Commit 4f87bb8d by Jonathan Wakely Committed by Jonathan Wakely

PR libstdc++/71044 optimize std::filesystem::path construction

This new implementation has a smaller footprint than the previous
implementation, due to replacing std::vector<_Cmpt> with a custom pimpl
type that only needs a single pointer. The _M_type enumeration is also
combined with the pimpl type, by using a tagged pointer, reducing
sizeof(path) further still.

Construction and modification of paths is now done more efficiently, by
splitting the input into a stack-based buffer of string_view objects
instead of a dynamically-allocated vector containing strings. Once the
final size is known only a single allocation is needed to reserve space
for it.  The append and concat operations no longer require constructing
temporary path objects, nor re-parsing the entire native pathname.

This results in algorithmic improvements to path construction, and
working with large paths is much faster.

	PR libstdc++/71044
	* include/bits/fs_path.h (path::path(path&&)): Add noexcept when
	appropriate. Move _M_cmpts instead of reparsing the native pathname.
	(path::operator=(const path&)): Do not define as defaulted.
	(path::operator/=, path::append): Call _M_append.
	(path::concat): Call _M_concat.
	(path::path(string_type, _Type): Change type of first parameter to
	basic_string_view<value_type>.
	(path::_M_append(basic_string_view<value_type>)): New member function.
	(path::_M_concat(basic_string_view<value_type>)): New member function.
	(_S_convert(value_type*, __null_terminated)): Return string view.
	(_S_convert(const value_type*, __null_terminated)): Return string view.
	(_S_convert(value_type*, value_type*))
	(_S_convert(const value_type*, const value_type*)): Add overloads for
	pairs of pointers.
	(_S_convert(_InputIterator, __null_terminated)): Construct string_type
	explicitly, for cases where _S_convert returns a string view.
	(path::_S_is_dir_sep): Replace with non-member is_dir_sep.
	(path::_M_trim, path::_M_add_root_name, path::_M_add_root_dir)
	(path::_M_add_filename): Remove.
	(path::_M_type()): New member function to replace _M_type data member.
	(path::_List): Define new struct type instead of using std::vector.
	(path::_Cmpt::_Cmpt(string_type, _Type, size_t)): Change type of
	first parameter to basic_string_view<value_type>.
	(path::operator+=(const path&)): Do not define inline.
	(path::operator+=(const string_type&)): Call _M_concat.
	(path::operator+=(const value_type*)): Likewise.
	(path::operator+=(value_type)): Likewise.
	(path::operator+=(basic_string_view<value_type>)): Likewise.
	(path::operator/=(const path&)): Do not define inline.
	(path::_M_append(path)): Remove.
	* python/libstdcxx/v6/printers.py (StdPathPrinter): New printer that
	understands the new path::_List type.
	* src/filesystem/std-path.cc (is_dir_sep): New function to replace
	path::_S_is_dir_sep.
	(path::_Parser): New helper class to parse strings as paths.
	(path::_List::_Impl): Define container type for path components.
	(path::_List): Define members.
	(path::operator=(const path&)): Define explicitly, to provide the
	strong exception safety guarantee.
	(path::operator/=(const path&)): Implement manually by processing
	each component of the argument, rather than using _M_split_cmpts
	to parse the entire string again.
	(path::_M_append(string_type)): Likewise.
	(path::operator+=(const path&)): Likewise.
	(path::_M_concat(string_type)): Likewise.
	(path::remove_filename()): Perform trim directly instead of calling
	_M_trim().
	(path::_M_split_cmpts()): Rewrite in terms of _Parser class.
	(path::_M_trim, path::_M_add_root_name, path::_M_add_root_dir)
	(path::_M_add_filename): Remove.
	* testsuite/27_io/filesystem/path/append/source.cc: Test appending a
	string view that aliases the path.
	testsuite/27_io/filesystem/path/concat/strings.cc: Test concatenating
	a string view that aliases the path.

From-SVN: r267106
parent 51beaeba
2018-12-13 Jonathan Wakely <jwakely@redhat.com>
PR libstdc++/71044
* include/bits/fs_path.h (path::path(path&&)): Add noexcept when
appropriate. Move _M_cmpts instead of reparsing the native pathname.
(path::operator=(const path&)): Do not define as defaulted.
(path::operator/=, path::append): Call _M_append.
(path::concat): Call _M_concat.
(path::path(string_type, _Type): Change type of first parameter to
basic_string_view<value_type>.
(path::_M_append(basic_string_view<value_type>)): New member function.
(path::_M_concat(basic_string_view<value_type>)): New member function.
(_S_convert(value_type*, __null_terminated)): Return string view.
(_S_convert(const value_type*, __null_terminated)): Return string view.
(_S_convert(value_type*, value_type*))
(_S_convert(const value_type*, const value_type*)): Add overloads for
pairs of pointers.
(_S_convert(_InputIterator, __null_terminated)): Construct string_type
explicitly, for cases where _S_convert returns a string view.
(path::_S_is_dir_sep): Replace with non-member is_dir_sep.
(path::_M_trim, path::_M_add_root_name, path::_M_add_root_dir)
(path::_M_add_filename): Remove.
(path::_M_type()): New member function to replace _M_type data member.
(path::_List): Define new struct type instead of using std::vector.
(path::_Cmpt::_Cmpt(string_type, _Type, size_t)): Change type of
first parameter to basic_string_view<value_type>.
(path::operator+=(const path&)): Do not define inline.
(path::operator+=(const string_type&)): Call _M_concat.
(path::operator+=(const value_type*)): Likewise.
(path::operator+=(value_type)): Likewise.
(path::operator+=(basic_string_view<value_type>)): Likewise.
(path::operator/=(const path&)): Do not define inline.
(path::_M_append(path)): Remove.
* python/libstdcxx/v6/printers.py (StdPathPrinter): New printer that
understands the new path::_List type.
* src/filesystem/std-path.cc (is_dir_sep): New function to replace
path::_S_is_dir_sep.
(path::_Parser): New helper class to parse strings as paths.
(path::_List::_Impl): Define container type for path components.
(path::_List): Define members.
(path::operator=(const path&)): Define explicitly, to provide the
strong exception safety guarantee.
(path::operator/=(const path&)): Implement manually by processing
each component of the argument, rather than using _M_split_cmpts
to parse the entire string again.
(path::_M_append(string_type)): Likewise.
(path::operator+=(const path&)): Likewise.
(path::_M_concat(string_type)): Likewise.
(path::remove_filename()): Perform trim directly instead of calling
_M_trim().
(path::_M_split_cmpts()): Rewrite in terms of _Parser class.
(path::_M_trim, path::_M_add_root_name, path::_M_add_root_dir)
(path::_M_add_filename): Remove.
* testsuite/27_io/filesystem/path/append/source.cc: Test appending a
string view that aliases the path.
testsuite/27_io/filesystem/path/concat/strings.cc: Test concatenating
a string view that aliases the path.
* testsuite/27_io/filesystem/path/generation/proximate.cc: Use
preferred directory separators for normalized paths.
* testsuite/27_io/filesystem/path/generation/relative.cc: Likewise.
......
......@@ -1244,6 +1244,77 @@ class StdExpPathPrinter:
def children(self):
return self._iterator(self.val['_M_cmpts'])
class StdPathPrinter:
"Print a std::filesystem::path"
def __init__ (self, typename, val):
self.val = val
self.typename = typename
impl = self.val['_M_cmpts']['_M_impl']['_M_t']['_M_t']['_M_head_impl']
self.type = impl.cast(gdb.lookup_type('uintptr_t')) & 3
if self.type == 0:
self.impl = impl
else:
self.impl = None
def _path_type(self):
t = str(self.type.cast(gdb.lookup_type(self.typename + '::_Type')))
if t[-9:] == '_Root_dir':
return "root-directory"
if t[-10:] == '_Root_name':
return "root-name"
return None
def to_string (self):
path = "%s" % self.val ['_M_pathname']
if self.type != 0:
t = self._path_type()
if t:
path = '%s [%s]' % (path, t)
return "filesystem::path %s" % path
class _iterator(Iterator):
def __init__(self, impl, pathtype):
if impl:
# We can't access _Impl::_M_size because _Impl is incomplete
# so cast to int* to access the _M_size member at offset zero,
int_type = gdb.lookup_type('int')
cmpt_type = gdb.lookup_type(pathtype+'::_Cmpt')
char_type = gdb.lookup_type('char')
impl = impl.cast(int_type.pointer())
size = impl.dereference()
#self.capacity = (impl + 1).dereference()
if hasattr(gdb.Type, 'alignof'):
sizeof_Impl = max(2 * int_type.sizeof, cmpt_type.alignof)
else:
sizeof_Impl = 2 * int_type.sizeof
begin = impl.cast(char_type.pointer()) + sizeof_Impl
self.item = begin.cast(cmpt_type.pointer())
self.finish = self.item + size
self.count = 0
else:
self.item = None
self.finish = None
def __iter__(self):
return self
def __next__(self):
if self.item == self.finish:
raise StopIteration
item = self.item.dereference()
count = self.count
self.count = self.count + 1
self.item = self.item + 1
path = item['_M_pathname']
t = StdPathPrinter(item.type.name, item)._path_type()
if not t:
t = count
return ('[%s]' % t, path)
def children(self):
return self._iterator(self.impl, self.typename)
class StdPairPrinter:
"Print a std::pair object, with 'first' and 'second' as children"
......@@ -1759,9 +1830,9 @@ def build_libstdcxx_dictionary ():
libstdcxx_printer.add_version('std::experimental::filesystem::v1::__cxx11::',
'path', StdExpPathPrinter)
libstdcxx_printer.add_version('std::filesystem::',
'path', StdExpPathPrinter)
'path', StdPathPrinter)
libstdcxx_printer.add_version('std::filesystem::__cxx11::',
'path', StdExpPathPrinter)
'path', StdPathPrinter)
# C++17 components
libstdcxx_printer.add_version('std::',
......
......@@ -112,6 +112,33 @@ test04()
#endif
}
void
test05()
{
std::basic_string_view<path::value_type> s;
path p = "0/1/2/3/4/5/6";
// The string_view aliases the path's internal string:
s = p.native();
// Append that string_view, which must work correctly even though the
// internal string will be reallocated during the operation:
p /= s;
VERIFY( p.string() == "0/1/2/3/4/5/6/0/1/2/3/4/5/6" );
// Same again with a trailing slash:
path p2 = "0/1/2/3/4/5/";
s = p2.native();
p2 /= s;
VERIFY( p2.string() == "0/1/2/3/4/5/0/1/2/3/4/5/" );
// And aliasing one of the components of the path:
path p3 = "0/123456789/a";
path::iterator second = std::next(p3.begin());
s = second->native();
p3 /= s;
VERIFY( p3.string() == "0/123456789/a/123456789" );
}
int
main()
{
......@@ -119,4 +146,5 @@ main()
test02();
test03();
test04();
test05();
}
......@@ -57,8 +57,36 @@ test01()
VERIFY( p.filename().string() == file );
}
void
test02()
{
std::basic_string_view<path::value_type> s;
path p = "0/1/2/3/4/5/6";
// The string_view aliases the path's internal string:
s = p.native();
// Append that string_view, which must work correctly even though the
// internal string will be reallocated during the operation:
p += s;
VERIFY( p.string() == "0/1/2/3/4/5/60/1/2/3/4/5/6" );
// Same again with a trailing slash:
path p2 = "0/1/2/3/4/5/";
s = p2.native();
p2 += s;
VERIFY( p2.string() == "0/1/2/3/4/5/0/1/2/3/4/5/" );
// And aliasing one of the components of the path:
path p3 = "0/123456789";
path::iterator second = std::next(p3.begin());
s = second->native();
p3 += s;
VERIFY( p3.string() == "0/123456789123456789" );
}
int
main()
{
test01();
test02();
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment