Age | Commit message (Collapse) | Author |
|
|
|
|
|
This moves the DirEntry and Error types out into their own separate
modules.
This is prep work to (hopefully) make the impending refactoring (or
more likely, rewrite) more palatable.
|
|
This commit performs the unenviable task of implementing directory
traversal for Windows, Unix-like platforms and Linux (where Linux is
supported via the generic Unix implementation, but also has its own
specialized implementation that uses getdents64 directly).
The purpose of doing this is largely to control our costs more
explicitly, and to fix platform specific bugs. For costs, this largely
boils down to amortizing allocation. That is, instead of allocating
a fresh OsString for every entry, we can now read directory entries
into a previously used entry. Anecdotally, this leads to a 20-30%
performance improvement on listing a single large directory for *both*
Windows and Linux.
As for bugs, these mostly center around long file paths. For example,
on Unix, the typical maximum path length is 4096 bytes. There is no
way to avoid this using std's APIs, so we need to instead roll our own
based on file descriptors. There's also Windows, which has a maximum
file path length of only 260 chars, although this is trickier to fix.
There are basically two downsides to doing things this way:
1. This increases maintenance burden quite a bit, and this change will
almost certainly introduce bugs on less well tested platforms.
2. It's not clear yet whether we will *also* need to hand-roll our own
`Metadata` implementation. std provides no way to build the
`Metadata` type from raw inputs. The key factor here is that std's
`DirEntry::metadata` method benefits from using `fstatat` internally.
If we don't re-roll our own `Metadata`, then we'll have to use
`std::fs::metadata`, which does a normal `stat` call on the file
path. `fstatat` should generally be faster. On the other hand, std's
implementation strategy also means that file descriptors aren't
closed until the last `DirEntry` is dropped, which is a bit sneaky.
(And is also indicative of a bug in walkdir, since this means its
"maximum opened fds" feature doesn't actually work.)
Unfortunate, but everything gets sacrificed at the alter of performance.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
And also add a `source` method on the `Error` impl.
And finally, permit the use of the deprecated `description` method,
since removing it would be a breaking change.
|
|
For now, we don't switch to Rust 2018 to avoid creating a
larger-than-necessary divergence with the in-progress walkdir 3 rewrite.
|
|
|
|
|
|
|
|
A somewhat recent change permitted the `push` function to exit early
after `oldest_opened` was incremented, but before a new entry was pushed
on to the stack. Specifically, the only way this could happen was if
a handle could not be opened to an ancestor path, on Windows only. We
fix this by incrementing `oldest_opened` only after we push a new entry
to the stack.
Credit goes to @LukasKalbertodt for figuring out this bug!
|
|
The method sometimes destroyed an internal invariant by not decreasing
`oldest_opened`. That then leads to panics in `push`. We fix this by
calling the canonical `pop` function, which is what should have happened
from the beginning.
This includes a regression test that fails without this fix.
Fixes #118, Closes #124
|
|
This moves the DirEntry and Error types out into their own separate
modules.
This is prep work to (hopefully) make the impending refactoring (or
more likely, rewrite) more palatable.
|
|
This gets rid of a lot of unnecessary infrastructure around maintaining
the directory hierarchy in a tree. This was principally used in order to
support effective quickcheck tests, but since we dropped quickcheck, we
no longer need such things.
We know center tests around the Dir type, which makes setting up the
tests simpler and easier to understand.
|
|
This is in preparation to rewrite the tests.
|
|
These weren't carrying their weight. Depending on rand is super
annoying, so just stop doing it. In particular, we can bring back the
minimal version check.
|
|
This supplants the previous "example" which was more like a debugging
program. So this commit not only rewrites it (dropping docopt in the
process in favor of clap), but moves it to its own non-published binary
crate.
|
|
... to 0.6 and 0.8, respectively.
We aren't running tests on the MSRV any more any way, so we might as
well keep on moving.
Unfortunately, the rand ecosystem refuses to advertise and maintain
correct minimal versions in their Cargo.toml, so we have to remove the
minimal version check.
|
|
|
|
Because we aren't ready to bump our MSRV yet.
|
|
Because rand. Sigh.
|
|
|
|
|
|
|
|
PR #119
|
|
|
|
|
|
|
|
This commit fixes a nasty bug where the root path given to walkdir was
always reported as a symlink, even when 'follow_links' was enabled. This
appears to be a regression introduced by commit 6f72fce as part of
fixing BurntSushi/ripgrep#984.
The central problem was that since root paths should always be followed,
we were creating a DirEntry whose internal file type was always resolved
by following a symlink, but whose 'metadata' method still returned the
metadata of the symlink and not the target. This was problematic and
inconsistent both with and without 'follow_links' enabled.
We also fix the documentation. In particular, we make the docs of 'new'
more unambiguous, where it previously could have been interpreted as
contradictory to the docs on 'DirEntry'. Specifically, 'WalkDir::new'
says:
If root is a symlink, then it is always followed.
But the docs for 'DirEntry::metadata' say
This always calls std::fs::symlink_metadata.
If this entry is a symbolic link and follow_links is enabled, then
std::fs::metadata is called instead.
Similarly, 'DirEntry::file_type' said
If this is a symbolic link and follow_links is true, then this
returns the type of the target.
That is, if 'root' is a symlink and 'follow_links' is NOT enabled,
then the previous incorrect behavior resulted in 'DirEntry::file_type'
behaving as if 'follow_links' was enabled. If 'follow_links'
was enabled, then the previous incorrect behavior resulted in
'DirEntry::metadata' reporting the metadata of the symlink itself.
We fix this by correctly constructing the DirEntry in the first place,
and then adding special case logic to path traversal that will always
attempt to follow the root path if it's a symlink and 'follow_links'
was not enabled. We also tweak the docs on 'WalkDir::new' to be more
precise.
Fixes #115
|
|
|
|
|
|
PR #114
|
|
|
|
|
|
|
|
We do still need winapi for a std-library work-around.
|
|
|
|
PR #112
|
|
|
|
This commit includes a new method, `same_file_system`, which when
enabled, will cause walkdir to only descend into directories that are on
the same file system as the root path.
Closes #8, Closes #107
|
|
This can avoid an allocation and copy in iterator chains that need to
produce a PathBuf.
PR #100
|
|
|
|
This commit fixes a bug where the first path always reported itself as
as symlink via `path_is_symlink`.
Partially fixes https://github.com/BurntSushi/ripgrep/issues/984
|
|
|
|
|