Skip to content

refactor: file matching implementation#2566

Open
sinrohit-desco wants to merge 2 commits into
molybdenumsoftware:masterfrom
sinrohit-desco:patch-0
Open

refactor: file matching implementation#2566
sinrohit-desco wants to merge 2 commits into
molybdenumsoftware:masterfrom
sinrohit-desco:patch-0

Conversation

@sinrohit-desco

Copy link
Copy Markdown

Precursor to #1921

@mightyiam

Copy link
Copy Markdown
Member

How about we replace most of this with usage of some popular and well tested crate, instead?

@sinrohit-desco

Copy link
Copy Markdown
Author

How about we replace most of this with usage of some popular and well tested crate, instead?

Well, do you have any in mind that we could use?

@mightyiam

Copy link
Copy Markdown
Member

@sinrohit-desco

Copy link
Copy Markdown
Author

Maybe dirwalk - crates.io: Rust Package Registry?

I don't think that this is very good idea. Reasons below

  1. dirwalk is fairly new and has just 89 downloads vs ignore which has downloads in millions and is much more popular and used by ripgrep.
  2. Even dirwalk uses ignore https://codeberg.org/madk/dirwalk/src/branch/main/dirwalk/Cargo.toml#L33
  3. No multiple roots, which means for multiple targets it's not going to be as easy as calling .add() to walk multiple targets in one pass. I'd need separate walks per target, hence losing deduplication.
  4. I used OverrideBuilder to implement --ignore patterns, where as dirwalk has .glob() and .extensions() but no equivalent to the override system that I rely on.

The motivation to use this crate was to simplify things. In my opinion ignore seems much better option but I will let you decide, since you are the maintainer.

@mightyiam

Copy link
Copy Markdown
Member

ignore sounds great.

@mightyiam

Copy link
Copy Markdown
Member

Need any help with this, @sinrohit-desco?

@sinrohit-desco

Copy link
Copy Markdown
Author

Need any help with this, @sinrohit-desco?

Yes, I’d appreciate a review of this PR.

@mightyiam

Copy link
Copy Markdown
Member

Would it be easier to review and better for maintenance and more reliable to switch to ignore instead of this refactor?

@sinrohit-desco

Copy link
Copy Markdown
Author

Would it be easier to review and better for maintenance and more reliable to switch to ignore instead of this refactor?

We were already using the ignore crate, and this PR is specifically about refactoring our usage of it. Previously, we used the ignore::gitignore API for pattern matching along with a custom Walker implementation for directory traversal. This PR replaces that hand-rolled traversal logic with ignore::WalkBuilder, which combines directory walking with gitignore rule handling.

So I’m not sure what you mean by “switch to ignore” here. Could you clarify what you think is missing?

@mightyiam

Copy link
Copy Markdown
Member

I don't see any existing tests. Do you think we should have tests before making such changes?

@sinrohit-desco

Copy link
Copy Markdown
Author

I don't see any existing tests. Do you think we should have tests before making such changes?

Since this is a refactoring PR and does not introduce new functionality, I’d prefer to keep the scope limited to the code changes here. If there are gaps in the existing test coverage, I agree they should be addressed, but I think that should be done in a separate PR. And I am not interested in opening another PR.

Replace hand-rolled Walker iterator and build_ignore_set with
ignore::WalkBuilder from the same ignore crate. Earlier we were using
ignore::gitignore API for pattern matching and a
custom Walker struct for directory traversal. WalkBuilder combines
both — it walks directories and applies gitignore rules.
@mightyiam

Copy link
Copy Markdown
Member

@sinrohit-desco, since @Lymah123 did #2668 we now have tests for this. Would you be willing to take a look? Seems like a formatting issue.

@sinrohit-desco

Copy link
Copy Markdown
Author

@mightyiam Could you approve the workflows again ?

@mightyiam

Copy link
Copy Markdown
Member

Thank you for rebasing, @sinrohit-desco!

It seems that @Lymah123's #2668 is paying off with a couple of failing assertions here.

https://github.com/molybdenumsoftware/statix/actions/runs/28359363290/job/84531255651?pr=2566#step:4:1073

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants