Excluding directories
I’m coding the directory exclusions feature of my Delphi-code-searching tool, and I wanted to get you guys’ feedback.
You’ll give the app a directory to start in, and it will automatically recurse through all subdirectories, except the ones you tell it to skip. I want this because we have some utilities in our code base that were for conversion to the current program version, so we no longer compile those tools as part of our current builds; so I don’t worry about them when I refactor, and I don’t want to search in them. One of my readers commented on wanting to exclude third-party code from searches. I think this “skip this directory” feature will be pretty useful.
I wrote the directory-recursing code this morning. The simple case (no exclusions) would basically look like this:
ArrayList arrayList = new ArrayList();
arrayList.Add(_parameters.RootDirectory);
for (int i = 0; i < arrayList.Count; ++i)
{
DirectoryInfo thisDirectory = (DirectoryInfo) arrayList[i];
arrayList.AddRange(thisDirectory.GetDirectories());
}
But if I want exclusions, the question arises: Where do I put that logic? There are two places that would make sense:
- Instead of the AddRange, I could do a foreach through the GetDirectories() results, and only add directories to the ArrayList if they aren’t on the exclude list. This would mean that I would never recurse into the subdirectories of anything on the exclude list. (This is what I decided to have the code do for now.)
- The other option would be to go from the starting directory and scan its entire directory tree, and then make a second pass to remove the excluded directories from the ArrayList. The difference here is that, if I exclude directory C:\Code\Foo, the ArrayList will still contain Foo’s subdirectories.
The difference is in whether “exclude directory X” means “ignore X and all its subdirectories”, or if it instead means “don’t process any of the files in X, but do include its subdirectories (unless I also exclude them specifically)”.
Which option do you think would be more useful? The second option gives you more fine-grained control; you can include or exclude individual directories at your discretion. But if what you really want is “exclude this entire directory tree”, then the first option would be better — unchecking all those directories one by one could be really cumbersome.
What do you think? Would you only use the first option? Only the second? Or would you really need the ability to choose, per exclusion, whether you’re skipping that directory tree or just the files in that directory?
October 13th, 2004 at 7:20 am
"unchecking all those directories one by one could be really cumbersome."
Thats more of a UI side issue than a business/logic side, which is where the rest of your question is.
I think the second would be better in the long term, as it’s more flexable.
For UI side of the question, look at MS-Backup.
October 13th, 2004 at 7:46 am
True, but it’s an issue that the business logic needs to consider as well. In particular, if I really want to exclude the entire subtree, but I have to uncheck each folder manually, then new directories will be automatically *included*, which is not what I want.
The main consideration is the user’s intent. You’re right that UI design is a side issue (although it should mesh pretty well with the user’s intent).
October 13th, 2004 at 8:09 am
For our directory structure, ignoring the directory and all its subdirectories would be what I would want to have it do.
October 14th, 2004 at 2:37 pm
It took me a while to think of a reason I would want to exclude a parent directory, but include some of it’s children.
It would be if I wanted to include one child directory with many siblings.
It might be easier to exclude the parent and include the one I want, than it would be to exclude all the siblings I don’t want.
Other than that narrow case, I would expect you would want the whole tree.
HTH.
October 17th, 2004 at 10:53 pm
Option interaction