Joe White’s Blog

Life, .NET, and Cats


DGrok 0.8.1: multithreading, default options, GPL

Version 0.8.1 of the DGrok Delphi parser and tools are now available for download. Download DGrok 0.8.1 here.

What is DGrok?

DGrok is a set of tools for parsing Delphi source code and telling you stuff about it. Read more about it on the DGrok project page.

What’s new in 0.8.1?

Quick summary of what’s new (more information below):

  • Now GPL-licensed.
  • Reasonable defaults for {$IFOPT}.
  • Multithreaded parser.
  • Less memory usage when parsing twice.
  • Copy tree results to clipboard.

Now GPL-licensed

Prior versions of DGrok used NUnitLite for their unit tests, and therefore had to ship under the same license as NUnitLite: the OSL (Open Software License). I’ve never been happy about that. The world really doesn’t need yet another tiny variation on the GPL, especially when that variation isn’t GPL-compatible.

So for this release, I dumped NUnitLite and switched to NUnit. That let me drop the OSL and switch to an industry-standard open-source license, the GPL (GNU General Public License).

There are a few downsides. NUnit has tremendous overhead; on my laptop, it takes about fifteen seconds just to start the NUnit console runner and load the tests, plus the time to run them (which is also slower than under NUnitLite). It also adds an extra 321 Kb to the download size. And now I have to clutter my test code with a bunch of stupid [Test] attributes.

If I think that’s a good trade, then apparently the OSL annoyed me more than I thought.

Reasonable defaults for “{$IFOPT}”.

An annoyance in previous versions (even to me) was that, if you were parsing code that contained things like {$IFOPT C+}, you would have to switch to DGrok’s “Options” page to tell it which compiler settings it should consider to be “on” and which are “off”. If it hit an {$IFOPT} you hadn’t told it about, it would fail to parse that source file.

In 0.8.1, that’s no longer the case. DGrok knows about the default compiler options in a clean install of Delphi, and by default, it assumes you’re using those options. You can still use the Options page to override those settings one by one (e.g. if you compile with range checking on, and want DGrok to parse code inside your {$IFOPT R+} sections), but it’s no longer necessary to do it for every single option.

If anyone’s curious, here are the settings DGrok uses. I just opened Delphi (actually Turbo Delphi) and pressed Ctrl+O Ctrl+O, which prefixes the current file with all the compiler directives currently in effect. Then I did a bit of testing on the odd cases, like A and Z (which can have numbers in addition to + or -, and which do have numbers when inserted by Ctrl+O Ctrl+O). Here’s what I wound up with:

B-, C+, D+, E-, F-, G+, H+, I+, J-, K-, L+, M-, N+, O+, P+, Q-, R-, S-, T-, U-, V+, W-, X+, Y+, Z-

You may notice that A isn’t listed. A is an oddball case, in that it’s treated as neither on nor off. That is, {$IFOPT A+} and {$IFOPT A-} will both evaluate as “false”. There’s a compelling reason for that: it’s what Delphi does under the default settings! So don’t blame me; I’m just being compatible with the real Delphi compiler.

Multithreaded parser

When you use the DGrok demo app to parse a source tree, it now spins up multiple threads to do the parsing. There’s a setting on the “Options” tab to control how many threads you want it to use.

I actually implemented this a few months back, and since then, it’s occurred to me that I was making the problem too complicated — life would be simpler if I’d just used the ThreadPool, and queued a work item for each file I wanted to parse. Oh well; what’s there seems to work. I’ll probably do the thread-pool thing in the future, though.

Less memory usage when parsing twice

I’m embarrassed by this one. In previous versions, if you clicked “Parse” more than once in the same program run (e.g. if you were tweaking the “Options” to deal with {$IFOPT}s), DGrok would temporarily take twice as much memory as it needed to. That’s because I built the new list, and then stored it in the top-level variable… so the old list (stored in that same variable) was still “live” as far as the GC knew, up until the point when I overwrote its reference at the very end.

It’s better now — it nulls out the reference before it starts parsing, so the old list gets GCed as soon as the new parse run starts allocating gobs of memory. So if you regularly parse a million-line code base (like I do), you’ll notice significantly less thrashing.

Copy tree results to clipboard

Pretty simple. There’s a “Copy” button under the tree that shows the parse results. This is mainly useful when you’ve used DGrok to search for, for example, all the with statements in your code, and now want to copy that list into Excel for easy sorting and printing.

Happy parsing!

6 Responses to “DGrok 0.8.1: multithreading, default options, GPL”

  1. DiGi Says:

    I found some grammar parts, that DGrok don’t know:


    // Incorrect error "Expected CaseSelector but was Caret"
    procedure TSomeForm.edSomeEditKeyPress(Sender: TObject; var Key: Char);
    begin
    case Key of
    ^C : // Yes, this is working. And it is used in some parts of VCL
    begin
    DoSomeCopyWork;
    Key := #0;
    end;
    end;
    end;

    Second is class type. First row after Class type declaration is marked with “Expected EqualSign but found Identifier” or “Expected EqualSign but found ProcedureKeyword”

    http://dn.codegear.com/article/34324

  2. Lex Y. Li Says:

    Hi, nice license change. Now I am going to see if my open source project can use DGrok to do something new. And this is also the first time I notice DGrok is written in C#. Originally I thought it was written in Delphi for .NET.

    :-) Thanks for creating this library.

  3. Joe Says:

    DiGi:

    Yeah, I was aware of the ^C syntax, I just haven’t been motivated to do anything about it yet. It’s very context-sensitive — ^C in a case label means something totally different from ^C in a type section. So I can’t handle it at lex time like I do all the other tokens; I would have to rewrite the token nodes at parse time. Still, I trip over it myself every now and then (e.g. when I try to parse VCL units, or some third-party libraries), so I’ll probably address it eventually.

    Class type: good catch. It’s thinking “public” is the name of the next type in the type section, rather than the end of the type section and a return to the guts of the class. I’ve already got other rules being smart about that (e.g. “class var” does handle “public” correctly), so it should just be a matter of adding a test and making it pass. Thanks!

  4. Joe Says:

    Lex Y. Li:

    Cool — let me know how it goes. I’ll be interested to hear whether anyone other than me can figure out how to use the durn thing. (grin) Let me know if you have any questions about it.

    And no, I wouldn’t use Delphi for .NET for new development. I’ve ranted before about how bad a job CodeGear has done on Delphi for .NET lately, among the many other reasons to use C# (free command-line compiler, better language features, etc.)

  5. Dante Says:

    Hello,

    I’m a brazilian Delphi programmer and I’ve been using DGrok (essencialy the demo) to help me translate some parts of the software I’m working on right now. I couldn’t be happier with the results, it save me a lot of time! Insted of analizing 9000 lines of grepped code, I’m only looking to 500 lines of code that really matter.

    I would like to contribute with two visitors I created, one for finding raise statements e other for method calls with literal strings. How do I submit them?

    Thanks for your work, it’s great!

  6. thomas Says:

    Hi Joe,

    thanks a lot for DGrok, it gives me a great start for a documentation and quality check tool in a really big Delphi project.
    I encountered two smaller problems with includes:
    our project is spread out over several directories, but we use centralized includes for compiler settings. It would be nice if GetSourceTokensForInclude would scan the Searchpath (like Delphi does) for the file. I was able to introduce this, but had to pass around codebaseOptions a lot. Probably there is a easier way.

    Delphi supports {$I ‘myfile.inc’} and {$I myfile.inc} but DGrok fails on the first.

    Thanks a lot for your great work!

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


Joe White's Blog copyright © 2004-2008. Portions of the site layout use Yahoo! YUI Reset, Fonts, and Grids.
Proudly powered by WordPress. Entries (RSS) and Comments (RSS).