Comparing Arbitrary Exclusion Methods

Without a doubt the most time-consuming aspect of using VBLM to localize your projects is getting it to extract only those strings that should be translated. The best way to exclude these is to use the exclusion rules, because each one systematically gets rid of all strings that meet the criteria. However, many strings will not be amenable to systematic exclusion, which brings us to the current topic: arbitrary exclusion. Arbitrary exclusion is simply the practice of (somehow) telling VBLM that you don't want to extract for translation this or that particular instance of a string.

When you right-click a string in the language table editor, a popup menu offers three ways to exclude it arbitrarily:

image\LTE_POPUP_MENU_shg.gif

Each of these methods -- marking as excluded, excluding via list file, and excluding via directive comments in source code -- are described elsewhere in detail. Our purpose here is to discuss their relative strengths and weaknesses.

Marking as Excluded

Marking strings as excluded is quick and easy to do, and quick and easy to undo. Since strings marked as excluded drop to the bottom in the LTE, they're easy to see. And although these strings are not actually excluded, VBLM handles them as if they were.

The weaknesses in marking strings as excluded are twofold. First, they aren't really excluded -- they've been extracted and reside in the LMP file, cluttering it up. This is less of an issue now that, as of V6, VBLM can deal with projects of virtually unlimited size. However, a large number of strings marked as excluded can impose significant overhead on processing time, whether it be loading and sorting the LTE, building a new project, or whatever. It's not uncommon to need to exclude 1/4 to 1/3 of the strings VBLM initially extracts, and if you do this in a large project by marking thousands of strings as excluded, you're going to spend a lot more time than you need to waiting for VBLM to complete tasks.

Second, when you mark strings as excluded, the exclusions do not exist independently of the LMP file, ie if you select File/New and run VBLM against the VB project again, all those strings will be extracted and will need to be marked again. VBLM allows you to address this problem by using LMX files to persist and transfer your exclusion markings between projects (see Export Options), but the other methods do not have this problem.

Exclusion List Files

Excluding strings by listing them in an SXL file has several strengths. It is equally easy to do. The strings actually are excluded, making for a lean LMP file and no excess overhead. Further, the exclusions exist independently as a file; when you select File/New and run VBLM against the VB project again, all those strings will be excluded again, with no more effort on your part than clicking Yes when VBLM asks if you want to apply the file. Finally, if you configure VBLM to apply the exclusions loosely (see SXL options), a single exclusion can serve to exclude multiple instances.

On the other hand, while you can choose to apply or not apply the entire file, it is very difficult to undo single exclusions (you'd need to save the SXL file in text format and carefully edit it, not recommended). And while the extraction log will document which strings have been excluded by the SXL file, you can't view the strings excluded by this method as easily as dropping to the bottom of the LTE; again, you need to save the file in text format, then view it with a file viewer or text editor.

Directive Comments in Source Code

I tend to think of directive comments in source code as the "gold standard" for excluding strings. The strings are actually excluded, with the attendant benefits. The exclusions exist independently of the LMP file; in fact, they are as durable as the source code itself (somebody could, after all, delete an SXL file). They are self-documenting, visible as you scroll through the code, rather than viewed separately in a list file or the LTE. They are relative easy to do and undo; as of V6, you can even mark the strings in the LTE and have VBLM edit the source to insert the directive comments. Finally, once developers are familiar with VBLM, they can enforce exclusions in real time, as code is added and modified. This seems to me (given my biases as a developer) to be a recipe for high quality.

The weaknesses with source code directives are, once again, twofold. First is a matter of principle; having to modify source code to support localization can be construed, in some respects anyway, to be a failing of VBLM. Shouldn't a localization tool be able to do its job without touching the original source? That's always been a goal for VBLM.

On a more practical level, though, it's clear from talking to users that people doing localization sometimes don't have the access or authority to make changes to the source code. In fact, it's not uncommon (to the occasional distress of WhippleWare tech support) for VBLM users to have, at best, minimal programming knowledge, and for whom such constraints make perfect sense.