[Rpm-maint] [PATCH 4/5] Optimize readLine routine
Alexey Tourbin
alexey.tourbin at gmail.com
Thu Feb 7 04:36:42 UTC 2013
On Tue, Feb 05, 2013 at 04:31:38AM +0000, Alexey Tourbin wrote:
> This change introduces the following optimizations:
> - Line chunking used to be done with per-character loop; it now uses
> strchr(line, '\n') call.
> - Likewise, the "balancing check" has been enhanced with strpbrk(3).
>
> Callgrind annotations for 'rpmspec -P glibc.spec', previous commit:
> 94,796,525 PROGRAM TOTALS
> 33,242,383 rpmio/argv.c:argvCount
> 9,886,925 build/parseSpec.c:readLine
> 3,964,158 glibc-2.16-75f0d304/malloc/malloc.c:_int_malloc
> ...
>
> Callgrind annotations for 'rpmspec -P glibc.spec', this commit:
> 88,013,018 PROGRAM TOTALS
> 33,242,383 rpmio/argv.c:argvCount
> 3,964,158 glibc-2.16-75f0d304/malloc/malloc.c:_int_malloc
> ...
Update: on the second thought, it occurred to me that two more
optimizations related to the "balancing check" can be implemented:
- Note that the check actually does two different things: it tracks
line concatenation with trailing backslash; and it further checks
for balanced brackets and parens, but only within a macro substitution.
Unless a macro has been started, the check for brackets/parens can be
skipped. Likewise, unless a trailing backslash has been seen, there
is no need to search for a newline. Thus, strpbrk() can be first
called with a reduced set of characters: only "\\" and "%" have to be
tracked down. When a trailing backslash is found, or when a macro
substitution is started which needs a closing bracket/paren, the set
of character is changed to a full set of "\\\n%{}()".
- The balancing check can also be used to find out whether the
rpmExpandMacros() call is needed at all. Note that, on average, only
about 30% of specfile lines have "%" character in them (among the
remaining 70% are empty lines with trailing "\n", which nontheless
still have to be processed). Thus most of the rpmExpandMacros() calls
and the associated buffer allocation/copying can be skipped altogether.
Callgrind annotations for 'rpmspec -P glibc.spec', updated commit:
80,831,100 PROGRAM TOTALS
33,244,684 rpmio/argv.c:argvCount
3,244,420 glibc-2.16-75f0d304/string/../sysdeps/x86_64/strcmp.S:__GI_strncmp
...
Update2: it is not clear whether the backslash in "\%{foo}" should
somehow "escape" the balancing check, as it does now. Anyway, this case
is still subject for macro expansion.
(So I updatd the change on github, here's the relevant part:)
--- a/build/parseSpec.c
+++ b/build/parseSpec.c
@@ -322,21 +312,25 @@ static int copyNextLineFromOFI(rpmSpec spec, OFI_t *ofi)
/* Check if we need another line before expanding the buffer. */
int pc = 0, bc = 0, nc = 0;
- for (const char *p = spec->lbuf;
- (p = strpbrk(p, "\\\n%{}()")) != NULL; p++)
+ /* Also check if we need to expand macros */
+ int needExpand = 0;
+ for (const char *p = spec->lbuf, *set = "\\%";
+ (p = strpbrk(p, set)) != NULL; p++)
switch (*p) {
case '\\':
switch (*(p+1)) {
- case '\n': p++, nc = 1; break;
+ case '\n': p++, nc = 1; set = "\\\n%{}()"; break;
case '\0': break;
+ case '%': needExpand = 1; break;
default: p++; break;
}
break;
case '\n': nc = 0; break;
case '%':
+ needExpand = 1;
switch (*(p+1)) {
- case '{': p++, bc++; break;
- case '(': p++, pc++; break;
+ case '{': p++, bc++; set = "\\\n%{}()"; break;
+ case '(': p++, pc++; set = "\\\n%{}()"; break;
case '%': p++; break;
}
break;
@@ -354,7 +348,7 @@ static int copyNextLineFromOFI(rpmSpec spec, OFI_t *ofi)
spec->lbufOff = 0;
/* Don't expand macros (eg. %define) in false branch of %if clause */
- if (spec->readStack->reading) {
+ if (needExpand && spec->readStack->reading) {
char *exp = rpmExpandMacros(spec->macros, spec->lbuf,
basename(ofi->fileName), ofi->lineNum,
undefined, spec);
More information about the Rpm-maint
mailing list