[Rpm-maint] rpmstrpool.c way too slow
Alexey Tourbin
alexey.tourbin at gmail.com
Wed Jan 30 00:08:02 UTC 2013
Hello,
My previous patches were made by hacking on rpmio/ and injecting new
code via LD_LIBRARY_PATH=$PWD/rpmio/.libs. Thus previous profiling
results were obtained using system librpm.so library, which comes with
rpm-libs-4.10.2-1.fc18.x86_64. Today I made new "full" profiling of
rpm specfile parser - that is, without using system rpm-libs - and the
result changed dramatically.
Here is the exact profiling command I used:
$ LD_LIBRARY_PATH=$(echo $PWD/*/.libs |sed 's| |:|g') valgrind
--tool=callgrind --dump-instr=yes ./.libs/rpmspec -P
../fc18-specs/texlive.spec >/dev/null
Here are the top20 routines from callgrind_annotate(1) output:
57,420,754,719 PROGRAM TOTALS
10,035,890,516 rpmio/rpmstrpool.c:rpmstrPoolId
5,874,365,680 rpmio/rpmstrpool.c:rstrhash
3,062,539,970 glibc-2.16-75f0d304/stdlib/bsearch.c:bsearch
3,029,616,175 lib/header.c:copyTdEntry
2,824,229,185 rpmio/rpmstrpool.c:rpmstrPoolGet
2,720,364,090
glibc-2.16-75f0d304/string/../sysdeps/x86_64/strcmp.S:__GI_strcmp
2,613,824,638 rpmio/rpmstrpool.c:poolHashAddHEntry
2,594,167,218
glibc-2.16-75f0d304/string/../sysdeps/x86_64/strcmp.S:__GI_strncmp
1,828,340,892
glibc-2.16-75f0d304/string/../sysdeps/x86_64/memset.S:__GI_memset
1,616,755,970 rpmio/rpmstrpool.c:rpmstrPoolStr
1,611,388,096 lib/header.c:intGetTdEntry
1,538,230,858 rpmio/rpmstrpool.c:rpmstrPoolPut
1,387,732,877 lib/header.c:headerGet
1,251,399,444 rpmio/rpmstrpool.c:poolHashFree.part.0
1,244,486,470
glibc-2.16-75f0d304/string/../sysdeps/x86_64/rawmemchr.S:__GI___rawmemchr
1,227,687,377 lib/header.c:findEntry
978,801,050 lib/headerutil.c:headerGetString
968,938,071
glibc-2.16-75f0d304/string/../sysdeps/x86_64/memcpy.S:__GI_memcpy
785,113,238 lib/rpmds.c:rpmdsNIndex
749,893,987 lib/rpmds.c:rpmdsNext
With system librpm.so, bsearch was the most expensive routine, and now
it is only #3, lagging behind by a great margin. Actually the new
rpmstrpool.c code squanders as much as 45% of the PROGRAM TOTALS.
$ Sum() { perl -MList::Util=sum -ln0 -e 'print sum(split)'; }
$ callgrind_annotate |fgrep rpmstrpool.c |sed 's/,//g' |Sum
25754696291
$ callgrind_annotate |fgrep TOTAL |sed 's/,//g'
57420754719 PROGRAM TOTALS
$ perl -le 'print 25754696291/57420754719'
0.448525910483688
$
So I'm a little bit lost here. I need faster specfile parsing, because
faster parsing would make it possible to run extensive tests more
often (e.g. processing all Fedora specfiles). However, with
rpmstrpool.c, any possibility of faster specfile parsing become very
remote.
More information about the Rpm-maint
mailing list