[Rpm-maint] [PATCH] rpmdbCountPackagesArch

Thomas Fitzsimmons fitzsim at redhat.com
Wed Jul 2 13:51:24 UTC 2008


Pixel wrote:
> Thomas Fitzsimmons <fitzsim at redhat.com> writes:
> 
> [...]
> 
>>> In the typical case, rpmtsRun() is already doing 2 iterations on
>>> "Name" doing headerLoad. And there are a few more.
>>>
>>> IMO if we look at optimisations, there are many things that should be
>>> done before this. For example, it would be far more effective to
>>> introduce a "Packages-light" with only the more interesting piece of
>>> information (EVRA and maybe a few more). This would make "rpm -qa"
>>> much much more faster/lighter (I/O accesses on Packages are really
>>> killing it).
>>>
>>> So i would say, implement rpmdbCountPackagesArch() using
>>> rpmdbSetIteratorRE() for now.
>> I've attached tests that benchmark name-only lookups versus
>> rpmdbSetIteratorRE-based name-arch lookups.  On my Fedora 9 32-bit x86
>> machine:
>>
>> [...]
>>
>> So over many transactions, using rpmdbSetIteratorRE is much slower
>> than using the Namearch index.
> 
> sure it's much slower. 
> 
> But afaik this rpmdbCountPackagesArch() is to be used for scriptlets.

Mostly, yes.  It will also be exposed as a new rpmdb function though, so I was 
trying to make rpmdbCountPackagesArch meet people's performance expectations for 
rpmdbCountPackages.

> So my question is how much slower it is in a package install. I would
> say it is barely noticeable.
> 
> i've tried adding the "slow" rpmdbCountPackagesArch() in transaction.c
> after the RPMPROB_FILTER_OLDPACKAGE and RPMPROB_FILTER_REPLACEPKG
> checks.
> 
> i get:
> 
> *** time 534 usecs (checking a newer version is not installed)
> *** time 169 usecs (checking a same version is not installed)
> *** time 148 usecs (counting same arch)
> 
> ie the first one is quite costly, then it's quite the same.
> 
> i could not see the speed difference with or without the added
> rpmdbCountPackagesArch() call.
> 
> (nb, test was: installing pkg "a-3" where "a-1" and "a-2" are installed)

Yes, I ran several use case benchmarks on my previous rpmdbSetIteratorRE-based 
patch and there was no noticeable speed difference.  So for rpm's usage patterns 
it doesn't matter if rpmdbCountPackagesArch is much slower than rpmdbCountPackages.

> 
> 
>>      rpmdbSetIteratorRE (mi, RPMTAG_ARCH, RPMMIRE_DEFAULT, "i686");
> 
> RPMMIRE_DEFAULT is "regex with \., .* and ^...$ added"
> using RPMMIRE_STRCMP instead gives you back 25% speed.

Nice.

How are rebuilddbs handled during deployment?  If the process is cumbersome or 
error-prone then maybe a slower rpmdbCountPackagesArch is a worthwhile trade-off 
for avoiding an RPM database rebuild.

Tom




More information about the Rpm-maint mailing list