[Rpm-maint] Rpm Database musings

Thierry Vignaud thierry.vignaud at gmail.com
Sat Mar 9 13:19:34 UTC 2013


On 7 March 2013 21:28, Panu Matilainen <pmatilai at laiskiainen.org> wrote:
> I wouldn't worry too much about hash algorithms and storage optimization at
> this point: that's something that can be tweaked and tuned over time as long
> as the cache structure is internally versioned so we know when we need to
> rebuild it.
>
> Right now I'm more interested in what the overall design of this all might
> look like. Like said, I'd like to see the cache be a "read-only media" so
> there are zero locking needed for queries that only need data from the
> cache. It'll undoubtedly penalize writers (ie transactions) as the entire
> cache probably needs to be regenerated even if just one package is
> installed/removed, but then we're not in the "millions of transactions per
> second" database business at all, in rpm's case painless (say, without
> having a library steal your signals and blow up in all directions if you
> miss a single iterator free, etc) and fast reads are what really counts I
> think.

Please don't be yum centric :-)
where as yum & rpm only perform one transaction, URPM/urpmi (and thus
rpmdrake as well as drakx -- Mageia ISO installer) usually split in several
transactions of small packages:
- sorting packages by deps
- spliting in small transactions
- foreach translation:
  o download (if needed) only the packages need by this transaction
  o verify those packages
  o install those packages

See http://search.cpan.org/~tvignaud/urpmi-7.19/urpm/main_loop.pm

IMHO This makes urpmi more robust than yum in my experience.

If we upgrade a whole system with lots of packages (says a couple thousands),
urpmi can well perform ~100-200 rpm transactions.

Having a quite a lot more expensive overhead per transaction would
thus be a huge penalty for urpmi


More information about the Rpm-maint mailing list