[Rpm-maint] Rpm Database musings

Michael Schroeder mls at suse.de
Fri Mar 1 16:32:19 UTC 2013


Hi Panu et al,

here are some numbers/musings about changing the database
implementation to just one single packages file:

- I assume that we still want to store all the headers (in some
  format) anyway.

- I checked all the headers of the i586/noarch packages from FC18
  to get some understanding how big they are and if it makes
  sense to compress them. Here's the result:

    scanned: 28423 rpms
    uncompressed: sum: 777290960, avg: 27348, median: 10600
    lzo:          sum: 305711769, avg: 10756,  median: 4805
    gzip:         sum: 255995670, avg:  9007,  median: 4154
    xz:           sum: 215564872, avg:  7585,  median: 3728

  (the median is quite different from the avg, that means that
  some packages are quite big.)

  As you can see, compression about halfs the size of the headers.
  LZO seems to be "good enough" and has the advantage that it's
  really fast.

- That means, if I have 2000 packages installed on my system
  (which is about the real number), the concatenated headers will
  use 20 MByte (using the median), 10 MByte when using LZO
  compression, 7.5 with xz.

- So if we want to drop all index files and just scan the
  packages database, we would need (assuming disk IO throughput
  of 50 M/s) about .2 seconds to create the in-memory index
  data. Which maybe is too much, I dunno.

Cheers,
  Michael.

-- 
Michael Schroeder                                   mls at suse.de
SUSE LINUX Products GmbH,  GF Jeff Hawn, HRB 16746 AG Nuernberg
main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);}


More information about the Rpm-maint mailing list