[Rpm-maint] Rpm Database musings

Panu Matilainen pmatilai at laiskiainen.org
Mon Mar 4 16:08:29 UTC 2013


On 03/04/2013 12:21 PM, Michael Schroeder wrote:
> On Sun, Mar 03, 2013 at 05:46:10PM +0200, Panu Matilainen wrote:
>> Right, in this context compression does indeed seem quite attractive. When
>> we talked about this in the devconf, I was thinking about the way rpm
>> itself currently keeps (re)loading the headers from Packages and adding
>> repeated decompression to the other costs of header loading didn't seem
>> like a way to make it faster. But for roughly halving the amount of io
>> needed for scanning through it exactly once (which is of course the way
>> libsolv operates) its quite a different thing.
>>
>> 0.2s is not a whole lot, for many operations absolutely nothing really, but
>> I'd think some kind of cache would be in order to avoid having to read
>> through all of packages just for those simple 'rpm -qf /foo' kind of
>> queries.
>
> Yes, I think it's too much. A rpm query call should be pretty
> instantaneous. Plus, reading all headers will kill 10 MBytes precious
> block buffer cache.
>
>> Such as, store the in-memory index structures into a memory mapped
>> cache file. The cache could perhaps be write-once and read-only for other
>> uses so there's no need for locking within the cache: eg recreate it from
>> scratch at the end of transactions and atomically replace the old one so
>> the cache itself is always coherent.
>
> Yes, but that's pretty much the way rpm currently works. (The "but"
> is not meant negative, we don't need to change everything just
> for change's sake.)

Not really, perhaps I should elaborate a bit... What I mean is a custom 
(perhaps mmap'ed) cache file whose structure etc we have full control 
over and which contains not just indexes but actual data as well, so 
that most critical / commonly used data can be retrieved without going 
to the actual headers (and fall back to loading headers for the uncached 
data).

>
>> Or something... this isn't that far
>> from libsolv's .solv files.
>>
>> Speaking of which... a funny little idea I got at the end of the devconf:
>> regardless of future rpmdb format changes, it should be now possible to
>> write an rpm plugin that creates + updates a .solv file for the rpmdb, so
>> you should never have to actually read through the entire rpmdb in libsolv
>> and its users like libzypp, dnf etc.
>
> Actually libsolv can do a "incremental" update if it has an old
> solv file available, i.e. it takes the unchanged content from the
> old solv file and only queries new headers from the rpm database.

Yup, I seem to recall this being the case. Updating / creating the .solv 
file from inside rpm wouldn't probably be a huge win, but it'd ensure 
the .solv is always up-to-date even when somebody does a direct 'rpm 
-U/-e', and the cost of doing that would probably just get lost in the 
noise of all the other costs of rpm transactions. Anyway, its just an 
idea, whether it would actually be worth the trouble I dunno.

	- Panu -


More information about the Rpm-maint mailing list