[Rpm-maint] RFC: file verification API

Panu Matilainen pmatilai at redhat.com
Thu Feb 21 10:13:53 UTC 2008

One of the first items for rpm.org TODO list last May was an API for 
package verification, might be time to actually do something about it...

There is already an API call to do file verification: rpmVerifyFile(). The 
problem with that is that you only get the bits of what differs, but what 
if you want to look at the actual values? Eg "/bin/foo owner mismatch, 
should be root but is joedoe" type of reporting. Sure it's just a stat() 
away but rpmVerifyFile() already called it, kinda silly to have to do it 
again... and for file checksumming it's not so simple as rpm does prelink 
undo automatically (and in any case it's not exactly a cheap operation). 
So basically we'd like some way of storing the results of what was read 
from disk on verification.

The idea I've been playing with is to use the rpmfi structure for that: 
rpmfi already has all the necessary methods for retrieving file ownership, 
timestamps etc, so we wouldn't have to invent and implement yet another 
"object" with methods for the storing and retrieving the bits and 
usernames etc.

Yesterday I got around to experiment a bit with it, to the point of 
"proof-of-concept" implementation. By using rpmfi for storage of from-disk 
data, verification now looks like this:

rpmfi fi = rpmfiNew(ts, hdr, RPMTAG_BASENAMES, 1);
rpmfi diskfi = rpmfiNewFromDisk(ts, hdr, RPMTAG_BASENAMES, verifyflags);

while ((rpmfiNext(fi) >= 0) && (rpmfiNext(diskfi) >= 0) {
     if (rpmfiFMtime(fi) != rpmfiFMtime(diskfi))
         verifyResult |= RPMVERIFY_MTIME;
     ... /* other checks */


Implementing a custom verification procedures that print out actual value 
differences or whatever is pretty trivial this way, as is doing a 
rpmVerifyFile() type operation that just gives raises verify-failed bits 
(like the above does) if you don't care about the actual values.

The not-so-nice thing with this approach are that there's no way to verify 
individual files, rpmfiNewFromDisk() works on header at a time. Of course 
you can ignore the files you don't care about when iterating over them, 
but you'll pay the penalty of md5summing everything. How big a deal is 
that in reality... dunno, from cli rpm has always only supported 
header-at-a-time verification. The other issue is that since the reading 
from disk and comparison are detached, things like lstat failure reasons
are lost (whereas currently you get "permission denied" and such), but 
that could probably be dealt with by just adding an extra entry to rpmfi 
to store errno's (which would be empty for a "normal" rpmfi).

Comments? Any gaping showstopper holes in the idea that I'm too blind to 
see? For the curious, the draft-implementation of rpmfiNewFromDisk() is 
here http://laiskiainen.org/rpm/patches/rpm-rpmfi-from-disk-1.patch

The alternative to hijacking rpmfi (which seems very natural for the 
purpose) would be implementing some other means of storing and accessing 
the verification results, a fair bit of mostly tedious work most likely.

Then there's of course the other parts of package verification: 
dependencies and verify-scripts, those would need some sort of API too I 
suppose... ideas welcome.

 	- Panu -

More information about the Rpm-maint mailing list