[Rpm-maint] [PATCH] Add RPMTAG_IDENTITY calculation as tag extension

Vladimir D. Seleznev vseleznv at altlinux.org
Wed Apr 4 14:32:57 UTC 2018


On Wed, Apr 04, 2018 at 09:18:35AM -0400, Jeff Johnson wrote:
> 
> 
> > On Apr 4, 2018, at 5:47 AM, Panu Matilainen <pmatilai at redhat.com> wrote:
> > 
> > On 04/04/2018 11:01 AM, Panu Matilainen wrote:
> > [...]
> >>> +static int identityTag(Header h, rpmtd td, headerGetFlags hgflags)
> >>> +{
> >>> +    int rc = 1;
> >>> +    unsigned int bsize;
> >>> +    void * hb;
> >>> +    DIGEST_CTX ctx;
> >>> +    char * identity = NULL;
> >>> +
> >>> +    struct rpmtd_s ttd;
> >>> +    Header ih = headerNew();
> >>> +    HeaderIterator hi = headerInitIterator(h);
> >>> +
> >>> +    while (headerNext(hi, &ttd) && rc) {
> >>> +    if (rpmtdCount(&ttd) > 0) {
> >>> +        if (!identityFilter(ttd.tag))
> >>> +        rc = headerPut(ih, &ttd, HEADERPUT_DEFAULT);
> >>> +    }
> >>> +    rpmtdFreeData(&ttd);
> >>> +    }
> >>> +    headerFreeIterator(hi);
> >>> +
> >>> +    if (!rc)
> >>> +    return 0;
> >>> +
> >>> +    hb = headerExport(ih, &bsize);
> > 
> > Forgot to bring this up on the first round: another question is
> > whether it actually makes sense to go through all this trouble of
> > copying the header, then exporting it and calculating the digest
> > from that.
> > 
> > It'd be considerably cheaper to calculate a digest on the go while
> > iterating over the data (from the immutable region, see my other
> > email). A newly imported header is guaranteed to be sorted (by tags)
> > so it's consistent.
> > 
> 
> There is no particular reason why the IDENTITY digest should need/use
> a header blob.
> 
> Any faithful transformation (e.g. using sprintf or hex strings) on the
> data for the set of tags can be used for an IDENTITY digest. The
> header blob implicitly includes padding, which can/will change
> depending on the definition ordering even when the tag data is
> identical/reproducible.

Ok.

> The filtering should also be positive/inclusive rather than
> negative/exclusive. While a positive list of tags to include is harder
> to enumerate than the shorter list of tags to exclude, the IDENTITY
> tag set will then be closed and well known.

I still don't understand why filtering should be inclusive rather than
exclusive? The only reason I see is arbitrary tags but the code
currently does not include tags beyond RPMTAG_FIRSTFREE_TAG.

> The simplest way to mark the positive/inclusive IDENTITY tag set is to
> change the awk script that generates the tag table to add a marker
> (like the array return marker) to identify which tags to include. 
> The members (and ordering) of the IDENTITY tag set might also need to
> be configurable without recompiling.

May be it would be easier to add marker that tells that marked tag
should be filtered from IDENTITY by calculation? But what is best way to
do that? Should I add a new field to struct headerTagTableEntry_s? Or I
miss something?

> But overall, dynamically generating the IDENTITY tag set withe a tag
> extension can be deployed (and back ported and maintained) far more
> easily than changing header code.
> 
> Nice job!

-- 
   With best regards,
   Vladimir D. Seleznev


More information about the Rpm-maint mailing list