[Rpm-maint] [PATCH] Add RPMTAG_IDENTITY calculation as tag extension

Wed Apr 4 13:18:35 UTC 2018

> On Apr 4, 2018, at 5:47 AM, Panu Matilainen <pmatilai at redhat.com> wrote:
> 
> On 04/04/2018 11:01 AM, Panu Matilainen wrote:
> [...]
>>> +static int identityTag(Header h, rpmtd td, headerGetFlags hgflags)
>>> +{
>>> +    int rc = 1;
>>> +    unsigned int bsize;
>>> +    void * hb;
>>> +    DIGEST_CTX ctx;
>>> +    char * identity = NULL;
>>> +
>>> +    struct rpmtd_s ttd;
>>> +    Header ih = headerNew();
>>> +    HeaderIterator hi = headerInitIterator(h);
>>> +
>>> +    while (headerNext(hi, &ttd) && rc) {
>>> +    if (rpmtdCount(&ttd) > 0) {
>>> +        if (!identityFilter(ttd.tag))
>>> +        rc = headerPut(ih, &ttd, HEADERPUT_DEFAULT);
>>> +    }
>>> +    rpmtdFreeData(&ttd);
>>> +    }
>>> +    headerFreeIterator(hi);
>>> +
>>> +    if (!rc)
>>> +    return 0;
>>> +
>>> +    hb = headerExport(ih, &bsize);
> 
> Forgot to bring this up on the first round: another question is whether it actually makes sense to go through all this trouble of copying the header, then exporting it and calculating the digest from that.
> 
> It'd be considerably cheaper to calculate a digest on the go while iterating over the data (from the immutable region, see my other email). A newly imported header is guaranteed to be sorted (by tags) so it's consistent.
> 

There is no particular reason why the IDENTITY digest should need/use a header blob.

Any faithful transformation (e.g. using sprintf or hex strings) on the data for the set of tags can be used for an IDENTITY digest. The header blob implicitly includes padding, which can/will change depending on the definition ordering even when the tag data is identical/reproducible.

The filtering should also be positive/inclusive rather than negative/exclusive. While a positive list of tags to include is harder to enumerate than the shorter list of tags to exclude, the IDENTITY tag set will then be closed and well known.

The simplest way to mark the positive/inclusive IDENTITY tag set is to change the awk script that generates the tag table to add a marker (like the array return marker) to identify which tags to include. 

The members (and ordering) of the IDENTITY tag set might also need to be configurable without recompiling.

But overall, dynamically generating the IDENTITY tag set withe a tag extension can be deployed (and back ported and maintained) far more easily than changing header code.

Nice job!

73 de Jeff

>    - Panu -
> _______________________________________________
> Rpm-maint mailing list
> Rpm-maint at lists.rpm.org
> http://lists.rpm.org/mailman/listinfo/rpm-maint