[Rpm-maint] Fingerprinting skipDir() brokenness ponderings
Panu Matilainen
pmatilai at redhat.com
Wed Jun 13 08:33:48 UTC 2007
I suppose pretty much everybody here knows the issue from the subject line
already, but if not, see the following bugs (and their duplicates) for
full description:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=140055
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=209306
I personally think this is something we *must* address in 4.4.2.1 somehow.
It's not like there aren't options, several of them. It just that I'm not
too happy with any of them. List follows (in no particular order) with
some pros and cons pointed out:
1) Remove the "temporary" skipDir() hack dating back to 2002 completely.
+ Is really the responsible and right thing to do.
+ Fixes the shared files problems.
- Memory consumption goes sky-high and performance degrades badly. This
might not be that much of a problem in most modern systems but for eg
OLPC is likely to be a showstopper.
2) Further band-aid around skipDir(): disable it on multilib systems
+ Only causes the memory + performance hit on modern systems that are
likely to survive it
+ Fixes the shared files removal problems where it hits the worst
- Ugly as sin
- Leaves non-multilib systems affected with the problems in some cases
like the one described in rhbz#140055
3) Apply the findfpexclude patch + hack that uses it to not to skipDir()
on erase. The pros and cons are largely the same as in 1) with a twist:
memory use isn't terrible on install but is on erase, so it'd still be
problematic for low-end systems.
4) Apply findfpexclude + taggedfileindex patches, remove skipDir() hack.
+ Performs extremely well, both from wallclock and memory consumption POV
in all cases
+ Fixes the shared files removal problems
- Breaks fingerprinting semantics, other concerns raised by jbj in
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=140055#c11
5) Fix the fingerprinting algorithm to not to consume ungodly amounts of
memory.
+ The right thing to do
- If it was that easy, it'd probably had been done already (it's only like
9 years old...)
6) Get rid of fingerprinting completely.
- Not the kind of change one wants to do for a maintenance release
7) Do nothing.
- "If it ain't broken..." doesn't apply, so this is not really an option at
all, it doesn't address the issue.
8) Variant of 1-2: make skipDirs runtime configurable, defaulting to empty
+ Default behavior is correct
+ Lets vendors and users tweak it as necessary without rebuilding
- Default behavior performs hideously
- Doesn't really fix the problem, only pushes responsibility elsewhere
Having done a bit of fingerprinting torture-testing, I have to say 4)
looks very attractive, but deliberately breaking fingerprinting semantics
on a maintenance release is ... um, not nice. OTOH, the semantics are
totally broken already because of skipDir() kludgery! So it'd be trading a
very broken behavior to more correct (not 100% correct) behavior in the
typical cases while improving performance a lot. It wouldn't seem like a
bad tradeoff at all, but it doesn't just feel quite right still. And I
wonder about jbj's concerns like the > 65K files in package in
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=140055#c11 - what
exactly happens then with the tagged fileindexes?
Thoughts / opinions? At the moment, only consider 4.4.2.1, we'll probably
want to revisit this issue afterwards regardless of the decision taken
now.
- Panu -
More information about the Rpm-maint
mailing list