[Rpm-ecosystem] hawkey query performance

Daniel Mach dmach at redhat.com
Wed Sep 9 08:52:17 UTC 2015


Hi,
I'm struggling with hawkey query performance[1].

Hawkey performs a sequential scan through the whole package set on every 
query. Consider 100k+ RPMs and making a closure on dependencies (pairing 
Requires and Provides, incl. file deps). My use case is to create a 
package set for a compose[2], similar use case is to draw a dependency 
graph.

Libsolv's depsolving can't be probably used, because compose can contain 
mutually conflicting packages (but individually installable) or broken 
deps (we need to deliver latest RPMs to quality engineering for every 
cost, even if the deps are broken).

Do you have any idea how to resolve my problem?
Can hawkey or libsolv be improved to index data for queries?
Or should I be using it differently?
Please note current performance is worse than original YUM backend we're 
still using in Pungi, possibly because it leveraged SQLite indexing.


thanks,
- daniel


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1225501
[2] https://pagure.io/fork/dmach/pungi/branch/gather-dnf

-- 
Daniel Mach <dmach at redhat.com>
Release Engineering, Red Hat


More information about the Rpm-ecosystem mailing list