[Rpm-ecosystem] hawkey query performance

Daniel Mach dmach at redhat.com
Wed Sep 9 08:52:17 UTC 2015

I'm struggling with hawkey query performance[1].

Hawkey performs a sequential scan through the whole package set on every 
query. Consider 100k+ RPMs and making a closure on dependencies (pairing 
Requires and Provides, incl. file deps). My use case is to create a 
package set for a compose[2], similar use case is to draw a dependency 

Libsolv's depsolving can't be probably used, because compose can contain 
mutually conflicting packages (but individually installable) or broken 
deps (we need to deliver latest RPMs to quality engineering for every 
cost, even if the deps are broken).

Do you have any idea how to resolve my problem?
Can hawkey or libsolv be improved to index data for queries?
Or should I be using it differently?
Please note current performance is worse than original YUM backend we're 
still using in Pungi, possibly because it leveraged SQLite indexing.

- daniel

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1225501
[2] https://pagure.io/fork/dmach/pungi/branch/gather-dnf

Daniel Mach <dmach at redhat.com>
Release Engineering, Red Hat

More information about the Rpm-ecosystem mailing list