[Rpm-ecosystem] lazy loading of filelists.xml to speed up dnf

Michael Schroeder mls at suse.de
Tue Aug 7 08:50:25 UTC 2018


On Mon, Aug 06, 2018 at 04:36:07PM +0000, Zbigniew J??drzejewski-Szmek wrote:
> this mail is a continuation of an FPC [1] and a FESCo [2] tickets.
> 
> A proposal was made is to disallow packages in Fedora from using file
> deps, and to optimize dnf to not load filelists.xml. File deps would
> still be supported, because external packages and users want to use
> them, but they would not be allowed for distro packages.
> 
> Not downloading or loading filelists.xml which are required for file
> deps would provide significant bandwidth savings (~47 MB compressed)
> and noticeable runtime savings (~10s at dnf startup) in many common
> cases.
> 
> So this is something that is worth exploring, but it's not clear if it
> is at all feasible.

There's also something that can easily be done and would make
loading the filelist unneeded in most of the cases: extend the
primary filelist to include some whitelist of files. The whitelist
must also be stored in the primary data, so that the solver knows
what to expect.

> It seems that dnf would need to support loading
> filelists.xml lazily. In the mailing list discussions, some people
> said that this would be hard, some people said that it would be
> possible??? What is the situation here?

Lazy loading of primary extensions is supported in libsolv, the
demo solver included in the package makes use of that feature.

> IIUC, dnf would need to restart
> the resolution of a transaction mid-flight once it encounters a file dep,
> which would require support across the different layers.

No, it works different. At some point the solver creates the ruleset
needed for dependency resolution. To do this, it has to find which
packages provide a given dependency. If that's a filename dependency,
it will check if it matches the default patterns (/etc/* *bin/*
/usr/lib/sendmail). If it does not match, it will search the filelists.xml
extension. Here's where libsolv can use a callback to make the lazy
loading happen.

> If Fedora commits to making use of this, would it be possible to
> implement this in dnf? What kind of changes would be required?
> 
> [1] https://pagure.io/packaging-committee/issue/714
> [2] https://pagure.io/fesco/issue/1955

I don't think this is hard to implement, but there's a little detail
that needs to be discussed: what should happen if the filelists.xml
download fails? This can happen because the metadata has been rewritten
in the meantime. How should the error be propagated back to the user?

Cheers,
  Michael.

-- 
Michael Schroeder                                   mls at suse.de
SUSE LINUX GmbH,           GF Jeff Hawn, HRB 16746 AG Nuernberg
main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);}


More information about the Rpm-ecosystem mailing list