[Rpm-ecosystem] lazy loading of filelists.xml to speed up dnf

Pascal Terjan pterjan at gmail.com
Wed Aug 8 17:08:36 UTC 2018

On 7 August 2018 at 09:50, Michael Schroeder <mls at suse.de> wrote:
> On Mon, Aug 06, 2018 at 04:36:07PM +0000, Zbigniew J??drzejewski-Szmek wrote:
>> this mail is a continuation of an FPC [1] and a FESCo [2] tickets.
>> A proposal was made is to disallow packages in Fedora from using file
>> deps, and to optimize dnf to not load filelists.xml. File deps would
>> still be supported, because external packages and users want to use
>> them, but they would not be allowed for distro packages.
>> Not downloading or loading filelists.xml which are required for file
>> deps would provide significant bandwidth savings (~47 MB compressed)
>> and noticeable runtime savings (~10s at dnf startup) in many common
>> cases.
>> So this is something that is worth exploring, but it's not clear if it
>> is at all feasible.
> There's also something that can easily be done and would make
> loading the filelist unneeded in most of the cases: extend the
> primary filelist to include some whitelist of files. The whitelist
> must also be stored in the primary data, so that the solver knows
> what to expect.

That's what Mandrake/Mandriva/Mageia/... has been doing for many
years, there is a small file-deps file containing the ones we end up
generating, mostly from scriptlets IIRC, and we end up with provides
added for those in the main metadata when generating it. Then file
lists are lazily loaded when people want to query them but not used
for dependency resolution.

$ GET http://ftp.free.fr/mirrors/mageia.org/distrib/cauldron/x86_64/media/media_info/file-deps

>> It seems that dnf would need to support loading
>> filelists.xml lazily. In the mailing list discussions, some people
>> said that this would be hard, some people said that it would be
>> possible??? What is the situation here?
> Lazy loading of primary extensions is supported in libsolv, the
> demo solver included in the package makes use of that feature.
>> IIUC, dnf would need to restart
>> the resolution of a transaction mid-flight once it encounters a file dep,
>> which would require support across the different layers.
> No, it works different. At some point the solver creates the ruleset
> needed for dependency resolution. To do this, it has to find which
> packages provide a given dependency. If that's a filename dependency,
> it will check if it matches the default patterns (/etc/* *bin/*
> /usr/lib/sendmail). If it does not match, it will search the filelists.xml
> extension. Here's where libsolv can use a callback to make the lazy
> loading happen.
>> If Fedora commits to making use of this, would it be possible to
>> implement this in dnf? What kind of changes would be required?
>> [1] https://pagure.io/packaging-committee/issue/714
>> [2] https://pagure.io/fesco/issue/1955
> I don't think this is hard to implement, but there's a little detail
> that needs to be discussed: what should happen if the filelists.xml
> download fails? This can happen because the metadata has been rewritten
> in the meantime. How should the error be propagated back to the user?
> Cheers,
>   Michael.
> --
> Michael Schroeder                                   mls at suse.de
> SUSE LINUX GmbH,           GF Jeff Hawn, HRB 16746 AG Nuernberg
> main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);}
> _______________________________________________
> Rpm-ecosystem mailing list
> Rpm-ecosystem at lists.rpm.org
> http://lists.rpm.org/mailman/listinfo/rpm-ecosystem

More information about the Rpm-ecosystem mailing list