[Rpm-ecosystem] lazy loading of filelists.xml to speed up dnf

Neal Gompa ngompa13 at gmail.com
Thu Aug 9 05:34:44 UTC 2018


On Wed, Aug 8, 2018 at 7:09 PM Pascal Terjan <pterjan at gmail.com> wrote:
>
> On 7 August 2018 at 09:50, Michael Schroeder <mls at suse.de> wrote:
> > On Mon, Aug 06, 2018 at 04:36:07PM +0000, Zbigniew J??drzejewski-Szmek wrote:
> >> this mail is a continuation of an FPC [1] and a FESCo [2] tickets.
> >>
> >> A proposal was made is to disallow packages in Fedora from using file
> >> deps, and to optimize dnf to not load filelists.xml. File deps would
> >> still be supported, because external packages and users want to use
> >> them, but they would not be allowed for distro packages.
> >>
> >> Not downloading or loading filelists.xml which are required for file
> >> deps would provide significant bandwidth savings (~47 MB compressed)
> >> and noticeable runtime savings (~10s at dnf startup) in many common
> >> cases.
> >>
> >> So this is something that is worth exploring, but it's not clear if it
> >> is at all feasible.
> >
> > There's also something that can easily be done and would make
> > loading the filelist unneeded in most of the cases: extend the
> > primary filelist to include some whitelist of files. The whitelist
> > must also be stored in the primary data, so that the solver knows
> > what to expect.
>
> That's what Mandrake/Mandriva/Mageia/... has been doing for many
> years, there is a small file-deps file containing the ones we end up
> generating, mostly from scriptlets IIRC, and we end up with provides
> added for those in the main metadata when generating it. Then file
> lists are lazily loaded when people want to query them but not used
> for dependency resolution.
>
> $ GET http://ftp.free.fr/mirrors/mageia.org/distrib/cauldron/x86_64/media/media_info/file-deps
> /bin/csh
> /bin/grep
> /bin/perl
> /usr/bin/ln
> /usr/bin/rm
> /sbin/service
> /usr/bin/chattr
> /usr/bin/guile
> /usr/bin/openssl
> /usr/bin/pear
> /usr/bin/texhash
> /usr/bin/tr
> /usr/bin/which
> /usr/sbin/groupadd
> /usr/sbin/groupdel
> /usr/sbin/useradd
> /usr/sbin/userdel
>

So the primary.xml already includes all that. If you actually look in
the primary.xml.gz files in the Mageia rpm-md data, those are already
there. The problem is that there are people who actually request files
outside of the base whitelist as a means to be able to request
"things" without knowing how they are packaged, because the file path
is the consistent thing across distros. This is supported in YUM and
DNF, just slightly differently.

In this case, the wish is to restore the YUM behavior. The idea is
that stacking this on top of the Zchunk deltarepo extension will yield
incredible boosts for everything.


--
真実はいつも一つ!/ Always, there's only one truth!


More information about the Rpm-ecosystem mailing list