[Rpm-ecosystem] lazy loading of filelists.xml to speed up dnf
n3npq at me.com
Wed Aug 8 18:59:04 UTC 2018
> On Aug 7, 2018, at 4:30 AM, Jonathan Dieter <jdieter at gmail.com> wrote:
>> On Mon, 2018-08-06 at 16:36 +0000, Zbigniew Jędrzejewski-Szmek wrote:
>> Hi dnf and libsolv developers,
>> this mail is a continuation of an FPC  and a FESCo  tickets.
>> A proposal was made is to disallow packages in Fedora from using file
>> deps, and to optimize dnf to not load filelists.xml. File deps would
>> still be supported, because external packages and users want to use
>> them, but they would not be allowed for distro packages.
>> Not downloading or loading filelists.xml which are required for file
>> deps would provide significant bandwidth savings (~47 MB compressed)
>> and noticeable runtime savings (~10s at dnf startup) in many common
>> So this is something that is worth exploring, but it's not clear if
>> is at all feasible. It seems that dnf would need to support loading
>> filelists.xml lazily. In the mailing list discussions, some people
>> said that this would be hard, some people said that it would be
>> possible… What is the situation here? IIUC, dnf would need to restart
>> the resolution of a transaction mid-flight once it encounters a file
>> which would require support across the different layers.
>> If Fedora commits to making use of this, would it be possible to
>> implement this in dnf? What kind of changes would be required?
>>  https://pagure.io/packaging-committee/issue/714
>>  https://pagure.io/fesco/issue/1955
>> Zbyszek, on behalf of FESCo (but not that this writeup is based
>> on my understanding, so all errors are mine.)
> This is a bit off on a tangent, but we are working on zchunk (see https
> ://fedoraproject.org/wiki/Changes/Zchunk_Metadata) which should make
> this whole problem far less painful. As things are going, it's highly
> unlikely that we'll be completely done for F29 (we have working code
> that needs review), but it should be good to go for F30.
This comment below is also tangential to lazy loading of file lists, but while you are here ...
Zchunk is a variant of an librsync protocol that is sensitive to package boundaries in the file being transported.
There is another optimization that would be useful for zchunk: rsync --fuzzy
The --fuzzy option is particularly useful distributing *.rpm package files into a directory, where the NEVRA is almost always different.
For that matter, it would seem not to be impossibly hard to generalize zchunk transport to pay attention to rpm file sections (lead/signature/metadata/payload) as well as payload file contents and improve upon rdist distribution of package files.
73 de Jeff
> Rpm-ecosystem mailing list
> Rpm-ecosystem at lists.rpm.org
More information about the Rpm-ecosystem