[Rpm-ecosystem] lazy loading of filelists.xml to speed up dnf
Jeff Johnson
n3npq at me.com
Wed Aug 8 18:59:04 UTC 2018
> On Aug 7, 2018, at 4:30 AM, Jonathan Dieter <jdieter at gmail.com> wrote:
>
>> On Mon, 2018-08-06 at 16:36 +0000, Zbigniew Jędrzejewski-Szmek wrote:
>> Hi dnf and libsolv developers,
>>
>> this mail is a continuation of an FPC [1] and a FESCo [2] tickets.
>>
>> A proposal was made is to disallow packages in Fedora from using file
>> deps, and to optimize dnf to not load filelists.xml. File deps would
>> still be supported, because external packages and users want to use
>> them, but they would not be allowed for distro packages.
>>
>> Not downloading or loading filelists.xml which are required for file
>> deps would provide significant bandwidth savings (~47 MB compressed)
>> and noticeable runtime savings (~10s at dnf startup) in many common
>> cases.
>>
>> So this is something that is worth exploring, but it's not clear if
>> it
>> is at all feasible. It seems that dnf would need to support loading
>> filelists.xml lazily. In the mailing list discussions, some people
>> said that this would be hard, some people said that it would be
>> possible… What is the situation here? IIUC, dnf would need to restart
>> the resolution of a transaction mid-flight once it encounters a file
>> dep,
>> which would require support across the different layers.
>> If Fedora commits to making use of this, would it be possible to
>> implement this in dnf? What kind of changes would be required?
>>
>> [1] https://pagure.io/packaging-committee/issue/714
>> [2] https://pagure.io/fesco/issue/1955
>>
>> Zbyszek, on behalf of FESCo (but not that this writeup is based
>> on my understanding, so all errors are mine.)
>
> This is a bit off on a tangent, but we are working on zchunk (see https
> ://fedoraproject.org/wiki/Changes/Zchunk_Metadata) which should make
> this whole problem far less painful. As things are going, it's highly
> unlikely that we'll be completely done for F29 (we have working code
> that needs review), but it should be good to go for F30.
>
This comment below is also tangential to lazy loading of file lists, but while you are here ...
Zchunk is a variant of an librsync protocol that is sensitive to package boundaries in the file being transported.
There is another optimization that would be useful for zchunk: rsync --fuzzy
The --fuzzy option is particularly useful distributing *.rpm package files into a directory, where the NEVRA is almost always different.
For that matter, it would seem not to be impossibly hard to generalize zchunk transport to pay attention to rpm file sections (lead/signature/metadata/payload) as well as payload file contents and improve upon rdist distribution of package files.
73 de Jeff
> Jonathan
> _______________________________________________
> Rpm-ecosystem mailing list
> Rpm-ecosystem at lists.rpm.org
> http://lists.rpm.org/mailman/listinfo/rpm-ecosystem
More information about the Rpm-ecosystem
mailing list