[Rpm-maint] [PATCH] Performance improvement of rpmfcSaveArgs
ffesti at redhat.com
Sat Jan 15 08:39:55 UTC 2011
On 01/13/2011 02:51 PM, Giulio Eulisse wrote:
> the current implementation of rpmfcSaveArgs (now argvAddUniq) is quite suboptimal since it does a sort for each insertion in order to preserve ordering in the argv.
> Find in attachment a more efficient implementation which finds the insertion point (using a linear search, could be improved by using a std::lower_bound like search), makes some space and inserts a new entry directly in the correct spot so that the resulting vector is already sorted.
> If my math does not fool me, my implementation is roughly O(N) (due to the linear search) while the current one is O((1+N)*log(N)) (due to the search and the sort).
> More pragmatically, in my tests, the (rpm)build time of an extremely large (~ one thousand dynamic libraries and several thousands files) package we have, went from three hours to two.
Thanks for spotting this! If the performance impact is really that big
we should probably consider replacing the ARGV by an better data type
that allows insertion in even shorter time. O(N) for every insert still
adds up to O(N^2). There already is a generic hash data type in the code
(lib/rpmhash.[HC]) that we used everywhere else to fix exactly the same
performance problem. I've not yet had a deeper look how complicated
replace the ARGV will be but it definitely worth a try.
More information about the Rpm-maint