[Rpm-maint] [rpm-software-management/rpm] Add macro to force fsync() on close() (#187)

Jeff Johnson notifications at github.com
Mon Apr 17 05:43:03 UTC 2017


Wall clock time measurements for an RPM command (this is RPM5, but the I/O trace is identical to rpm.org):

`sudo /usr/bin/time /opt/local/bin/rpm -Uvh --nodeps --noscripts --relocate /=/opt/local/tmp /var/tmp/kernel-core-4.11.0-0.rc6.git2.1.fc27.x86_64.rpm`

BASE
----
2.26user 0.43system 0:03.90elapsed 69%CPU (0avgtext+0avgdata 196708maxresident)k
0inputs+75744outputs (1major+82590minor)pagefaults 0swaps

2.36user 0.44system 0:07.54elapsed 37%CPU (0avgtext+0avgdata 195668maxresident)k
45280inputs+75784outputs (10major+82618minor)pagefaults 0swaps

USING fallocate+fdatasync+fadvise+fsync
---------------------------------------
2.60user 0.89system 2:01.76elapsed 2%CPU (0avgtext+0avgdata 196632maxresident)k
0inputs+76488outputs (1major+82685minor)pagefaults 0swaps

2.59user 0.85system 2:01.93elapsed 2%CPU (0avgtext+0avgdata 196456maxresident)k
0inputs+75752outputs (1major+82684minor)pagefaults 0swaps

2.80user 0.89system 2:30.68elapsed 2%CPU (0avgtext+0avgdata 196004maxresident)k
44952inputs+76520outputs (19major+82666minor)pagefaults 0swaps

2.77user 0.95system 3:00.88elapsed 2%CPU (0avgtext+0avgdata 196216maxresident)k
50008inputs+75792outputs (27major+82668minor)pagefuls 0swaps

(aside)
These measurements were taken on a build server -- the slower measurements were the result of active builds in docker.

So fallocate+fdatasync+fadvise+fsync is (nominally) ~30x slower (using a wall clock measured install on rotating media).

@jayzmh: (to summarize, I have said most of this before indirectly. JMHO, YMMV)

I'd like to hear simple measurements of the effect of RPM+patch on both SSD and rotating media. The measurements (or at least a warning) should be included with any "opt-in" macro that you choose for configuration so that other users might be forewarned of the significant delay to expect from using fsync(2).

(aside)
I'll be happy to run these measurements for you on whatever patch you choose to merge into RPM, as well as the tests below.

I'd also like to hear that RPM+whatever_patch_you_use "works" because the installed files are not in cache, not because RPM is running ~30x (or whatever delay on SSD) slower and hence not competing for cache effectively. There's nothing in the fsync(2) man page that specifies the effect on kernel cache, only that fsync(2) returns when file info has been written to disk.

Basically I'd like to see what mincore(2) has to say about installed-file-in-cache after RPM exits if you only wish to use fsync(2), or (perhaps) a comparison when fsync(2) is replaced with a ~50ms delay per file (what I am seeing as the approx fdatasync(2) delay for every file installed) if only
    `usleep(50000);`
is used.

One simple way to check the kernel cache would be to use fincore(1) from linux ftools here [https://code.google.com/p/linux-ftools/](url) immediately after an RPM install (or use the C routine fmincore() in my patch above).

@cgwalters: Jens Axboe's "buffer bloat" patch deals only with autoscaling to prevent large writes from degrading overall system performance. The other part of the problem is that file durability using fsync(2) is costly, and there isn't an obvious solution (at least that I can find) to how to mitigate that cost. Perhaps spinning off a helper thread on file close will "work": I'm not sure I trust the widely cited conclusions here [http://oldblog.antirez.com/post/fsync-different-thread-useless.html](url), particularly with the "buffer bloat" patch. RPM I/O is also many files, not one very large file.

@pmatilai: worrying about naming a macro "_periodic_fsync" and later generalizing to call f*sync(2) to be, say, every nBytes/nSecs is likely unnecessary. For starters, RPM installations are many files, not a single large file, and fsync(2) takes a single fdno argument. Doing "w+.fdio" (i.e. open RW and truncating), and pre-sizing using fallocate(2) (so that st->st_size metadata does not need to be repeatedly sync'd as writes progress) is likely a trivial implementation in the right direction on all media.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/187#issuecomment-294409265
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rpm.org/pipermail/rpm-maint/attachments/20170416/ecbed3de/attachment-0001.html>


More information about the Rpm-maint mailing list