[Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

Steven Morad notifications at github.com
Wed Jun 7 18:13:46 UTC 2017


Let me preface this by saying we are doing something unorthodox: we are running RPM 4.12.90 on MacOS 10.12.

It turns out that on Linux, querying and writing to the database can cause corruption. On MacOS, just querying in parallel can cause it. We can replicate it by doing `for i in {1..30}; do /bin/rpm -qa & done`. I have some info about how and why this happens. Using sandbox-exec, I was able to trace what `rpm -qa` does and what `rpm --rebuilddb` does to fix corruption.

Bdb `mmaps` regions of the db to increase performance, but then backs the regions using the filesystem. I'm not sure why it does this, as I would imagine mmap already takes care of flushing changes back to the db. Perhaps the db regions are "decompressed" and more performant? Source: https://web.stanford.edu/class/cs276a/projects/docs/berkeleydb/ref/env/region.html
What is happening is that `rpm -qa` is actually writing to the files of these file-backed mmaped regions: 

```
[root at redacted ~]# grep write /tmp/trace/trace_output.sb
(allow file-write-data (path "/dev/dtracehelper"))
(allow sysctl-write (sysctl-name "kern.procname"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/.dbenv.lock"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.001"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.001"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.002"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.003"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.004"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/.dbenv.lock"))
```


The way `rpm --rebuilddb` fixes this is by unlinking the regions:

```
(allow file-write-unlink (path "/opt/yum/var/lib/rpm/__db.001"))
(allow file-write-unlink (path "/opt/yum/var/lib/rpm/__db.002"))
(allow file-write-unlink (path "/opt/yum/var/lib/rpm/__db.003"))
(allow file-write-unlink (path "/opt/yum/var/lib/rpm/__db.004"))
```

Turns out if you unlink them by hand, it also fixes the corruption. I haven't figured out why the corrupted regions don't flush their changes to the real db, corrupting that as well.

I've written a sandbox profile that disallows writes to the file-backed mmaped regions. This means that we can call `sandbox-exec $sandbox_profile rpm -qa` to safely read, with zero chance of corrupting the db:

```
[root at redacted ~]# sandbox-exec -f rpm-query-nowrite.sb -- /bin/rpm -qa &>/dev/null
[root at redacted ~]# ls -la /var/lib/rpm/__db.00*
-rw-r--r--  1 root  root    24576 Jun  7 10:19 /var/lib/rpm/__db.001
-rw-r--r--  1 root  root   507904 Jun  7 10:19 /var/lib/rpm/__db.002
-rw-r--r--  1 root  root  1318912 Jun  7 10:19 /var/lib/rpm/__db.003
-rw-r--r--  1 root  root   811008 Jun  7 10:19 /var/lib/rpm/__db.004
[root at redacted ~]# sandbox-exec -f rpm-query-nowrite.sb -- /bin/rpm -qa &>/dev/null
[root at redacted ~]# ls -la /var/lib/rpm/__db.00*
-rw-r--r--  1 root  root    24576 Jun  7 10:19 /var/lib/rpm/__db.001
-rw-r--r--  1 root  root   507904 Jun  7 10:19 /var/lib/rpm/__db.002
-rw-r--r--  1 root  root  1318912 Jun  7 10:19 /var/lib/rpm/__db.003
-rw-r--r--  1 root  root   811008 Jun  7 10:19 /var/lib/rpm/__db.004
```

Is it possible there is a bug in the way you file-back your mmap'ed regions?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rpm.org/pipermail/rpm-maint/attachments/20170607/7270bf7b/attachment.html>


More information about the Rpm-maint mailing list