[Rpm-maint] [PATCH] handle errors when constructing lists in the Python bindings

David Malcolm dmalcolm at redhat.com
Mon Dec 19 15:24:25 UTC 2011


On Mon, 2011-12-19 at 09:24 +0100, Ales Kozumplik wrote:
> On 12/16/2011 04:24 AM, David Malcolm wrote:
> > Various functions in the Python bindings construct lists of objects,
> > but assume that all calls succeed.
> >
> >    python/header-py.c: hdrKeyList
> >    python/rpmps-py.c: rpmps_AsList
> >    python/rpmtd-py.c: rpmtd_AsPyobj
> >    python/spec-py.c: spec_get_sources
> >    python/spec-py.c: spec_get_packages
> >
> > Each of these could segfault under low-memory conditions: if the PyList_New()
> > call fails, PyList_Append(NULL, item ) will segfault.
> > Similarly, although Py_List_Append(list, NULL) is safe, Py_DECREF(NULL) will
> > segfault.
> >
> > Add error-handling for both cases to each function.
> >
> > Caveat: compiles, but is not tested.
> >
> > Found using an experimental static analysis tool I've been writing:
> >    http://gcc-python-plugin.readthedocs.org/en/latest/cpychecker.html
> > which is part of https://fedorahosted.org/gcc-python-plugin
> >
> > A sample HTML report from the tool on one of these cases (specifically
> > hdrKeyList) can be seen here:
> >    http://fedorapeople.org/~dmalcolm/gcc-python-plugin/2011-12-15/header-py.c.hdrKeyList-refcount-errors.html
> > ---
> 
> Hi David,
> 
> thank you for the patch.
> 
> I have a question about returning NULL from a python method like the 
> patch does. Will it cause an exception? It would be preferable, rpm 

The standard behavior of functions that return a PyObject* is that a
NULL value means that an exception *has* been raised: the standard
exception objects (e.g. the PyExc_MemoryError global variable for
"MemoryError") are pre-allocated (to avoid the need to allocate them
when memory is low), and when a NULL is returned, the thread-local
"current exception" has been set to point to an exception object.  So if
you get a NULL back, you can return NULL to propagate the exception back
up the call stack, and that's typically the best thing to do.

See e.g.:
http://docs.python.org/extending/extending.html#intermezzo-errors-and-exceptions
and:
http://docs.python.org/c-api/exceptions.html


> generally tries to terminate quickly if an OOM situation is encountered. 

This is a pet peeve of mine: I think it's a bad idea for libraries to
terminate when they encounter errors: even out-of-memory problems.  We
don't know about what the rest of the program is doing, and it may be
that a well-coded application could trap the out-of-memory exception and
free up resources and try again (e.g. running the garbage collector, or
freeing a cache, etc).  Also, we should give the application a chance to
exit gracefully, but we can't know all of cleanups that are needed
(leaving the rpm db in a sane state is one thing, but there may be
others).  Python tends to do a better job at error handling than raw C,
given that we have a standardized pattern for error handling (i.e.
exceptions): assuming the code is written to handle NULLs, the NULLs can
propagate the MemoryError exception up the call stack and the latter can
perhaps be handled somewhere.

Having said that, glib (and thus gtk) take the attitude of "if malloc
fails, kill the process":
http://developer.gnome.org/glib/2.30/glib-Memory-Allocation.html#glib-Memory-Allocation.description
which makes all of this in the context of e.g. anaconda somewhat moot,
alas :(

>   Why not use http://docs.python.org/c-api/exceptions.html#PyErr_Format 
> then?
When PyList_New() fails, it has already called PyErr_NoMemory(), which
sets the thread's exception state to PyExc_MemoryError.  Manually
wrapping all of those possible failures to add additional contextual
information would lead to madness.

Hope this is helpful
Dave



More information about the Rpm-maint mailing list