[Rpm-maint] [PATCH 10/15] Port tagNumFromPyObject on Python 3 to use unicode objects

Panu Matilainen pmatilai at redhat.com
Tue Oct 20 08:29:57 UTC 2009


On Thu, 15 Oct 2009, James Antill wrote:

> On Thu, 2009-10-15 at 15:15 -0400, David Malcolm wrote:
>> Preserve the string-based API to headers:
>>    h['name']
>> by expecting a PyUnicode on Python 3, and a PyString on Python 2.
>> ---
>>  python/header-py.c |    7 +++++++
>>  1 files changed, 7 insertions(+), 0 deletions(-)
>>
>> diff --git a/python/header-py.c b/python/header-py.c
>> index 98cd753..f3454df 100644
>> --- a/python/header-py.c
>> +++ b/python/header-py.c
>> @@ -421,8 +421,15 @@ int tagNumFromPyObject (PyObject *item, rpmTag *tagp)
>>      if (PyInt_Check(item)) {
>>  	/* XXX we should probably validate tag numbers too */
>>  	tag = PyInt_AsLong(item);
>> +#if PY_MAJOR_VERSION >= 3
>> +    } else if (PyUnicode_Check(item)) {
>> +	PyObject *utf8_bytes = PyUnicode_AsUTF8String(item);
>> +	tag = rpmTagGetValue(PyBytes_AsString(utf8_bytes));
>> +	Py_XDECREF(utf8_bytes);
>> +#else
>
> There's no reason for ifdef here, as it isn't a bug if:
>
> ipkg.hdr[u'name']
>
> ...works in 2.* ... dito. hdr[b'name'] working in 3.*.

Yup, and the same goes for any place where rpm accepts strings from 
python, eg ts.dbMatch(). Current rpm.org HEAD is being anal about it and 
only accepts plain strings everywhere, leaving the all the responsibility 
of encoding to the caller who is not really in any better position to know 
what rpm might accept.

Leaving it that way is an option, but if everything is Unicode in Python 
3... maybe we should just bite the bullet and handle the conversions to 
utf8 in the bindings afterall. It just needs to be consistent: either we 
accept unicode everywhere or we dont. Might be best handled with an arg 
conversion function, eg

int rpmStringFromPyObject(PyObject *item, PyObject **str)
{
     PyObject *res = NULL;
     if (PyBytes_Check(item)) {
         Py_XINCREF(item);
         res = item;
     } else if (PyUnicode_Check(item)) {
         res = PyUnicode_AsUTF8String(item);
     } else {
         PyErr_SetString(PyExc_TypeError, "string or unicode expected");
     }
     if (res) {
         *str = res;
     }
     return (res != NULL);
}

...so any place needing to handle either unicode or string/bytes object 
such as tagNumFromPyOjbect() would be something like

     PyObject *stro = NULL;
     ...
     if (PyInt_Check(item)) {
         /* XXX we should probably validate tag numbers too */
         tag = PyInt_AsLong(item);
     } else if (rpmStringFromPyObject(item, &stro)) {
         tag = rpmTagGetValue(PyBytes_AsString(stro));
         Py_XDECREF(str);
     } else {
     ...

 	- Panu -


More information about the Rpm-maint mailing list