[Rpm-maint] [rpm-software-management/rpm] Python module's name changed unnecessarily, making it impossible to express dependencies on it (#373)
Adam Williamson
notifications at github.com
Sat Dec 23 00:39:22 UTC 2017
OK, let's both calm down a bit. I was just annoyed at running into this entirely unnecessary problem while i was in a hurry last night. I agree that in practical terms it's not a major disaster that's likely to affect lots of people catastrophically. It's still *wrong*, though.
Let's take a second here to dig into naming a bit, because this whole area of nomenclature is kind of a disaster in Python. We're dealing with quite a few sort of overlapping...*things*, here. I've certainly been using 'module' and 'library' a bit sloppily and interchangeably.
Let's start with the easy terms: a module is, as https://packaging.python.org/glossary/#term-import-package puts it, "The basic unit of code reusability in Python". So in your project, `transaction.py` is a single module, which you could load in Python by running `import rpm.transaction`.
One level up from that we have what the glossary calls an "Import Package". If you do `import rpm`, you're loading an "import package". On the filesystem, I'd say these things are part of the "import package" called `rpm`:
/usr/lib64/python2.7/site-packages/rpm
/usr/lib64/python2.7/site-packages/rpm/__init__.py
/usr/lib64/python2.7/site-packages/rpm/_rpm.so
/usr/lib64/python2.7/site-packages/rpm/_rpmb.so
/usr/lib64/python2.7/site-packages/rpm/_rpms.so
/usr/lib64/python2.7/site-packages/rpm/transaction.py
Notably, the `.egg-info` file *IS NOT* part of this "import package". An "import package" *as such* doesn't really contain any packaging 'metadata'. It's just a bunch of Python modules organized in a certain way.
Lastly, we have what the glossary refers to as a ["Built Distribution"](https://packaging.python.org/glossary/#term-built-distribution). When you run `python setup.py build` you're creating one of these "built distributions", and when you run `python setup.py install` you install it. In *your* project, the "built distribution" basically consists of the "import package" plus the `.egg-info` file, but they can potentially contain much more than this - they can contain multiple "import packages" and different modules, 'extra data' files, tons of stuff. (For the record, the term 'library' doesn't really *have* a canonical meaning in this area, but mostly where I've used it above, I really meant "built distribution").
So there's a key point here: the name of your "import package" is 'rpm'. It always has been - before this change, after this change. What's at issue is the name of the "built distribution". Before this change the "built distribution" was called 'rpm-python'. Now it's called 'rpm'.
All the tools we've been talking about - setuptools, pip, tox etc. - deal in "built distributions", not "import packages". If you run, for e.g., `pip install rpm`, you are telling pip "ensure the built distribution named 'rpm' is installed". You are definitely *not* telling it "ensure an import package named 'rpm' is installed". I hope this is clear now and doesn't require further explanation? When you go to pypi, for e.g., and browse around, you're browsing through various "built distributions". By policy of course two "built distributions" registered on pypi can't have the same name, but two "built distributions" registered on pypi *could* potentially install identically-named but quite different "import packages", or modules.
It's easy to get these concepts confused when you're dealing with a relatively simple codebase like this (at least so far as the Python bits are concerned), where there's one "import package" and there's no particular reason why the "built distribution" should have a different name from it. But not *all* codebases are like that. There are quite a few "built distributions" which install *multiple* "import packages". (And by the same token, there are ones that install only a single module, no "import package" at all). Obviously, in this case, the name of the "built distribution" will match at most *one* of the "import packages". It may match none, but it certainly can't simultaneously be the same as *all* of them.
There's also a very common case where the names necessarily differ (and this one really *is* a case where I'd be right behind in saying someone did something stupid here): Python import names are not allowed to have dashes in them, so when you're separating words in the name of a module or "import package", you usually use underscores. However, pypi by policy (it could actually be a rule inherited from one of the underlying tools, I'm not sure...) *does not allow* underscores in the name of "built distributions". So you very commonly find that the pypi "built distribution" for an "import package" with a `_` in its name has a `-` in it instead. I own [one of these](https://pypi.python.org/pypi/resultsdb-conventions) - the "built distribution" is called `resultsdb-conventions`, the "import package" is called `resultsdb_conventions`...
I don't think there's anything fundamentally wrong with this whole setup. "Built distributions" are quite analogous to distribution packages in this system, after all, and we don't allow *those* to have multiple names. I suppose we *do* have the whole `Provides:` mechanism in RPM, though (and equivalent in other distribution packaging systems), which at least AFAIK this system doesn't provide. So you could argue that either that *or* the ability to express dependencies like `requires foo OR bar` is missing.
Finally, just to try (again) to calm things down a bit: I don't intend my suggestion that you don't know all this stuff as a *criticism*. There is a nearly infinite amount of stuff to know about and we all know almost none of it. We're all ragingly ignorant of about 99.9% of all possible human knowledge. :P All I meant to say was, there's quite a lot of background information to know about all these bits, and renaming a "built distribution" is a relatively unusual thing to do that, if you're interested in doing it properly, probably requires some research. For instance, *I* certainly don't know how I'd go about doing that, if I wanted to do it in such a way that things which depended on the old name still worked. I've never done it, so I haven't developed that knowledge. If I were going to do it, the first thing I'd do is go do some more research.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/373#issuecomment-353695883
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rpm.org/pipermail/rpm-maint/attachments/20171223/5fb0cb39/attachment-0001.html>
More information about the Rpm-maint
mailing list