[Rpm-ecosystem] Containerization using rpm

Tue May 19 14:55:35 UTC 2015

Hi.

Sorry for the slow response... 

Dne Fri, 15 May 2015 09:07:43 -0400
Neil Horman <nhorman at tuxdriver.com> napsal(a):

...
> > I agree completely. However my impression is that "quicker" is actually
> > the point of containers (and "dirtier" is the usual side-effect). I've
> > already mentioned this: "time to market" is the cost function for
> > container application developers. And if we want them to use rpm we need
> > to offer them solution that would not be worse than the competing ones in
> > this regard.
> > 
> 
> Thats an excellent point. I too hear that most people want containers that
> are 'quick' to assemble, because getting a service stood up ASAP is
> financially desireous.  I wonder if, given our proposed use of rpm here, it
> might be worth trying to change the narrative such that containers don't
> have to be quick to assemble if they are instead quick to tailor to your
> needs?  That is to say, if we allow for a container to be built and curated
> in a slow path of development, but provide an interface that allows for
> derivative containers to quickly customize aspects that the parent
> container marks as being customizable, would that be a worthwhile endeavor?

Maybe. It's up to the users... However if producing e.g. a Docker image would
be significantly easier than producing a "container rpm", then I wouldn't bet
on rpm.  And if there are no images to modify, the ease of modification
doesn't matter...

> > There is one more thing that anybody who tried to install a container from
> > distribution packages encountered: the resulting containers were rather
> > huge (314 MB only for httpd in my current test case...). Being able to
> > remove the
> Yup, the test container manifest that I have in freight is an httpd server,
> and it clocks in right at 314MB.
> 
> > unnecessary stuff and still not lose the possibility to use rpm for the
> > content management might actually make the "dirtier" aspect less
> > dangerous. I could be able to find out there is a security update for the
> > bits I ship in my container for example...
> > 
> Hm, do you think it would be worth adding a strip stage to the container
> build, that could let a user specify a list of files to remove from the
> container prior to packaging?  That would certainly be a dirty hack, but it
> would allow for reduction of size that was custom to a users needs, and if
> you did it in the input manifest, the freight-builder tool could easily scan
> the list and warn/fail the operation if you tried to remove 'crtical files'
> like the rpm database, so you would be prevented from removing rpm query
> functionality.

I think the users already do remove some files manually. At least those I
know. So having this ability could be useful.

> > I was thinking also about another approach: merge the rpm whose file I
> > modified into the "container rpm". Ie., for all the modified or removed
> > non-config files do something like 'rpm -e --justdb' of their owners and
> > then the subsequent container tree scan would identify them as unpackaged
> > and add to my new container (it should be possible to save the scriptlets
> > before the package removals too)... This should be possible without
> > modifying the rpm tools but the resulting package would be an
> > extraordinary mess and I'm afraid I would end-up with a single all-in-one
> > rpm file quite often. The main advantage (tiny rpm that can compose its
> > own container) would disappear.
> > 
> 
> Thats an interesting idea.  Another variant of the idea along the same lines
> would be to create a derivative rpm with custom files, that does various
> rpm/yum manipulations as part of a post install script.  That is to say,
> imagine you have a container that wants to run httpd, and you want it to run
> your custom website, which uses postgers, but the generic httpd container
> ships with mysql. What if your custom container had a spec file that looked
> like this:

Well... In the world of "microservices" such a situation should not happen:

* Container is exactly one service (IIRC Kubernetes even assumes a container
  has only one network port open)
* If more services are required for an application, it gets composed from more
  containers and those are then glued together by some management
  infrastructure (by creating their private virtual network, mounting common
  filesystem, etc.), the customization is there just to add configuration or
  content (web site files to httpd container, altered my.cnf in the database
  container)

And in my world of "self-composing containers" it should not happen too,
of course. :)

Eventually it would be nice if the management tooling could support anything
between "microservice" to full OS container (complete userspace). Which is
why I wanted to package only the changed parts of any container. Then it
would be up to the user where to install them (separate or the same
container, even outside of any container should work).

> 
> Name: mywebsite
> Version: 1
> Release: 1
> Requires: generic-httpd-container # houses httpd/mysql/php
> Source0: mywebsite-html.tbz2 # extracts to /var/www
> Source1: mywebsite.conf
> 
> %install
> tar xvf $SOURCE0
> 
> %postinstall
> yum erase mysql
> yum install postgres
> cp $SOURCE1 /etc/httpd/conf.d/
> #any other config file edits go here
> 
> #make the container disk image a bit more lightweight
> rpm -e bash libreoffice mongodb supertuxkart foobar gcc 

This is why I think composing container from small pieces is better than
having to purge a big generic one.  If the bits vendor is trustworthy (unlike
anonymous images in the upstream Docker registry) I don't have to worry about
incompatibilities but get advantage of not having to ship everything on my
own, include the latest fixes and eventually be able to make partial updates.
(I know it's being considered non-issue right now but I expect people to
discover soon that having to re-deploy several hundreds of MB each time there
is a security update of a small library might be actually quite expensive.)

> Its incomplete of course, but theres no reason that, on install we can't
> customize a container on the fly.  And the database remains consistent that
> way (allbeit changed from the parent database, which is needed).
> 
> The space savings on disk would be nice, though its still a bit messy
> because you have to download the additional bits before removing them,
> which is less than great

Yes.  I don't like that idea too much.  It would be nice if those cleanups
weren't needed at all.  There is some initiative to break unnecessary
dependencies, the soft dependencies might also help. However it's going to
take time before we'll see the real effect and the result would still be
worse than the "micro-distributions" in this regard. So ability to track what
the user did to the "clean" system and being able to reply those actions
would still be valuable.  Managing containers (container images) would
require greater flexibility than managing Linux distribution.

Regards,
-- 
Tomáš Smetana