1. CRIU configuration files

    One of the CRIU uses cases is container checkpointing and restoring, which also can be used to migrate containers. Therefore container runtimes are using CRIU to checkpoint all the processes in a container as well as to restore the processes in that container. Many container runtimes are layered, which means that the user facing layer (Podman, Docker, LXD) calls another layer to checkpoint (or restore) the container (runc, LXC) and this layer then calls CRIU.

    This leads to the problem that if CRIU introduces a new feature or option, all involved layers need code changes. Or if one of those layers made assumption about how to use CRIU, the user must live with that assumption, which may be wrong for the user's use case.

    To offer the possibility to change CRIU's behaviour through all these layers, be it that the container runtime has not implemented a certain CRIU feature or that the user needs a different CRIU behaviour, we started to discuss configuration files in 2016.

    Configuration files should be evaluated by CRIU and offer a third way to influence CRIU's behaviour. Setting options via CLI and RPC are the other two ways.

    At the Linux Plumbers Conference in 2016 during the Checkpoint/Restore micro-conference I gave a short introduction talk about how configuration files could look and everyone was nodding their head.

    In early 2017 Veronika Kabatova provided patches which were merged in CRIU's development branch criu-dev. At that point the development stalled a bit and only in early 2018 the discussion was picked up again. To have a feature merged into the master branch, which means it will be part of the next release, requires complete documentation (man-pages and wiki) and feature parity for CRIU's CLI and RPC mode. At this point it was documented but not supported in RPC mode.

    Adding configuration file support to CRIU's RPC mode was not a technical challenge, but if any recruiter ever asks me which project was the most difficult, I will talk about this. We were exchanging mails and patches for about half a year and it seems everybody had different expectations how everything should behave. I think at the end they pitied me and just merged my patches...

    CRIU 3.11 which was released on 2018-11-06 is the first release which includes support for configuration files and now (finally) I want to write about how it could be used.

    I am using the Simple_TCP_pair example from CRIU's wiki. First start the server:

    #️  ./tcp-howto 10000
    

    Then I am starting the client:

    # ./tcp-howto 127.0.0.1 10000
    Connecting to 127.0.0.1:10000
    PP 1 -> 1
    PP 2 -> 2
    PP 3 -> 3
    PP 4 -> 4
    

    Once client and server are running, let's try to checkpoint the client:

    # rm -f /etc/criu/default.conf
    # criu dump -t `pgrep -f 'tcp-howto 127.0.0.1 10000'`
    Error (criu/sk-inet.c:188): inet: Connected TCP socket, consider using --tcp-established option.
    

    CRIU tells us that it needs a special option to checkpoint processes with established TCP connections. No problem, but instead of changing the command-line, let's add it to the configuration file:

    # echo tcp-established > /etc/criu/default.conf
    # criu dump -t `pgrep -f 'tcp-howto 127.0.0.1 10000'`
    Error (criu/tty.c:1861): tty: Found dangling tty with sid 16693 pgid 16711 (pts) on peer fd 0.
    Task attached to shell terminal. Consider using --shell-job option. More details on http://criu.org/Simple_loop
    

    Alright, let's also add shell-job to the configuration file:

    # echo shell-job >> /etc/criu/default.conf
    # criu dump -t `pgrep -f 'tcp-howto 127.0.0.1 10000'` && echo OK
    OK
    

    That worked. Cool. Finally! Most CLI options can be used in the configuration file(s) and more detailed documentation can be found in the CRIU wiki.

    I want to thank Veronika for her initial implementation and everyone else helping, discussing and reviewing emails and patches to get this ready for release.

    Tagged as : criu podman
  2. Nextcloud in a Container

    After using Podman a lot during the last weeks while adding checkpoint/restore support to Podman I was finally ready to use containers in production on our mirror server. We were still running the ownCloud version that came via RPMs in Fedora 27 and it seems like many people have moved on to Nextcloud from tarballs.

    One of the main reason to finally use containers is Podman's daemonless approach.

    The first challenge while moving from ownCloud 9.1.5 to Nextcloud 14 is the actual upgrade. To make sure it works I first made a copy of all the uploaded files and of the database and did a test upgrade yesterday using a CentOS 7 VM. With PHP 7 from Software Collections it was not a real problem. It took some time, but it worked. I used the included upgrade utility to upgrade from ownCloud 9 to Nextcloud 10, to Nextcloud 11, to Nextcloud 12, to Nextcloud 13, to Nextcloud 14. Lots of upgrades. Once I verified that everything was still functional I did it once more, but this time I used the real data and disabled access to our ownCloud instance.

    The next step was to start the container. I decided to use the nextcloud:fpm container as I was planning to use the existing web server to proxy the requests. The one thing which makes using containers on our mirror server a bit difficult, is that it is not possible to use any iptables NAT rules. At some point there are just too many network connections in the NAT table from all the clients connecting to our mirror server that it used to drop network connections. This is a problem which is probably fixed since a long time, but it used to be a problem and I try to avoid it. That is why my Nextcloud container is using the host network namespace:

    podman run --name nextcloud-fpm -d --net host \
      -v /home/containers/nextcloud/html:/var/www/html \
      -v /home/containers/nextcloud/apps:/var/www/html/custom_apps \
      -v /home/containers/nextcloud/config:/var/www/html/config \
      -v /home/containers/nextcloud/data:/var/www/html/data \
      nextcloud:fpm
    

    I was reusing my existing config.php in which the connection to PostgreSQL on 127.0.0.1 was still configured.

    Once the container was running I just had to add the proxy rules to the Apache HTTP Server and it should have been ready. Unfortunately this was not as easy as I hoped it to be. All the documentation I found is about using the Nextcloud FPM container with NGINX. I found nothing about Apache's HTTPD. The following lines required most of the time of the whole upgrade to Nextcloud project:

    <FilesMatch \.php.*>
       SetHandler proxy:fcgi://127.0.0.1:9000/
       ProxyFCGISetEnvIf "reqenv('REQUEST_URI') =~ m|(/owncloud/)(.*)$|" SCRIPT_FILENAME "/var/www/html/$2"
       ProxyFCGISetEnvIf "reqenv('REQUEST_URI') =~ m|^(.+\.php)(.*)$|" PATH_INFO "$2"
    </FilesMatch>
    

    I hope these lines are actually correct, but so far all clients connecting to it seem to be happy. To have the Nextcloud container automatically start on system startup I based my systemd podman service file on the one from the Intro to Podman article.

    [Unit]
    Description=Custom Nextcloud Podman Container
    After=network.target
    
    [Service]
    Type=simple
    TimeoutStartSec=5m
    ExecStartPre=-/usr/bin/podman rm nextcloud-fpm
    
    ExecStart=/usr/bin/podman run --name nextcloud-fpm --net host \
       -v /home/containers/nextcloud/html:/var/www/html \
       -v /home/containers/nextcloud/apps:/var/www/html/custom_apps \
       -v /home/containers/nextcloud/config:/var/www/html/config \
       -v /home/containers/nextcloud/data:/var/www/html/data \
       nextcloud:fpm
    
    ExecReload=/usr/bin/podman stop nextcloud-fpm
    ExecReload=/usr/bin/podman rm nextcloud-fpm
    ExecStop=/usr/bin/podman stop nextcloud-fpm
    Restart=always
    RestartSec=30
    
    [Install]
    WantedBy=multi-user.target
    
    Tagged as : fedora nextcloud podman

Page 1 / 1