1. S3 sleep with ThinkPad X1 Carbon 6th Generation

    Since a few weeks I have the new ThinkPad X1 Carbon 6th Generation and as many people I really like it.

    The biggest problem is that suspend does not work as expected.

    The issue seems to be that the X1 is using a new suspend technology called "Windows Modern Standby," or S0i3, and has removed classic S3 sleep.[1]

    Following the instructions in Alexander's article it was possible to get S3 suspend to work as expected and everything was perfect.

    With the latest Firmware update to 0.1.28 (using sudo fwupdmgr update (thanks a lot to Linux Vendor Firmware Service (LVFS) that this works!!!)) I checked if the patch mentioned in Alexander's article still applies and it did not.

    So I modified the patch to apply again and made it available here: https://lisas.de/~adrian/X1C6_S3_DSDT_0_1_28.patch

    Talking with Christian about it he mentioned an easier way to include the changed ACPI table into grub. For my Fedora system this looks like this:

    • cp dsdt.aml /boot/efi/EFI/fedora/
    • echo 'acpi $prefix/dsdt.aml' > /boot/efi/EFI/fedora/custom.cfg

    Thanks to Alexander and Christian I can correctly suspend my X1 again.

    Update 2018-09-09: Lenovo fixed the BIOS and everything described above is no longer necessary with version 0.1.30. Also see https://brauner.github.io/2018/09/08/thinkpad-6en-s3.html

    Tagged as : fedora X1
  2. Latest CRIU for CentOS COPR

    The version of CRIU which is included with CentOS is updated with every minor CentOS release (at least at the time of writing this) since 7.2, but once the minor CentOS release is available CRIU is not updated anymore until the next minor release. To make it easier to use the latest version of CRIU on CentOS I am now also rebuilding the latest version in COPR for CentOS: https://copr.fedorainfracloud.org/coprs/adrian/criu-el7/.

    To enable my CRIU COPR on CentOS following steps are necessary:

    • yum install yum-plugin-copr
    • yum copr enable adrian/criu-el7

    And then the latest version of CRIU can be installed using yum install criu.

    Tagged as : CentOS criu migration
  3. archive.rpmfusion.org

    After many years the whole RPM Fusion repository has grown to over 320GB. There have been occasional requests to move the unsupported releases to an archive, just like Fedora handles its mirror setup, but until last week this did not happen.

    As of now we have moved all unsupported releases (EL-5, Fedora 8 - 25) to our archive (http://archive.rpmfusion.org/) and clients are now being redirected to the new archive system. The archive consists of 260GB which means we can reduce the size mirrors need to carry by more than 75%.

    From a first look at the archive logs the amount of data requested by all clients for the archived releases is only about 30GB per day. Those 30GB are downloaded by over 350000 HTTP requests and over 98% of those requests are downloading the repository metdata only (repomd.xml, *filelist*, *primary*, *comps*).

  4. OpenHPC: Building Blocks

    I will be giving two talks about OpenHPC in the next weeks. The first talk will be at DevConf.cz 2018: OpenHPC Introduction

    The other talk will be at the CentOS Dojo in Brussels.

    I hope I will be able to demonstrate my two node HPC system based on Raspberry Pis and it definitely will be about OpenHPC's building blocks:

    OpenHPC Building
Blocks

    And the results:

    OpenHPC Building BlocksOpenHPC Building
BlocksOpenHPC Building
BlocksOpenHPC Building
Blocks

    Come to one of my talks and you will able to build your OpenHPC engineer from the available building blocks.

    Tagged as : CentOS OpenHPC
  5. Optimizing live container migration in LXD

    After having worked on optimizing live container migration based on runc (pre-copy migration and post-copy migration) I tried to optimize container migration in LXD.

    After a few initial discussions with Christian I started with pre-copy migration. Container migration in LXD is based on CRIU, just as in runc and CRIU's pre-copy migration support is based on dirty page tracking support of Linux: SOFT-DIRTY PTEs.

    As LXD uses LXC for the actual container checkpointing and restoring I was curious if there was already pre-copy migration support in LXC. After figuring out the right command-line parameters it almost worked thanks to the great checkpoint and restore support implemented by Tycho some time ago.

    Now that I knew that it works in LXC I focused on getting pre-copy migration support into LXD. LXD supports container live migration using the move command: lxc move <container> <remote>:<container>
    This move command, however, did not use any optimization yet. It basically did:

    1. Initial sync of the filesystem
    2. Checkpoint container using CRIU
    3. Transfer container checkpoint
    4. Final sync of the filesystem
    5. Restart container on the remote system

    The downtime for the container in this scenario is between step 2 and step
    5 and depends on the used memory of the processes inside the container. The goal of pre-copy migration is to dump the memory of the container and transfer it to the remote destination while the container keeps on running and doing a final dump with only the memory pages that changed since the last pre-dump (more about process migration optimization theories).

    Back to LXD: At the end of the day I had a very rough (and very hardcoded) first pre-copy migration implementation ready and I kept working on it until it was ready to be submitted upstream. The pull request has already been merged upstream and now LXD supports pre-copy migration.

    As not all architecture/kernel/criu combinations support pre-copy migration it has to be turned on manually right now, but we already discussed adding pre-copy support detection to LXC. To tell LXD to use pre-copy migration, the parameter 'migration.incremental.memory' needs to be set to 'true'. Once that is done and if LXD is instructed to migrate a container the following will happen:

    • Initial sync of the filesystem
    • Start pre-copy checkpointing loop using CRIU
      • Check if maximum number pre-copy iterations has been reached
      • Check if threshold of unchanged memory pages has been reached
      • Transfer container checkpoint
      • Continue pre-copy checkpointing loop if neither of those conditions is true
    • Final container delta checkpoint using CRIU
    • Transfer final delta checkpoint
    • Final sync of the filesystem
    • Restart container on the remote system

    So instead of doing a single checkpoint and transferring it, there are now multiple pre-copy checkpoints and the container keeps on running during those transfers. The container is only suspended during the last delta checkpoint and the transfer of the last delta checkpoint. In many cases this reduces the container downtime during migration, but there is the possibility that pre-copy migration also increases the container downtime during migration. This depends (as always) on the workload.

    To control how many pre-copy iterations LXD does there are two additional variables:

    1. migration.incremental.memory.iterations (defaults to 10)
    2. migration.incremental.memory.goal (defaults to 70%)

    The first variable (iterations) is used to tell LXD how many pre-copy iterations it should do before doing the final dump and the second variable (goal) is used to tell LXD the percentage of pre-copied memory pages that should not change between pre-copy iterations before doing the final dump.

    So LXD, in the default configuration, does either 10 pre-copy iterations before doing the final migration or the final migration is triggered when at least 70% of the memory pages have been transferred by the last pre-copy iteration.

    Now that this pull request is merged and if pre-copy migration is enabled a lxc move <container> <remote>:<container> should live migrate the container with a reduced downtime.

    I want to thank Christian for the collaboration on getting CRIU's pre-copy support into LXD, Tycho for his work preparing LXC and LXD to support migration so nicely and the developers of p.haul for the ideas how to implement pre-copy container migration. Next step: lazy migration.

    Tagged as : criu migration pre-copy
  6. Lazy Migration in CRIU's master branch

    For almost two years Mike Rapoport and I have been working on lazy process migration. Lazy process migration (or post-copy migration) is a technique to decrease the process or container downtime during the live migration. I described the basic functionality in the following previous articles:

    Those articles are not 100% correct anymore as we changed some of the parameters during the last two years, but the concepts stayed the same.

    Mike and I started about two years ago to work on it and the latest CRIU release (3.5) includes the possibility to use lazy migration. Now that the post-copy migration feature has been merged from the criu-dev branch to the master branch it is part of the normal CRIU releases.

    With CRIU's 3.5 release lazy migration can be used on any kernel which supports userfaultfd. I already updated the CRIU packages in Fedora to 3.5 so that lazy process migration can be used just by installing the latest CRIU packages with dnf (still in the testing repository right now).

    More information about container live migration in our upcoming Open Source Summit Europe talk: Container Migration Around The World.

    My pull request to support lazy migration in runC was also recently merged, so that it is now possible to migrate containers using pre-copy migration and post-copy migration. It can also be combined.

    Another interesting change about CRIU is that it started as x86_64 only and now it is also available on aarch64, ppc64le and s390x. The support to run on s390x has just been added with the previous 3.4 release and starting with Fedora 27 the necessary kernel configuration options are also active on s390x in addition to the other supported architectures.

    Tagged as : criu fedora
  7. Linux Plumbers Conference 2016

    It is a bit late but I still wanted to share my presentations from this year's Linux Plumbers Conference:

    On my way back home I had to stay one night in Albuquerque and it looks like the hotel needs to upgrade its TV system. It is still running Fedora 10 which is EOL since 2009-12-18:

    Still Fedora 10

    Tagged as : criu
  8. Influence which PID will be the next

    To restore a checkpointed process with CRIU the process ID (PID) has to be the same it was during checkpointing. CRIU uses /proc/sys/kernel/ns_last_pid to set the PID to one lower as the process to be restored just before fork()-ing into the new process.

    The same interface (/proc/sys/kernel/ns_last_pid) can also be used from the command-line to influence which PID the kernel will use for the next process.

    # cat /proc/sys/kernel/ns_last_pid
    1626
    # echo -n 9999 > /proc/sys/kernel/ns_last_pid
    # cat /proc/sys/kernel/ns_last_pid
    10000
    

    Writing '9999' (without a 'new line') to /proc/sys/kernel/ns_last_pid tells the kernel, that the next PID should be '10000'. This only works if between after writing to /proc/sys/kernel/ns_last_pid and forking the new process no other process has been created. So it is not possible to guarantee which PID the new process will get but it can be influenced.

    There is also a posting which describes how to do the same with C: How to set PID using ns_last_pid

    Tagged as : criu fedora
  9. Combining pre-copy and post-copy migration

    In my last post about CRIU in May 2016 I mentioned lazy memory transfer to decrease process downtime during migration. Since May 2016 Mike Rapoport's patches for remote lazy process migration have been merged into CRIU's criu-dev branch as well as my patches to combine pre-copy and post-copy migration.

    Using pre-copy (criu pre-dump) it has "always" been possible to dump the memory of a process using soft-dirty-tracking. criu pre-dump can be run multiple times and each time only the changed memory pages will be written to the checkpoint directory.

    Depending on the processes to be migrated and how fast they are changing their memory, this can still lead to a situation where the final dump can be rather large which can mean a longer downtime during migration than desired. This is why we started to work on post-copy migration (also know as lazy migration). There are, however, situations where post-copy migration can also increase the process downtime during migration instead of decreasing it.

    The latest changes regarding post-copy migration in the criu-dev branch offer the possibility to combine pre-copy and post-copy migration. The memory pages of the process are pre-dumped using soft-dirty-tracking and transferred to the destination while the process on the source machine keeps on running. Once the process is actually migrated to the destination system everything besides the memory pages is transferred to the destination system. Excluding the memory pages (as the remaining memory pages will be migrated lazily) usually only a few hundred kilobytes have to be transferred which reduces the process downtime during migration significantly.

    Using criu with pre-copy and post-copy could look like this:

    Source system:

    # criu pre-dump -D /tmp/cp/1 -t PID
    # rsync -a /tmp/cp destination:/tmp
    # criu dump -D /tmp/cp/2 -t PID --port 27 --lazy-pages   
     --prev-images-dir ../1/ --track-mem
    

    The first criu command dumps the memory of the process PID and resets the soft-dirty memory tracking. The initial dump is then transferred using rsync to the destination system. During that time the process PID keeps on running. The last criu command starts the lazy page mode which dumps everything besides memory pages which can be transferred lazily and waits for connections over the network on port 27. Only pages which have changed since the last pre-dump are considered for the lazy restore. At this point the process is no longer running and the process downtime starts.

    Destination system:

    # rsync -a source:/tmp/cp /tmp/
    # criu lazy-pages --page-server --address source --port 27   
     -D /tmp/cp/2 &
    # criu restore --lazy-pages -D /tmp/cp/2
    

    Once criu is waiting on port 27 on the source system the remaining checkpoint images can be transferred from the source system to the destination system (using rsync in this case). Now criu can be started in lazy-pages mode connecting to the page server on port 27 on the source system. This is the part we usually call the UFFD daemon. The last step is the actual restore (criu restore).

    The following diagrams try to visualize what happens during the last step: criu restore.

    step1

    It all starts with criu restore (on the right). criu does its magic to restore the process and copies the memory pages from criu pre-dump to the process and marks lazy pages as being handled by userfaultfd. Once everything is restored criu jumps into the restored process and the restored process continues to run where it was when checkpointed. Once the process accesses a userfaultfd marked memory address the process will be paused until a memory page (hopefully the correct one) is copied to that address.

    step2

    The part that we call the UFFD daemon or criu lazy-pages listens on the userfault file descriptor for a message and as soon as a valid UFFD request arrives it requests that page from the source system via TCP where criu is still running in page-server mode. If the page-server finds that memory page it transfers the actual page back to the destination system to the UFFD daemon which injects the page into the kernel using the same userfault file descriptor it previously got the page request from. Now that the page which initially triggered the page-fault or in our case userfault is at its place the restored process continues to run until another missing page is accessed and the whole procedure starts again.

    To be able to remove the UFFD daemon and the page-server at some point we currently push all unused pages into the restored process if there are no further userfaultfd requests for 5 seconds.

    The whole procedure still has a lot of possibilities for optimization but now that we finally can combine pre-copy and post-copy memory migration we are a lot closer to decreasing process downtime during migration.

    The next steps are to get support for pre-copy and post-copy into p.haul (Process Hauler) and into different container runtimes which already support migration via criu.

    My other recently posted criu related articles:

    Tagged as : criu fedora
  10. HS100 - Wi-Fi Smart Plug

    For my recently installed PXACB I was looking for a way to remotely power it on and off. I found the Wi-Fi Smart Plug "HS100" and a blog post that it can be controlled from the command-line.

    The referenced script uses captured results from wireshark and just re-transmits these messages from a shell script. In one of the comments someone points out that this is XOR'd JSON and how it can be decoded. Instead of a shell script I re-implemented it in Python and I am now always using XOR to encode and decode the JSON messages without needing to include the encoded commands in my script. This makes it easier to read the script and to extend the script.

    The protocol used is JSON which is XOR'd and then transmitted to the device. Same goes for the answers. The JSON string is XOR'd with the previous character of the JSON string and the value of the first XOR operation is 0xAB. Additionally each message is prefixed with '\x00\x00\x00\x23'.

    The message to turn on the power looks like this:

    {
     "system": {
      "set_relay_state": {
       "state": 1
      }
     }
    }
    

    To find more about which commands the device understands I used the information I got from: Why not root your Christmas gift?

    I downloaded the firmware for the US model of my smart plug and used binwalk to analyze the content of the firmware. The firmware contains busybox based ramdisk which includes the smart plug relevant programs /usr/bin/shd and /usr/bin/shdTester and it seems at least following commands exist:

    • system
    • reset
    • get_sysinfo
    • set_test_mode
    • set_dev_alias
    • set_relay_state
    • check_new_config
    • download_firmware
    • get_download_state
    • flash_firmware
    • set_mac_addr
    • set_device_id
    • set_hw_id
    • test_check_uboot
    • get_dev_icon
    • set_dev_icon
    • set_led_off
    • set_dev_location

    With the knowledge from the original shell script implementation and the results from binwalk I wrote the following script: https://lisas.de/~adrian/hs100.py

    Using this script I can power the device behind the smart plug easily on and off:

    $ ./hs100.py -H p-pxcab.example.com off
    $ ./hs100.py -H p-pxcab.example.com state
    Power OFF
    $ ./hs100.py -H p-pxcab.example.com on
    $ ./hs100.py -H p-pxcab.example.com state
    Power ON
    

    The only annoying thing about the smart plug is, that it tries to communicate with some cloud systems so that it could be controlled from anywhere. After starting the smart plug it makes a name lookup for devs.tplinkcloud.com and connects to port 50443. I can connect to that system with openssl s_client -connect devs.tplinkcloud.com:50443 but what the smart plug actually sends to that system I do not know. If I do not block the smart plug in the firewall I see a NTP request after that and then the communication seems to stop. Right now the smart plug is blocked and does no NTP requests but it still tries to reach devs.tplinkcloud.com:50443 once a minute.

    Tagged as : pxcab

Page 1 / 4