One of the things that constantly annoys me on Ubuntu (and therefore, one of the things I’m intending to fix this cycle) is how bloody long it takes for kernel upgrades to install! Most of this turns out to be down to two components (which … ahem … my team is responsible for): update-initramfs which spends nearly two minutes re-building the initrd.img for the boot partition, and flash-kernel which spends another two minutes copying everything to the boot partition.
Under (Com)pressure
At some point last week when I was testing a lot of different kernel bits (see the prior post for context!), this annoyed me to the point that I sat down to dig into it. The first bit update-initramfs was a bit tricky to analyze: there’s no really good means of profiling shell scripts sadly, so I wound up using the old PS4-calling-date trick and writing a quick Python script to analyze the results. Once this was done, however, one line stood out like a sore thumb; the compression of the resulting image.
Impish has switched to using zstd as its compression algorithm and, while this certainly compresses things better than gzip, it takes ages to run on an ARM processor. Specifically, even on an overclocked Pi 400 running at 2GHz it was taking 85 seconds (!) to compress the image. A quick change to /etc/initramfs-tools/initramfs.conf and we can chop that down to 10 seconds with lz4:
50 # 51 # COMPRESS: [ gzip | bzip2 | lz4 | lzma | lzop | xz | zstd ] 52 # 53 54 #COMPRESS=zstd 55 COMPRESS=lz4
Admittedly the resulting image is much larger (~35MB with lz4 vs ~22MB with zstd) and that will affect boot speed. However, even at SD card read speeds that’s only one second lost per boot vs 75 seconds per kernel install (and lately I’ve been doing about one kernel install per boot!). Anyway, consider whether you do more or less than 75 boots per kernel install, and adjust accordingly!
Flash, ahh ahh!
On Ubuntu (and for that matter, Debian), the flash-kernel tool is charged with installing the bootloader, device-tree, kernel, and initrd (basically everything needed to get the system started) to … wherever they need to be on a given board. Sometimes that’s some NVRAM, sometimes a special ext4 partition, sometimes (as in the Pi’s case) that’s a FAT partition on some storage medium.
The Pi is quite an unusual case here inasmuch as its root storage is fully expected to go walkies (such as when upgrading an SD card on an old Pi and moving it to a new Pi model). Due to this, flash-kernel (in Ubuntu) copies the device-tree files for all Pi models (not just the one it finds itself on) to the boot partition every time the kernel is updated (along with everything else it usually copies). Unfortunately, the flash-kernel function that handles each transfer is quite slow and has a lot of overhead.
For most boards, this doesn’t matter as they copy (maybe) five files: a bootloader, a kernel, an initrd, a device-tree, and possibly some firmware or other scripts, or more often a single file which is some amalgam of the aforementioned pieces. But for the Pi there’s well over 200 individual files to copy and the result is a major slow-down.
Once I dug into the code the cause was pretty obvious. In the /usr/share/flash-kernel/functions script, the backup_and_install function on line 642 is the function we’re interested in:
642backup_and_install() {
643 local source="$1"
644 local dest="$2"
645 local do_dot_bak=$(get_dot_bak_preference)
646 local mtd_backup_dir=$(get_mtd_backup_dir)
647 if [ -e "$dest" ]; then
648 if [ -n "$do_dot_bak" ]; then
649 echo "Taking backup of $(basename "$dest")." >&2
650 mv "$dest" "$dest.bak"
651 else
652 echo "Skipping backup of $(basename "$dest")." >&2
653 fi
654 fi
655 # If we are installing to a filesystem which is not normally mounted
656 # then take a second copy in /var/backups, where they can e.g. be
657 # backed up.
658 if [ -n "$boot_mnt_dir" ] && [ -n "$mtd_backup_dir" ] ; then
659 local bak="$mtd_backup_dir/"$(basename "$dest")
660 #echo "Saving $boot_device:"$(basename "$source")" in $bak"
661 mkdir -p "$mtd_backup_dir"
662 cp "$source" "$bak"
663 fi
664 echo "Installing new $(basename "$dest")." >&2
665 mv "$source" "$dest"
666 maybe_defrag "$dest"
667}
Yes, it’s convoluted, but it needs to be for certain of the copying cases, and mostly it’s pretty quick. However, it’s the last line that’s interesting. That leads us to the “maybe_defrag” function on line 471:
471maybe_defrag() {
472 local file="$1"
473 local field="Bootloader-Has-Broken-Ext4-Extent-Support"
474 local broken_fw
475
476 if ! broken_fw=$(get_machine_field "$machine" "$field"); then
477 return
478 fi
479 if [ "$broken_fw" != "yes" ]; then
480 return
481 fi
482 if [ "$(df --output=fstype ${file} | sed -e 1d)" != "ext4" ]; then
483 return
484 fi
485 if ! command -v e4defrag > /dev/null; then
486 error "e4defrag command not found, unable to defrag $file"
487 fi
488 if ! e4defrag "$file" > /dev/null 2>&1; then
489 error "e4defrag of $file failed. Try freeing up space in /boot and re-executing flash-kernel"
490 fi
491}
From working on flash-kernel in the past, I know that “get_machine_field” calls are quite expensive (it’s essentially using shell-script to parse a text-file database). That’s why there’s a ton of calls caching these lookups around line 1008. Ultimately the “proper” fix is to cache the result of that call too. However, on the Pi the “maybe_defrag” function never does anything (and will never do anything) anyway, so a quick hack is to simply comment out that final line in backup_and_install:
642backup_and_install() {
643 local source="$1"
644 local dest="$2"
645 local do_dot_bak=$(get_dot_bak_preference)
646 local mtd_backup_dir=$(get_mtd_backup_dir)
647 if [ -e "$dest" ]; then
648 if [ -n "$do_dot_bak" ]; then
649 echo "Taking backup of $(basename "$dest")." >&2
650 mv "$dest" "$dest.bak"
651 else
652 echo "Skipping backup of $(basename "$dest")." >&2
653 fi
654 fi
655 # If we are installing to a filesystem which is not normally mounted
656 # then take a second copy in /var/backups, where they can e.g. be
657 # backed up.
658 if [ -n "$boot_mnt_dir" ] && [ -n "$mtd_backup_dir" ] ; then
659 local bak="$mtd_backup_dir/"$(basename "$dest")
660 #echo "Saving $boot_device:"$(basename "$source")" in $bak"
661 mkdir -p "$mtd_backup_dir"
662 cp "$source" "$bak"
663 fi
664 echo "Installing new $(basename "$dest")." >&2
665 mv "$source" "$dest"
666 #maybe_defrag "$dest"
667}
Once I’d put these two changes in place, I found that kernel installs went from taking more than 4 minutes, to slightly less than 2. Okay, still not fantastic but I am operating on an SD card, and a halving of the time for two trivial changes is not bad!
I’ll get the “proper” fixes in place during the next development cycle.