It's Pi all the way down...

When HDMI freezes over

Wed 13 October 2021
by Dave Jones

Well, it’s time for another Ubuntu release, and so time for another slew of blog posts from me! In this particular case though, I need to start with a mea culpa on the state of the Ubuntu Desktop for Raspberry Pi release.

The bug, the bodge, and the deadline

Shortly before release, a bug in HDMI output handling (initially part of another bug) was discovered which vexed us right up to the release itself. The symptoms ranged from the annoying (display output freezes after a certain amount of use, usually a couple of hours in my case) to the critical (display output freezes during boot, in the case of a colleague).

The determining factor eventually turned out to be something to do with high refresh-rate monitors. With my eye-sight, I’ve never wanted or needed anything beyond good ol’ 1080p at 60Hz … and so that’s what all my monitors are. But younger colleagues (with better eye-sight) have fancier displays with higher frame-rates and on these the issue was most definitely critical.

We quickly found a work-around (which is in the release notes) which was simply to replace the “kms” overlay in config.txt with the “fkms” overlay, but then the race was on to fix things before release.

The kernel team worked tirelessly to try and come up with a fix, but it quickly became apparent this was an upstream issue too, and that it was unlikely we’d find a fix on our own. Eventually it was obvious that the bug was going to make it into the release images, and so the debate switched to whether to release with the fkms workaround in place.

To bodge, or not to bodge

As will be obvious to anyone who’s tried the desktop release with a high-refresh rate monitor, we released without the workaround in place (and that’s the royal we, because ultimately this was my decision so if you’re going to yell at anyone, yell at me — comments below!).

By way of (lengthy) explanation, here was my reasoning:

  1. We have regular meetings with the Pi Foundation (or more precisely the engineers at Raspberry Pi Trading Ltd. — just to appease Ben’s nagging of my constant conflation of these entities!). My understanding from these meetings is the “kms” overlay is considered the “future direction of development”. As such, it is under active development in the upstream kernel.
  2. The “fkms” overlay is considered legacy, and while it is supported in the kernel in use in Raspberry Pi OS, if it happens to break in the development kernel (which is where the Ubuntu kernel patches are sourced from), that’s acceptable.
  3. We have a mechanism for modifying the boot configuration during a release upgrade, but no such mechanism currently exists for kernel upgrades. The modifications performed during release upgrades are not perfect (there are edge cases we cannot reasonably account for), but a release upgrade is a sufficiently major operation that a certain amount of breakage might be considered reasonable in the case of esoteric configurations. The same cannot be said of kernel upgrades.
  4. Other (minor) points against “fkms”: audio doesn’t operate correctly (choppy) under “fkms” without the “tsched=0” workaround, and video displays screen-tearing. These are obviously non-critical, but it’s a sub-par experience compared to fixing “kms” properly.

To summarise: we could switch people to the “fkms” overlay during the impish release upgrade (and obviously modify the release image to use “fkms” too) but that would leave us in a position where a future kernel upgrade during the release broke things (in fact, our testing with a few interim upstream kernels suggested this was already the case), and there was no good mechanism for fiddling with the boot configuration during a regular apt update, apt upgrade procedure.

Alternatively, we could release with the “kms” overlay knowing full well that would break badly on certain high-fps monitors, but with a release noted work-around that wasn’t terribly complex (editing one line in a text file, which was accessible on all platforms), and the hope that we could fix the kernel in a (quickly released) update. Obviously this would still entail pain for people having to use the workaround to even boot (in order to install such an upgrade). But it would also mean that such people would have experience of installing the workaround and thus would have little difficulty in removing it themselves, without having to bodge a postinst script into the kernel to try removing it (probably imperfectly).

Conclusion … ?

As it turns out, we may have a fix quicker than I’d hoped: as I write this final section, I’m just waiting on another SD card to flash to play with a testing kernel which may fix at least part of the issue (I wasn’t kidding when I said the kernel team were working tirelessly on this!). There are also a couple of patches that appear to stabilize things completely, at least on my old 1080p 60Hz monitor.

Anyway, to all those affected, all I can say for now is “sorry” that this release isn’t everything I’d hoped it to be, and that the release includes this issue. That’s on me: I could’ve made the call to switch to “fkms” for the release, but I genuinely think that although this is obviously the more painful course, it’s the right one for a better experience in the end.