Home | Blog | Upgrading OpenWrt (backstory)

Remotely upgrading OpenWrt is easy, right?

This is a tale from the sysadmins' trenches; if you're in the same boat and require assistance, see the detailed tutorial/guide at Remotely upgrading OpenWrt 15 to 19 on Mikrotik Routerboards with NAND flash.

Some time ago, I was tasked with upgrading a few dozen Routerboards without physical access. You'd think, "well, that sounds like a routine sysadmin job"—but ooh boy, you're in for a ride! Here's the thing: When the LEDE project split off from OpenWrt, they, among other things, changed the file system used for flash based systems from yaffs2 to UBIFS.

File system support
in OpenWrt/LEDE
≤15.05 ≥17.01
yaffs2 yes no
UBIFS no yes
The problem in a nutshell.

yaffs2 isn't mainline, so I fully understand why it was switched out. But LEDE didn't just switch over the default filesystem, they also completely removed yaffs2 support simultaneously. So OpenWrt 15 speaks yaffs2 but no UBIFS, and later versions speak UBIFS but no yaffs2. So how do you upgrade such a system? Well, the manual just tells you plug in a serial cable and reinstall locally from scratch. Not so easy when you have dozens of devices, scattered all around the state.[1]

The device in question
PCB of the Routerboard 450G. Image © Mikrotik.

Now seems like a good moment to talk more about the Routerboard 450g: It's a small embedded Linux system based on the MIPS-powered ar71xx chipset, with a custom proprietary bootloader and a single SLC flash chip. This flash chip has a few partitions, most notably a 3.4MB "kernel" partition and a larger one for the "rootfs". The bootloader looks at this kernel partition, and expects an ELF file to boot from there. Guess what filesystem the bootloader expects? Correct, yaffs2.

To this day I haven't figured out how OpenWrt's sysupgrade(8) tool manages to install its kernel onto this partition without support for the underlying filesystem, but at this point I honestly didn't really care any more.[2]

There are some extra challenges, like OpenWrt 15's sysupgrade tool not supporting NAND flash, and the wget2nand utility keeping the existing filesystem intact (meaning I can install OpenWrt 19, but the new kernel can't read the (yaffs2) userland, and won't boot). We also didn't want to deploy a custom kernel with yaffs2 support patched back in, since that'd just complicate further upgrades, so we'd have to deal with the whole thing again in a year or two.[3]

At this point, after chasing a bunch of dead ends, I recapped what I found out:

Oh, and did I mention that there's no DHCP on the target network, so you better not fuck up the network config?
'This is fine' meme
Me at this point in time

Getting desperate, I for a short while contemplated just grabbing a kernel image with an initramfs from openwrt.org and just using that flashed to the "kernel" partition without ever touching "rootfs". But that won't work, because with this image will forget any changes you make to it on reboot. Not just the configuration of the software we need on the device, but even basic things, like the static IP its supposed to have. But with this idea in my mind, some gears started turning.

Maybe it clicked already for some of you. So here's what we ended up doing:

  1. Compile a minimal OpenWrt 19 image with an initramfs that fits in the small kernel partition. It just has a UBIFS-kernel, busybox, sshd and sysupgrade (occupied just ~2.5MB of the 3.4 available, so could even fit some debug utils). We hard-code the network config and rebuild for each device, so our Routerboards get a valid IP when they boot from this image.
  2. Log into the old OpenWrt system, upload said image into tmpfs and manually upgrade the kernel to our new image. This requires mounting the MTD block device and copying the file there (wget2nand turned out to be useful to steal copy-paste commands from)
  3. Reboot. If all goes well, we are now in our minimal system, and hopefully have the correct IP and sshd running[4]
  4. Transfer the official sysupgrade.bin upgrade file and a tape archive containing a minimal configuration (sshd, network) and install it with sysupgrade
  5. Wait for the device to automatically reboot, pray the install worked, and reconfigure the device with Ansible
Success, finally!

This solution actually worked flawlessly for all but one device: As luck would have it, this device was manufactured with a MLC flash chip, which UBIFS can't handle and refuses to use. There's an interesting thread in the OpenWrt mailing list about that, where the vendor admitted to have swapped out chips for a "short time in 2010". Lucky for us (for real this time), this is the one device that is in the basement of our office and not one of those out in the field. So we just binned it.

Foot- and Sidenotes

[1]: Given that we are in Austria, we're talking about the size of Connecticut, not Texas, but within the Alps, which means either going over or around mountains a lot.

[2]: I studied how sysupgrade(8) works, and the shell scripts it sources, and it looks like it just creates a UBI volume on the MTD partition...how the bootloader copes with that I don't know, but it does.

[3]: Yet another side note: We attempted this before in 2018, and ran into all the same problems then too, but we gave up back then to solve other problems. I actually talked (well, emailed with) with Richard Weinberger, UBIFS' maintainer and fellow countryman, how to get around this problem, but even he couldn't come up with a satisfying scheme.

[4]: When we first tried this, dropbear(8) refused to start in our image, because we didn't pull in uci(8), OpenWrt's configuration system. So we start it from /etc/rc.local.