I’ve previously pointed out that the AX.25 implementation in the kernel is pretty poor. It’s not really being maintained, and even when it gets fixes after I reported it, with people running LTS OSs it can take like 5 years before before the fix actually reaches users, if ever. So when writing applications, you still have to work around kernel bugs from a decade ago. This makes it kind of pointless to upstream patches.

The exception is security patches, and reading between the lines of why the AX.25 code is now being removed from the kernel, it sounds like maybe some LLM (like the looming “Mythos” and the related Glasswing) may have found some severe problems. But even if there aren’t any known security problems yet, having code is now more of a liability than ever. Code needs to be removed, or taken responsibility of. (tangent about ffmpeg at the bottom of this post)

With the kernel code removed, say goodbye to the old walkthrough.

The new API

Well, not “new”, per se, but “replacement”.

With the socket based API about to be gone, we need some other way for applications to send packets and manage connections.

For sending raw packets to and from the modem there’s KISS. I have no real complaints about it. Not much to get wrong about sending frames. It’s implemented by most modems, like the software modem Direwolf and by some radios like the Kenwood TH-D75, so it’s not going anywhere.

For connected mode (streams of in order data, like with TCP) the biggest contender seems to be AGW. Direwolf implements it, and I’ve made a messy implementation of an AGW client in Rust. The async Rust API works, as we’ll see, but the code needs some refactoring and cleanup due to it being written exploratorily while I was deciding what it should even do, and how.

The AGW protocol is not super amazing, but it gets the job done. One can build a connection API on top of it, as I have, and never have to think about the AGW protocol ever again.

There’s another protocol called RHP, specified here and here. It came out of the XRouter project. Since XRouter is closed source, I have a strong aversion to it. It seems both counter to how I see amateur radio, and anachronistic, for it to be closed source. It’s bad enough that VARA and Winlink are closed source. And people are definitely working on replacing VARA with various other modes because of it.

tl;dr: I’m going with AGW for now. If someone writes a Rust crate for RHP exposing a compatible AsyncRead/AsyncWrite API, I certainly wouldn’t mind adding that dependency to optionally use.

I have not yet implemented AGW (or RHP) in my own AX.25 stack, but I plan to. For now that means I’ll use Direwolf.

axsh

Link to code.

The previous implementation

My previous axsh implementation, since deleted, had some problems:

  • it was implemented in C++, and not only do I prefer Rust, how could I even call something written in C++ “secure”? (a blog post for another day)
  • used the kernel API, so that needs rewriting,
  • used SEQPACKET, which proved to be a bit “weird” when interoperating with some other APIs, and
  • used crypto primitives vulnerable to quantum computers.

So with everything but terminal management needing a rewrite, this is a reason to rewrite the whole thing.

Requirements

  • Don’t use kernel AX.25 sockets — this means use AGW.
  • Use Rust.
  • Also work on TCP (mainly for debugging) — This means using an internal framing protocol.
  • Be quantum safe — Use ML-DSA+ed25519 dual signed for authentication of server and client.
  • Be efficient — This means don’t use ML-DSA for per packet signatures (they are huge), at the cost of some quantum safety (see the README).

Non-requirement: Encrypt — This would violate the amateur radio license. And then, why not just use SSH?

Example

If you have an AGW server, such as Direwolf, then it’s easy to run axsh. Just start a server:

axshd \
    -k server.key \
    -v debug
    -a authorized_keys \
    --agw-addr localhost:8010 \
    -l M0QQQ-1

Then log in:

axsh \
    -k client.key \
    -s M0QQQ-2 \
    --agw-addr localhost:8000 \
    M0QQQ-1

Then wait like 30-40 seconds for the handshake to complete. The reason for the wait is the large ML-DSA signatures used in the handshake.

It can’t be the same direwolf instance, since Direwolf only shuffles packets between the radio and AGW clients, not from one AGW client to another. In my case I had one Direwolf connected to an ICom 9700, and another to a Baofeng UV5R using an AIOC (all in one cable). AIOC is highly recommended for experimentation over the air.

So yeah my test is between just about the cheapest VHF/UHF radio that exists, and maybe the most expensive one.

Direwolf setup

ICom 9700

In addition to running rigctld -m 3081 -r /dev/ttyUSB0 -s 19200:

$ cat direwolf-9700.conf
# Identified with `aplay -l`.
#
# To get pulseaudio to get its dirty hands off of the device, I turned it "Off"
# under "Configuration" in `pavucontrol`.
ADEVICE plughw:2,0
PTT RIG 2 localhost:4532
CHANNEL 0
MYCALL M0QQQ-3
AGWPORT 8010
KISSPORT 8011
MODEM 1200
PACLEN 256
$ direwolf -t 0 -c direwolf-9700.conf

AIOC

$ cat direwolf-aioc.conf
# Identified with `aplay -l`.
#
# To get pulseaudio to get its dirty hands off of the device, I turned it "Off"
# under "Configuration" in `pavucontrol`.
ADEVICE plughw:1,0
ARATE 48000
PTT /dev/ttyACM0 DTR -RTS
CHANNEL 0
MYCALL M0THC-8
MODEM 1200
PACLEN 256
$ direwolf -t 0 -c direwolf-aioc.conf

Why not just run this over TCP/IP?

With KISS providing packet support (and AGW providing a higher level API on top, if preferred), why not just run TCP/IP, and let the very stable OS TCP implementation take care of everything?

TCP is definitely more modern, stable, and maintained, but it doesn’t scale down to slow speeds very well. A TCP+IPv4 header is at least 40 bytes, and if you don’t want to be some sort of caveman, IPv6 is another 20 bytes. At 1200bps that would be 267-400ms overhead for every packet1. Checking a random TCP data packet on my laptop I see that with TCP options TCP/IPv4 is actually 52 bytes, or 350ms.

Counting the air time (milliseconds, not just bytes) makes this overhead problem more obvious.

And because of amateur radio license reasons TCP would still need to identify the callsign, you probably have to add 17 bytes (113ms) as a surrounding header.

That leaves TCP with 69 or 89 bytes overhead per packet, meaning 460ms or 593ms. And since you don’t want to tie up the RF channel for too long (only for the whole packet to be dropped due to interference), you won’t want to send packets that are too large.

Of course it’s 4x as slow if you want to do something like Bell 103 on HF.

AX.25 connected mode takes that down to 19 bytes (126ms) overhead (if using Mod 128 mode) per data packet.

Because of the AX.25 segmenter, for bulk data TCP is not as bad as it may have sounded. For a 1500 byte TCP segment, fitting in just under 8 200 byte AX.25 frames (totalling 17*8+69=205 bytes of overhead), this means 1367ms overhead instead of plain AX.25 (at 19*8=152 bytes) 1013ms. A 1500 byte payload takes 10 seconds to send, so that’s an overhead of 13.7% instead of 10.1%.

But for interactive use cases, worst case a single payload packet, it’s 467ms vs 133ms. And that’s only counting the data frames, not the acknowledgments. A TCP ACK is at a minimum 17+40=57 bytes, or 380ms. An AX.25 RR is 18-19 bytes, or 120-127ms.

That makes TCP about three times less efficient, compared to AX.25.

A bigger problem with TCP, especially untweaked, is resend timers and window sizes. At 1200bps you don’t actually want too big a window size, since you don’t want to tie up the RF channel for several minutes if the other end has gone away. So a bunch of airtime tweaks are needed. And at best you’ll end up with the numbers above.

Maybe you could tweak TCP to be more friendly to lower speeds, and find the other overhead acceptable. If so, then you’ll be happy to hear that axsh supports running on TCP as well.

Why not QUIC?

Well first, it inherits the same problems from TCP/IP. Sure, the UDP header is smaller than the TCP header, but then on top of that there’s the QUIC header.

The second problem is that QUIC is meant to be encrypted. Ripping out encryption, while staying secure, seems more dangerous that keeping it simple and just working from the requirements. Probably the whole handshake would have to be redesigned.

FFmpeg & Google

AX.25 being removed from the Linux kernel reminds me of LLM finding that bug in ffmpeg, causing all that drama.

I have no dog in this fight, but in my opinion ffmpeg is in the wrong, here. Their argument seems to be all about how this particular encoder is rarely used, is just a hobby project, etc.. Ok, but it’s in your code base. Even if disabled by default, why would you want to ship a security footgun? Maybe some hobbyists out there build ffmpeg with all encoders enabled. Do you want them to be vulnerable to someone’s virus?

So Google should either keep quiet, or give a patch? Well, keeping quiet because the codec is rarely used is not really an option. That’s borderline negligent and morally culpable, for when someone eventually gets hacked.

So Google “should” always provide a patch in these cases? Perhaps, depending on the meaning of the word “should”. Google is rich, so “should” be morally forced to contribute to your software, just because Google (presumably, via youtube) is a heavy user of ffmpeg?

Well, that just sounds like the the (non-)problem with open source software (or free software) in general. The license permits use and profit without contribution. If you wanted a tithe then you should have put that in the license. Sounds like you want everyone to be free only to do what you want. That’s not how that works.

This is also why I don’t like the AGPL license. It’s not free software if it binds me in your serfdom.

Footnotes

  1. Actually, it’s a tiny bit more, because of the occasional bit stuffing