The IRC protocol

I grew up in the age of IRC. I was an channel operator in DALnet's #irchelp at the height of the network's popularity. Recently, with some spare time on my hand, I toyed with the idea of building an IRC server implementation in Zig. But once I became more familiar with the protocol, I abandoned the project.

Here are some of the things that stood out.

Privacy

This probably won't come as a surprise to most, but the IRC protocol has very little (no?) consideration for privacy. Of particular note is the public disclosure of a client's IP address. There's nothing stopping implementations from providing a more privacy-focused experience, e.g. cloaking the IP address or forcing a connection over TLS - both things that some IRC networks support. But the fact is that there's nothing baked into the protocol and from what I can tell IRCv3 does little to address this.

Given the current regulatory context, and what other protocols are doing with respect to privacy, it kinda feels like IRC is best left in the past.

Text-Based Protocol

IRC is a text-based, newline-terminated protocol. The benefits and drawbacks of both text and binary-based protocols are well known and documented. In my mind though, binary protocols are easier to implement, have fewer edge cases, are more flexible and more efficient.

Our choice of channel or nick name should not be limited by the protocol.

If the human readable benefit of text-based protocols were considered essential, then I'd look at Redis' RESP protocol or something similar which at least still has the possibility of length-prefixing data.

Error Handling

The IRC protocol uses explicit error codes with descriptions for various error cases, but doesn't have more generic error capabilities. For example, code 431 is used to indicate "No nickname given", but there isn't a more generic way to indicate that a specific required parameter is missing. There's one generic error code, 461, to indicate that parameters are missing but no ability to specify which parameters and the opposite error (too many parameters) doesn't exist.

Rather than having specific error codes for only some commands and some parameters, it would be more useful and consistent to have parameterized errors. As-is, there's a bunch of error cases which don't have specific error codes and thus it isn't clear how they should be handled or communicated.

Protocol Inefficiencies

IRC isn't just a text-based protocol it's also a really verbose one. Consider this partial reply to the motd command:

:tantalum.libera.chat 375 karltest :- tantalum.libera.chat Message of the Day -
:tantalum.libera.chat 372 karltest :- This server provided by Hyperfilter (https://hyperfilter.com)
:tantalum.libera.chat 372 karltest :-
:tantalum.libera.chat 372 karltest :-
:tantalum.libera.chat 372 karltest :- Welcome to Libera Chat, the IRC network for
:tantalum.libera.chat 372 karltest :- free & open-source software and peer directed projects.
...

The IRC RFC does not seem to require my nickname to be included, but it seems consistently implemented nonetheless. (It is required as per the "living specification" found at https://modern.ircdocs.horse). I don't understand what purpose it serves. But it does make the implementation more complicated by requiring the MOTD to be dynamic which is noteworthy because we're limited to 512 byte lines.

Unnecessary dynamic data added to replies make writing those replies more difficult, adds more edge cases (especially given the short reply length limit) and makes caching virtually impossible.

In some cases, the additional data might be necessary due to asynchronous nature of IRC communication. I'd like to see a fixed 2 or 4 byte message_id in every command and reply (IRCv3 support a Message ID extension, but it appears limited to some server-initiated replies.)

Length limit

I've mentioned the 512-byte length limit of commands and replies already. All the specification really says is that messages longer than 510 bytes (plus trailing \r\n) shall not be sent. It's not clear what should happen if a client, or server, sends more than this.

What's interesting, and I think fairly indicative of the protocol in general, is that a client can send a valid command (fewer than 512 bytes) which results in a server having to issue an invalid reply (greater than 512 bytes). Consider this exchange:

privmsg karltest :hello
:karltest!~karltest@178.256.02.33 PRIVMSG karltest :hello

My 23 byte message has generated a 57 byte reply. What happens if I send a 500 byte message? The specs don't seem to offer any suggestion, but what I've seen happen is the message getting truncated. To be clear, I don't mean the message gets split into 2. I mean bytes are simply dropped.

This shouldn't happen at the protocol level and if it must happen, the behavior should be documented.

Autobanh Testsuite

I'll finish by saying that I wish there was an autobanh testsuite equivalent for IRC. Both clients and servers would benefit from it. I'm not sure IRC protocol, as-is, is worth investing in, but, if I was trying to move it forward, I'd probably start there.

Comments

Thank you for your cool work

All comments are reviewed before being made public.