Reliable UDP Basics

Back in university (2012–2015), my instructors taught us that UDP was so unreliable it was essentially unused in modern applications. Since TCP guarantees ordered packet delivery and provides useful features like bidirectional streaming and congestion control, the situation was presented as if the choice between TCP and UDP was settled—there was no reason to use UDP.

In fact, it was implied that future development wouldn’t even require knowledge of TCP sockets, and possibly not even the details of HTTP. That was the prevailing stance at the time.

While my school required all instructors to have relevant industry experience, some gaps remained. Anyone who has written streaming applications or networked games knows the key reason to use UDP, or more specifically, Reliable UDP: performance.

In its most basic form, Reliable UDP consists of three additional features:

  1. Sequence numbers, which our application level protocol can use to detect missing/duplicate packets.
  2. Packet acknowledgements (ACKs), which provide reliability by confirming receipt of packets.
  3. Retransmissions triggered by timeouts or sliding window mechanisms, enabling the sender to resend packets that have not been acknowledged.

Reliable UDP vs TCP

You’re probably thinking “these features resemble the first steps toward TCP implementation”, which is true, but important TCP features are deliberately omitted when implementing reliable UDP.

  1. No persistent connections. Although establishing a TCP connection seems trivial, it requires either maintaining the underlying socket(s) or paying the handshake cost each time you communicate with a peer.
  2. Faster recovery by avoiding head-of-line blocking. TCP ensures in-order delivery by blocking new packet transmission until missing packets are retransmitted and acknowledged. Reliable UDP lets us decide when retransmissions or in-order delivery are necessary.
  3. Omitting transport-layer congestion control allows higher throughput and lower latency, especially when sending many small packets. Congestion control ensures fair network usage by reducing transmission rates and window sizes for specific senders, but this can negatively affect performance in applications like multiplayer games.

Consider a multiplayer online FPS game. The server regularly broadcasts player positions to all clients multiple times per second. Losing a single player location update is not critical since another will arrive within tens of milliseconds. In this case, we may not care about missing acknowledgments and can skip retransmissions. Using TCP would remove this choice and likely cause gameplay issues due to delays from retransmitting missing packets.

Reliable UDP in Zig

After completing the ziglings exercises, I wanted to explore the language further. So, I pulled out some old university assignment requirements and decided to reimplement one that involved Reliable UDP, including simulating packet loss and implementing a sliding window feature.

Note: The code here targets Linux/POSIX systems only. If you are implementing networking in Zig on another platform, you will likely need to adjust the code accordingly. I’m also new to Zig, so some of my code may not follow idiomatic style or community best practices. I’m also using a 0.15.0 release of Zig, so the syntax may look off for those still using a 0.14.x version.

Socket Basics

First, we need to create some sockets.

const receiver = net.Address.initIp4(.{ 127, 0, 0, 1 }, 9091);

const sock = try posix.socket(
    posix.AF.INET,
    posix.SOCK.DGRAM,
    posix.IPPROTO.UDP,
);

try std.posix.bind(sock, @ptrCast(&receiver.any), receiver.in.getOsSockLen());

For those unfamiliar with socket-level programming, the socket is the file descriptor used for send and receive calls.

  • INET specifies usage of IPv4 (as opposed to IPv6, Unix sockets, raw sockets, Bluetooth, etc.).
  • DGRAM indicates a datagram socket, meaning it is connectionless and each send or receive call handles a single discrete packet.
  • UDP specifies the UDP protocol. After creating the socket, we bind it to reserve the IP/port combination within the OS. This ensures we receive incoming packets sent to that destination port.
  • We can use setsockopt to allow address/port reuse if needed.

Then we can try and receive data on the socket:

const buf: [1024]u8 = undefined;
            
const ret = try posix.recvfrom(sock, @constCast(&buf), 0, null, null);
std.debug.print("data: {s}\n", .{buf[0..ret]});

Note that by passing null for the last two parameters, we discard information about the sender provided by the transport layer. We’ll revisit this later.

We can test our socket with netcat.

echo "hello world" | ncat --udp 127.0.0.1 9091

zig run main.zig
data: hello world

Sending is similar. You don’t necessarily need to call bind, but it depends on your use case. It’s generally a good idea to bind the socket; otherwise, the OS assigns a random available port.

const sender = net.Address.initIp4(.{ 127, 0, 0, 1 }, 9090);
const receiver = net.Address.initIp4(.{ 127, 0, 0, 1 }, 9091);
const message = "hello world";
const sock = try posix.socket(
    posix.AF.INET,
    posix.SOCK.DGRAM,
    posix.IPPROTO.UDP,
);

try std.posix.bind(sock, @ptrCast(&sender.any), sender.in.getOsSockLen());
_ = try posix.sendto(sock, message, 0, @ptrCast(&receiver.any), receiver.getOsSockLen());

Again we run with zig run main.zig.

ncat -l --udp -v 9091
Ncat: Version 7.97 ( https://nmap.org/ncat )
Ncat: Listening on [::]:9091
Ncat: Listening on 0.0.0.0:9091
Ncat: Connection from 127.0.0.1:9090.
hello world

Minimal Viable Reliability

As mentioned earlier, we need at least sequence numbers and acknowledgements to build Reliable UDP.

To simplify our implementation, we define a packet structure in Zig using an enum and a tagged union to facilitate runtime packet type checks.

Note: you could leave out the src and dest fields from this packet definition, but in the near future I want to write a network simulation routine that will sit in between the source and destination, making decisions about forwarding or dropping packets, without needing to deal with something like excessive configuration or ARP spoofing.

pub const PacketType = enum(u8) {
    Data,
    Ack,
    EoT,
};

pub const Kind = union(PacketType) {
    Data: struct {
        seq: u32,
        ack: u32,
        len: usize,
        data: []const u8, // not owned — you manage backing storage elsewhere
    },
    Ack: struct {
        ack: u32,
    },
    EoT: void, // no payload
};

pub const Packet = struct {
    src: std.net.Ip4Address,
    dest: std.net.Ip4Address,
    kind: Kind,
}

See the repo here for the full source code.

So, now we can implement a basic form of Reliable UDP.

Receiver:

fn handle_recv(allocator: std.mem.Allocator, senderAddr: std.posix.sockaddr, senderLen: std.posix.socklen_t, s: std.posix.socket_t, p: packet.Packet, expected_seq: *u32) !void {
    switch (p.kind) {
        .Data => |d| {
            if (d.seq != expected_seq.*) {
                return;
            }
            std.debug.print("{s}", .{d.data});
            const ack = packet.Packet{
                .src = p.dest,
                .dest = p.src,
                .kind = packet.Kind{
                    .Ack = .{ .ack = expected_seq.* },
                },
            };
            const buffer = try allocator.alloc(u8, 1024);
            const n = try ack.serialize(buffer);
            _ = posix.sendto(s, buffer[0..n], 0, &senderAddr, senderLen) catch |err| {
                return err;
            };
            allocator.free(buffer);
            expected_seq.* += 1;
        },
        .Ack => return, // in theory this should never happen, we could log an error here
        .EoT => {
            std.debug.print("\n", .{});
            // The sender has indicated this is the end of its transmission, so we can return an error telling the main loop that it can exit
            return Error.ErrorEoT;
        },
    }
}

// recv packets in a loop and ack them until we recv an EoT
fn recv_loop(allocator: std.mem.Allocator) !void {
    const sock = try posix.socket(
        posix.AF.INET,
        posix.SOCK.DGRAM,
        posix.IPPROTO.UDP,
    );

    try std.posix.bind(sock, @ptrCast(&receiver.any), receiver.in.getOsSockLen());

    const buf: [1024]u8 = undefined;
    var expected_seq: u32 = 0;

    var addr: std.posix.sockaddr = undefined;
    var addr_len: std.posix.socklen_t = @sizeOf(std.posix.sockaddr);

    while (true) {
        const ret = try posix.recvfrom(sock, @constCast(&buf), 0, &addr, &addr_len);
        if (ret == 0) {
            // This shouldn't be possible since the socket is blocking, but lets just be safe.
            continue;
        }
        const p = try packet.Packet.deserialize(buf[0..ret]);
        handle_recv(allocator, addr, addr_len, sock, p, &expected_seq) catch |err| {
            if (err == Error.ErrorEoT) {
                std.debug.print("\ngot an EoT, exiting\n", .{});
                return;
            }
            return err;
        };
    }
}

Our receiver sits in a loop, where the recvfrom call blocks. When it receives a packet it calls handle_recv, which handles parsing out the messages from Data packets and responding with Acks, updating the expected sequence # variable as needed.

Notice that we pass &addr to recvfrom so that we save the senders IP:Port so that we can reply with the Ack.

Sender:

fn send_wait(sock: std.posix.socket_t, allocator: std.mem.Allocator, pkt: *const packet.Packet, expect_ack: u32,timeout_ms: u64) !void {
    const buffer = try allocator.alloc(u8, 1024);

    const n = try pkt.serialize(buffer);
    defer allocator.free(buffer);

    const buf: [1024]u8 = undefined;

    while (true) {
        _ = try std.posix.sendto(sock, buffer[0..n], 0, @ptrCast(&receiver.any), receiver.getOsSockLen());

        // Set receive timeout
        const timeval = std.posix.timeval{
            .sec = @intCast(timeout_ms / 1000),
            .usec = @intCast((timeout_ms % 1000) * 1000),
        };
        try std.posix.setsockopt(
            sock,
            std.posix.SOL.SOCKET,
            std.posix.SO.RCVTIMEO,
            std.mem.asBytes(&timeval),
        );

        const recv_len = std.posix.recvfrom(sock, @constCast(&buf), 0, null, null) catch |err| switch (err) {
            error.WouldBlock => { // would block is how the timeout is represented for recvfrom
                std.debug.print("Timeout waiting for ack {d}, retrying...\n", .{expect_ack});
                continue; // retry
            },
            else => return err,
        };

        const pkt_recv = try packet.Packet.deserialize(buf[0..recv_len]);

        switch (pkt_recv.kind) {
            .Ack => |ack| {
                if (ack.ack == expect_ack) {
                    return;
                }
            },
            else => std.debug.print("Unexpected packet received while waiting for ack\n", .{}),
        }
    }
}

fn send_loop(allocator: std.mem.Allocator) !void {
    const sock = try posix.socket(
        posix.AF.INET,
        posix.SOCK.DGRAM,
        posix.IPPROTO.UDP,
    );
    // we'll try to recv acks too so bind early
    try std.posix.bind(sock, @ptrCast(&sender.any), sender.in.getOsSockLen());

    var expect_ack: u32 = 0;
    var file = try std.fs.cwd().openFile("book.txt", .{});
    defer file.close();

    // to start we'll excpect to send and recv acks in sequence
    while (true) {
        // we need to account for the space needed to write the packet header
        var buf: [1024 - 25]u8 = undefined;

        const bytesRead = try file.read(buf[0..]);
        if (bytesRead == 0) break; // EOF reached

        var p = try packet.initDataPacket(
            allocator,
            sender.in,
            receiver.in,
            @intCast(expect_ack),
            expect_ack,
            buf[0..bytesRead],
        );
        errdefer p.deinit(allocator);

        try send_wait(sock, allocator, &p, expect_ack, 100);
        p.deinit(allocator);
        expect_ack += 1;
    }
    // all of our packets have been ack'd
    const eot = packet.Packet{
        .src = sender.in,
        .dest = receiver.in,
        .kind = .EoT,
    };
    const buffer = try allocator.alloc(u8, 1024);

    const n = try eot.serialize(buffer);
    _ = posix.sendto(sock, buffer[0..n], 0, @ptrCast(&receiver.any), receiver.getOsSockLen()) catch |err| {
        std.debug.print("error sending EoT to reciever, it should timeout anyways: {}", .{err});
        return err;
    };
    allocator.free(buffer);
}

For the sender we bind to an IP:Port so we don’t get assigned a random port and the receiver knows where to send Acks. send_loop opens book.txt (a random book downloaded from project gutenberg) and sends data from the book in a loop. Once all the data has been sent and acknowledged we also send an EoT packet to signal that we’re finished.

In send_wait we set a timeout on our socket.SOL.Socket refers to a generic socket level option rather than one specific to a particular protocol. RCVTIMEO is the flag for specifying that we’re setting the receive timeout value. See the docs here.

We loop until we receive an ACK for the packet, checking for error.WouldBlock which indicates the timeout. See the Zig docs here where it specifies:

If sockfd is opened in non blocking mode, the function will return error.WouldBlock when EAGAIN is received.

Note that sockets don’t directly have support for exponential backoffs or max retries.

Retry Performance

Let’s quickly compare the slowdown incurred by our retry mechanism.

With the addition of a random number generator in our recv_loop function we can quickly simulate some packet loss. At the top of the function we’ll add:

var prng = std.Random.DefaultPrng.init(@intCast(std.time.nanoTimestamp()));
const rand = prng.random();
var r: f64 = undefined;
const percent_drop: f64 = 1; // 1% chance

And then inside the while loop before we call handle_recv we’ll add:

r = rand.float(f64) * 100.0; // random in [0, 100)
if (r < percent_drop and p.kind != .EoT) {
    continue;
}

Timing with no dropped packets: zig run main.zig -- --mode send 0.02s user 0.04s system 39% cpu 0.150 total

Timing with 1% of packets dropped: zig run main.zig -- --mode send 0.02s user 0.04s system 2% cpu 2.198 total

2.198 total indicates that we spent essentially all of our execution time waiting on IO. This makes sense, as earlier we set the timeout to 100ms when we called send_wait:

try send_wait(sock, allocator, &p, expect_ack, 100);

We can confirm this using strace, prepend strace -T -e recvfrom to the running of your program. With some grep and awk we can track the total amount of time spent waiting on recvfrom calls that eventually timed out.

strace -T -e recvfrom zig run main.zig -- --mode send 2>&1 | grep EAGAIN | grep -oP '<\K[0-9.]+' | awk '{sum+=$1} END {print "Total wait time in recvfrom:", sum, "seconds"}'
Total wait time in recvfrom: 1.87963 seconds

Note that for me the relevant strace lines look like this:

recvfrom(3, 0x11d4ac5, 1024, 0, NULL, NULL) = -1 EAGAIN (Resource temporarily unavailable) <0.104772>

In theory we could set that timeout to a lower value, but 100-200ms is a reasonable max timeout for something like a networked game where players might be located in different geographic locations.

Next Steps

At the moment, it should be relatively clear that we’re effectively replicating TCP’s in-order delivery by sending and acknowledging a single packet at a time.

Whether we can tolerate out-of-order delivery or avoid retries depends entirely on the needs of the end application. That said, there’s a simple optimization we’ve already hinted at that can significantly improve throughput.

By introducing a sliding window, we can reduce the per-packet overhead by batching together packets into a “window” and waiting for a single acknowledgment in response. The receiver replies once per window, indicating the highest sequence number it has successfully received. The sender then slides the window forward based on that acknowledgment.

In a follow-up post, we’ll implement this sliding window logic and explore Zig’s allocator types to reduce memory usage and gain better insight into allocation behavior overall.