Abstract Nonsense

A place for musings, observations, design notes, code snippets - my thought gists.

Activation functions and empiricism

In deep learning literature, there’s a veritable menagerie of different activation functions that are commonly employed betwixt layers. Proponents of one or another class of functions will usually proffer up some rationalisation for what makes their choice grounded: differentiability, smoothness, computational complexity, numerical stability, concision…

Today I was reading through GLU Variants Improve Transformer by Noam Shazeer (also of Attention is all you need fame) and came across this gem of empiricism:

We have extended the GLU family of layers and proposed their use in Transformer. In a transfer-learning setup, the new variants seem to produce better perplexities for the de-noising objective used in pre-training, as well as better results on many downstream language-understanding tasks. These architectures are simple to implement, and have no apparent computational drawbacks. We offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence.

Well, I appreciate the honesty.

Programming Rust

I completed reading through Programming Rust by Jim Blandy, Jason Orendorff and Leonora F. S. Tindall (Amazon). I feel like I spent too much of 2025 flitting between new concepts, and I missed the structured and concerted learning from university. A goal for 2026 was to switch from breadth-first to depth-first learning; and to be honest, I miss reading textbooks! I read this cover to cover, and overall I really enjoyed it. In particular:

  • I really enjoyed the sections on the Rust borrow checker and semantics of ownership. The diagrams were especially helpful in understanding the memory layout of Rust values and types.
  • The illustrative code examples were a bit hit-and-miss for me: I thought they were often quite bland or didn’t quite demonstrate the topic at hand very well.
  • Exercises would have been a nice touch, but there is a plethora of exercise ideas floating through the ether in any case.
  • I found the asynchronous programming section to be a bit of a mess. However, that could well be due to my lack of experience with async programming as a whole. In which case, I have much to make up for!

All in all, I found it quite an enjoyable read, and feel that I’ve got a better handle on Rust programming. I’m looking forward to picking up and reading through more textbooks this year!

Rustlings

As part of my recent quest to learn Rust, I worked through the Rustlings exercises. You’re given a series of ~100 problems, and your goal is to fix the code to make it compile with passing tests. It was a pretty quick whirl-through, and I don’t think the examples are the most interesting, but as a short introduction to the syntax and rudiments of the semantics, it does the trick with aplomb. I did quite enjoy the interactive problem-solving CLI interface, though.

Networking Concepts by Beej

Computer networking always felt like an area in which I was lacking after completing my undergraduate CS education. A good while ago I stumbled across the excellent and freely accessible Beej’s Guide to Networking Concepts and worked through it, tracking my solutions in this solutions-guide-to-networking-concepts repository.

All solutions herein were produced without any LLM assistance. Where any cloze-style template is provided by the textbook for the project, I’ve adopted that as the basis for my solution. Any errors or lapses of understanding are solely my own.

Please note that the selection of chapters for which solutions are presented is non-contiguous as not all chapters in the textbook had programmatic questions or projects associated. The list of projects is as follows:

ChapterTextbook linkSolutions
5HTTP Client and ServerSolutions
9A Better Web ServerSolutions
12Atomic TimeSolutions
13The Word ServerSolutions
16Validating a TCP PacketSolutions
19Computing and Finding SubnetsSolutions
22Routing with Dijkstra’sSolutions
30Using SelectSolutions
39Multiuser Chat Client and ServerSolutions

With the exception of uni-curses’s for the last project, there’s no dependencies other than Python and the standard library.

My focus in these solutions is on simplicity and straightforward, imperative code, following the style used by Beej in the textbook. For production applications, you would certainly want to add more abstraction and robust parsing and error handling. These solutions are meant to be educational and illustrative.

I really enjoyed this little foray into the dark arts of networking, and learned a lot along the way. Most importantly, I learned a lot about what I still don’t know, which I find equally valuable. Here’s a little section-by-section retrospective on working through the exercises:

# 5 HTTP Client and Server

Here you build your own HTTP client and server, and practice opening sockets, sending bytes through and receiving bytes back. It was interesting reading through the socket docs to see what’s exposed by the OS, and learning about arcane incantation option flags like socket.SOL_SOCKET, socket.SO_REUSEADDR....

# 9 A Better Web Server

This was a fun little exercise, and got me thinking about how I would construct type-safe objects to represent the usual menagerie of HTTP headers, requests, paths, methods, protocols… For simplicity I went with TypeAlias to have something more explicit to put into function signatures. I can’t recall why I didn’t use the (relatively new) type statement, though.

It was also a little scary (and fun!) to think about just how many vulnerabilities that a naive implementation of a web server would expose, and just how much complexity is required to remediate those security holes.

# 12 Atomic Time

python
from contextlib import closing
def get_nist_time() -> int:
    with closing(socket.socket()) as s:
        s.connect((TIME_SERVER, TIME_PROTOCOL_PORT))
        logger.info(f'Connected to {TIME_SERVER}:{TIME_PROTOCOL_PORT}')
        
        response = b''.join(iter(lambda: s.recv(RESPONSE_BUFFER_SIZE), b''))
        logger.debug(f'Response: {response}')
        
        return int.from_bytes(response, byteorder='big')

I initially thought that using the closing utility was a smart way to automatically .close the socket after exiting the context manager, but I just realised that:

Changed in version 3.2: Support for the context manager protocol was added. Exiting the context manager is equivalent to calling close().

I also enjoyed using the functional-esque form of iter(callable, sentinel, /). The docs use this example that I borrowed to read data into the buffer from the socket:

One useful application of the second form of iter() is to build a block-reader. For example, reading fixed-width blocks from a binary database file until the end of file is reached:

python
from functools import partial
with open('mydata.db', 'rb') as f:
    for block in iter(partial(f.read, 64), b''):
        process_block(block)

# 13 The Word Server

In this project you have to build up a server-side packet stream that encodes words (English words that is, not machine words) and a client that consumes the packets and parses them. I don’t think my construction here is the cleanest, but I think it does the job for working within the scaffold of the solution and keeping it imperative. I also went down some side exploration on Python’s struct standard library module, and protobufs.

# 16 Validating a TCP Packet

This was an exercise in finagling with bit twiddling operations and coercing Python to use bit arithmetic instead of integer arithmetic. But quite enjoyable once you squash the inevitable off-by-n errors.

# 19 Computing and Finding Subnets

More bit-twiddling exercises! This all came together quite nicely though, and made for some fun arithmetic. doctests was my friend here for helping me validate the functions as I wrote them.

# 22 Routing with Dijkstra’s

This was a straight-forward implementation of Dijkstra’s’s algorithm for route finding. I haven’t dug into it much, but I know there’s a positively inordinate amount of complexity re efficient network routing. I quite enjoy reading the Cloudflare Blog to expose myself to more about the wide, wild and wonderful world of the web.

# 30 Using Select

Not much to say here, just a small exercise on using select. This was instructive in the broader I/O and OS file integration sense, though.

# 39 Multiuser Chat Client and Server

By far my favourite exercise, and one of those “you’ve got to build it yourself once” programmer exercises I can tick off the list!

There’s a veritable endless set of extensions I wanted to add to the project, but got called away by the pursuit of something else in the knowledge-verse. It was the perfect confluence of everything the book had worked through: bits and bytes, sockets, threads, error-handling, packet-wrangling… It was nice handling concurrent users and distributing packets between clients. As a small bonus, I could sneak in my favourite and sparsely utilised walrus operator:

python
while (cmd := read_command(f'{args.client_nickname}> ')) != CLIENT_QUIT_COMMAND:
    if cmd.startswith('/message'):
        _cmd, recipient, message = cmd.split(maxsplit=2)
        packet: bytes = chat_packet.dict_to_packet({
            'type': 'private_message',
            'recipient_nickname': recipient,
            'message': message
        })
    ...

# Conclusion

I’m glad I went through the textbook end-to-end and wrote up solutions for the learning experience, but I don’t think I’ll revisit this much. I think my time is probably better spent on learning other things now. At some point it might be nice to revisit these exercises in Rust, though. I think it provides better affordances and abstractions for writing cleaner and more expressive networking code than Python.

GitHub hijacks and breaks browser search

I like to keep a word list of any new and interesting words I come across day-to-day. Today I was curious how many entries were in my list and went to search the YAML file on GitHub.

To my delight, I discover that GitHub has hijacked the native Cmd-F browser search. To top it off, seems the maximum number of matches GitHub’s search returns is limited to 200.

I’ll excuse a search function that at least reports > 200 results matched, but there is no indication in the UI that this is the case. Even if you navigate to the 200th result and can see additional matches in the viewport, GitHub’s UI steadfastly refuses their existence.

Here’s what it looks like on macOS with Safari version 26.2:

Safari GitHub search

A quickly-disappearing hint on the GitHub search model reported that hitting Cmd-F again brings up the native browser search. I gave that a whirl, but it still wasn’t finding all matches. I thought I’d inspect the page source for the text elements to see what’s going on under the hood:

Viewport bug Safari GitHub

… and hit a render error. This is another lovely bug I’ve been running into when the viewport changes quickly in Safari. I can see a data-target="react-app.reactRoot" attribute lurking in the dark: maybe I shouldn’t besmirch the React-ification of GitHub’s UI, though. After all, at least the raw file searches instantly in the browser:

Safari GitHub raw search

Luckily, Firefox retains its native search experience:

Firefox GitHub search

Maybe I’m getting this all wrong, and I’d love to be corrected. But it sure feels like GitHub’s UI&UX has become increasingly slow and unfriendly as of late. In any case, I’ve reported this bug and I’ll update if I hear anything back.