Abstract Nonsense

A place for musings, observations, design notes, code snippets - my thought gists.

Mapbox documentation uses real tokens

I’ve been working on a small side project that requires embedding an interactive map. I spent some time evaluating different mapping providers - initially OpenStreetMap with leaflet for annotations. OpenStreetMap is great for rapid prototyping since it’s free and doesn’t require any API key, but it doesn’t look great (by default, at least) and it doesn’t have the same rich POI data integration.

I dabbled with Mapbox next and was really impressed. It has end-to-end API coverage for heaps of mapping use cases, and provides lots of examples of how to integrate Mapbox into your choice of language/platform/framework. However, the thing that really stood out to was their onboarding design. Once you create an account and login, Mapbox injects your API keys directly into the example code on the webpage. They literally inject your personal API key into the standard examples for logged in users so that you can copy/paste and immediately start working.

I ended up settling on Apple’s Mapkit JS. It’s got a generous free tier allowance (as per Apple usual, only if you pay for an Apple Developer account, which is fair) and a nice UI. The downsides are that it doesn’t have any React bindings, so you have to roll your own wrapper, and the overall documentation is lacking. The dearth of documentation is especially noticeable when contrasted to the tomes provided by Mapbox.

Internationali(z|a)tion is hard

I came across a UI glitch today in my Uber app. At first glance it appears to be a preposterous oversight, the s in Favourites has been orphaned!

I live in Australia, where we closely follow British English spelling - meaning it’s “Favourites” and not “Favorites”. In the world of UI/UX, it’s common to use localisation dictionaries to map strings to locale-appropriate versions. I suspect some UI designer carefully crafted this screen for US English and mapped over to AU English, accidentally committing a tiny typographic crime.

There’s a fantastic site literally called grumpy.website that aggregates many such UI/UX whoopsies. I’d definitely recommend checking it out!

Image

Writing in Future Tense: Machine Time

I published a blog post last night but it never appeared on the site. My GitHub Actions workflow kicked in, my commit hit the server, my Cloudflare build completed with no warnings or errors - everything looked good.

The culprit? Timezone mismatch. I’m writing from AEST (+10, I’m in Melbourne), but Cloudflare Pages Workers builds in UTC (“server time”). Hugo saw my future timestamp and politely ignored the post.

The fix: Use hugo --buildFuture as the build command in Cloudflare Pages settings to include posts “in the future”. I’ll consider this a cautionary tale … it’s not the first time timezones have caused me havoc in production.

⚡️Apache Spark 4.0 released

Apache Spark 4.0 has been released. It’s the first major version update since Spark 3.0 in 2020.

Here’s some of the highlights I’m excited about:

  • A new SQL pipe syntax. It seems to be a trend with modern SQL engines to include “pipe” syntax support now (e.g. BigQuery). I’m a fan of functional programming inspired design patterns and the excellent work by the prql team, so I’m glad to see this next evolution of SQL play out.
  • A structured logging framework. Spark logs are notoriously lengthy and this means you can now use Spark to consume Spark logs! Coupled with improvements to stacktraces in PySpark, hopefully this will mean less grepping tortuously long stack traces.
  • A new DESCRIBE TABLE AS JSON option. I really dislike unstructured command line outputs that you have to parse with awkward bashisms. JSON input/outputs and manipulation with jq is a far more expressive consumption pattern that I feel captures the spirit of command line processing.
  • A new PySpark Plotting API! It’s interesting to see it supports plotly on the backend as an engine. I’ll be curious to see how this plays out going forward… Being able to do #BigData ETL as well as visualisation and analytics within the one tool is a very powerful combination.
  • A new lightweight python-only Spark Connect PyPi package. Now that Spark Connect is getting more traction, it’s nice to be able to pip install Spark on small clients without having to ship massive jars around.
  • A bug fix for inaccurate Decimal arithmetic. This is interesting only insofar as it reminds me that even well-established, well-tested, correctness-first, open-source software with industry backing can still be subject to really nasty correctness bugs!

Databricks has some excellent coverage on the main release and the new pipe syntax specifically.

Answering the Unasked

I’m not sure exactly where this originated from, but I’m quite delighted by this exam question:

State some substantive question which you thought might appear on this exam, but did not. Answer this question (correctly).

As an interview question, I’ll sometimes ask: “Tell me something interesting you’ve discovered or learned recently.” I find its goes a long way to understanding the way the candidate thinks; how they convey technical knowledge to others; and to get a flavour for how real their passion and interest is for the domain.