Blogs by Folkert

Tech blog on web, security & embedded

While working on the Roc compiler, we regularly dive deep on computer science topics. A recurring theme is speed, both the runtime performance of the code that we generate, as well as the performance of our compiler itself.

One extremely useful technique that we have been playing with is data-oriented design: the idea that the actual data you have should guide how code is structured.

December 9, 2022

Sorting with SIMD

Google recently published a blog article and paper introducing their SIMD-accelerated sorting algorithm.

SIMD stands for single instruction, multiple data. A single instruction is used to apply the same operation to multiple pieces of data. The prototypical example is addition, where one instruction can do e.g. 4 32-bit additions. A single SIMD addition should be roughly 4 times faster than performing 4 individual additions.

This kind of instruction-level parallelism has many applications in areas with a lot of number crunching, e.g. machine learning, physics simulations, and game engines. But how can this be used for sorting? Sorting does not involve arithmetic, and the whole idea of sorting is that each element moves to its unique correct place in the output. In other words, we don't want to perform the same work for each element, so at first sight it's hard to see where SIMD can help.

To understand the basic concepts, I played around with the ideas from the paper Fast Quicksort Implementation Using AVX Instructions by Shay Gueron and Vlad Krasnov. They provide an implementation in (surprisingly readable) assembly on their github. Let's see how we can make SIMD sort.

For the last couple of months we at Tweede golf have been working on implementing a Network Time Protocol (NTP) client and server in Rust.

The project is a Prossimo initiative and is supported by their sponsors, Cisco and AWS. Our first short-term goal is to deploy our implementation at Let's Encrypt. The long-term goal is to develop an alternative fully-featured NTP implementation that can be widely used.

Recently, we gave a workshop for the folks at iHub about using Rust, specifically looking at integrating Rust with cryptography libraries written in C.
Over the past months, we have worked with Scailable to optimize their neural network evaluation. Scailable runs neural networks on edge devices, taking a neural network specification and turning it into executable machine code.

In our last post, we've seen that async can help reduce power consumption in embedded programs. The async machinery is much more fine-grained at switching to a different task than we reasonably could be. Embassy schedules the work intelligently, which means the work is completed faster and we race to sleep. Our application actually gets more readable because we programmers mostly don't need to worry about breaking up our functions into tasks and switching between them. Any await is a possible switching point.

Now, we want to actually start using async in our programs. Sadly there are currently some limitations. In this post, we'll look at the current workarounds, the tradeoffs, and how the limitations might be partially resolved in the near future.

Previously we talked about conserving energy using async. This time we'll take a look at performing power consumption measurements. Our goal is first to get a feel for how much power is consumed, and then to measure the difference between a standard synchronous and an async implementation of the same application.
To more effectively write Embedded Rust applications, we want a clearer picture of two aspects: how can we ergonomically perform multiple tasks concurrently, and how can we exploit low-power modes to save energy. In the coming weeks, we want to write a small but non-trivial application that communicates with 2 sensors, uses async, and uses the low-power modes to conserve energy.
In embedded systems, energy efficiency is crucial for practical applications. Usually devices run on a battery, so the less energy you use, the longer the power supply will last. In this post we'll look at the basics of going to sleep and waking back up, and build a proof of concept using the nRF52840 development kit.

Blog posts

Show all

If one person at Tweede golf is a Rustacean, it’s definitely Wouter. Whether it’s about web, embedded, or even games: he tried it. And probably not just tried it, but prototyped, created, documented, presented, and nailed it. Just take a look at Wouter’s GitHub page[1]. He’s also engaged in the Rust community as an organizer of RustFest, member of the Dutch Rust foundation, and as maintainer of several open-source crates. He believes that Rust is well on its way to perfection.

August 14, 2020

Functional Rust? (4/5)

Lars started at Tweede golf about a year ago. We lured him in with the prospect of working on a cool embedded project in Rust. Since then he clocked a lot of Rust hours on it and on a research project we are running. Still, he manages to astound us with critical notes on Rust. Rightly so? Let's just say interviewing a functional programming purist like Lars gives us a lot of new perspectives around Rust.
Marlon is the rookie at Tweede golf, joining us a few months back. He started out with Rust not too long ago and is therefore the guy to talk to about his learning experience with Rust.