Folkert

Werk en blog posts

Over

Folkert

  • Embedded software engineer
  • folkert@tweedegolf.com

Folkert is een echte Rustfanaat en daarmee past hij perfect bij het team. Hij wil zich primair gaan focussen op het toepassen van Rust binnen embedded vraagstukken, low-level materie waarin hij zich helemaal thuis voelt. Daarnaast houdt hij ook van het oplossen van praktische problemen met meer high-level technologie.

Naast in programmeertalen is Folkert ook geïnteresseerd in linguïstiek, de studie van de natuurlijke talen. Met name het Fries (wat hij zelf ook spreekt) en de Scandinavische talen. Is hij niet met een (programmeer)taal bezig, dan kookt hij graag of werkt hij in de tuin.

RustCompilerslinguistics

Blog posts van Folkert

December 9, 2022

Sorting with SIMD

Google recently published a blog article and paper introducing their SIMD-accelerated sorting algorithm. SIMD stands for single instruction, multiple data. A single instruction is used to apply the same operation to multiple pieces of data. The prototypical example is addition, where one instruction can do e.g. 4 32-bit additions. A single SIMD addition should be roughly 4 times faster than performing 4 individual additions. This kind of instruction-level parallelism has many applications in areas with a lot of number crunching, e.g. machine learning, physics simulations, and game engines. But how can this be used for sorting? Sorting does not involve arithmetic, and the whole idea of sorting is that each element moves to its unique correct place in the output. In other words, we don't want to perform the same work for each element, so at first sight it's hard to see where SIMD can help. To understand the basic concepts, I played around with the ideas from the paper Fast Quicksort Implementation Using AVX Instructions by Shay Gueron and Vlad Krasnov. They provide an implementation in (surprisingly readable) assembly on their github. Let's see how we can make SIMD sort.

For the last couple of months we at Tweede golf have been working on implementing a Network Time Protocol (NTP) client and server in Rust. The project is a Prossimo initiative and is supported by their sponsors, Cisco and AWS. Our first short-term goal is to deploy our implementation at Let's Encrypt. The long-term goal is to develop an alternative fully-featured NTP implementation that can be widely used.

Recently, we gave a workshop for the folks at iHub about using Rust, specifically looking at integrating Rust with cryptography libraries written in C.

Over the past months, we have worked with Scailable to optimize their neural network evaluation. Scailable runs neural networks on edge devices, taking a neural network specification and turning it into executable machine code.

In our last post, we've seen that async can help reduce power consumption in embedded programs. The async machinery is much more fine-grained at switching to a different task than we reasonably could be. Embassy schedules the work intelligently, which means the work is completed faster and we race to sleep. Our application actually gets more readable because we programmers mostly don't need to worry about breaking up our functions into tasks and switching between them. Any await is a possible switching point. Now, we want to actually start using async in our programs. Sadly there are currently some limitations. In this post, we'll look at the current workarounds, the tradeoffs, and how the limitations might be partially resolved in the near future.

Previously we talked about conserving energy using async. This time we'll take a look at performing power consumption measurements. Our goal is first to get a feel for how much power is consumed, and then to measure the difference between a standard synchronous and an async implementation of the same application.

To more effectively write Embedded Rust applications, we want a clearer picture of two aspects: how can we ergonomically perform multiple tasks concurrently, and how can we exploit low-power modes to save energy. In the coming weeks, we want to write a small but non-trivial application that communicates with 2 sensors, uses async, and uses the low-power modes to conserve energy.

In embedded systems, energy efficiency is crucial for practical applications. Usually devices run on a battery, so the less energy you use, the longer the power supply will last. In this post we'll look at the basics of going to sleep and waking back up, and build a proof of concept using the nRF52840 development kit.