Blog

Tech blog on web, security & embedded

While working on the Roc compiler, we regularly dive deep on computer science topics. A recurring theme is speed, both the runtime performance of the code that we generate, as well as the performance of our compiler itself.

One extremely useful technique that we have been playing with is data-oriented design: the idea that the actual data you have should guide how code is structured.

December 9, 2022

Sorting with SIMD

Google recently published a blog article and paper introducing their SIMD-accelerated sorting algorithm.

SIMD stands for single instruction, multiple data. A single instruction is used to apply the same operation to multiple pieces of data. The prototypical example is addition, where one instruction can do e.g. 4 32-bit additions. A single SIMD addition should be roughly 4 times faster than performing 4 individual additions.

This kind of instruction-level parallelism has many applications in areas with a lot of number crunching, e.g. machine learning, physics simulations, and game engines. But how can this be used for sorting? Sorting does not involve arithmetic, and the whole idea of sorting is that each element moves to its unique correct place in the output. In other words, we don't want to perform the same work for each element, so at first sight it's hard to see where SIMD can help.

To understand the basic concepts, I played around with the ideas from the paper Fast Quicksort Implementation Using AVX Instructions by Shay Gueron and Vlad Krasnov. They provide an implementation in (surprisingly readable) assembly on their github. Let's see how we can make SIMD sort.

Over the past months, we have worked with Scailable to optimize their neural network evaluation. Scailable runs neural networks on edge devices, taking a neural network specification and turning it into executable machine code.

The Dutch government offers the AHN [[1]](https://www.ahn.nl/) as a way to get information about the height of any specific place in the country. They offer this data by using a point cloud. That is, a large set of points with some additional meta information. With the current version of the AHN the resolution of the dataset is about eight points per square meter. This results in about 2.5TB of compressed data for the relatively small area of the Netherlands. While this is something that is not impossible to store locally, it does offer some challenges.

De verhouding tussen groen (bomen, struiken, gras) en grijs (bebouwing, tegels, straat) is door de jaren heen verslechterd, met name in de steden. Weinig groen kan voor veel problemen zorgen, zoals slechte opname van regenwater en het lang vasthouden van hitte waardoor het in de stad een aantal graden warmer is dan daarbuiten.
In 2014 we won an innovation grant from the province of Gelderland based on our proposal to provide 'intelligent' gardening advice to users of Draw Your Garden (Dutch: Teken Je Tuin). We created this web application for one of our clients and we have been gradually expanding it since its release. In the app users can both design their garden and view it in 3D as well as order products and contact gardeners, who in turn can submit proposals based on the users' design.

At tweede golf, we value innovation: we take the time to research new technologies and subsequently challenge ourselves to try out these new techniques in order to discover new applications. We also like to learn by doing: build something first, ask questions later.