Async and asleep: designing our future embedded applications
Embedded devices have many different tasks to complete, but usually only one core to perform them on. It is common that a task must wait for hardware to complete a task (e.g. send a physical signal to a sensor, or wait for a response). In such cases, we'd like the CPU to work on something else.
Not just raw speed is important: in embedded systems, we must also consider energy consumption. Work often occurs in bursts, interspersed by periods of inactivity. Therefore we'd like ergonomic control over entering low-power modes, and only wake back up when there is work to be done.
The remainder of this post sketches the background of this project, describes the problems that we'll have to overcome, and how we will evaluate the project afterwards.
The Power Problem
For most software, power usage is not a concern. But in embedded systems, it can make or break a product. Devices out in the field are usually battery-powered. And when the battery runs out while the system is expected to perform its task, that's a real problem.
Dealing with power requirements is hard because seemingly trivial changes in the program can cause a big change in power consumption. Furthermore, batteries are inherently unreliable. My colleague Wouter told me a story of testing a big collection of batteries for a heart-rate monitoring patch. He bought a certain battery type from all vendors he could find and measured their performance.
It turned out that none of the batteries performed as described by the manufacturer. Worse, there was significant performance fluctuation between batteries from the same batch.
Ultimately this meant he had to make a very conservative estimate of the available power, which in turn means that while writing code, power consumption was a constant concern. He recommends testing the power consumption continually to prevent regressions.
This and other experiences have made us think about more structural approaches to reduce the power consumption of our programs.
The landscape today
Most of our production Embedded Rust applications are big state machines. Each state performs some work (read a sensor, write to disk, etc.) and then transitions to one of several successor states. This is how embedded applications have been written traditionally, and how they are still commonly architected in C or C++.
For communication with individual sensors we rely on the extensive Rust ecosystem, and its standardization of common behavior as traits. For instance, the
embedded-hal crate defines an abstraction for communicating over SPI, the acceleromter crate defines an abstract interface for accelerometers, and finally, the list3dh crate builds on top of these crates to allow us to communicate to the lis3dh accelerometer over SPI.
Ultimately this means we have to spend most of our development time on architecture (the state machine), but relatively little time worrying about the exact implementation of how we communicate with sensors: that code has already been written by others.
The promise of async Rust
The introduction of async in rust promised to alleviate the pain of manually writing state machines. With the
await keyword we can indicate points in the program where it might be a good idea to switch tasks (e.g. because we're waiting for some data from a sensor, which may take a while). But otherwise, our code looks like straightforward synchronous code. No explicit state machine is required.
Today, the embassy crate provides an async executor that runs on many embedded architectures. More widely used executors like tokio cannot be used on embedded systems, because they require a system allocator and other parts of rust's standard library that cannot be implemented (efficiently) on our hardware.
And yes, async on embedded does deliver on its promise. We are able to write very clean embedded code. The embassy crate provides us not just with a solid embedded executor, but also comes with an extra layer of API design that should make writing embedded applications even nicer.
However, async today makes writing the other parts of our applications harder. Currently, traits cannot contain async methods, which means we're left entirely without the ecosystem that we rely on to communicate with our sensors.
Making an async version of a synchronous sensor crate is not terribly hard, but it is annoying and for now, cannot be re-used in a convenient way.
Using low-power modes
We furthermore want to combine the usage of async with using low-power modes. These modes use substantially less energy than when the hardware is running normally. Conceptually, this is quite simple: when there is no work to do, and we know or assume there will be no work to do for a while, we can make the device enter a low-power mode to conserve energy.
The catch is that it may take longer to respond to an event because the machine has to wake up first. Additionally, a sleep/wake cycle by default does not retain the program state: it's like a reset. Any sensor configuration and static values are erased.
In practice, we need to be careful when to enter a sleep mode, and to make sure all relevant state is accurately stored when going to sleep and retrieved when waking back up.
Speaking of waking up, this too is a bit fiddly. Because the device is off, most incoming signals are simply not processed. So how do we know when there is more work to be done? There are not a lot of (Rust) examples of this mechanism for the hardware that we use.
Ideally, we'd find a reusable way to achieve power savings using the low-power mode, but that seems unlikely given how many conditions must be met to go to sleep safely and advantageously.
We want to write a small but non-trivial application that uses async to communicate with at least 2 sensors. By default, the chip is in a low-power mode. The sensors periodically send new measurements, which will wake the device, process the measurement, and go back to sleep.
In terms of measurements, we first need some baselines. We will measure the power consumption of some common scenarios like wait-for-event and a busy-waiting loop. We are specifically interested in the CPU's power draw, and will ignore the power used by sensors or other components of the device.
After that, we will measure the power consumption of going to and from the low-power mode. This will tell us how long we should sleep (at least) before going to sleep is more efficient than just idly waiting for the next incoming measurement.
Finally, we will measure the power consumption of the CPU while running 2 versions of our test application: sync and async.
Besides the energy consumption of the sync and async version, the code style and amount of code are also important. Chiefly we want to know when the convenience of async is worth the cost of having to make an async version of the sensor communication code.
We will use a power analyzer to measure the energy draw. We will have to figure out how to measure exactly what we want to measure: power consumed by the CPU. We expect relative changes in CPU power consumption to translate well between different CPUs.
Evaluating the code quality is less objective. Nonetheless, we think we can draw some conclusions after building the sync and async test cases. Likewise, we can make a more informed trade-off when considering porting code from sync to async rust
In the coming weeks, we'll be releasing more posts detailing our progress on this research project. We'll zoom in on the various test cases that we're going to build, tell more about how we measure current consumption and which tools we use, and compare the ergonomics of sync versus async programming for embedded applications in Rust.
At the end there will be a thorough comparison of the sync and the async approaches. Stay tuned!
This is the 1st article in a series on async embedded development with Rust:
Last September, at the start of my internship at Tweede Golf, my tutors gave me a LoRa-E5 Dev Board. My task was to do something that would make it easier to write applications for this device in Rust. Here's what I did.
It's time for another technical blog post about async Rust on embedded. This time we're going to pitch Embassy/Rust against FreeRTOS/C on an STM32F446 microcontroller.
In our last post, we've seen that
async can help reduce power consumption in embedded programs. The async machinery is much more fine-grained at switching to a different task than we reasonably could be. Embassy schedules the work intelligently, which means the work is completed faster and we race to sleep. Our application actually gets more readable because we programmers mostly don't need to worry about breaking up our functions into tasks and switching between them. Any
await is a possible switching point.
Now, we want to actually start using async in our programs. Sadly there are currently some limitations. In this post, we'll look at the current workarounds, the tradeoffs, and how the limitations might be partially resolved in the near future.