Running real-time Rust
As a disclaimer, though, I have to note that I never had the misfortune pleasure to work on a hard-real-time system. I may be missing some things, so don't take this blog as the whole gospel.
Many systems have some real-time component. You can imagine a thrust-vectoring rocket engine that needs to respond in time or the rocket crashes. But many systems are easier, for example, EV chargers, where the power pins have to be disconnected within an X number of milliseconds after the plug is disconnected from the car.
Real-time is also one of those terms that everybody uses differently. But generally it's about being able to know or predict that some operation finishes before some deadline. This means keeping an eye on (unexpected) latencies and potentially unbounded blockers.
So let's dive in!
How real-time can Rust be?
Rust is compiled to machine code, just as C is. And similarly, it doesn't have a GC. This means that Rust is really only constrained by what the hardware can deliver.
For example, even if the code is perfect, it can't prevent jitter in sensors, peripherals or interrupts. But ultimately the level of control we have is such that we have to care about that.
Another thing is assigning the place from where the code is executed. Flash will be slower and less predictable than RAM, for example. There's lots of things to consider like this that are not determined by the language.
All this is to say that Rust itself is not a constraining factor when creating a real-time system and can be a great choice.
Things to look out for
Rust cares about correctness. It's one of the major reasons why people use it! In theory this doesn't cost anything since Rust is all in on zero-overhead abstractions.
However, things pan out differently in practice. We need to prove to the compiler that we're doing things in a safe manner. This means for example that we often choose to wrap something in a mutex even when we wouldn't in an equivalent C program. The alternative sometimes can be writing some unsafe code, but for good reasons we often opt for the 'worse' safe option.
So what are common things to worry about?
Critical sections
A common way to make mutexes work is by taking a critical section. This is popular enough that a widely used abstraction has been made for it.
While a critical section is active, no other code can run. This means interrupts are delayed until the end.
Because of its pervasive use, this can be a big concern. Any dependency can at any point start a critical section that will take too long.
To analyze where it's used, you can use cargo tree -i critical-section to see all dependencies that use the crate. This won't catch any code that manually implements critical sections, but that doesn't happen much.
Most uses of the critical sections are very ok. The CS is taken, a flag is changed and the CS is released. But for all uses this should be checked.
An alternative is creating a custom critical-section implementation that for example disables all but your 'real-time' interrupt. This comes with a bunch of safety caveats, but is very much possible. This is the approach the RTIC framework takes behind some abstraction to make it safe.
Async await
Async/await is awesome! But it's also cooperative. This means that if one task is running, it'll keep running until an await point is reached. Only then can the executor switch to another task.
Luckily there are ways around it. Async/await is always cooperative within an executor, but two executors could preempt one another. This can be accomplished with Embassy by using multiple executors or by using RTIC which has that built-in.
Still you'll have the overhead of the executor to calculate into the equation. This adds additional latency that may or may not be predictable based on the implementation of it.
Lock-free is not wait-free
Lock-free datastructures are very cool! They're used a lot in embedded Rust. You'll find them in e.g. threadsafe channels to pass values between tasks. The nice thing is that they don't require a mutex, which is more efficient.
Instead some atomics are used, typically with a compare-exchange at the heart of it. This is an operation that can fail and thus needs to be retried. This can happen on concurrent access.
In theory when contention is high, a task may never succeed finishing the operation. It's unlikely to happen, but if people's lives are at risk, this is definitely something to consider.
Generics and cache
More and more microcontrollers have cache on board, especially for the flash. A function will be executed faster when it's already in cache. This is not any different than in C.
However, just like C++ templates, generic Rust functions get monomorphized. That means that every unique use of a function with different types used will get its own instantiation. So if you rely on a function being in cache, make sure it's the exact function you're thinking of.
fn foo<T>() { /* ... */ }
foo::<&u32>();
foo::<&mut u32>(); // Different function
This can have an adverse effect on your worst-case execution time (WCET) calculation.
Implicit drop
RAII is a great concept and Rust uses it to the max. But drop is usually called implicitly; The compiler just inserts it for you. But a drop impl can run any arbitrary code. This means the concerns are similar to that of the C++ destructors.
So make sure you know what you're dropping, in what order and what the drops do.
In hot paths, it may be worthwhile to take control and call drop manually or to defer the drop to another non-critical task.
Things that are easy(er) in Rust
Though we've just seen some pitfalls of Rust being a high-level language, this can certainly also help us!
Dynamic allocations
When we use no_std (disable the standard library), the alloc part of Rust is also disabled. This is great because we want to avoid using it, since making an allocation can take an unpredictable amount of time. And because this is the default mode of operations on embedded, the majority of the embedded Rust ecosystem doesn't use dynamic allocations.
Instead we have lots of great crates with fixed-size datastructures like heapless and embassy-sync which are easy to plumb in because of Rust's traits, generics and slices.
Error handling
The way to deal with errors is by returning Result and Option values. There's no need to worry about exceptions like you have to with C++.
There still is panic though, but that's only for unrecoverable errors and is not that different to a hardware exception (like a hardfault on cortex-m). This means it's not part of the standard flow of your program.
Good standard practice is to define a panic handler that optionally prints or stores the panic message after which it gets escalated to a hardfault. Then in the hardfault handler you do all the normal recovery like stopping the motors or whatever your application needs.
That which does not run, does not have latency
Rust can do a lot at compile time. This can come in the form of macros, traits + generics and increasingly const functions.
Anything you can do ahead of time at compilation can't get in the way of reaching your deadline targets.
Conclusion
It should be clear that creating real-time systems is no easy feat! And Rust, just like any high(er) level language (like C++), can make it more tricky to pull off.
Like for any given project, selecting the language you use comes with trade-offs. But hopefully this blog post gives you a better sense of the things you can and should think about so you're able to write some good real-time firmware in Rust.
Facing an embedded challenge?
We can help you deal with:
- low-power applications
- hardware restrictions
- Rust Embedded engineering