Embedded async debugging and inspect-embassy
To try it out, see the github: https://github.com/tweedegolf/inspect-embassy. Read on for more details.
The state of embedded async debugging
Oftentimes while working on a project made with Embassy, I change something and suddenly the program just seems to stop working. The problem can be very simple, like forgetting to send a message another task is waiting on, but finding out where it is waiting can be very hard. With current tools we don't have any insight in what is going on inside async functions. Trying to use a debugger like GDB will often just show the program counter waiting at some sort of sleep instruction for the next task to poll: not very helpful.
Usually, at this point, I start to add print statements all over my program just to try to find where the program got stuck. Only then can I start to set breakpoints in useful places. Sadly, as the project grows, all of this can become very time consuming.
As I see it, the above problems all stem from the same source: debuggers assume every core is only working on one thing at the time. While this is correct in a physical sense, this is often a level of abstraction too low when trying to understand the (mis)behavior of an async program. In these systems, it is often the interactions between different tasks that cause the problems.
In an ideal world
In an ideal world, executors would somehow communicate the full list of tasks to the debugger so that they could be shown in the same way the list of threads can be shown today. But for an ergonomic experience, there would also need to be a way to see the "backtraces" of all tasks, not just the one of the function currently being polled.
These "backtraces" of waiting async functions would ideally look and work the same as backtraces do for sync functions: Allowing the inspection of local variables on every level of the stack and showing where functions up the stack were called from.
Screenshot of a GDB session of a sync program: Function arguments and call locations are visible.
The list of tasks and the backtraces of those tasks would allow you to see the full state of a system while debugging in a way that is already familiar. But additional functionality that is specific to async debugging would also be needed, like some way to see what tasks are waiting on what other tasks.
Being able to do all of that perfectly would need collaborations between Rust, async executors and
debugging tools. While the ideal situation is ambitious and not witin the scope of this project, a lot is possible using the current debug output. So I have created
inspect-embassy
as a first proof of concept.
A proof of concept
Inspect-embassy
is a TUI tool that can be used with
probe-rs or embedded inside GDB. It reads the debug information and the memory of the target to try
to enhance the debugging experience so that it is closer to my ideal world. But a video says more
than a thousand words, so here is a small demo of what inspect-embassy
can do right now. Here it is
used to see four futures waiting on a button being pressed in a join4
:
Source code for the example and inspect-embassy
can be found here:
https://github.com/tweedegolf/inspect-embassy/blob/main/test_crates/nrf52840-join/src/main.rs
As you can see in the video, inspect-embassy
shows you the line number where an async fn is waiting. Then as a child item it shows you what the await point is waiting on (this can be another async fn, creating the "backtrace"). You are able to open up an async fn to see a representation of the memory layout, and what local variables it is storing. When using the GDB backend, GDB's pretty printer is used to format these values.
Select and join also have special handling, allowing you to see all the futures they are waiting on as child items. For a future inside a join you can inspect the value it resolved to while waiting on the others to complete.
All of the above updates live any time a task is polled, allowing you to not only see the full state of your system, but also how it behaves over time. For example: if in the above demo you accidentally use the wrong pin number for one of the buttons, it would be immediately obvious as the future not resolving when pressing the button.
Inner workings
This section explains the internal workings of inspect-embassy
; it expects a basic
understanding of the async machinery in Rust. If this does not interest you, you can skip to the
conclusion.
At startup, the debug data gets parsed into a model containing the memory layout of all async fn, join and select futures. The memory location of all task pools is also read from the debug data.
A breakpoint is then set at the end of the embassy-executor
poll function. Every time it gets
hit, the memory is read at the location of all the known task pools. This memory can then be parsed
using the layouts gathered from the debug data, resulting in the human readable fields that are then
displayed in the UI.
More information on the workings can be found in the Architecture.md
file in the repository.
Future work
As Inspect-embassy
is just a prototype at the moment, it only works with a single version of
Embassy/Rust and it is not nearly as well integrated into the rest of the debugging workflow as
I would have liked. There are some limitations in the rest of the ecosystem right now that make it
difficult to get closer to the ideal world above:
- At the moment, the names and layout of the Embassy task pool types is hardcoded. So the task detection would break with new embassy versions.
- I was unable to find a way to use the GDB's extension interface to more seamlessly add the extra information.
But probably the most difficult problem is custom implementations of the Future
trait. To be able
to give useful debug output of waiting futures, the debugger has to be able to understand what a
future is doing from just its value in memory. But this is not possible to do in a general way with
custom implementations of Future
, like select
and join
.
Inspect-embassy
is able to show what futures an async fn is waiting on because it knows the code
generated by Rust will always poll the future it is storing at that moment. So it can always know
what futures it is waiting on for the value in memory (if it is there it is waiting on it). Join
and Select
are also just special cases, inspect-embassy
knows how these futures are supposed to
work and can use that information together with their memory layout to know what futures they are
waiting on.
But all the above breaks down with custom implementations; it is then not possible to know what
the future is doing from just the debug information and the value in memory. Right now
inspect-embassy
will show any unknown future as a leaf node not awaiting any other future, but
this is not always correct. I do not yet know how to handle cases where they are awaiting something
else correctly in a generic way.
And in a slightly different direction, all the tolling in inspect-embassy
could in principle also be made to work with other async executors, like Tokio. This would be a lot more work as these executors don't have fixed locations for tasks, and desktop programs are also a lot more likely to use dyn Futures
or other forms of indirection which would then need to be supported as well.
Conclusion
The debugging experience of async programs, and especially that of embedded
ones have some extra difficulties at the moment. But I don't think these are insurmountable. Thanks
to the excellent debug output in Rust, it was possible to create a tool like
inspect-embassy
, getting us one step closer to parity with sync debugging.
But as mentioned, there is still work to do, so if you are working on an Embassy project, go and try inspect-embassy
at https://github.com/tweedegolf/inspect-embassy/!
And if you have any ideas to improve the async debugging experience even further, feel free to contribute to inspect-embassy
(e.g. by opening a PR, or starting a discussion) or maybe start your own tool!