Soundness and Unsafe Issues
An operating system necessarily must use unsafe code. This document explains the rationale behind some of the key mechanisms in Tock that do use unsafe code but should still preserve safety in the overall OS.
static_init!
The "type" of static_init!
is basically:
#![allow(unused)] fn main() { T => (fn() -> T) -> &'static mut T }
Meaning that given a function that returns something of type T
, static_init!
returns a mutable reference to T
with static lifetime.
This is effectively meant to be equivalent to declaring a mutable static variable:
#![allow(unused)] fn main() { static mut MY_VAR: SomeT = SomeT::const_constructor(); }
Then creating a reference to it:
#![allow(unused)] fn main() { let my_ref: &'static mut = &mut MY_VAR; }
However, the rvalue in static declarations must be const
(because Rust doesn't
have pre-initialization sections). So static_init!
basically allows static
variables that have non-const initializers.
Note that in both of these cases, the caller must wrap the calls in unsafe
since references a mutable variable is unsafe (due to aliasing rules).
Use
static_init!
is used in Tock to initialize capsules, which will eventually
reference each other. In all cases, these references are immutable. It is
important for these to be statically allocated for two reasons. First, it helps
surface memory pressure issues at link time (if they are allocated on the stack,
they won't trivially show up as out-of-memory link errors if the stack isn't
sized properly). Second, the lifetimes of mutually-dependent capsules needs to
be equal, and 'static
is a convenient way of achieving this.
However, in a few cases, it is useful to start with a mutable reference in order
to enforce who can make certain calls. For example, setting up buffers in the
SPI driver is, for practical reasons, deferred until after construction but we
would like to enforce that it can only be called by the platform initialization
function (before the kernel loop starts). This is enforced because all
references after the platform is setup are immutable, and the config_buffers
method takes an &mut self
(Note: it looks like this is not strictly
necessary, so maybe not a big deal if we can't do this).
Soundness
The thing that would make the use of static_init!
unsafe is if it was used to
create aliases to mutable references. The fact that it returns an &'static mut
is a red flag, so it bears explanation why I think this is OK.
Just as with any &mut
, as soon as it is reborrowed it can no longer be used.
What we do in Tock, specifically, is use it mutably in some cases immediately
after calling static_init!
, then reborrow it immutably to pass into capsules.
If a particular capsule happened to accept a &mut
, the compiler would try to
move the reference and it would either fail that call (if it's already
reborrowed immutably elsewhere) or disallow further reborrows. Note that this is
fine if it is indeed not used as a shared reference (although I don't think we
have examples of that use).
It is important, though, that the same code calling static_init!
is not
executed twice. This creates two major issues. First, it could technically
result in multiple mutable references. Second, it would run the constructor
twice, which may create other soundness or functional issues with existing
references to the same memory. I believe this is not different that code that
takes a mutable reference to a static variable. To prohibit this, static_init!
internally uses an Option
-like structure to mark when the static buffer has
been initialized, and causes a panic!
if the same buffer is re-initialized
(i.e. the same static_init!
was called twice). With this check, we can mark
static_init!
as safe.
Alternatives
It seems technically possible to return an immutable static reference from
static_init!
instead. It would require a bit of code changes, and wouldn't
allow us to restrict certain capsule methods to initialization, but may not be a
particularly big deal.
Also, something something static variables of type Option
everywhere (ugh...
but maybe reasonable).
Capabilities: Restricting Access to Certain Functions and Operations
Certain operations and functions, particularly those in the kernel crate, are
not "unsafe" from a language perspective, but are unsafe from an isolation and
system operation perspective. For example, restarting a process, conceptually,
does not violate type or memory safety (even though the specific implementation
in Tock does), but it would violate overall system safety if any code in the
kernel could restart any arbitrary process. Therefore, Tock must be careful with
how it provides a function like restart_process()
, and, in particular, must
not allow capsules, which are untrusted code that must be sandboxed by Rust, to
have access to the restart_process()
function.
Luckily, Rust provides a primitive for doing this restriction: use of the
unsafe
keyword. Any function marked as unsafe
can only be called from a
different unsafe
function or from an unsafe
block. Therefore, by removing
the ability to define an unsafe
block, using the #![forbid(unsafe_code)]
attribute in a crate, all modules in that crate cannot call any functions marked
with unsafe
. In the case of Tock, the capsules crate is marked with this
attribute, and therefore all capsules cannot use unsafe
functions. While this
approach is effective, it is very coarse-grained: it provides either access to
all unsafe
functions or none. To provide more nuanced control, Tock includes a
mechanism called Capabilities.
Capabilities are essentially zero-memory objects that are required to call
certain functions. Abstractly, restricted functions, like restart_process()
,
would require that the caller has a certain capability:
#![allow(unused)] fn main() { restart_process(process_id: usize, capability: ProcessRestartCapability) {} }
Any attempt to call that function without possessing that capability would
result in code that does not compile. To prevent unauthorized uses of
capabilities, capabilities can only be created by trusted code. In Tock, this is
implemented by defining capabilities as unsafe traits, which can only be
implemented for an object by code capable of calling unsafe
. Therefore, code
in the untrusted capsules crate cannot generate a capability on its own, and
instead must be passed the capability by module in a different crate.
Capabilities can be defined for very broad purposes or very narrowly, and code can "request" multiple capabilities. Multiple capabilities in Tock can be passed by implementing multiple capability traits for a single object.
Capability Examples
-
One example of how capabilities are useful in Tock is with loading processes. Loading processes is left as a responsibility of the board, since a board may choose to handle its processes in a certain way, or not support userland processes at all. However, the kernel crate provides a helpful function called
load_processes()
that provides the Tock standard method for finding and loading processes. This function is defined in the kernel crate so that all Tock boards can share it, which necessitates that the function be made public. This has the effect that all modules with access to the kernel crate can callload_processes()
, even though calling it twice would lead to unwanted behavior. One approach is to mark the function asunsafe
, so only trusted code can call it. This is effective, but not explicit, and conflates language-level safety with system operation-level safety. By instead requiring that the caller ofload_processes()
has a certain capability, the expectations of the caller are more explicit, and the unsafe function does not have to be repurposed. -
A similar example is a function like
restart_all_processes()
which causes all processes on the board to enter a fault state and restart from their original_start
point with all grants removed. Again, this is a function that could violate the system-level goals, but could be very useful in certain situations or for debugging grant cleanup when apps fail. Unlikeload_processes()
, however, it might make sense for a capsule to be able to callrestart_all_processes()
, in response to a certain event or to act as a watchdog. In that case, restricting access by marking it asunsafe
will not work: capsules cannot call unsafe code. By using capabilities, only a caller with the correct capability can callrestart_all_processes()
, and individual boards can be very explicit about which capsules they grant which capabilities.