Kernel Attacks on the Encryption Service
In this last submodule of the HWRoT course, we'll explore how Tock's kernel-level isolation mechanisms help protect sensitive operations in a HWRoT context.
Our previous attempt at an attack on the HWRoT encryption service—an SRAM dumping attack—assumed that we were able to load a malicious application. As we saw, Tock's process-level isolation guarantees prevented the malicious application from being able to compromise other processes.
But what if the attacker tries to compromise the kernel itself? To give our attacker even more of an advantage this time, let's assume that a hypothetical attacker of our HWRoT might try slip some questionable logic into a kernel driver, and see how Tock provides defense-in-depth via language-based isolation at the driver level.
NOTE: For a full description of Tock's threat model and what forms of isolation it's intended to provide, see the Tock Threat Model page elsewhere in the Tock Book.
Background
Rust Traits and Generics
The Rust programming language (which the Tock kernel is written in) allows for defining methods on structs and enums, similar to class methods in many languages.
Following this analogy, Rust traits are like interfaces in other languages:
they let you specify shared behavior between types. For instance, the Clone
trait in Rust roughly looks like
#![allow(unused)] fn main() { pub trait Clone { fn clone(&self) -> Self; } }
which indicates that any type that implements the Clone
trait needs to provide
an implementation of clone
returning something of its own type (Self
) given
a reference to itself (&self
). Implementations are provided in impl
blocks:
for instance, to implement the above trait, you might write something like
#![allow(unused)] fn main() { struct MyStruct { ... } impl Clone for MyStruct { fn clone(&self) -> Self { ... } } }
Types can be bound by traits: for instance, a function signature like
#![allow(unused)] fn main() { fn duplicate<C: Clone>(value: C) { ... } }
indicates that duplicate()
is defined to be generic over any type C
such
that C
implements the Clone
trait, and that the input to duplicate()
will
be of this type C
.
As a last note, traits can be marked as unsafe
to denote that any
implementation of such a trait may need to rely on invariants that the Rust
compiler can't verify. One common example is the Sync
trait, which types can
implement to indicate that they're safe to share between threads.
Because such traits can't be compiler-verified, the Rust compiler requires
implementations of them to be marked as unsafe
as well, e.g.
#![allow(unused)] fn main() { struct MyStruct { ... } unsafe impl Send for MyStruct {} }
Submodule Overview
Our goal in this submodule is to modify an existing kernel capsule to "slip in" a function call that a (malicious) userspace app can trigger that compromises the overall system integrity. To make this a subtle attack, the attacker wants to hide this new function call in the kernel so that when the board maintainer updates to a new version of Tock the attack is present in the kernel.
For demonstration, we will insert a call to hardfault_all_apps()
. This is of
course a sensitive API designed exclusively for testing. This API should not be
accessible to userspace, but we will see if an attacker can expose this to
userspace without the board maintainer knowing about the change.
Milestones
We additionally have two small milestones in this section: one to sneak some logic into our encryption oracle driver, and then one to add an application which uses it.
- Milestone one adds a minimal bit of logic to the encryption oracle driver which a userspace application can use to fault all running applications, but with the caveat that it requires the board definition to explicitly give it that permission.
- Milestone two adds a userspace application to trigger this driver, and then demonstrates how Tock performs language-level access control to capabilities which the Tock board definition has to explicitly grant.
Starter Code
Again as in the previous section, we have some starter code in libtock-c. The
only new directory we'll use is the questionable_service/
subdirectory in
libtock-c/examples/tutorials/root_of_trust/
.
To launch this 'questionable' service which we'll use to trigger the
fault all processes
driver, simply navigate as per the previous submodules to
the Questionable service
in the on-device menu, select it, and then select
Start
as usual.
Milestone One: Adding the Fault All Processes
Driver
As a first step, we'll need to add some logic to our encryption oracle capsule.
Open tock/capsules/extra/src/tutorials/encryption_oracle_chkpt5.rs
(the
completed encryption oracle driver) and do the following:
-
First, we need to ensure our compromised driver has a reference to the kernel, as well as a capability of generic type
C
. This capability will be necessary in a second, but for now we take it for granted. Down where theEncryptionOracleDriver
struct is, add a new type parameterC: ProcessManagerCapability
, and then akernel
andcapability
member:#![allow(unused)] fn main() { pub struct EncryptionOracleDriver<'a, A: AES128<'a> + AES128Ctr, C: ProcessManagementCapability> { kernel: &'static kernel, capability: C, aes: &'a A, process_grants: Grant< ProcessState, ... >, ... } }
Don't forget to add an import for
kernel::capabilities::ProcessManagementCapability
andkernel::Kernel
as well at the top of the file:#![allow(unused)] fn main() { use core::cell::Cell; use kernel::capabilities::ProcessManagementCapability; use kernel::grant::{AllowRoCount, AllowRwCount, Grant, UpcallCount}; ... use kernel::{ErrorCode, Kernel}; ... }
-
Next, now that we've added a new type parameter to
EncryptionOracleDriver
, we'll need to change the implementations of eachimpl
block so that enough type parameters are provided to it. In theimpl
block just below our newly-modified struct definition, we'll change#![allow(unused)] fn main() { impl<'a, A: AES128<'a> + AES128Ctr> EncryptionOracleDriver<'a, A> { ... } }
to
#![allow(unused)] fn main() { impl<'a, A: AES128<'a> + AES128Ctr, C: ProcessManagementCapability> EncryptionOracleDriver<'a, A, C> { ... } }
Later in the file, you'll also want to change
#![allow(unused)] fn main() { impl<'a, A: AES128<'a> + AES128Ctr> SyscallDriver for EncryptionOracleDriver<'a, A> { ... } }
to
#![allow(unused)] fn main() { impl<'a, A: AES128<'a> + AES128Ctr, C: ProcessManagementCapability> SyscallDriver for EncryptionOracleDriver<'a, A, C> { ... } }
and
#![allow(unused)] fn main() { impl<'a, A: AES128<'a> + AES128Ctr> Client<'a> for EncryptionOracleDriver<'a, A> { ... } }
to
#![allow(unused)] fn main() { impl<'a, A: AES128<'a> + AES128Ctr, C: ProcessManagementCapability> Client<'a> for EncryptionOracleDriver<'a, A, C> { ... } }
-
Now, we need to change our
new()
associated function to accept a reference to the kernel as well as an instance of our desired capability. Addkernel
andcapability
as new arguments tonew()
, and use them to construct the returnedEncryptionOracleDriver
:#![allow(unused)] fn main() { /// Create a new instance of our encryption oracle userspace driver: pub fn new( kernel: &'static kernel, capability: C, aes: &'a A, source_buffer: &'static mut [u8], ... ) -> Self { EncryptionOracleDriver { kernel, capability, process_grants, aes, ... } } ... }
-
Lastly, we want to sneak in our new logic. In the definition of
command()
is a largematch
statement that causes ourEncryptionOracleDriver
to exhibit different behavior when it receives a command based on the value ofcommand_num
. We'll add a new branch for command number 2 to fault every application.#![allow(unused)] fn main() { impl<'a, A: AES128<'a> + AES128Ctr, C: ProcessManagementCapability> SyscallDriver for EncryptionOracleDriver<'a, A, C> { fn command( &self, command_num: usize, ... ) -> CommandReturn { match command_num { ... // Request the decryption operation: 1 => { ... } // Hardfault all applications 2 => { self.kernel.hardfault_all_apps(&self.capability); CommandReturn::success() } // Unknown command number, return a NOSUPPORT error _ => CommandReturn::failure(ErrorCode::NOSUPPORT), } } } }
With this, our changes to the driver are complete! Whenever it receives a Command syscall with command number 2, it should fault every application.
If we take a look at the implementation of the Kernel::hardfault_all_apps()
function we used, we'll see that it has signature
#![allow(unused)] fn main() { pub fn hardfault_all_apps<C: capabilities::ProcessManagementCapability>(&self, _c: &C) { ... } }
which indicates that to be called, it needs to accept an input of generic type
C
which implements the trait capabilities::ProcessManagementCapability
.
As such, we'll need to do two things when modifying our board definition:
- Create a new type (say
EncryptionOracleCapability
) implementing thecapabilities::ProcessManagementCapability
trait. - Instantiate our new driver and provide it with an instance of our
EncryptionOracleCapability
type
Opening boards/tutorials/nrf52840dk-root-of-trust-tutorial/src/main.rs
, we can
get started.
-
First, let's define our new
EncryptionOracleCapability
type. In Tock, theProcessManagementCapability
we need to implement is defined as follows:#![allow(unused)] fn main() { /// The `ProcessManagementCapability` allows the holder to control /// process execution, such as related to creating, restarting, and /// otherwise managing processes. pub unsafe trait ProcessManagementCapability {} }
This is an unsafe trait with no methods, so we won't have to do much to implement it. Add the following right above the definition of
struct Platform
in ourmain.rs
:#![allow(unused)] fn main() { struct EncryptionOracleCapability; unsafe impl capabilities::ProcessManagementCapability for EncryptionOracleCapability {} }
Note that if you don't include the
unsafe
in the second line,rustc
will error, stating that the trait in questionrequires an `unsafe impl` declaration.
-
Now, let's tweak our platform to indicate that the oracle driver takes in an
EncryptionOracleCapability
. You'll want to modify thePlatform
struct definition to read as follows:#![allow(unused)] fn main() { struct Platform { base: nrf52840dk_lib::Platform, screen: &'static ScreenDriver, oracle: &'static capsules_extra::tutorials::encryption_oracle_chkpt5::EncryptionOracleDriver< 'static, nrf52840::aes::AesECB<'static>, EncryptionOracleCapability, >, } }
-
Lastly, in the actual
main()
function just above the block comment indicatingPLATFORM SETUP, SCHEDULER, AND KERNEL LOOP,
you'll want to modify the initialization of the encryption oracle driver to include our reference to the kernel and an instance of our capability.#![allow(unused)] fn main() { let oracle = static_init!( capsules_extra::tutorials::encryption_oracle_chkpt5::EncryptionOracleDriver< 'static, nrf52840::aes::AesECB<'static>, EncryptionOracleCapability, >, capsules_extra::tutorials::encryption_oracle_chkpt5::EncryptionOracleDriver::new( board_kernel, EncryptionOracleCapability {}, &nrf52840_peripherals.nrf52.ecb, aes_src_buffer, ... ), ); }
You should now be able to build and install the kernel as usual; not much should be noticeably different until the next step.
Milestone Two: Triggering the Fault All Processes
Driver
To actually trigger the driver, we'll need to send it a Command syscall with
command ID 1. We'll do this in two simple steps. To start, back in libtock-c,
rename questionable_service_starter/
to just questionable_service/
.
If you get stuck, see questionable_service_milestone_one/
.
-
First, as with the previous submodules, copy your implementations of
wait_for_start()
,setup_logging()
, andlog_to_screen()
, and callwait_for_start()
andsetup_logging()
at the top ofmain()
. -
Now, change main to perform the following:
-
Log to the screen that all apps are about to be hardfaulted
-
Trigger the hardfault driver using
command()
, i.e.syscall_return_t cr = command(/* driver num */ 0x99999, /* command num */ 2, 0, 0);
-
Add another log to screen (can be anything; this should never be reached, as the app should have already faulted)
Install and run the application. You should see that the first log appears, but
the second one never does. A fault dump should instead appear over the
tockloader listen
console.
Now that we have a working setup, one question might be whether we can make do
without adding a noisy unsafe impl
in our board definition file main.rs
,
likely the first file someone would inspect.
One idea might be to move the unsafe impl
into our driver code. Unfortunately,
if we try that, e.g. by moving the struct definition into
fault_all_proceses.rs
and changing our Kernel::hardfault_all_apps()
call to
#![allow(unused)] fn main() { struct EncryptionOracleCapability; unsafe impl capabilities::ProcessManagementCapability for FaultAllProcessesCapability {} impl<'a, A: AES128<'a> + AES128Ctr, C: ProcessManagementCapability> SyscallDriver for EncryptionOracleDriver<'a, A, C> { fn command( &self, command_num: usize, ... ) -> CommandReturn { match command_num { ... // Hardfault all applications 2 => { self.kernel.hardfault_all_apps(EncryptionOracleCapability {}); CommandReturn::success() } ... } } } }
then rustc
will error, noting implementation of an `unsafe` trait.
Indeed,
Tock drivers (and capsules in general!) cannot make use of unsafe constructs,
so any capabilities given to them must come from the board definition where
they can be more carefully audited. This makes following expected access control
policy a prerequisite for the kernel to compile.
Along with capabilities, disallowing unsafe
code in drivers has many other
positive isolation effects. For instance, without access to unsafe
, drivers
cannot use core functions like core::slice::from_raw_parts()
to construct
slices to directly access memory, meaning they can only make use of memory
explicitly granted to them.
For more details on Tock's isolation mechanisms, see the Tock Design page on the website, as well as the EuroSec 2022 paper Tiered Trust for Useful Embedded Systems Security.