We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

A Rusty CHERI - The path to hardware capabilities in Rust

00:00

Formal Metadata

Title
A Rusty CHERI - The path to hardware capabilities in Rust
Subtitle
A status report on ongoing efforts to support CHERI architectures in Rust
Title of Series
Number of Parts
542
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
CHERI defines hardware extensions to encode access constraints on pointers, enabling hardware enforcement of such restructions based on metadata stored alongside pointers. There is an ongoing drive to support compiling Rust code in a way that can make use of these extensions. Doing this provides another layer of protection. We can encode knowledge about provenance validity, bounds and other access restrictions that the compiler (and OS/etc.) knows about in a way the hardware can enforce at runtime. The Rust memory model is famous for being able to enforce these types of restrictions at compile time, but not for unsafe Rust code. Unsafe Rust code needs to be written sometimes, which presents situations which can only be verified at run time. Some other nice benefits could come from this work. For example, runtime bounds checking can now be done by hardware rather than software, and since provenance information is necessary for operations on capabilities, closing gaps where it is not currently preserved forms a part of this work. This talk is a discussion on what is required for this support, and gives an overview of the state of the various attempts to implement this support.
14
15
43
87
Thumbnail
26:29
146
Thumbnail
18:05
199
207
Thumbnail
22:17
264
278
Thumbnail
30:52
293
Thumbnail
15:53
341
Thumbnail
31:01
354
359
410
MUDCompilerArchitectureSoftwareService (economics)Computer hardwareBoundary value problemCompilerOperations researchReduced instruction set computingRun time (program lifecycle phase)Temporal logicExtension (kinesiology)Address spaceConstraint (mathematics)Pointer (computer programming)Information securityDigital signalCodeTerm (mathematics)Computing platformDisintegrationBound stateoutputParsingType theoryRange (statistics)SpacetimeDifferent (Kate Ryan album)Array data structureElectric currentCore dumpParameter (computer programming)Hybrid computerPrimitive (album)Asynchronous Transfer ModeSoftware testingLibrary (computing)HypothesisClique-widthSeries (mathematics)ImplementationCompilerComputer hardwareConstraint (mathematics)Total S.A.Pointer (computer programming)Type theoryOperator (mathematics)MathematicsSpherical capParameter (computer programming)Address spaceCommunications protocolRun time (program lifecycle phase)Data compressionLibrary (computing)ResultantDefault (computer science)CASE <Informatik>Structural loadBlock (periodic table)Arithmetic progressionComputer configurationObject (grammar)Information securityResource allocationValidity (statistics)Normal (geometry)LengthModal logicProjective planeArmMultiplication signSoftware testingHypothesisAsynchronous Transfer ModeFront and back endsDifferent (Kate Ryan album)Hybrid computerCore dumpRange (statistics)CodeSpeicheradresseMetadataBound stateComputing platformComputer architectureTemporal logicMereologySoftwareCausalityError messageCodeSpacetimeExtension (kinesiology)Service (economics)Boundary value problem
Program flowchart
Transcript: English(auto-generated)
So, my talk is about modifying the Rust compiler to support Cherry's hardware capabilities. I'm going to start off with a brief introduction. My name is Lewis Revel and I work for a company
called Embercosm. I work on many things but I'd say I specialise in developing LLVM backends for constrained or unusual architectures. Embercosm itself is a software services company. We operate in the boundary between hardware and software, particularly in the embedded space
where you can find many unusual, difficult and interesting problems like writing compilers. So, what is Cherry? It's an acronym Capability Hardware Enhanced Risk Instructions. It's best described as an instruction set extension which can be adapted and applied to different architectures. The main feature of Cherry is that you can encode
access constraints on memory addresses using things called capabilities. Capabilities essentially have metadata alongside memory addresses that allow you to specify these access constraints. Capabilities can only be operated on using capability operations which replace the
normal pointer operations and these operations utilise the metadata to enforce those access constraints. It's worth pointing out there are two modes of operation for Cherry. There's pure cap mode where all pointers are capabilities and in hybrid mode you have
pointers by default on normal pointers but capabilities are annotated as such in the source code. So, capabilities together with capability operations allow you to enforce spatial referential and temporal safety in the hardware at runtime. Spatial safety is to do with
disallowing accesses out of bounds of an original allocation. Referential safety is disallowing accesses without valid provenance and temporal safety means that if the lifetime of an object is over you can no longer access it through a capability. So, what about
integrating Cherry and Rust? Well, we're working on this as part of a project which is led by our customer CyberHive and they're funded in turn by Digital Security by Design which is a UK government initiative. CyberHive want to use Cherry hardware to enhance secure network
protocols that are written in Rust. So, the goal for us then is to produce a Rust compiler that's capable of targeting Cherry-based architectures with the long-term goal of a stable compiler that can produce production-ready code for security purposes and we know that we're
initially going to be targeting Arm's Morello platform. So, other than being able to compile existing Rust code for Cherry, what's the motivation behind integrating Cherry and Rust? Essentially, it boils down to another layer of protection. We know that Rust is good at
identifying and enforcing access constraints at compile time but with Cherry you can identify constraints at compile time and enforce them in hardware at runtime. So, a good example is that Rust code annotated with Unsafe is often a necessity in many real-world projects which
means that it could behave badly but we don't know until runtime. With Cherry you can prevent this bad behavior in hardware when it occurs at runtime. There's some other small side benefits such as replacing slow software bounds checks with hardware bounds checking and replacing pointer plus length types with Cherry capabilities. So, to make things more clear I have a motivating
example. So, say we want to add a dynamic offset to a pointer and then load from that pointer. Well, this needs to be done in an unsafe block because we don't know until runtime if it's
going to do something bad. Without Cherry you could end up accessing out of range of your original allocated array but with Cherry that access will not occur at runtime and the hardware will either panic or give you something, a default value. So,
now that we know that we want these benefits how do we go about modifying Rust to get them? The main problem is that we need to account for capability sizes correctly. That is, we need to stop assuming that pointer type size is equal to the addressable range of the pointer. Because capabilities have metadata this isn't the case. Also in LLVM,
in the Cherry LLVM fork, capabilities are pointers in address space 200 whereas in Rust it seems like we assume that all pointers to data are in address space zero. Also if we want to support
hybrid mode we need to be able to specify different pointer type sizes for different address spaces so address space zero will have different sizes from address space 200. One thing I hope doesn't require many changes is that we need provenance and bounds to be propagated through the compiler because they need to be attached to capabilities.
And of course if we want the optional bonus stuff we need to implement that as well. Progress so far, so the data layout changes are completed which means that we can correctly specify capability sizes both the type size and the addressable range for both pure cap and
hybrid mode. I've modified APIs which produce pointer types to get rid of the assumption that APIs require an explicit address space parameter. And the biggest change is that
for APIs where we have a where we report a size for a type this is replaced with a total type size and a size of the value that you can represent. And yeah this means that like I said before we can support Cherry capabilities. There's also in the strict provenance
API there is an explicit unsafe method of producing pointers with no provenance from a use size. And for Cherry we need to use Cherry operations to set the address of a null capability to achieve the same result. What I'm currently working through is trawling through assertion
failures that come up when building the core libraries with this modified compiler. What still needs to be done? Well there's almost definitely going to be modifications to the libraries to remove any assumptions that break for Cherry. There's also the question of how do
we specify capability types in hybrid mode and because I don't think that Rust annotations are the right tool to specify a specific pointer as being a capability I think this requires a replace a size with a type size and added a size of the value that you can represent.
We need to go through all of those uses of the type size and see if they should really be using the size of the value that you can represent because this is the main cause of the errors that I'm seeing in building the libraries. And of course a lot of testing and
polishing is going to be required. Before I finish the talk I do need to mention that there's ongoing and past work that is in this same area. There was a master's thesis from the University of Cambridge and there's another government funded project from the University of
Kent. And well thank you for listening. Please feel free to check out the code on GitHub or ask me any questions outside.