SPARKNaCl: A Verified, Fast Re-implementation of TweetNaCl - TIB AV-Portal

SPARKNaCl: A Verified, Fast Re-implementation of TweetNaCl

00:00

4

Zugehöriges Material

Chapman, Roderick

Formale Metadaten

Titel

SPARKNaCl: A Verified, Fast Re-implementation of TweetNaCl

Serientitel

Anzahl der Teile

287

Autor

Chapman, Roderick

Mitwirkende

Blanco, Fernando Oleo

Lizenz

CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/57032 (DOI)

Herausgeber

Erscheinungsjahr

Sprache

Inhaltliche Metadaten

Fachgebiet

Genre

Abstract

SPARKNaCl is a new, verified and fast reefrence implementation of the NaCl API, based on the TweetNaCl distribution. It has a fully automated, complete and sound proof of type-safety and several key correctness properties. In addition, the code is fast - out-performing TweetNaCl on an Ed25519 Sign operation by a factor of 3 at all optimization levels. This talk will cover how "Proof Driven Optimization" can result in code that is both correct and fast on bare-metal embedded targets. TweetNaCl is a compact reference implementation of the NaCl API. It was initially constructed to show that an entire crypto library could fit into "100 tweets", but has since been re-used in some critical applications, such as the WireGuard VPN. There are no comments in the code at all, and all assurance rests on a single brief academic paper, and the formidable reputation of the authors. Can we do better? Can we produce a reference implementation which is amenable to automatic verification and yet is competitive with TweetNaCl in terms of performance and code size? This talk presents SPARKNaCl - a complete re-implementation of TweetNaCl in SPARK, which comes with a fully automated proof of type-safety, memory-safety and a number of key correctness properties. Having established a solid foundation, we went on to compare the performance and code size of SPARKNaCl against the original C implementation. Various transformations and optimizations have been applied that result in SPARKNaCl out-performing TweetNaCl on a bare-metal 32-bit RISC-V machine for a single Ed25519 "Sign" operation, while retaining automation and completeness of the proof. Furthermore, SPARKNaCl is freely available under the 3-clause BSD licence. This talk will present an overview of the results from both the proof work and performance analysis of SPARKNaCl.

FOSDEM 2022161 / 287

1

28:39

A gentle introduction to Picocli

2

43:04

Z80: the last secrets

3

34:39

Why your next embedded project should be written in Go

4

32:07

5

23:40

A Better Public Transport App

6

29:11

Open Geodata Digital Spaces

7

16:22

Open Source Firmware status on AMD platforms 2022

8

28:22

Firmware Settings and Menus

9

28:50

Bringing RAUC A/B Updates to More Linux Devices

10

11:45

GPIO across Linux and Zephyr kernels

11

28:48

Eclipse Amlen: Messaging for IoT/Web/Mobile

12

28:42

Back to DirectFB!

13

29:23

Why Embedded Linux Needs a Container Manager Written in C

14

1:00:10

Automotive Ethernet PHY bring-up: lessons learned and debug tips

15

25:52

How to teach OSS licenses and compliances at a university

16

30:10

Why the pandemic could help FOSS, but was a win for proprietary software

17

58:11

Panel: Hot Topics

18

27:53

A globally unified governance framework for Open Source

19

28:40

An update on the Digital Markets Act

20

28:19

Why Device Neutrality is important for Free Software?

21

28:19

Somebody set up us the bomb

22

29:12

Rapid Prototyping Physical Interfaces with Web Serial and Cheap MCUs

23

29:05

Can JS also build the metaverse?

24

41:39

Running trusted payloads with Nomad and Waypoint

25

43:58

Simple (but useful) Ansible reporting with ara

26

44:04

Deploying An Embedded Linux Distro Build Factory with Ansible And Proxmox: lessons learned

27

28:48

Utilizing AMD GPUs: Tuning, programming models, and roadmap

28

28:49

Porting Signal processing algorithms to CuPy for precision measurement

29

28:54

PIRA: Performance Instrumentation Refinement Automation

30

28:49

Bringing together open source scientific software development for HPC and beginners

31

26:24

Containers in HPC

32

25:34

Uncovering Arcon: A state-first Rust streaming analytics runtime

33

46:07

v3dv: Status Update for Open Source Vulkan Driver for Raspberry Pi 4

34

21:39

The status of turnip driver development.

35

43:13

36

56:34

Optimal buffer allocation on Wayland

37

31:08

Fun with Finite Automata

38

34:40

TornadoVM: Hardware Acceleration For Java In Practice

39

27:03

Update On Java On The Raspberry Pi

40

33:17

azul: How Your Java is Still Served Hot

41

33:23

Jakarta EE: Present & Future

42

30:47

Fundamentals Of Diversity & Inclusion For Technologists

43

30:59

Polyglot Cloud Native Debugger: Going Beyond APM

44

59:42

Using LibreSilicon

45

38:42

Efabless Open ASICs

46

48:35

Coriolis RTL-to-GDSII Toolchain

47

44:00

48

34:01

Next generation micro-controller programming

49

39:35

Nim Metaprogramming in the real world

50

33:59

Nim concurrency & Parallelism

51

28:15

Why rule-based monitoring is (still) great

52

29:06

Network Traffic Classification for Cybersecurity and Monitoring

53

18:04

Peer-to-peer hole punching without centralized infrastructure

54

29:04

Kubernetes networking : is there a cheetah within your Calico?

55

19:05

Keep appetite for the stats, it costs nothing

56

18:12

Faster memory reclamation with DPDK RCU

57

29:02

2-cluster Kubernetes, with Calico, BGP Interconnect and WireGuard... All Without Leaving Your Laptop!

58

29:06

Challenges and Opportunities in Performance Benchmarking of Service Mesh for the Edge

59

22:05

The relational model in the modern development age

60

23:52

Percona XtraDB Cluster(PXC) Non blocking operations, what you need to know to avoid pitfalls

61

24:59

ProxySQL Cluster: challenges and solutions to synchronizeconfigurationacross multiple decentralized cluster nodes

62

22:35

ProxySQL 2021 Dev Submit

63

18:52

Release Note Highlights from 2021

64

35:06

MySQL 8.0: Logical Backups, Snapshots and PITR like a rockstar

65

56:43

MySQL Operator for Kubernetes

66

25:07

MySQL on Kubernetes demystified

67

18:53

Hash join in MySQL 8.0

68

23:07

Flame Graphs for MySQL DBAs

69

23:41

MySQL Performance on Modern CPUS - Intel, ARM, AMD

70

42:25

Newest MySQL component services features

71

59:43

MySQL InnoDB ClusterSet

72

25:08

Encrypting binary (and relay) logs in MySQL

73

39:55

Efficient MySQL Performance

74

27:10

Backup/Restore tools performance comparison

75

37:33

Bootstrapping a multi dc cloud native observability stack

76

05:45

Monitoring and Observability devroom: Opening

77

39:02

Profiling in the cloud-native era

78

38:38

Adopting OpenTelemetry and its collector

79

25:56

Suggestions for a Stronger Mozilla Community

80

28:58

Collecting Sentences for Common Voice

81

18:59

Introduction to Foxfooding

82

37:12

Searchfox: Fast code search and indexing

83

39:03

Linux Mobile vs. The Social Dilemma

84

29:00

Phosh Contributors Get Together

85

33:59

ModemManager in your phone

86

38:46

2 Years of Mobian

87

30:18

Mainlining the reMarkable 2 eInk tablet

88

35:10

From Android to mainline on the Snapdragon 845

89

29:05

Running Mainline Linux on Snapdragon 410

90

24:05

Librem 5 phone kernel report

91

39:01

The road towards using regular linux on ebook readers

92

1:28:44

FOSDEM 2022 - Closing Session

93

1:28:45

Status of camera support on mobile FOSS devices

94

24:00

Anatomy of GNOME Calls

95

33:50

FOSDEM 2022 - Welcome to Libadwaita

96

29:01

Bring openwifi to PYNQ-Z1 with ultra low cost

97

39:03

RedLeaf: Isolation and Communication in a Safe Operating System

98

33:29

Unikraft: Debugging and Monitoring

99

30:59

Mitigating Processor Vulnerabilities by Restructuring the Kernel Address Space

100

48:18

Genode meets the Pinephone

101

20:23

Advanced Unit Testing in the Hedron Microkernel

102

43:53

The Composite Component-Based OS

103

25:26

A practical solution for GNU/Hurd's lack of drivers: NetBSD's rumpkernel framework

104

54:07

Unhackable across 30 Years, End in Sight

105

28:31

UX/RT - a QNX-like OS based on seL4

106

35:32

Hardware accelerated applications on Unikernels for Serverless Computing

107

33:30

Managarm: Design of a pragmatic fully-asynchronous microkernel

108

29:05

A year of RISC-V adventures: embracing chaos in your software journey

109

19:00

Why everyone needs to know some coding: last-mile sandboxing

110

23:44

Designing a programming language for the desert

111

30:10

Fuzion Language Update

112

18:37

How to design powerful DSLs for users

113

15:09

Declarative and Minimalistic Computing

114

33:59

The Concise Common Workflow Language

115

29:01

Adventures in Dataflow

116

28:47

The Matrix State of the Union

117

59:00

The matrix-rust-sdk

118

29:29

Growing Pinecones for P2P Matrix

119

19:01

Opsdroid: Building a bot using Python3

120

19:12

The next generation of Matrix interfaces

121

30:02

All things with moderation

122

29:15

MLS meets Matrix

123

29:21

Beyond the Matrix: Extend the capabilities of your Synapse homeserver

124

28:11

Events for the Uninitiated

125

04:56

Decentralized Collaborative Annotations using Matrix

126

04:28

ChatStat - An R package for Matrix stats

127

29:35

Through The Looking Glass

128

34:34

8-bit Character support on architectures were the smallest addressable unit size is 64-bit in Clang and LLVM

129

23:20

LLSOFTSECBOOK: LOW-LEVEL SOFTWARE SECURITY FOR COMPILER DEVELOPERS

130

16:25

Towards an Operational Code Aesthetics

131

24:02

Online performance

132

30:09

Why ODF is a better standard than OOXML

133

10:33

Macro Dialog feature

134

29:09

LibreOffice WASM – an Update: A status report from the journey to get LibreOffice into the browser, fully*

135

11:07

Information Engineering Operations

136

29:23

Improved coverage analysis for LibreOffice's CI

137

10:24

Editing Simulation

138

22:38

Improving Developer Experience at LibreOffice

139

26:38

Curl based HTTP/WebDAV UCP

140

09:21

Kubernetes setup & deployment

141

09:49

Canvas For Rendering UX

142

30:10

Advantages of LibreOffice Technology

143

10:09

Building Collabora Online UI: based on the LibreOffice components

144

18:26

Peergos - Combining peer-to-peer connectivity, end-to-end encryption and fine grained access control to build a secure and privacy focused self-certifying web protocol

145

27:57

State of libp2p

146

18:00

Edges Are Infrastructure: IPFS Everywhere for a More Resilient Future

147

19:21

Hyper Hyper Space: In-browser p2p applications

148

24:02

Earthstar: The merits of being a bicycle when everything else is a hyperloop.

149

22:10

What's coming in VIRTIO 1.2

150

19:13

Tracing KubeVirt traffic with Istio

151

18:52

The story of adding TPM support to oVirt

152

39:09

Phyllome OS: A friendly virtualization-focused Linux distribution

153

27:30

Network interface hotplug for Kubernetes

154

25:27

KubeVirt scale test by creating 400 VMIs on a single node

155

23:54

Isolating PCI/CXL Devices: It All Starts with System Launch

156

27:19

Introducing OKD Virtualization

157

28:26

DevOps, Cloud Native, DPUs: beyond the buzzwords

158

27:22

Cross-platform/cross-hypervisor virtio vsock use in go

159

39:27

Panel 2: Dependencies for Vulnerability Discovery and Tracking

160

24:03

SweetAda: A Lightweight Development Framework for the Implementation of Ada-based Software Systems

161

1:04:19

SPARKNaCl: A Verified, Fast Re-implementation of TweetNaCl

162

24:00

Exporting Ada Software to Python and Julia

163

38:43

Proving the Correctness of GNAT Light Runtime Library

164

34:15

The Outsider's Guide to Ada

165

33:38

The Ada Numerics Model

166

24:34

Ada Looks Good, Now Program a Game Without Knowing Anything

167

13:49

Introduction to the Ada DevRoom

168

1:03:39

Introduction to Ada for Beginning and Experienced Programmers

169

03:37

Closing of the Ada DevRoom

170

24:02

Implementing a Build Manager in Ada

171

23:41

Getting Started with AdaWebPack

172

29:04

Overview of Ada GUI

173

28:41

Use (and Abuse?) of Ada 2022 Features in Designing a JSON-like Data Structure

174

29:02

Alire 2022 Update

175

21:27

secPaver: Security Policy Development Tool

176

29:11

State of Open Source Databases

177

46:51

IMPLEMENTING AN INCENTIVISED PARTNERS PROGRAM IN MAUTIC

178

19:07

Introduction to qbe

179

23:49

Verifiable Credentials and Decentralized Identifiers with DIDKit

180

27:11

Automatic CPU and NUMA pinning

181

39:32

Build and release tools tailored to building, releasing and maintaining Linux distributions and forks

182

29:59

Modding the Immutable – how to extend Flatcar, an immutable image-based OS

183

44:49

Collaboration instead of Competition

184

38:16

CentOS Stream: stable and continuous

185

09:00

Extending Kubernetes with WebAssembly

186

28:41

Boot2container: An initramfs for reproducible infrastructures

187

29:20

Free tools that help you run online events in an effective way

188

25:37

Streaming and Edit Conference Videos with OBS, Jitsi and kdenlive

189

38:34

FOSS Events Primer

190

50:12

Run a conference with pgeu-system

191

03:34

Welcome to the Conference Organisation Dev Room

192

28:23

FOSDEM Conference Infrastructure

193

28:57

Lessons from 6 Virtual Ansible Contributor Summits

194

19:54

Debian Conference Infrastructure

195

29:06

Introducing ONLYOFFICE Forms for paperwork automation and smart collaboration

196

24:28

Oniro - an open-source starter for fast-paced IoT environments

197

32:04

Unifying Infrastructure and Application Delivery Using Keptn

198

23:46

Porion a new Build Manager

199

23:46

Massive Unikernel Matrices with Unikraft, Concourse and More

200

23:58

How to improve the developer experience in Heptapod/GitLab

201

44:26

Continuous Integration Pipelines with Nomad, Vault and Jenkins

202

19:00

Pushing the Open Source Hardware Limits with KiCAD

203

19:37

Open CASCADE Technology: status update

204

18:51

ngspice - current status and future developments

205

18:38

LibrePCB Status Update

206

59:14

KiCad Project Status

207

18:40

Hacking through BIM models

208

29:43

Valgrind and debuginfo

209

21:49

Adding Power ISA 3.1 instruction support to Valgrind

210

53:16

Upstreaming the FreeBSD Port

211

10:11

Enable AVX-512 instructions in Valgrind

212

20:31

Privacy-preserving video object detection in WebAssembly inside Veracruz

213

20:13

SGX Enclave Exploit Analysis and Considerations for Defensive SGX Programming

214

23:51

Secure boot, TEEs, different OSes and more

215

23:18

Logging, debugging and error management in Confidential Computing

216

23:20

Intravisor -- a hypervisor for fine-grained isolation using CHERI

217

23:18

Symbolic Validation of SGX enclaves using Guardian

218

23:16

Gramine Library OS

219

23:59

Rethinking the OS for Isolation Flexibility with FlexOS

220

20:46

WebAssembly + Confidential Computing

221

23:15

Developing for the AWS Nitro Enclave Platform

222

58:13

Process-based abstractions for VM-based environments

223

23:10

Arm CCA enablement through the Trusted Firmware community project

224

35:09

Unit testing Linux kernel drivers

225

13:10

How (not) to make a mockery of trust

226

43:01

LAVA + OpenQA = Automated, Continuous Full System Testing

227

21:33

Data Replication and Migration from Ceph RGW to Cloud

228

28:57

Introducing Garage, a new storage platform for self-hosted geo-distributed clusters

229

36:33

COSI : a brief update

230

09:43

Migrate to Ceph-CSI

231

29:01

Trajectware - timeline-based navigation across computing heritage

232

58:59

A Brief History of Spreadsheets

233

37:28

Debunking The Myths About The Raku® Language

234

58:31

Keeping old Unix/Linux up-to-date with pkgsrc

235

1:24:01

A Computer Museum Why and how?

236

27:40

FrogFind and 68k News

237

29:33

Old Web Today: Keeping Flash (and other) Retro Web Sites Accessible on the modern web

238

33:58

Hack for the Planet

239

19:45

Getting 1K Chess for the ZX81 online

240

29:35

AOSC OS/Retro - An Introduction

241

43:57

Made by Woz: how Apple-1 operating system works?

242

28:21

Radically simple testing in Raku

243

57:44

Raku Steering Council Q&A Panel

244

24:00

Class learning analytics with Raku

245

36:23

A Raku Grammar for Navigation Lights

246

33:36

GitHub Actions (in|for) Raku

247

22:12

Free Software, Dependency Management, and what I got wrong at FOSDEM 21

248

33:33

Keeping the past to preserve the future

249

38:23

Decentralized DevOps with Unfurl

250

29:02

Voyager 1 adventures

251

29:05

Opensource WiFi chip (openwifi) progress and future plan

252

29:06

gr-ofdmradar: OFDM Radar in GNU Radio

253

09:34

Introducing the M17 Project

254

29:42

Emitting Hellschreiber from a Raspberry Pi GPIO: combining gr-hellschreiber with gr-rpitx

255

58:07

AlekSIS, the Free School Information System

256

42:59

Working effectively with (-support-) the community

257

56:43

Solving the knapsack problem with recursive queries and PostgreSQL

258

58:30

Slow things down to make them go faster

259

29:37

PostgreSQL Distributed & Secure Database Ecosystem Building

260

28:30

Future Postgres Challenges

261

43:42

What I wish I knew about security when I started programming

262

58:22

Secure Communication with Tls

263

38:12

Sudo: Watch and control your blind spots

264

44:27

WebRTC broadcasting with WHIP

265

43:06

UnifiedPush: A FOSS cross-platform push notifications protocol

266

48:16

Jitsi: 20 years of Real Time Communications

267

58:58

On the Far Side of REST

268

58:44

Implementing the NTFS filesystem in Rust

269

45:09

Open Source Network Automation in 2022

270

43:48

European digital sovereignty and open source

271

35:38

Are we being inclusive with our community recognitions?

272

20:35

Strengthening Developer Communities in Unprecedented times

273

20:10

Tracking your time with Timewarrior

274

19:18

A lightning intro to re-Isearch

275

18:59

Rapid Prototyping of a Positioning System

276

14:58

NetOTA: Quick introduction to IoT centric package archive

277

20:10

Jupyter for React.js developers

278

18:06

Measuring and analyzing humidity data using Python, syslog-ng and Elasticsearch

279

14:29

C meta-programming for the masses with C%: cmod

280

18:45

Generating virtual 3D exhibitions from Wikipedia

281

20:42

Collabortive group self-awareness with Where, a Holochain app

282

17:21

LibreOffice 7.3 New Features

283

13:47

Thunderbird in 2022

284

20:30

ToroV, a kernel in user-space, or sort of

285

05:18

Hardware-accelerated graphics in secure multi-tenant environments

286

37:27

Making a Community Managed FOSS Project Sustainable

287

22:18

Valgrind on RISC-V

Automatisches Abspielen

Sprache

Text

Bild

00:00

CodeComputerVerzeichnisdienstAnwendungsspezifischer ProzessorCAN-BusKryptologieImplementierungQuellcodeGasströmungSpeicherabzugAlgorithmusStromchiffreAuthentifikationHash-AlgorithmusOptimierungKonstanteGanze ZahlPrimidealRestklasseOvalVollständigkeitÜbertragSeitenkanalattackeKryptologieSoftwareSchlüsselverwaltungDeskriptive StatistikNichtlinearer OperatorAlgorithmusFehlermeldungHash-AlgorithmusBitVersionsverwaltungVirtuelles privates NetzwerkZahlenbereichUmwandlungsenthalpieAggregatzustandDifferenz <Mathematik>QuellcodeStromchiffreTypentheorieGanze ZahlDigitalisierungRestklasseTwitter <Softwareplattform>VerschiebungsoperatorPrimitive <Informatik>Produkt <Mathematik>PunktProgrammbibliothekAuthentifikationProgrammverifikationElliptische KurveLaufzeitfehlerMathematikMultiplikationsoperatorAssemblerPolygonImplementierungKategorie <Mathematik>DistributionenraumFormale SpracheVariableElektronische UnterschriftEin-AusgabeKurvenanpassungCodeMultigraphVerzweigendes ProgrammGruppenoperationNummernsystemHypergraphWeb-SeiteSchnittmengeProjektive EbeneOptimierungMAPPufferüberlaufTermBrennen <Datenverarbeitung>MathematikerinGarbentheorieFunktionalanalysisMultiplikationQuick-SortAdditionTeilmengeAbenteuerspielLesen <Datenverarbeitung>GamecontrollerLeistung <Physik>DoS-AttackeProzess <Informatik>Inhalt <Mathematik>ResultantePhysikalisches SystemKonstanteProgrammfehlerPhysikalischer EffektGrundsätze ordnungsmäßiger DatenverarbeitungSchmelze <Betrieb>Streaming <Kommunikationstechnik>Software EngineeringSchlüsselverteilungOpen SourceSummierbarkeitSystemaufrufInelastischer StoßAlgebraisches ModellSchreib-Lese-KopfLipschitz-StetigkeitSymboltabelleWeb SiteWasserdampftafelPlastikkarteWellenlehreWechselsprungRechter WinkelDreiecksfreier GraphDiagrammComputeranimation

09:49

ÜbertragCodeFormale GrammatikImplementierungMagnetkarteKryptologiePhysikalisches SystemProgrammbibliothekCAN-BusSoftwaretestLineare RegressionKonstanteOptimierungROM <Informatik>TypentheorieBefehlsprozessorCompilerCOTSSpeicherverwaltungFlächeninhaltAlgorithmusHash-AlgorithmusPartielle DifferentiationBefehl <Informatik>LoopNP-hartes ProblemProzessautomationInvarianteLipschitz-StetigkeitDesign by ContractHilfesystemBenchmarkPlastikkarteMereologieAusnahmebehandlungBitRechter WinkelEntscheidungstheorieFormale SpracheSystemaufrufFlächeninhaltPhysikalisches SystemProjektive EbeneOptimierungKategorie <Mathematik>ResultanteDistributionenraumGruppenoperationDatenfeldHypermediaReelle ZahlFaktor <Algebra>Zentrische StreckungLogiksyntheseFunktionentheorieProgrammbibliothekSchnitt <Mathematik>URLWeb SiteNatürliche ZahlDreiecksfreier GraphMathematikPhysikalische TheorieArithmetisches MittelQuick-SortBenutzerfreundlichkeitTwitter <Softwareplattform>QuaderProzess <Informatik>AggregatzustandGesetz <Physik>Web ServicesInteraktives BeweissystemMooresches GesetzAssemblerStatistikProgrammfehlerMaschinencodeFunktionalanalysisBrennen <Datenverarbeitung>Mathematische LogikSoftware EngineeringBildschirmmaskeAlgorithmische ProgrammierspracheAlgebraisches ModellCASE <Informatik>GeradeTermMAPCodeHalbleiterspeicherImplementierungAlgorithmusProgrammverifikationOffice-PaketSoftwaretestVollständigkeitDynamisches SystemVirtuelle MaschineKurvenanpassungMultiplikationsoperatorComputersicherheitProgrammschleifeBefehl <Informatik>Hash-AlgorithmusVariationsrechnungPräkonditionierungKonditionszahlLesezeichen <Internet>Hecke-OperatorIdentitätsverwaltungSchnittmengePrädikat <Logik>FreewareSeitenkanalattackeDeklarative ProgrammierspracheHardwareZusammenhängender GraphLaufzeitfehlerLoopProgrammierstilProdukt <Mathematik>SoftwareentwicklerTypentheorieCOTSSpeicherverwaltungKryptologieKeller <Informatik>CompilerKonstanteTypprüfungPartielle DifferentiationAnalysisPufferüberlaufInteraktives FernsehenFormale GrammatikDomain <Netzwerk>TheoremSoftwareLineare RegressionInvarianteComputeranimation

19:31

LoopBefehl <Informatik>NP-hartes ProblemAlgorithmusProzessautomationInvarianteCodeFormale GrammatikProgrammschleifeGanze ZahlNichtlinearer OperatorGruppoidRestklasseOffene MengeVorzeichen <Mathematik>DatentypROM <Informatik>Stochastische AbhängigkeitAusnahmebehandlungPufferüberlaufDivisionCAN-BusZeiger <Informatik>TypentheorieVollständigkeitFlüssiger ZustandKonstanteLokales MinimumKoeffizientProdukt <Mathematik>NormalvektorProgrammbibliothekMaß <Mathematik>BefehlsprozessorNotebook-ComputerThreadModul <Datentyp>SoftwaretestWhiteboardMessage-PassingCodeHalbleiterspeicherAusnahmebehandlungPufferüberlaufDivisionProgrammbibliothekTypentheorieVirtuelle MaschineSoftwaretestVorzeichen <Mathematik>Zeiger <Informatik>SpeicherverwaltungProgrammverifikationLoopTheoremCoprozessorBitPrädikat <Logik>Zentrische StreckungTermNotebook-ComputerAutomatische IndexierungKurvenanpassungDeklarative ProgrammierspracheLeistung <Physik>SpeicherabzugMultiplikationGanze ZahlLastDigitalisierungGebundener ZustandSpannweite <Stochastik>InvarianteKategorie <Mathematik>Nichtlinearer OperatorPhysikalisches SystemQuaderSchlüsselverwaltungObjekt <Kategorie>KonstanteKonditionszahlSpezifisches VolumenDesign by ContractZahlenbereichRestklasseEinfache GenauigkeitProgrammschleifeAlgorithmusSchaltnetzFunktionalanalysisPartielle DifferentiationDynamisches SystemSoftwareschwachstelleMultiplikationsoperatorElement <Gruppentheorie>PrimzahlTypprüfungErzeugendeRadikal <Mathematik>OrdnungsreduktionLuenberger-BeobachterFormale SpracheGüte der AnpassungElektronische UnterschriftSkalarfeldUmwandlungsenthalpieOptimierungArithmetisches MittelProgrammiergerätElektronische PublikationFuzzy-LogikEigentliche AbbildungGeradeZweiParallele SchnittstelleWeb SiteAdditionWeb-SeiteRechter WinkelDreiecksfreier GraphRechenwerkFatou-MengePlastikkarteLeckPunktGesetz <Physik>VollständigkeitNeunzehnPrognoseverfahrenProzess <Informatik>MomentenproblemFormation <Mathematik>TransitivitätVollständiger VerbandPräkonditionierungHypermediaKontrollstrukturLoginForcingMinkowski-MetrikSystemaufrufWhiteboardDistributionenraumBildverstehenWasserdampftafelProdukt <Mathematik>EinsProfil <Aerodynamik>Flüssiger ZustandBetti-ZahlComputeranimation

29:14

SoftwaretestWhiteboardVorzeichen <Mathematik>GruppoidMessage-PassingCodeOptimierungSystemaufrufVariableTypentheorieFunktion <Mathematik>p-BlockStandardabweichungGlobale OptimierungMAPAliasingCompilerGasströmungRegulärer Ausdruck <Textverarbeitung>BefehlsprozessorVerschiebungsoperatorVollständigkeitInnerer PunktDatenanalyseObjekt <Kategorie>EliminationsverfahrenPartielle DifferentiationGanze ZahlZahlzeichenKompakter RaumInformationsspeicherungOperations ResearchCachingTransformation <Mathematik>BeweissystemInformationsüberlastungBitNichtlinearer OperatorFunktionalanalysisFormale SemantikQuellcodeOptimierungCompilerMultiplikationsoperatorAusnahmebehandlungSoundverarbeitungKategorie <Mathematik>TypentheorieKartesische KoordinatenMessage-PassingSchlüsselverwaltungFreewareWhiteboardTransformation <Mathematik>InformationsspeicherungGlobale OptimierungCodeAliasingDifferenteRahmenproblemProgram SlicingMathematikKeller <Informatik>NormalvektorTwitter <Softwareplattform>SelbstrepräsentationMultiplikationZahlenbereichDatenfeldMAPHyperbelverfahrenOrdnungsreduktionGeradeDifferenz <Mathematik>AdditionDigitalisierungUnrundheitDreiecksfreier GraphFormale SprachePufferüberlaufSystemaufrufGüte der AnpassungKreisbewegungObjekt <Kategorie>Physikalisches SystemÜbertragResultanteTypprüfungBefehlsprozessorTeilbarkeitArithmetischer AusdruckImplementierungLaufzeitfehlerFront-End <Software>KryptologieElliptische KurveFunktionale ProgrammierspracheRelationentheorieLuenberger-BeobachterAggregatzustandInterface <Schaltung>LoopPartielle DifferentiationSpannweite <Stochastik>KoeffizientGerichteter GraphSpielkonsoleKompakter RaumVorzeichen <Mathematik>MultipliziererExistenzaussageVersionsverwaltungSpieltheorieZentralisatorPlastikkartesinc-FunktionWeb SiteEinfügungsdämpfungProgrammschleifeMobiles InternetSchnitt <Mathematik>Suite <Programmpaket>ChiffrierungMomentenproblemBildbearbeitungsprogrammRechter WinkelSchwebungGesetz <Physik>NeunGruppenoperationGanze ZahlPunktComputeranimation

38:57

MAPGlobale OptimierungBefehlsprozessorCodeProgrammschleifeCachingBitrateSpeicherabzugp-BlockProgrammbibliothekGruppoidGanze ZahlZahlzeichenTDMAPhysikalisches SystemIterationCompilerSystemaufrufFunktion <Mathematik>Message-PassingUmwandlungsenthalpieArchitektur <Informatik>EinflussgrößeSubstitutionTransformation <Mathematik>InformationsspeicherungNichtlinearer OperatorInvarianteLoopVersionsverwaltungIdeal <Mathematik>LeistungsbewertungProzessautomationInformationKonstanteGraphAnalysisStochastische AbhängigkeitPublic-domain-SoftwareNichtlinearer OperatorGlobale OptimierungTransformation <Mathematik>AusnahmebehandlungVorzeichen <Mathematik>CodeProgrammbibliothekBitMultiplikationsoperatorSchaltnetzCASE <Informatik>Quick-SortMAPMathematikKategorie <Mathematik>DatenflussQuellcodeDreiecksfreier GraphProjektive EbeneFunktionalanalysisRepository <Informatik>BefehlsprozessorCachingMomentenproblemCompilerProgrammschleifeZweiunddreißig BitBitrateFolge <Mathematik>MaschinencodeFormale SemantikInformationWeb logVirtuelle MaschineReelle ZahlFormale SpracheServerResultanteEinsSelbst organisierendes SystemFaserbündelPunktp-BlockInvarianteOptimierungPhysikalisches SystemHalbleiterspeicherLoopSystemplattformSystemaufrufSoftwareentwicklerMessage-PassingInverser LimesGamecontrollerUnrundheitProgrammverifikationProgram SlicingPartielle DifferentiationEliminationsverfahrenUmwandlungsenthalpieTypentheorieDatenkompressionEinflussgrößeKonditionszahlAnalysisVersionsverwaltungWechselsprungForcingVollständigkeitAlgorithmusGüte der AnpassungTermRandwertKonstanteGraphTheoremStandardabweichungHardwareKartesische KoordinatenMultiplikationRechter WinkelHypergraphEinfügungsdämpfungComputerspielRichtungGruppenoperationVerknüpfungsgliedGesetz <Physik>ComputerarchitekturRelationentheorieStellenringProzess <Informatik>Puls <Technik>SchnittmengeURLPunktwolkeDatenstrukturWeb SiteEchtzeitsystemPhysikalischer EffektZahlenbereichBaumechanikOrdnungsreduktionHilfesystemLuenberger-BeobachterComputeranimation

48:40

AggregatzustandMereologieVersionsverwaltungPhysikalisches SystemGerichteter GraphImplementierungVertauschungsrelationKategorie <Mathematik>Abstimmung <Frequenz>ProgrammfehlerRechter WinkelGrößenordnungWeltformelEinflussgrößeRPCResultanteDigitalisierungTypentheorieVerschiebungsoperatorTermPlastikkarteBitAdditionProxy ServerLeistung <Physik>Gebundener ZustandCodeZweiAlgorithmusDynamisches SystemPrädikat <Logik>Element <Gruppentheorie>Prozess <Informatik>Divergente ReiheQuick-SortComputeranimationBesprechung/Interview

52:58

ParametersystemDynamisches SystemDatenflussArray <Informatik>Formale SpracheObjekt <Kategorie>StereometrieMathematikAnalysisCodeGlobale OptimierungPhysikalisches SystemFunktionalanalysisCompilerKategorie <Mathematik>Element <Gruppentheorie>AliasingTermDatenfeldAdditionPrädikat <Logik>KonditionszahlSoftware Development KitOrdnungsreduktionGruppenoperationWorkstation <Musikinstrument>Vorzeichen <Mathematik>Produkt <Mathematik>PunktAttributierte GrammatikMereologieARM <Computerarchitektur>Besprechung/Interview

55:33

Design by ContractAlgorithmusBitrateLoopMultiplikationPunktMultiplikationsoperatorRepository <Informatik>InvarianteProgrammverifikationQuick-SortKonditionszahlSchnittmengeOrdnungsreduktionCodePräkonditionierungProzess <Informatik>FunktionalanalysisBitAdditionVerzweigung <Mathematik>MehrrechnersystemGesetz <Physik>ForcingBesprechung/Interview

57:18

ZahlenbereichElliptische KurveCodeMereologieDatensatzQuick-SortGeradeMultiplikationsoperatorKoroutineSchreib-Lese-KopfGruppenoperationAlgorithmusSchnelltasteFrequenzProdukt <Mathematik>Minkowski-MetrikNabel <Mathematik>Translation <Mathematik>ChiffrierungGüte der AnpassungPunktspektrumEntscheidungstheorieInhalt <Mathematik>UnrundheitComputeranimationBesprechung/Interview

59:52

No-Free-Lunch-TheoremChiffreMAPNP-hartes ProblemSpeicherabzugGruppenoperationCASE <Informatik>MereologieMathematikComputerunterstützte ÜbersetzungTopologieHypergraphQuick-SortUmwandlungsenthalpieZentrische StreckungEntscheidungstheorieTouchscreenCodeMultifunktionProjektive EbeneAlgorithmusMultiplikationReelle ZahlSchwebungArithmetisches MittelSchnitt <Mathematik>WechselsprungFormale SpracheDesign by ContractBitTypprüfungMaschinencodeTheoremKryptologieInteraktives FernsehenBinärcodeTwitter <Softwareplattform>DifferenteFunktionentheorieComputeranimationBesprechung/Interview

01:02:08

Data Envelopment AnalysisEntscheidungstheorieBitGerichteter GraphMailing-ListeZweiMultiplikationsoperatorARM <Computerarchitektur>ZahlenbereichLokales MinimumTheoremHilfesystemLeistung <Physik>Besprechung/Interview

01:03:31

ComputeranimationTechnische Zeichnung

Transkript: Englisch(automatisch erzeugt)

00:08

Okay, thanks for coming everybody. Nice to see you all at FOSDEM this year. Nice to see you even virtually, if not in person. So this afternoon, I'm going to be talking about Spark at ACL, which is a cryptographic library written

00:24

in Spark, which I'm sure most of you already know is a subset of Ada and associated tooling. This was a kind of a lockdown side project for me. It started out as a bit of fun, just to do some programming work during lockdown, and then kind of became a bit more serious as I got it more towards a production level of quality.

00:44

So what I want to talk about then, in order, is to start by talking a little bit about something called Tweet NACL, which is our existing version of the NACL crypto library, which I started from. Tell you a bit about that and then tell you, well, why am I bothering to do this in the first place? What caught my interest about it?

01:02

I'll tell you about the goals that I set myself for this project. I'll also tell you what my goals definitely weren't in terms of things I didn't want to try to achieve. The two main technical contributions, I suppose, are firstly a section I'll talk about the proofwork on the software. So this is about the formal verification of the Spark version and what I was able to verify and what I wasn't able to verify.

01:23

Then we'll talk about the performance side, which is kind of saying, well, OK, we verified it, but can we actually make it go fast enough to be competitive with the C code without breaking the proof as well, which is a very interesting adventure. And I'll talk about finally some further work and ideas that we might have to take it further.

01:40

So moving straight ahead, what is NACL? Well, OK, NACL on its own sometimes pronounced SALT or it actually stands for Networking and Crypto Library, but no one calls it that. People refer to it as libsalt or libsodium sometimes. NACL is, I guess you could describe it as the specification for a cryptographic libraries API, which embodies a number of particularly recent or cryptographic algorithms.

02:09

And there are there are many implementations of NACL written in different languages. There were implementations in C, Python, Rust, C++, you name it. Lots of languages have implementations. But there are particularly well-regarded implementations of this, which have been subject to a

02:23

lot of quality control, and it is very highly regarded as a piece of software. Tweet NACL is an implementation of the NACL API, which is written in C, and it has some rather unusual properties. This version of NACL was actually written and designed by the people that designed the API in the first place.

02:45

So a rather odd property of Tweet NACL is that the whole software, all of the source code for Tweet NACL for the whole library can be tweeted in 100 tweets. Now, this is a slightly strange thing to do. Firstly, it's an exercise in demonstrating that a useful cryptographic library can actually be

03:02

done in such an amazingly small amount of code, which is an achievement in itself. And secondly, it's a little bit of an exercise in saying, well, you know, demonstrating that you can distribute this kind of stuff to the whole world by tweeting it without having to get in the way of any copyright or distribution or export controls or anything like that.

03:24

So, as a result of trying to get the thing down to 100 tweets, one issue is that there are really actually almost no comments in the code whatsoever. So it is spectacularly difficult to understand how it works. I'll show you a bit of the code in a minute or two.

03:41

There is a technical paper about 16 pages that describes how the library is put together. This is really rather succinct in a paper and it makes sense if you understand the mathematics of things like elliptic curves and all these other algorithms. But if you're not a hardcore crypto mathematician like me, if you're a software engineer,

04:01

it's pretty difficult reading this paper and trying to understand the code is still pretty difficult. And the assurance for TweetNACL largely rests on things like, well, there are lots of people using it. There were significant, some significant users. I believe a version of it has been reused in something called the WireGuard

04:22

VPN system, which is widely used and is built into many Linux distros these days. So there were lots of users and the designers of NACL, particularly a chap called Dan Bernstein and his colleagues, do have a very formidable reputation as cryptographic mathematicians, certainly.

04:41

But it's interesting to look at a piece of code like that and say, well, you know, is it really actually trustworthy and has it got any bugs in it? And if we applied state of the art formal verification tools, what would we find? So that's kind of where I came from. I thought, well, OK, let's just rewrite it in Spark and see what we get. What you get with TweetNACL, the algorithm set you get are a number of modern cryptographic algorithms.

05:05

So Salsa 20 is a modern stream cipher. There's an authenticator called Poly 1305. It uses a standard hash algorithm, just just the 512-bit variant of SHA-2. And for signatures and key exchanges, it uses an elliptic curve called Curve 25519, on top of which is built the Edge 25519 signature scheme.

05:25

And you can also use it to establish a shared secret using the variant of ECDH built on that elliptic curve. So you get a really useful set of primitives with it with which you can do most most things that you would want to do. So it is really a useful library in so little code, it's quite surprising.

05:44

So as I said, looking at the C code, when I originally looked at it, I just looked at it and thought, gosh, I don't get this. How is this possibly working? Because, as I said, there were no comments. There's also a bit of trickery going on where the code is doing some cunning mathematical tricks to achieve an acceptable level of performance.

06:02

The code is doing some interesting mathematical trickery to make things like the elliptic curve operations go sufficiently fast. It's not as highly optimized as some libraries that you might come across. So it's not really an assembler or doing some really wild things, but it is still pretty difficult to understand. Secondly, it is programmed in a fashion called constant time programming style.

06:24

So the idea here is that the running time of the algorithms does not vary in a way that depends upon any variable or any input variable of any of the algorithms, if you like. So this is to defend the algorithms against side channel attacks based upon the execution time of the algorithm.

06:46

And that makes for the code being more difficult to understand because you look at a piece of code and think, what on earth's going on? And then you think, oh, why didn't they just write, you know, be much easier if you wrote this out as an if statement. And then you think, oh, they can't write an if statement because you'd be testing a sensitive variable and you can't branch based on a sensitive variable.

07:02

So it's kind of written in a slightly weird way in this constant time style. And finally, as I said, the paper is rather terse. So here's an example of what the paper says and then what the code looks like. So, for example, here's a rather marvelous paragraph from the paper. There's an algorithm in the signing algorithm.

07:25

There's an algorithm that reduces a number, a large number, modulo a big number called the order of the group. And you think, OK, and this one paragraph is the description of this whole code function in the library. And you look at that and you think, huh, what on earth is that about?

07:45

And yeah, well, yeah. And is that really right? Well, yeah, except actually when you get down into the weeds of it and off as a sort of something I found while doing the formal verification of this code, there is actually a mistake in this paragraph and I'm sure you all spotted it, right?

08:03

Obvious. It's a harmless mistake, I'm pleased to say, but there is actually a factual error in this paragraph. I won't tell you where it is yet. But that one paragraph is the only description you get of some code. Just for the sake of readability, we just introduce a couple of things to make it readable.

08:22

There's a couple of typedefs you need to know about. There's a signed integer 64 called i64 that the code uses. There's a type called GF, which is an array of 16 of those signed 64 bit values. So the GF type is used to represent very large numbers, basically.

08:40

It's an array of integers where each integer is like the digit, a digit of an array of digits. And those are used to represent numbers which are points on the elliptic curve. So this is a type that's that's all over the place in the software. It then uses a couple of hash defines just to make to abbreviate a couple of common idioms,

09:03

which makes the code shorter and therefore easier to fit in 100 tweets. But so that paragraph of mathematical stuff becomes that code in the tweet in ACL distribution. And you look at that and you think, wow, what on earth is going on there? If you look here, you can see all these mathematical operators.

09:22

There's lots of various multiplications and additions and subtractions and shifting. And imagine every single one of those operators. You've got to convince yourself that that's not and these these are all signed integer operations. So remember that in C signed integer operations are undefined behavior on overflow.

09:40

So you've got to convince yourself that every one of those could never overflow or underflow in any way. That's pretty difficult to do. And it's pretty difficult to see how this code works at all, let alone know it's doing the right thing and you're not going to be undefined or anything. So that's the kind of scale of the challenge that you're faced with this kind of code.

10:02

So why why did I even bother to do this? There's in the last five to 10 years, I guess there's been a real explosion in interest in this field, particularly people doing formal development or formal verification of cryptographic code. There's been many, many notable achievements in this area from people like Microsoft Research,

10:20

Amazon Web Services, the people at Galois out in the States with their Cryptol and Saw tool sets. There's a system called Jasmine and EasyCrypt, which is extremely strong, an effort called HackSpec with Rust to specify and refine cryptographic code, fiat cryptography. And probably as many more projects, again, that I've forgotten to mention.

10:40

And I apologize to anyone who's who's, you know, feels aggrieved that I've forgotten to mention their favorite project. And there's lots and lots of very impressive results in this area. Also, there's very highly respected implementations of NACL out there already, particularly the kind of benchmark implementation that many other distributions are based upon is called Libsodium.

11:00

And Libsodium is very highly regarded. It has been subject to a great deal of audit and analysis. It is also extremely efficient because bits of it are implemented in assembly language for particular targets. And they use some extraordinary tricks to make the thing go very, very fast. And that's that's a really, really, really well respected implementation. So so, you know, why bother doing another implementation of NACL?

11:24

Well, I really wanted to see, particularly with Spark 2014, since the language was was, you know, so significantly enhanced with Spark 2014 and modern day proof engines, I wanted to see if the proof system could deal with code of this kind of complexity, this kind of mathematical

11:42

nature, using this constant time programming style to see if the tools could cope with with with code of this nature. So what could we prove? Could we prove that it's so-called type safe or free of runtime errors and prove the code couldn't raise an exception, for example? Or could we do more? Could we actually go on to prove some more advanced sort of correctness properties?

12:03

I'm also interested in automation, meaning can I get the theorem proving tools to prove everything automatically? So you just push the button and it says, yes, everything is OK. And it proves all of the verification conditions automatically with one or more of the of the main proof engines, because I don't want to be involved with having to check false alarms from the proof tool.

12:26

I don't definitely don't want to have to be firing up some sort of interactive proof tool to kind of finish off the proving job. I want to automate the thing completely. So it'd be interesting to see what level of automation can be achieved. So there's also a myth that I come across working in industry, I come across this quite a lot.

12:45

There is a general perception I see in industry that people think that formal languages are either too slow or not compatible with people's target hardware and sort of unusable in the real world, where the real world might be, you know,

13:00

running on a small embedded target, something that's not got much memory or much computing power or these days, not much battery power. People are very interested in power consumption as well as speed. But there is this perception that formal languages are sort of too slow or too big for production use. This is often restated in a way it sort of comes out in the wash and people say, well, this code has to be really fast and really efficient.

13:24

And it's got to run on this little embedded target so we can program it in C, C, C or C or any language that starts with C or possibly assembly language. And languages like Spark don't even get a look in. People don't consider that it could be done in Spark. And I was interested in seeing if that really was the case and seeing if we can achieve both efficiency at execution time and also

13:46

the level of assurance and proof that we can do with Spark and see if we can get both of those at the same time. So what goals did I set myself and not set myself? So I wanted to do the whole library. I didn't just pick on one

14:02

of those cryptographic algorithms. I decided I'd do the whole library, all of the algorithms. So did that. Fine. Obviously, my new library should at least pass all the regression tests that exist for TweetNACL. And indeed it does. And that's fine. That's pretty, pretty obvious. The implementation retains the constant time property of the programming.

14:25

That's important. That stuff about avoidance of the side channel attack from timing variation. It is an important property of the library. And I decided I had to retain that to see if the tools could deal with it as well. And what I really set my goal on was this idea that I get a completely automatic

14:43

proof of at least what so-called type safety, where type safety in Ada terms basically means proving that the program could never raise an exception at runtime, but also covers things like the complete getting rid of all possibility of undefined behavior or erroneous behavior and so on.

15:06

Autoactive is a term that was, I believe, coined by Rustin Leno, who used to work for Microsoft Research and more recently works for Amazon. Autoactive is a sort of a portamento of automatic and interactive. So it's this idea that the proof is automatic in that you just press the button

15:22

and it works, but it's interactive in that to make it automatic, you have to write contracts in the code, where contracts are things like assertions, preconditions, postconditions, loop invariants and so on. So there is a bit of work involved, you actually have to think. But every once you've got it in the code, once you've got those contracts in place, the proof is completely automated.

15:43

Because I really didn't want to have to fire up an interactive proof tool like Isabelle or Coq or any of any of those other tools, because I know from experience they're not much fun to drive. Another goal was that the library for bare metal, kind of very small embedded targets, I

16:00

wanted the whole library to be compatible with the zero footprint runtime, which is implemented by GANAT. So that means you put the compiler in a particular mode, and it never ever generates a call to the Ada runtime library, so that the code will effectively run on anything. There's no dependencies on any other packages or libraries, there's no COTS components at

16:21

all, and the whole library is programmed in this sort of heap free style. So there's no use of dynamic allocation or deallocation or anything, it needs a bit of stack memory, and that's basically all. And it's completely portable, the same code compiles, if you can compile it, I assume, it will run on anything, any 32 or 64 bit machine, it should just work straight out of the box.

16:44

I wanted the performance to be, shall we say, competitive with TweetNACL, we'll see how we go, we'll come back to that. And will we find any bugs? If I actually attempt this verification, will I actually find bugs in my code, which turn out to be bugs in the original master copy from TweetNACL?

17:02

We have some form in this area, I did a similar exercise some years ago with a hash algorithm called SCANE. And in doing that verification in Spark, I found a bug in the SCANE algorithm, or at least the implementation of the SCANE algorithm in C, from the designers of the algorithm, it actually had a very subtle arithmetic overflow bug in it, which

17:22

had never been discovered before by anyone else, so we have some sort of form in this regard. What I didn't want to do, I'm not going to try and compete with libsodium in terms of performance, because that would be silly, I'm not going to compete with very carefully crafted assembly language. I'm not going to try to prove partial correctness of actual, you know,

17:44

heavyweight mathematical correctness properties of the code, that seemed probably a bit ambitious. And I'm not trying to say anything about the mathematical security of things like the elliptic curve, curve 25519, that's the domain of crypto mathematicians, not software engineers, so I'm not sort of venturing into that territory.

18:06

OK, so on the proofwork, firstly, a few statistics for you. It's about 1700 logical lines of code comprising statements and declarations and slightly over 100 sub-programs, about 74 functions, sorry, 74 procedures and 37 functions.

18:23

In terms of contracts, this is a bit interesting. There's quite a lot of types and subtypes, as you know from using Ada, types and subtypes are sort of super important in the correct use of the language. But other contracts, there's 27 preconditions, 20 explicit post-conditions, dynamic

18:42

subtypes or dynamic subtype predicates turn out to be really important. I think they are something of a killer feature of Ada 2012, which the Spark tools can exploit. There is a few of those. And there are 51 explicit assertions in the code, assert pragmas. That's quite a lot. And that reflects the difficulty of some of the proof that we have to do, is that the asserts are used

19:04

to kind of help the prover, if you like, to kind of give it a hint of things it should prove first. Before it can prove C, you should prove B and before B, you should prove A. Well, you can kind of do that with by giving it assertions. So there's quite a few intermediate assertions in the code that we'll come back to.

19:21

Proving things about loops is often tricky. I had a quick look. There were 56 loop statements in the library. Interestingly, of those 56, 25 don't require any loop invariants at all to be stated by the user. The tool just synthesizes something and that's enough to prove what we want to prove. 18 of them require a user written loop invariant, but it's really simple. It just

19:43

took me about 30 seconds to write the correct loop invariant and it was done. And only 13 of them are sufficiently hard that I had to write a loop invariant and it actually took some real work and some real thinking to get it out. But I will admit those 13 were actually quite tricky. If you look in the code, you'll see some of them are really quite, quite non-trivial.

20:03

And I'll tell you where those are in a minute. Yeah, the difficult bits, unsurprisingly, are things like there's a multiply operator for that GF type. So this is multiplying 256 bit integers modulo a large prime number.

20:21

The large prime number, by the way, is two to the power of 255 minus 19, which is where the curve gets its name. So, yeah, multiplication modulo two to the 255 minus 19. That's a really difficult bit of code. And proving that it doesn't overflow, for example, proving that all the intermediate numbers that you get don't overflow is actually quite tough.

20:41

That was a difficult one. There is a subsidiary function to that called carry and reduce, which is abbreviated to CAR. That function does the modulo reduction thing. That's quite tricky as well. And proving that that terminates is actually quite tricky and proving that it converges turned out to be quite difficult.

21:01

And we've actually done that. And then there's this modulo L operation, the fragment of code I showed you earlier. That is spectacularly difficult to understand and was really quite difficult to prove to be type safe. Interestingly, the first two of those that the multiply and the carry and reduce operations are also the most performance critical in the loops of the whole library.

21:24

The carry and reduce algorithm for a single signature generation function in the library. I think the carry and reduce function is called about 20000 times. So that's very intensely performance critical. So trying to, for example, get the inner loop of that code to be really, really fast was something of a big deal.

21:43

So what have we proven? So this notion of type and memory safety in Ada terminology kind of reduces to the complete absence of undefined or erroneous behavior. There's no dependence on unspecified behavior, which is an intrinsic property of Spark.

22:00

There's nothing that would raise an exception. So we prove that all range check, overflow check, index check, division by zero check and so on all can never fail using the theorem proving tools. Obviously, because it's Spark, there are there's no explicit use of pointers and no use of the heap. So that makes a whole load of problems go away.

22:22

Also, obviously, what we're doing is proving that all object values always satisfy their type declarations and all users supplied contracts are always true. So preconditions, post conditions, assertions and loop invariants are always true as well. And then if you get the whole lot out, you know, you're in a good place. So that's basically what we've proven.

22:44

Some observations about that. Types basically are awesome. If you're used to using Ada, which I'm sure most of you are, since you're in the Ada room at FOSDEM, you'll know that already. But this comes as a surprise to other people, people that are used to programming in C. They kind of don't get the idea of things like user defined scalar types,

23:02

user defined integer types to C programmers is something of a crazy concept, you know. And it's just so important in Ada to use these things appropriately. It really is is the key to achieving automation and completeness, meaning zero false alarms and getting one hundred percent of the VCs out.

23:23

It really is all about using the type system properly. And another observation is that types, this idea of type safety is not really a fixed price concept because in Spark, the more effort you put in to declaring and using types and subtypes properly, the more you get back in terms of what you can prove. So it's really up to you once you've established these basic properties of no undefined behavior and everything else.

23:46

How far you go then depends upon you to actually write. And the more and the stronger declarations you write, the more value you can get back from the proof system. Types are like a weak, if you like, a weak specification of correctness.

24:02

And so some of the properties that emerged from the proof are actually what you would think of as being correctness properties of the code. Although they're expressed as types in the language. So this notion of, well, is it type safe or is it partially correct? Actually, it kind of is a bit fuzzy in Spark and in this library because

24:20

very use, very strong use of types actually starts to establish partial correctness properties as well. And a real interesting observation for me was the discovery, really, that the so-called dynamic subtype predicate thing from Ada 2012 really is amazing.

24:42

It offers a leap in the expressive power of the type system in Ada, which is an enormously important thing in 2012. I've recently discovered they also exist in languages like Haskell and OCaml, but they're called liquid types, where liquid stands for logically qualified, which is a rather strained abbreviation of the term, I suppose.

25:05

But yeah, they do exist in other languages. But this idea that you can define a subtype with a fully general predicate is really, really amazing and very, very powerful. So an example from the code is this bunch of declarations. So what we're looking at here is this type, the type GF we saw in C, which is this array of 64 bit integers.

25:27

You can see there in the Spark code is called type GF 64. And it's an array of 16 values of type GF 64 any limb where GF 64 any limb is an I64 up. Because we're in Ada, we're going to define the bounds on that thing, which are defined in terms of some constants we've declared above.

25:45

But then, interestingly, I've got a subtype called normal GF 64, which is a GF 64 with a dynamic predicate that says, ah, it's a GF 64. So it's an array of 16 64 bit signed integers. But each of the elements of that array is constrained to be in the range 0 to 65535.

26:04

So that's like an array of 64 bit integers where each digit is constrained to be normalized down into a much smaller range. So the use of this dynamic predicate thing is really, really important. And the proof tools are aware of the dynamic predicate and what it means and will prove that you always respect it.

26:23

And it's really, really cool. In terms of volume of proof, the code generates about 907 or exactly 907 verification conditions for a particular release. I'm pleased to say that with a combination of the provers Z3, CVC4 and ALTOGO, the distribution, basically the community

26:46

edition, the 2020 and 2021 community editions of Spark managed to prove all 907 of those verification conditions completely automatically. About 770 something are so trivial that all three provers blow them away every time very, very trivially.

27:04

And 130 are hard in that they require that 130 are where at least one of those provers fails to prove one of the VCs. So you have to have more than one prover switched on. Of the 130 that are hard, the scores are that Z3 gets 122 of them.

27:25

CVC4 then comes along and proves another two. And then ALTOGO comes along and proves the last six. And that's how we managed to get them all out. But it's interesting to note that you do need all three of the provers. No, no two of them are OK. You won't get 100 percent with only two of them. You do still need all three to get to 100 percent.

27:43

In terms of elapsed time, it's about four and a half minutes on the laptop that sat in front of me. This has eight processor cores, so that's not too bad. That's kind of nearly a coffee break, which is reasonably acceptable. The tools are pretty good at being parallelized. That's good news by generating lots of verification conditions.

28:03

And, you know, you get verifications for each. You get a VC file almost for each check or for each line of code that has a check on it. For each subprogram, you get a verification condition file and you can prove them in parallel using using as many provers as you like. That works very, very well and scales very well. And you can just buy more processor cores to speed the proof up.

28:24

That's really good news. OK, so I want a performance. So what does it actually go like when you run it on a real machine? The test set up, I needed an embedded target, so I bought one of these things. This is a sci-fi high five one Rev B board, which is a small 32 bit RISC-V machine because it's a small embedded machine.

28:47

Also, the GANAT community edition release from 2020 has support for this out of the box. This is a 32 bit RISC-V. So it was supported straight out of the box by GANAT, which was very nice. And thanks to AdaCore and the team that work on the BSPs and so on for supporting it.

29:04

So I started with the GANAT community 2020 edition, which is essentially under the hood of release of GCC 8.4.1. And what I decided to test was a sign operation because that's computationally intensive. It's important. It uses all of the mathematical interesting stuff of using the elliptic curve operations.

29:26

And I'm signing a 256 byte message using a particular fixed key. And what I did was to sign the same message using Spark and ACL and using the C implementation from Tweet and ACL, sign the same message and measure their runtime on the board to see what I got.

29:43

There's some reasons why I thought Spark might be slower than C. So firstly, I adopted a more functional programming style in Spark than you would in C. So particularly in Spark, for example, I was able to overload the mathematical operators for that GF type so that I can supply a proper binary plus operator, a multiply operator and so on for that type,

30:05

which essentially, you know, in Spark is a function that returns a GF type. And normally in Ada, the canonical semantics say that that function return would be by copy. But that GF type, remember, is 1664 bit integers, which is 1024 bits.

30:22

And obviously returning 1024 bits by copy on the stack might actually be really, really slow. So that might be a reason why we would be slow in Spark. So that's a possibility. Tweet and ACL also has a rather interesting trick in it where it only does two of those normalization steps after the multiply operation.

30:45

In Spark and ACL, it was actually a lot easier to prove to do that you needed three normalization calls after a multiply. And also in Spark and ACL, it normalizes after addition and subtraction of those GF values. So the bottom line is in Spark and ACL, we do more of those normalization operations, so that might slow us down.

31:05

Finally, Spark and ACL sanitizes its local variables, which is good practice in cryptographic programming. You sort of erase your local state before you return from a function. Tweet and ACL doesn't do that. So that's another reason why Spark might be slower. There are also some reasons why Spark might be faster than C.

31:22

Obviously, once we've proven that the Spark code is type safe and free of runtime errors, we are compiling it with the switches on the compiler to say, well, suppress all runtime checking. It's the GNAT-P switch because we can. We've got justification and evidence to do that because we know from the proof tools telling us that the runtime errors can't fail.

31:42

Therefore, you can never raise an exception. Therefore, suppress all the runtime checking. So that's good news. Spark also has many properties that make it really usefully intrinsically good for an optimizer. So things like having no aliasing of names in the language. There's no undefined behavior, no side effects in expressions or function calls in Spark.

32:03

These are correctness properties, but also they're properties that an optimizer might be able to exploit to improve the code. So there's a potential way we might be good. So what did we get? Well, we started optimization level zero, one and two. And obviously I'm using GNAT Community Edition here to compile the C as well as the Spark.

32:23

So it's the same back end of the same compiler at optimizations level one to the C code, basically one by a bit, as you can see in green there. But a really big surprise. This was really surprising, was that the Spark code was substantially faster at optimization level zero, much faster than the C.

32:41

And that was actually quite a surprise. So what was going on? Firstly, I found at optimization level zero, the C code is very slow at doing bitwise rotate because in C there is no bitwise rotate operator. So you have to write it out as a full blown function. And without the optimizer, that function doesn't get inlined in C.

33:02

It's a full blown function call in Spark. Rotation is an intrinsic function call in package interfaces. So you typically get one instruction or a very small number of instructions for a rotate operation, therefore very, very fast. And that saved us a lot of time.

33:22

There's a small improvement owing to the an improvement in the carry and reduce code, which comes from the wire guard sources that I adopted. That's not that significant. A big deal was this thing called return slot optimization. So that concern I had about the cost of a function return of a thousand bits or more being very, very slow.

33:41

Turns out that's almost entirely removed by the compiler because, again, of the properties of Spark, because the function calls a side effect free and alias free. The compiler is able to actually optimize those function calls and essentially it optimizes them into doing a return by reference directly into the caller stack frame. And that works even at optimization level zero in Gannett.

34:03

And that's really, really cool because that concern just disappears. We'll come back to returns on optimization a bit later on. So the golden rule I had with improving performance was whatever I do to improve the performance of this code, I'm not going to break the proof. So having established a 100 percent automatic proof, I'm going to keep it 100 percent and fully automated whatever I do to improve performance.

34:29

So that was my kind of guiding rule, if you like. It also was really nice because the proof stops me making stupid mistakes. If I start with type safe code, obviously I should finish with type safe code because I shouldn't break the correctness properties of the code.

34:43

So the proof tool is really, really useful because it stops me doing things I shouldn't do. I also found, and we'll come to this, that the proof system and the types in particular language can kind of suggest or guide optimization. And we'll see that in just a moment. So the first round of things I did was some fairly basic optimizations that I knew would help.

35:04

So things like going for an optimal initialization of large objects rather than initializing everything in a large object once and then initializing it all again in a loop or something. I was able to optimize that. I did a bit of manual common sub expression and partial redundancy elimination, a little bit of judicious unrolling, particularly of the inner loop of the GF multiply operation.

35:27

There's a big, big improvement available by unrolling the innermost of two nested loops. I removed some slices from the Spark code because that was a bit of a source of slowness. And where possible, I actually applied those changes back into the C source code of TweetInACL to kind of level the playing field.

35:47

So with that lot in place, the numbers look out like this. And this is a bit less surprising. So now we see that TweetInACL is winning by a bit at all optimization levels. But with a big improvement across the board in both languages, which was very pleasing just by doing some very basic stuff.

36:04

In my second round, I came across this this observation really that the GF type. Most of the time, the numbers in that GF type are actually limited to be unsigned and 16 bits in range, even though they're being stored in 64 bit digits.

36:21

Because when you're doing a multiply, those digits overflow a normal 16 bit range. The digits actually get very, very big as you multiply up all the different digits and coefficients and add them all together. Some of those digits get up to be about two to the forty two to the forty three or something. So you need 64 bits to store one for sure.

36:42

But an issue problem is the code is compact, but it's very wasteful of storage. You're storing a lot of zeros there. You're storing 1024 bits where really most of the time you could store one of those GF things in 256. So a question is, could we compress that that type down to be 256 bits?

37:02

And also, can we use the proof system to prove that some of the 64 bit mathematical operations could actually be done in 32 bits? Because if you can prove that something will always fit in a 32 bit plus or a 32 bit multiply, you can save a lot of time. So, for example, on RISC-V 32 bit, a 32 bit multiply is one instruction, but it's six instructions to do a 64 by 64 bit multiplication.

37:28

So that's a potentially enormous saving. So, for example, in the carry and reduce code, the first application of carry and reduce starts with 64 bit values.

37:42

But it turns out the result of that operation, all the digits fit in 32 bits. The second carry and reduce code goes from 32 bit to 32 bit. The third application goes from 32 bit back to 16. What you can do is in that second and third call, you can actually do all the arithmetic in 32 bit maths, not 64 bit maths.

38:03

And that makes a huge difference. The other potential benefit from compressing the representation is that just assigning these things is fast. If you do need to copy one of them, you only need to copy 256 bits instead of 1024. Also, you might get more of them in the data cache, which would help as well.

38:21

And as I said, the proof system is backing me up here by proving that these transformations are OK and proving that the operations still don't overflow, even if I do them in 32 bits and so on. So that's nice to know I'm not going to make a mistake. With that in place, there is an absolutely amazing improvement.

38:40

The Spark code goes from 90 something million CPU cycles to assign a question down to 30 something or 20 something. So we're getting about a factor of three better from the Spark code with this revision, this so-called operator narrowing kind of transformation and compressing the storage. We absolutely get an amazing improvement pretty much in all the optimization levels, particularly at one and two.

39:05

We get down to 32 or 27 million CPU cycles from 90 something, which is really, really, really impressive. In the third round of optimization, I thought, well, let's try optimization level three. But use the no-unroll pragma to tell optimization level three where not to unroll loops just to make sure we were in control of what it was doing.

39:29

I also had a look at optimization S, which means optimize for size to see what it would do. I also tried the so-called compressed instruction set from RISC-V, which basically some of the 32-bit instructions on RISC-V can actually fit in 16 bits.

39:43

They can be compressed. So you get smaller code size and possibly better hit rate in the instruction cache. On this particular target machine, there was actually a penalty, though, because if you branch to an instruction that's not on a 4-byte boundary, there's actually a performance penalty for that. So we're not sure that's going to help or not, but we'll try it.

40:03

Then I discovered the compiler could actually change that. You can say, give me compressed instructions, but force alignment. So all jump targets, all basic blocks always start on a 4-byte boundary. The compiler has switches to do that. So that's what I actually settled on. So with that in place, we get the following results.

40:21

We get a bit more improvement across the board, except for some reason optimization level two, there's some sort of degradation that I never really understood. But we do get a bit of improvement at that level. So at this point, the best we've got is about 25 million cycles for a sign operation at optimization level three.

40:40

The other interesting numbers, the stack usage is down to about two and a half K for a sign operation. That's computed using the GANAT stack tool. The code size for the whole library is about 18K with optimization S switched on. That's OK. I mean, it's only 18K here, folks. I don't know if many of you remember what it was like programming

41:01

when your memory was measured in kilobytes, not megabytes, but that's quite reasonable. A bit more optimization in round four was things like doing some loop fusion and putting the first carry and reduce operation, merging it with the plus and minus functions helped a bit. I also realized I could narrow another multiplication operation in the actual multiply

41:23

function itself to do a 16 by 16 bit multiplication instead of a 64. Again, the proof system said that was OK. And a bit more judicious loop unrolling in the multiply function to avoid double initialization of the intermediate. That gave us another few hundred thousand CPU cycles.

41:42

So we were down into the realms of 23 and a bit million cycles at optimization level three, which was quite nice. Then a year had passed. So GANAT community 2021 came along based on GCC 10.3. And as a result of the work I'd done earlier, many thanks to Eric Botkazoo at Adicor, who actually implemented some changes

42:02

in GCC so that the return slot optimization stuff was enabled for more cases of function calls because of the properties of Spark. So particularly if you put the aspect pure or pure function onto a function in Spark now that the compiler picks that up and says, aha, and is able to turn on returns optimization in more cases.

42:23

So that got us about another two or three hundred thousand CPU cycle saving. So down to about twenty three point two two million at optimization level three, which was useful. Observations, then the optimizations seem to come in several flavors. There are language specific stuff like array slices and this return slot stuff.

42:44

There's stuff to do with the micro architecture, like compressing instructions or not, as the case may be. There's algorithmic improvement like transforming the mathematics, but using the proof engine to support and prove that we're not making a mistake. And then there's what I would call back porting. So this is taking optimizations that happen at a higher

43:02

optimization level and transforming the source code so that you get the benefit of that optimization, but without the optimizer. So this is things like partial redundancy, elimination and loop unrolling, where you change the source code to make the optimization happen. But but you still get the benefit and lower optimization level. It is hard to predict what you're going to get.

43:22

You really do have to have hardware and measurement to get this stuff to understand what's going on. Right. The combination of using the proof system with the optimization is really, really great. The proof tool just stops me making mistakes, but it also suggests things.

43:41

It says, you know, the types are there. The types are saying you could do this operation in 16 bits or 32 bits instead of 64. And I can prove that that's safe. That's really, really great and works really, really well. I do come across a lot of projects in industry where people don't like using the optimizer, which is quite a surprise.

44:01

People still compile with the optimizer off. And there is a general perception, particularly among C programmers, that they don't trust the optimizer or at least they say, well, we tried to switch it on and our code broke. So we switched it back off again. Ah, that's disappointing. Spark is amazingly robust in the face of optimization. And this is not a not a surprise. Spark was designed to have this property

44:22

that the semantics is independent of optimization and the complete elimination of undefined and unspecified behavior. The language means that optimization is typically very, very safe and robust on Spark. And that's a good news. Loop invariants are still really hard to find for the non trivial stuff.

44:44

The difficult ones were really difficult. If you want to have a look at them in the code, be my guest. They're not very nice in about three or four of the functions. Reproducibility of results with the prover is a tricky subject at the moment. I mean, things like if you take a new version of a tool like Z3 or CVC4, it's not guaranteed that you'll get the same results.

45:07

And it's not guaranteed that you'll get better results. You know, I've tried this and sometimes a newer proof tool actually degrades in that it fails to prove something it proved before. And that's really disappointing. Also, timeouts in proof tools are a terrible, terrible thing.

45:26

The proof tool timing out as a way of, you know, sort of generating a false alarm is not a great thing if the timeout is based on the passing of a wall clock of real time, because then you don't get the same results on all your desktop machines.

45:41

You don't get the same result on your server or your continuous integration or out there in the cloud. It's really not very helpful. So getting the same results on all the platforms for all the developers all the time is a really important thing. And you have to go to things like using step limits in the proof engines rather than timeouts.

46:03

In terms of further work, then, what we might do next with it. Firstly, well, what if we ran on a 64-bit target? Would all this still apply? What sort of results do we get? The answer is don't know, but we might try it. It would be very interesting to do this all again. But instead of using GCC, try it with GANET LLVM to see what the performance is like with an LLVM-based compiler.

46:24

That would be interesting. It would be interesting to take those for those the verification conditions. So, for example, for the eight verification conditions that Z3 can't prove, well, we would take those eight VCs because we know they're true because CVC4 and Alta Ergo can prove them.

46:40

We take those eight that Z3 won't prove and we submit them to the Z3 developers to say, hey, come on, why don't you prove these? And maybe we can get one of the provers up to a standard where it proves everything. That would be good. It would be nice to try and prove that we only need to do two applications of that carry and reduce operation rather than three.

47:01

That proof has actually been done by someone called Benoit Vigier over in Belgium. He did it in Coq, though, using a really high powered theorem prover and doing it interactively in higher order maths. He's been able to prove that that's the case. So can we get that at the code level to be an automation as well? Because that would save us another ton of time.

47:21

And finally, formal verification of the constant time property we don't have at the moment. I believe we can do that by using the information flow analysis engine. You know, in that proof, there's this thing called a program dependence graph in information flow stuff. It does. I think we can use that PDG to prove that subprograms exhibit this constant time property.

47:42

And that's that's something we'd like to look at in the future. So that pretty much gets us to the end. I hope I'm on time in terms of resources. The code is on GitHub at that location. If you want to look at the code, be my guest. Please submit comments or issues if you have any. If you want to see how the optimization worked.

48:00

Each of the optimizations is tagged in the GitHub repo. So from a particular tag to the next tag, you'll be able to see the changes that I made to achieve a particular transformation optimization. And so you can see exactly what's happened. And there are a sequence of blog entries on the article blog which document this performance analysis and optimization stuff in gruesome detail.

48:22

If you want to know exactly what I did and how it worked, then please go check out those blog entries. Thank you all for listening very, very much. I'll leave it at this level at this point and hope we get some really interesting questions and comments from you all in just a moment. So I would say thank you very much for listening. Thank you to the organizers of FOSDEM for inviting me and hope to speak to you soon.

48:57

Amazing presentation, especially for making this topic, which is quite complex, digestible, in my opinion.

49:07

So the first question I would like to ask you is Spark worth it? Was it worth using it? That's a good question. Well, I would like to think so, but obviously I'm biased.

49:20

It was an interesting exercise. It is interesting to see how far the proof tools have come. I think, you know, 10 years ago we wouldn't have dreamt of doing that depth of proof with the tools, especially with that level of automation. The tools just couldn't have managed with the previous version of Spark before Spark 2014.

49:41

Now, you know, with Spark 2014 and the really modern proof engines, it's amazing that the power of the proof tools and also the power of the language. As I said, things like the type system in Spark is really powerful. The addition of the dynamic subtype predicate in remainder 2012 coming into Spark for 2014 is amazing in terms of what we can express now.

50:05

And the tools ability to automate the process come on in leaps and bounds for just a decade ago or so. So it was really, really worth it. And it's interesting. And it was kind of I suppose I would say it was good fun to do. I did most of it during lockdown. I needed something to do as a side project.

50:25

So I enjoyed it and thought it was worth doing. And I hope the rest of you found it interesting to listen to. OK, so you mentioned that maybe you could find some bugs in the original implementation.

50:44

Interestingly, from the release of TweetNACL that I started with, the answer is no. We didn't find any clear bugs in TweetNACL, which is a pretty remarkable result, actually, for such a significant body of complicated C code.

51:01

There are a couple of things in TweetNACL where I came to translate it to Spark and thought, oh, that's terrible. Things like TweetNACL does shift operations on signed integers, which is a terrible thing, because in C it's either implementation defined one way and undefined the other. And it's not a great thing to do. So there's a little bit of sort of slightly dodgy practice in there.

51:25

But we didn't find any clear bugs. There is one interesting result with that, though, that there in the first release of NACL, if I look online, the first release of the NACL sources was made in December 2013. And then in April 2014, there was a second release saying, oh, we fixed a bug.

51:41

And there was a bug in the first release of TweetNACL, which means that for about five months, no one noticed that there was actually a rather subtle, nasty bug in it. This was in this modular algorithm. And what I did do, actually, just for fun, was to take my copy of Spark NACL and I kind of re-injected that original bug and put it back in.

52:05

It's literally, there's a 14 where there should be a 15 or a 15 where there should be a 14. It's normalizing the wrong element of an array of digits. I put that bug back in to Spark NACL, ran the proof tools, and because of the predicated types, because of things like predicated subtypes, the proof tool spots the bug immediately.

52:26

It comes back and it says, no, you've got one undischarged BC there and spots it straight away. And that's a really cool result because it means you couldn't have made that mistake. If the code had been written in Spark in the first place, you never would have made that bug because the proof tool spots it.

52:41

So that was really pleasing to see that Spark would have got that one. Other than that, no, we didn't find any bugs in the original, which is good news, I guess. It speaks very highly of the ability of the original designers. Thank you. You mentioned using Spark 2014.

53:02

Spark has evolved since then and some new additions have gotten new tools. Have you tried those new additions, improvements or not? So I think the one really recent feature of the language that I did exploit was this attribute, not an attribute.

53:23

There's an aspect in the language called relaxed initialization, which I believe appeared in the 2020 release or 2021, I'm sure Yannick will correct me, release of the Spark community edition. So that allows us to prove initialization using the proof tools rather than by the more pessimistic flow analysis.

53:42

That's very important for performance, because what you'd normally do in Spark, if you've got an aggregate, you know, a composite object like an array, what you do is initialize it all in one go using an aggregate initialization. And then maybe you initialize it again or initialize a few elements again, field by field with the relaxed initialization aspect.

54:00

And there's this tick initialize attribute. You can use the proof system to prove initialization. What that means is in terms of performance is you end up initializing what you need to initialize exactly once, which is faster. Basically, you get faster code by being able to exploit this more expressive way of proving initialization of arrays.

54:21

So that was very, very useful. I guess the other thing in technologically, a big advance that came was with the community, as I said, with the community 2021 release of GANAT. Eric Bockers, who had actually implemented some changes in the middle end of the compiler to actually turn on this thing called return solid optimization for Spark functions.

54:42

So if the Spark functions marked pure and if it's got no aliasing of the actual parameters with the returned object, then the compiler switches on this return slot optimization. And again, we've got a, you know, a noticeable performance improvement by the compiler being smarter to exploit the properties of Spark to generate code.

55:03

So that was very useful, actually, as well. And as I said, the killer feature from 2012 is the dynamic subtype predicate. And I couldn't have done it without dynamic subtypes and predicates on them. They're really, really powerful. They're definitely the thing that I found to be the most useful for doing this work.

55:25

OK, thank you. We have another question. Do you have, did you have too many false alarms? Well, initially, when I did the first port, obviously, if you go, if you look at the GitHub repo and you go right back to the beginning,

55:40

you'll find my initial port of Spark and I hadn't really developed the contracts very, you know, well at that point. And if you run the tools on that version, yeah, there were false alarms because I hadn't written a sufficiently strong set of contracts. I hadn't written the preconditions and loop invariants yet. So initially, there were quite a few false alarms. I can't remember how many.

56:02

But then I developed, as my understanding of the code developed, you know, how this multiplication and reduction works and all these other algorithms. As you begin to understand how they work, you are, OK, aha, I've understood something, OK, I'll put that in. I can sort of encode my understanding of the algorithm as a precondition or as an assertion as you develop the contracts.

56:22

This is the sort of autoactive bit of it. You put contracts in, the false alarm rate comes down, comes down, comes down, and eventually we got there. We got to a point where there were zero false alarms because all of 900 and something verification conditions are automatically discharged. So effectively, we got to zero, which was the point. And that's what I was trying to get to, to see if it was possible.

56:43

So we got there in the end. And I won't pretend it was easy. There was, you know, three or four of those functions took a lot of thinking about to understand. It was really a matter of just understanding how they work in the first place. Once you understand how they work, you go, ah, I'd express your understanding as a contract and bang, the proof tool goes, oh, yeah, that one's easy.

57:03

And they managed to prove the code. So it's it's it's an interesting process. But but yeah, it's it's was worth doing. And, you know, we got to zero in the end, which was was a good thing. So Frederick Braga asks, how much time did it take you to build this project, knowing that you are a longtime Spark user?

57:28

That's a good question. I regret to say I wish I had done this. I did not actually keep a log of the time that I spent. I don't have a record of how many hours this all took.

57:41

It's interesting that there's a big split. There's some of the routines like the Scharf, the Scharf 2, the Scharf 512 implementation, you know, that translated from C into Spark very easily. And almost all of it proved at first attempt. It was almost like a day or something, a few hours work. Easy peasy. At the other end of the spectrum, the mathematically complicated stuff in the elliptic curve algorithm, you know, translating the code.

58:07

Yeah. You know, 10 lines of C translates into Spark. Not too bad. A few hours. The proof of some of those hard algorithms, it wasn't sitting at the keyboard typing work. It was thinking. It was just getting my head around how it worked.

58:22

That was measured in in hours and hours and hours of just going, why can't I understand this algorithm? What's it doing? How does it work? There was just a lot of sitting and thinking and sometimes waking up in the middle of the night and going, aha, I've got it. Quick, I know how it works. You've got to get up in the middle of the night and write an assertion before I forgot what I was thinking.

58:45

Some of those algorithms took quite a lot of effort. I mean, measured in many hours of thinking about them and then trying the proof. The proof tool said, oh, not quite. No, can't prove it. And then you go, ah, I understand something else. And then the proof tool would say yes. I mean, end to end, sort of working on this just part time as a side project.

59:03

The whole thing has been spread over more than a year or two of work. Again, if you look at the GitHub repo, you'll see the commits on it go back about it must be almost two years now. So it's quite a lot of effort, but very spread out over the spare time, over more than a year.

59:23

You'll see there's sort of groups of commits where there are lots of commits in a short space of time. And then nothing happens for months because I'm doing some real work and getting paid to do something elsewhere. And then there's obviously a period where I had time and I did a lot more work and you'll see the commits are grouped into little bursts.

59:43

So, yeah, I don't know the actual numbers for productivity and effort, but that's the way it goes. So we have a question from Jeffrey who asks whether is it even possible to prove the correctness of an encryption algorithm?

01:00:00

Well, if you mean partial correctness, if you mean does it actually, you know, does the ciphertext you get back, you know, are the bits actually correct with regard to the specification or the cryptographic algorithm? Well, yes, it is possible. We've not done that in Spark and ACL. It is possible because lots of other groups have done it. Look at fiat cryptography.

01:00:22

Look at the work with EasyCrypt and Jasmine. Look at the work that Galois have done with their crypto specification language for which they can refine code. Microsoft's work on the Everest project, Amazon's work. Lots of other groups are looking at really serious correctness of cryptographic algorithms.

01:00:41

So, yeah, it's been done. It comes down to having a specification. Thankfully, most of these cryptographic crypto algorithms do have very strong mathematical specifications which you can work from. And that's great. We haven't done it in this case because I was I was going for automation and type safety is my main goal rather than pure correctness.

01:01:02

I mean, on Twitter and ACL, there is a correctness proof of the scalar multiplication in the elliptic curve, as I said, was done by a guy called Benoit Visier in Belgium. But that was kind of a Ph.D. sized project and he had to use a full blown interactive theorem prover to do it.

01:01:20

I believe he used Coq to do it. And it's a big, complicated bit of real, hardcore mathematics that he's actually proved that the code really does generate the right answer. So that's an entirely different scale of efforts, if you like, and different complexity from what we did here. But no, it has been done. And there are lots of projects out there doing really serious correctness proofs on cryptographic algorithms all the way down,

01:01:44

even some even going down to the level of the machine code and proving correspondence with the binary. It really is very impressive what else has been going on out there in the crypto community. OK, so another question is how many contracts, assertions, et cetera, do you have to add

01:02:01

that should not have been necessary, meaning I shouldn't be guiding the prover towards this goal? That's a good question to which I have to say I don't know the answer that adding these assertions sometimes is a bit difficult. You have to think, well, if I had this assertion, will it help? And you add an assertion and sometimes it helps.

01:02:22

And the proof tool says, yeah, that works. And sometimes it says no or anything. Oh, that didn't help. So how many shouldn't be needed? Well, if I had an infinitely powerful theorem prover, you know, hopefully you wouldn't need that many. But we don't have infinitely powerful theorem provers. The theorem provers are capable of what they are today and they'll get better as time goes on, no doubt.

01:02:42

It would be an interesting exercise. I suppose you could take the Spark and ACL sources. You could take the proof as it is with zero false alarms and full automation. You could try to selectively remove some of the assertions and see if the proof still works. And then if that works, remove another assertion and see if the proof still works and see how many you can remove before you break the proof.

01:03:03

That would be an interesting exercise. I haven't had the time to do that. I just don't know how many you could. What would be the minimum set that you'd need to retain automation of the proof? Don't know. It would be a fun thing to do, but probably pretty time consuming because you'd spend a lot of time waiting for the proof tools to see if they work.

01:03:23

OK, sounds like a bit of a Herculean effort, frankly. Yeah. So we've got 10 seconds left. So there are still some questions left unanswered. So please, viewers, join this room. I think Rod is going to stay here. Thank you.