We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

How (not) to make a mockery of trust

00:00

Formal Metadata

Title
How (not) to make a mockery of trust
Subtitle
Testing client software for public-key infrastructure
Title of Series
Number of Parts
287
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The ever-continuing push for digitalisation has increased our reliance on trust services of various kinds, filling various needs relating to document signing, code signing, authorization tokens, and so forth. Many of these trust services rely on public-key infrastructure (PKI) and X.509 certificates. The sensitive nature of these tools makes them difficult to use in a testing environment. On the one hand, exposing access to production keys in your CI is obviously a terrible idea. But on the other hand, setting up and maintaining a fully functional "mock" PKI environment is also pretty tricky. What can you do about that? Using PKI tools in test workflows involves many challenges. Here are a few examples: Even a (supposedly) basic task like validating an X.509 certificate involves quite a bit of complexity. Apart from "local" validation logic, you might also have to check the revocation status of your certificate, which could entail talking to an OCSP responder service or looking up a CRL. If you're using secure timestamps (RFC 3161) in your code, your tests might also require access to a time stamping service. Maybe you're using a remote signing service vendor that doesn't offer any sort of "sandbox" for testing purposes. In all of these scenarios, both test data generation and mock service integration can be quite cumbersome. Both in my own time and on the job, I write a lot of code that relates to digital signing in various ways, and this is a kind of problem that I run into all the time. After trying out a variety of methods, I grew dissatisfied with the "traditional" options, and rolled my own PKI testing framework: Certomancer. Certomancer helps with both test data generation, performs trust service mocking, comes with a plugin API, and most importantly, it's FOSS (MIT licence). In my talk, I'll take you through some of the "how"s and "why"s of Certomancer's feature set, and talk about some of the mileage that I've gotten out of it.
Diagram
Public-key infrastructureProbability density functionFrustrationStatistical hypothesis testingDependent and independent variablesJSONXMLUML
Probability density functionDigitizingOpen sourceConnected spaceProfil (magazine)Process (computing)Probability density functionMultiplication signBitWebsiteSign (mathematics)Slide ruleComputer animation
Statistical hypothesis testingKey (cryptography)Computer wormStandard deviationIdentity managementDuality (mathematics)MathematicsLibrary (computing)Data integrityPublic-key infrastructureControl flowChainDomain nameTransport Layer SecurityPlastikkarteMereologyMilitary operationInternet service providerWeb serviceInternetworkingInformationElectronic mailing listDigital signalSoftwareMultiplication signStatistical hypothesis testingSinc functionPublic-key cryptographyMathematicsProcess (computing)AuthorizationFile formatProcedural programmingElectronic signatureStatement (computer science)Keyboard shortcutType theoryService (economics)DigitizingBasis <Mathematik>Statistical hypothesis testingOrder (biology)View (database)BitCASE <Informatik>Key (cryptography)Lebesgue integrationNumberGame controllerLibrary (computing)Mechanism designMereologySign (mathematics)Validity (statistics)Public key certificateLink (knot theory)Point (geometry)Degree (graph theory)Exception handlingSoftware engineeringStandard deviationRoutingComputer wormSet (mathematics)AlgorithmComplex (psychology)ChainCartesian coordinate systemPublic-key infrastructureDomain nameNatural numberIntegrated development environmentDependent and independent variablesElectronic mailing listWeb serviceInternetworkingComputer animation
Uniform resource locatorWeb servicePersonal digital assistantExtension (kinesiology)Key (cryptography)Configuration spaceScripting languagePoint (geometry)Multiplication signSeries (mathematics)Limit (category theory)FrustrationPublic key certificateQuicksortScripting languageCartesian coordinate systemFunctional (mathematics)Statistical hypothesis testingLink (knot theory)Process (computing)CASE <Informatik>Statistical hypothesis testingExtension (kinesiology)Service (economics)Category of beingImplementationOnline service providerRemote procedure callMachine codeSign (mathematics)Integrated development environmentElectric generatorSlide ruleWeb pageComputer animation
Primality testRSA (algorithm)Total S.A.Statistical hypothesis testingDefault (computer science)Service (economics)BootingSelf-organizationKey (cryptography)WritingRootSample (statistics)Validity (statistics)Extension (kinesiology)Software repositoryWeb serviceProgrammable read-only memoryPublic-key infrastructureArchitectureCommunications protocolPrice indexFiber bundleDemo (music)Computer networkProduct (business)Statistical hypothesis testingProof theoryImplementationStandard deviationDisintegrationINTEGRALMereologyStatistical hypothesis testingPublic key certificateKey (cryptography)Service (economics)Statistical hypothesis testingValidity (statistics)Electronic signatureQuicksortServer (computing)Point cloudWeb pageProjective planeLine (geometry)Web browserConfiguration spaceProof theoryWeb 2.0VirtualizationInstance (computer science)Primitive (album)Point (geometry)Repository (publishing)Distribution (mathematics)Integrated development environmentDemo (music)Multiplication signAlgorithmCurveSlide rulePublic-key cryptographyBitImplementationPhysical systemoutputExtension (kinesiology)Product (business)Computer animationXML
Computer animation
Transcript: English(auto-generated)
Good afternoon, everyone. It's great to be back here at FOSDEM this year.
Today, I want to tell you about the frustrations I've experienced running tests against public key infrastructure and the tools that I ended up developing in response. So I'm very grateful for this opportunity to share some of that with you here today. First, let me quickly say a couple things about myself. I work in the PDF industry in the research department at iText.
The company I work for grew out of an open source project, and I also do quite a bit of FOS work in my own time as well. So in that sense, you could say that FOS is both my job and my hobby. What's particularly relevant for today's topic is that I work a lot with digital signing as well, mostly in connection with PDF documents. And while that's not really what this talk is about,
you'll hear them pop up from time to time. For those who are interested, I've also listed my GitHub profile and personal website here on the slide. All right, since we're in the testing and automation room, let me spend some time to set up a testing problem that I run into way too often.
In order to do that, we need to back up a bit and review how digital signatures work. Most of what I'm about to say is not specific to PDF signatures, by the way. The technological basis of digital signing is public key cryptography. What that means is that every signer has a pair of keys, the private half of which is used for signing, the public half for validation.
Of course, the crucial thing is that there is a mathematical relationship between the private key and the public key that makes digital signatures practically impossible to forge if you don't know the private key. At the same time, anyone with knowledge of the public key can still validate signatures produced using the private one.
So what this amounts to is that if you're given a piece of data, a signature and a public key, there is a mathematical procedure that allows you to check whether the signature was generated on that particular piece of data by the private key corresponding to the public key that you have.
What the maths don't tell you is who that key then belongs to, and that's where certificates come in. The role of a certificate is to bind a signer's public key to their identity. And we'll talk about what that entails in a minute. But in the meantime, you should remember that a certificate is just a special type of signed data.
It's essentially a signed statement by an entity authorized to identify other signers. And that's what we call a certificate authority. Most common formats for certificates these days are derived from the X.509 standard that you might have heard of, or perhaps a profile of that. And they're typically attached to or embedded into the signed payload.
Now, what's important to keep in mind here is that the certificate itself is not really part of the mathematical signing process. It's just one of the ways to solve the problem of determining which keys belong to whom. In some sense, from the validation point of view, you could even argue that the mathematics is the easy part of the validation process.
In almost all cases in the real world, the algorithm that you're trying to use has already been implemented by someone else. You can just use their library to handle that validation for you. In actuality, the hard part is not so much in validating the mathematical integrity of the signature, but rather in verifying whether or not the signer is who they actually claim to be.
In other words, the question that we're asking here is who is actually in control of the signing key? Since that's what certificates and public key infrastructure are all about, let's zoom in on that aspect a little further. As I already said in the beginning, a certificate is essentially just a statement
from a certificate authority asserting that a certain key belongs to a certain owner. And this owner doesn't even necessarily have to be an actual person. It could also be a company or a domain name like, you know, a website. And of course, the CA itself is not an exception to this rule either. So in most cases, you'll have a set of certificate authorities that you trust absolutely.
Those are the trust routes. And everyone else basically has to have a path or a chain of trust that goes back to one of these routes. The idea is that each of these links is backed up by yet another certificate. So in order to validate a certificate, you essentially always have to validate a chain of trust consisting of multiple certificates.
In general, to be issued a certificate, the subject must prove to the CA that they control the key for which they're trying to obtain a certificate. The precise mechanism by which that happens is highly dependent on the use case and very dependent from vendor to vendor as well.
But the point is that it requires a degree of trust. And therefore, certificates are naturally limited in time because trust doesn't last indefinitely, right? So that's why certificates have an expiry date. And that already brings into view the tip of the iceberg of complexity that is certificate validation.
Because now the obvious question arises, what happens if the key is compromised before the certificate expires? To respond to that need, there are mechanisms by which CAs can revoke certificates that they issued before they expire. There are many reasons for which a CA might want to revoke a certificate.
For example, if the key is compromised, if they suspect that the certificate was issued fraudulently or for any number of reasons. And that means that revocation checking is an essential part of the certificate validation process. This is especially true for high value signatures like those on contracts and whatnot.
The two main mechanisms by which a certificate authority communicates the status of its issued certificates to the broader public is through CRLs and OCSP. CRL stands for certificate revocation list. So as the name implies, it's basically just a list of all the certificates that are currently revoked.
Very straightforward. But for large commercial CAs, these CRLs can get very large and unwieldy. And for those cases, the CA can also offer OCSP access. The way this works is that the CA exposes an OCSP responder service to the internet. And if you want to know the status of one of the certificates issued by that CA, you
can send a request to the OCSP responder and basically ask it, hey, is this certificate still valid? Yes or no. And again, this is all complexity that has to be dealt with by pretty much all applications that rely on signature validation for integrity. Testing all of that would be hard enough already, but it gets worse.
It turns out that it's not just validators, but signers also have to deal with this stuff. Without going into too much detail, that usually has to do with signatures that have to outlive their certificates. And this has significant implications for the software engineering process. Because if you're designing an application that needs to validate digital signatures or produce digital signatures with long lifetimes, then, well, your application needs to interact with those trust services.
And that brings us to the central question that I wanted to ask here today. That is, how do you even begin to test such a setup? Because due to the nature of the thing that we're trying to do, replicating these trust services in a testing environment is not always easy.
And that's the question I want to focus on for the remainder of my time. The first and most basic thing to note here is that the problem is really about so much more than just generating test data. Because indeed, the generation of test certificates and CRLs is something that can be scripted. With OpenSSL and some bash skills, you'll get pretty far.
And while these scripts are more often than not write-only and hard to maintain, it's still pretty doable, all things considered. Now, as the title of the talk kind of implies, the real issue is not so much in the generation of test data, but rather in mocking these online services that you need in a PKI environment.
Those include OCP responders, which we discussed before, but things like time sampling services and remote signing implementations would also fall into this category. And of course, having these test services up and running in your testing environment is only half the story. Because hopefully you're not just testing the happy path in your application, you also need to test all sorts of failure cases.
For example, what does your application do if it's served a certificate that's broken in some way, or a revoked certificate, or maybe it needs to validate certain exotic certificate extensions that OpenSSL doesn't even handle. In all of these cases, chances are that you'll hit the limits of the bash plus OpenSSL approach pretty quickly.
And just as an illustration of how messy it gets, I've included a screenshot of a very small excerpt of my previous testing setup, and I hope you'll agree with me that this is neither pretty nor particularly maintainable.
So at some point, my frustration with this process reached critical mass, and I sat down for a weekend to put together an actual solution. And that's what became Certamancer. Certamancer not only generates test certificates and CRLs for you, but it also handles the provisioning of these trust services that we discussed. One of its biggest advantages, in my opinion, is that it doesn't require any
scripting. In the vast majority of use cases, you can simply configure it declaratively. If for whatever reason your use case is more complicated than that, then you can always use the Python API to extend its functionality. The code is hosted on GitHub. There's a link on the slide and also on the summary page for this talk.
So to wrap up, I thought I'd give you a quick tour of what working with Certamancer is like in practice. Certamancer takes two kinds of input. First, it needs keys. Actually, generating the keys is not something that Certamancer does by itself, but you can use a tool like OpenSSL to generate key pairs from most common algorithms.
Those include RSA, DSA, ECDSA, and EDWIS curve DSA. Once you have the keys, you'll need to write some configuration. The first part of that configuration involves defining entities that can be the subject of or issue certificates. We do that by defining the entity's name and associating at least one key pair with each entity.
And the next step involves configuring certificates in our virtual PKI system. The example on the slide here shows the issuer being set, the validity period, and a couple of very common certificate extensions that you'll find in most production certificates out there.
Note in particular that the configuration includes a CRL distribution points extension, which is going to point to a CRL repository hosted by Certamancer itself. And that brings us to the final part of the configuration, namely the actual trust services. In this example, I've included configuration for an OCSP responder, a CRL repository, and a time stamping service.
As you can see, this part of the configuration is not particularly complicated. It's just a couple lines and Certamancer handles the rest. To see all of that in action, you can run the Certamancer animate command and then point your browser to the running instance.
From that very primitive web UI, you can interface with the virtual PKI environment that you just created. If you want to see a more extensive demo of the command line interface that Certamancer exposes, there is one available on ASCII enema that's also linked from the project's
GitHub page. I've personally gotten quite a bit of mileage out of Certamancer myself, both in my personal projects and at work. I not only use it for integration tests of various kinds, but I've also used it to cobble together a mock server implementation for the cloud signature consortium API. And outside the testing use cases, it's also really convenient for
demonstrations, proof of concept implementations, troubleshooting, internal training, and all sorts of other situations where being able to, quote unquote, make a mockery of trust is pretty useful. All right, that's all I got. I hope you enjoyed it. And if you have any questions,
I'll happily take them now. Thank you.