We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Is YAML the Answer?

00:00

Formal Metadata

Title
Is YAML the Answer?
Subtitle
… and if so, what has ever been the question?
Title of Series
Number of Parts
542
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
If you nowadays work with and on the cloud or simply *-as-Code, chances are you came into proximity of ‘Yet Another Markup Language’ (YAML). Deceptively simple in appearance, YAML provides you with a plethora of ways to shoot yourself in the foot. This talk discusses a number of common pitfalls. In a nutshell: Why YAML is not as easy as you may think.
14
15
43
87
Thumbnail
26:29
146
Thumbnail
18:05
199
207
Thumbnail
22:17
264
278
Thumbnail
30:52
293
Thumbnail
15:53
341
Thumbnail
31:01
354
359
410
File formatType theoryBinary fileKey (cryptography)SpacetimeType theorySingle-precision floating-point formatRevision controlPoint (geometry)Level (video gaming)Streaming mediaSolid geometryMessage passingDampingReal numberRegulärer Ausdruck <Textverarbeitung>Parameter (computer programming)Goodness of fitData typeSubsetData modelPrice indexSinc functionPresentation of a groupFile formatMobile appInternet service providerSheaf (mathematics)BitLine (geometry)Flow separationMereologyVariable (mathematics)SatelliteMappingBinary codeDiagramComputer animation
Variable (mathematics)Vertex (graph theory)Operator (mathematics)Scripting languageTerm (mathematics)Kolmogorov complexityConfiguration spaceRevision controlMereologyImplementationSubsetString (computer science)Order (biology)Data typeFormal languageBoolean algebraData structureStreaming mediaCASE <Informatik>Mapping2 (number)Right angleSequenceNamespaceSoftware testingPoint (geometry)Mechanism designContent (media)Uniform resource nameInterpolationRoundness (object)Multiplication signComplex (psychology)Revision controlPlotterBookmark (World Wide Web)WordAlgorithmSingle-precision floating-point formatVariable (mathematics)Operator (mathematics)Extension (kinesiology)Scripting languagePresentation of a groupDatabase normalizationLatent heatGastropod shellArithmetic meanComputer animation
Computer animationProgram flowchart
Transcript: English(auto-generated)
This entire talk is inspired by a single remark by a former co-worker of mine, who just casually dropped the line that Yammer was so simple that nobody could ever attain mastership
in it. So, a question towards the audience, also slight audience participation, sorry. Who would tend to agree? That is almost nobody. That's a shame, because you would be in good company. This is in the GOAT section of every spec of Yammer, there it is.
So let's get a bit into detail, and the very core Yammer exists to provide printable text presentation of structured data. And in that regard, it is a rival to things like JSON, XML and other formats.
It's been around for quite a while, we're almost looking into 20 years of Yammer now. It is somewhat interwoven with JSON, since version 1.2, actually all of JSON is also valid Yammer. That is introducing an interesting relation, because now since I think 2018, JSON is
a strict subset of JavaScript, thus there's an intersection of Yammer and JavaScript now that is precisely JSON. Let's not get into the argument if that's good or not, just, you know, it's a thing.
If you write a lot of Yammer, all of the examples, most of the examples, most of the real-life specimens, if you will, they will let you believe that there are no real types
in Yammer. Actually, the opposite is true. Yammer is heavily typed, almost everything in a Yammer document has a type. Here's a selection, nothing surprising. All the types you see here, they are also present in JSON, and that's also inspiring
an interesting question. Let's say you have a Yammer document. Could you just change the syntax to JSON and have a valid presentation? Would that work? Oh, you're too good. I see I attracted the wrong audience, because, yeah, no, that doesn't work.
Actually, it falls apart with the map type. The map type in Yammer is really, really wide. It does allow for such things as composite keys to its entries. If you're really interested, they're introduced by, what's the token, question mark space.
So that's a thing. There are not so basic types. OMAP is an ordered map, the regular mapping is not ordered. SAT is somewhat, yeah, special.
There's not a complete list, by the way. There's a way to have Yammer inside Yammer, that's nice. Then there's a type specifically for binary data. This one is really, really useful, provided it actually works.
Try it. You know, the problem really is, in JSON, XML, also in Yammer, is you can't have certain byte marks. So the first 16, I believe, character points, they are off limits, they are controlled signals.
They can't be part of the stream. With this type, you can just base 64 everything, and once your Yammer is being passed, that is being expanded into the raw binary. Pretty neat, eh? First example.
This is not minimal Yammer, but it does help to illustrate a few points. I suppose a lot of you got a lot to do with, say, OpenShift, Kubernetes, and the like. You've seen the three hyphens a lot, haven't you?
Okay, who knows what that does? Come again? Not quite, no. No, it's not the beginning of the document. What this is, is a document separator. So what you see here is not actually a Yammer document, it's a Yammer stream.
Yammer is meant to be streamed, possibly. Don't do that unless you have a solid message framing, because truncated Yammer tends also to be earlier to Yammer, so I think twice before you do that. The thing is, most of the tools that you have with Yammer will assume there's only
one document ever. You need to do some convincing to get all the documents out of a stream. Oh, by the way, do you know what happens if you omit those three hyphens?
You do? Okay, pretty much everything you see here, if it's missing, it's implied. Great, eh? It's a folder version number, that's also a bit of the homework for you. There's a chance this is going to break the tools because they do not understand version
1.2. The majority of the tools are still stuck at 1.1. So, let's get into the title of it. This is something you see quite common, what do we have here? It's a mapping with a single entry, which shares the key variables.
Inside is another mapping for the key app version. We've got something in it. Now, there's no indication to what data type that is. It's an integer, right?
You agree? It depends on your schema. It depends on your schema. We're not quite there yet. This is foremost pure YAML. So, this is an integer, we agree, for the time being.
This is a float, right? This is a string. And this is still a string. You may have noticed I omitted a few things. What's three points? It depends on your schema. No? Yes, it does.
The regular expression for float says three points is a float. What is point one then? It's also float. So, if you want to make sure this is really always a string,
you may be tempted to do something like this. I quote a string, our thing, also in YAML. Big surprise. Single quoted also. So, the professional may do something like this. This is a tag. The two exclamation marks means it's global tag.
So, there's global meaning. Oh, I'm running out of time. It's a string. I have my word for it. The true professional who lost the plot may do something like this. Tags are identified by URNs. Also, there's a name spacing mechanism in YAML.
That's the part where you go, yay, name spaces. So, advanced features. This is something you do not commonly see. Most users of YAML are probably unaware this exists, but you have some tools to reduce duplication,
redundancy within your structures. One is anchors. Okay, they're basically marker. And one are alliances that do invoke those anchors. Pretty nifty. Also, these do give way for an attack
known as one brilliant laugh. So, it's basically you can set an anchor to an array or list of alliances who themselves contain, well, a lot of anchors.
So, this allows for a very compact presentation of a very complex data structure that quickly expands plenty of nodes. So, if you happen to... I'm really running out of time. If you happen to consume YAML from untrusted sources, this is something you should know.
Magical operator. This is another really nifty tool. It's only valid in 1.1 of YAML. It got immediately deprecated in 1.2. And also, it's a data type. It's there to basically merge mappings into other mappings.
Great stuff. So, test from the trenches. These are examples that really happened. Do you see... I should explain this. This is part of a GitLab setup as a script. This is expected to be a sequence of strings
to be executed on the shell. That's not what it is. Who sees the issue? No? Oh, very good, yes. That's not a string. It's a mapping.
Because of the single-pass design of YAML, the algorithm is very, very greedy. So, it sees that colon there and says, oh, great, this is a mapping. And completely ignores the quotations. So, how do you fix this? There's a bunch of ways.
Yes, I think the third one is my favourite. The fourth is really unsafe, because, once again, raw binary data. Have said that. This is another favourite. Again, GitLab CI.
We have a mapping. We try to set some variables for GitLab to expand. What is the content of bar? I must remind you, mappings don't have order. Oh, who knows?
It's going to be empty. It's going to be empty. The thing is, the mapping doesn't have an order. The YAML implementation in GitLab has other ideas. So, it takes that mapping and applies an order on it. So, bar goes top because lexicographic order.
And then there is a single round of interpolations. And foo at that point is empty as a variable. So, how do you fix that? Either way out is you rename your variables or this.
Thank you, thank you. This never happened to me, but it's been too good to pass. What do we have here? What does languages contain? It's a sequence of...
Tell your schema. We will need to drink. It's one string and one boolean. No, it's supposed to present Norwegian here.
It is accepted as a boolean. So, you need to tag that or quote it. However you wish. So, my observations. Because of the edge cases of the hidden complexity, there is a huge disparity in features that various tools actually support.
They also show different behavior. It's one hot mess. I can't put it in other words. Also, if you're writing YAML, it is admittedly really a pleasure. It's easy to type, but you can never let your guard down.
YAML will try to do a lot of lag for you, being very accommodating in sometimes the worst way possible. Some proposals I have. Because the versions of YAML really do different behavior, you should start to tag your streams accordingly.
You should see that the tools you use for consuming YAML are properly configured. Things like language specific extensions. So, part of the YAML streams could be evaluated in the process, read, execute it. That's a bit scary.
Yes, right. As most of YAML is relatively simple, the complexity is mostly because it's deeply nested and you can't properly edit it. Strict YAML may be a solution. It's a strict subset of YAML, with a lot of the ambiguity removed.
Way easier, safer. Way easier. Tooling support is so-so. So, I teased a question with this talk. The question could be, YAML is exactly than the answer if all you wanted was JSON with comments.
There are some other nicer tools as well, but that's pretty much it. So, this concludes my talk. Thank you very much. You've been terrific. Is there any questions? Please repeat the questions. We've got four seconds.
Is JSON any better? It depends what your use case is, really. Repeat the question, please. Oh yes, the question was, was JSON any better?
So yes, thank you very much again.