Is YAML the Answer?
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 542 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/61550 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2023206 / 542
2
5
10
14
15
16
22
24
27
29
31
36
43
48
56
63
74
78
83
87
89
95
96
99
104
106
107
117
119
121
122
125
126
128
130
132
134
135
136
141
143
146
148
152
155
157
159
161
165
166
168
170
173
176
180
181
185
191
194
196
197
198
199
206
207
209
210
211
212
216
219
220
227
228
229
231
232
233
236
250
252
256
258
260
263
264
267
271
273
275
276
278
282
286
292
293
298
299
300
302
312
316
321
322
324
339
341
342
343
344
351
352
354
355
356
357
359
369
370
372
373
376
378
379
380
382
383
387
390
394
395
401
405
406
410
411
413
415
416
421
426
430
437
438
440
441
443
444
445
446
448
449
450
451
458
464
468
472
475
476
479
481
493
494
498
499
502
509
513
516
517
520
522
524
525
531
534
535
537
538
541
00:00
File formatType theoryBinary fileKey (cryptography)SpacetimeType theorySingle-precision floating-point formatRevision controlPoint (geometry)Level (video gaming)Streaming mediaSolid geometryMessage passingDampingReal numberRegulärer Ausdruck <Textverarbeitung>Parameter (computer programming)Goodness of fitData typeSubsetData modelPrice indexSinc functionPresentation of a groupFile formatMobile appInternet service providerSheaf (mathematics)BitLine (geometry)Flow separationMereologyVariable (mathematics)SatelliteMappingBinary codeDiagramComputer animation
07:57
Variable (mathematics)Vertex (graph theory)Operator (mathematics)Scripting languageTerm (mathematics)Kolmogorov complexityConfiguration spaceRevision controlMereologyImplementationSubsetString (computer science)Order (biology)Data typeFormal languageBoolean algebraData structureStreaming mediaCASE <Informatik>Mapping2 (number)Right angleSequenceNamespaceSoftware testingPoint (geometry)Mechanism designContent (media)Uniform resource nameInterpolationRoundness (object)Multiplication signComplex (psychology)Revision controlPlotterBookmark (World Wide Web)WordAlgorithmSingle-precision floating-point formatVariable (mathematics)Operator (mathematics)Extension (kinesiology)Scripting languagePresentation of a groupDatabase normalizationLatent heatGastropod shellArithmetic meanComputer animation
15:49
Computer animationProgram flowchart
Transcript: English(auto-generated)
00:05
This entire talk is inspired by a single remark by a former co-worker of mine, who just casually dropped the line that Yammer was so simple that nobody could ever attain mastership
00:21
in it. So, a question towards the audience, also slight audience participation, sorry. Who would tend to agree? That is almost nobody. That's a shame, because you would be in good company. This is in the GOAT section of every spec of Yammer, there it is.
00:46
So let's get a bit into detail, and the very core Yammer exists to provide printable text presentation of structured data. And in that regard, it is a rival to things like JSON, XML and other formats.
01:02
It's been around for quite a while, we're almost looking into 20 years of Yammer now. It is somewhat interwoven with JSON, since version 1.2, actually all of JSON is also valid Yammer. That is introducing an interesting relation, because now since I think 2018, JSON is
01:29
a strict subset of JavaScript, thus there's an intersection of Yammer and JavaScript now that is precisely JSON. Let's not get into the argument if that's good or not, just, you know, it's a thing.
01:47
If you write a lot of Yammer, all of the examples, most of the examples, most of the real-life specimens, if you will, they will let you believe that there are no real types
02:00
in Yammer. Actually, the opposite is true. Yammer is heavily typed, almost everything in a Yammer document has a type. Here's a selection, nothing surprising. All the types you see here, they are also present in JSON, and that's also inspiring
02:21
an interesting question. Let's say you have a Yammer document. Could you just change the syntax to JSON and have a valid presentation? Would that work? Oh, you're too good. I see I attracted the wrong audience, because, yeah, no, that doesn't work.
02:45
Actually, it falls apart with the map type. The map type in Yammer is really, really wide. It does allow for such things as composite keys to its entries. If you're really interested, they're introduced by, what's the token, question mark space.
03:06
So that's a thing. There are not so basic types. OMAP is an ordered map, the regular mapping is not ordered. SAT is somewhat, yeah, special.
03:23
There's not a complete list, by the way. There's a way to have Yammer inside Yammer, that's nice. Then there's a type specifically for binary data. This one is really, really useful, provided it actually works.
03:42
Try it. You know, the problem really is, in JSON, XML, also in Yammer, is you can't have certain byte marks. So the first 16, I believe, character points, they are off limits, they are controlled signals.
04:00
They can't be part of the stream. With this type, you can just base 64 everything, and once your Yammer is being passed, that is being expanded into the raw binary. Pretty neat, eh? First example.
04:20
This is not minimal Yammer, but it does help to illustrate a few points. I suppose a lot of you got a lot to do with, say, OpenShift, Kubernetes, and the like. You've seen the three hyphens a lot, haven't you?
04:41
Okay, who knows what that does? Come again? Not quite, no. No, it's not the beginning of the document. What this is, is a document separator. So what you see here is not actually a Yammer document, it's a Yammer stream.
05:03
Yammer is meant to be streamed, possibly. Don't do that unless you have a solid message framing, because truncated Yammer tends also to be earlier to Yammer, so I think twice before you do that. The thing is, most of the tools that you have with Yammer will assume there's only
05:25
one document ever. You need to do some convincing to get all the documents out of a stream. Oh, by the way, do you know what happens if you omit those three hyphens?
05:41
You do? Okay, pretty much everything you see here, if it's missing, it's implied. Great, eh? It's a folder version number, that's also a bit of the homework for you. There's a chance this is going to break the tools because they do not understand version
06:03
1.2. The majority of the tools are still stuck at 1.1. So, let's get into the title of it. This is something you see quite common, what do we have here? It's a mapping with a single entry, which shares the key variables.
06:28
Inside is another mapping for the key app version. We've got something in it. Now, there's no indication to what data type that is. It's an integer, right?
06:41
You agree? It depends on your schema. It depends on your schema. We're not quite there yet. This is foremost pure YAML. So, this is an integer, we agree, for the time being.
07:01
This is a float, right? This is a string. And this is still a string. You may have noticed I omitted a few things. What's three points? It depends on your schema. No? Yes, it does.
07:22
The regular expression for float says three points is a float. What is point one then? It's also float. So, if you want to make sure this is really always a string,
07:40
you may be tempted to do something like this. I quote a string, our thing, also in YAML. Big surprise. Single quoted also. So, the professional may do something like this. This is a tag. The two exclamation marks means it's global tag.
08:02
So, there's global meaning. Oh, I'm running out of time. It's a string. I have my word for it. The true professional who lost the plot may do something like this. Tags are identified by URNs. Also, there's a name spacing mechanism in YAML.
08:24
That's the part where you go, yay, name spaces. So, advanced features. This is something you do not commonly see. Most users of YAML are probably unaware this exists, but you have some tools to reduce duplication,
08:44
redundancy within your structures. One is anchors. Okay, they're basically marker. And one are alliances that do invoke those anchors. Pretty nifty. Also, these do give way for an attack
09:03
known as one brilliant laugh. So, it's basically you can set an anchor to an array or list of alliances who themselves contain, well, a lot of anchors.
09:21
So, this allows for a very compact presentation of a very complex data structure that quickly expands plenty of nodes. So, if you happen to... I'm really running out of time. If you happen to consume YAML from untrusted sources, this is something you should know.
09:41
Magical operator. This is another really nifty tool. It's only valid in 1.1 of YAML. It got immediately deprecated in 1.2. And also, it's a data type. It's there to basically merge mappings into other mappings.
10:04
Great stuff. So, test from the trenches. These are examples that really happened. Do you see... I should explain this. This is part of a GitLab setup as a script. This is expected to be a sequence of strings
10:21
to be executed on the shell. That's not what it is. Who sees the issue? No? Oh, very good, yes. That's not a string. It's a mapping.
10:40
Because of the single-pass design of YAML, the algorithm is very, very greedy. So, it sees that colon there and says, oh, great, this is a mapping. And completely ignores the quotations. So, how do you fix this? There's a bunch of ways.
11:06
Yes, I think the third one is my favourite. The fourth is really unsafe, because, once again, raw binary data. Have said that. This is another favourite. Again, GitLab CI.
11:21
We have a mapping. We try to set some variables for GitLab to expand. What is the content of bar? I must remind you, mappings don't have order. Oh, who knows?
11:43
It's going to be empty. It's going to be empty. The thing is, the mapping doesn't have an order. The YAML implementation in GitLab has other ideas. So, it takes that mapping and applies an order on it. So, bar goes top because lexicographic order.
12:02
And then there is a single round of interpolations. And foo at that point is empty as a variable. So, how do you fix that? Either way out is you rename your variables or this.
12:28
Thank you, thank you. This never happened to me, but it's been too good to pass. What do we have here? What does languages contain? It's a sequence of...
12:43
Tell your schema. We will need to drink. It's one string and one boolean. No, it's supposed to present Norwegian here.
13:00
It is accepted as a boolean. So, you need to tag that or quote it. However you wish. So, my observations. Because of the edge cases of the hidden complexity, there is a huge disparity in features that various tools actually support.
13:21
They also show different behavior. It's one hot mess. I can't put it in other words. Also, if you're writing YAML, it is admittedly really a pleasure. It's easy to type, but you can never let your guard down.
13:40
YAML will try to do a lot of lag for you, being very accommodating in sometimes the worst way possible. Some proposals I have. Because the versions of YAML really do different behavior, you should start to tag your streams accordingly.
14:02
You should see that the tools you use for consuming YAML are properly configured. Things like language specific extensions. So, part of the YAML streams could be evaluated in the process, read, execute it. That's a bit scary.
14:22
Yes, right. As most of YAML is relatively simple, the complexity is mostly because it's deeply nested and you can't properly edit it. Strict YAML may be a solution. It's a strict subset of YAML, with a lot of the ambiguity removed.
14:44
Way easier, safer. Way easier. Tooling support is so-so. So, I teased a question with this talk. The question could be, YAML is exactly than the answer if all you wanted was JSON with comments.
15:03
There are some other nicer tools as well, but that's pretty much it. So, this concludes my talk. Thank you very much. You've been terrific. Is there any questions? Please repeat the questions. We've got four seconds.
15:20
Is JSON any better? It depends what your use case is, really. Repeat the question, please. Oh yes, the question was, was JSON any better?
15:42
So yes, thank you very much again.