Hijacking Ruby Syntax in Ruby
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 66 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/46591 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
Ruby Conference 201845 / 66
5
10
13
14
17
18
21
22
26
29
37
45
46
48
50
51
53
54
55
59
60
61
63
65
00:00
VideoconferencingKeyboard shortcutContext awarenessVariable (mathematics)Drum memoryDemo (music)Instance (computer science)Event horizonException handlingInformationBlock (periodic table)Statement (computer science)Point (geometry)Large eddy simulationLength of stayControl flowMusical ensembleProgrammer (hardware)Core dumpCodeLibrary (computing)Set (mathematics)ResultantArmProcess (computing)Projective planeIntegerDivision (mathematics)Variable (mathematics)Service (economics)Software maintenanceKeyboard shortcutObject (grammar)Template (C++)System callHookingComputer programmingDialectStorage area networkEnterprise architectureLocal ringStandard deviationContext awarenessThread (computing)Formal languageFiber (mathematics)String (computer science)Content (media)Parameter (computer programming)Instance (computer science)Social classMultiplication signSampling (statistics)Complete metric spaceMetaprogrammierungTracing (software)Event horizonOpen sourceSoftware developerBlock (periodic table)Data analysisInformationVirtual machineCASE <Informatik>Proxy serverTwitterPower (physics)Point (geometry)Speech synthesisConnectivity (graph theory)CoprocessorInternet der DingeWeb applicationException handlingVirtualizationJSONXML
10:14
Hash functionLarge eddy simulationInheritance (object-oriented programming)Sample (statistics)LogicStrutAbstractionSineRankingRippingPersonal digital assistantModule (mathematics)Object (grammar)HierarchyLie groupSocial classModul <Datentyp>Event horizonExtension (kinesiology)HierarchySocial classProgrammer (hardware)Exception handlingPatch (Unix)Sampling (statistics)Interpreter (computing)CASE <Informatik>MultiplicationCommutatorExtension (kinesiology)Absolute valuePower (physics)Electronic visual displayHookingCodeFunction (mathematics)Module (mathematics)Formal verificationLine (geometry)Symbol tableLogicAbstractionMathematicsChainExtreme programmingDifferent (Kate Ryan album)Insertion lossImplementationInheritance (object-oriented programming)Order (biology)1 (number)String (computer science)Overhead (computing)Tracing (software)Point (geometry)Streaming mediaComputer programmingAutomatic differentiation
16:42
ExistenceFormal verificationAbstractionInterpreter (computing)Point (geometry)Event horizonModule (mathematics)Social classRippingAbstract syntax treeComputer configurationParsingLibrary (computing)CodeStandard deviationToken ringInheritance (object-oriented programming)Traverse (surveying)InformationCombinational logicCodeElectronic mailing listSoftware testingFunction (mathematics)Virtual machineSocial classConstructor (object-oriented programming)Instance (computer science)ChainPosition operatorModule (mathematics)String (computer science)Point (geometry)Identity managementExistenceWordHierarchyInheritance (object-oriented programming)Dynamical systemMultiplication signRun time (program lifecycle phase)Tracing (software)AbstractionCASE <Informatik>Event horizonRippingLibrary (computing)ParsingNumberLine (geometry)Token ringLatent heatComputer fileHookingAbstract syntax treeCategory of beingBlock (periodic table)Sampling (statistics)Statement (computer science)Vector potentialProof theoryPower (physics)Auditory maskingError messageMathematicsControl flowDifferent (Kate Ryan album)Slide ruleStorage area networkCausalityProxy serverExpressionStandard deviationComputer configurationInterface (computing)
24:30
Process (computing)Query languageJava appletDatabaseResource allocationBlock (periodic table)Open setLambda calculusKernel (computing)MultiplicationStatement (computer science)Exception handlingKeyboard shortcutSound effectLocal ringVariable (mathematics)Cache (computing)System callBlock (periodic table)Variable (mathematics)Frame problemProcess (computing)Stack (abstract data type)CodeSampling (statistics)Object (grammar)Local ringComputer fileSystem callPoint (geometry)Semiconductor memoryLine (geometry)Parameter (computer programming)Statement (computer science)Level (video gaming)Set (mathematics)Power (physics)Resource allocationNumberEvent horizonTracing (software)Formal languageLambda calculusException handlingExterior algebraMessage passingMultiplication signType theoryDatabaseInheritance (object-oriented programming)CASE <Informatik>Network socket2 (number)WritingComputer programmingContext awarenessComputer configurationMiddlewareSocial classJava appletOpen setConnected spaceMetaprogrammierungGoodness of fitPanel painting
33:26
Coma BerenicesJSONXMLComputer animation
Transcript: English(auto-generated)
00:02
Hello everyone! I think that RubyKaigi is one of the most amazing programmer conferences,
00:25
and we will show the fun of RubyKaigi and the power of RubyKaigi. We will talk about hijacking Ruby syntax in Ruby. Okay. At first, I introduced myself. My name is Tomohiro
00:45
Hashidate. I come from Japan. But my friends call me Joker or Joker-san, because my Twitter ID and GitHub ID is joker1007. Please remember me, Joker. I'm working on deploying as CTO.
01:09
I'm usually working as data engineer, info engineer, web application engineer, et cetera. And because of it, I'm familiar with Ruby and Ruby on Rails and Fluentd and the other issues,
01:25
on first or et cetera. So this RubyConf is my first RubyConf experience, and it's my first trip to United States. And this talk is my first speech ever, and of course, this talk is my
01:44
first English speech. And because of it, I'm so nervous now. But I'm very happy to talk at RubyConf. It's my great pleasure. And next is introduction of a talking partner, Moris-san.
02:01
Hello everyone again. I am Satoshi Tagomori, also known as Tagomoris. Many Ruby friends call me Moris, so please call me Moris. And I am an open-source software developer and also or manager, sorry, and maintainer of some projects, including Fluentd, Message Back Ruby,
02:23
and Norikra, Wusi, and some others, many others. And I'm working at a processor company, ARM, right now, as a software developer in IoT service division, and to providing enterprise data analytics platform. So we both, and I and Jokasan, are working in different companies,
02:49
but why we are talking together here is AsakusaRB. AsakusaRB is a very great regional
03:01
Ruby community in Tokyo, and we are always talking together about interesting programming techniques and writing and many interesting code, including meta-programming. So in this talk, we will introduce some Ruby features,
03:25
Ruby standard features, and then we will talk about our own code, which uses these meta-programming techniques. First, I will talk about binding. The binding is a kind of context object, which includes a set of local variables and instance variables and many as some others.
03:47
And in many cases, bindings is used for template engines, kind of. And so this sample code shows that, this sample code shows how to get binding object from context of
04:06
method get binding, and then get the value of variables, like in the local variable n or instance variable secret, using local variable get method. And the binding of
04:22
crust have some other methods, like in receiver, eval, local variables, local variable get, local variable defined, and local variable set. So this code shows what we can do using binding. In this code,
04:43
this code gets binding object from binding proxy methods with variable a, b, and c, and also dumps the contents of binding object. And also, this code is doing a local variable
05:01
get method to get the value of variables a, b, and c, and dump them. And this result shows that the object, sorry, and the binding object, sorry, binding object is recreated per binding call. So that means,
05:26
if we do a local variable set on our binding object and acquired from different context, and we can see the newly created local variable d just after local variable set,
05:46
but in original context, the local variable d disappeared. So that means we cannot create a new local variable on different context. But interestingly, this code is doing
06:04
local variable set on both of local variable d and a with value integer 20. So the result shows that, and we can see both of local variable d and a with integer 20,
06:20
and also in original context, we can see local variable a with integer 20. This means binding local variable set method is to add a variable only in a binding instance or to overwrite values of existing different, existing variables in original context.
06:47
We can do that. And next is a trace point. And trace point, so the sample code shows that the trace point, the trace point usage, and the trace point is to trace
07:00
events in Ruby virtual machine and call the hooks about these events. And we can hook many events, like underlying layers, and then class and end for the beginning of class definition, and the end of class definition, and call and return about to hook the method calls,
07:21
a Ruby method calls, and C call and C return about the methods implemented in C language, and the B call and B return for blocks, and thread begins thread end and fiber switch. And Ruby document says that we can use trace point to gather information specifically for exceptions,
07:41
but of course we can use many various events, and I think that that document is completely wrong. So look at this code. And this code is very simple and a method definition to define ea with argument a, and dump the content of local variable a, and then return a string
08:09
a in lower cases. And then call the ea method with integer 100, and then dumps the result of that method. So result is, surprisingly, the value of local variable a is string 100,
08:33
and the returned value is upcased a. What happens? In fact, above of the code, I
08:45
enabled a trace point to set a local variable with a stringified value, and to call upcase method on returned value. So this is what trace point can do.
09:02
So the trace point have many methods to control its behaviors or to know the details of events, what is the event or where the event occurs, and we have some methods to know the details of events, like a method id or a carry id or a less exception or return value,
09:24
and trace point also have a method binding. That means we can get binding object at any point we can hook using trace point. So that means we can use trace point to get the information
09:42
about Ruby virtual machine, or we can overwrite every local variables at any time in Ruby code. So you should not require my code because I can break any kind of Ruby code, your Ruby
10:01
application, your Ruby library, anytime, completely. Yes, I can. Trust me. Anyway, I will talk about two moving features. First, it's refinement.
10:25
Refinement provides a way to extend the class locally. It's useful to do monkey patch safety. This is a sample of safety monkey patch.
10:40
I'm sad because most Ruby programmers don't use this feature. In fact, refinement is difficult and limited feature. I had some Ruby committers said, I actually want to remove this feature or not. I'm so sad. Refinement is pretty cool.
11:06
I like this feature. And there is another use case. It's super private method. This refined method used in only this file, absolutely. It's useful for refactoring,
11:22
for example, method extraction. And second, I want to talk about method hooks. Anybody know method hooks? If you have used it before, please raise up your hands.
11:41
So thank you. And so I think that it is not popular feature. And these hook methods are called when a method is defined or removed or undefined. For example, this sample code shows how method update we have.
12:07
Method update is called from the line of def foo. In this case, in line number seven. And received the defined method name as symbol. So method hook provides a way to implement
12:28
method modifier like public private protected. By the way, I made three method modifiers. First, final modifier. Final forbid method override. Second, override modifier.
12:49
Override enforce the target method has super method. Third, abstract modifier. Abstract enforce that target method is overridden.
13:00
These modifiers is similar with ones of Java. If you know Java, please imagine these modifiers and you can understand easily. These method modifiers work when Ruby defines class. It is runtime. But in most case,
13:25
these checking run before many application logic. In other words, there is no overhead. Here is a sample code of final stream. Final stream provides a final modifier.
13:42
When a method is defined or when the class include module or extended module, if super class has final method and the method is overrided, exception is occurred. Like this. This is a sample of override gem. When class definition is finished,
14:09
if modified method has no super method, exception is occurred. This is a sample of abstract gem. When the class definition is finished, the modified method is not overridden, exception is occurred.
14:26
How to implement these method modifiers? I used so many hook methods and included extended method added and trace point. And also I used zipper.
14:41
In other words, I used the power of many black magics in Ruby. Here is a use case of method added in finalist. So method added to the main method verification logic.
15:01
Method hook is useful for implementing method modifiers, but I cannot implement final modifier only by method hook. Why? Ruby has so many cases of method definition. Def or define method includes module, extend, flip end.
15:25
So each case calls dedicated hooks. Why? Because include changes only chain of method lookup. When Ruby program includes a module, Ruby interpreter inserts module to a hierarchy of method lookup.
15:47
It's different from method adding. It's important. For your information, class ancestors method displays class and module hierarchy.
16:02
For this reason, finalist gem uses many hooks in order to cover various cases. Method added to detect override by subclass, singleton method added to detect override subclass, singleton class and include it and extend it like so.
16:24
And I want to talk about trace point too. Overlider gem and abstract gem use trace point. In this gem, inherited hook or included hook starts trace point tracing.
16:43
And trace point hook, I check class module hierarchy to detect method existence. So module instance method is very useful for such a situation because method class has super method method.
17:03
And super method method provides a way to trace method lookup chain. I'm using the word method too many times and maybe you are confusing. Why did I use trace point? Ruby defines method dynamically. It's determined at runtime.
17:26
Because of it, I must wait until the end of class definition to know a method is present or not. First, override and abstract cannot detect violation just when they are called.
17:44
In Ruby, the only way to detect the violation, detect such violation is trace point. And I have advanced use case of trace point in abstract gem and override gem. These gems must detect finish of specific class definition in trace point hook.
18:08
In such situation, there is one point that needs attention. Ruby has two syntax to define class.
18:20
Class end statement and class.new with code block. Class.new with block is just method code. It's important. In other words, end event of trace point cannot detect class.new with code block.
18:45
Because of this, I used to see return event and return value property. It may cause trouble. I was in heavy trouble.
19:01
I have another advanced case. It's trace point and ripper combination. Do you know ripper? Ripper is a built-in library, but not popular. So, standard library. And ripper is a parser for Ruby code. And it can output token list and S-expression.
19:23
S-expression represents a construction of Ruby code. It has token swings and token positions. It is similar to AST. By the way, at the future option,
19:40
we can use Ruby VM colon abstract syntax tree module. It's already added in current Ruby trunk. It has better interface than ripper. It's great work. And sample code of ripper is this.
20:02
This code is from Ruby reference manual. There is a token position in nested array. Like at identity and string m. So that nested array is token position.
20:22
This is use case of Aftriker gene. I'm sorry, this sample is very complicated. Please see details after we upload this deck. Events of trace point has file path and line number.
20:41
And the S-expression that is output by ripper has token position. I can detect construction of code syntax easily by this information. In Aftriker gene, I check a construction of Ruby code
21:01
to detect whether some method modifier is called in class definition or out of class definition. Like this, ripper empowers trace point. Trace point takes events and where it occurs.
21:21
And ripper.exe method provides how methods were called. And we can get detailed information about importing methods by this information. As one of other use cases,
21:42
power search gene is implemented by this combination. Power search gene is built-in gene for testing. Anyway, these genes are proof of concept. But these are different practical.
22:03
I think that black magic is dangerous, actually. But it is very fun. And it extends Ruby potential. So we can change Ruby syntax by Ruby code itself. Because of these features, I like Ruby very much.
22:29
And here is break time slide. We have one question, the Ruby quiz. What is the difference between undec method and remove method?
22:42
I passed the mic to Morrison during thinking time. Okay, did you get the answer? So when we call the instance method who, the Ruby machine machine digs the method lookup chain
23:06
and we find the definition of method who at the class bar. The class bar is the subclass of the super class who. And also who also has a definition of method who.
23:22
But anyway, method who in class bar will be called. When we call the remove method method on the class bar, it just removes the definition of the method who from class bar. So then when we call the instance method who,
23:44
the method lookup chain, the Ruby machine will dig the method lookup chain and then we'll find the method definition who from class who. But on the other hand, when we do the undec method who on class bar,
24:03
that method masks the definition of method who on class bar and marks that method do not exist. So then when we call the instance method who on class bar, the Ruby machine raises no method error.
24:22
So when you want to raise no method error, you should use the undec method. Anyway, so this is my use case of these method programming techniques.
24:42
And so when we will write middlewares or huge applications, we will do the many times of resource allocations and resource releases. And resource means huge memories or files,
25:04
file handlers and sockets, database connections, and many others. So we need to release these resources when we will not use that resource anymore. But of course we can use and begin and ensure clauses for that method, for that purpose.
25:24
But that is a bit complex and messy. And the other languages have special idioms for that purposes. For example, this Java code shows that try with resources clause
25:46
and for that purpose, only for that purpose. But that is very simple and useful to allocate resources and then release these resources safely in a safe way.
26:01
So of course Ruby has a way to handle these problems. That is an open method with blocks. The open method of file class takes an argument of pass and then in block the file will be opened
26:24
and then at the end of block the file will be crossed. But that is a kind of Ruby way, I think. But it requires many indentations. One indentation per resource and sometimes that method is not implemented.
26:42
And for example, TCP socket does not have this type of method. So I load a Ruby gem with resources. And this Ruby gem is to realize this safe resource allocation
27:04
and using top level with method by kind of refinement. And so this with method takes just one argument of block object to assign resources.
27:20
And then it passes assigned resources to the block parameter. So interesting thing is this argument block have two statements. Of course it can have any lines or any number of statements
27:41
but that block object have two statements and just returns one value but it can catch the allocated resources and stock and DB and then pass these values to block parameters.
28:04
So that is, of course, a trace point. It uses a trace point and events be returned underlying and to pass allocated resources to block parameters and to identify allocated resources in lambda statement.
28:24
And it also uses binding to detect newly defined local variables in allocation lambda. And then it also uses refinements to introduce top level with method without any side effects.
28:41
But this idiom still require indentation for block so that when we need, we want to assign resources in march step and the first time, second time, third time.
29:01
And if we call with method three times, it requires three level nesting. So it is not so bad but looks a bit messy so that we want any much cooler alternative.
29:24
And Golang haven't defer. Defer do not require indentations and also deferred method call will be called at the end of context. That looks very smart. So yes, I created that.
29:44
So we have some option to implement defer and an option is to use defer class and the kind of that and also takes, it's also taken block
30:01
and then in that block, we can use deferred processing. But it also, it still requires indentation and after calling defer method,
30:21
we can replace the value of variables which uses in blocks and it raises unexpected exceptions or resource leakage. So that is not safe. So I created defer our gem
30:40
and of course using refinements and to introduce and top level defer method and so the sample code is that and that is just look like very similar to the defer in Golang, I think.
31:03
Of course, I use trace point. So defer method enables trace point if not yet and it initialized internal stack frame. So defer manages when the resources should be released
31:21
using its own internal stack frame and the trace point and monitors the method call stack and it gets the snapshot of local variables in defer block and it calls release block at the end of scope.
31:41
Scope and scope means that's virtual stack frame and it also uses in binding to store and restore the set of local variables. In of release block and also it also uses refinements to introduce top level defer method without any side effects.
32:05
That is an, sorry. And so these techniques are very powerful and we can do many things using these techniques but the hard thing is debugging.
32:24
If we missed to disable or trace point works at the expected time, the trace point will continue to work forever and the only way to stop it
32:43
is to send a kill signal to the Ruby process. That is a very hard thing but also it provides a kind of super power to us and to realize many good things, many interesting things.
33:04
So that a gentleman said long ago how far away give yourself to the dark side. It is the only way you can save your friends. Thank you very much.