We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Coala Framework: Discover Static Code Analysis in Python

00:00

Formal Metadata

Title
Coala Framework: Discover Static Code Analysis in Python
Alternative Title
Discover Static Code Analysis in Python with Coala Framework
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
We, as developer, aim to provide code that, almost matches our team code style, looks better and behaves right. Static code analysis (SCA) tools are one of the way to achieves that. But, with multi-programming languages projects and all kinds of code related needs, It's difficult to address all thoses usecases without dealing with a vast majority of SCA tools. Coala is a — language agnostic — static code analysis framework that provides a common command-line interface for linting and fixing all your code. It is written in Python and supports way over 50 languages in addition to language independent routines. So, instead of building new analysis tools from scratch you can now build your own custom logic and let let coala deal with the rest. This talk introduces the audience to the Coala Framework and guides them through how the can use it to build routines to do almost anything you want with your code.
Static random-access memoryMachine codeSoftware frameworkMathematical analysisSoftware frameworkComputer animation
Digital signalDependent and independent variablesInformationDistribution (mathematics)DigitizingGroup actionMobile appFactory (trading post)Computer animation
Software frameworkMachine codeMathematical analysisStatic random-access memoryMachine codeMathematical analysisStatic random-access memorySoftware frameworkComputer animation
Machine codeMathematical analysisStatic random-access memorySource codeMachine codeStatic random-access memorySource codeInformation securitySoftware testingProgrammierstilMathematical analysisComputer animation
Software frameworkMachine codeMathematical analysisStatic random-access memoryMachine codeMorley's categoricity theoremComputer animation
Formal languageProgramming languageTraffic reportingMachine codeResultantXML
Cone penetration testEndliche ModelltheorieInstallation artGroup actionGraph (mathematics)DataflowMachine codeStatic random-access memoryData structureFunction (mathematics)Patch (Unix)Mathematical analysisControl flowIntermediate languageError messageResultantTraffic reportingRule of inferenceAlgorithmProgram flowchart
Beta functionWeb pageRepository (publishing)JSON
Machine codeMachine codeProjective planeFormal languageDirectory serviceComputer animationJSON
Electronic mailing listMachine codeRule of inferenceSlide ruleJSON
Electronic mailing listMachine codeFile formatConfiguration space
Computer fileGroup actionFunction (mathematics)Machine codeConfiguration spacePatch (Unix)Computer configurationRule of inferenceComputer fileDirection (geometry)Group action
Configuration spaceInterior (topology)Set (mathematics)Computer fileConfiguration spaceRule of inferenceMachine codeSheaf (mathematics)JSON
Inheritance (object-oriented programming)Configuration spaceConfiguration spaceSheaf (mathematics)Machine codeSpacetimeComputer file
Machine codeLogicMathematical analysisSource codeRule of inferenceComputer animation
Mathematical analysisSource codeMachine codeLogicSoftware frameworkAlgorithmoutputFunktionalanalysisMachine codeLocal ring
Core dumpMathematical analysisoutputComputer fileResultantFunktionalanalysisMathematical analysisoutputProjective planeContent (media)Function (mathematics)Rule of inferenceSource codeJSON
Function (mathematics)Group actionSource code
Core dumpLocal ringCategory of being
DisintegrationComputer fileMathematical analysisWordConfiguration spaceParameter (computer programming)FunktionalanalysisSource code
FunktionalanalysisParameter (computer programming)Function (mathematics)TupleComputer configurationMachine codeResultant
Programmer (hardware)String (computer science)outputResultant
Function (mathematics)Scripting languageSource code
Maß <Mathematik>Abstract syntax treeMathematical analysisFunction (mathematics)Data managementComputer programmingMachine codeSource codeComputer animation
Computer fileMultiplication signStatic random-access memoryLocal ringMathematical analysisSemantics (computer science)Intermediate languageCategory of beingContent (media)CASE <Informatik>Data structureComputer programmingConstructor (object-oriented programming)
Mathematical analysisAbstract syntax treeCompilerFlagFormal languageComputer configurationComplete metric spaceComputer programmingComputer animation
Point cloudFacebookOpen source
Transcript: English(auto-generated)
Let me introduce you to Lionel who is going to tell us more about the Koala framework. Thank you. So, hello everyone. My name is Lionel. I work as a DevOps engineer at evo.azure.sncf.
evo.azure.sncf is a digital factory which addresses the SNCF group digital challenges. We are almost 1,500 employees working in three sites, Lille, Nantes and Lille. I work in Lille.
We address two main challenges, digital distribution through WeSNCF and RealEurope internationally
and travel information via the mobile app L'Assistant SNCF. We are delivering to them IT services. Today we are going to talk about static code analysis.
We will start first by do some overview, a quick definition of static code analysis. We will follow up with the Koala framework and Koala beers, how it is used to do static code analysis in Python.
And we will see what are coming next into the framework to ease the use of Python to do static code analysis.
And we will finish with the Q&A, if you have questions of course. Static code analysis. We can define it as a method to extract facts, detect and also fix defects in source code without executing it.
We are mainly used to do code quality, code reviews, also compliance, when you have to do compliance tests for security or you just want to ensure that your code style into the team are well respected.
And of course to detect flows and try to fix them before running it. The Koala framework, we have a lot of tools today to do static code analysis.
As you see, there is a bunch of tools. And the more you have tools, the more you have a way to configure those tools. And it's pretty hard. So, let's categorize those tools into analyzers, as you see up there.
And you have also the way we use those tools via editors, tools and services, and also the way we consume the results produced by those tools. Like exporting your results into JSON or an HTML report.
As you see, it's pretty complicated to deal with all of those tools and the way we use them. So, how can we know? What if we can have one only tools to manage all this mess?
That's why Koala has been built. Koala is not pretty much, it's just an API which is language agnostic.
That means you can use Koala in Python, but you can analyze code from any languages. It supports more than 60 languages for now, programming languages.
Let's have a closer look. That's the typical constitution of the static analysis tool. We start by using the code as data, and you have some model extraction to just get data from the code.
You then produce an intermediate representation. We can have AST, you also get data structures,
you could also have call graphs and also control flow graphs. If we zoom out into the main goal of Koala, as I said, you have data structures. For analysis, we have rules that also call routines.
In Koala, they call it BEARS. It's the way you implement your algorithm to do static code analysis. After that, the produced results could be in two forms, outputs and actions.
Output is like detecting the way you detect flows and errors. You could have also as an output a fixed recommendation, how we can fix our code. And you get, as a result, also compliance reports for flows and code styles, as I said before.
You also have actions. There is a lot of actions there. Action could be apply a patch or actually fixing the flows or the defects.
Let's see how to pick setup. You could use PIP to install Koala BEARS, which installs everything you need to start working with Koala. Or you could also use Docker, which I think is the recommended way
because you don't have to deal with PIP packages. You just use the container which is packaged up with all the things you need. You have another way, which is online. You have a beta web page online,
where you can just put the repository of your code and start working on it. Let's see how to use Koala. Let's see some code. First, you clone the project,
and as you see, you have the BEARS directory and the SRC directory. And this project is mainly, you have mainly two languages. We have C and Python. And the way you approach the use of Koala is by saying,
okay, I have a project with two languages. Let's list the available BEARS, the available rules that I could use to analyze my code. So, you run this, and you see a bunch of BEARS available.
For the next slides, I will just access Koala to make the slide more short. So, you just run Koala, and it means running Koala behind Docker.
So, I want to analyze my code written in Python. I will list the BEARS available for Python. Here, I choose the PEPED BEAR, for example.
I don't know much about it, so I get some documentation how to use it. And I see also we have optional settings, how to set the configuration for the BEARS, and what the BEARS can do. The BEARS can detect formatting and can fix formatting.
So, you run your BEARS on Koala, and specifying the BEARS PEP8 on your Python files. First, you will get a GIF output.
It says how your code is not compliant to the PEP8 rules. And after that, you have action, as I said before. You can either do nothing or open the files,
apply the patch, which means make the code compliant, and also ignore the command. And you can also pass apply patches on the command line to do the apply patch action directly.
You also have another option to specify which action you need to do. As you have noticed, you also, the tools recommended you or suggest you to use dash dash save to save your configuration. That mainly brought us to the configuration file.
As I said, you have a lot of tools and you have a lot of configuration files to tell how the tools should analyze your code. In Koala, you also have one file to deal with all BEARS.
When you run dash dash save, it produces a file called a Koala file. It looks like this. You have sections, it's an init file. You have sections and you have two, at least two mandatory settings,
which is specifying the BEARS, the rules you need to apply on the code, and the file you need to be applying on. You can also enter your settings at the command line, as you see in the int.
In the configuration file, you have a way to organize your BEARS. This is to avoid repeating yourself in the configuration. So you have inheritance, you just have to prefix the section with a BEARS settings,
BEARS section name, and as you see, section one and two extend from the configuration set at the BEARS section.
You also have append operators, which you can, like this, append files from section to section. As you see, at the BEARS, I'm only analyzing Python files, and at section one, I want to analyze not only Python files, but also C files.
That's an example. I have all the C files, all sections, and at the example section, I want to check also if the spaces are consistent in my code.
Okay, let's see how BEARS works, and what are BEARS really, and how to create your own BEARS.
As I said, BEARS are only rules, but BEARS are the base construct when you need to write a rule. You have to implement the run function, and the run function is the one which is executed to run your algorithm.
From the BEARS, you have two classes, local BEARS and global BEARS. Actually, your code should extend from those two ones, and you also can have user input at the heart, which is provided by the framework.
Local BEARS runs on every file of your project, and the run function provides the file name and the file content for you to run your algorithm, also user input and settings,
and global BEARS runs analysis on the whole project. As you see in the run function, you don't have the file,
but you can do some internal things to do whatever you want with your files. Let's see an example. I have a Hello World BEAR, and now I just print some logging things.
So I extend from the local BEAR, that means I want to run it on every file, and just output a user input, which is provided by the user.
And at the end, the rule is that you have to yield the result. Why? Because when you mix BEARS, you could get the result from any BEARS in your BEARS. If you have dependent BEARS, you can provide the result
getting from the inner BEARS you have used in your BEAR. I run it, and that's the output. It suggests me to enter my user input, and I put my name,
and it suggests me, as you saw before, action to do on this. So BEARS. To write BEARS, you have three main categories. You have native BEARS, linters to do linting, and you have external BEARS.
What are native BEARS? Native BEARS are the BEARS that extend for local BEARS and global BEARS, simply like this.
If you want an example, that's a native BEARS. As you saw, the yellow word BEARS was the native BEAR. Then I just implemented the run function and yielded some results. And the global BEARS aside also do the analysis on the work project.
Linter BEARS. Linter BEARS use your own tools and wrap your tools. Just imagine you have a linter, like it's just lint,
and you want to use Koala to wrap it. That's why you use a linter, a linter BEAR. You specify the executable, and you have to say if this BEARS is a global or not. If it's global, it means your tools analyze the whole project,
and if not, it means just your tools analyze each file of a file. The particularity there is that the linter BEARS has to implement the create arguments.
That's how you pass arguments to your executable. Let's see an example. And you have also the way to provide a configuration. If you have a tool that is a highly configurable file,
you can just do everything in the generate config, produce the JSON you need to execute your tools, and it will be injected in the create argument after that.
Let's see an example to see how it works. I have pylint that I need to wrap. So, I create the create argument function. That's where I return a list, a tuple.
The tuple contains every options pylint needs to analyze my code. That's simply how it works. And from the output specified there, you specify how you want to analyze it
and interpret the result. Xn or BEARS is wrapping also your BEARS, your tool, but written in any language.
It will provide, Koala will provide you some data in JSON, and you have as a rule, you have to produce this result which will be analyzed or used by Koala.
As an example, I create a BEARS. I wrap my tool with using node, I create my script in node.js, and after that, with my add result, I produce the JSON wanted for my tools to be considered as a BEARS, like this.
I should use console.log to show, to output it as a JSON. Going further, the new thing that will come up is they are creating a way to provide to us the AST of any language,
a new API using the aspect of granted programming with aspect and taste, and a new package manager to just specify your requirements in your BEARS
and it will go fetch your NPM package or your PIP package inside your BEARS to be sure that when your BEARS runs, you have everything you need to analyze your code. Thank you.
We still have time for a few questions. This gentleman here.
With Koala, are you able to do some kind of semantic static analysis
to deduce some properties of your program or so on? The question was, can I do some semantic analysis? I answered that yes.
You have just the run function, so you can do whatever you want. The intermediate representation in that case was just a Python data structure.
The local BEARS Koala produce gives you the file content and the file name, and you can work with this. If you have the file content, you can do whatever you want with your tools that you semantic analyze on this.
That's how it works. They saw that that was a little complicated, and you cannot do much with this simple construct. That's why I said they are working on bringing some real tools to give you more abilities.
Any additional questions?
You mentioned it's language agnostic. Can you use it for Python 2.7 as well? Can you speak? Python 2?
I think I said that it was available for Python 3, I think. Yes?
It supports 60 languages. Oh. It supports 60 languages. Yes.
Oh, it's hard, right? Oh, man. Yes. It supports 60 languages, so I could analyze the C program in Python with the complete AST from C.
Yes. It depends on the compiler options. How does this work? Yes. The question was Koala supports 60 languages, and you can use Python to analyze C code. And how it works?
Yes, that's my question, because it depends on the compiler options, the flags. Yes. You can use the linter, the linter beer, and provide your compiler options via the beers. You have to see it as a wrapper.
That's why I think it's very powerful, because you can have just one tool and do whatever you want with it, what is out there to do, analyze it.
Any questions? Okay, thank you. Thank you very much.