Green software engineering
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 542 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/61482 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Software engineeringBuildingFood energyGreen's functionSoftwareSoftware developerMeasurementOpen sourceSoftware engineeringCodeBuildingXMLUML
00:48
Physical systemSpacetimeNP-hardMiniDiscDirected setPlastikkarteElectronic visual displayFood energyPower (physics)CoprocessorMaxima and minimaWindowPhysical systemProcess (computing)Web pageAxiom of choiceInstance (computer science)Power (physics)Computer fileRight angleSpacetimePRINCE2Software testingComputer hardwareOperating systemHard disk driveCoprocessorWeb 2.0Operator (mathematics)International Date LineComputer animation
01:41
HypermediaPhysical systemPower (physics)Food energyTouchscreenImage resolutionInternetworkingRead-only memorySpacetimeMemory cardCoprocessorNP-hardMiniDiscDirected setCore dumpDuality (mathematics)PlastikkarteElectronic visual displayProcess (computing)Operating systemOperator (mathematics)Food energyAxiom of choiceComputerWindowPhysical systemGoogolInstance (computer science)Computer animation
02:14
Food energyOnline chatLocal GroupMetric systemLatent heatFood energyCartesian coordinate systemDecision theoryRight angleMobile appVideoconferencingMessage passingInformationCASE <Informatik>Reduction of orderSystem call
03:25
Core dumpStress (mechanics)Point cloudFood energySoftwarePerspective (visual)Degree (graph theory)Food energyMetric systemLine (geometry)Open setFilm editingDiagramIntegrated development environmentPoint cloudOrder (biology)Physical systemCausalityProjective planeSoftware developerCASE <Informatik>Repository (publishing)Multiplication signLibrary (computing)Frame problemFlowchartGreen's functionStapeldateiView (database)Limit (category theory)Computer animation
04:58
MeasurementComputer networkBefehlsprozessorFood energyDynamic random-access memoryPower (physics)Software testingBenchmarkGastropod shellScripting languageArchitectureDataflowBuildingFunctional (mathematics)MeasurementGastropod shellOrder (biology)Multiplication signFood energyDataflowComputer fileBenchmarkCartesian coordinate systemStandard deviationSoftwareBefehlsprozessorSoftware testingWeb applicationReduction of orderExecution unitFile formatTraffic reportingVirtual machineTerm (mathematics)DiagramPerspective (visual)FlowchartUnit testingForm (programming)Metric systemInstance (computer science)Group actionSlide ruleRepository (publishing)Physical lawArithmetic progressionPower (physics)Semiconductor memoryComputer animation
07:14
Food energyCartesian coordinate systemMathematical optimizationComponent-based software engineeringMetric systemSlide ruleService (economics)Self-organizationDiagramProgram flowchart
07:36
GUI widgetMetric systemPhysical systemBefehlsprozessorDatabaseServer (computing)Network topologyFood energyUtility softwareVideo gamePoint cloudWebsiteInstance (computer science)Web 2.0Element (mathematics)SoftwareTraffic reportingWhiteboardGoodness of fitGraphical user interfaceTask (computing)Multiplication signRootWeightClosed setComputerCalculationSemiconductor memoryPerturbation theoryMoment (mathematics)Integrated development environmentMotherboardSoftware bugSelf-organizationState of matterView (database)Process (computing)Power (physics)FrequencyVirtual machineNormal (geometry)Cartesian coordinate systemMetric systemTotal S.A.Sampling (statistics)Band matrixRevision controlBefehlsprozessorDatabaseDifferent (Kate Ryan album)Server (computing)Mathematical optimizationTurbo-CodeLogical constantCoprocessor
10:59
DatabaseSoftwareInstance (computer science)Slide ruleVirtual machineAreaLibrary (computing)Food energySoftware frameworkCartesian coordinate systemInterface (computing)Exterior algebraOpen sourceFunctional (mathematics)Software developerRule of inferenceIntegrated development environmentMeasurementPoint cloudBitGreen's function
12:47
Metric systemFood energyHome pageProjective planeSoftwareGroup actionVirtual machineOrder (biology)Continuous integrationMeasurementComputer hardwareGreen's functionFocus (optics)Endliche ModelltheorieDatabaseData centerInstance (computer science)Power (physics)Computer animation
13:31
MethodenbankData modelOpen sourceMachine learningFood energyoutputExploratory data analysisData analysisPredictionPoint cloudEstimationSample (statistics)Inclusion mapMeasurementThread (computing)Virtual machineFrequencyInformationWorkloadServer (computing)Limit (category theory)Power (physics)Endliche ModelltheorieCoprocessorOpen sourceBefehlsprozessorOpen setDatabaseStandard deviationPoint cloudView (database)Food energyGoodness of fitGroup actionEstimatorWordPoint (geometry)Projective planeProcess (computing)Computer animation
14:49
Limit (category theory)Food energyPoint cloudEstimationGroup actionMeasurementDirected setCache (computing)Total S.A.BefehlsprozessorRepository (publishing)Group actionProjective planeInformationTerm (mathematics)Food energyPoint cloudInstance (computer science)Computer animation
15:11
Game theoryFood energyEstimationView (database)Group actionHash functionWorld Wide Web ConsortiumLink (knot theory)Open setMeasurementFunction (mathematics)Metric systemCategory of beingSharewareSample (statistics)Continuous functionDisintegrationOverhead (computing)Decision theorySequenceTask (computing)ExplosionGUI widgetTerm (mathematics)Software testingContinuous integrationMeasurementRepository (publishing)Group actionSystem callComputer animation
15:35
Game theoryMetric systemSoftware testingBranch (computer science)Virtual machineVirtual realityMeta elementCache (computing)Coding theoryPower (physics)BefehlsprozessorFood energyTotal S.A.Graph (mathematics)Plug-in (computing)Object (grammar)Food energyVirtual machineConnected spaceSoftware developerSoftwareQuicksortInformationInstance (computer science)Point (geometry)Multiplication signGraph (mathematics)Software testingData storage deviceGoodness of fitGroup actionGraph (mathematics)Bit
16:51
Metric systemSoftware developerOpen sourceGreen's functionArchitectureSoftwareFood energySharewareMeasurementLink (knot theory)Food energySoftwareRepository (publishing)StapeldateiOpen setSource codeXMLJSON
17:08
Software testingFluid staticsWebsiteOpen setRepository (publishing)Food energyGreen's functionMetric systemRepository (publishing)MultiplicationMultiplication signInstance (computer science)Physical systemStapeldateiCASE <Informatik>Food energySoftwareLibrary (computing)Slide ruleMetric systemSoftware testingNumberWordPairwise comparisonDemosceneXML
17:56
Service (economics)Virtual machineIntegrated development environmentCASE <Informatik>First-order logicFood energySoftwareInternet service providerOpen sourceGreen's functionMoment (mathematics)Metric systemGastropod shellDataflowEndliche ModelltheorieMultiplication signInterface (computing)Similarity (geometry)Standard deviationTerm (mathematics)EstimatorPoint cloudInternational Date LineMeasurementFunctional (mathematics)Order (biology)Computer fileParameter (computer programming)Scripting languageTurbo-CodeFreewareAreaDependent and independent variablesPhysical systemProfil (magazine)BuildingComputer architectureComputer animation
20:28
Program flowchart
Transcript: English(auto-generated)
00:06
So hello and welcome to my talk about green software engineering and more specifically about building energy measurement tools and ecosystems around software. My name is Arne and I work for Green Coding Berlin, which is a
00:23
company that specializes in making open source tools for energy-aware software measurement. I would like to take you on a tour today of a concept for a possible future ecosystem we imagine where energy consumption of software is a first-world metric and
00:44
available for every developer and user. So let's have a look at a hypothetical scenario. So Windows 10 operating system typically comes with a minimum system requirements. So if you look on the vendors web page, you can see it has a processor that is needed, 1 gigahertz,
01:02
1 gigabyte of RAM, a particular amount of hard disk space, graphic cards, etc. However, what is never given is the power on, for instance, idle that this operating system uses on this reference hardware that it apparently already specifies. So this should be pretty doable, right?
01:21
Also something like power the desktop activity. So how much power does it use just to go around in the operating system, opening the file explorer, using the taskbar and stuff like this on the reference system, for instance, that Microsoft specifies, or on a reference system that we or the community specifies.
01:42
And imagine then you could make informed choices. So by just saying, hey, I'm looking at Windows 10 and I see that it has 45 watts in idle, but apparently my computer is mostly in idle. So it might be more interesting to use Ubuntu, for instance, which has just 20 watts in idle, or desktop activity is even lower.
02:01
So why not choose this operating system if energy is my main concern? And this is what I what I cherish the most in the operating system, or which is an important metric for me. If you think this process even further, you can think about comparing energy of applications, very specific, not only in the idle scenario or in one scenario,
02:23
but in very specific usage scenarios that are ingrained to how people typically use such an application. What you see here is two radar charts on the left side is WhatsApp and on the right side is Telegram. Please keep in mind that these are concept pictures. So this is not actually the energy that this application you use for this use case.
02:45
But let's say your use case is that you message a lot with an app, but you don't do that many video calls. So if you looked at WhatsApp, you see here that it has quite a high energy budget when it comes to messaging, whereas Telegram has quite a lower budget. Telegram is, however, very bad when it comes to video, where WhatsApp could be, for instance, better.
03:05
So if you say that you are mostly doing messaging with your application and you would like to keep your battery life or maybe use Telegram on the desktop, your desktop energy consumption low, then with such metrics, you could actually make an informed decision if WhatsApp or Telegram is the better app for you, if energy is an important concern.
03:26
And imagine as a developer, if you think even one step further, that you go to GitHub or to GitLab or wherever your software is hosted, and you look in the repository and you see right away with something like an open energy batch, how we called it internally,
03:43
to see how much the software you see it down here, how much the software is actually using for its intended use case that the developer of the software had in mind. So you can compare one software that maybe has very limited use case to another software or library just by the energy budget
04:01
because you have the metrics already available. We actually try to build these tools, and I would like to take you in this very short timeframe that we have been given by Fostam, so just about 20 minutes. I would like you to take a tour through our projects that we are doing more as an appetizer
04:21
so you see what we are working on and what we think could be possible or a possible ecosystem in the future. You will be presented with a view that looks like such. So the green metrics tool, EcoCI, open energy batch and cloud energy. So the green metrics tool is what I would like to talk about today mostly
04:43
because I think it is the tool that outlines our concept of transparency in the software community the best. And then we will talk later about our approaches for CI pipelines or restricted environments like the cloud. So first of all, I think it makes sense, although I know people tend to hate diagrams or flowcharts to some degree,
05:05
but I think it makes sense to quickly go over how the concept of the tool works from a high-level perspective. So in order to measure software, we follow the container-based approach. So we assume that your software is already in a containerized format or can be put in such a format.
05:26
So, for instance, even a Firefox browser, if you want to measure desktop applications, can be put in a container and be measured with our tool. Also machine learning applications, simple command line applications, but also web applications. Typically, when you develop software, you already have infrastructure files like Docker files,
05:45
Docker compose file, or even a Kubernetes file available, which our tool can consume. In all fairness, Kubernetes is still a work in progress, but Docker files, it can consume. And then the tool basically orchestrates the containers
06:02
and attaches every reporter that you want in terms of measuring metrics. So here we are still very similar to typical data logging approaches like Datadog does it, for instance, or other big players. So the memory, the AC power, DC power, the network traffic, CPU percentage, CPU and RAM
06:21
is all locked during the execution of what we call a standard usage scenario. So in the first couple of slides, I've shown you the concept of looking at software from how is it typically used. And people already have thought about this concept quite a lot when they make end-to-end tests with their software,
06:41
because this is a typical flow that a user goes through in your application. Or unit tests, which might be very reduced amounts of functionality that is tested in a block. Or benchmarks that are already inside of the software repository, session replays, shell scripts, build files that basically measure where we could measure your build process.
07:03
All of this is already available typically, and our tool can consume these files, will run these workflows, and then tell you the energy budget over the time of this run in particular. This slide is more just if you're not too familiar with Docker, the idea is just to have every service or every component of the application in a separate container
07:24
so that we can later on better granularize the metrics and better look at which component might be interesting to look at if you want to do energy optimizations in particular. When you use the tool, and I will just go quickly over that,
07:40
and then probably go with you through a live version of what we are hosting at the moment, you will get a lot of metrics. So you will obviously get something like the CPU utilization, or the average memory that was used, or maybe the network bandwidth that was used. But what is interesting for its dashboard and basically its USP is that you get also the energy metrics from the CPU, from the memory.
08:03
You get a calculation of what the network has used in energy, and you get convoluted or basically aggregated values where it makes often sense to look at CPU and memory in conjunction, or it makes sense to look at all the metrics that you have available to get something like a total energy budget.
08:24
Then you obviously can look also at the AC, so at the wall plug, so not only what is your CPU and your RAM using, but what is the total machine using, or something that we have in our lab as a setup. You just look at the mainboard, so not on the outside of the PSU, so what is basically plugged in the desktop computer,
08:41
but only the power that flows directly into the mainboard. And here you can see that our tool automatically calculates the CO2 budget based on the energy that it has used for this run. The tool also shows you which reporters have been used in an overview, and then it tells you a lot of charts.
09:01
So this is a sample chart, and what the tool can basically give you is not only an overview capability, but also an introspection where you, for instance, are interested in the idle time of the application. So what is my application doing when no user is interacting with it? Is it actually using energy, and is this too much energy
09:21
for my belief or for the belief of the community? So for instance, here we have an example of a setup, of a WordPress setup that we have done with an Apache, a Puppeteer container that runs Chrome, and also a MariaDB instance. And you can see here that here are a couple of requests
09:41
that have been done to a WordPress instance, and then we are basically just idling, but still the web server is doing quite some work, and there have been no WebSockets active, so why is there server and database activity here? Is this valid? Is this maybe some caching, some housekeeping, or is this unintended behavior? We picture that our tool could highlight such energy hotspot
10:04
or energy malfunctions, as we call them, to better understand how software uses energy. You can also look at energy anomalies, so we work sometimes with features like Turbo Boost, which is typically not turned on in cloud environments, but very often for desktops,
10:21
which brings your processor in kind of like an overdrive state so that it can react very quickly in a frequency above its normal frequency. However, what we have done here in this example, we have run a constant CPU utilization, but as you can see here, the CPU clocks at different frequency over the time, and sometimes it uses exponentially more energy
10:42
for the same tasks. So it finishes quicker, but it uses more than only a linear amount more of energy to do the task. So this is a very interesting insight that our tool can, for instance, deliver when you try for energy optimizations of your software.
11:00
So what is the whole idea that we have behind all this project? And let me move myself down here a little bit so you can see the full slide. We want to create an open source community or a green software community that focuses on the transparency of software so that you have basically an interface
11:22
which we call the usage scenario where you can measure software against and then ask later on questions against a database or against an API which has measured all these softwares, questions like how much does this software consume? Is there a more carbon-friendly alternative or is there a software that makes less network requests?
11:45
The idea, if these softwares are available in your country, so JUKA to my knowledge is, for instance, from the US, and CodeCheck is more like a German application, is we want to be the JUKA or the CodeCheck of software. So we want to deliver answers to developers
12:04
where they can ask questions about the energy budgeting of a library, of a software, or of a functionality by providing a framework to make these measurements. So let me move up here again and then back to the slides. So let me show you our other tools
12:22
that we believe are needed to build an ecosystem around green software because software is not only running in desktop environments or is not only on a single machine. It also runs a lot in the clouds where these measurements that we have, and I would like to encourage you to read a bit on what sensors are available in our tool,
12:42
but where these sensors are not available which is, for instance, in the cloud. So let me bring up my browser again. So if you are on the homepage and here you have seen the green metrics tool that I've just talked about, you'll also see that we have the Cloud Energy Project and the EcoCI Project.
13:02
So EcoCI focuses on measuring the energy of software in a continuous integration pipeline that, for instance, runs in a virtual machine. Our focus is currently on GitHub Actions. In order to estimate the energy in a virtual machine because you cannot measure, you have no access to the wall plug in the data center,
13:21
you have no access to sensors in the CPU or whatever, you have to estimate the machine based on measurements that you already have for the same hardware. If you click on Cloud Energy, you can see here that we have based our machine learning model on a research paper from Interact DC and the University of London, and they have basically taken the data
13:43
from the spec power database, which is an open database for servers that have been measured just with a fixed workload to compare it against each other. And based on this data, we can create a machine learning model, which is also free and open source to use,
14:01
that is just a Python tool, which you call with the information that you have. So let's say you have the information that your CPU is from Intel, that the frequency that you're running is 2.6 gigahertz, you have seven gigabytes of RAM, and you know the CPU has 24 threads. But you don't know any more info. You don't know if it's a Skylake processor
14:21
or a more modern internal processor. You have no more information because the hypervisor limits this to you. So if you give the model more information, it can give you more accurate estimates, but it can also work with the limited information in the cloud. And then it spits out to the standard out the current wattage that you have been using,
14:43
and then you can reuse that in a tool that we build upon that. So now that you've understood that there is a machine learning model behind the idea, I would like to bring you to EcoCI. So EcoCI is a GitHub action that is based on the work from the Cloud Energy project that can give you in a GitHub action
15:01
the information of how much a CI pipeline has used in terms of energy. So if you go, for instance, to the GitHub repository, you can also go to the marketplace. So we go one step further, and here you can see you can directly use it.
15:21
It is very easy to use. It just needs two calls to initialize the tool and then one more call whenever you want to get a measurement and what it does for you. So let's quickly go to our repository where we actually use GitHub actions to measure every of our workflows in the tool.
15:42
So we click on actions. Let's say we go to manual test run virtual machine. We click on main. And you see here, I've run this run yesterday. It succeeded. So our log tells us, hey, all tests have to work fine. So the runner worked fine and also the API. And for this run in the Azure Cloud
16:01
where GitHub actions runs as virtual machines, I have used 650 joules of energy. And you get a nice ASCII graph over time. We are a bit limited here in the graphs we can display in the GitHub actions overview. But you can see here at what point in time the energy for instance is the highest and then maybe look at the later tests if you deem them to be more energy consuming
16:23
than for instance at the start where it was using only a fixed amount of energy. So this gives a developer and also user the information how much energy is not only the software using but also the development of the software. Is it maybe using more than we want
16:42
as developers or maybe even as a community? And these are our concept tools to just get a first start of what we think could be possible in a near future where software is basically measured
17:00
and the data of its usage is constantly published by developers also. The idea is then to have something like an open energy batch that is basically in every repository that tells you for this software and for this usage scenario that comes with it. So be it for instance running the tests or be it for instance building the containers
17:22
or the intended use case of the software. So let's say the NumPy library of Python has an energy batch where it says, hey for 1000 times 1000 metrics multiplication this software uses this amount of energy on the reference system that we have specified. And when you use the same reference systems
17:41
to compare software against each other you come to a scenario that we have basically shown from the starters in the first slides where you can basically tell is the one software more energy hungry than the other one comparing the same use case. So let me quickly get back to my slide deck.
18:01
So let's wrap up. Measuring software energy consumption we believe is still too hard. The goal should be easy as starting a Docker container and it should happen transparently. Therefore we have created the green metrics tool which can reuse Docker files and infrastructure files to make it very easy to orchestrate your architecture. And then in a flow that you already have
18:23
be it a puppeteer file or be it just a shell script you can run that with our tool just as a parameter appended and it will tell you how much energy has been used over this particular scenario that you feed in. Measuring software is also very complex. So this is why we have integrated best practices or tool
18:41
like pausing between measurements letting systems idle before you actually use them turning functionalities like SGX off looking at if turbo boost is on and very more features. Just inline measuring like Datadog or other providers are doing it at the moment we believe is not enough and it's too arbitrary to talk about energy.
19:03
Software must be measured against a standard usage case so we provide standard usage cases for software as an interface but we ask you the community also or we need to see over time what are these standard usage cases we can all agree on.
19:20
The software must be comparable to another similar software in terms of energy. This is why we need these standard usage cases to make it comparable. This also means it must be measured on reference machines that everybody has access to that we want to provide for the community as a free service. Energy metrics must also be available
19:41
in restricted environments like the cloud so I've talked about estimation models that need to be open source and available and for everybody to implement and energy must be transparent and a first order metric an order in developing and using software so people should know before they use the software how much energy it is consuming and this is what we are trying to achieve
20:02
with the tools we are developing. I hope it could pique your interest in our work and the tools we are developing some as concepts, some already production ready and I thank you for listening and now I hope it's time for questions.