The state of ARM - a 64bit view of what does/doesn't work
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 63 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/54560 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
openSUSE Conference 201655 / 63
1
5
8
13
14
19
20
24
25
31
32
33
34
35
37
38
40
43
44
45
46
47
49
50
51
52
53
54
58
59
61
63
00:00
ArmArmView (database)State of matterProcess (computing)BitComputer animationLecture/Conference
00:29
Open sourcePrincipal idealPrincipal idealArmOpen sourceProjective planeDistribution (mathematics)Focus (optics)Multiplication signLecture/Conference
00:49
Principal idealOpen sourceMultiplication signAxiom of choiceComputer animation
01:08
Axiom of choiceBitEvent horizonTouchscreenComputer animation
01:28
Axiom of choiceSoftwareComputer hardwareArmComputer animation
01:48
SoftwareComputer hardwareSoftwareBitFeedbackComputer hardwareOcean currentComputer animation
02:11
SoftwareComputer hardwareOcean currentComputer hardwareMereologyComputer animation
02:31
SoftwareComputer hardwareInclusion mapMereologyArchitectureSkewnessArmFingerprintKernel (computing)SpacetimeMereologyRevision controlComputer animation
02:49
MereologyArchitectureArmRevision controlComputer architectureArmSpeichermodellBefehlsprozessorComputer animation
03:14
MereologyArchitectureThumbnailArmInclusion mapArmEvoluteDiagramRevision controlImplementationComputer architectureComputer animation
03:44
Inclusion mapMereologyArchitectureArmColor managementSoftwareImplementationArmSoftware32-bitComputer animation
04:05
SineFunction (mathematics)BuildingBuildingComputer animation
04:25
NumberBitMereologySoftwareComputer animation
04:58
NumberExecution unitSoftware testingSoftwareExecution unitComputer animation
05:20
Execution unitSoftware testingWorkloadSoftwareCASE <Informatik>ArmComputer animation
05:38
Software testingExecution unitCompilerKernel (computing)Java appletRAIDCryptographyScripting languageFatou-MengePascal's triangleOpen setAssembly languageSteady state (chemistry)SpacetimeArmWhiteboardSoftwareMathematical optimizationArmBitHard disk driveImplementationVideo gameAdditionSingle-board computerCartesian coordinate systemCodeCompilerSoftware testingProcess capability indexServer (computing)Coefficient of determinationMultiplication signPhysical systemMP3DivisorService (economics)Network socketFormal languageWhiteboardMachine codeAreaSpacetimeComputer hardwareGraphics processing unitCuboidMiniDiscForm factor (electronics)Different (Kate Ryan album)Computing platformModulare ProgrammierungAssembly languageStack (abstract data type)CausalityChord (peer-to-peer)Socket-SchnittstelleCompilation albumMathematicsElectronic mailing listRAIDCryptographyMereologyKernel (computing)Mixture modelAxiom of choice1 (number)Instance (computer science)Revision controlSoftware bugCloud computingVariety (linguistics)Traffic reportingCASE <Informatik>Line (geometry)Form (programming)Core dumpBuildingGoodness of fitAngular resolutionFront and back endsComputer clusterComputer animation
15:09
WhiteboardMaxima and minimaSystem callGame theoryComputing platformFrequencyBenchmarkTwin primeExclusive orArmOpen sourceReal numberBenchmarkSpacetimeServer (computing)BootingComputing platformImplementationFrequencySoftware testingSelf-organizationFirmwareDifferent (Kate Ryan album)ArmLatent heatPhysical systemSoftwareEnterprise architectureWhiteboardBitInteractive televisionWorkloadComputer hardwareNumberIntelGoodness of fitCartesian coordinate systemProduct (business)CASE <Informatik>Representation (politics)ResultantBounded variationMetric systemAreaPlanningWordPairwise comparisonKernel (computing)Point (geometry)Multiplication signSet (mathematics)Revision control32-bitOpen setArithmetic progressionComputer architectureNP-hardPerfect groupOcean currentStandard deviationComputer animation
24:22
Computer animation
Transcript: English(auto-generated)
00:08
So, thanks for coming. I'm going to talk about the state of Arm and a 64-bit view of what does, doesn't work. The intention is for this to be a little bit more interactive.
00:24
I'll go through a few points and try and spur some discussion. So, who am I? My day job, I'm a Senior Principal Engineer at Arm. I focus on open source, so distributions, upstream projects, and downstream consumers of open source are
00:47
my primary focuses. I've been involved with OpenSUSE for quite a long time. 6.2 is my first distro and unfortunately I've not been able to get
01:01
away since then. I'm also known to have some questionable fashion choices. Kind of goes on through different events and my homemade mankini that's a bit
01:23
difficult to see on the screen was my last design outfit. But I'm also European in spite of 52% of my compatriots. So, that's about me. What else am I going to talk
01:43
about? I'm going to cover briefly what Arm 64, otherwise known as Arm 64 is, give an overview of the software status, things that we know do work, things that have been optimized, things that we know are broken.
02:02
We don't necessarily know what's missing because without a bit of feedback we can't do anything about it. A brief overview of the current hardware status and I'm going to touch briefly on benchmarking as well. And then there's
02:26
next steps, which is kind of where you guys come in for the discussion part. So, for those that aren't aware of what Arm 64 is, it's also known as Arm 64,
02:40
the kernel space, it's Arch Arm 64. Debian and Ubuntu use Arm 64 moniker rather than AR64. It's part of the eighth revision of the Arm architecture known as Arm V8. It was a ground-up design of the architecture,
03:01
but ensuring that there was backwards compatibility. And Arm have come out with three 64-bit CPU designs so far. It's the Cortex-A53, the A57, and the A72. And the diagram there just shows the evolution from Arm V5 all
03:24
the way through with the new features being added in each revision and ensuring that there's always backwards compatibility. You'll notice that under Arm V8, there is an A32 and an A64. A32 is the 32-bit implementation of the Arm V8
03:44
architecture. And we recently announced the Cortex-A35, I believe it is, which is a pure 32-bit implementation. So, what's the software status in Arm 64?
04:06
It's mostly working, basically. Almost everything builds. So, as I overheard not too long ago, if it builds, it's done. Not quite. So, if you look at OBS,
04:29
there's a bit of disparity between x86 and AR64 there. But we are pretty close.
04:40
So, what we know is that the obvious things that people care about are done. And because that is the large chunk of software out there, the items that need to be ported are smaller, but part of the reason is that they're also much
05:00
harder to port. The low-hanging fruit has now pretty much been picked. So, that's great. The software is ported, but does it run? If it's got integrated unit tests, it should run. A lot of the software hasn't necessarily been tested
05:23
because not everyone runs all the software that's available. And some people that run some of those niche case software workloads may not have tried it on Arm yet. Even if it does run, there's a large chance that it may not
05:42
run optimally. So, you're not going to get the best out of that software. But if we look at what has been optimized to run well on Arm, most of the key languages are there. OpenJDK, C, C++, Node.js is there. Obviously,
06:10
the compilers, both GCC and LLVM, have been optimized. The kernels had a lot of optimization added to it, especially in the areas of crypto, RAID 6. And there's continual work going on, obviously, in the kernel to ensure
06:25
that more is optimized to run as well as possible. And if you look further up the stack, actual applications stacks, you've got OpenSSL, Ceph, Hadoop,
06:40
and Xen have all been added. Ceph and Hadoop especially have improved CRC performance. And the Xen code base on Arm 64 is much smaller than it is on x86. It was a relatively fresh implementation of the Xen hypervisor.
07:05
Well, that's all good, but what's the difference really on optimized code versus straight out the box? So, if we look at what some of our friends over in Debian have done using their botch application as part of their QA
07:24
side of things, it's written no-camel. And if you use the fallback C implementation of o-camel, it runs in just under five hours. But if you're using the native optimized o-camel, it only runs in one hour and 15.
07:45
So, there is a vast performance improvement using optimized native code. May not be rocket science, but some people don't actually grasp the relevance there. So, there's quite a bit that has been ported,
08:04
but not optimized yet. Lua, Rust, Golang, they've not quite had the care and attention needed to get the most out of them.
08:22
Recently, Mono had been released upstream with full Arm64 support. So, that's probably one of the newer porting additions to the stack. Wookie, who works at Arm and is also a Lenaro assignee,
08:42
spends his life dealing with Arm64 builds. And at Lenaro, a couple of years ago, they spent almost a full year on going through assembler code to find out which software packages had assembler in there that were
09:04
just getting in the way. And rather than trying to fix the assembler code, it's much easier, much cleaner just to remove it. A lot of the time, it's superfluous. Lenaro are running a competition. If you go
09:22
to performance.lenardorg for helping in improving even the performance of software in Arm64. So, what pieces do we know are missing?
09:43
We know LuaJIT still does not run on Arm64. There's quite a few network related applications that leverage LuaJIT now. That is being worked on. Lenaro is working with upstream to try and drive that through. Golang currently is moving to an SSA backend for the Go
10:06
compiler. When that gets released in 1.7, it will not have support for Arm64. It will fall back to the existing version 1.6. But we
10:22
are working with upstream to get the SSA backend ported across to Arm. That's expected in Golang 1.8, which is due next year, I believe. And as I mentioned, Mono's been ported. But within OpenSUSE, it's not been packaged yet. So, a lot of the unresolvables that I
10:45
showed earlier on are Mono dependencies. So, that should hopefully bring us much, much closer now to what's available in X86. So, software's all good. But can you run it on anything?
11:07
It depends. So, if you've not got any space, what do you do? Can you use an EC2 instance or something? Yes. Run above from OVH. Use ThunderX systems. So, you can get VMs on there.
11:29
If you want to use them as build hosts. Have them as your VPS. And of course, you can use the OVS to build your
11:40
software. It's running a mixture of applied micro X gene 1s and Seattle. So, you've got a choice of cloud-based infrastructure, if you wish. But you've got a big desk and you like to see
12:03
blinky lights. It's quite soothing to help with your bug reports. There's a wide variety now of hardware that you can get that's running AR64. So, it depends on out of that list what your
12:22
budget is. If you've won the lottery, you could go for an HP Moonshot. Kind of sitting around just over 10 grand. You can go a little bit lower and go for the gigabyte H270. Somewhere around the $5,000 mark. That's four sleds. Each
12:43
with two sockets. Each socket's 48 cores. So, that's 96 cores per sled. 384 cores roughly. If my math works out. That's still a little bit too expensive. That's fine. We can
13:02
go a little bit lower. 3,000 for dual socket. 96 cores. Bit more disk space, et cetera. Still a bit on the pricey side. We can go for soft-iron overdrive 3,000. I've got one and it runs beautifully. Or you can go for an applied
13:25
micro XE1 dev platform. Little bit more desktop size form factor. Or you can go for single socket when you gigabyte running thunder X. And even smaller still,
13:45
service form factor is the R120 as well. Which, again, runs thunder X produced by gigabyte. If you're not keen on form factor, you can buy the MP30 from gigabyte running
14:03
XG1 and put that in whatever chassis you desire. You can and put in a PCI graphics card and you can use that as your desktop. The overdrive 1,000 just announced is a
14:24
nice small desktop form factor. Great for testing software out, writing code on, running, use it as a home server. It's up to you. More bare bones is the
14:44
maker. That's very similar to the overdrive 1,000. It has a few additional pieces. Like an exposed PCI slot. But it's just a single board computer. You don't get
15:00
any hard drive, any case, et cetera. So it's kind of beagle bone-esque, but grown up. We can go for some of the smaller 96 boards, like the high key, which comes in two variations from the maker. One gig at $75 or a two gig version at $9. It's very much targeted
15:25
at the more embedded side of usage. And the same with Qualcomm's DragonBoard 410C from 96 boards. That, again, $75. Or you can go a little bit less. $35, you
15:45
get the Raspberry Pi 3. And probably the cheapest one at the minute is the Pi 64. Two variations, half a gig or a gig of RAM, and it depends on your budget, $15, $30. So I've got my hardware. I've got
16:07
my software. How do I test how well it all performs? Benchmarking is not simple. It's very easy to gain. You need to ensure that you have equivalent platforms
16:23
if you want to do comparisons. If it's not like for like or as close as possible, there's not much point in doing a comparison. And when you're doing your benchmarking or your tests, make sure you do it
16:40
over a set period of time longer than one minute. You will find variations as you repeat your tests over and over and over again. So you'll need to amortize your results. You can't just take, well, that looks like the best one. I'll just take that
17:01
one. And you need to pick a real world metric and one that's common across all the platforms that you want to look at. If you're just going to pick something that's specific to ARM, yet you want to compare it against Power, MIPS, x86, whatever, or a different ARM platform, you need to ensure
17:22
that that metric is common across the platforms that you want to test. And at the end of the day, if you know of good benchmarks, please let me know, because there's a wealth of benchmarks out there, spec in, et cetera, but they're not really
17:44
representative of real world use cases. If an application has a good benchmarking setup, please let me know. That'd be great to find out. There have been a couple of articles on benchmarking since
18:01
Cavium publicly announced production availability of their Thunder X platforms. One was by Intel, where they went through a third party to obtain a system, and within that, they openly admitted that they have no idea how to tune the system, so yeah, your mileage may vary, but we know
18:20
how to tune our Intel systems, and when we tune our Intel systems and run this software workload, it looks much better than these really crappy numbers that you can get on this Thunder platform. Not quite like for like. And there has been another one where it's a little bit more evenly balanced,
18:42
so recommend you read the next platform on that. So that's my brief run through of what the current status is, where we know we're at. Now, this is where the interaction comes.
19:05
So does anybody have any questions, or does everybody want to get outside to the paddling pool?
19:24
So the question was, how much open source is in the stack, and is there any black spots that I would rather not cover? So with the design of ARMv8 and the concerted targeting
19:44
of the enterprise space, because as Norman Fraser mentioned, we're traditionally mobile space, which is very black boxy. When you find while moving into the enterprise space, we need to be as close as possible to existing
20:04
architectures so that for people to migrate to, implement on ARM servers, it needs to be as seamless a move as possible. So the stack is as open as possible.
20:22
We have open specifications for base server systems so that dictates to be compliant, you must ensure that you use EFI, et cetera, and other standard policy-based things.
20:42
We've all gone one step further with server-based boot recommendations, where we make some recommendations on what you should do for your boot process. Not everyone agrees with that, which is fine. But there's open firmware available.
21:05
There's, you know, we push everything that ARM does, we push that upstream. Linaro, who ARM work with, who's a nonprofit engineering organization that is made up of lots
21:23
of ARM partners and lots of different segments, they push as much as they can upstream. I think they're in the top five of Linux kernel contributions now, have been for a year or so, and they work in multiple areas, both low-level stack and higher up in user space.
21:47
So we try and keep everything as open as possible because we know it will bite us squarely if we don't. We may not be as quickly as one would like, but we do our best.
22:08
Could you say a few words about the West African country? No comment.
22:29
Are there any plans to provide an ARG32 build on AlconSUSE? So, possibly.
22:43
Nothing has been decided. Obviously with LEAP moving to 64-bit, so LEAP is only on ARG64, but there's nothing stopping anybody maintaining a 32-bit version of LEAP if they wanted to, but it's going to be a lot of hard work.
23:05
Okay, so work in progress. Thank you. So for an ARG32 build, it's certainly possible. There's no ARG32 hardware yet. Available in the market,
23:22
but that will be coming pretty soon. So one question for everyone here is, does anybody know of software that does not work on AR64
23:41
that they would like? Excellent. So we're all good there. Works for me. Yeah, any other questions? Comments?
24:00
Flames? Love? Perfect. Thanks.