We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Intel BayTrail graphics overview

00:00

Formal Metadata

Title
Intel BayTrail graphics overview
Subtitle
Discussion of the Intel BayTrail platform and architecture
Title of Series
Number of Parts
199
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Discussion of Intel BayTrail SoC architecture from a graphics perspective, including overview of render engine, display engine, memory architecture characteristics, and current status in Linux. Hopefully the presenter will have some sample platforms for people to play with after the talk. The Intel BayTrail platform incorporates a new, out of order Atom CPU, an Intel HD graphics engine, and various IP blocks to support both tablet and notebook platforms. This talk will give an overview of the new hardware features, and the status of the graphics driver support in Linux.
IntelHydraulic jumpLecture/Conference
Computer architectureFrequencyType theoryMathematicsKey (cryptography)Electronic visual displayBus (computing)Semiconductor memoryClient (computing)Tablet computerBand matrixInformation securityExtension (kinesiology)QuicksortDirected graphOpen sourceBefehlsprozessorVirtualizationWindowPower (physics)Cache (computing)Data conversionLeast squaresRun time (program lifecycle phase)System on a chipProduct (business)Bridging (networking)Computing platformMereologyBlock diagramScatteringDifferent (Kate Ryan album)Covering spaceRing (mathematics)Software developerFactory (trading post)Characteristic polynomialMulti-core processorLaptopSlide ruleGame theoryDevice driverEnvelope (mathematics)Personal digital assistantAnalytic setCodierung <Programmierung>Process (computing)Independence (probability theory)Level (video gaming)Shared memoryTrailNumberVideoconferencingRegular graphGraphics tablet1 (number)Image resolutionSimilarity (geometry)outputEndliche ModelltheorieOrder (biology)VolumenvisualisierungBlock (periodic table)Set (mathematics)TunisINTEGRALAdditionVideo gameSpacetimeStack (abstract data type)Asynchronous Transfer ModeMikroarchitekturCore dumpSynchronizationDevice driverBitXML
Core dumpMikroarchitekturDifferent (Kate Ryan album)QuicksortBenchmarkProteinPerspective (visual)Electronic visual displayWater vaporElectric generatorClient (computing)Game controllerTerm (mathematics)Power (physics)TheoryContext awarenessMereologyBitBefehlsprozessorAtomic nucleusLevel (video gaming)Computing platformSpacetimeConfiguration spaceMathematicsSet (mathematics)Data storage deviceOpen sourceVariety (linguistics)AreaComputer architectureGame theoryCodierung <Programmierung>LaptopVideoconferencingMultiplication signSoftware maintenanceBlock (periodic table)Extension (kinesiology)WebsitePoint (geometry)DemonArmBranch (computer science)Bit rateOrder (biology)Virtual machineProgramming paradigmLogic gateImplementationVolumenvisualisierungEnvelope (mathematics)CodecOcean currentUnitäre GruppeStack (abstract data type)Graph theoryDevice driverComputer hardwareClosed setView (database)Mobile WebSource codeCharacteristic polynomialBridging (networking)Shared memoryTranscodierungXMLUML
Thomas BayesTrailRun time (program lifecycle phase)Personal digital assistantBitDevice driverAndroid (robot)Parity (mathematics)Virtual machineSystem on a chipElectronic visual displayComputing platformSemiconductor memoryMicrocontrollerBlock (periodic table)MathematicsGame controllerRight angleConfiguration spaceProduct (business)Power (physics)Execution unitLogic gateKernel (computing)Interrupt <Informatik>Projective planePoint (geometry)View (database)Term (mathematics)Core dumpTouchscreenShared memoryShooting methodImplementationLogicGraphics tabletOverhead (computing)International Date LineArithmetic progressionComputing platformForm factor (electronics)Network topologyLevel (video gaming)Multiplication signComputer architectureClient (computing)Sinc functionData storage deviceCovering spaceExpert systemPhysical systemSet (mathematics)Band matrixStandard deviationComputer programmingFood energyWordBit rateOpen sourceStudent's t-testRun-time systemSelectivity (electronic)Information retrievalPower setSoftware bugTablet computerWindowWeightDynamical systemInheritance (object-oriented programming)Source codeXML
Wrapper (data mining)PlanningLibrary (computing)BuildingComplex (psychology)GeometryOpen setSoftware frameworkSet (mathematics)Context awarenessDifferent (Kate Ryan album)Computing platformAndroid (robot)Source codeLecture/Conference
Transcript: English(auto-generated)
All right, I think everyone's settled in. I'm going to jump in here. My name is Jesse Barnes. I work in the Intel Open Source Technology Center with Daniel, who just presented, and several others who are here today on the i915
graphics driver primarily, and then miscellaneous other GPU-related stuff inside of Intel. Today, I'll be talking a little bit about the Intel Bay Trail platform. And I had actually wanted to cover this last year, but I only got approval to talk about anything in any detail this year, so push forward.
So some of this may not be totally new to you, especially if you've been reviewing what came out of IDF a couple of months ago on Bay Trail. But this is something I'm really excited about, because it's a new SOC for us in a couple of ways.
It's got a new CPU, a redesigned Atom CPU, and a new GPU relative to our previous SOC products. So it's really interesting. And we've also been involved on the bring-up and some of the design work from very early on in the open source
side of things. So it's exciting for us from that perspective as well. Wasn't just totally driven by Windows stuff or powered on solely with Windows, that sort of thing. So anyway, like I said, this is a new SOC specifically focused at the tablet market and convertible market.
It's not designed to go into phones. It doesn't quite have the power envelope that you need for that and the level of integration you need for that, but it does make a pretty decent tablet chip. It's fabbed on our 22-nanometer process, which isn't our bleeding edge, which is the Broadwell stuff that we're pushing out now that's on the 14-nanometer.
But our 22-nanometer process is pretty good. It's pretty mature and works well for us. They're really big. I mean, it's a really big change, both CPU and GPU-wise for us on the SOC side, on the Atom side. And like I said, the Linux stuff was a priority from early on.
And the Linux team has been involved. Our open source team doing Linux drivers has been involved from very early on. We were at the power on when the silicon first came back from the factories doing the initial debug and development of our driver stack. So Ben, who may be somewhere around here, was at that. Ken Gronke, who's on the Mesa team, was there helping.
And we got 3D games up and running and stuff within a day or two of getting stable silicon. So that was really exciting. So some of these slides you may think, well, these are really pretty. How did he come up with those? Well, I didn't. These are mostly stolen from IDF, so hopefully they won't be
too full of marketing stuff. But this is the overall architecture of the chip. So it is an SOC. It's got a whole bunch of IP blocks on it. But the big changes are the new Atom up to quad core, and then the new HD graphics, which is not new from the client side. It's basically an Ivy Bridge graphics part.
But it's been shot down and put onto the SOC. So if you're familiar or if you've been following our previous Atom stuff, you'll know that earlier chips used an IP block from Imagination, the PBR graphics stuff. And I'm sure everyone has a lot of love for their stuff.
And it'll be sad to see it go, but now we've got gen graphics instead. So you'll have to tolerate that. And then it's got kind of a similar feature set to what you'd see on a client part. There's the CPU and graphics power sharing and the same ISA extensions for security and virtualization. That sort of thing is all present.
So it looks like a regular client CPU. And it actually has a similar speed. So like I said, these are kind of the key changes. So the big one on the CPU side is going to the Silvermont architecture, which I'll talk about in a second, and then the graphics side going from the SGX running at a lower frequency to the gen graphics
running at a higher frequency. And of course, it's a totally different rendering model, too. It's not a deferred tile-based rendering. It's immediate. Resolutions increased. And in fact, on the Linux drive, we go up higher than 25 by 16, thanks to the mode setting architecture stuff that Danny was talking about.
We're able to change the display clock, the core display clock, at runtime if we need to bump up to an even higher resolution. Memory bandwidth has dramatically increased. So on some of these SKUs, like on the convertible tablets and notebooks, you get regular PC-style memory bandwidth, which is really great. And we went up to new USB as well.
So that's also important. So you don't have to wait forever for your multi-gigabyte USB stuff. Overall, this is what the basic block diagram looks like. And this is a bit different than what you would see on a client platform. It doesn't have a QPI bus. It's a custom ring fabric that we use
on the Atom side. Just within the SoC. One of the big differences that we have relative to the client parts, at least since Sandy Bridge we've had, the GPU is able to share, in a cache-coherent way, the last level cache on the CPU. And so both the GPU and the CPU are integrated under the
same die, and they can share the last level cache. On here, they're integrated under the same die, but we do not have that cache sharing. So each of those dual core chunks of IP has their own shared cache, but it's independent of the graphics. So that makes for quite a different set of performance
characteristics. So if you're tuning for performance on this part versus, say, an Ivy Bridge laptop, you're going to see very different things because of that lack of cache sharing. So you have to be careful about how you map and read back memory, and also how you set up your rendering and things like that. And it has the Intel HD graphics plus the quick sync
stuff that you would see on the client side. So it can do the regular video decode and encode you would see in the client space. And in addition to that, we've got a VP8 decode engine on these parts that's available and used in some cases.
This one looks really marketing-like. But the takeaway for this one is the huge improvements in power or performance you get to choose. The huge difference here is this is the first Atom part that's out of order.
So if you've looked at our previous micro-architectures on the CPU side, they've been in order, maybe dual issue. Now we've got up to four cores out of order, and the performance has just dramatically improved. If you use one of the old Atoms that's like using an ARM platform, you're like, I don't want to do any builds on this. I don't want to actually log into this machine. I want to just copy stuff onto it and pretend it's not
there. This guy, you can log into it, and it's like using a core 2 Duo laptop from a couple years ago. I mean, it's like reasonable speed. You don't want to stab your eyes out when you're using it, so it's a huge, huge improvement from that point of view, and it's got the new architecture extension. So it's a reasonable CPU to deal with in terms of having
the feature set you're used to. And there's all the fab stuff if you're interested in the 3D gates. So this is some of the benchmarks we've done comparing the new Atom CPU side to some of the competition.
Like I said, the other guys, these are fast ARM CPUs, but they're still terrible. Sorry if there are any ARM folks in here, but I used Atom, and even our old generation Atom was pretty competitive. It actually beat out some of the ARMs.
And in using it, I'm like, I don't understand how the ARM guys even survive. Like, this is a terrible thing to use. I don't want to use this platform. This new one is like, it's significantly faster. So you can see, I mean, these just spec in. So this is your current bench, basically. And it's way, way faster. And the power consumption is also lower.
Some of these guys, I don't know if this is actually ISO power, but you can see at least the clock rate. Yeah, Greg's saying that Tegra even takes more power on this same benchmark and gets less performance. And there's legal disclaimers at the end that I don't know
what I'm talking about. So yeah, I'll just agree with that. Like I said, the next big chunk is the graphic side. So getting rid of imagination. Imagination is for their faults. They're tuned to work in the mobile space and tend to be
fairly power efficient and can achieve high performance. I mean, if you look at the latest iPad Airs, they've done pretty amazing things with that architecture. It's proven. I don't like to work on it, because it's a closed source driver stack, and it's a different architecture than what we have. And I'm more used to ours, so I'm really happy about
this change, because we have a fully open source stack. But that's mostly a personal preference. From an Intel perspective, I think we have an opportunity here to really prove this architecture out in a lower power envelope. And I think we're doing that in Bay Trail.
So you can see that it's only got four U's. It's a pretty simple chop down from the client side ivy bridge. So the performance is scaled down to what you'd expect minus the LLC sharing that I already talked about. And all I can say in terms of the future is just keep
your eye on this space. It's going to get pretty interesting this year and next year. And then I talked about the power and the video decode stuff, it's all available too. Here you can see the gaming side. So this is basically comparing the graphics performance.
This isn't a leadership part, by any means, from a graphics perspective. It's merely competitive. To me, this is actually pretty impressive, given that it is a fairly straightforward chop down versus like a dedicated redesign for mobile. The fact that we did this well without a tremendous
amount of effort tells me that we have quite a bit of room to improve. And subsequent parts should be really nice. Plus, we've got an open source stack, so that's pretty exciting. And yeah, it beats the crap out of our old stuff.
I don't know if there are any media folks in here. There are so many acronyms and different codecs. But it does have, like I said, the same encode and decode support and transcode support that you would see in the client part. So that's nice from teleconferencing or video
calling, that sort of thing, point of view. So all that stuff is there, and that's actually a really nice feature of our parts. And I probably ought to know more about it, but I don't spend much time on the media side. On the display side, 10 by 7, you can't really see this.
This is a chamber of horrors. There are demons hanging people and torturing them. On the display side, the pipe stuff and the register block on the display mostly matches what you would see on the client side.
But when you get out to the port side, they redesigned everything and did a custom implementation of the display PHY itself. That's the chamber of horrors, and it's been a thorn in our side that I think the hardware guys really take pleasure in torturing us. It's just a totally different programming model.
We have a very low level access to all of the PHY characteristics and all of the filtering and things that get applied on the DAC for the PHY. So that's been a challenge to get right across a variety of configurations. But on the plus side, we've got MIPI here.
We've got 4 by lanes display port. And we can actually go up past 25 by 16, which was mentioned earlier. I think we're at like 38 by 18 or something like that. So it's not a bad display part either. And it's relatively low power.
We have some power gating features that we can do. When the pipes go off, we can actually power gate the whole display engine, like we do on the render side. When the GT's idle, we can power gate that. So we have that support on the display side now too, which is really nice. The low level aspect of it is a pain, but we mostly have
it under control now. So this is what we're shooting for. This is like a display on idle case, where you're just looking at your screen, but you're not doing anything. Or maybe you're rendering a little bit. You can see the graphics is lit up.
But what we're shooting for on Linux is this. When the display goes off, we want the whole SoC to basically go off and power gate everything. So there's just some tiny microcontrollers and things that's still awake. But this is not a full S3 state. This is a runtime suspend state. So the SoC architecture that we have now lets us get to
really low power consumption during runtime. Greg asks, are we going to bother with S3 because we're going to do runtime suspend? Yeah, S3 in some cases can still save you a little bit more than a full runtime suspend case. So we'll support both S3 and runtime
suspend all over the place. Going forward, if we can get parity between S3 and runtime suspend, I think at some point we may just implement S3 in terms of runtime suspend. But there's a logic, and there's an overhead to all of the runtime stuff. So S3 is also simpler from a platform and implementation
point of view. So there's trade-offs. So the upstream status of this, we started pushing these bits way back almost two years ago now. Landed the initial graphic stuff in 3.6. And the full S0X, that's our runtime suspend support,
isn't pushed upstream just yet in the core side of things. And then on graphics, that's pretty much done. We just have to get it merged. Things really stabilize around 3.10. So if you're starting to get these platforms for yourself, don't do anything before 3.10. Or you're going to be in that chamber of horrors that I
talked about. The display up and down clock I talked about. So I don't think any of those super high res display products are out on the market yet. But when they are, we'll be ready for them. So you'll be able to use those right away. RC6 is a feature on this, too. And it works.
I know some of you may have had problems with that. RC6 on Bay Trail is a different implementation than it is on the client side. And it's actually simpler. And it seems to be pretty stable. We've got some display power shutdown stuff like I talked about, some other display features that are actually not
Bay Trail specific, the dynamic refresh rate support. So if the screen is mostly idle, you could actually drop the refresh rate, save some power on your memory bandwidth, let your memory stay in self-refresh for longer. We have some bugs. Again, chamber of horrors. This one we just fixed the day before I caught my flight
out here. And it didn't drive me nuts for a week and a half. So I'm really happy that we fixed that. So if you're buying a Bay Trail-based platform on Amazon or something now, you would have run into that right away. But we've got that under control now. So that should be fixed in the current kernels.
And we'll push that back to stable. And then there's some other issues. Like Alan Cox got this T-100 machine with a Bay Trail T, a tablet, a SKU in it. And it's actually a MIPI-based panel on his machine.
But it showed up as a VGA for some reason. And as he was trying to configure it, he's like, oh, I can get it to work if I program the VGA to do this. And it just so happened that he was programming the VGA in such a way that we were driving both VGA and MIPI with the same timings, and the display happened to work. So there are still some things we have to work out on
the MIPI side. We haven't tested it tremendously. Just in upstream Linux, we have it on our Android trees. So we'll see some improvements there. On the Mesa side, it's been in for a long time since Mesa 9.2 and even before, I think. There's a lot of performance headroom here. Like the performance for some things is actually looking
pretty good. But for other things, we're like at half the speed that we could be. So the Mesa team is just kind of continually improving performance there. Ian is probably not here since we were out last night until about 6.30 AM. But you can ask him later when he shows up.
On the media side, it's been supported in Live VA as of the 1.2.1 release. The VP8 stuff is in progress. The way that the VP8 bits are integrated, it shares some logic in terms of power gating and interrupt support
with the i9-15 IP block. So there are some i9-15 changes required, and then there's a bunch of driver in our Android tree that's published in open source, but it's not integrated upstream. And I don't know if the people we have working on it are actually going to push it upstream or not. It might land in staging at some point, but it's pretty
ugly still. So we're working on that. There are a few platforms out there now, mostly Windows stuff. I keep refreshing on Amazon because I want to see the Android tablets come out. But I think those will be coming in the next couple of months.
Or if you go to Shenzhen, China, you can have them build you one right now. That's been another kind of issue with people building weird configurations and saying, hey, it doesn't work. So if you're looking for the SKUs, I just included them here. You can also jump on Wikipedia. Those are all there. But if you're on Newegg or something and you want to look,
there's a new NUC, the next unit of computing, that's kind of a side project for Intel, a nice little tiny form factor device. But there's one with Bay Trail in it now. There's a new one with Haswell and a new one with Bay Trail. And so both of those are really nice platforms for a tiny
device like that. And the Bay Trail one, hopefully, is fanless. So that's a fun way to get your feet wet with Bay Trail if you want. And then there's new stuff coming. So just keep looking. OK, legal disclaimer. Anything I said, you can't hold against me because I'm a fool.
So any questions?
I can hear you. So you're going to be on the Mesa side of the graphics. You fully support OpenGL ES 3.0. And when you bring this over to Android, you're obviously going to be supporting it there as well. Will you be bringing desktop OpenGL to the Android side
while being able to create it with an EGL context? Will we be bringing full OpenGL to Android? That's not something that we're going to stop people from doing. I don't think Google has any plans to expose the full GL API through their framework, which is really
what would be required. But if you want to take a build Sanogen with Mesa and enable full GL in your build and then do NDK stuff, that's definitely doable, I would think. On Android, there's some additional complexity because they've got a GL wrapper library to unify the feature
set, but yeah, it's something that you could definitely do if you wanted to. I guess Ian's still not here, but he would be a good one to ask about that too. Any other questions?
Cool. Well, I'm glad I got to brag about this. I've been excited about this platform, and I'm glad it's finally out there. Thanks. So next talk is going to be about Nuvo.