Huawei HiAI Foundation Accelerated AI
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Alternativer Titel |
| |
Serientitel | ||
Anzahl der Teile | 90 | |
Autor | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/47685 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
2
10
13
14
26
35
37
45
48
54
55
56
63
64
68
73
74
77
78
80
82
85
89
00:00
Web logMultiplikationsoperatorEnergiedichteSoftwareentwicklerGrundraumPhysikalismusAggregatzustandCASE <Informatik>BinärcodeVierzigPunktwolkeMatrizenrechnungAuswahlaxiomFlächeninhaltGamecontrollerMAPArithmetisches MittelDatenmissbrauchLogischer SchlussSoftwareLeistung <Physik>RechenwerkHilfesystemProzess <Informatik>DefaultBenutzerfreundlichkeitEntscheidungsmodellZahlenbereichKonfiguration <Informatik>Neuronales NetzSchreib-Lese-KopfWort <Informatik>VideokonferenzStabDeskriptive StatistikAutomatische HandlungsplanungDatenverwaltungHauptidealÜbergangEchtzeitsystemGesetz <Physik>CodierungInformationMetrisches SystemDivergente ReiheProdukt <Mathematik>Demo <Programm>BefehlsprozessorComputeranimation
04:50
Demo <Programm>Virtuelle MaschineZweiRechter WinkelSmartphoneProzess <Informatik>COMAlgorithmische LerntheorieTwitter <Softwareplattform>Wort <Informatik>MathematikDifferenteComputeranimation
06:33
Leistung <Physik>Logischer SchlussSpezialrechnerGraphikprozessorBefehlsprozessorSkalarfeldMathematisches ModellSampler <Musikinstrument>StellenringMultiplikationsoperatorBefehlsprozessorLeistung <Physik>TermNeuronales NetzBenchmarkTwitter <Softwareplattform>Singularität <Mathematik>SoftwareZahlenbereichComputerarchitekturRechenwerkFehlermeldungFlächentheorieNeuroinformatikGraphikprozessorMaschinensprachep-BlockBildgebendes VerfahrenMathematische LogikMetrisches SystemGüte der AnpassungRechenschieberStrömungswiderstandDateiformatGamecontrollerGebäude <Mathematik>TropfenProzess <Informatik>LastPaarvergleichTensorBitMathematisches ModellLogischer SchlussVirtuelle MaschineDifferenteMatrizenrechnungMathematische ModellierungMustererkennungDemoszene <Programmierung>ZehnGruppenoperationVektorraumGesetz <Physik>CompilerResonatorSoftwaretestInhalt <Mathematik>WellenpaketDemo <Programm>Dienst <Informatik>SichtenkonzeptTabelleAlgorithmische LerntheorieComputeranimation
12:21
Neuronales NetzNichtlinearer OperatorMathematisches ModellNichtlinearer OperatorMailing-ListeHyperbelverfahrenDifferenteSoftwareMomentenproblemWellenpaketNeuronales NetzDifferenzenrechnungFaltungsoperatorNeuroinformatikBitHumanoider RoboterComputeranimation
13:44
Hill-DifferentialgleichungLeistung <Physik>Humanoider RoboterFramework <Informatik>Nichtlinearer OperatorDesintegration <Mathematik>Mathematisches ModellSoftwareentwicklerIntegralSoftwareMereologieEnergiedichteGemeinsamer SpeicherDemo <Programm>Nichtlinearer OperatorBitElektronischer ProgrammführerDifferenteCodierungNeuronales NetzSoftwareentwicklerApp <Programm>Framework <Informatik>Prozess <Informatik>WärmeübergangLokales MinimumDigitale PhotographieRichtungHumanoider RoboterPerspektiveMatchingGefangenendilemmaGüte der AnpassungBitrateComputeranimation
16:12
Digitale PhotographieGoogolAbelsche KategorieWärmeübergangMustererkennungDemoszene <Programmierung>Hausdorff-RaumProgrammbibliothekPhasenumwandlungProzess <Informatik>Mathematisches ModellInverter <Schaltung>Bildgebendes VerfahrenMustererkennungSystemplattformFormale SemantikInformationHumanoider RoboterGoogolCASE <Informatik>Einfache GenauigkeitQuellcodeKonditionszahlProzess <Informatik>Mathematisches ModellAppletPhasenumwandlungMailing-ListeProgrammbibliothekCodeDatenverwaltungAggregatzustandBaumechanikWiderspruchsfreiheitHalbleiterspeicherNormalvektorEinflussgrößeComputeranimation
17:43
Prozess <Informatik>PhasenumwandlungPräprozessorLogischer SchlussSynchronisierungInterleavingEbener GraphSpezielle unitäre GruppeParallelverarbeitungIntelBildgebendes VerfahrenPhasenumwandlungProzess <Informatik>Logischer SchlussEin-AusgabeCliquenweitePuffer <Netzplantechnik>DämpfungTransformation <Mathematik>WärmeübergangPRINCE2InformationQuaderZellularer AutomatSchlüsselverwaltungOffene MengeHalbleiterspeicherEnergiedichteMereologieAutomatische HandlungsplanungSpeicherabzugProgrammbibliothekBitParallele SchnittstelleInterleavingPixelFunktion <Mathematik>Ebener GraphDateiformatHumanoider RoboterComputeranimation
19:31
Logischer SchlussWärmeübergangMathematisches ModellFaltungsoperatorBefehlsprozessorVersionsverwaltungMathematische ModellierungDämpfungPuffer <Netzplantechnik>TensorEin-AusgabeMetropolitan area networkDifferenteQuellcodeInhalt <Mathematik>WärmeübergangMathematisches ModellMathematische ModellierungZweiBefehlsprozessorMultiplikationsoperatorHalbleiterspeicherFaltungsoperatorOrdnungsreduktionNichtlinearer OperatorTUNIS <Programm>SchlussregelResultanteSpeicherabzugInelastischer StoßKartesische KoordinatenFormation <Mathematik>AbfrageRationale ZahlComputeranimation
21:53
Demo <Programm>Hill-DifferentialgleichungProgrammierumgebungMathematisches ModellMathematische LogikFokalpunktBetrag <Mathematik>App <Programm>TUNIS <Programm>Demo <Programm>MultiplikationsoperatorBitMathematikFormale GrammatikEinsRechter WinkelDifferenteCASE <Informatik>Digitale PhotographieXMLComputeranimation
24:27
Mathematisches ModellNichtlinearer OperatorDatensatzOrbit <Mathematik>WellenpaketFamilie <Mathematik>Schnitt <Mathematik>BitrateSystemplattformStichprobenumfangFramework <Informatik>App <Programm>Mathematische ModellierungComputeranimation
25:47
Hill-DifferentialgleichungProgrammierumgebungTropfenStrömungswiderstandPlug inMathematisches ModellDateiformatDesintegration <Mathematik>Explosion <Stochastik>Chi-Quadrat-VerteilungWeg <Topologie>SystemplattformTorusTropfenMathematische LogikSoftwareentwicklerVirtuelle MaschineRechter WinkelMathematische ModellierungApp <Programm>Logischer SchlussBeobachtungsstudieDemoszene <Programmierung>Produkt <Mathematik>Wort <Informatik>InformationsspeicherungComputerspielMetropolitan area networkWhiteboardMomentenproblemTreiber <Programm>Projektive EbeneBildschirmfensterProzess <Informatik>WellenlehreFamilie <Mathematik>Nichtlinearer OperatorMathematisches ModellTouchscreenKartesische KoordinatenTwitter <Softwareplattform>Plug inProgrammierumgebungHumanoider RoboterRPCWellenpaketPhysikalischer EffektUmwandlungsenthalpieCodeSpannweite <Stochastik>GenerizitätStrömungswiderstandGlättungProgrammbibliothekGeheimnisprinzipAppletGeradeComputeranimation
30:46
ProgrammierumgebungHill-DifferentialgleichungDesintegration <Mathematik>TropfenStrömungswiderstandProzess <Informatik>FunktionalRPCSchlussregelApp <Programm>SoftwaretestTaskSchlüsselverwaltungEnergiedichteWellenpaketDienst <Informatik>UmwandlungsenthalpieDifferentePlug inSpieltheorieMereologieEinfacher RingMathematisches ModellMinimumComputeranimation
32:38
Hill-DifferentialgleichungProgrammierumgebungDesintegration <Mathematik>StrömungswiderstandTropfenApp <Programm>Rechter WinkelCodeGeradeGraphfärbungHumanoider RoboterBildschirmsymbolSynchronisierungSkriptspracheZweiCASE <Informatik>PunktComputeranimation
34:04
Inhalt <Mathematik>WellenpaketStichprobenumfangCASE <Informatik>DatenflussBildgebendes VerfahrenGeradeEin-AusgabeSchnittmengeFunktion <Mathematik>CodeComputeranimation
35:05
Mathematisches ModellProgrammierumgebungDesintegration <Mathematik>Hill-DifferentialgleichungUmsetzung <Informatik>AnalysisInklusion <Mathematik>TropfenCASE <Informatik>Nichtlinearer OperatorEinsAnalysisMailing-ListeMereologieMathematisches ModellVerkehrsinformationMathematische ModellierungStrömungswiderstandt-TestComputeranimation
36:06
Explosion <Stochastik>AnalysisMathematisches ModellSpannweite <Stochastik>Mathematische ModellierungDemo <Programm>VerkehrsinformationEmulatorRPCWeb-SeiteRechter WinkelMereologieMailing-ListeHumanoider RoboterPlug inMathematisches ModellInformationsspeicherungComputeranimation
37:08
Demo <Programm>SuchmaschineTypentheorieSpannweite <Stochastik>Logischer SchlussReelle ZahlDemo <Programm>Bildgebendes VerfahrenMereologieAuflösung <Mathematik>VorhersagbarkeitApp <Programm>ResultanteSuchmaschineMinimumMathematisches ModellVerknüpfungsgliedGewicht <Ausgleichsrechnung>WellenpaketComputeranimation
38:49
Demo <Programm>App <Programm>VersionsverwaltungSuite <Programmpaket>TouchscreenProjektive EbeneLeistung <Physik>MinimumAuflösung <Mathematik>BeamerDifferenteVerknüpfungsgliedRandverteilungBildgebendes VerfahrenFehlermeldungSuchmaschineCASE <Informatik>DämpfungMultiplikationsoperatorSoftwareVierzigGewicht <Ausgleichsrechnung>PunktVerdeckungsrechnungMereologieComputeranimation
42:18
Demo <Programm>CodeKonvexe HülleMereologieFreier ParameterMathematisches ModellMini-DiscBildgebendes VerfahrenSpeicherabzugWellenpaketLeistung <Physik>Elektronische PublikationMaschinenspracheDemoszene <Programmierung>AppletProjektive EbeneMathematische ModellierungComputerarchitekturUmsetzung <Informatik>CASE <Informatik>Logischer SchlussDeskriptive StatistikSchlussregelWeb-SeiteTextur-MappingBestimmtheitsmaßGewicht <Ausgleichsrechnung>PunktspektrumKonfiguration <Informatik>Virtuelle MaschineStabPlastikkarteComputeranimation
45:52
CodeSpannweite <Stochastik>ComputerspielBildgebendes VerfahrenMathematisches ModellHalbleiterspeicherRechter WinkelComputeranimation
46:52
Mathematisches ModellPixelCodeHalbleiterspeicherMereologieRechter WinkelBildgebendes VerfahrenUmsetzung <Informatik>SuchmaschineResultanteDatenverwaltungProdukt <Mathematik>Logischer SchlussMetropolitan area networkLastEin-AusgabeElektronische UnterschriftVorhersagbarkeitAppletComputeranimation
49:30
CodeBildgebendes VerfahrenVerkehrsinformationMereologieElektronische UnterschriftSystemaufrufStichprobenumfangDatenkompressionSchlüsselverwaltungÄhnlichkeitsgeometrieQuick-SortKette <Mathematik>GeheimnisprinzipDämpfungComputeranimation
51:11
Abelsche KategorieSpezialrechnerCodeBildgebendes VerfahrenQuick-SortEin-AusgabeFunktion <Mathematik>CodeVererbungshierarchieFlächeninhaltHalbleiterspeicherAuflösung <Mathematik>MereologieGeradeComputeranimation
52:32
Rechter WinkelStichprobenumfangMereologieRahmenproblemKontextbezogenes SystemMAPAuflösung <Mathematik>HalbleiterspeicherGüte der AnpassungBitmap-GraphikComputeranimation
54:03
Manufacturing Execution SystemBitmap-GraphikEigentliche AbbildungSoftwareDemo <Programm>LastGebäude <Mathematik>ProgrammierungApp <Programm>VerknüpfungsgliedMAPThreadKontextbezogenes SystemTouchscreenHalbleiterspeicherWort <Informatik>SoundverarbeitungComputeranimation
58:50
GeradeAppletMathematisches ModellMathematische ModellierungDemoszene <Programmierung>Weg <Topologie>WellenpaketStrömungswiderstandCodeComputeranimation
59:50
LoginSoftwareVersionsverwaltungMultiplikationsoperatorRechter WinkelGewicht <Ausgleichsrechnung>Wort <Informatik>Computeranimation
01:01:40
CASE <Informatik>StabMultiplikationsoperatorDienst <Informatik>MenütechnikTreiber <Programm>Reelle ZahlTouchscreenAuswahlaxiomTurm <Mathematik>SoftwaretestComputeranimation
01:03:58
Nichtlinearer OperatorNeuronales NetzWechselseitige InformationSoftwareFaltungsoperatorBildgebendes VerfahrenMathematisches ModellResultanteeCosTermProzess <Informatik>Rechter WinkelDifferenzenrechnungNichtlinearer OperatorMereologieGoogolLeistung <Physik>Physikalischer EffektCASE <Informatik>TabelleBitrateAuswahlaxiomInternetworkingGeradeUmsetzung <Informatik>Mailing-ListeWort <Informatik>UnternehmensarchitekturNotebook-ComputerRahmenproblemFramework <Informatik>Lokales MinimumVirtualisierungVersionsverwaltungDatenflussComputerarchitekturQuick-SortMultiplikationsoperatorMultiplikationUmwandlungsenthalpieBinärcodeNeuronales NetzDifferentePunktMetrisches SystemInstallation <Informatik>SchlüsselverwaltungHumanoider RoboterComputeranimation
01:08:47
Neuronales NetzNichtlinearer OperatorMathematisches ModellNichtlinearer OperatorDiskrete-Elemente-MethodeSoftwaretestCASE <Informatik>BitHumanoider RoboterSoftwareArithmetisches MittelComputeranimation
01:09:53
AnalysisExplosion <Stochastik>MomentenproblemVersionsverwaltungBefehlsprozessorStandardabweichungCoprozessorMAPNichtlinearer OperatorApp <Programm>AuswahlaxiomMultiplikationsoperatorSoftwaretestWellenpaketArithmetisches MittelMathematisches ModellComputerarchitekturNeuroinformatikAutomatische HandlungsplanungGlättungQuick-SortAnpassung <Mathematik>Neuronales NetzMooresches GesetzBitMailing-ListeGruppenoperationTermFramework <Informatik>DateiformatBinärcodeVerkehrsinformationInformationCodeGrenzschichtablösungInstantiierungTopologieKomplex <Algebra>PunktwolkeTypentheoriePunktEnergiedichteZweiMereologieRahmenproblemLeistung <Physik>SoftwareentwicklerZehnDatensatzCASE <Informatik>Familie <Mathematik>Kartesische KoordinatenImplementierungElektronischer ProgrammführerEinsSoftwareTropfenStrömungswiderstandDatenflussComputeranimation
Transkript: Englisch(automatisch erzeugt)
00:00
Those of you who don't know me, I guess most of you don't. My name is Sean. I'm the principal product manager, product unit manager, actually, from Huawei. And you're going to stick with me for some time this afternoon. And I'll tell you something about our AI stuff, on-device AI stuff. And hopefully, towards the second half of the session, I will actually do some
00:26
real-time coding. Hopefully, the demo guy is with me, and that can work. If that doesn't work, I also have a backup plan to play some videos. But I really hope it will work. And so, first off, let's do some polls.
00:42
Who of you guys are developers here? Please raise your hand. Okay. I see, like, 45%. So, who of you guys are not developers? Man, this works every time.
01:01
You see, this really works every time. You ask a room full of audience... Oh, it's not full here because it's such a big room. You ask them, like, who are the developers? Some of you guys raise your hand, and then you ask who are not developers, and the other half didn't. So, that means we have some guys here in this room that's not in a binary state.
01:22
It's in the middle. You guys who didn't raise your hand in both of the times, you're not developers, at the same time are developers. So, who are you? So, let's try it again, huh? So, who are not developers here? I see one girl, I know her. So, three guys, four, five.
01:41
Okay, the camera guys. Who are developers? I think we're reaching, like, 97%. I still think there are one or two or three guys who didn't raise his hand. Both times, but anyways, just a quick info here. This is called, in a behavioural economy,
02:02
it's called the default option phenomena, which means that most people stick with the default option because default option here is not raising your hand. Now, surprisingly, this also fits the description of physics because in physics, there's this law called lower energy state.
02:25
Everything in the universe tries to stay at the lower energy and here, in this case, not raising your hand is saving energy. As you can see, all these laws in physics and economies, it just works together in this situation.
02:41
It's really amazing. Every time I try this, every time it works. But anyhow, let's get started. So I'm here to talk about our AI stuff. So like I said, two sessions. Let's just get started with the first session. So you probably have been through some of our talks, or at least some intros or something on our stage.
03:01
So basically, one of the major features of the new Huawei device is coming from last year, end of last year and this year, namely the P20 series and the Mate 10 series. We actually put a chip on our phone. That chip is called MPU. MPU stands for neural processing unit,
03:21
which means that that chip processes neural network on the device without any help of the cloud. And this means that you can put a lot of AI stuff now on the device for two reasons. Number one is that it's faster and more power efficient
03:40
because now you don't have to send a bunch of stuff into the cloud. This actually is very useful in some scenarios, like the Prisma guys who's going to come later on the stage to talk about their stuff. Imagine you have to send a big picture into the cloud, do the inference, then come back. That takes time because network is not always stable.
04:02
So that's the first reason. It's performant yet power saving. It performs much faster. I can show you some metrics later. It performs much faster but uses much less energy than the CPU. And the second reason being that sometimes you have to really think
04:24
about the privacy of your users. And sometimes the stuff they give you, they do not necessarily want to share with the rest of the world. Meaning that you want to process these stuff on your device without having to connect to the network, without having to send it to the cloud.
04:43
So for these two reasons, you want to use the on-device AI. And maybe there are some other reasons as well, but I think these are basically two of the major reasons. Now, we have seen some trend in the past couple of years, actually, namely this year and last year.
05:00
You know, AI stuff are coming to the device. Now, I have actually collected these couple of articles and they are from three Jeffs from different companies. But apparently, you know, after the fact, it seems like successful guys tend to name themselves Jeff.
05:22
So if you want to be more successful, maybe you should thinking about changing your name to Jeff. But anyhow, the first Jeff from Apple says that the smartphone will get even smarter with on-device machine learning. Of course, Apple has their agenda because they are not particularly strong on the cloud, so they want everything to be on the device.
05:41
But anyhow, I think this is true though. The second Jeff from Qualcomm said that on-device processing and AI goes hand in hand. Of course, if a lot of AI is moving on the device, then you do more processing on the device, which is good for their business because they're Qualcomm, they make chips, right?
06:00
And the third guy, Google, Jeff Dean, I think most people should be familiar with Jeff Dean, or at least know about this guy. He's an amazing guy. He said 80% of smartphone will have on-device AI capabilities by 2022, which I think is going to be true. We'll see. I mean, it's 2018, so there's four year more to go.
06:23
But anyhow, as you can see, there are these trends that AI processing or AI inferencing, for that matter, is going more and more towards the device, right? So that's the trend there. Now, let's talk about our MPU.
06:41
As you can see here, we have some benchmarks. This is done by our own lab, but in all fairness, we have 25 times more performance than if you would do the same thing on a CPU in terms of doing the AI inferencing with neural networks. And yet, it's power efficient,
07:03
50 times more power efficient than the CPU. And if you want to talk about the specific numbers, here's some benchmark we did with ResNet-50. We can do 2005 image inference per minute,
07:20
while if you do that on the iPhone, particularly iPhone X, you can do like 880 per minute. And then everything from there goes down the hill, and if you do it with the Galaxy S8, I don't think they have GPUs, but I could be wrong, but anyhow, it does 95.
07:42
So you can see the comparison there. And these are other networks that we did benchmarks on, but anyhow, I want to actually have another quick poll. Don't worry, it's not a trap anymore. So who of you guys are familiar, at least familiar with AI or machine learning,
08:00
and know the basic concepts, and you know about ResNets and 50s and 120s and Inceptions and VGG networks? So some of you guys know, but anyhow, so for those of you who don't know, these are just different AI models. They're mostly trained for image recognition.
08:21
They have different architectures and have different depths, if you will, if you compare ResNet-50 to ResNet-152, it's just basically how complex they are, 50 versus 152, if it's under the same name, like ResNet here. But anyhow, it's just different AI models that can do the image inference,
08:42
and of course, the amount of computation in that model are different. So here are some benchmarks for these models. But if you go down to the basics, why that MPU is faster than GPU and than CPU is because our traditional architecture on CPU,
09:04
if you think of it as a table, a lot of unit or a lot of surface errors are occupied by the control unit. That is what CPU is good at. It's good at logic controls, and it has a lot of units that does the computation,
09:26
arithmetic computation, but it's probably like 50%, 60% of the size is being used for that. But if you move on to GPU, it has a lot more computation unit,
09:42
it has a lot less controlling unit. It's because GPU is traditionally good at processing image, and a lot of times, you can process image in parallel, and that's why GPU is good at processing actually vectors, basically.
10:02
So now the MPU is good at or designed to process tensors, which means that tensors being the multidimensional matrix or arrays, if you will. So that's why MPU is faster, because it's just good at processing these multidimensional arrays,
10:22
or we call them tensor, which is the building blocks or cornerstone of the neural networks, which is basically, if you will, the neural network equals to deep learning, which is where the AI has been advancing in the past. Past couple of years really quickly.
10:42
So that's why we can directly process tensors and in that way, a lot faster than GPU than CPU. So that's the secret of it being faster than GPU and CPU, hence the metrics before. So that's a little bit why.
11:01
Since a lot of you guys here don't, my slide is stuck, I don't know why. Okay, now it works. This is probably why I left Microsoft, because sometimes Windows just gives you all these problems
11:24
and a restart would work, but I don't wanna restart my computer here. I kinda wanna skip these slides because this is a bit more like details, since a lot of you guys don't know a lot about AI here. But this slide basically says that how we actually run the model,
11:42
we don't actually load the model and then compile it on the fly, like just compile it or not and run it. We actually compile to machine code and directly run it. Hence, we can run it faster, but that also means that we will have this preparing step
12:00
where you actually have to convert your model, like your model from TensorFlow or Caffe, into our format or our machine code. So there's this little step needed. But don't worry, we'll talk about that later with our demos and in our tool. We will show you how you can easily do that with just a drag and drop.
12:24
Now, talking about neural networks, you have to know that it's basically a cluster of computations. And in that cluster of computations, you have basically different layers. It's just like our neurons, you have layers.
12:41
And in these layers, you have different operators to take care of different kind of computations, convolutions and whatnot, concat and whatnot. But we have a comprehensive list here that tells you that these are the neural network operators that we support at the moment. No, it's the at moment,
13:00
because we're gonna add more support gradually. Even though this is already the biggest list on the market today, if you look at CoreML supported operators, CoreML being the Apple's on-device AI framework, they support 60 some operators and the Android NN from Google
13:22
supports, I think, 30 some operators. So we have a pretty comprehensive list here. Now, you might ask, what if I don't have something that's not on this list? What if I train a model that's a little bit more complex than you have thought and it has operators not on the list?
13:40
That's okay, because we will have tools to support that. Again, we can demo that later. But just to wrap it up, it's faster, it saves energy, and we support a lot of different frameworks, and a lot of neural network operators are already supported, but we're adding more support.
14:03
Last but not least, the integration part, the part where you are actually gonna get down to the business, you're gonna get your hands dirty to integrate your own model, will have tooling support, which makes it much, much easier. And we will actually show that later in the demo.
14:23
I'll do some coding if the demo guide is with me. But anyhow, we just wanna share a little bit how the Prisma guys did it. For those who don't know about Prisma, they are a wonderful app. I think this is the first AI capable app
14:42
I have ever remembered in the market. I think it was a couple of years ago. You guys first published on the iPhone, and they can do this style transfers. Basically you take a picture, then it transfers to entire different style like impressionist and what you have. Basically make the photo really look stunning
15:01
in a special way. But anyhow, they use AI, and now their AI processing is on the device, our mobile phones. And our friend from Prisma will just tell you a little bit about how they did it from a developer perspective. What kind of do's and don'ts,
15:22
and what kind of stuff you have to be careful about. Then after that, I'm actually gonna show a real demo on my device, and actually show how you could do such an app with our framework support and with our tooling support. So without further ado.
15:47
Just talking, all good. Thanks Sean for introduction. Let me get a little introduction about myself. My name is Maxim. I am developer in Prisma, and I am doing research on how accelerate AI
16:01
on Android phones. And I'm working in Prisma. Prisma is a company that doing AI photo improvement, and we got B2C and B2B direction. And B2C, this is application. So on B2B, we provide SDKs and technologies
16:20
to use on Android device, such as style transfer, segmentation, and a single recognition that helps you to use semantic information from image to improve it. So we got on Android platform some hours from Google. Maybe, no, maybe not. So we used DTK on Prisma,
16:42
and logically it contains almost single library that provide GNI layer. And for this and so on, I will provide some code listings to just understand what comes from Java side and what we need to perform some operations.
17:01
So this model logically contains initialization phase and processing phase. Instillation phase contains muntains of TI manager and the TI environment, such as a model initialization, a model loading, and unloading to keep memory in consistent state.
17:22
So as you can see, for loading, we just need, for example, for loading from assets, asset manager, and just model name. On unloading, we just need model name. And so if model manager is not initialized anymore, or maybe don't from first start, we just create this.
17:44
So processing phase itself contains some free steps. It's just a private processing data, doing forward inference itself, and post-processing. We just need a float array buffer that contains image information,
18:03
and such things as with height, and that's all. So we are into the accelerated style transfer, so we just need to know input width and height. This is just image-to-image transformation.
18:22
So on pre-processing and post-processing, there's one thing to think about it. So on Android, Android native format is interleaved on Android bitmap, for example. So for example, hgba8888 contains pixels in layout
18:40
like hgba, hgba, and something like that. And TI works on planar memory layout, so you got for first values for air channel, then for g channel, and so on. So we need to transform this,
19:00
and maybe apply some mean normalization for this. We use the Intel MKL for parallel processing for this, but you can use any arbitrary parallel processing library such as OpenMP is included on Android, this doesn't matter. So post-processing is just the same thing,
19:21
but vice versa, you convert from planar channels at the output of TI, and do interleaved fashion to display that, and convert to bitmap to do everything you want. For a difference, it just contains of initialization buffer step, initialization tensors, so we allocate tensors
19:42
and copy from input float buffer to this tensor via memcpy and just run it, and that's all. So we use the style transfer models for our acceleration, and I got some tips for tuning. For example, convolutions last really, really great,
20:02
so even compare with iPhone. But to reduce chance to be bound by memory bandage and to not make some bottleneck on memory, there is a requirement or tip to reduce copying operations. So, and about upsampling, this upsampling player
20:23
is absent in Caffe, but you can use fork of Caffe that contains inter-player that do the same, or emulate upsampling via deconvolution, that's what you want. So we tested a 800 style transfer model
20:41
on Huawei Mate 10 Pro, and we take, the style transfer model that we run on CPU about a second, and just try to run this model on NPU and already got the X2 speed up, so this. After some model tuning,
21:00
we increase the speed up up to three times. So this model performs about 300 milliseconds compared to one second on CPU version. So a quick results of this. We got our model runs on our Prisma application, it's about three times faster on Huawei devices
21:23
that supports DDK, and it was pretty easy to integrate, it's pretty easy to use API provided by Huawei and DDK. And so, and you can anytime contact Huawei to support this
21:40
and we will answer. Thank you. Thanks. So as we can see, it will take some time for you to integrate your AI models into the app, but that was actually last year, because these guys started early and they didn't have much of tooling support,
22:02
but now it's a lot better situation. We have tooling support, we do a lot of stuff automatically for you, so you can actually just focus on your business logic, writing your apps logic rather than focusing on, oh, how do I do this? How do I, you know, how do I actually integrate a model?
22:24
Why have a model? So right now I'm actually gonna talk to you about some of these tooling support we have. Then we'll see a demo.
22:40
But before that, I just wanna show you this. And if you look at the right side, or actually left side, this is, was 8.30 at night, I took a picture, it's somewhere South China, Shenzhen, next to Hong Kong. As you can see, a lot of things are overexposed.
23:04
That's a typical situation, like you go out at night, you wanna take some good pictures, you take out your phone, doesn't matter if it's iPhone or iPhone X, doesn't matter, it's Android phone, the highest you can get from Samsung, the S9, I believe, or S9 Plus.
23:20
You see this kind of situation, that you're like, what can I do? You can do nothing. But at the same time, exactly the same time, I took that picture with my Huawei P20 Pro. As you can see, it's a lot better. A lot of things are not overexposed. This is not just a single shot with the camera, it actually uses the camera to take multiple shots
23:44
and using AI to determine how to stitch them together under different exposures, so that the right things are exposed with the right amount of exposure, and then you have this kind of photo where things just how they supposed to look like,
24:01
maybe a little bit more enhanced than your naked eye, but it's certainly better than the ones that actually has a lot of overexposure. This is kind of one of the examples how we put AI together onto our devices to make basically the user experience better. You could think a lot of different scenarios, different cases, you can use AI
24:20
to improve your own apps experience, improve your user experience. This is just one of the example. Now, you have probably heard something from us yesterday how we have all these amazing APIs that basically with AI trained model behind it, or today how you can actually integrate
24:42
your own AI models into our apps on our platform, and you probably think, oh, this is cool. I mean, I almost used S word, but anyhow, and it is recorded, so oh, this is cool, but how do I use it, right? Because you always hear about these amazing stuff, but when you get down to the business,
25:01
when you want to get your hands dirty, you feel like helpless because it's great promise, but you cannot implement it. So, you're probably, you know, you're in yourself probably like this now, like, oh, this stuff is great, but where are the tools, how do I use it, where's the documentation, the samples,
25:21
and oh, you talk about AI models, how do I convert my own model from TensorFlow and Caffe or any other framework onto your platform, and oh, you talk about operators, but what operators are supporting, what are not, and what are the things that I should pay attention to, or even better, can you tell me if my model just works or not?
25:43
Or if it doesn't work, how can I improve it, how can I fix it? So these are the kind of things we do with our tooling support. Basically, we call IDE, but what actually it is is it's Android Studio plugin, because we've done a study, we figured that everybody's using Android Studio to develop Android apps now,
26:01
and we shouldn't build yet another, you know, IDE or whatever the stuff is, we should just stand on the shoulder of the giant. Basically, then we build this plugin on top of Android Studio. We're gonna continue putting a lot more other developer tooling support into this tool so that not only AI and many other aspects
26:23
or the platform aspects of Huawei Mobile, you can use that with our tool, and hopefully that will make your developer life a lot easier. So like I said, it's Android Studio plugin, and it's supported both on Mac and Windows,
26:41
because again, we did a study, we figured that even though you guys are Android developers, you don't feel ashamed to use Macs, right? I see Macs here and there. But I mean, we just wanna use the best tooling. So we supported Android, supported Mac OS, supported Windows, and most of UI stuff
27:01
is drag and drop. So you just do this drag and drop and then fix a few things here and there, mostly because there are this one last mile where business logic should work with the code, and we have no way of guessing or inferring that. It doesn't matter what kind of AI we have. So you still have to do the one last mile,
27:22
but a lot of things are done automatically for you behind the scenes. And we also have a lab. I believe we deployed quite some machines here in your lab, so that you can use our IDE to remotely connect to our device from the cloud, as if you're just working with it on your desktop,
27:47
and you can also remote ADB into that, basically pushing your app into that device and debug remotely. And we also have this high-key 970 board, which is basically an embedded device.
28:01
So it runs Android and other flavors of Linux as well, so you can use that to develop your solutions for embedded scenarios. And we also support that with our IDE, basically once you plug it in and we'll just install the drivers for you, automatically connect that and project the screen onto your desktop and everything.
28:22
So when we talk about models, like I said, the AI Foundation basically needs your model to work, because you know your business logic better, you know your scenarios better, you probably have the kind of data a generic AI model trainer or AI team doesn't have, so you wanna train your own model.
28:40
But once you have that AI model for yourself, you're gonna wanna import that into your APK and onto our platform, and we have that as well. Basically, it's still a drag-and-drop experience. When your model is supported, everything goes through smoothly. We're basically putting all the SO libraries and C++ code and JI encapsulations and Java API references
29:03
for you automatically, so that at the end of the day, you just need one line of code to reference your model to do the inference or prediction, if you will. And if your model had some of the compatibility issues, like I said earlier, if you use some of operators we don't support at the moment,
29:21
it will actually tell you that here is your model, it has this and this layers, and these are supported, these are not, and for these are not, why they're not supported here might be the suggestion that you wanna, how you wanna replace that layer and do the range training and come back to integrate it.
29:43
And we also have this, it's very primitive, but still we're trying to build that within the next couple of months, basically AI model store where we share more AI models with the community and basically you can use that as well if you don't have your own model,
30:00
but still you wanna try the shared model. Cause there are basically three things. We have our own APIs, which is the AI engine, these are the models that we train ourselves, we just encapsulate that into the API, then you can basically directly use that if that fits your needs. If you already have some AI model or if you have some very specific
30:21
particular business scenarios, you can address the best with your own AI models, then you train that model and you can use our foundation layer to import that into the app and have that working. And the third way would be that, try to figure out if there's shared model
30:40
out there in the community and you can also use that. So this is one of the, this is the third way basically using the AI model store. So the entire process of how you work with these, of course, starts from downloading something. I mean, it always starts from there. You download our plugin and install it
31:01
and then here's the part where you decide whether you need, you have your own model or whether you wanna use our own API. So if you wanna use our API, you go the top route, basically you use our tool for the engines, high ID engine tools, basically then you just drag and drop some of the APIs we have.
31:21
Or otherwise, if you have own model, you use the tool at the bottom to integrate. And then once you have coded up your app, you have that APK, you might wanna test it. The way you test, of course, is using a real device. There's no better than a real device. And you use one of the Huawei devices,
31:43
but you can use also the remote devices because we provide it for free, as long as you register yourself there and you can get it for free and you can do the testing over there through the remote devices. So that's another way. If you're working with the high key 970, which also has the AI capabilities,
32:01
you can also use our tool to connect to that and try to test and debug it. And once you've gone through that, you still need some testing, of course. We also have a comprehensive testing service in our lab. I believe it's starting to be offered in Europe, where you can do performance testing
32:21
and functional testing, and if you do games, there are very specific gaming testing and a lot of different testings. So you can do that, and when you finish that, you can publish your app into our app gallery. So basically, that's end to end walkthrough of how you would use our toolings.
32:42
And like I said, if you use the engine part, which is basically using our AI APIs, you can just directly do drag and drop, and hopefully you can be done within under an hour, from half an hour to an hour, and then integrate that into your app. So again, we do a lot of things automatically,
33:02
so you just have to find the right API here and drag and drop to your code lines or between your code lines, and we do actually three things for you automatically. So first off, if you look at the middle piece, the code would be actually generated in the place you drop that API icon in,
33:22
and second thing is that if you look at the top, whenever you use some APIs, you have to do the imports, right? But in this case, we actually automatically put the import code there on the top, so you don't have to move your fingers above. And the third thing is that our solution is basically AR package-based solution, which means that you actually have to
33:42
write this one line of code in the Gradle script to import that AR and sync it, but we do that automatically for you as well, so that you only have to drag and drop, then all these three things will be there. Of course, at this point, Android will ask you if you want to synchronize the Gradle packages,
34:00
you just click yes, and the package will be automatically downloaded for you. Now, you might be wondering where's documentation, right? Because always, when you look at the API, the API itself is not enough, and we have documentation there, it's just within the tool, within the IDE.
34:20
You click that API, you will see, okay, here's the documentation, here is actually what's the input, basically, in this case, it's API, I think we demoed yesterday, basically tells you, trained with a lot of pictures, by the way, tell you how beautiful this picture is.
34:40
Not the content in the image, but the overall settings and linings and everything. So in this case, if you look at the right side, the documentation is actually telling you input is image, output is some flow scores from zero to 100, and then there's a sample code down there,
35:00
you can also drag and drop these guys to the corresponding places. So this is how you would integrate an AI API, or an engine API, if you will. And now, the foundation part, basically the part I talked about where you have to integrate your models into the app, still it's a drag and drop experience,
35:21
and we do the model analysis for you first, if the layers or the operators in your model basically are compatible with what we support, with everything on that list I showed you earlier, then you will be able to convert the model and integrate it into the right places.
35:43
But in case you have some operators that you have that's not on the compatibility list, we'll actually show you a report that says, okay, here's the list of the layers, or operators you have, and these are the ones that's not supported.
36:02
That's how you use the foundation tool to integrate your model. This is the report, basically. Again, I'll show you later in the real demo. And this is the remote device part where you can directly access
36:22
from within our plugin on Android Studio. You just have to click the connect part, and they will be shown with a login page. You log in there, and then you can basically remote access these devices. You can interact with them, just like an emulator. And you can also click the debug button there,
36:41
up on the right side on Android Studio. Now your ADB device list will have one more device, which is this guy. We connect this remote ADB so that you can actually push your app and debug it remotely. And this is the AI model store I told you about.
37:01
But I'm not gonna go more details on that. So now I'm gonna show you a little demo and show you how I actually use our stuff to code it. So the demo here is simple, but not very simple. So here's what I wanna do. I wanna be able to just type in your name.
37:22
And then I wanna find this guy. Basically, I use Bill Gates example. I wanna find Bill Gates, how old he is. I wanna get a picture of his, because I wanna do this. I wanna get that picture and use one of the model that's shared by the community. It's called AgeNet. This AI model can determine from a picture
37:44
how old a person is in that picture. Basically it'll give you a age range. So then they can compare that inference from that AI model with the real age of Bill Gates and see if my prediction is close.
38:01
Then I'm gonna, because normally on the search engine this would be the kind of picture that's shown at the thumbnail size. So they're actually relatively lower resolution. And then I can use our AI engine API to blow it up, to make it clearer, which is the part at the bottom if you look at.
38:22
And that would need one of our API to basically do. And so I can use that. So basically I used our foundation, my own training model. I also used our AI engine API. So that's the purpose of it. Of course I use a search engine
38:41
to basically search the name plus age, then I get the result on the right side and I use the app to pull the image and text in and do the rest. So I'm just gonna show you how this app works and show how you can use our API to do that.
39:04
Oh, by the way, we have this wonderful app that's companion for our devices. It's called High Suite. I believe there is English version, international version. So all you have to do is basically install this guy and plug in your hallway device. It actually project the screen of the device
39:21
onto your desktop. So I don't need all these. If you remember yesterday, there was this kind of projection light and a camera on it. So this is basically what's happening on my screen. So I can pull up this app I set. And so, oh, I already used it. So Bill Gates is here.
39:42
I don't know if you can see the difference or not, but the picture at the bottom is enhanced. So it's clearer than the one on the top. I can see that on my phone, but I don't know at this resolution on the projector if you can see it or not. But those of you who don't believe me, you can come here to see it later. So let me try something else or somebody else.
40:05
I might wanna try Elon Musk. Let's hope the wifi works.
40:23
Yeah, I haven't really adopted to Google yet. So it's using another search engine. So it takes some time to get the image from there. But anyhow, here you can see that the actual age of Elon Musk is 46. Here from this picture, from this particular picture,
40:40
we use our AI model, the age net, to infer that this picture, this guy in the picture looks like he's between 25 and 32 with 90.46% of probability. And it's mostly because two things.
41:02
Those pictures you get from search engine, they're not necessarily the picture from this year. They could be from before. So there will be some error margins there. Another thing is that, in case you don't know, a guy like Elon Musk, he has his own stylist. So he might look younger with all the work
41:22
that has been done by the stylist on him. So maybe let me just go back to Bill Gates as they go. Again, network latencies.
41:46
Yeah, it's a bit better with Bill Gates. His actual age is 62, and this one infers that he's between 48 and 55, with relatively high probability, 80 some percent. And yeah, that's basically what the app does.
42:02
So right now I'm gonna show you how we can code search an app. Of course I'm not gonna start from scratch in that I have some basics like UI and image, placeholders and the buttons and whatnot.
42:21
I just wanna show you the part how we integrate the model, my own HNET AI model and how we use that higher engine API to basically enhance the image. So now I have this model already downloaded on my disk here, this guy.
42:49
It's a cafe model. Now all I have to do, I'll just try to do it again. So once you install our plugin, you will see it here.
43:03
That vehicle, that's our tooling branding. We're still working through the brandings. And now you see the engine part, like I said, the higher engine part is the part you directly use in the API from us. There are of course trained AI models behind it. And then there's this higher foundation, which the part I talked earlier,
43:22
where you use your own model. Now, then like I said, it's a drag and draw experience. If you just drag your model here, it will know it's a cafe model, hence configure the rest of the options, especially this guy, because if you know about the cafe framework,
43:42
it produces two things. One is the model, the other one is a description file for the model, basically just put a text file. But anyhow, we infer that because it's in the same folder, like I said, we try to do as much stuff as we can automatically for you so you don't have to bother with all these details.
44:00
And it will put basically the assets in this folder and will generate such a Java API. It's called hnetmodel.java. It's because my model file is named hnet. It will just basically call the Java API your file name, then model.java.
44:23
Of course, we remove all these underscores and whatnot. Then you can later directly use hnetmodel.predict to load your model and use that to do the inference. So now I will click the RAM to start.
44:43
We have a Docker container behind it that actually does the conversion, but like I said, we first analyze the model to see if it's compatible or not. I can show later a case why it's not compatible, what you will see. Then we do the conversion. Then we actually basically take all these results
45:01
and put that into your project. I just want to show the details because it's kind of daunting if you would do it manually because there are a lot of stuff behind the scene we're putting here. These SO packages, we support both the V8A and V7A architecture and also this binary,
45:22
the machine code from the model here. It's all your project in these places. You can inspect them later if you want. It's a C++-based solution, so there will be some C++ code, but we encapsulate that automatically for you with JNI
45:42
so you'll get this Java-level API you can directly use. But in case you want to mess around or you actually want to do some more stuff on the C++ layer because it's more efficient, you can do it here. And actually quite some code our friend from Prisma
46:02
has shown you later is actually in the C++ layer. You can also do that freely. So we put all these stuff in the right places. So in the end, all you have to do in your activity,
46:23
here I have a main activity, all you have to do is one thing, basically you see the right place here, basically. So here I'm actually loading the, if you remember, I'm actually loading the image above.
46:43
Oh, I'll just use the slide here. So I'm actually loading this image into my memory and convert that into pixels and give my API, then give it to the model to do the inference
47:00
and get back the age, range, and probability. So I'm actually getting these pixels from that image, and before I also have some code to go to the search engine to do the searching part to find the right text, basically the age, and also get the image
47:22
from the HTML and basically put it into my memory and convert that into pixels. And then I can use my model here, or rather using the Java API here to do the prediction.
47:43
So I want to predict on these pixels, so I get a result here from the inference of the model. The reason why there is red wigglies is because I actually did more stuff in the C++ layer,
48:04
but I can do that later, but just before that, now you need to actually load this model, so you have to do this, basically, you use the hmanmodel.load,
48:21
and basically load the model into the memory when you run it, and you give it the asset manager, because it needs the asset manager to actually find the model and load it. And then at the right place you use it to do the inference, but then just to be a responsible citizen of the community, you want to unload it,
48:41
you want to basically not burn the memory of your user, you would actually write this line of code, un-destroy of your activity here, you basically want to say hmanmodel.unload to unload it from the memory. So that's using the model part,
49:05
you have to do model.predict, then the input. Here I have to fix actually some small things. I'm not going to go through the details here with these fixings, but I actually did have
49:21
some C++ code to make this work better, because then I changed the method signature here, and this guy returns a string array, but I want a float array, it's because, then I change my C++ code,
49:43
now it actually returns a float array, but those of you who know about the JNI encapsulations, you also have to do it here, just so that the JNI encapsulation
50:01
is having the corresponding signature. Now this guy won't complain anymore, and this guy won't complain either, but then we still have the bottom part where I actually want to enhance the image, and that would be something else we can use, because now we have just integrated our model,
50:23
and that one, like I said, I want to use our APIs, we have a lot of, I think, right now should be 20 some APIs in here, so some of them are, you know, like something about face, you can do like,
50:52
for example, these face comparisons, you can compare a similarity of two faces, and face detection, you can find faces from images, and also other stuff like the one I told you before,
51:04
here's documentation, so here's how you use it, you click an API, you see that okay, here is the documentation, and here is a sort of image telling you what is the input, here it's input is image, and output is some score
51:22
telling you how beautiful this image is, then there's this code samples, again, these guys, you can drag and drop them, so basically, then I want to find this one in Hans' image, it's called image super resolution,
51:43
this one, I'm just gonna use it directly here, so again, it's drag and drop, I need to find the right place, basically here, I have the, actually I should have removed these code, because I'm gonna do it now,
52:05
so I have the image in the memory, now I have to do the enhancement part, so I'll just directly drag and drop this guy here, this is the part that Griddle asked me if I want to download the package, I will say yes,
52:21
now I'll just hide this guy here, now this is a couple of lines of code you have dragged and dropped here, of course, like I said, I have to fix a few things, because these are sample code, now, this is the initialization of our API,
52:41
so most of our APIs always have this initialization part, and referencing part, and the destroy part, you might want to put them into the right places, for example, this one here, initialization, I would actually put it on top,
53:01
with on crate of my activity here, like I said, I should have removed them, so basically, it's this part, I'll just put it here,
53:21
then there's also this removing from the memory part, the destroy, I'll just put it on the onDestroy method, like I said here, of course, I have to fix a few things,
53:45
this context looks right, and then, here's a body of API, basically, this API just initialize itself, then it needs a bitmap as input,
54:01
some parameters, and you do the super resolution from that bitmap that's being loaded into a frame, then it returns, of course, another bitmap, which is enhanced one, so this needed some fix, and because this is another thread, it's not a main thread, so I have to do the proper context,
54:22
main activity dot this, rather than this, because it's a different thread, and I have already loaded bitmap into the memory, so I wouldn't need this, I just need that, because I have already declared a bitmap
54:42
and load that with the original image, and then, this is the return bitmap that's enhanced one, clearly, I wanna make the naming a bit clearer, so I do that, so now, I return them and show them on the UI,
55:07
and let's see if it compiles,
55:47
there's some problem here,
56:01
never seen this one, what's this? Let me connect it a bit,
56:39
normally, this network dies on me,
56:41
but now, it's something else, program type, anybody knows why,
57:05
is the import build conflict, could be the Gradle script,
57:22
these guys contradicting with one another,
57:41
I'll give one more go, this is the typical demo effect, when something works just before, but it's not guaranteed to work on a stage, oh, and it completely successful, so I guess it is that thing,
58:02
so now, I can push that app again on my device, just to show you that it works,
58:25
there's always this awkward silence while it's building, and hopefully, it will end soon,
58:40
now, I should show you my screen, now, I see the app relaunches, there's some delay here, and I'll try to do the build gate thing again, and it works, all right,
59:00
so this is how you would call the demo, just to recap a little bit, we have the tools, the high-end foundation tools to support you importing your own models, and then, a lot of stuff happening behind the scene, but you don't have to remember everything, all you have to remember is in the end, you can call that model with one line of Java's code,
59:21
of course, if you wanna mess around with the C++ layer to put more efficient stuff in there, you can still do that, and if you wanna use our trained model, you can use this through our higher engine APIs, which we also have a tool here, you can still do the drag and drop, and all the documentations are there,
59:43
and it just got a couple lines of code that would actually help you to leverage one of these nice models we have trained behind the scenes, so the one last thing I wanna show you is the remote device, if the network is good enough here.
01:00:00
Oh, I have to log in, let's see. I'm tied to the Chinese version, but the one you have would be European version, so
01:00:25
don't worry that if you don't read Chinese, you won't have to take a two-year Chinese lesson to be able to use our tool. And now, see the device list, you can search here, of course, I want to just
01:00:42
use one of these P20 devices, okay, yes, I saw the search didn't work, but it worked, so we have three, if I kill this guy, we have more, yes, so I'll just use one of these
01:01:00
P20s, I'll just click connect, like I said, I didn't really have enough time to prepare the European version, so my version is tied to the Chinese version, so actually right now, try to connect to the lab in China, as you can see, there are some delays, plus
01:01:30
the network here isn't really great. But anyways, like I said, it's a Chinese lab, so it's Chinese, but the one you'll be using will be tied to the European lab, and you won't have to endure yourself with
01:01:44
all these weird characters that you don't read. Like I said, it's from far away, so there is some delay here, something is blocking
01:02:00
this guy, okay, now it's good, then, of course, we can click all these buttons, so basically to interact with it, and I believe, if I do the debugging, it's because
01:02:21
this guy is lagging a bit, otherwise, I would have seen two devices, but anyhow, this would be how you use our device remotely, through our tooling, in case your dev team has more people than the amount of device you have, and all of a sudden, you need to use the
01:02:41
devices at the same time. Another thing I just want to show you briefly, which will be the last thing, is this support we have here on high-key devices. I don't have one with me today, otherwise, if I have plugged in, we will say connect and installing the driver for me, and we will project a virtual screen on my desktop,
01:03:05
or you can also connect that device through HDMI to another real screen, it's your choice, but we also support that in our IDE, and there will be more testing services coming to Europe, and you can directly access that from our menu here, and that should be everything
01:03:25
I want to talk about today. Thank you. And with that, I think we still have some time, right? If there's any questions, or, yeah, please.
01:03:48
The first one or the second one? Yes, this one. Yeah.
01:04:14
I didn't quite get the second part clear. Can you use the mic? So you said about all these operators on the list.
01:04:21
Yeah, how do you compare them? What's the key metric to, or what's the, you know, what's it all about? What does it compare? Is it possible to, you know, compare these operators? I mean, like, they're capable. You said the Apple had, like, 60 operators, and the Google had 30, and so what does it
01:04:50
say in terms of, you know, might, some power of the frameworks? Well, it's rather not about the power, it's just about your model, because, you
01:05:00
know, different scenarios requires different kind of, you know, neural networks architectures and has different operators to be used. For example, if you would train any sort of neural network on images, chances are you're going to have to use convolutions. I'm not going to walk you through the details of convolutions, but it basically
01:05:23
does, you know, does matrix multiplications and adding the results together. And in other cases, you would use, for example, you know, ReLUs, these are one of the musts
01:05:41
if you need an activation layer, you either use a ReLU or 10H, or if it's binary classification, yes or no, if it's multiple classifications, if you want to tell images, if it's a chair or table or speakerphone or glass, you would use a softmax operators.
01:06:04
So case being that, point being that for different architectures, you need different operators, for different scenarios, you need different operators, so it rather depends on your scenario and your architecture, it does not necessarily say that one operator is good or better or worse than the other one, it just in different scenarios you use
01:06:25
different ones. I don't know if that answers your question clearer. We have another one. So I have maybe three questions. The first one is... Oh, you have three questions? Yeah, but they are interrelated.
01:06:42
I think we still have time, right? Okay. So the first question, do I have to be online to, when I create my own model in TensorFlow, for example, do I have to be online to convert this model to use it in Android Studio? Because you showed us the conversion process.
01:07:03
We use echo, do I have to be online or is it made offline? So the question is, in case the rest of you guys didn't hear it clearly, so the model conversion part, does it have to be online or offline? We have two solutions. One is that the conversion is done on your desktop.
01:07:24
For this, you need actually to install Docker and Docker images, which I believe everybody knows how to do today, and everybody should have Docker engine on your desktop. So if you have Docker, it's offline, it doesn't need any internet. Of course, you have to download the Docker image beforehand.
01:07:40
But while you're converting, it's an entirely offline solution, and we also have the online solution, which will come, I think, next month, for those guys who doesn't want to install Docker or who somehow is limited with their choices and cannot install Docker. For example, if your laptop is managed by your company or these enterprise-managed
01:08:04
laptops, you don't have a lot of freedom to turn on the virtualization engine to install Docker. So the answer being that there is offline version, there is online version. Both are free of your choice. Okay, thank you.
01:08:21
So the next question is, maybe I will ask all the questions. What if my model isn't compatible with NPO? And the third question is, we have the operators for specific frameworks.
01:08:40
I can see that the most operators are from TensorFlow, so does it mean that I should use TensorFlow for preparing my models, because there is more operators, so there
01:09:03
is greater chance that my model will be faster when I prepare it in TensorFlow, not in Caffeine or Android. The second question I just want to repeat is, what if the operator is not supported?
01:09:23
So the third question is, okay, there are clearly more TensorFlow operators we support here, a little bit more than Caffeine, so does that mean we endorse TensorFlow versus Caffeine? So to answer your second question, which is a really good question, you know, by the way, if your model is not supported, because I actually forgot to demo one thing, because
01:09:46
I only demoed you the positive case, as we know, in software testing, you always have to test the negative cases. Here, if I have a model that's not supported, I just want to show you how it looks like.
01:10:03
This is a very complex model, a lot of layers. We use it for testing. By the way, the one that was successful with the HNet was a Caffeine model, and here is a TensorFlow model for actually predicting human poses, like whether your pose is like
01:10:27
this or like that, basically, and if you run this guy, we will go through the model checking part, we will actually tell me, oh, I'm sorry, there are some things that
01:10:40
are not supported here, some operators that are not supported here, and I can click this guy to open the report. This guy, like I said, is huge, it has 248 layers, normally, you wouldn't train such a huge model, especially when you're running something on the device, you wouldn't want a huge model, even though it's energy-efficient, but still, you want to be constrained on
01:11:04
the device, rather than in the cloud, you can have a lot more computation power. Here, it will tell you that these are the operators that are not supported, and the reason being that some of them, we don't support this type, some of them, is that
01:11:21
somehow you are out of the constraints, so once you see these kind of operators, you actually have almost always one choice left. You need to go back to the training code to fix your model architecture.
01:11:43
In some cases, you can replace one operator with another, in some cases, you can go work around, but that more or less means you have to retrain your model, unfortunately, and that might take from somewhere to minutes to hours, hopefully not days, depends on
01:12:02
how complex your model is, but with the current solution, when you hit the wall like this, you have to retrain your model, but we have actually a newer version which is under development, hopefully it will come out pretty soon, like within a month or two, where you don't
01:12:24
have to do this, because for those operators we don't support on the MPU, we actually have a CPU version of that, meaning that for these layers, we support it, but the computation is going to be offload to CPU, which means that it's going to be a little bit slower
01:12:45
or less efficient than the MPU, but only for that particular operator or that particular layer, so if you don't have a lot of these layers or operators that's not rather compatible on the MPU, you are actually fine, for example, if you have one or two layers out of ten-ish,
01:13:04
20-ish, even 30-ish, basically that's fine, we just do the computation on the CPU, it's going to be a little bit slower, so maybe like half a millisecond or 1.2 milliseconds slower, but wouldn't be like one second slower, so that's the solution there.
01:13:24
Now the third question, if I still remember that, is we have more operator support in TensorFlow and less on Caffe, does that mean we endorse TensorFlow versus Caffe? The answer is no. It just happens so that right now we have a little bit more operator support on TensorFlow
01:13:45
than Caffe, but we will quickly just to navigate around that and just have more operator support, and you wouldn't have to see a list where Caffe is a bit less than TensorFlow, and
01:14:02
you are of course always free to choose your own framework to train your model and to import your model into our framework, and we will support actually more kind of model formats, and, for example, CoreML, like I said, will come out from I think in a month,
01:14:21
so you can basically drag and drop a CoreML into our tool and we will just convert that for you, and we are eventually going to support ONNX, for those of you who don't know about ONNX, it's kind of like ONNX stands for open neural networks change, basically it's a middle format that's compatible with everybody, and hopefully every framework will
01:14:43
build their adapters in and out of ONNX so that eventually we can support everything and hopefully all the operators, so that's the plan there. And also a lot of times, forgive me for like maybe one minute's delay, a lot of times that's not, as app developers, we probably cannot dictate what our AI team or data scientists
01:15:06
do, sometimes they are just familiar with Caffe more than TensorFlow, and sometimes they just want to use TensorFlow, not Caffe, so normally your AI team or your model training
01:15:21
team wouldn't be dictated by your request, and that's why we don't want to have search endorsement to close door to one versus another, we want to support everything within our capabilities, but of course sometimes there are just priorities, Caffe comes first
01:15:42
or TensorFlow comes first, we might have to sacrifice a little bit and do that short term pick, but in the long term we want to support everything. Maybe we still have time for a couple of questions, there was no one here, right? There's one question over there.
01:16:03
Hello, I have a little question about your binaries, as far as I understand you provide pre-compiled binaries and it cannot be compiled with another C++, for example, application, and do I have opportunity to create, to compile my own binaries for your SDK, because
01:16:25
for instance I need maybe binary with Clunk, or GCC, is it possible? I don't know the answer to that question, unfortunately, but if you give me your contact information I can relate to the right guy.
01:16:43
Okay, thank you. Yeah, because I don't know a lot about Unity, and I don't want to speculate here. Sorry for that, but I'll come to you later. So maybe we have time for the last question. Still one?
01:17:01
Go. Hey, so I have several questions, I'm not sure if we have time for all of them. I might have asked this yesterday as well, does your API work on any NPU powered device, or only Huawei NPU powered devices?
01:17:23
The answer is, at the moment it works only on Huawei NPU devices. Because we are still at a very early stage of the neural processors on the device especially, so there is an industry standard there to guide everybody, so right now we are doing
01:17:44
our own implementation, I don't think any other device makers are doing it yet, but they will do it, just a matter of time, I believe a lot of guys will follow within this year or next year, but their solution might be, for example like Qualcomm, they
01:18:06
which is, if you ask me, less efficient and inferior, but that's their choice. So the short answer being that it will only run on our devices. But we are working on a solution so that you can code your app once and it runs differently
01:18:24
on our device and not Huawei device, so you don't have to code two apps, it's just one app that runs on all devices, but one is Huawei device, it just happens so that it's much more efficient, it does the computation much faster, so you will have a much better
01:18:41
user experience, and other devices you might have to face the choice of either, you know, let your user bearing the slowness or less performance app, or just maybe don't provide that capability on other devices, have some sort of smooth transition out of that capability.
01:19:00
Okay, cool, thanks.
Empfehlungen
Serie mit 4 Medien