Can you hear me now? - Using the Web Audio API
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 170 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/13969 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Graphical user interfacePhysical systemWorld Wide Web ConsortiumComputer fileRoundness (object)Computer hardwareCuboidWeb pageLink (knot theory)Software developerMobile WebVideoconferencingPlug-in (computing)Graphics tabletMultiplication signDifferent (Kate Ryan album)Streaming mediaUser interfaceBitInteractive televisionLimit (category theory)TorusSingle-precision floating-point formatFile formatVirtual machineDependent and independent variablesRule of inferenceElement (mathematics)Medical imagingCodierung <Programmierung>Flash memoryMusical ensembleModemWeb browserCoprocessorStability theoryScripting languageInfinityProcess (computing)3 (number)Drag (physics)Band matrixSquare numberFault-tolerant systemGame controllerComputer programmingRevision controlFigurate numberSystem callWebsiteWordComputing platformReal numberSqueeze theoremVideo gameVariety (linguistics)MP3Web 2.0Game theoryJukeboxLogic synthesisCodecLatent heatMIDI
07:26
JukeboxInfinityInfinityGame theoryRevision controlCuboidMusical ensembleJukeboxMP3Real numberComputer animation
09:11
Maxima and minimaGraphical user interfaceRevision controlInfinityMultiplication signTrailGraphical user interfaceXMLComputer animation
10:02
VideoconferencingComputer animation
11:14
PredictabilityLogic synthesisRevision controlReal-time operating systemHookingGame controllerDifferent (Kate Ryan album)Graphics tabletXMLComputer animation
12:53
Maxima and minimaMIDIVirtual machineGame theoryVideo gameGame theoryComputer fileMultiplication signInteractive televisionLoop (music)World Wide Web ConsortiumPlanningRandomizationShape (magazine)CodeUser interfacePhase transitionState of matterRevision controlGraphical user interfaceGame controllerLine (geometry)TrailEvent horizonFlagBitProgrammschleifeFunctional (mathematics)Key (cryptography)Computer programmingProgrammer (hardware)LengthElement (mathematics)Form (programming)Right angleWeb 2.0MIDIGoodness of fitEvent-driven programmingComputer animation
18:49
Price indexMenu (computing)Execution unitSoftware testingGraphical user interfaceWaveformComputer animation
19:32
Graph (mathematics)Row (database)Web browserAudio file formatGradientBitComputer fileUniform resource locatorLine (geometry)Form (programming)TrailGroup actionMoving averageDifferent (Kate Ryan album)Row (database)Content (media)Keyboard shortcutWorkstation <Musikinstrument>Software development kitSound effectReal-time operating systemMultiplication signStructural loadMetreWaveformComputer animation
23:59
Game theoryRing (mathematics)Web browserCodeCartesian coordinate systemWeightVideoconferencingGraphical user interfaceBitComputer animation
24:55
Game theoryWechselseitige InformationNormed vector spaceGraphical user interfaceCollisionWorld Wide Web ConsortiumGame theoryPlanningComputer clusterMusical ensembleLattice (order)SynchronizationMultiplication signSoftware developerConnectivity (graph theory)Library (computing)Mixed realityNetwork socketWeb 2.0Group actionServer (computing)Sampling (statistics)Connected spaceJust-in-Time-Compiler1 (number)Scheduling (computing)Compilation albumDampingCoprocessorVideoconferencingSoftwareEvent horizon
30:42
Function (mathematics)Data bufferSource codeContext awarenessPrototypeStructural loadVolumeImplementationTrailNumberPhysical systemGastropod shellSoftware development kitProcess (computing)Computer filePoint (geometry)NeuroinformatikBuffer solutionWorld Wide Web ConsortiumWordSource codeCodeDifferent (Kate Ryan album)CoprocessorMereologyBitObject (grammar)Web browserProduct (business)Group actionGame theoryVideo gameScripting languageElement (mathematics)WeightRevision controlMathematicsContext awarenessBuildingCASE <Informatik>Repository (publishing)2 (number)Pointer (computer programming)Workstation <Musikinstrument>CoroutineLibrary (computing)Graphical user interfaceBlogFront and back endsSoftware testingNintendo Co. Ltd.Web 2.0CodeFunction (mathematics)outputProper mapJSON
36:29
ChainSummierbarkeitCone penetration testMacro (computer science)Maxima and minimaChemical equationMenu (computing)Computer fileWindowView (database)Markup languageVacuumInclusion mapBlogComputer fileGame theoryElectronic mailing listArrow of timeMathematicsKey (cryptography)Type theoryScripting languageVideoconferencingGraphical user interfacePoint (geometry)Computer programmingSet (mathematics)outputWeb browserProcess (computing)CodeGroup actionCodecReal-time operating systemFunctional (mathematics)Multiplication signVirtual machineMereologyCASE <Informatik>Line (geometry)Data conversionState of matterFile formatLogic synthesisWeb 2.0Einbettung <Mathematik>Arithmetic progressionTrailWeb pageSpeech synthesisHydraulic jumpMessage passingPlug-in (computing)CountingInstance (computer science)MIDISequenceIntegrated development environmentBlogClosed setBinary fileNumberWorld Wide Web ConsortiumFlagSampling (statistics)DigitizingIdeal (ethics)System callShared memoryBitPermianPiOnline helpHacker (term)Arithmetic meanGodTelecommunicationVideo gameSpacetimeMotion captureInformation40 (number)Computer animation
46:06
BlogBlogoutputGame theoryAndroid (robot)Computer hardwareWeb browserNeuroinformatikDegree (graph theory)DigitizingLimit (category theory)Point (geometry)Web applicationWeb 2.0Electronic mailing listVirtual machineProcess (computing)CodeMultiplication signNewsletterSingle-precision floating-point formatProgrammer (hardware)PrototypeComputer programmingLibrary (computing)Term (mathematics)Graphical user interfaceWorld Wide Web ConsortiumWebsiteSoftwareFunctional (mathematics)ConsistencySystem callScripting languageSoftware developerLine (geometry)Product (business)Mobile WebDrag (physics)LogicWorkstation <Musikinstrument>Block (periodic table)Patch (Unix)GoogolForm (programming)Noise (electronics)Software development kitBranch (computer science)Mobile appElectronic program guideVideo gameVideoconferencingWeightComputer fileCASE <Informatik>Group action
55:44
Computer animation
Transcript: English(auto-generated)
00:02
doesn't seem like people are straggling too much. I don't see a bunch of people. Howdy. My name is Jory Prum. I'm going to talk to you about web audio today. How many people here are audio developers? Kind of. I'm the only one. How many of you are doing
00:23
JavaScript and other web platform stuff already? Okay. So that's where we hit the mark. Excellent. All right. So I'm going to give you something you probably already know. But I'll just do a real quick little where have we where did we start and why are we where we're at today? So I figure we'll start by consulting the wayback machine,
00:44
Mr. Peabody. So first, how many people have been developing websites for longer than 10 or 15 years? Okay. A couple of you. So in the beginning, there was silence. Web pages could be gray. The links could be blue, purple if you visited them. Images could exist.
01:04
If you wanted them to be links, they had blue or purple squares around them, and you couldn't control any of this. And there was no sound. And then as things got better, there continued to be no sound. And we continued to have no sound. And then all of a sudden, one day, we had
01:20
plugins, like real plugin. Real video was the first one I remember playing with in quick time. All of these were pretty cool in various ways. You know, we had, you know, with real, we could play video, we could play audio. It wasn't enormous, but it wasn't great quality, of course, you know, for our lovely little modem download pipes. But it was streaming only,
01:47
and you couldn't interact with any of the data. So nothing could respond to anything. Quick time was cool, but it was pretty Mac-centric. Didn't work very well on Windows for a variety of reasons. Beatnik was awesome in that you could do all kinds of responsive
02:06
customization, but you couldn't stream. And then there was Shockwave, which was just enormous. The files were huge, and it, of course, was a processor hog. So then a few years later, we finally got one plugin to rule them all, and Flash became ubiquitous everywhere,
02:24
which was awesome, because we'd never had any of this stuff before. But, you know, while we got some of the cool things, we could do lots of interactivity, we could do MP3s, we could do streaming, it was universal, it was everywhere. Everybody seemed to have it,
02:40
but no MIDI. So that was kind of crappy, and stability is certainly not a word that is spoken much around Flash. And, of course, then there was mobile. Anybody remember Flash for mobile? No. HTML5 came into our saving grace, though, with the audio element. Awesome. We have something
03:02
actually created for audio, which is really great. It allowed us to create all kinds of custom interfaces, which is cool, so you didn't have to have a reliance on operating system for making your user interface look cool on a web page. But there were a lot of limitations, right?
03:21
You really couldn't control a huge amount. There's only so much that you could actually rely upon when it came to mobile devices. For example, if you use the audio element on iOS, you could only play one sound at a time. So if you want to play more than one sound, well, that's too bad. The last sound will stop playing, and the new sound will start playing.
03:44
Oh, and by the way, the first time you want it to play is when it's going to finally decide to load your sound because it's saving you battery time and bandwidth on your phone, because that's the priority, is your battery in your phone, not making a machine gun sound when you start pushing the trigger on the machine gun. So there's a lot of limitations
04:03
there, and that was kind of a problem. There were some ways around it. You could build sprites with audio, which is interesting but not great. Oh, and if you want to do it on Android, well, anybody before Ice Cream Sandwich, well, while the audio element was recognized, there were no codecs in the browser, and so it couldn't play sound at all.
04:23
So kind of a problem. I started getting into doing some audio stuff on the web. I'm an audio engineer. I've been doing sound design for games for the last 16 years, and so I started developing things on the iPad and wanted to build some things that would
04:41
have some very simple interactive audio and found that I couldn't do it, which brought me to the Web Audio API. But it's not like you couldn't do cool things with the audio element. For example, this is Lux Ahoy. I'm just going to play a little video of it, but this is running in the Chrome browser. Now, this relies on the audio element to play all the
05:11
sound, and it works great on a desktop browser but not so well on a mobile device because, well, you can't rely on what it's actually going to be capable of playing and simultaneous
05:22
sounds and things like that. So that brought us to the Web Audio API, which is actually pretty damn cool. How many people here have looked into the Web Audio API so far? So a few of you. Excellent. Have you done any programming with it yet? Okay. So half of those
05:40
people have actually used it. So Web Audio is very cool. It's a node-based system for being able to program all kinds of different things with sound. You can do simple playback if you want. You can do manipulation of your sound files. You can generate sound, so you can create synthesizers and things like that. Lots of really, really cool stuff. We still have
06:06
one major problem, one major drawback, though, which is also a problem with the audio element, which is codec support in all the different browsers because, as you see here, there is not a single audio format that is supported on all browsers everywhere still today. In fact,
06:24
I was answering somebody only an hour ago who was trying to figure out why they couldn't get their MP3 files to play in Firefox because MP3s don't play in Firefox unless you run a specific version on a specific Windows setup with the particular hardware
06:40
that will then do the MP3 decoding because Firefox has principles rather than support. So you end up having to do all your audio work in two different formats, which is a drag, but, you know, it's not the worst thing in the world. Despite that, there are some really cool things people are making. So I'm going to show you a bunch of different cool
07:03
examples. So looking back into the past year, I'm actually going to run this one live. So this is a really cool example called the Infinite Jukebox. I will turn on my mirroring.
07:26
So the Infinite Jukebox was made during a music hack day, and what this does is it actually will let people supply an MP3 of whatever is their favorite song, and it will then beat map the whole song and then play that song forever because,
07:45
you know, when your favorite song just isn't long enough, we have the Infinite Jukebox. And there's some awesome examples up on the website, but this is kind of my favorite one. So we have Billie Jean. Now all of those little bands there are places it can jump like it just
08:00
did, and it'll just kind of randomly choose. You can jump anywhere in the playback as well. So this is all real time, and it's pretty damn seamless. Hours of fun.
09:11
Now that it got somewhere interesting, the same guy wrote another version before that, which was actually just as much fun. It's the Infinite Gangnam Style.
09:33
Might not be working anymore. Anyway, it's kind of the best feature of this one is it plays Gangnam Style forever, but it changes the button to say, make it stop, please.
09:44
And down here it shows you the listening time so you can keep track of how long you've been playing the song. Oh, I bet it works in the older version. They just changed some things this week in the nightly build of Chrome. So maybe that's why it's not working. There we go.
10:14
It'll actually edit the video as well. Never detect when it makes the jump, please.
11:11
You have to say please. Some very cool other things. Let's see. There's a lot of synthesizers. Some very cool stuff. You can play them from the keyboard as
11:24
well. And all of this is actually being generated in real time. So you can actually go through here and play with all the settings, which is awesome because there's actually even a MIDI version.
11:55
Let me hook up the MIDI version. You're not going to see it. There we go. So I've got a
12:43
synthesizer. But here we have the... Oh, I need to be in the other version. It's not seeing MIDI for so... Oh, I probably have to enable it. So MIDI is actually behind a
13:03
flag in Chrome. Let's try that again. There we go. Sound out of here or no? There we go.
13:34
So playing MIDI from an iPad. But if you have actual MIDI controllers, you can do it through there as well. It doesn't have quite the horrible latency the iPad does for that.
13:44
And actually the controls are MIDI controlled as well. So you can see that drive going up and down. And just like all other MIDI things, it can have stuck notes. Some fun stuff.
14:06
One of the coolest things I think with the Web Audio API that you can do is trigger interactive events, event-based audio. So I actually put together a really simple example of some of the
14:20
most common concepts we use in video games. Well, the most obvious would be just play and stop. Right? Another would be elements that you would mouse over and have different state sounds for interface stuff. Another would be firing a weapon or controlling weapons. So
14:46
for example, I have a pistol. Now, one of the things we do a lot of in game audio is randomization. We want to make sure that you never feel like you're hearing the same sound over and over again. There's a great reason for that. I'm going to play you this pistol sound,
15:03
but it's only one pistol sound and it's not randomized in any way, shape, or form. And you tell me if you could tell whether it was only one sound. Single sounds have a funny way of being very obvious to the human ear. But let's say we
15:20
randomized just the pitch of that sound. So I'm just randomizing the playback pitch of one sound by I think 2%, maybe 5% at the most. And so it's just randomly playing that back. I can add additional sounds as well. So here's just three versions of the sound without
15:43
randomization of the pitch. Pretty good. But together you end up with something that sounds really good. Obviously, no pistol fires that quickly. I hope. But we also do other kinds of weapons like shotguns. Right? And the shotgun you need
16:03
to be able to control the length of time it takes to reload it. So, you know, very simple code to do something like that. So I'm just holding down the key to play the shotgun sound. Now, it's actually randomizing the firing sounds and the reload sounds, and it's randomizing
16:25
the pitch on all of them. And it's keeping track of how long each one of those is, playing them back at the right times, and making sure I can't fire it any more often than it takes to actually refill it. Now, I'm not a programmer. I do hack a bit in code. But
16:46
I wrote this code in a matter of about an hour. So it shows just how incredibly accessible the Web Audio API really is. They also have now the ability to do loop markers. So
17:04
one of the things we do a lot of in game audio and event-driven audio is have loops that play for however long something is occurring. But we need a start and an ending phase to that loop as well. So we can actually programmatically choose where the loop markers would go inside something. So I could deliver a file that has the start, the loop, and the stop
17:24
all in one file, and then program it so that it will play through all of those. So here's a machine gun. Now, however long I hold this down, it will keep firing the loop. And when I let go, it'll stop. So it's actually arbitrarily creating the loop markers based on what I've
17:48
programmed it for. So we could do that with a car starting, looping in idle, revving up to another state, revving down. So we can create those different things. We could crossfade between
18:00
them. And the Web Audio API provides all that capability. And it's very easy to access. Let's see. So I'll show you some really basic functionalities that people have been programming. Wavy Jones is kind of a fun one. It's basically just an oscilloscope made with
18:21
a line in Canvas but reacting to the sound that the sound file is being played.
19:09
Okay. That one doesn't feel like working today. Let's try it in Chrome.
19:24
They must have changed something again on me. Sorry about that. Tested all these last night. They all worked. This example allows for creating it actually draws the waveform. And you can click anywhere in it to play from that location. So I'll just play a little bit
19:44
of it. Wait a minute. Wait a minute, doc. Are you telling me that you built a time machine? Kind of a DeLorean. So you've got realtime waveform creation. Are you telling me that you built a time machine, doc? You can move around the file seamlessly. So this is kind of the kinds of tools that I'm used to
20:03
working in in content creation to be able to move throughout waveforms and do editing. Now, that means that if we have something that would visualize that, you know, let's say make, you know, peak meters like that, well, that's getting towards some of the tools that I work
20:24
with as well. So this is, again, a canvas example using a gradient on a line. Or we could expand that out into multiple frequencies. Or we could actually do that as
20:47
a spectrogram. Some pretty detailed capabilities for a web browser if you ask me when it comes to
21:02
sound. So that means then the obvious next step would be, say, using a web browser to actually create content. So we could say add a track. And let's say we create an instrument, maybe a keyboard.
21:34
And we can turn on the metronome. The metronome happening there. We can play. We go back and
21:52
take a look and it's actually captured our sound. In fact, we could trim it. So we have a functioning digital audio workstation built in a web browser. That's pretty cool. And this is
22:07
collaborative. So multiple people can use it at once. I can go and add something else on top of that. So they've provided a bunch of different instruments for us. We'll load a drum kit.
22:42
All right. So we can take a look at what we recorded. We can scroll that back. We can actually
23:04
view the notes and work with them if we want. Get out of the piano roll. And this will actually
23:28
then create audio files for me. Pretty cool. Let's go another example. This is kind of one of my favorite examples recently. It doesn't look like much. But what's really awesome is it first off
23:44
asks to be allowed to use the microphone. And it will actually record. So it's not going to show me very much while I'm actually doing anything. But when I hit, it's going to give me the file. Which of course is going to open peak. So it's not going to show me very much while I'm
24:03
actually doing anything. So it does record, which is also pretty damn cool. We can actually create content inside a web browser now. So I'm going to show you a couple more examples. And then I'm going to show you how the code works a little bit. So another really cool example.
24:25
The BBC went and recreated a lot of the well-known Doctor Who radiophonic workshop sounds, which is amazing. And they provide all the code for it too. Does it work? This one I have a
24:50
video of. So I'll show you a video of it instead. Like I said, they just changed a bunch of stuff in Chrome this week. So a lot of stuff. This example, I don't think this one's
25:15
going to work. I'm going to see if we can try it. I'm going to skip that one.
25:25
So these guys went, it's a group in Sweden called Plan 8 that have been doing a lot of web audio development. A lot of their stuff isn't currently working. They've taken it offline for various reasons. So there's a couple things I can show. One of them is called Socket Tennis, which Lynn's going to help me demonstrate. I'm going to unplug this.
25:50
So one of the cool things about this one is it uses a mobile device and all the sensors in it. Connection here. OK, so I'm going to create a game.
26:14
And the one you want to put in is AZTL. Player joined. Player joined.
26:26
OK, so now we have a tennis game that's all sound and using the motion sensor.
26:42
OK, is it your serve? I think you have to turn the phone the other way. You have to hold it from the back side. Yeah. Yep. Try it. Too late.
27:02
It's your serve. Oh, it's my serve. Try now. There you go. Swing. Anyway, it's a fun example. The eyes move back and forth, which is a lot of fun.
27:27
OK, I'm going to jump back in here. Look at Racer. This is a really cool mobile example
27:43
that was also implemented by the Plan 8 guys. So this was an experiment that Google did. And it uses mobile devices. I'll show you the video, because having five devices all set up, you won't see much. But it actually is... Sorry, let me put the sound back over.
28:10
So it actually moves the sound from one device to the next, as well as keeping all of the devices in sync and keeping the music in sync. And depending on how many people you have playing,
28:20
it layers more layers of sound and music on top. And it works on most mobile devices that support web audio. So things like that collision, the sound plays on the device that the
29:04
collision occurred on. And as the slot cars move from device to device, the sound comes out of those speakers. It's very cool. I've only ever actually had a chance to play five devices one time. Because some devices don't support web audio, some are just too slow to support
29:20
the software. But it's a lot of fun. I was going to mention, there's a bunch of third-party libraries out nowadays for doing web audio. So you don't actually have to write it all from scratch, which is cool. Some of the things I didn't show you, for example, there's guitar effects processors that are out that you can utilize. And you can just basically
29:42
plug them in and start doing distortion, chorus, phasers, flangers, all kinds of cool stuff. There's one called Tuna, which allows you to do all that kind of stuff. There's another one called Gibberish. It's really designed for just-in-time compilation. And so it's very
30:02
heavily optimized. Very cool stuff. There's another one called Web Audio API Clock, which will allow you to schedule sounds. The Web Audio API is sample accurate, which is great. So that means you really can do some very cool audio things. Keeping things in sync with picture is a whole different challenge. But being able to use the clocking to be able to
30:25
schedule events, whether you were, let's say, trying to build a multi-channel mix of music, it's pretty cool. And then there's another one called Component FM that also allows you to use various sound manipulation tools. Faust recently created a new tool that allows you to export
30:46
custom processors. That's pretty neat. You can actually make JavaScript processors that will do custom manipulation of sound. So if it doesn't exist in the Web Audio API, you actually still can do it. Surround is another thing you can do in Web Audio API. And actually, the audio
31:05
element will support surround playback as well. This radio station in France actually has a 5.1 as well as binaural audio outputs. Now, the biggest question people ask me generally is, well, that's nice, but how are people going to use it? It's actually easier than you think.
31:25
HDMI. You plug your computer via HDMI into your home theater playback system, and it will generally start playing surround straight out for you. In my studio, I was able to actually create an example that used Web Audio API to play surround music out through a browser through
31:45
the Chrome browser. The hardest part was figuring out how to get a six-channel OGG file created properly. Once I did that, it all worked just fine. That example code is up online. Obviously, I'm not going to show any surround stuff. Nintendo actually forked WebKit for the Wii U.
32:06
So you can develop HTML5 games that will work on the Nintendo Wii U. And their implementation, because it's WebKit, supports Web Audio API as the audio engine. So you can do some pretty darn nifty things. I'm still trying to get my hands on one of
32:23
those dev kits so I can do some of that coding. So let's take a look at the code. So this is the interactive example I showed with the guns. This is the code that's running that, basically. So I'll show you the back-end object code and then what's triggering that.
32:44
All of this is available online. If you go to the HTML5 audio blog or if you pick up the magazine here, I wrote an article, and there's pointers to the GitHub repositories for all this code. So the first thing you have to do is create an audio context. Makes sense.
33:03
In some cases, that might need to be prefixed with WebKit. Depends on what you're running it on. Older implementations of the Web Audio API, such as Safari 6 or iOS 6, will not support un-prefixed. So you've got to do a little bit of testing. There have been some changes to
33:22
the API, and they're still developing it. So it's still ongoing. Just in the last week, the latest version of Canary, they changed some of the names of the routines, like add gain node or change gain node. They got rid of the word node because it wasn't in any of the others.
33:44
So they deprecated it a while back, expected people to notice and change, and so all of a sudden, things stopped working. So you've got to keep an eye on some of the stuff that's still in process. Once you've created your context, you can then connect to it all kinds of different nodes. So, for example, a panner node obviously will pan from left to right.
34:05
A gain node will allow you to increase, and it works based on percentage. So if you said, if you gave the gain a value of one, that would be 100%. Two is 200%. So 60B would be roughly a two plus. So now we're going to get some sound data. Now, there's two ways to do this.
34:26
One is with an XML HTTP request, an XHR. Personally, I like to embed my sound data as much as I can unless I'm building something that's so heavy weight that it needs to be streaming the audio data. So most of the examples I've been showing you that I created use this base64 binary
34:46
decoder. I wish that was built into JavaScript. Unfortunately, you have to go find a library that will let you do that. The library I'm using is listed in all the code. So you're going to decode the audio source, and then you're going to put that into a buffer and just keep track of
35:06
whether it's been loaded or not. So once you've done that, you can actually create an object to load the sound into. Once you've done that, you load all the relevant parts, you know, the buffer and the panner and gain and all that will get loaded into that object.
35:26
And then connect to the whole thing, to a destination, and now you can play it. The nice thing is it's very easy to play. You just issue the start command. Now, the zero indicates in seconds, and it's a floating point number when you'd like to schedule that sound to play. So
35:46
being able to schedule your sounds is very, very simple. And then, of course, return a handle so we can use it again later. Killing the sound, you know, you want to be with that handler to be able to manipulate it. So for the example of
36:01
the pistol, this is the actual code that's running it. So we're initializing the sound and setting the gain. And then we're listening for a user to have an action, and then we're just calling it. And that's really all there is to it. So I'll show you an example that's making use of all of this kind of in the same place.
36:26
So I do a lot of video game production. And one of the things we did recently in the production process I'm working in is we changed to an HTML5-based script format. So this is an example
36:45
from Telltale Games' The Wolf Among Us, if any of you have played that. I apologize if there's any spoilers in this that give anything away. Basically, Telltale changed their script format. Their tool exports for us an HTML5 script, which is kind of cool, actually,
37:02
because it means we can play the feeding lines of dialogue to an actor if we've already recorded the counter parts of the conversation. So I went through and actually redid the way they wrote that code so that I can actually get a lot more information. Originally, those gray
37:21
buttons, you could keep clicking them, and it'll just keep playing instances of the sound. And the sounds weren't embedded. So I decided to base64 encode all that so we have more portability because we're not playing this off a web page. We're playing it off a local machine. And then I implemented a playhead-type functionality so we can actually keep track of where we are. So these are going to be kind of quiet, yeah.
37:52
So originally, we didn't have the ability to see what the progress of the sound playback was. And it didn't change the state of the button, and you could play more than one of
38:04
these at a time. So it won't let me play more than one anymore, which is great. And just base bar to start or stop. As I started building this, I thought, okay, we might need to jump around. So let's say we needed to go to find one that's valid.
38:27
So we probably wouldn't want to play that whole thing for an actor, you know. So if Bigby was the next one.
38:45
So as we started working with this, I realized, well, this is actually kind of cool as a playback tool, but it's about 60% of the way to my ideal mastering tool, actually. Because the mastering process in dialogue for games is very tedious.
39:04
By example, I spent about 48 hours straight over the course of a week mastering all the dialogue for a game like The Walking Dead. And the process is very simple, but it's a long, tedious, arduous process. I opened up a program called Audio Finder,
39:23
which shows me a list of all the dialogue, which I have meticulously ordered. And it's not in a script format. So I just have basically a bin of files that all have nine digit numbers as their file names. So like this number here, that would be the file name of this line.
39:42
So when it gets recorded, that's what I would have. It's like, okay, so I have a folder full of those under one of the character's names, and I have another folder full of another character's. And I have to actually master them one environment at a time. I don't want to master the characters. I want to master the environments. Because if I master the characters, the environments may not match. So when they start talking to each other,
40:04
the two characters, one might be louder than the other, and it becomes obvious when it's in the game. But I'm not in the game. I'm outside of the game. But using it inside the HTML5 script, well, now I am inside the game. So the process used to be I would open it up. First
40:21
off, I would name all the files so I could actually undo half the stuff I had to do just for the mastering process. I would add the environment name to the end of the file name, and then I would add the character name. And then I would reorganize all the files into environments. And then once I'd finished mastering, move them all back into character folders, and then strip all that extra stuff off the file names before I deliver them.
40:42
And then when I start going through them in Audio Finder, I use the arrow key to play back a file. It shows me the file's waveform, shows me how loud that file is currently at its loudest point. And as I'm listening through the files, I then, you know, I'm just arrowing through them. And then I hear one that's not loud enough. I hit a command key. It opens it up in peak.
41:04
I hit another command key. It gives me a change gain. I type in the amount of gain I want to change. I hit enter. I wait for a second or two to process. I listen to it. Nope, that's not quite right. Make a change again. Listen again. Okay. Save. Close. Go back to Audio
41:21
Finder. Go back a few files. Start moving through the files again and listening to the dialogue. Same thing when I want to put a high pass filter, which would get rid of, say, a plosive, like if there's a big puh sound on some performance or if there's too much rumble. We use these tools a lot. It's about 90% of the work of the mastering job.
41:42
So it's about 48 hours to do that. But I started thinking, well, I'm almost there here, right? I can play through everything in sequence. So like that cough isn't loud enough. I want it to be louder. I keep making it. Oh, that's too loud. Let's go back to zero.
42:04
Excuse me. Excuse me. Excuse me. Excuse me. Oh, hi. I've seen you around, but you may have forgotten. Oh, hi. I know who you are, flycatcher. I know who you are. I think I've lost count.
42:26
But you know, crane. But you know, crane. Oh, yeah. Oh, yeah. Oh, yeah. But you know, crane. Oh, yeah. And let's say I decided that one was too bassy. I can just invoke a 92 hertz high pass.
42:40
Oh, wait, actually, I want 105. So I've just got set so I can just jump through them. Yeah. Yeah, I mean, it's no big deal. Oh, yeah. Go back. Excuse me. Oh, hi. I have a seen around, but you may have forgotten. I know who you are, flycatcher. You've worked at the
43:00
woodlands for how many years? I think I've lost count. But you know, I think I've lost, but you know, crane. Let me go recently. Oh, yeah. Yeah, I mean, it's no big deal. So what brings you here? I mean, can I help you with anything? So I'm making all these changes,
43:20
right? Now I just need to process them. So I wrote a little code that gives me the export as a tab delimited file. So now I've got all my changes just to the files I've modified. I've got the file name. I've got the amplitude change. I've got the amplitude change in DB in
43:41
case I want to know what I actually made a change. Here's my high pass filter on that one file. And now I've got a bash script that will then run all of just those files through the plugin settings I've already chosen. I've taken a 48-hour job and turned it into a two-and-a-half-hour job almost entirely in the Web Audio API.
44:05
So I would say it's, you know, me being a not programmer, more of a code hacker, you know, I found ways to improve my pipeline. It shows how utterly easy it is to work in Web Audio.
44:27
So what's coming next? Well, the next step is we've got the Web MIDI API, which you've seen a tiny bit of. It's really only implemented at this time in Chrome, but it's getting there.
44:40
And the audio working group is working on that. We've got WebRTC for real-time communication, so chatting and whatnot. But that also gives us access to new codecs like the Opus codec. And it also gives us more ability to get real-time input in more of the browsers. Because right now getting access to the microphone is still kind of a challenging thing.
45:04
But, you know, Chrome does it. And there's a way to trick iOS into doing it as well. You can actually capture video and then strip the video away and keep the audio. But there's a good example on the HTML5 audio blog for that. And then there's the Web Speech API, which is, I think that came into Chrome 25. And I think
45:27
the synthesis part of it is behind a flag as well. Maybe it's been released. But synthesis is available in Safari 7 too. So we're starting to see a lot more of these types of capabilities. And there's, you know, all of it ties together to give us more cool stuff we can do.
45:43
So anyway, there are ways you can get involved. First off, you can, if you're interested in helping out and influencing the way any of these codecs are actually being created, the audio working group with the W3C is actually very eager to get people to
46:00
put in their two cents. It's a very active group. And they're very welcoming of people coming and, you know, helping out. And then there's the HTML5 audio blog, which is a resource I put together where you can find a ton of examples. And then there's actually a third one I didn't list here, but you can find
46:21
it pretty easily through either the audio working group or through the blog. Chris Lewis, who is involved with the working group, has a weekly newsletter that's putting out basically some of the coolest and newest things that are currently showcasing in Web Audio. Anyway, so I'll be willing to take any questions people
46:42
might have about Web Audio or anything else, like the universe and everything. So the question is, browser compatibility and obviously not everything is working 100% of the
47:04
time. How long do we have to wait before we can use this? So they're working on the API. The Web Audio API is available in almost every browser at this point in some form. And there's actually a thing called the monkey patch, which is put together by Chris Wilson
47:23
from Google, that you can utilize to determine the compatibility of various browsers and not have to worry about that so much. There's still, I think, another eight or nine months before they'll really finalize the API. But, you know, it's been a long process.
47:42
When I first got involved, it was March of 2012, and this was barely in Chrome. And then all of a sudden in July it was in iOS 6. And then it was, wow, we can use this, well, in a lot of places, such as mobile devices. It came into Chrome on Android not long.
48:02
That was about the end of 2012. And when Opera switched over to WebKit, they got it. And Firefox has been very involved, and they have support now, too. And Microsoft has now quietly, and quietly, I mean, they put in the modern.ie website that they're working on it.
48:25
So we'll see support in IE soon. Firefox getting involved. Mozilla's involvement actually has been really great for the API because it's helped flesh out some of the inconsistencies and question marks as to how the API's supposed to work. So some of the growing pains here are,
48:44
I think, actually really good in that we've found places where, like, the things I discovered this week where the name of an API call changed, those were things I remember somebody was talking about a while ago, and I just have been too busy to go look for them. So it's partially my fault some of these things don't work. Some of it's that the developers made
49:05
things, made examples a while ago and went, you know, I'm just making it to put it out there, and they're not really worrying about supporting. In some cases, some of those developers have pulled their code altogether and just put up videos and said, this is what we made two years ago. It's pretty cool. And that's about it. So there's still a ways to go. But I'm actually
49:27
using the API because, you know, I'm not going to wait. There's just too much cool stuff to do. There are third-party libraries you can use that will detect for the API and compatibility as well. So, for example, if you were making a game and you didn't want to write any of the
49:43
code yourself, you could use, say, sound JS, the library, and it will fall back to Flash if it can't get anything else working. So, you know, there's some stumbling blocks, as with pretty much everything in HTML5 at this point. Soon, I hope. Any other? Yeah.
50:12
So, yeah. So the question is, with the amount of software limitations, are there equal and other hardware limitations? I personally haven't run into a whole lot of hardware
50:23
limitations other than whether there's support or not. The Android hardware limitation as far as I've seen is just that the handset makers basically push a product out the door and then they forget it exists. And then they just, or the telcos decide they're not going to certify
50:40
an update because they don't really care and they'd rather you pay for another device anyway. So the software update thing is a real drag. It's just as, for Web Audio API, as the way it's evolved, it's actually a problem on the iOS side as well, unfortunately, because the way Apple does Safari pushes is they branch, they basically fork WebKit,
51:08
at some point they will decide to no longer incorporate new features from WebKit when they're locking down code. So, for example, anybody who's running Safari 6.0, you know, when that came out
51:22
will never see the updates to the API that came in Safari 7.0. But Safari 7.0 got a ton of updates to the API and Safari 8.0 will probably get a whole bunch more. And so it's an incentive for people, unfortunately people can upgrade,
51:40
it's just a question of whether they do. I mean, on this machine I'm running Snow Leopard and so I'm not even able to use Web Audio API in Safari, I have to switch over to Mavericks to do it and I have it on here. But, you know, there's limitations pretty much everywhere. In terms of whether the hardware can play sound, I haven't run into too much. I mean,
52:02
the latency, for example, in Chrome is 9 milliseconds, if I remember correctly. Pretty damn good for a browser. You know, the audio functionality has really been separated from pretty much everything else involved with the browser experience. I was noticing
52:20
some issues yesterday when I was updating code to be compatible with the latest Canary that there was, I was getting some weird noise, but that may have just been my computer. So I haven't tested that more thoroughly. Other questions? Yeah. Do I think it'll have an impact on desktop audio apps?
52:46
To a certain degree. I don't think, you know, a web browser-based digital audio workstation is ever going to replace, you know, Cubase or Pro Tools or Logic. To a degree, it's also a democratization. As someone who has, until last year,
53:02
never written a single line of code that had anything to do with sound manipulation, I would say it certainly demonstrates a certain degree of democratization when it comes to audio programming. I'm certainly building more tools with it. That example I showed where I could record the sound and then save a WAV file, I'm actually
53:25
going to take that and integrate that into my script tool so that I can do the processing inside there. I spent about a week on it, yeah. That's right. Well, I have a whole other talk
53:50
I'm actually putting together for a different conference, for a game conference in Montreal in November, which is all about how come 35 years into making games we still don't have custom tools
54:01
for audio? You know, it's a whole other discussion, but I agree with you. It's kind of disgraceful that, you know, currently we don't have the ability to make these tools. I mean, I sat in on a couple of really excellent game development talks here today, and neither one of them mentioned
54:21
audio at all. They both spent the entire time talking about 3D, which is totally fine. I was tempted to ask, but I know the answer is going to be, oh, I haven't thought about it, because that's kind of the world of audio. Nobody really thinks about it. I think in terms of the democratization, the availability of being able to do things,
54:43
certainly it's going to have some impact. I'm hoping that what I can do is build build basically prototypes and then turn around and hire real programmers to make real tools for me. But, you know, I don't know how to run that kind of process yet, so I'm just building it myself. And I can say, here, make this happen. And somebody who's
55:07
wiser in the ways of true programming can hopefully solve those problems for me. Any more questions? Well, thank you so much for coming to the last talk of the day,
55:21
the last talk of the week. I hope you guys saw some great stuff, and I hope you get involved with the Web Audio API. And if there's anything else, any questions you have outside of here, please feel free to hit me up through the blog, HTML5audio.org, or through my website, or Facebook, or LinkedIn, or wherever you find me. Thanks for coming.