Now Hear This!
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Part Number | 42 | |
Number of Parts | 94 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/30685 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
RailsConf 201542 / 94
1
4
7
8
9
10
11
13
14
16
17
19
21
24
25
29
30
33
34
35
36
37
39
40
42
47
48
49
50
51
53
54
55
58
59
61
62
64
65
66
67
68
70
71
77
79
81
82
85
86
88
92
94
00:00
TheorySoftware frameworkOpen sourceAreaSource codeWeb 2.0Network topologyInformation and communications technologyCartesian coordinate systemTelecommunicationClosed setScaling (geometry)View (database)GodRootMereologyComputer animationLecture/Conference
01:14
MereologyTouchscreenSpacetimeLattice (order)SpiralTheory of relativityInsertion lossEndliche ModelltheorieRight angleMetropolitan area networkSummierbarkeitTerm (mathematics)Multiplication signMechanism designSeries (mathematics)View (database)Digital photographyReverse engineeringPulse (signal processing)Maxima and minimaMusical ensembleAreaMoving averageOffice suiteVarianceTheoryStrategy gameRule of inferenceGame controllerParameter (computer programming)Position operatorExecution unitCASE <Informatik>Disk read-and-write headArithmetic meanPhysical lawCodecVideoconferencingSoftwareGraphical user interfaceFreewareStandard deviationUsabilityWeb browserMobile WebInformation and communications technologyTelecommunicationTraverse (surveying)Band matrixInternetworkingFunctional (mathematics)Sign (mathematics)Charge carrierService (economics)DialectReal numberUser interfaceInformationType theoryFront and back endsOpen sourceAlphabet (computer science)Web applicationOpen setDifferent (Kate Ryan album)Set (mathematics)Identity managementDigital mediaSystem callGreatest elementForm (programming)Connected spaceCartesian coordinate systemCellular automatonFirewall (computing)Shared memoryPower (physics)Line (geometry)Web 2.0Overhead (computing)NumberDefault (computer science)EncryptionCommunications protocolSurfaceTransmitterTriangleConnectivity (graph theory)Touch typingComputer architectureCrash (computing)Computer configurationDigitizingTablet computerSmartphonePlug-in (computing)Software development kitPoint (geometry)Computer animation
09:20
WeightOffice suiteWordProcess (computing)PressurePower (physics)ResultantTheory of relativityMereologyVariety (linguistics)Term (mathematics)Computer fileHazard (2005 film)TunisMassWorkstation <Musikinstrument>QuicksortRight angleSource codeUniverse (mathematics)Machine vision1 (number)Order (biology)Inheritance (object-oriented programming)CASE <Informatik>Radio-frequency identificationTheorySet (mathematics)Execution unitSocial classSoftware testingService (economics)Vertex (graph theory)Multiplication signNumberMoment (mathematics)Parameter (computer programming)Information and communications technologyData conversionWeb serviceInternetworkingIP addressCartesian coordinate systemBand matrixPublic-key cryptographyDigital mediaFlow separationFirewall (computing)Server (computing)Mobile appProxy serverDecision tree learningWeb applicationMessage passingSystem callNeuroinformatikBitSimilarity (geometry)VideoconferencingRange (statistics)Computer configurationRoundness (object)InformationLink (knot theory)Adaptive behaviorPlug-in (computing)TelecommunicationPoint (geometry)Side channel attackConnected spaceContext awarenessTriangleWebsiteSensitivity analysisCodecWeb 2.0Form (programming)Electronic mailing listCommunications protocolCodeDescriptive statisticsoutputCharge carrierFigurate numberSoftwareSmartphoneComputer animation
17:15
Mobile appSoftwareCASE <Informatik>Charge carrierThread (computing)VideoconferencingInformationIdentity managementData conversionCartesian coordinate systemForm (programming)Latent heatUniform resource locatorNumberSystem callDataflowGame controllerCore dumpLink (knot theory)Information and communications technologyUser interfaceFacebookAxiom of choiceContext awarenessDatabaseTelecommunicationData miningPortable communications deviceContent (media)Computer configurationConnected spaceBookmark (World Wide Web)Rule of inferenceString (computer science)Shared memoryLaptopMessage passingGroup actionType theoryGoodness of fitWeb browserQueue (abstract data type)Data managementStandard deviationElectronic mailing listCapability Maturity ModelWeb pageRandomizationQuicksortTwitterOnline helpMedical imagingWordService (economics)State of matter2 (number)Structural loadWebsiteOnline chatPoint (geometry)Process (computing)Object (grammar)Bimodal distributionPhysical lawMultiplication signAreaFunctional (mathematics)Category of beingSound effectFrequencyDecision theoryInsertion lossMereologyUniverse (mathematics)Right angle1 (number)Computer fileSpeciesUtility softwareSocial classView (database)Line (geometry)LoginMoment (mathematics)Set (mathematics)Computer animation
25:11
Service (economics)VideoconferencingHoaxMereologyPoint (geometry)Key (cryptography)Online chatTwitterIdentity managementLine (geometry)System callInformation securityInformationBuildingWebsiteGroup actionEvent horizonGreatest elementCartesian coordinate systemTrailWeb pageSimilarity (geometry)Mobile appSystem administratorOperator (mathematics)Data conversionLink (knot theory)Web applicationInformation and communications technologyType theoryCore dumpPlastikkarteNumberInformation privacyFocus (optics)View (database)Content (media)AuthenticationGreen's functionStatisticsUniform resource locatorWeb 2.0Demo (music)Computer filePhysical systemMultiplication signContext awarenessReal-time operating systemTelecommunicationRegulärer Ausdruck <Textverarbeitung>Mobile WebString (computer science)Data storage deviceRight angleDependent and independent variablesWordThumbnailInstance (computer science)Natural languagePlug-in (computing)Entire functionRow (database)FrictionFunction (mathematics)AreaEndliche ModelltheorieLevel (video gaming)Staff (military)Rule of inferenceInsertion lossFrequencyVideo gameSeries (mathematics)Execution unitSocial classGraph (mathematics)Bus (computing)CASE <Informatik>Parameter (computer programming)ResultantoutputVelocityFilm editingStress (mechanics)Computer animation
33:06
WebsiteMereologyMetropolitan area networkRegulärer Ausdruck <Textverarbeitung>Rule of inferenceSystem callGraphical user interfacePublic key certificateVideoconferencingBitWeb pageMultiplication signMobile appStandard deviationComputer animation
34:37
Figurate numberCellular automatonComputer virusMathematical analysisStudent's t-testMultiplication signVideoconferencingPattern recognitionRoundness (object)Finite-state machineFreewareWeb browserSpeech synthesisMereologyGraphical user interfaceExtension (kinesiology)Library (computing)Different (Kate Ryan album)Presentation of a groupTelecommunicationServer (computing)System callDrop (liquid)Mobile appClient (computing)Demo (music)Diagram
37:22
Web browserOpen sourceInformationContext awarenessBitSynchronizationSpeech synthesisPattern recognitionRegulärer Ausdruck <Textverarbeitung>Source codeWeb 2.0RoutingLink (knot theory)Data conversionClient (computing)Process (computing)System callDemo (music)CASE <Informatik>VideoconferencingComputer configurationCodeSampling (statistics)Markup languageLibrary (computing)Software frameworkSet (mathematics)Row (database)EmailSoftware developerContent (media)NeuroinformatikDigital mediaServer (computing)Peer-to-peer2 (number)Point (geometry)Centralizer and normalizerScripting languageQuicksortFreewareDecision theoryComputer architectureIP addressTwitterRule of inferenceTraffic reportingoutputSoftwareObservational studyMessage passingInsertion lossExecution unitMereologyResultantWordGoodness of fitRootComputer virusDependent and independent variablesOrder (biology)Universe (mathematics)TheoryMultiplication signNatural numberAxiom of choiceSelectivity (electronic)Power (physics)Figurate numberWebsiteComputer animation
Transcript: English(auto-generated)
00:13
So clearly you're in the session now doing this, putting voice, video, text into Rails. A quick introduction, my name is Ben Plank. I'm actually very proud of you from the
00:23
City of Atlanta. Welcome. Hope you all have enjoyed the night so far. You may also know me through some of my own source contributions. Just a quick show of hands, has anyone heard of a year tree? Has anyone used it? Couple? Alright, cool. I'm not going to talk about a year tree today, but I do want
00:41
to quickly mention it because it bears relevance to the talk. A year tree is an open source framework for voice applications. So you can think of it as for Rails, this is where it led, and a year tree is for voice and for real-time communication. I'm also the founder of a company called LambdaLingua based here in Atlanta, and this is what we do. We work with
01:00
voice applications. We build them, we scale them, we do usability, but this is a topic close to my heart behind communications applications in a particular area. Today, I want to tell you why the web is a lot like outer space. Because on the web, no one can hear your screen.
01:24
So let me just paint a scenario. So imagine you're working with your app, and all of a sudden something happens and you realize you need to speak with one of your customers. What most of you are going to do is you're going to pick up a telephone. And the main problem with this is that when you pick up that phone,
01:42
any communication that you have is now outside of your business process. It's not noted within the business application, it's not recorded, the fact that you've been called and have it is in no way, in most cases, is in no way reflected in the statement that you're giving to your customer.
02:00
And also, the communication itself is fairly limited. You've got this really kind of crappy narrowband audio signal to talk through. You can't easily share pictures, you can't easily share links, you really don't have a very rich communication experience. Wouldn't it be cool if we could, instead of having that phone call happen outside of the app,
02:21
put the communication right into the application itself? Wouldn't that be cool? So that brings us to something called WebRTC. So, by the show of hands, has anyone heard of WebRTC? That number goes up every time I ask, which is absolutely a happy thing for me to see. Has anyone actually tried it?
02:44
Well, hopefully, at the end of this talk, you all have some resources for you that will inspire you and give you some information on how you can try it. For those who aren't familiar, WebRTC is fundamentally about audio, the speaker, the microphone, and the camera in the browser, and they can use that in a web application.
03:02
So what is it? It is the camera and microphone, but it's without any plugins. This means that if you want to go build a real-time communication app, you want to take advantage of the mic and speaker for some kind of app, you don't actually need a flash, you need a Java, and all of the bad things at home, like having
03:21
plugins such as Crashers and ICQ, it's built right into the browser. WebRTC additionally has functionality built in to establish peer-to-peer connectivity between two or more components. This is really an interesting point, which I'll touch on more in a minute, but connectivity across the internet can be really
03:41
tricky with map, firewalls, and things like this. So WebRTC has functionality built in to help traverse connections from firewalls. The last thing it provides is a common set of codecs for actually exchanging high definition media. So I'll talk more about that in just a second as well.
04:03
So fundamentally it is a WebRTC is a JavaScript browser API. You tend to access it using JavaScript browser built in the browser. It can also be used for mobile, although in the mobile world what you get are all of the back end pieces to get the standardization for that. But you don't necessarily have the same API. It's a different SDK
04:25
I'm not going to talk too much about mobile today. But the standards for accountability are really interesting. So these codecs Opus, G711, H.264, and VPA, these are what make very high quality audio and video possible on the internet.
04:42
G711 aside, which we really care about, Opus is really a pretty amazing codec. It comes from a lot of research including significant contributions from Skype. If any of you have probably most of you have made Skype calls, you notice just how good the audio sounds. Opus builds on that research and actually goes further.
05:01
Opus as a codec is good enough not only to transmit voice efficiently, which is to say using minimum amount of bandwidth to preserve the highest quality of voice, it actually can scale up and also transmit music. So it's a very very high quality codec. It's built in the browser with no royalties. I mean if everyone's ever dealt with licensing codecs
05:21
it can be Opus is entirely royalty free. H.264 and VPA are two competing standards for transmitting video. H.264 is not a wireless, it actually is tapped and covered although Cisco has paid for licenses so that open source software like Firefox and conventional Chrome will support H.264.
05:42
VPA is actually a codec that Google acquired a company and then released all of its IP. So it's a fully open standard open license codec for video and that's very exciting because that means that we'll be able to do video without paying royalty eventually.
06:01
But what these do provide you is built in the browser with very very high quality audio and video. There are a few more alphabet soup type things that are built in the standard. SDP is the mechanism by which the two endpoints exchange information about where they are. I stud in turn these are the protocols
06:21
used to traverse the firewall and then DTLS-SRTP is exciting because it is basically on by default encryption. So all of your calls will be, all the media on your calls will be encrypted. So finally what is WebRTC? A lot of people in the telephony industry get really excited about the idea
06:43
of putting a telephone in a web browser. And please if you take one thing away from today do not take that away. It isn't to put a telephone in a web browser because we can do so much more. The web is the rich power of user interface possibilities. So think of it instead as communications in a web browser. And a quick note on the relevancy
07:02
of WebRTC. This is a chart compared by, this is the only undid chart I've got this whole talk around. Dean Bodley put together this chart projecting the adoption of WebRTC. And the grey line at the bottom represents browsers. And we are pretty much hitting that point today with about a billion, little over a billion desktop
07:23
browser based devices that support WebRTC. The interesting part is the growth in tablets and smartphones because these communications options won't just be in the browser they will also be on mobile devices. Whether that is mobile web or native to apps. Eventually coming very soon there will be a lot of WebRTC available devices out there. Ok so before we go
07:44
further into WebRTC you want to just give a real quick background on communication to politics. So this is how communications are facilitated today. So most of you pick up a phone, you know you might have your service to AT&T. When Alex wants to call Bob she'll pick up the phone, she'll dial
08:00
that signal will hit AT&T. AT&T shoots over Verizon. Verizon sends it back down to Bob. This is called the traffic sign. It's pretty classic. It really relies on every subscriber having a relationship and then all of those carriers being federated with each other. The advantage of this is that everybody can call everybody. We have one set of phone numbers and generally as long as all of the
08:24
carriers federate everyone is regional. But there are a lot of problems with that. Because of the overhead that comes with all that federation there is a lot of innovation that gets lost. You just don't move very quickly when you have to coordinate companies all around the world. And then not to mention devices.
08:42
Also it's not particularly user friendly. If you think about identity in the form of a cell phone, your identity is your phone number. But that's the least, it's ten random digits that are assigned to you by your phone company that really mean nothing to you. And yet we come to be associated with this identity. So this architecture
09:04
has some significant drawbacks. The next architecture is more like a triangle. And Skype is a good example of this. So you have one center on the surface and you have endpoints that connect into it. Now these guys are going to innovate a lot more because they control both the network and the endpoints. So we got things like video, we got things
09:23
like high definition calls. We have hundreds of usernames. Usernames that we actually picked in the process of signing up. But there are still two things that are problematic with this. One is that it's essentially a wall cart. You know I can't build an app that integrates with Skype all that well. I certainly can't base Skype in one of those processes. Which means
09:43
second of all it's not very contextual. I still have to go to a separate service or separate application to actually handle my communication. It's not based on my business process. So with WebRTC we actually get to do something that looks more like this. You still get to keep the standard, the tribal, it's actually a more perfect triangle.
10:04
Because what's happening here is the signaling goes back to the website. So again I don't have to download any plugins. I just go to the web application and it serves me all of the tools I need to enable this to use Skype.
10:21
Secondly the signaling and the media are separated. So what happens is when that call must be set up, Alice will send a request which just contains her information to the web service. The web service will share that information back to Bob. But let's imagine you have a firewall here. The media actually is exchanged behind the firewall.
10:40
So this has some really interesting implications for performance. This has some really interesting implications for quality. If you are on a low bandwidth link, maybe you're in a, I was actually working in a Barbados once and the internet connection off the island went dead. So if you have connections on the island, but not connections off the island, you actually can still view it. Because all of the media was exchanged over
11:02
so that we were not using expensive bandwidth round trips across congested links. All of the video was being passed on the land. Even though the session setup could be elsewhere. So let's dig a little bit further into how the WebRTC session set up my code. So we'll start with Alice using Firebox.
11:23
She is going to send a request to initiate communication with Bob. That request contains something called an SDP, a Session Description Protocol. For practical purposes just think of it as a opaque blob of text. But this blob of text contains a bunch of things, which include her contact information in the form of an IP
11:43
address and a port. It contains a list of the codecs that her device supports. This being WebRTC, that would mean that this would be introduced as a port. And it contains as well her a public key that can be used to encrypt communication being sent to her. Now the web server doesn't have to do anything with that blob. Again, it's
12:02
just opaque. All it really has to do is forward that on to the recipient, in this case Bob. So Bob, upon receiving that offer, generates his own response, containing largely this information and passes it back via the web service to Alice. Now at this point, a whole bunch of packets start flying
12:22
between Alice and Bob. Starting with ice, then stun, and then turn. So what those three things do are ice in particular enumerates all of the network interfaces that you have. So you might have a LAN, you might have a VPN. It will also ping out to the internet and figure out what your public IPs are. And it will use all that information
12:42
to try to tell Bob how Alice can best be reached. If they can make a direct communication on the LAN, great. If they can't, because there are several firewalls, maybe we'll do something, try to pierce through the firewall, that's where the stun comes in. In the worst case scenario, if they can't make direct communication, either locally or using stun to reverse the firewall,
13:02
then there are relay servers called turn servers that will actually proxy the media. They'll actually just receive from one party and pass it right together. Now because they've exchanged private keys using the signaling layer, what will actually happen is the media will be encrypted. So even though the turn server technically is in the pouch, the media,
13:22
in the worst case scenario, all that audio is still encrypted. The turn server can't see it. Can't do anything about it. It's just a day of getting passed back and forth. So this big insecurity is one of the big things with WebRTC that is I think relevant given our friends in the NSA who like to jump into all of our conversations. Properly
13:42
deploy WebRTC against that for being able to see into that communication. One other point I want to make about signaling is I've used web servers as my example, but it really doesn't have to be web servers. All you have to do is get that SDP in the way it's going to be. We've done deployments with XMPB as your carrier.
14:02
It's messy. You can do it with Redis. I've even seen an example where someone actually took it, put it on a text file on a USB drive and carried it to a computer and loaded it back in, which is the least efficient way I could possibly imagine. But it does work. So that's enough about plumbing. What really gets me excited about the
14:24
applications is how to use it. In the last couple of years of building these applications, I've thought about what it takes to build these apps and what attributes applications like this should have. So I came up with a lot of
14:40
tenets that I want to share that you should consider when designing communications applications. So a modern voice application should be adaptive, which to me means that it should take advantage of the capabilities of these devices around. It should be fluid, which is to say it should be able to move across devices and across time, even across users, and still preserve the context of the conversation.
15:05
It should be contextual, because really this is the value of what you're building, the communication company in context with whatever application it supports. It should be trustworthy, because the worst thing in the world is to communicate with something sensitive and then have it revealed or otherwise it gives users trust. And the last point
15:24
is that it should be referenceable. So let me go a bit more into each of these. Adaptive. What does it mean to be adaptive? So if Alex again is on Firefox, she has a pretty broad range of options available. She has a keyboard for input, she obviously didn't send text back and forth, she's got a microphone, a camera, and speakers. She can really have a very rich
15:43
communications interface. And maybe she's talking to this guy over here who's on his iPad with a very similar set of input options available. So whatever app we build for them might enable video conversation and audio conversation text, which Carrie will have. Now this one wants to join the conversation as well. If she's on a smartphone and this particular
16:04
smartphone either doesn't have a camera or maybe she doesn't have enough bandwidth or maybe not enough battery to support a video stream, so she still wants to participate in the conversation. She still wants to talk about whatever the issues are. Well she can still both receive text messages if we have a mobile app in play, and she can also
16:24
participate by audio. So think of this sort of as your conference call where some of the people have a side channel where they can use video with richer communications, whereas this third party really is only in by voice, but frankly she is able to participate. The same is true for this poor guy. I don't even know where I found that phone.
16:44
But he can still join it. He can still talk. And then we have this last guy who also has a browser but either his microphone is broken or maybe his baby is asleep and he doesn't want to talk. Actually I've got one with a guy who's in Milan and he's 6 hours ahead of us, so a lot of times we'll have calls where after his kids go to bed
17:04
and he'll always be able to talk. So we'll say something and if he wants to feedback a lot of times we'll just write something into our side channel. So an app that's adapted will enhance or degrade gracefully based on the capabilities and choices that we've made. That's what being adapted is all about.
17:24
Alright, let's talk about being fluid. So conversations often start, especially today, with chat. I certainly don't, if I want to reach out to somebody, the first thing I do is pick up that phone. At least if it's a co-worker, it's not. I'd like to start with chat. I want to see where they are. I want to ask a question.
17:44
Maybe I just want to see if he's available. But at some point chat becomes too cumbersome so we'll switch to audio. So I want to be able to click a button that enhances that conversation from the same conversation, the same context, from chat to audio. Maybe I want to pull a couple more people in because this discussion is getting bigger. We want to maybe invite a customer.
18:04
Maybe we want to invite someone from another department. Upgrade to video, because some things, whether the picture tells a thousand words, the video tells that 20 words in a second. But then when we're done we should be able to go back to chat. And the flow here is that this is still one conversation. And I think Skype does this very well. If you've ever had a Skype conversation where you
18:24
started chatting and then went up to video and then backed up, you can kind of steal that in the history of the conversation. And of course, frankly, being able to switch devices. This is a big one. And not everyone really gets this right. Again, I want to give a Skype thread for this. I'm at my desk and I need to leave. I can actually transition that call to my notebook fairly easily.
18:46
Being contextual. This is my favorite of the five. A friend of mine has this really great book that in the future communicating isn't what you're going to be doing. It's what you're doing while you're doing something else. This idea that we have dedicated communication devices I think is
19:02
I think it's done. I mean even all of us, the phone that we carry isn't primarily a phone. It's everything else, right? So being contextual is all about getting context into the conversation. Or putting the conversation in the context of the staff. These are just some sort of, not entirely random examples, but examples of
19:22
information that may be useful to a conversation. So how many callers are waiting in queue if you have a contact center? Or how much have we sold so far this month? I like the one that says add my manager to this call because it implies not only that the manager can be easily added to a listing conversation, but the business relationship is understood
19:42
by the application. So if I make a request and I say my manager, the application is who I am, who's my manager is, who's how to reach my manager, and then actually add some of the conversations. You can see this in text as well. So a good multimodal app that's contextual not only will facilitate the direct participants of the conversation
20:02
but also third party services. In this case you can see that we were talking about looks like we had a problem with asterisk, but you can actually see that notifications from New Relic were being pushed directly into the conversation. So that just gives everybody else more visibility about whatever problem they're trying to solve.
20:21
This is a great example of really business specific information. So in this case I made a conversation and I just, when I sent this message all I typed was I wanted to know about A12345. What the application did is it actually went back it understood that that was a special string hashtag so to speak, and it actually looked up that information from the database
20:43
and rendered some information live. So this conversation now has not only what I said, but the context of what I said with very little effort from me. So this makes all of that communication much more fluid. Everybody's on the same page. So the fourth one a communications app absolutely has to be trustworthy. Users don't trust anyone.
21:05
So how can it be trustworthy? I think the number one rule don't surprise the user. Don't do something that they don't expect. An example of that would be don't share the contents of the conversation with people who they don't expect that conversation to share with.
21:20
So if you have a conversation between two people, generally speaking no one else should be able to come back later and access that same conversation. It's really important I think as well to help users make smart choices where it's required. So there's been a lot of debate about is WebRTC
21:40
as mature as the standards have gone through, as the browsers have adopted the standards there's been a lot of discussion about how to request permission, when to request permission or the camera by your phone. Google does a really nice job I think. Here we're going to Google Hangout to start a Google Hangout. When you first load the page if you've not been there before the first thing it asks you is can I access the camera?
22:01
Now it does remember that. I'll get too diesel about that later. It does remember that you've granted access but before it drops you into the conversation you see a picture of yourself that says here's what you look like, are you ready to go in? I think that's an important step because what you don't want to do is jump into a conversation or somebody loads a site they've been to before
22:20
they've already granted permission, it takes you straight into the conversation and then they just realize they're wearing pants. That can be rather embarrassing. So doing little things like that to try to help you always feel fully in control of their communication is really important. That is especially true with microphones. At least on Macs, cameras have that little green dot that tells us that it's on.
22:40
Microphones don't have such a dot. So there are plenty of examples where an application is listening but you don't know it. That can lead to some unhappiness. Another item about trustworthiness is identity. So identity is an interesting thing. There are lots of applications that make their name based on anonymity. There is no identity. That's an important use case.
23:04
I think the iWord app, having identity is core to facilitating communication. You want to know who the other end of the problem is. When a phone call comes up, I can look at the number and I can say, well, that's my wife. I know who that is.
23:20
In reality, caller ID is actually very easy to scoop. So the only reason that we don't see a lot more of that scooping is just because of basically the carrier is controlling the network itself. But anyone who gets a certain type of user ID connection or a whole PRI will be able to actually set the caller ID to whatever they want. I may or may not have known that. But
23:43
with LEP, we have many more options for a certain identity. We have OAuth, we have social identities like Facebook and GitHub and Twitter. And we can actually use those to enhance the communication. And if your communication is built into an app, use the identity that comes from your app to assert who the user
24:03
is. And finally, these conversations should be referenceable. So, referenceability is, I think, all about sharing. Every conversation in my mind should have a URL. This is an easy thing to do. We deal with resources and objects all day long. So every conversation with a URL that is permanent,
24:23
unique, it represents the latest state of the communication request. So if you schedule a call, then you should generate a URL that says this call will happen. If the call is going on and you hit that URL, really, it should bring you straight to the conversation. It should present the user interface that lets you be a part of or participate in that conversation.
24:43
Once the call is complete, then you should provide some kind of transcripts. You are recording maybe multiple content types. So if you record the call and you transcribe it, allow the options to download the audio as well as search the transcript, any images or links that were shared can be combined into that view.
25:03
But this idea of a single conversation can be referenced in a URL in all its forms. And then whenever possible, they get searchable and downloaded. Because you don't know something is there and you can't find it, it may not be there. Oh, and it wants you to be able to be shared. That's really one of the main
25:21
points of having a URL, right? When you copy and paste it and send it to anyone, assuming they have permission to view it, they'll be able to see it. So those are my five tenets. I want to try to apply those. I've got three idea applications, and these are not necessarily great ideas by themselves. I think they illustrate where you can take these tenets
25:43
and these tools and embed them into, to enhance web applications today. They're kind of silly. A live, anonymous matchmaking service. Think Tinder about the video. So our couple of funny dates. You can kind of see we've got two people here who have video sessions going.
26:03
They've got some stats and how they were matched. Looks like they all like most mustaches and puzzles. But these people also really want a sense of privacy. They don't necessarily want to share as much information about them. They certainly don't want to give out phone numbers. So not only do we give them the ability to find each other and communicate, we also give them
26:22
these stickers that go over their face. You've probably seen something similar with Google. We can help them obscure some of their identity by giving them these tools. They can still get a sense for who each other is. They can see their voice. They can see some of the expressions. But you can still hide some of their identity fairly effectively with a tool like this.
26:43
So what does this give you? It gives you safe introductions with strict anonymity. Everybody comes to the site. The site folder reveals what it's designed to do. No need to exchange phone numbers. There's a very low friction to getting Google started. There's no app to download. There's no plugin to install. Really, just by going to the site, the entire
27:02
toolkit of communication is built right in. And then we use silly tricks that you can use to break the eyesore, again, to contain the anonymity. And if you want, we can do a thumbs up. Skype just did their language translation, right? Why not apply that to this site, either by text or by audio.
27:22
Alright, so the second example is an instant response app. So my background before I became a developer, I did a lot of ops. I did server administration stuff. And this kind of thing, whenever something goes down, you get that phone call at 3 in the morning, getting everybody in and on the same page and being able to tell. So what if we could build something
27:42
like this? What if we could build something that would enable people to not only discuss whatever is actually broken, but also bring in contextual information that's surrounding the problem. So, on the left you can see the chat. Just like before, you can see where people are discussing the problem. Third party services, the tan
28:02
and green lines, third party services are pushing in data. The content there is important. It's the idea that anytime someone does an deploy, we can see the plug is made. If there's an alert from the monitoring system, it can be pushed into that text chat. On the right, you can see the voice and video conversation going on. And
28:22
of course the people who are joined by video will see each other, but there's nothing to say that they couldn't also join by mobile device, either by telephone call or mobile app, say a character card. Now what's really interesting, what makes this different than just any other communications app, is what's on the bottom. So what's on the bottom is charts, graphs, contextual data
28:42
from the monitoring application itself. So rather than waiting for an event to come in, I can actually see trend lines happening in real time as part of a communications tool. So the way I envision this is that someone actually maybe a company that builds monitoring tools goes and builds this into their dashboard. I don't know, I'm not here. You can see this map. Whoever does this.
29:02
So the key here is timely and contextual information. The view itself can adapt. If you're on a mobile device, you'll get a poor focus on the communication and less of the dashboard type features. But on desktop, you'll get the full experience. I like, in particular, the emphasis on group-based communication.
29:22
I can click a button, and everyone on the ops team gets an invite to join that particular conversation in context about that conversation. But more importantly, if I need to bring in a vendor, maybe my storage vendor needs to jump in the conversation, I don't want to give him a user account in my system for that purpose. We can generate this unique URL.
29:41
He'll have a token. He can come straight to that page. He can join in that conversation and see what's discussed there without exposing any of my other conversations to him. Of course, we get better with external services. Like I mentioned, we can push in data from GitHub or whatever. We can also record these instances and learn from them later. So once we record it, we can actually tag back to our issue tracking system
30:03
with a link to this conversation. So if something is done where it breaks in a similar way, someone does a search and finds it, you can come back and find this original conversation and understand how this group was told. Okay, my third example, medical records, patient services. So imagine
30:22
you have this very simplistic website and you want to see you've been to the ophthalmologist, you want to see the advice the ophthalmologist gave you. So you actually, excuse me, this is a phone call. So you call the ophthalmologist, you talk to him, he gave you some advice. So the call was recorded. That recording is available for you as the patient to download this site.
30:41
The transcript of that recording is there as well. And the doctor is actually able to go back and do annotations. So the last time I talked to the doctor, he used some words that I thought I knew how they were spelled, and I was wrong. And he was in this case, he would actually write them in and I'd be able to find out what information all of those things are.
31:01
In addition, there's another button here, if I've got a problem with my bill, or if I need to talk to another doctor so I have no emergency problem, I can click the button right here and immediately be connected with someone who already knows who I am, who has access to whatever information I was looking at when I initiated the call, whether that's the bill or medical information.
31:21
And I don't need to keep track of his phone number, I don't need to keep track of any security information. I think in particular, the identity part of this is interesting, because if you call your bank and they ask you for the same three pieces of information, your name, your account number, your last footage, your social, I mean, anybody
31:40
can fake that, right? But if I have secure communication through this website, and I log in with my password, and they have two background medications, Rosie and Andrew, that strong authentication is carried through to my post conversation. So when I click that button and whoever takes my call takes my call, they know who they're talking to. Much more so than somebody just memorized my social security information.
32:04
So in the medical advice use case, secure core authentication I think is one of the big deals here. You'll reuse the primary authentication from the web app, maybe even verify, you can do voice biometrics, make sure that the person who's calling sounds like the person who's bad. You can even cross-check against locations, which is not something you can really do.
32:25
And then you can automate the claims, so you've got the recordings, the transcripts of the bill, both to the patient as well as to the claims processing people. Any of the medical advice that's given, so that long string of things I should do three times a day that you gave me that I didn't write down because I was too busy listening, is all goes into the file. I can go back
32:44
after it and read it. It also gives you an easier way to do quality and service quality assurance. Okay, I hope I've got you to sleep. I have a demo. So this web artistic thing is pretty cool. And I thought that if I could show you
33:01
it would be cooler. We have, as you can see, I've got
33:22
Firefox running. I built this really simple little Sinatra app. And all it's done so far is connect to this page. So this is an example of WebRTC requesting permission from the camera. You may have heard me earlier mention that sites will remember this preference. The standard has fallen
33:43
upon the idea that if a site is not using HTTPS, then it forces the request every time. So I'm running Sinatra a little bit and there's no certificate. So I didn't accept it. If this were a site that had HTTPS and then everyone should remove this, then it wouldn't remember. So this is Chrome. So I've got
34:06
I'm sorry, this is Firefox. I've got Chrome over here. You can't see. I'm going to bring up the launcher.
34:23
And now the rocket launcher has a camera on it. Now what we have is, I'll show you Chrome. Chrome is right here. You can kind of see the video coming from
34:42
Chrome. And it's being transmitted to Firefox. Now, both of these browsers are running the global host. So the traffic is actually only going through the app. Even though the server is elsewhere. The other thing I've set up is Google has this really cool speech API. And they
35:02
expose it. There's a drop stroke library called Anyang. And Anyang is constantly listening. If I just listen right now. If I do it right Now granted, this is demo, so it's probably going to blow up part of me. But if I save the mac for it, it should actually activate the launcher. Then I should be able to talk to the launcher to steer it and fire it.
35:23
The listening rounds. Weapons free. Move left. Move left. Move left.
35:45
I should have made it go further. How else was that? This is a dangerous demo.
36:06
Move up. Move down.
36:33
The communications here is that the video is not using WebRTC. The audio is actually using a different browser extension.
36:42
A different JavaScript API. That is Chrome only. The particular recognizer I'm using right now is Chrome only. But speech recognition can be done client side or server side. If you're using WebRTC you can very easily call into something like free switch or aspects. Do your recognition there. And then the last piece of it, which is just to move
37:01
the launcher with the rockets, is really nothing more than a curl. Or in my case, just a jquery API call. So that's really it. There's nothing matching to it. So that's pretty fun. So with that, that was my presentation on WebRTC. I have a few
37:28
resources for you if you'd like to know more about WebRTC. The first link in particular is great. It's what I used primarily when I was setting up my demo. It's the official set of samples from
37:41
it's the official set of sample code from WebRTC.org. It's on GitHub. The demos are a lot. So if you go to this page, you'll be able to flip through and actually start the video. And the demo includes a link to the source on GitHub. Then the WebRTC.org itself is sort of the central point for
38:00
WebRTC resources. I also want to point out an initiative run by a friend of mine, Landtree, who's actually based here in Atlanta, called WebRTC Challenge. And his goal is to get 1 million developers who are using WebRTC by 2020. It's a pretty obvious goal. I think you can do it. He's got some pretty interesting content in the mailing list. Highly recommend it. Check that out. A couple things I want to mention as well is this being
38:23
RailsConf, if you're interested in doing more voice with Ruby, definitely check out the Rails-like framework for voice. I'll also point out RubySpeech. If you do get into some of the more interesting speech recognition scenarios, RubySpeech is a library we're generating the markup needed for driving synthesizers and web connectors.
38:43
It makes that whole process a lot easier. The last bit is my content information. I'm Ben Plang. That vPlan works on Twitter and GitHub. And, of course, if you're new to me. But with that, if you have any questions, I would love to answer them. Yes? Yes.
39:09
That's a great question. So the question is, in the example with medical reporting, if WebRTC does peer-to-peer encryption, how would the copy report it?
39:20
So, the answer to that is that WebRTC can be peer-to-peer, but it doesn't have to be peer-to-peer. So, I kind of hinted at this earlier when I said that if the architecture is built to enforce that, you can ensure it's peer-to-peer. Something I'm going to explain is that if you have a media server in your network,
39:40
something like free-switch or practice, it will participate as a WebRTC endpoint. So instead of going direct from browser to browser, it's free-switch, free-switch to the browser. And then in that case, because you're decoding all of their computer records. Yes? How do you handle the use case where the
40:01
person on the other end stepped away from his computer, is not at his computer, whatever? Do you have options to route over playing telephone, or just take a voice message, or what are your options for that? So the question is, if a user steps away from the computer, how do you deal with that? And it sounds like, I guess, if they step away before
40:21
a conversation happens, or during a conversation. Yeah, maybe they're selling something on some site, and they aren't keeping the media, but the person on the consumer wants to reach them. So you try I say, okay, so it's like a contact center scenario where the agent does walk away from the desk. Yeah. Okay, so in that case,
40:42
I would probably make sure that the call that comes in gets routed from something that can make more intelligent decisions. So it might try WebRTC first, and then after five seconds of that support, either try somebody else, or try to sell them. The asterisk and free switch are both open source to left handers.
41:01
They go to more WebRTC. If you take that call from WebRTC from the client side, once you get it into either of those, it can be converted to home sync. It can be converted to more WebRTC, it can be converted to standard synth, it can go to synth, voice over ID, or even to regular telephone network calls. So the same kind of rules that apply to context scenarios would apply in that case as well. If you really want to get Esoterra,
41:24
there are some motion detection libraries from JavaScript that will actually, if you wanted to detect presence by the they're moving, they're hammering, they're that far, I'm not sure I'd recommend that. But basically anything you would do to detect session activity timers, anything you do
41:40
to detect presence on one end, kind of uses input to some other decision Is that a missed question? Any other questions? So are you asking about the IP address?
42:09
You're asking if I can get the person's IP address or I guess location, based on the negotiation period. You could, so in that particular case, what you would have to do what you would have to do is
42:24
you have to control the negotiations in two years, such that you script out the person and then apply all the information. You have to work with the partner server side to protect the event. It could be done, but you probably wouldn't want to worry about the problem. If you're, depending on text editor, you're using it, and how strict you're
42:45
Alright, thanks a lot. Thank you very much.