Integrating Voice through Adhearsion
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 90 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/40300 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 201341 / 90
2
5
8
10
12
13
14
15
17
19
21
24
25
28
29
31
32
34
36
39
40
43
44
46
50
51
52
54
55
57
58
62
65
66
67
78
79
87
88
00:00
CodeDirected graphDatabaseAdditionQuery languagePoint (geometry)Row (database)Revision controlComputational complexity theoryLecture/Conference
00:31
CASE <Informatik>Network socketClassical physicsWeb 2.0Computing platformCartesian coordinate systemPresentation of a groupRevision controlGame controllerHeat transferMoment of inertiaState observerMultiplication signNumberCondition numberAdditionGroup actionHypermediaLogicPoint (geometry)Event horizonCodeServer (computing)RoutingClient (computing)Lecture/Conference
02:00
Database normalizationKnowledge-based configurationVariable (mathematics)Row (database)Goodness of fitoutputIntegrated development environmentCASE <Informatik>Configuration spaceSystem callLecture/Conference
02:28
Variable (mathematics)Product (business)Social classGame controllerCartesian coordinate systemSoftware testingComputer architectureLevel (video gaming)Software developerExtension (kinesiology)AdhesionPlug-in (computing)Goodness of fitLecture/Conference
03:11
Game theoryGame controllerComputational complexity theoryState of matterMultiplication signDigitizingExtension (kinesiology)outputCASE <Informatik>CodeRow (database)Software repositoryDot productSoftware testingMultiplicationText editorData conversionMenu (computing)Slide ruleFile formatLevel (video gaming)Finite-state machineNumberComputer fileComputer animationLecture/Conference
04:38
Event horizonVideo game consoleComputer fileLibrary (computing)Mobile appBlock (periodic table)Virtual machineDemo (music)Row (database)Message passingTask (computing)Social classMultiplication signState of matterAdditionServer (computing)outputCartesian coordinate systemPhysical systemPunched cardCodeComputer animation
06:34
System callCommunications protocolNumberMoment (mathematics)Web applicationPeer-to-peerCross-platformPlug-in (computing)PlanningDialectLevel (video gaming)State observerServer (computing)Game controllerWeb 2.0Socket-SchnittstelleWeb browserAttribute grammarMobile appConnected spaceFunctional (mathematics)Maxima and minimaRepetitionLogicEvent horizonReal numberContext awarenessAdditionCartesian coordinate systemPhysical systemAveragePunched cardBlock (periodic table)Different (Kate Ryan album)Core dumpLecture/Conference
09:55
2 (number)Proof theoryNumberSystem callCodeMultiplication signDemo (music)Server (computing)Virtual machineWordWeb browserComputer virusLecture/Conference
11:09
Translation (relic)Proof theoryNumberAdhesionService (economics)Power (physics)Process (computing)Cartesian coordinate systemProduct (business)AdditionCASE <Informatik>Multiplication signVideo gameLink (knot theory)TrailComputing platformSystem callVirtualizationCharge carrierSheaf (mathematics)Web serviceSet (mathematics)SoftwareServer (computing)SpacetimeEntire functionMobile WebInternet service providerFunctional (mathematics)Different (Kate Ryan album)DialectMenu (computing)Queue (abstract data type)Musical ensemblePhysical systemVolume (thermodynamics)Row (database)RepetitionVapor barrierReal numberSmartphoneData structureAxiom of choicePresentation of a groupWeb 2.0Formal languageRight angleIntegrated development environmentWindowSocial classEvoluteRadiusOperator (mathematics)Ring (mathematics)Projective planeRobotMobile appRepresentation (politics)Line (geometry)Direction (geometry)Software frameworkCodeNumbering schemeNetwork topologyLecture/Conference
15:53
RepetitionAdhesionSpeech synthesisRevision controlOffice suiteSystem callPlanningComputer clusterTelecommunicationCustomer relationship managementCellular automatonMultiplication signMoment (mathematics)Computing platformWeb serviceFormal languageTrailConnectivity (graph theory)Core dumpTranslation (relic)Sinc functionInsertion lossAngleNumberFigurate numberReal numberProduct (business)MereologyMessage passingFunctional (mathematics)Pattern recognitionMobile appPlastikkarte2 (number)Selectivity (electronic)Cartesian coordinate systemGame controllerTerm (mathematics)Event horizonDatabaseHypermediaAdditionFile formatAdaptive behaviorMultiplicationLatent heatSession Initiation ProtocolQuicksortProjective planeCommunications protocolOcean currentData conversionComputerCall centreSocial classBuildingAmsterdam Ordnance DatumData managementLevel (video gaming)AbstractionSoftware frameworkParticle systemVirtual machineReal-time operating systemMoving averageTraffic reportingCalculation.NET FrameworkDigital photographyLecture/Conference
Transcript: English(auto-generated)
00:00
can't work with the database. But if you're looking at doing complex billing and code rejecting or routing based on something that is in the database, the code adds up. So at some point it's better to just write a query in ActiveRecord and call it a day. How does a version talk to a platform? First of all, AHN is the official abbreviation
00:21
because the name is too long. Asterisk is reached through async-agi. It used to be AGI and AMI, version 1. Addition 2 only uses hsync-agi, which is basically AGI over AMI. Prism, which uses the inbound event socket. So in both cases, and also
00:40
in the Prism case, which is your IO, it's addition that will connect to the platform. So we're going in in not... So addition is a client of the server. So how does an addition application work? Aside from configuration, we have code controllers and routing. Routing can work, which has controllers in both, based on some conditions,
01:02
usually a phone number, could be time of the day, could be any other. It's similar to the Rails routing. Then the addition command provides what you need to directly create and run an application. This is only a 30 minute presentation, so I won't be going through the real basics because they're well documented. I would like to show you what you can do and then you can
01:22
dig in. So code controllers hold the application logic. The run method is what is the entry point, so it was used to run the controller. And this is different from, say, a web controller, where every method will be an action. Here, the run method is what is in both for that controller. As you can see,
01:40
this is a very simple example. This controller answers the call, because if you don't answer, you're going to be using early media, by default, so it's built-in. Play something, whistles, and then hang up. Also hang up has to be explicit, because otherwise the controller will run until the end of the code.
02:01
Well, in that case, that hang up is redundant, as it will hang up anyway because the call is over, but it might not be the case. So we can handle playback menus, user inputs, and recording in a transparent fashion, very simply. We have a very good configuration system, which allows you to
02:21
have configuration that is also self-documented, and can automatically import values from environment variables. So you can just run your production with something setting the AMV variables, and that's all you need. There's a good plugin architecture that we use later, and routing is very good
02:41
and pretty complex, but also simple to use if you know what you're doing. Testability. All adhesion applications are fully testable using RSpec, and if you're a so inclined cucumber, using the helpers with developer, and also in general controllers are just Ruby classes, so you don't need
03:00
anything special unless you're doing functional testing. If you mock at the correct level, tests will just work. So, sounds great, huh? But extensions.conf does it all. No, it doesn't. Stop using extensions.conf for applications. So, demo time. So this is a little longer as a controller. I like putting
03:21
code in my slides, because it's all in one place, but there's also GitHub repo with everything you see now. So this controller basically will ask you for menu, will ask you if you want to test inputs or record a message. If you press 1, notice we have timeout, invalid, and failure support, so
03:42
timeout is when the user doesn't press anything, or in case of multi-digit input, it just presses one digit and then stands there. Invalid will tell the user that this input does not match one of the possibilities, and failure will, after a set number of tries, three tries in this case, will tell the user, I couldn't get your input. So this is a
04:04
pretty complex state machine, which is usually used for menu, reduced to a simple and usable DSL. And depending on what you click here, it will go either to enter number, which will ask the user using the sound for a number, and then display it back, or record a message.
04:22
Recordings are saved to, in a file, dot, double slash format, so to play it back in asterisk, we actually need to clean it up, remove file, and remove dot way, because as you know, asterisk plays fast, without extension. So, let's get to the demo, which involves throwing out a...
04:45
So I have an additional application running, and an asterisk application running on my machine. I will just dial... Press 1 to test input. Press 2 to leave a message. Please enter a number.
05:02
I'm probably missing the sound value, because you should say you pressed 3. Recording is more of the same, just ask it for recording, and then we'll save everything in a file. This is the addition console. This is pretty powerful. I set it to debug to show you that everything that comes in and out of asterisk
05:21
is translated to an internal addition event, so we can handle any event that happens in a transparent manner. We usually run at a little less purpose state, or otherwise it's very difficult to go on working. So, this was probably not a very interesting demo, but at the same time it shows what you can do.
05:42
So, events. Addition allows to trace and react to events. So, you can send commands, but you can also do stuff based on what the server is telling you. You can use guarded handlers. Guarded handlers is a library we use to define event handlers. All classes, anything in an addition app, can set an event handler that will be asynchronously called
06:06
when that event pops up and executes the block accordingly. You can also just run an event block like the old addition one, which is the way I prefer to handle recurring events.
06:21
So, this code goes in a separate file and will just handle all events from punch block. What's punch block? Punch block is just the underlying library for addition as the task of getting the events into the system. So, punch block in this context means all events.
06:40
All events will be published on a faraday connection. So, what you do is distribute events to browsers with flip sockets, for example. You could be doing a simple dashboard that shows which of your seat peers are online at any given moment, what they're doing, if they are on call, if they are waiting in a queue, or whatever, any other events you want to handle.
07:05
So, this brings up the second demo, which is events. The fastest way to show you this is just to turn off Blink for a moment.
07:28
As you can see, I turned off Blink and I got the unregistered event in my browser. Obviously, in this scenario, I'm just pushing whatever the inspect for the event is to the browser.
07:45
You could be doing any kind of logic with this. I'm back to being registered and I got pushed up to fame. So, the ability to interoperate with any different system is, I think, what sets Addition apart from other solutions.
08:01
And we will visit some pretty cool stuff with it. Plug-ins. So, this is another plug-in. For example, there's a lot of plug-ins for Addition.
08:24
Most of them involve some functionality we took out from Core, such as there's an Addition asterisk plug-in that gives you execute and bgexecute and all the other methods that are not in Core any longer because we rely on Raya, which is a cross-platform protocol. Virginia is a plug-in I built that embeds a real app in your application.
08:45
What's a real app? It's simply a simple web server. So, you can do this. Addition will pop up the server on the port you call it to. And, has anybody else used Sinatra here?
09:02
Anyway, Sinatra is a simple DSL that allows you to define very short and simple web applications. What we're saying here is react to a GET on the dial URL. So, it will be localhost this and that slash dial by doing this. What we're doing is an originate, which that many people know what it does.
09:23
So, we're going to call the number. What's cool is that we can make it, when the call is connected, we can have it drop into one of our controllers. So, we can say, ask people if they want to receive a call. Our customers use this to do Robodialing, which we'll see later. So, basically, they dial up to people, ask them if they're interested into a commercial offer by using TTS.
09:47
If they're interested, they drop in a controller that takes them to a human sales rep. So, they can have the minimum number amount of sales rep and still get a lot of sales. Experience proved that people will hang up, if it's TTS, or if it's a human, if they're not interested, they will hang up anyway.
10:08
So, using TTS allows them to have very few people handling calls. Second demo time. So, what we're doing here is, this code is all live. I brought this up on my machine.
10:21
And what you do is, simply when I click this button, the server will dial me, 100, ask me if I want to take the call. If I want to take the call, I will be connected to the other number, which is 200, just by clicking the button on the browser.
10:51
So, now 100 has been dialed, and it's been connected to 200, which is the other user, and now they're talking. Hello. Hi.
11:02
Would like to buy a used car. So, that's how it works. You sold a used car. I think you're sold now, right? You want to see more, right?
11:21
So, I was thinking of, I will actually cap off my presentation by going through some of what our customers do, as more to lingo. Which is not to say, it's not to do any publicity to anyone. Just because real life use cases are what people use to decide if they're interested in the framework or not. No one wants a product that is not used, because it means it will not be supported, or probably is not good enough.
11:44
So, let's start with my personal flagship project, because I built the thing. Ringplus is an mvno and mvne, which means they are a mobile virtual network operator, or enabler. It's simply a cell phone company that doesn't have its own network, but runs the radio space from a larger carrier.
12:07
And they built everything to be able to also sell this service as a package to other people interested in becoming their own phone provider. Their goal is also to get into companies and have them have their own phone provider to get costs lower.
12:24
They do over the top applications, so they do music ringtones, billing, recharging, all stuff. And they also have a very interesting functionality that is in-call apps. So, while you're speaking to someone else, and say the person is telling you how to get to this house,
12:43
I always forget those directions, and usually you're in the car, so you do not have anything to write it down. Just press 5, we'll take you to the in-call, without breaking the call, we'll take you to a menu that you can configure, where you say you have 1S, record this call. You press 1, an announcer tells people in the call that they're being recorded for the time being.
13:03
Your friend tells you where it lives, you press 1 again, recording goes down, you find it on your smartphone in the companion application for M+. So you can listen to it again, and that's just one of the uses. Most exotic uses involve automated translation between two parties that don't speak the same language.
13:22
We've been experimenting with that. Obviously, the barrier there is the quality of ASR and translation services. But it's still interesting as a proof of concept. In addition here powers all, call logic, billing, and features. Sprint, which is the underlying provider, just gives us a set pipe that ties into our servers.
13:42
We handle everything else, from routing to billing to all the apps and features. IfByPhone. IfByPhone is basically an ad support company. What they do is when you place an ad, they give you different phone numbers for every medium you're using.
14:01
So you have a phone number for newspaper, a phone number for web, and a phone number for SMS marketing. And they basically use this to measure which channels are better for your business. So all numbers go ultimately to the same menu or IVR. And they also do a hosted IVR for the company. So say you want to do phone marketing, but you don't have a structure for that,
14:24
they will provide you with a hosted IVR that leads people through the choices you decide, and then just to a sales rep if you want to. And this also does ad tracking, as said, and link tracking. So after you get called, they provide you with a platform that will have all the data for the call,
14:42
how long it was, the phone number, time of the day, as the person called the system before or not. All of this is powered by Adhesion. This is actually a very complex solution, one application we are in the process of porting to Adhesion 2. But we can still do that.
15:01
So, PHRG. These are pretty large companies, only not known in Europe, especially PHRG, because PHRG is the largest American provider of renovation. You know that American houses are usually built out of panels, and not much in the way of bricks and concrete. So basically what they sell is renovating your house by replacing entire sections of it,
15:23
whether it's windows, an entire wall, or doors. So they're not related to IT directly. But at the same time, these are our largest customers per call volume. So what happens is that they do that robo-dialing thing I explained earlier. They have a system that has prospects in it.
15:41
The system will go pick up the prospects, dial them, an automated voice will ask them if they would like to speak to a sales representative, and if they accept, they can place it in the queue and speak to a live rep. Another thing they do is appointment confirmation. So you are interested, you tell them you are.
16:02
You tell them the sales rep will come to your house at a certain time. You will get a call the day before, or whatever you send it to, that tells you to be at home tomorrow at 10 AM, because the sales rep from PHRG is coming to visit you. And all this data, including everything in real time, so when a sales rep goes to the appointment,
16:22
he can then report it in. And all the accepted appointment, rejected appointments, the sales rep performance, everything is pushed to a CRM platform that does a mobile version. What is impressive here is the number of calls they run last year,
16:41
which is over 5 million calls in 2012. And they are looking at doubling the figure this year to 10 million. So that's one call every five seconds. It's not 50 CPS we were debating last night, but it's still an impressive amount of calls, especially because they are all live and all of them go through databases and web services.
17:03
So this is an Italian service I built, and what they do is a prepaid translation service for travelers. So you buy a phone card that has an international number in it, and also a few others for countries that don't support that international number. You can call that number,
17:22
and they will get in contact with someone that speaks the language you asked for. What happens here is that actually it's sort of a very large distributed call center, so the translators are not physically in an office or somewhere, but they are at home sitting at their computers,
17:42
so they have access to a soft phone, or on their cell phones where they can use an app with envelope, or even just their cell phone as in getting called on a cell phone. What you do here, Adversion is used for billing and credit management, those prepaid cards have a set number of millions that gets decreased,
18:01
we whisper into the channels and tell the people that credit is low, and all that kind of interesting functionality. And we also do, the core of the application is doing the translator search with AMD detection and adaptive selection. This is fancy terms to say, we basically say we want someone that speaks Chinese,
18:21
we will call the first in the list, and you know that many cell phone companies actually have some sort of answering machine, if your phone is actually turned off. So we need to detect that, because we want the person to speak to a live rep, not to an answering machine obviously. So we need to detect that.
18:41
And we also do adaptive selection, and so people that call the service for the same language, and subsequent times, tend to be connected with the same rep as possible, which is important because if you had an issue where you were on vacation, in a place you don't speak the language from, and you call again,
19:00
you probably want to speak with the person that handled the situation the first time. This service is actually going quite well, they have over 30,000 customers. So, what are we going to do in the future? This is Adhesion now. All the apps you saw are in production live, with thousands of users.
19:22
So, I will say, we've been tackling scaling pretty well, so there's still a lot to do. So, WebRTC. Everybody loves WebRTC, right? Yay! So, Adhesion can obviously perform all third-party control, since Adhesion does not sit either in the signaling path or in the media path.
19:41
So we don't really care about WebRTC in the sense that we can handle WebRTC applications in the same way we handle normal applications. So, with WebRTC and SIP and other, what we built on other projects I cannot mention by name, we built multiple device tracking and switching. So what happens is that when you call,
20:00
say you're on your cell phone, you get to your computer, you want to sit down and not hold the phone to your ear, you can just press a button and send the call to your computer. You get up, you go in the living room, you move the call to your smart TV or your iPad or whatever, without any loss of conversation or signal.
20:21
Next thing we're going to be doing is protocol convergence. So we're going to bring XMPP, it's already a first-class citizen in Adhesion. We're going to integrate messaging and presence in the framework, so you can get easy events in the same format you get them from the telephony platforms. All of these are already implemented in upcoming products,
20:40
which I cannot mention by name, but you'll see them when you recognize them when you see them. So what's going to happen? Adhesion 3.0 is currently at the planning stage, and we're deciding what we want to do. What we decided on is we want a unified communication platform.
21:00
So all XMPP events and messages will be handled by Adhesion in a unified format so we can have messaging controllers in addition to voice controllers. Our idea is treating messaging as another kind of media, like CIP does. So everything is unified.
21:21
We need better ASR support. Currently our ASR support is limited to what the platforms can do. We would like to detach from platform specifics of every ASR engine and build some abstract layer you can use. So you can have ASR even if you don't know how it works better. And we're revamping the component structure,
21:41
which is maybe something that is not that interesting at this level, but the components are quite complex at the moment. Some of them need to be broken up in parts. Luca, we actually are at 9.30. Oh. Well, come to Adhesion Conf 2013. Thank you.
22:04
Does anybody have any questions for Luca? Is there an automatic speech recognition over here? Then it will be pointing at an angle. I'm going to put down the thing. Okay. Great. Thank you.
22:20
I don't know if it comes down or not. I think it's kind of bolted in. Let me try. Maybe the top part will roll.