We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Building an embedded VoIP network for video intercom systems

00:00

Formal Metadata

Title
Building an embedded VoIP network for video intercom systems
Subtitle
How to leverage open standards to bring voice and video capabilities to IP hardware intercom solutions
Title of Series
Number of Parts
490
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
IP video intercom systems combined with smartphones can leverage regular RTP/SIP VoIP technology to offer a new set of services to end-users: getting a notification when visitors press the door bell, seeing them on video before answering the call, interacting with them via voice and video and deciding to open the door, at home or anywhere else via wifi or 3G coverage. Linphone (a SIP user-agent) and Flexisip (a SIP proxy server) can be integrated into IP video door phones, in-house panels and video surveillance devices to build a complete VoIP network. Linphone and Flexisip use open standards to reliably send the audio and video streams captured from IP video intercoms to in-house devices, including smartphones and tablets, connected either to a local network or to the public internet. These open source SIP-based software solutions can run perfectly on small hardware devices with reduced footprint, and can easily be integrated into GNU/Linux embedded systems, thanks to their Yocto packages. This lecture will describe how Linphone and Flexisip can be used together to build an embedded SIP network dedicated to home automation or video surveillance. The network architecture used in these contexts can also be deployed in other areas, such as the emergency services or the Internet of Things. Linphone and Flexisip can be integrated into IP video intercom systems to make the audio and video capabilities of a door entry panel accessible by in-house control screens and smartphones, connected either to a local network or to the public internet. Indeed, the linphone software fits well in embedded systems, which makes it a good candidate for being used in home automation devices, such as outdoor panels or indoor monitors, where video is to be capture or displayed. However a SIP user-agent itself is not sufficient for setting up a fully functional SIP network: we propose the use of Flexisip, which is also able to run with reduced footprint on embedded devices as well as on a large scale cloud deployment, to fork incoming calls to in-house monitoring panels, smartphones or tablets. When used together, Linphone and Flexisip offer advanced features for IP door phones and video monitoring systems, such as : - HD video and HD voice (with support for H.264 and H.265 hardware accelerated codecs, and Opus codec) - Call forking with early media video - ICE, STUN and TURN support for optimised NAT traversal allowing peer-to-peer audio and video connections whenever possible - secure user authentication with TLS client certificates - Interconnection with push notifications systems, for reliably notifying of people ringing the door
33
35
Thumbnail
23:38
52
Thumbnail
30:38
53
Thumbnail
16:18
65
71
Thumbnail
14:24
72
Thumbnail
18:02
75
Thumbnail
19:35
101
Thumbnail
12:59
106
123
Thumbnail
25:58
146
Thumbnail
47:36
157
Thumbnail
51:32
166
172
Thumbnail
22:49
182
Thumbnail
25:44
186
Thumbnail
40:18
190
195
225
Thumbnail
23:41
273
281
284
Thumbnail
09:08
285
289
Thumbnail
26:03
290
297
Thumbnail
19:29
328
Thumbnail
24:11
379
Thumbnail
20:10
385
Thumbnail
28:37
393
Thumbnail
09:10
430
438
Computer hardwareVideoconferencingStandard deviationGoodness of fitSoftware engineeringCore dumpTouchscreenProjective planeBuildingPoint (geometry)Multiplication signProduct (business)VideoconferencingElectronic visual displayRepository (publishing)Ring (mathematics)HypermediaMereologySoftwareWhiteboardInformation securityPresentation of a groupCoefficient of determinationCASE <Informatik>RingnetzRow (database)Line (geometry)Interface (computing)Different (Kate Ryan album)Key (cryptography)InternetworkingInternet forum1 (number)Open sourceComputer-assisted translationPlanningProxy serverBitDistribution (mathematics)AnalogyDemonMobile WebArmDigitizingSession Initiation ProtocolSystem callMultiplicationMessage passingAsynchronous Transfer ModeClient (computing)Physical systemINTEGRALConnectivity (graph theory)Computer hardwareStandard deviationInternettelefonieReal-time operating systemMobile appCommunications protocolDerivation (linguistics)Lecture/Conference
Mobile appPersonal area networkPresentation of a groupOrder (biology)VideoconferencingElectronic mailing listProxy serverConfiguration spaceRow (database)Projective planeIP addressProduct (business)Form (programming)Wave packetCartesian coordinate systemInheritance (object-oriented programming)InternetworkingTouchscreenService (economics)Information securityRevision controlGame controllerCoefficient of determinationLibrary (computing)RingnetzCategory of beingCore dumpMultiplication signInstance (computer science)FingerprintMereologyRing (mathematics)Group actionSmartphoneComputer-assisted translationComplex (psychology)RoutingPhysical systemSystem callBefehlsprozessorFormal languagePasswordComputer fileDomain nameAuthenticationSheaf (mathematics)State of matterProcess (computing)Hacker (term)Local ringAddress spaceTransport Layer SecurityLevel (video gaming)HypermediaSession Initiation ProtocolPoint (geometry)Line (geometry)DemonCommunications protocolStandard deviationBuildingSoftwareStreaming mediaLecture/Conference
Mobile appSystem callImplementationCodecCodierung <Programmierung>Proxy serverTablet computerSession Initiation ProtocolSoftwareInternettelefonieVideoconferencingStreaming mediaComputer hardwareMultiplication signCommunications protocolOperational amplifierWave packetOpen setOpen sourceTouchscreenProduct (business)WhiteboardLecture/Conference
Speech synthesisComputer hardwareService (economics)Internet service providerFamilyInternettelefonieTerm (mathematics)SoftwareSystem callProjective planeMultiplication signRegular graphNumberQuicksortBitConnected spaceType theoryOpen sourceLevel (video gaming)Proxy serverTelecommunicationProduct (business)Client (computing)Standard deviationInternetworkingLocal area networkBit rateGoodness of fitMaxima and minimaCoefficient of determinationRight angleMixed realityVideoconferencingLabour Party (Malta)Order (biology)Presentation of a groupLecture/Conference
VideoconferencingStandard deviationAddress spaceInternet service providerInternettelefonieSession Initiation ProtocolSystem callSoftwareElectronic signatureLine (geometry)Lecture/Conference
Standard deviationInternet service providerClient (computing)Multiplication signComputer animation
Point cloudFacebookOpen sourceComputer animation
Transcript: English(auto-generated)
Good morning everybody, my name is Jean Monnier, I'm a software engineer since 1999 and I'm mainly involved in the Linfone project since 2010, so in a few words, the Linfone project
is a voice over IP software developed for Linux in very early 2000. So what I'm going to be talking about is how to use the voice over IP technology to build
an intercom system for the entry. So the presentation will first start to introduce the use case, so what it is about when you want to build this kind of system. So what is the voice over IP technology that we suggest to use and how to build a very
simple voice over IP network using Raspberry Pi as an example and with the Linfone software and what would be the next things.
So the simple use case is like this, so you have an intercom doorbell in front of your building and you want someone to open the door. In the past, the very simple solution is to have a camera in the front of the building
and to use a simple coax cable to link the camera to the home screen. So it's pretty cool but now if you want to have multiple displays at home like the
home screen but also why not a mobile device or a PC. So we start to be in the multi display use case. Okay, so same thing, I have the coax but what if I want to bring the video to my mobile
device for instance, so it's become much more complex and what we can also imagine that the same kind of use case, so you want the video to be displayed at home but also in the street on your mobile device if you want to know if something is happening
in front of your home. So the good thing is that we can leverage on digital infrastructure to do this kind of thing. So no more analog audio or video but we can use IP to carry audio and video packets
from the door entry panel to the home screen or even a mobile device or even a mobile device connected to the public internet. And when it's about IP, voice over IP, it's a good idea and especially two protocol
which are SIP, SIP and RTP for real time transport protocol. So voice over IP technology that I'm going to use for this presentation are based on
two main IETF standards. The first one is SIP, session initiation protocol and the second one is real time transport protocol.
So SIP, in a few words, it's a text-based protocol inspired from HTTP, standardized in 2000. There are two main components in a SIP network, the SIP user agent and SIP proxy which are
about writing SIP messages between different clients. And RTP in short, basically the idea is to send audio and video in a packetized way
over internet. Okay, so now we want to use IP, we will use SIP and RTP. So what we need is a SIP user agent located in the door entry camera and a SIP user
agent in the home screen for the software. For the hardware, at the door entry camera level, what we need is to be able to capture audio, video and for the home screen, we need just to be able to display video
and capture the audio because most of the time there is no screen in the front door. So for the software, what we need is, what we propose is to use Linfone which is a SIP
user agent with RTP capabilities and for the home screen, so we can do exactly the same. Okay, so now we have the hardware can be a Raspberry Pi or any small hardware running
ARM or any other processor, a display, a cable and on the software base we can use the Raspbian distribution of the Raspberry Pi with Linfone on both parties.
Okay, so at the door entry camera level, what we can use is Linfone Demon which is a command line interface for being able to place or receive call.
With early media feature, I'm going to explain what it is about. So, early media is a specific way of initiating calls. For a regular call, you wait for the call to be established before exchanging media
between the two endpoints. In the case of early media, the idea is to be able to send the video before answering the call because when someone is in front of the door, you want to be able to see
his video. It's a kind of preview but not necessarily accepting the call. So Linfone Demon, it's an open source project derivative from Linfone.
It's available on our public Git repository and also something interesting for the embedded case, we provide some Yocto receipt to integrate Linfone Demon within a Yocto distribution.
Okay, so how it will work? Just a button connected to the GPIO. I just start Linfone Demon in the background with a Unix pipe and with a couple of lines of Python, it's pretty easy to handle the button and just to open the socket,
communicating with Linfone Demon and to initiate a call. That's it. On the display side, it's almost the same.
You can just start Linfone Demon in auto answer mode and it will automatically display the video. So, it's for the very simple use case. So now, if you want something a little bit more complex, you want to be able to distribute
the video not only to the home screen but to the home screen and to another equipment which can be, for instance, a mobile application. You need to add another equipment in the SIP world which is the SIP proxy.
The purpose of the SIP proxy will be to route the call on both the CPU agent which is in the home screen and on another user agent which can be a mobile application for
instance. So, now we suggest to use another equipment which is called Flexi SIP proxy. As I said in the beginning, it's the matter of SIP proxies to route the calls and with
a special feature which is the early media call for hacking because as I said at the beginning of the presentation, what we would like is to have the video coming from the door entry system to the home screen or to the mobile application before establishment
of the call. So, we need a SIP proxy which is able to duplicate the audio stream in order to be displayed on both devices. It's what is called the early media call for hacking.
And the video from the call initiator to all ringing devices. Okay, so Flexi SIP, it's a simple process running on Linux with a very small configuration file, so just a couple of sections to authenticate.
I need to set the domain of my house and a smaller configuration file with the password in order to authenticate the user.
If we want to go in deep into the SIP protocol, so I have the smartphone app which was registered to the home screen, to the home screen SIP proxy, the home screen panel which was registered as well.
And when the door entry camera want to invite call, the invite goes to the SIP proxy which fork the invite to both the smartphone application and to the home screen. And the RTP with the video can be displayed on both device at the same time.
Okay, so now if we go a little bit deep in the complexity, I want the video of my door entry system to be seen on my own, but also if I'm outside, it would be cool if I could be able to see who is pressing the button in front of my house.
So now what we need is two Cascades, two SIP proxies, one at home and another which can be located anywhere on the internet. So we can do it with FlexiSIP the same way.
So we had a small configuration file which explained, which state where to fork the call on the public internet. So as you can see, we have a first SIP address which is everyone at my house which
is fork2 home screen and to bob at sip.infone.org which is an IP address available from the public internet. Okay, so more or less it's the same.
Just the end is different. The SIP proxy from the home screen is forwarding the invite to the public internet SIP proxy. And at the end, the video can be seen on the local network and on the device connected
to the public internet. Security consideration. So it's very sensitive data which is coming from your apartment and from the doorbell. So make sure to use SIP TLS and SRTP everywhere to secure the communication.
And also if you put some password in the local network, it's still better to hash the password.
Okay, so it was for the presentation of the VPN network for the video interconnect system. What next? We can also imagine to do the opposite. I mean to call the door entry panel from a smartphone.
It's also something which can be doable. It's also possible to add action to the system. I mean if you want to be able to open the door, you can as an example use GTMF
to pilot some switch at the door entry panel level. An interesting thing that you can also use is MDNS to able to automatically discover
the door entry panel or the home screen without having to configure IP addresses. Another interesting topic that we could discuss is how to use push notification to be able
to wake up mobile application when someone is pressing the home entry button. This is something which also is available from the FlexSIP proxy.
Another interesting point is interworking with existing door entry camera. I'm pretty sure that on your building there is a door entry panel and what we experience is that most of them are using SIP as the protocol
to bring the audio and video from the door entry panel to an equipment in the house. So if you want to be able to use existing equipment and just to change the display, it's something which is possible as they follow the SIP standard.
And the last thing is in my presentation I introduced Linfond Demon, which is a common line tool, but if you want to have deep control of the application,
you can use the library version of Linfond Demon to do the same. It's available on C, C++ and on many other languages. Even on Python. Okay, thank you. I think I'm three minutes ahead.
Thank you for your questions.
Okay, any questions? Thank you for the presentation.
So, is that a commercial project or is that your personal hobby project? In fact, there is no real project for this kind of application yet. I would say that I'm part of the Linfond team and there is a company backing the Linfond project,
which proposes this kind of service, but there is no project dedicated to this kind of application.
It's more a usage of Linfond and TextySIP for this kind of application than a real project dedicated to the entry system. Was it your question? The question was really, you were talking about having your proxy
and then SIP on the stationary tablet to display the incoming call. So, would you put proxy on the same tablet? Oh, the proxy which is in the home? Yes. The idea is to use the same hardware for both the home screen and the SIP proxy.
Okay, so let's say you're streaming the video to proxy and then it goes to the tablet itself and mobile phone. How do you deal with video decoding?
Is that hardware supported on both phones and tablet? Can proxy do recording or how do you do this? Video decoding? Video decoding is done on software, is it hardware decoding? The proxy is not decoding the video at all. It's the same RTP stream which is forked to both the home screen
and to the mobile application. And the mobile application is supposed to be able to decode either H.264 or VP8 depending on the protocol which is used. Most of the time it's H.264 because it's possible to leverage on the hardware
So, we have implementation of this kind of codec which is available either on Raspberry Pi or on many other embedded hardware. Okay, cool. Thank you. You're welcome.
So, how is Lynfone compared to other VoIP softwares like Zivo and Elastix? The question is, is there any...
Zivo. Zivo, yes. It's an open source project around Asterix. I would say that we have no experience in this kind of scenario with Zivo but Lynfone works with Zivo for regular phone calls. So, as Zivo is based on Asterix and Asterix is following the same protocol,
I don't see any kind of issue to use Zivo with this kind of deployment of Lynfone within the doorbell.
On the type of hardware that you've been talking about here, what sort of latency do you get between somebody actually pressing the doorbell and a mobile phone getting an image?
And then what's the quality of communication between the two ends like in terms of delay between one speaking and the other hearing? Oh, I would say it's a matter of network quality. So, if the quality is good, if it's a wire cable, if it's cable, there is almost no delay.
You have the encoding time at the doorbell level but if it's hardware encoding, it's good. And it's a ping time more or less.
There is no 50 milliseconds, 100 at maximum, but not more on the local. If you are speaking about the public internet, it's a little bit more complex. But on local network, there is no problematical latency.
Thank you for the presentation. My question is, I use a voice over IP product like a service from a service provider a few years ago
and I have it always connected in my zoiper or something. Have you considered giving the doorbell, for example, a connectivity directly to the internet or through some proxy with some general voice over IP telephone number
so that I will only set the telephone numbers of me and my family members and it will act as a standard voice over IP video call in their mobile clients? Why not? I don't see any issue with that. As I said, it's standard SIP.
So, if your SIP network provider follows the standard, you can simply configure the SIP address that you have and directly connect the doorbell to your public.
So, if the provider keeps all the standards and supports DTMF, everything would be possible using a standard client. Okay, thank you. Okay, this has all the time we have now, so thank you very much.