Reverse Engineering: Satellite Based IP Content Distribution

Video in TIB AV-Portal: Reverse Engineering: Satellite Based IP Content Distribution

Formal Metadata

Reverse Engineering: Satellite Based IP Content Distribution
Title of Series
Part Number
Number of Parts
CC Attribution 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Place

Content Metadata

Subject Area
The presentation will cover reverse engineering a satellite based IP content delivery system. These systems are generally used for moving digital media (such as movies, video on demand) but also can be used for digital signage and any other type of files. The presentation will touch on all aspects of reverse engineering from satellite reception, packet analysis, forward error correction reverse engineering (along with an explanation of the math), to the difficulty dealing with the extremely constant high bitrates on an off the shelf linux PC. The end result of the entire reverse engineering project was a linux based software client that has similar features as the commercial version based solely on an analysis of the protocol and incoming data.
Distribution (mathematics) Computer animation Mapping Computer file Website System programming Communications protocol
Web page Area Presentation of a group Distribution (mathematics) Standard deviation Computer file Information Variety (linguistics) Plastikkarte Process capability index 2 (number) Type theory Message passing Computer animation Software Different (Kate Ryan album) Utility software Quicksort Position operator
Presentation of a group State of matter Multiplication sign Port scanner Transponder Mereology Disk read-and-write head Encapsulation (object-oriented programming) Computer programming Component-based software engineering Different (Kate Ryan album) Single-precision floating-point format Videoconferencing Matrix (mathematics) Encryption Diagram Isometrie <Mathematik> Covering space File format Digitizing Moment (mathematics) Special unitary group Bit Price index Sequence Type theory Arithmetic mean Internet service provider Order (biology) Pattern language Point (geometry) Trail Identifiability Pay television Open source Link (knot theory) Unicastingverfahren Streaming media Heat transfer Regular graph Event horizon Metadata Field (computer science) Wave packet Revision control Frequency Latent heat Hacker (term) Internetworking Operator (mathematics) Set theory Form (programming) User interface Data transmission Standard deviation Inheritance (object-oriented programming) Forcing (mathematics) Content (media) Mathematical analysis Set-top box Cartesian coordinate system Carry (arithmetic) Computer animation Software Personal digital assistant Digitale Videotechnik Compact Cassette Radio-frequency identification Codec Computer worm
Standard deviation Presentation of a group Touchscreen Multiplication sign Motion capture Content (media) Principal ideal domain Bit Streaming media Function (mathematics) Transponder Mereology Regular graph Band matrix Frequency Process (computing) Computer animation Bit rate Different (Kate Ryan album) Radio-frequency identification Cuboid Right angle Error message
Statistics Computer animation Divisor Function (mathematics) Drop (liquid) Figurate number Usability
Point (geometry) Vorwärtsfehlerkorrektur Presentation of a group Computer file Multiplication sign Workstation <Musikinstrument> Real-time operating system Transponder Mereology Computer programming Neuroinformatik Term (mathematics) Core dump Videoconferencing Electronic visual display Metropolitan area network Multiplication Standard deviation Sequence Word Computer animation Addressing mode Radio-frequency identification Linearization Quicksort Resultant Spacetime Computer worm
Point (geometry) Email Sampling (statistics) ACID Streaming media Transponder Computer programming Numerical analysis Type theory Sign (mathematics) Computer animation Linearization Videoconferencing Volumenvisualisierung Figurate number Quicksort Arithmetic progression Computer worm
Musical ensemble Randomization Length Multiplication sign Workstation <Musikinstrument> Sheaf (mathematics) Real-time operating system Transponder Mereology Computer programming Heegaard splitting Mathematics Videoconferencing Bezeichnungssystem Exception handling Email Block (periodic table) File format Fitness function Bit Price index Type theory Category of being Order (biology) Pattern language Quicksort Sinc function Resultant Directed graph Filter <Stochastik> Autonomous System (Internet) Game controller Computer file Real number Error detection and correction Streaming media Mass Heat transfer Field (computer science) Revision control Term (mathematics) Natural number String (computer science) System programming Boundary value problem Energy level Data structure Data transmission Focus (optics) Information Cellular automaton Content (media) Planning Total S.A. Timestamp Numerical analysis Uniform resource locator Computer animation Object (grammar) Communications protocol Computer worm
Point (geometry) Computer file Coroutine Streaming media Transponder Mereology Field (computer science) Crash (computing) Bit rate Operator (mathematics) System programming Set theory Area Pattern recognition Data transmission File format Block (periodic table) Cellular automaton Bound state Content (media) Bit Volume (thermodynamics) Cartesian coordinate system Parsing Digital photography Process (computing) Computer animation Software Hard disk drive
State observer Sensitivity analysis Musical ensemble Multiplication sign Numbering scheme Insertion loss Function (mathematics) Stack (abstract data type) Information privacy Food energy Dimensional analysis Fraction (mathematics) Estimator Mathematics Semiconductor memory Different (Kate Ryan album) Single-precision floating-point format Programmable read-only memory File system Matrix (mathematics) Distributed computing Resource allocation Graphics tablet Block (periodic table) Sampling (statistics) Data storage device Bit Wave Buffer solution Order (biology) Hard disk drive MiniDisc Website Right angle Pattern language Point (geometry) Slide rule Game controller Vorwärtsfehlerkorrektur Computer file Device driver Heat transfer Event horizon Field (computer science) 2 (number) Cross-correlation Robotics Natural number Operator (mathematics) Galois-Feld System programming Energy level Set theory Metropolitan area network Form (programming) Addition Default (computer science) Data transmission Content (media) Cartesian coordinate system Approximation Database normalization Kernel (computing) Computer animation Game theory Communications protocol Buffer overflow Computer worm
Polynomial Vorwärtsfehlerkorrektur Greatest element Implementation Multiplication sign Numbering scheme Mereology Coprocessor Field (computer science) Neuroinformatik Element (mathematics) Mathematics Differenz <Mathematik> Different (Kate Ryan album) Operator (mathematics) QR code Matrix (mathematics) System programming Integer Nichtlineares Gleichungssystem Set theory Addition Standard deviation Multiplication Algebraic equation Bit Division (mathematics) Variable (mathematics) Transmitter Category of being Vandermonde-Matrix Word Arithmetic mean Computer animation Personal digital assistant Light field Speech synthesis Right angle Quicksort Table (information) Arithmetic progression
Filter <Stochastik> Multiplication Multiplication sign Sheaf (mathematics) 1 (number) Numbering scheme Heat transfer Mereology Field (computer science) 2 (number) Category of being Vandermonde-Matrix Mathematics Arithmetic mean Computer animation Bit rate Personal digital assistant Matrix (mathematics) Figurate number Quicksort Form (programming) Spacetime Identity management
Point (geometry) Computer program Vorwärtsfehlerkorrektur Implementation Computer file Multiplication sign Sheaf (mathematics) Numbering scheme Real-time operating system Inverse element Mass Distance Mereology Field (computer science) Mathematics Cross-correlation Bit rate Scalar field Matrix (mathematics) Software testing Area Multiplication Information Data storage device Total S.A. Bit Division (mathematics) Funktionalanalysis Complete metric space Numerical analysis Word Process (computing) Computer animation Software Personal digital assistant Network topology Computer science Figurate number Table (information) Routing Row (database)
Presentation of a group Email Computer file Multiplication sign Projective plane Sheaf (mathematics) Data storage device Numbering scheme Basis <Mathematik> Mass Data stream Computer animation Bit rate Matrix (mathematics) Videoconferencing System programming Right angle Software testing Quicksort Communications protocol Identity management
I'll at al HapMap map come ruled on from the id heresy I wholly on overall 2017 ch ah
my name's still Jacob it I'm here to discuss a satellite this IP protocol that I reverse engineered a couple years ago of this protocols used Oliver North America to distribute news media the on-demand films to maybe theaters it'll signage files amongst other moremundane users there are many different protocols operate on the same principles what I will discuss today and I guess the prevalence of this the same systems a similar system is being used in Europe and the based on the vendor website suspect I suspect there's so system use across the globe and I wanna just North and South America at done it's
sorrowful the background to make sure all on the same page although there are many types of different satellites circling the earth this presentation deals specifically with most common type most most commonly known type geostationary satellites on the satellite sort of well suited for distributing of variety of information from a single source for large given geographic area most common and most commonly known uses for a TV concept the set up is also well-suited for the other types of media distribution and is typically never return passes this sort delivery network and and that sort of set up is OK if you're watching TV it's unimportant utilization watching in if you're seeing media files they just will be needed and within seconds of the liver usually that position the data days before and the so for this from
our setup is pretty basic just standard laser standards and stared C-band dishes which is still there and PCI from PCI DVB-S cards and this 1 exactly 6 PC and the
software all open source all open source offerees DVDs new which Transport Stream Enel analyzing tool in the regular DB tools like as aptitude DVB traffic to do some in-depth analysis on the transport to that the ideas in the traffic in the new standard Unix tools grab saw a unique and other text-based tools then at some point and the starting my own software to get deeper into it so before I die then I want briefly covers and technical aspects of sizes video formats and this will the DVB is most prevalent digital video broadcasting standard in the world there's 3 main types as DVB-T for terrestrial dtbc cable-television DDS force DVD as for satellite the new versions DVDs to T 2 in C 2 and all of the in DVD would indicators for video can also carry any other type of digital content in the case this presentation civically IP traffic so the main difference between the VTC and as as the transmission medium for t it's the Air Force it's a copper cable in essence the Earth's atmosphere so the physical interface is generally referred to as the marks in all 3 DVD flavors in the case of DVDs of salsa referred to rather erroneously as a transponder on it once the signals are demodulated bitstream the virtually identical and the standard way this state is movies and big transport strain but also called TS for sure but it's and this the truth of the formats relatively simple 288 by packets that have a simple for by better in all you really need to know for this presentation is that starts with of 47 hexamers 13 bit field called a packet identifier more comeliness appeared on so on on the chance for stream 108 100 8 thousand 191 available kids and evolve only a fractional you simultaneously each pill carry a specific type of a traffic can a single video stream audio stream program metadata other data traffic from the purity is a how a set-top box or other component can filter out the content not relevant to the operation and for example a digital video channel be made up a few kids generally will have 1 could use the the extreme another for the audio track and potentially others for subtitles captions were whatever you call in Europe so we called North America to the set of articles of this specific patterns and ignore everything else in stories encryption DVD encryption runs on will run on the 184 by payloads just the generic 0 1 to be 7 there in my diagram the so and then the 2nd party to cover is part is the packetized elementary stream and the inside it apart and so on he has this format that defines the moment data which is generally order of the years carried and transport stream FIL matrices packet as a cassette sequential order in the aspect of the sun on the head the format of the video or audio codec isn't defined and is typically is changed over time the video would've been M PEG 2 now it's H 2 6 4 it's migrating to the AGC and the important part the the important part of that you need to know about this is packed isometry stream always starts with 0 0 0 1 sequence and hacks then the other time of payload PSI is most commonly described is used to describe the layout of a transfer training how it's configured what TV channels available what periods are available from the video and audio streams on but there's many other types of PSI data the lad different standards to right and top the transfer port stream to be able to define what is necessary for the application now DVD multiprotocol encapsulation standards a standard pertains to this presentation devise means to carry IP traffic over the Transport Stream the key parts are 1 or more paid pits can carry DDB impede impede traffic and all that it to operate independently of each other each pair stream may contain more than 1 destination IP the intention is to take the i packet IP packets from 1 parent 1 Ethernet networks or once the decoded the and in North America I encountered very little unencrypted unicast IP traffic but I did see it here and there the vast majority the IP traffic I encounters multicast GDP since a satellite link is usually single-directional so a little bit of
background I always enjoy scanning satellites for news events the sports events etc. and although there's a lot of pyramidal forces for subscription it's always been the front it's always been fun to see which you aren't supposed to see when scanning signals with a satellite set-top box you look for these hidden channels it's coming around it's common around a lots and corrected it's also come to find a transformed the transponder transfer stream that lead no programs on and which you can see here is a set-top box doing a bone scan and so so anyway so these these transponders have no programs that on encryptor anything they obviously have some purpose but if liking television traffic and many times will be a satellite-based Internet services but not always so for years I was wondered what they were but but I never did much investigation assuming it was just encrypted Internet traffic but at some point 7 or 8 years ago I saw hints here and there's some on some forms of there being there being TV channels on these unknown transponders they're calling it IPTV unfortunately all these signals were on C-band receive and time so as that I set out to find C-band dition start examining the signals of Linux PC always had a piece 0 the DDB DVDs courts examining was just a matter sitting and poking around what finds it will the Shingo installed so once acquired my 1st the vendor shine stalled administered taking notes on these and the transponders the
so the for the process that I would use to examine these empty transponders once had found with the setup box was to see what were present once in your PID hijra present star identifying them I ever said in really notice as looking for but it was unencrypted and not write a regular PES stream I was interested so what are the tools are used identify what kids were present was DB traffic it's part of the standard Linux DVD tools it operates by checking every chance for error checking every packet in the Transport Stream over 2nd in time what could the president and the bandwidth so you can see in in and output here the capture on the right is a regular TV much and you can see there's a lot of different periods in various bitrates and the 1 on the left here is what I'm calling an empty marks as 1 that has you can see is almost 70 about 70 to make a bit of data just sitting there so once that identified
1 of these 1 of these on kids with lots of traffic start to look at in the BBC snow and DVDs to fall I to examine a really fine detail whatever contents is on a particular paired side and be traffic after and identify the pit after a quick glance you could still can't come a traffic is being carried and not you look at this example if you look at this example here I cut out a lot of it would have been enough 30 screens were the data but you can see that it's not you can see it's IP traffic in red you can see the you can see the IP in the destination port so that going out some traffic here on so again all of this process you know I mention it manually but eventually and programmed up is kind of boring this like rentals stuff on paper so let's
so anyway DVD sniffle provided a tremendous amount of detail almost anything about almost anything encounters and here's a portion of the output well that's actually portion the output
so anyway so basically sensor outputs sensor outputs text I was able
to pipe usable pipe this stuff through drop and look at what IP user President and yet other statistics and figure out what this was looking for a so certainly lost in so yeah 1st I wasn't really short I was looking for the 1st thing I did was to identify the IP is the but he addresses were present the nicer looking traffic and into the IP is a real very quickly but it divide in carrying a catastrophic it wasn't very interesting multicast completely different story
so looking at a UTP dump looking at a dump of the UDP packets you can see here you can see them highlighted the 47 switches space 188 888 bytes apartment in blue which I guess you can see very well there's a bunch of the PES start sequences so it's really obvious in this package that there's some sort of video there's some for some sort of Transport Stream but had to get it added and we know at the time so the a yeah but this 1 I reached the end of by this point I had reached the end of the grabbable part of my investigations I threw together some very basic code to capture the IP packets and manipulate them ever at effort this
at 1st little Fernandez display what I Peter present on the pit administered to say the payload is of certain piece of file on my 1st results the 1st success turned out to be linear TV channels linear is an industry standard term to describe real-time television a real-time television station is these channels are made up of a single and pig Transport Stream encapsulated and UDP packet with 1 channel per multicast IP each UDP packet would contain exactly 7 Transport Stream packets from to Sanders to transponders on the same side I have a large letter of for the of of encrypted channels with a few more words were in the clear once what these have the NHL Center Ice stating cryptic unencrypted for about 2 hears so to put these channels back on a computer or a simple program to capture the UDP packets are encapsulated monitored for multicast IP which I would play back and we'll see what you can see a little screenshot here so after finding go the encapsulated linear video I saw no clue with majority of the Baltic have peas were there was a multi that Cassiopeia missing transponders NHL sunrise Otsuka most the man with the transponder so I want to
look at linear video the each multicast IP was used for single program if you use the same technique of saving UDP packets to payload onion and IP and tried to play them back every now and then we'll see what attempt to render picture start place for sample audio and you can see the example here of how would kind of start to render something that you would really did anything and I found examples of this type of behavior on numerous satellites if you examine UDP payload you see the telltale signs the MPEG transport stream in theUS packets but I couldn't figure out how to extract them
so upon closer examination it was clear the UDP payloads had some sort of had on them so I wrote offers strip some and save UDP payload using the same got technique it was difficult to determine what what that what was the header what was the payload I tried this blind had a stripping technique onerous transponders without much much like so I tried on the transponder that I'd tucked away in my notes on transponder I get lucky mn acidity playable video although I was excited finally did the play without air turned out to be an absolute Dr. eyes but American and program was exactly the most rewarding prize but was progress in the last still wasn't sure my my attempts previous attempts at the stripping had not worked yet on the other transponders so by this point I
determine the header and packet for a fixed size of the exceptions I started looking at the header more specifically to see if I can understand why I had not always been able to get some sort of idiot plan but like before use DVDs sniffing up to examine just the UDP packet headers in real time and looking at the side you can see almost immediately the debtors have some sort of 32 bit field that is interleaved in red that's in red and leaves the separate 30 fit to the counter and political in blue as number that ended the number increment each packet of you guys can see the real world anyone want to start a look at the different transponders my notes I realize that I was in dealing with the same protocol the same version of the protocol and identify identified at least 5 different systems that are almost identical in design very similar a technical level but differed in last assertive focus my efforts on the high band with C-band transponders carried NHL Center Ice I thought it might yield the most interesting results from here on out a mainly focused my on this transponder Indian and protocol was using so going back to the side you can see almost all the all the packets are with few bytes 0 0 1 answer to speculate that the nearly 32 bit field in the header was some sort of transmission ID so I it's intense criticizing its transmission ID to the file and with this basic change I was able run the video from a much larger percentage of files and it became clear what matters shipping previously was so that a mass was the mole tool that is being sent at the same time so I examine the files that I can play I looked at the packet headers recluse almost all the transmissions or send a sequential order based on the counter but the examples I can play to counter was sequential order so I realize that can was actually a block number in the transfer answered saying the payload using the counter is a block sitting sitting the payload using the counter as a block is an offset this cleaned up almost all the years I was getting I was able to play back almost every file so what sales able to play the majority the files being transferred also is the running time of the videos were significantly longer than the time it took to download it was it was only then there the files are meant for playback in the future i.e. content delivery system so being able save the media files a great achievement both a lot a lot a lot to be desired every file was saved as 32 bit number nearly i'd indication as to what the content was require immediate required further examination which is it was me trying to play the file on a media but media player so I was at also receive files numerous times and the knowledge that was retransmission also there is for the 1st time at the end of these files did in the cells always contain some seemingly random data and hunter was some sort of error correction data but I had no way of proving it at the time since split between the speaker's singer split between the file in section extra data was pretty obvious the random data always start on a packet boundary so I knew there were other packets in the stream that I was north ignoring I assume they had they had they had to have some sort of control packets the receiving stations had to be where the files no transmit women know in the transmission started or ended I started loading all the non payload packets to file it's indegree wanna get about basic understanding of the other types all packets have the same a by header of it by header of the 6 and 16 bit packets and the 16 bit packet length and the 32 bit object field after this all the packets differed but since all the packets use the same object ID field I started to look at how other packets correlated with the file I downloaded and see what I can determine the 2003 3 was the most prevalent packet and it was the longest of all of them as quickly as you could see as the strains containing would seem to be quite obviously violent or come back to this the other 2 faggots worthy of 6 FF enabling occurred before and after transmission the bodies were really short is it was soon clear that they used to indicate when a new object ID was starting in the stream 1 was being removed so 3 packet like I said I come back
to this is the 1 I was looking for a truly contain the file name along with some other descriptive information about the file so I server log as many of these I couldn't looking for patterns browser saves was pretty read rudimentary used them and what to packet side by side I piped in grab packets with various filters to see what pattern I could discern the 1st problem was the location of the ASCII file name was not a fixed offset in the packet is money had to look more now the packet was structure to build a program programmatically extract so my goal is to try to identify as many parts the 3 pack as possible so that I think so that I could see what remained I the pack it was made up of little endian 32 bit and null-terminated strings of the sensor a lot of zeros I knew the total payload size including the extra data and a good idea the file size a few hundred bytes and actually it was removed also the transmission block size I was able to clearly see the file name the packet as well so I found a date field was usually within the past week so UNIX timestamp format so I may be showing my age in nature a little bit but I pronounce extends the couple 3 packets for all these properties but the I may be showing my age but I printed out text terms of a couple of 3 packets for files new all these properties of and highlight of them so once unmarked of all
these areas is obvious that the field I was interested were all cluster together and you may have you'll clearly here in the photo but there is always there is always a 32 bit little endian 0 to right above the transmission block size which is highlighted in yellow on their it was followed by a 32 bit little endian like the correlated with 13 bytes biceps the file name which was usually in the packet based on this knowledge a little bit more human pattern recognition was will make a parser that was token-based starting 24 bytes in the packet now the sets of major flaws it would crash early often as they do know nearly enough about the format but were grows cells will probably say the files files of the correct file names as a bounds checking my the subsite faltering along the years and whenever I miss parts in a 3 three-pack ever compare the packet the crash to what I was expecting and this process continued on offer a year to instill the partial stop running in areas I'm sure my parser routine is a facsimile of the real software but is in stable for 45 years no so the
TID problems at this point we're dealing with the volume and speed of transmissions the big video-on-demand system had to transponders the combined were consonant 50 megabits streams I could fill the terabyte by hard drive less than 2 days and that's the constant I was collecting a little or no interest in watching and but I am always able filter us in the content based on violence was an ideal solution require me to manually can clean up the hard every few days but over the over the course of a month over the course of a month they set up to a set of attend here by array two-tier drives never fully address the file blind the file problem problem programmatically was trypsin string-matching I was able to reject a large enough portion I could sort through the files once a month or so the overall bit rate was the most difficult problem tackle this is by far the most I O-bound application I'd ever worked with the transmissions were the transmissions reconnaissance I need to build the right data at least as fast as a very because no matter buffering would help the tender array was more than adequate to keep up with data rate if I left alone doing other but when doing other operations on the discovered problems so the 1st thing
I tried to deal with the the 1st thing I tried to deal with was to increase the buffering on the DVD driver I tried using a buffer is bigger 64 megabytes but then only covered about 7 seconds the value-added using before was 48 Mexico's improvement but it'll London eliminate the problem again qualitatively say how much it helps but I still ran into promissory consistently so from watching the hard drive waves was clear the data that was being rendered it was being written to disk in quick bursts of a 2nd pause 1st few seconds assert a look at the disk using BN stack and you could you could also see the burstiness you can see that here in this slide on the top 2 blocks out there and red you can see it's writing nothing and then writing 48 45 45 thousand 144 in the nothing and I'll explain in a 2nd so I did some research and found the kernel settings the control how much I know is cached in memory until 2 plus 2 desk there's a pair settings in the kernel dirty background bytes Indian background ratio and what the setting do do With this setting does is he there is such a bite when the array should the amount of data being held a 9 and cast before flushing at a desk robot so is using additives ran the default setting for the kernel I was using was 10 % the spots was using them a RAM for application to be conservative for data being used on a cash if ordered being used frying catching a right wing get flushed disk until approximate Ford 400 mega needed to be flushed 150 megabytes a rough estimate would take around 20 seconds dad form x to the buffer the prom was the buffers would still be following what the data was written out so that's why the bursts so much closely spaced the real from can be seen in the uh the Apertium stats W a field this is the percentage of time that the I O operations are suspended more buffer buffer underruns could occur if everything in my application had to stop and wait so basically that means that everything is waiting for the higher the happens servant that last a right there with the yellow so 17 per cent of that 1 2nd everything's just pause so to stop this behavior I change the and dirty background ratio bytes to 0 which method the data was written to disk as soon as it expired this immediately change the disk-access behavior to be much near the end it was written now as it was received continuously as being so bursting they also it also meant also met the potential spikes and I blocking were reduced significantly so you can see this improvement in the lobby and of Laura that output you can see it's like riding a consistent and thousand in each sample so after tweaking the VM dirty settings Michael I was able to the right data reasonably well based apart by storing the problems doing other file-system manipulations and very very specifically deleting a file of built up until this point I was just using the EXT for file system that is the default on X and the reason you anything else is about my needs I started researching file deletion times larger and larger file is in the t for the longer we take to do we do dine it's been fragmented I did some work on this TV years before and I remember the hardcore users recommended something other than the X 4 but I can remember why it turns out that the 1 deletion times were causing the Irish by it turns out it was 1 deletion times causing Irish Irish is a little bit more research master trying various file systems and upsetting settling on XFS 2 main reasons were the control how large a block of the disk was allocated the time and the file deletion time was always fractions of a 2nd regardless of how large the file was the large block allocation was also very useful since almost every file downloaded was multiple gigs I said that by it's 1 game and there's some XFS tools z how many non-contiguous blocks of file takes up with the 1 big setting you have really really large files in order of 20 order 20 exam or it's still be split up into a few pieces so by this point I was it'll store just about every media file from the video on demand distribution system but I still have problems here and there to reception problems and their overall was able to complete 95 % more the files and and buffer overruns I over a thing of the past the the so although I was quite quite
ways of being able to store and organize the files due to the nature of so do the nature of satellite transmission I would quite regularly be missing a few packets from transmissions it was quite frustrating the missing 1 2 packets from appointed moving so events I decided that I should take I shouldn't I should take a look at the up until that wanted and understand much about the seaside aside from that a level provided data protections using some fancy method and understand at the time at some point I found vendors like fender-benders website dimension using a proprietary FEC method so I started this research and FEC and started looking at the father was downloading critically is and how I can use to focus my energy I knew that the thesedata data always food exactly In the packets and I knew I I knew there was no padding there's no padding on the FEC today data like there was on the the actual file so this led me to believe that it was blocked based also known as the ratio that you see data had close correlation to the file being transmitted assert examining other systems that use the same protocol and there's this ratio differed a little on different systems event and use less redundancy KU this mostly this most likely is because the man is less sensitive interference and signal loss and KU band so 1 and 1 of the more important observation that I happen to know this is a single trip a single packet transmissions FEC was an exact copy the payload so besides an example of a 2 back a file that was 26 100 total bytes you could see the contents of the same as by 0 and by 13 hundred and up until this point I was mostly it was mostly overwhelmed with attempting attempting to know where to start because I focus my energy on the large files for some reason so I start examining some examining the smaller transmissions and started collecting stamps of 2 3 and 4 packet transfers at some point during my research and and it's and pointing my research I don't pattern searches and come across a potentially useful FEC schemes that had been able to make sense of eventually we're back to the scanned documents and started trying to piece together how it all worked what is this mundane pattern search actually pretty the big breakthrough I needed I just to know the answer was sitting on a minus the whole time the so from this point on I started researching all the math mentioned mentioned in the patterns and trying to remember how matrix math worked I do learn all about finite field man finite-field math Galloway fields in matrix math over the course of the over the course of a few weeks in many many sheets of paper I manage to calculate the FEC this of small file examples by hand I would turn this into a long math lesson but I wanna walk through the high-level overview of how the math works the FEC system so a
key part of the forward error correction is izdelat field arithmetic a galley filled it has 2 important properties the 1st is that it's a set of integers the math operations addition subtraction multiplication division between the integers are another integer and the set the 2nd property is that the field is finite the FEC scheme that I'm sort of speaking of in fact most others are a set of algebraic equations that are used to calculate the missing elements from the other the so performing the math operations a gallery feel difference in standard math addition subtraction just or multiplication division are a bit more complicated but the total possibility for a two-day feel to the 8th field osmotic easily be performed in a computer with a lookup table all the light field that is done using a primitive polynomial not reducible by any other polynomial in the cases the to a galley filled the polynomials represented by way to represents a yes represented by the
X today equation on the bottom so this is the same polynomial used for QR codes Reed-Solomon lots of other correction schemes so in so the FCC method I employed I found out later was called the Vandermonde FEC method consists of the basic equation 1 of the k equals x sub times g where wider the care the transmitted code words of all the packets with the FEC data X to the end is the original data and use this and like a generator matrix then you can get back x to the 1 with the following equation which is X the n equals e to the end times 8 minus 1 which is repair matrix the so yes so the 2nd equation seed and is where the star you receive codewords in in 1 of the Repair matrix that but it's based on a generator matrix you have to have at least and elements of the the dealer recreate tax so whole principle the FEC system is you have to have is you have a system became equations in n variables so if you have at least N pieces you can solve for the missing values so does all that mean in a real-world implementation while was role induce matrix for example do a demonstration on 2 by 3 FEC scheme this means there 2 bytes we wanna protect using 3 bytes on axon of example you can do it quickly and know all processor be understood so the 1st step is to build a generator major generator matrix and this so for this example of the 2 by 3 0 tilted to be a 2 by 3 matrix to by 3 Vandermonde matrix and this is the equation right here and this is just a simple germanic geometric progression so the actual
so you might 3 Vandermonde matrix for the galley field to the 8th is from Figure 5 but this isn't the generator matrix to create the generator matrix we need we need to get at the standard form by reducing unlike any get into this but that's where figure success rate there In noticed there you'll notice that the
2 by 2 bottom left the matrix is all ones in the diagonal space this is called the identity matrix and all proved that also prove very useful features later on in the FEC schemes the OK so the 2 by slaughter protect their 1 enough 7 I got these values match actual packets and you can see that in Figure 9 here so now to do the math and how many of you remember how matrix multiplication but I will be forgotten In a quick review you you flip the 2 by 1 matrix on its side only the 2 by 3 2 by 3 matrix alone you multiply the 2 by 1 major left with each row and end up with a 3 by 1 matrix A to do the math you multiply 1 1 by 1 and a 7 by 0 and you follow this through for each of the 3 for each of the 3 fields the so in a gallery field any anything times 1 is itself an anything time 0 0 that makes 3 D is easy to see the 1st part the codeword is 1 of the seconds of 7 blast still give you somehow 1 times 3 in the galley filters 21 of 7 times to is f 3 and 21 x sort of 3 due to so d 2 is that the value which would be in the 3rd packed the the by for transmitted in the 3rd packet on your your as the great property is the 1st bytes the codeword in the original form are the same as because last section is always in and any matrix so that as I mentioned earlier this is the case this is the case for any of the FEC data rates also means that all the bytes written just for a transfer won't have to be altered later so now that we've been able generate the
FEC the next question is how we do have to get the data back out so let's say in our example from before the we get get the 1st packet of 3rd packet and we've lost the 2nd since we got to back into the code words will be able to generate the missing by the 2nd packet to do so we need to build the pair not to do this we take rows magenta generator matrix for the packet only and we build a 2 by 2 matrix in this case the what you see here in what you see on Figure 13 the 1 in the 0 correlates to the 1st packet of 3 into our the repair or the generator matrix is 3rd field so the matrix needs to be inverted to create the anomalous pretty that trouble on tree and that's figure 14 so that this gives us a inverse which is what we used to generate excitement the so now I
multiply the 2 bytes that we have 1 of the 2 by a inverse now the mass is the same as before basic matrix arithmetic that's a very simple example but that explains the process and the numerous examples on paper before I was able to understand enough to do much in software so now that I understood how the FEC process worked asserted to work on a softer implementation likely some scalar field matters very standard compute computer science it didn't take long to find some code online tubal multiplication division tables once I did that I wrote software need a route I wrote a few functions to reduce invert the matrices by this point I had all the all mathematical tools and software needed we did together to deal with the the FEC so for this part modified mice off to stop truncating the FEC datafiles Wednesday them I collected a few examples of complete files of complete FEC repair sections and since I should be able to generate the FTC sections walls repair sections might my 1st step was to write software that verify the FEC data fields and save files didn't didn't take very long to implement and distance offer for files that consisted of less than 100 packets but after that area are problems of when files a consisted of 101 packets I made the assumption that 100 must not be some arbitrary 1 and the programmers had decided on it for some reason it turns out that after 100 packets packets relieved so that that 100 by under by axes the largest scheme the interleaving data the early data it was easy to calculate by so the total number of packets divided by 100 any packets a were present again to get to the even 100 is considered all 0 so all these little things necessary to do a complete and the implementation of easy to test and software with complete file an FEC sections to test against so by now is elated that I was able repair bad files but I was still not totally sure with the FEC rate would be I can make some assumptions based on the total file sizes that I took a look back at the myriad of unknown sections near 3 packets from much earlier and sure enough there was a field the correlated FEC and data in the 32 bit field after the file name the only reason it that was an obvious was it was it was stores the FEC value times 100 so if the FEC scheme was on was 100 by 1 of 6 the filled d 10 thousand 600 so once had this last piece of information out it was time to try do it try it was time to try and do it in a real time on the Lifestream so the time it takes
to build a generator matrix which is the basic of the basis of the repair matrices is quite quick for the 2 by 3 example I just presented but with a larger FEC rates are used in a field such as a hunter by uh in such as the repairs wonder by 109 takes much longer sigh pre-calculated the generator matrix is matrices nice store these in a header file it was compiled into the executable I was for the section the generator matrix used to the right the identity matrix and once they implement once I implemented FEC I was able to emulate as far as I could to tell as far as I could do the entire protocol so very many I where's the FEC was a word-pair these these in short bursts so in conclusion the biggest challenge of the project for me was the FEC is the mass was something that never really dealt with in my schooling aside from I'm remembering it long enough to pass a test and that was that so I'm sure you'll be wondering about the video demand video-on-demand system if still running the system is running but all the files now our encrypted and using some sort of scheme based on AES 256 so hopefully explain all this the way it's easy to understand and thank you for listening to my presentation at time it so 1 seat applied all error-correcting stuff what proportion of the data stream we able to decode successfully other