Daala Video Codec

Video in TIB AV-Portal: Daala Video Codec

Formal Metadata

Daala Video Codec
Research Update
Alternative Title
Open Media - Daala
Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Year

Content Metadata

Subject Area
Context awareness Group action Building Pixel State of matter Code Multiplication sign Source code Spherical cap Computer configuration Shared memory Videoconferencing Codec Multimedia Endliche Modelltheorie Data compression Social class Area NP-hard Electric generator Cohen's kappa Building Software developer Open source Sound effect Bit Control flow Open set Category of being Arithmetic mean Internetworking Telecommunication Order (biology) Chain Different (Kate Ryan album) Right angle Pattern language Quicksort Whiteboard Cycle (graph theory) Freeware Spacetime Point (geometry) Trail Implementation Freeware Service (economics) Open source Codierung <Programmierung> Number Product (business) Goodness of fit Term (mathematics) Internetworking Band matrix Software Computer hardware Implementation Self-organization Distribution (mathematics) Scaling (geometry) Projective plane Content (media) Counting Computer network Incidence algebra Cartesian coordinate system File Transfer Protocol Word Hypermedia Software Personal digital assistant Multimedia Video game Codec Collision Game theory Videoconferencing
Context awareness Presentation of a group Distribution (mathematics) Multiplication sign Mereology Inference Strategy game Different (Kate Ryan album) Shared memory Videoconferencing Codec Process (computing) Information Office suite Data conversion Data compression Geometric quantization Position operator Predictability Compact space Pattern recognition Link (knot theory) Electric generator Software developer Bit Control flow Demoscene Radical (chemistry) Process (computing) Exterior algebra Befehlsprozessor Vector space Oval Prediction Telecommunication Pattern language Moving average Right angle Quicksort Cycle (graph theory) Whiteboard Figurate number Metric system Data structure Resultant Spacetime Ocean current Point (geometry) Inheritance (object-oriented programming) Momentum Transformation (genetics) Image resolution Content (media) Number Moore's law 4 (number) Latent heat Term (mathematics) Data structure Codierung <Programmierung> Form (programming) Standard deviation Image resolution Code Line (geometry) Transformation (genetics) Cartesian coordinate system Entropy Subject indexing Geometric quantization Algebraic closure Friction Personal digital assistant Computer hardware Formal grammar Statement (computer science) File archiver Iteration Codec Musical ensemble Videoconferencing
Pixel Distribution (mathematics) Multiplication sign Medical imaging Graphical user interface Radio-frequency identification Different (Kate Ryan album) Videoconferencing Codec Data conversion Pixel Data compression Geometric quantization Multiplication Predictability Covering space Area Temporal logic Computer Sound effect Bit Mereology Process (computing) Prediction Order (biology) Pattern language Block (periodic table) Ocean current Frame problem Random number Transformation (genetics) Codierung <Programmierung> Image resolution Adaptive behavior Rule of inference Code Number Codierung <Programmierung> Fingerprint Domain name Information Image resolution Interactive television Frame problem Symbol table Plane (geometry) Geometric quantization Friction Personal digital assistant Vertex (graph theory) Codec Videoconferencing Coefficient
Slide rule Demo (music) Link (knot theory) Image resolution Code Demo (music) Computer-generated imagery Set (mathematics) Mereology Bit rate Line (geometry) Time domain Word Graphical user interface Bit rate Prediction Videoconferencing Codec Quicksort Videoconferencing Arithmetic progression Fiber (mathematics) Reduction of order
Point (geometry) Slide rule Group action Code Multiplication sign Translation (relic) Online help Product (business) Bit rate Repository (publishing) Videoconferencing Codec Software testing Lie group Absolute value Mathematical optimization Predictability Domain name Area Electric generator Information Software developer Cartesian coordinate system Complete metric space Benchmark Frame problem Similarity (geometry) Sparse matrix Wiki Process (computing) Befehlsprozessor Personal digital assistant Codec Quicksort Referenzmodell Videoconferencing Arithmetic progression Resultant Spacetime
so you know I was a the young from and was always they'll be talking about the the dollar that you couldn't projects so what we're effectually calling the next next generation video not and so on Service Ribbon motivation of why free codecs matter and so here we're talking about what about 3 were talking about controlling that and not cost so the idea is that and you know each Miller do anything with this you could that you want to apply to any application that you want and I have to ask permission for anybody and so if you look at the current media on they're billion dollars full tax on communication tools so what this means is that you know for every cell phone around the has a audio or video content there is a small costs associated with licensing that collective you look at the the price of the components of that that device over time they all go down but the cost of waste in the context of stays constant the and of course you know as as these things are so the place this is in you multiply millionfold so and there's a a heavy cost there and if you look at all so the the licensing terms on this connects the real users can have a competitive weaponry right so the I think commodity hardware manufacturers will uses weapons by having of patterns on some portions of the code can on a reciprocal license with other other hardware vendors and what this means is new entries into the market we don't have any of those advantages laughter to pay you a little more you know that he per device and I'll make them uncompetitive unprofitable and so this is a tool that we used to keep competitive markets and you get rid of the market and finally other success that was based on and I haven't enough permission right so having a license the codec already is kind of burdensome if you have an idea of beginning from that point media are non-starter and course so if you use a lot of April innovator for forgiveness here so and I work it was all and we ship a whereas you might have heard of on we do this through a volunteer network for phrase shows reduced wanton worker distributors have run FTP sites who who's the source code of those binaries and if you're a small open source projects in using a could being through the same fashion and we all the same problem we can't count how many people are using our our products so for many these licenses there's a per-user cost ominous just even keeping track of the number of users is burdensome and you can't possibly even begin to and you pay a license costs and so what lose all this ignore these these costs and for a small projects you know you probably find doing so with the assumption that you know you're too small the soon what happens is once you're successful then you become a target and so know of a famous case is is state you know started by using the 57 7 and the p h and then when they became a larger and successful they they had the revenue to the mice and other context like 264 on and so the sort of this tax on success that this shows up when you use and you could not be mindful of licenses and find that the constellations really is an action largest cost in the point of the bodies causes incompatibility and you think about the life cycle of on product for some hardware manufacturers and they might spend words in 19 more just ensuring compatibility across all the devices all the deployments and licensing fetus is really not so much a concern In the I've seen this a welcome and you know you think well OK we always you know we should make a new codec that that covers these use cases and then we can get rid of all these compatibility issues and now you have 1 more could that you have to be compatible with so on developing new colleagues have missing the point you know that we have the title is come missing the point here that the other really good reasons and so on In these all mostly around the Costa licensing so that you know you can't license and income recover it gonna give there's no acceptable license so for for certain video codecs on they might you call lessons call it'll give you option there option B. all and you know you 1 of these things applied you so you you know there's no acceptable way for even get a license an example of this is you know for a at 264 if you are deploying this to a large group people on the Internet there is a licensed Kappen so you can display the cap and to rewrite counting the number of people look for other context like a c there's no cap so you you're back to the same kind of problem no license that works for your your distribution model I and in some cases know building and you could then it may be cheaper than than licensing terms the doll development team I within Mozilla is far below the the kappa 264 so the and we're not sure what the lysozyme like for 265 exactly so it makes sense for a proposal it has this distribution problems to to build and you are to invest in the development of this freak out of their way then can use and of course this adversarial licensing is a huge risk in a competitive market and In a friend is often of fair reasonable are discriminatory on there was FTC hearing in June of 2011 where the a little property an adviser for a larger working company incident friend meant she had to call and sign NDA before she can get licensing terms and so you know it is an NDA she can't talk to other people out there was a for the same technologies the houses possibly in a reasonable non-discriminatory right so were trying to a dollar to trying to In chains a competitive market here so Ukraine good colleges is not easy problem on a really don't need that many religious in state 1 and many of the the best implementations are already free software so if you look at on the open source community in many of the commercial confirm it made collisions of these commercial context are already the open source software and you can use them if you have any kind of deployment scale you have to go license the appropriate patterns may be but already out there 64 in another the politicians work right phone and of course network effects you know decide the the market so there has other case were royalty-free Kodak has take over and pick each on this then been displaced by receive nearby rosy banker can so for Japan there are lots of it is grilled of the error in standard there lots of new in these centers of come out and you know have been pattern encumbered and and perhaps offer better performance in some very specific cases but not no reasonable is placed a flag on but being wrought free you know is not enough there are different people who care about on different things the something for from the video space the people they care about on the the cost per bit in terms of the a number of of and if a pixel say that the the compression I mean of the collective but don't care about price rightly so they'll be willing to pay anything they want to stay in the very best collect call whatever your lessons in terms of our there people who are in the mobile space that's a when you the best performance per for what there are people that that other needs and so in order to really can win this game the area of the rosy free have been going on all fronts and so as if we on should other connects which of the Aurora and we should borders and of these were not best in the in class for those use case at that time and they didn't see the great adoption other roles in free in some cases they are there were significantly better and so on or is if as really been opus so this is an example of a royalty-free I occurred at other we did with on at the ICF also body and Opus is better almost across the board for every every use case and that basically on mean like 10 other x obsolete you this all in 1 place and you can all all use cases from you know very low quality of which may occasionally have to low latency high-quality stereo on and
how did they dance music quality and so what is the same sort of thing in but the strategy is is essential here so all these things are are necessary for us to be successful and deploying erosive ridiculous so we design alternatives to the to to avoid the the worst our patent thickets was not enough to be just avoiding existing known patents in working around them we have to actually I have a story that's compelling we can tell people about why were royalty-free so just going in and say what we read this pattern to navigate them it was it is insufficient because those people are going also read those patents in the we're talking to who major technology that I 1 that's the time and that as we have them a compelling story I will will realize fans and publish the results and often In the advice there is that you should not publisher pattern analysis because it currently is your competitors a blueprint of how you might and you defend listen and say pattern court and the the point there is that we when we analyze and publish these results we can then defend ourselves against IPR claims like we do who did it with opus by simply pointing out specific parts of these defenses and saying you know your pattern you're claims to this technique reusing do not apply for this specific reason when I actually giving away any pattern defense as a result you know in a bit the very defensible statement and of course for a patented technology that we develop the donors already with dollar on and the idea there is is that by patterning on specific technologies we can then go to other partners in the industry and get to listen to our claims as to why we believe were roasted free until Opus had the patterns was varied was read aloud that conversation but presenter locus all that we felt some patterns in rural inter-speaker Venture partners because we can take those patterns and weakening use them to kind of reciprocal licenses so the the 4th bullet point here I which is sort of what we would opus as we the we partnered with other people in the industry and granted a license on a reciprocal terms which meant that you could use our patents for deployment of opus so long as you did not sue anybody else who is deploying opus I for those patterns and then if if you did go after someone else who is deploying office you lose that defense in any of the other players who were our partners who were deploying Opus who wants to sue you then have the right without losing their license and so this became ought sort of like the the GPL Asensio became the sort of Our technique to encourage the the behavior and finally overturning the the next generation so were not targeting 265 we're looking after that the decoded development cycle all is a pretty cycle and so we believe that to be competitive we need to take the time to actually do something that is significantly better than 265 i'm because they've already come from the markets if we were to deploy something that was equal that they have an advantage in the harbor space rather spaces and so actually any 30 50 per cent better than what 265 is dying and finally no it's a document all this stuff and make it so that it's no abundantly clear on the dollar's royalty-free the dollars better and in all these use cases and let people know what were world the and so the other processes are actually very difficult so received the best in all cases with the best in the compression for a bit in I did for for what so for mobile case we have to be good for archive use cases of figure for streaming of to be good for and real-time communication we have to be able to speak of our competitors and our critics in the other camps and get them on board with what we're doing so there's a huge amount of all developers that work on road to bring context and his great minds around us often we like encourages people that could contribute parts of the technology towards the career development process knowing that they can get the benefit of using this honor roll to 3 races once it was becomes available say after the next iteration of context an what are the stressors it was that we have did on with opus that was great was we found a niche that was not currently covered by existing audio codecs and we developed of strong use case around that with opus so for of low latency high quality on Audio there was nothing in that space and Opus sort of feel that nation until we start showing that we were very successful there we could and other people to become interested in but once we show some success that everybody realize this was gonna be something they can deploy and and 5 these from as we know that although momentum this is 10 people and were not in a position to develop our own harder and so we'd like to do is create technology in a way that shows to be used this compelling but also is something that other people wanna pick up and can easily converted into O'Hara plantation and so on yeah OK some the things that we did notice that worked really well that we're going to try to do with dollar is written try do all the work in a public process and recognize says body with a strong IPR disclosure policy so the work with opus was done at the ICF and the i TF there is a Strong IPR closure policy where anybody who shows up to contribute as make any comment towards . process for developing that standard is required to disclose any patterns they however may know about the real are not standard and that specific disclosure required to give the pattern number and this is good you know it is not there's there's nothing about the ISA pulses as you must not use pattern encumbered ideas or technologies but because they give us a specific pattern number we can evaluate that pattern and say well we do or do not agree that this you know I P inferences over doing and if we do believe that it does we can work around we can just up to use similar technology find some different way of doing things the that would would rule on it and with and I know you know real question all the assumptions around the conventional structure of the codec so you know basically the that always of values of a high risk of reward approach we're gonna try new and radical techniques and with the idea that and some of them should and should give us performance gains that are above and beyond what nutritional video techniques on the way 264 and 265 and the creative if 9 have been developed the sort of incremental improvement so you take an existing technique and you say when we have more CPU budget what can we do that's different and maybe you finite that technical bitten by applying you know a little more computational power get better motion vectors will get better in of intra-coding as orient juggle those and the we can find something else on and try find applications were high flexible essential so we targeted dollar real-time can occasion this is in line with phone work with opus where at the ICF if they adopted opus for the monitored implement are you couldn't we would like to develop a clinic that fits the video clip that fit Spanish for real-time communications with so that will eventually get used in at the answer with with the war emergency yeah and finally you know process that and the uses PSR to select the features they include in their contexts and PSNR you know doesn't actually correlate with well with what people perceive as the quality and so we actually look at the videos and choose our techniques according to what gives a better visual performance rather than just arbitrary met metrics index and so very quickly I'm going to give you guys an overview and how video codecs work so the formal name price to the collects that pretty much all are have to do others prediction in during about a scene when coding the current scene and transformations so you rearrange all the data so that it's in a more compact form quantization Lauren resolution the transformed data and because and so friction there are
2 kinds of prediction video codecs there's interaction where you prick portions of the current frame from RD decompress portions of that frame so lower the frame you can use references from above it and interpret sure you use the decoded previous frame to predict the next frame and here you can see you know for this current frame and we've constructed a reference frame from the previous frames and that's the residual in and this is significantly less information the residual so that's how was much you can skip the bulk of their the compression and
the transformation so most but X is a 2 D DCT which takes you know some spatial-domain awesome In his image information as pixels applies the transformed into a more sparse domain keeps us highest coefficients and uses those as what it codes this case is also great great compression is responsible for some of the blocky edges you see in critics Wenger effect In the last 2 was your quantization encoding because this is where we you know will take those transformed coefficients reduce the vertices represent them and take those bits and Arlington intra-coded converts them into some on the numbers that has some probably distribution and sufficient and so and dollar order basically do different things for all those In so doing just a DCT we apply a lapped transforms on which is a technology that was around a lot 20 years ago in the early nineties that was abandoned because of the computational costs now that we've got I invest years this is not something that is tractable it also requires us to go through and do a bunch of other new techniques because none of the intra prediction prediction work with lapped transforms exactly the same way on it would have been a not in that area I'm really multi simple arithmetical thing and most of the it is included to binary at the decoding outcomes times adaptive IIR they're coming on we use this great but to be barred from Opus called social medicalization of Europe found yesterday John gave a great talk describing a technique in Y will be less of they were really well for us lessen the sting ideas around friction where and you no longer do the difference between frames and in doing so that removes arch was the number of patterns that begin by saying take take a difference of 2 frames induce additional processing and finally do a chroma from the prediction was which the is differ from the Codex and don't we're talking about a rule of what motion motion-compensation next month at conference and we do this cover time-frequency resolution switching all these are can new techniques that are not currently used and for video coding and were very users and all the charter you have a little story from being royalty-free
and like to follow work written I've got a link here to some of the demos you put together these role online and that it the slides are online have has a link online critical them if you have any interest in the specific I'm going techniques but this is a chart that we
made it's a describes our our progress for the previous year and so what you're looking at here the red line is what a set 65 does on certain video set as a heavily November of 20th of last year and these lines here are are the dollars could be a sort of overtime answer making progress in his the additional word actually John Mark Leonard code yesterday that the giver fibers and promote through low rates so we're moving closer to that rely on 265 and we believe that there are many more techniques you can apply that will get us all closer the and of course you know the techniques I
thought well before I really no not the end of it there is new innovative work is being done in the space and for example this is a very interesting technique lets you take the center frame that is a composite of the 2 other images and separate them so these are the result of doing through spatial and the sparsity induced prediction reaction was separated 2 frames or overlaid I'm not that this has come additional cost that you may be tractable you no the next few years so the idea is that our new that haven't been currently deployed the believing the lie gains out of so was the road
ahead look like on you know as the slide been we've been making some progress in these new techniques that innovative I find new ways to work around issues by using that transforms but there's still a long way to go and this is currently looking at the point 265 EGTC which is is it's a violent and the 9 on so they're not really focus on the next generation ago our kind of an but we like to you know take opportunities to to innovate and amygdala something that will be competitive so I would love to get help from people and in particular are looking for application domains where I In the some novel novel use case that is currently covered by video codecs and if there's something that you think would be interesting maybe around you know videoconferencing areas areas to sort of slows no we like to accommodate that right Simon from 4 questions you me and it so we have time for 1 or 2 questions so I think the proposal that was found by the way that absolute fact be working the says my hopefully before we answer so the the question I the question was will be on proposing dollar to assess body at completion an answer that is that we're actually talking with and the idea of Informal Working Group hopefully soon to do development of the of the products in a public process so even before we have here this is 1 of the ideas on even before we got everything finalized love to engage in a says body may bring other partners to give us Our Contributions involve the community only depend on 1 before and most it's yeah the following Hong Kong and you use the if you have the I'm going right yes so we we run test against 2 of the question was how does the CPU uses a dollar compared to 65 I we we running of benchmarks against 265 and currently we we run significantly faster than that I'm on the reference model Our goal is to develop not just a reference model but also a production quality our code that we can release you know under those of was all adding that can be used to actually do this in software and at the former rate you know from day 1 we're not going to notice how ideas that will be of you hope that the CPU performs well improve over time so we were were designing techniques that make use of this French Ricota for example is designed to work really well assume I mean if you look at it of I no I I think the techniques reusing have not been finalized there's of a lot of optimization we have not done so that's a 64 is gone through in the summer translations of some of the transformations in and the motion search and we haven't anything like that also there's still lots of room for indicial performance the so we don't you think you much the thank you you you you you you know hopefully right to the next of the