Breaking News Detection [DEMO #2]


Formal Metadata

Breaking News Detection [DEMO #2]
Title of Series
Number of Parts
Weller, Peter
Weichselbaum, Christian
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Numenta Platform for Intelligent Computing (NuPIC)
Release Date

Content Metadata

Subject Area
The Cortical.IO team demonstrates some twitter analysis.
Spring (hydrology) Demo (music) Control flow
System call State of matter PRINCE2 Multiplication sign Streaming media Disk read-and-write head Event horizon Rule of inference Route of administration Twitter Wave packet Software bug Wiki Video game Causality Read-only memory String (computer science) Representation (politics) Software testing Normal (geometry) Local ring James Waddell Alexander II Subtraction Fingerprint Metropolitan area network Software bug Graph (mathematics) Online help Temporal logic Hoax Gender State of matter Prediction Twitter Video game Demoscene Arithmetic mean Hypermedia output Website Speech synthesis Hill differential equation Information security Data type Writing Force
Category of being Moment (mathematics) Computer network Bit Quicksort Streaming media Twitter
Goodness of fit Lattice (order) Network topology Hypermedia Wave packet Formal language
Preprocessor Arithmetic mean Personal digital assistant Right angle Symbol table Twitter
we believe that the arms from the data and you will high and so
where are the cortical writing and this is the breaking news had pack sorry and he's made up by by public Christians and myself side to the gender is we can describe the problem the dates we used and of solutions so the problem is that filtering rules on sites and with the goal of semantically detected the topics in the feet and possible and new trends the data we used safe while training and testing purposes we have the offline data feeds and from me or hiding so this is a substring of the fire hose from February 2013 and our like we've actually as you can see hear that wiki asking for me to its so we use the real uh Twitter feeds or hand you over to the coupling of the of the the uh so represent us to the solution so that's as simple minimalistic workflow of would be used to to solve the problem so really there to defeat access we also need to know because GM technology and we combine a professor at Caltech of API to get a semantic representation of the tweets the so the idea is that we should be able to detect 2 kinds of 2 different kind of animal is 1 of the new topic and money and the other 1 is the new trend anomaly so this is a small graph of the animal is cause for this experiment we did this from this that so that you just say that we extracted 100 tweets with the same cost and here we can defend state and to the 2 types of animal their high scores how their new topics and the lower scores this course and the Newton anomalies and there was this difference is because we choose aspect this upon used so it's more or less a speaking and writing about the different but similar topics and when you have had really is something that is the 1st time at this scene so it's a really new new it's it's so this becomes and when it's some people that it's the retreating or just creating tweets about the same you is when we can see this as scoring value just going down so this is an example of the of the highest scoring value prior animal in value and this is just because this it is the 1st time that that appears the stream something related to this new good now we can see it in the the 1st thing event is the reality and the 2nd the fingerprint of will be the prediction and then what is happening there is that there is that it cannot predict anything almost anything and is completely different to the input just because they're there to return it has never seen has no and never seen before in their in their stream yeah we have an example of all the type of anomaly and then we can see here that the 1st 2 song there is where we detect anomalies and the other 2 it's that are just below artists that happened in the past in this the same stream and also we can see the now they're reality that there are things happen there's there's str after it is much similar to the prediction as we saw before and yet it is just a collection of all the new topics detected so we can clearly see that they are all about different topics in different means and all that far from this BBC breaking breaking so now will pass to the life tomorrow I will let uh continue with this so you please go there to something you see they're using their their house tag of the head of the body
uh thing you you'll be really lazy sweet as I was looking so the have a local
networks yes so basically you know will we do here is all dead already there bubble and all team developed and applied on that feed and the lowest X we see that the started to learn a bit and there were a couple of authoring tweets leaving the beginning and that cause the the really good prediction of of towards the end we see that still every tweet seems to be some sort of breaking news here but have not honored a moment at almost 50 tweets that's not out there in the yeah yes that 3 to heating up up up up up doesn't include 51 52 but still spreading news but it's really inheriting you metric in promoting promoting next 2nd the SEC node another score is going down up yeah I guess it would it would take a lot more tweets tool to learn from the stream lot of buoyancy and everyone agrees with the priority of the is the is the property of the quality of the with met new we deal with is trending what's next new review OK so it's interesting I OK you
will change statistical
let's new media let's tried
the only problem is that said with December feeds really slow we would need like that with a fire hose let's meet Let's yeah and and from that 1 % the dialog I'm selecting a few training every tree that's good English language assigned to it so I can write 1
yeah OK there's incoming tweets that matched you this is the way
the usage of what what's going on in this we we pick attack all whatever you would like to feel good and the and then we start capturing weeds and after a few tweets we use predicting sold on the only preprocessing that star this is with all their cortical I await the where we of for example removed that the of most columns and the edge from this text uh and that's a pretty much something like lottery to really can you junkie looking at that like you know yeah a really really hard you know the example initial from BBC were really nice and clean so I know yeah and there's uh smaller preprocessing what that is basically removing their http because it's too is contained the retina and this will bias on the meaning of all the tweets and there's something as like their retreat symbol as at the and and I think that's all the preprocessing is needed it would you we can see that although we go through the 5 tweets and he HTM ready learning to predict its so they always seem to be pretty similar yeah your Seiko watch MIT's new we you have a simple in case there is some some little new news in their we would get notified quiet but is alone infection right Bell OK so we did crystals that an OK i is we we and the weather like the next in the next few few of the concentrated mainly


  418 ms - page object


AV-Portal 3.9.2 (c7d7a940c57b22d0bc6d7f70d6f13fde2ef2d4b8)