Merken
Classification (13.01.2011)
Automatisierte Medienanalyse
Diese automatischen Videoanalysen setzt das TIBAVPortal ein:
Szenenerkennung — Shot Boundary Detection segmentiert das Video anhand von Bildmerkmalen. Ein daraus erzeugtes visuelles Inhaltsverzeichnis gibt einen schnellen Überblick über den Inhalt des Videos und bietet einen zielgenauen Zugriff.
Texterkennung – Intelligent Character Recognition erfasst, indexiert und macht geschriebene Sprache (zum Beispiel Text auf Folien) durchsuchbar.
Spracherkennung – Speech to Text notiert die gesprochene Sprache im Video in Form eines Transkripts, das durchsuchbar ist.
Bilderkennung – Visual Concept Detection indexiert das Bewegtbild mit fachspezifischen und fächerübergreifenden visuellen Konzepten (zum Beispiel Landschaft, Fassadendetail, technische Zeichnung, Computeranimation oder Vorlesung).
Verschlagwortung – Named Entity Recognition beschreibt die einzelnen Videosegmente mit semantisch verknüpften Sachbegriffen. Synonyme oder Unterbegriffe von eingegebenen Suchbegriffen können dadurch automatisch mitgesucht werden, was die Treffermenge erweitert.
Erkannte Entitäten
Sprachtranskript
00:00
Boys like the idea so welcomed Welcome to are lecture We list of right in the middle so somewhere in the part that a where housing stand when the popular mining now a solo last week low sequence had and mining movement
00:22
Yes last week with the money and a room and and were talking about about time series of and that that is collected over some timespans be exploited for getting new knowledge getting new information out of the death toll at Trent and the restrictions and the all of that kind of stuff and I'm not quite sure if you look at some of the video so for all the people that could be with us for the 1st public electorate and there are some videos on the cost website looking sell and you'll might want to re fresh knowledge about where housing and the mining but she believed use was of something and if there any questions about the and I will be happy to go on all the questions that you might have thought the plan for the future he will be staying with us and the lecture and the new of Brooke lecture with with the PM think that of all the things 2 months to mount a sign fellow you can pick the right time spent free not the at the moment and that it was communities and we had a lovely that basically of this kind of happy with the great great can rather than going out with state that and then take buying and in summer the faulty the nicest said time of the year and brought by up at least again How with some nice lectures and the where housing 1 all of the mining 1 that you might particular like because it's very helpful in the Ministry
02:21
And this is what they care about So there today won't talk about classification for a little while and basically we will look into 3 major techniques for classification the 1st decision trees and then we have night based classifications and the last 1 which was 1 of the most model in 1 of supply the machines works well and we will go into the house for the night after the Blue because we lost the 1st 4 holes to the muscles and and anyway
02:57
The basic point is was classification at the idea is that you a collection of record whatever may be maybe trans actions from from your cash registered may be or other kind of data so production that some of some ministers the and the each record belongs to 1 or more losses Class is semantic characteristic that is shared by some of the records of was usually called off at so you just named save all people who are male on the people will female upon the 2 classes of people where each person is a record of and this is what you do in classification and what I want to do is use a while basically it Antonellis somebody's male female after look at him or her in the neck and site Without asking the most cases are you that your Mayo see that you are a man and see that the female of but however recognise It is something that the well basically take other at tribute into account but maybe some old state heuristics also for example women tend to have longer which Trulia wrong over his head but 2 out of 3 is not too bad for the mayor's slowly of what all the dead about in the book the only ones are quite sure owner all of also wrong so you see that its heuristics is basically you take some of the attributes of the conceit and then you try to match it would was know about typical Types of the and the moral it kind of gets into a picture of the bed a classification will be so many basically Dubai is you build a class at tribute or you define class at tribute as being a function of the characteristics that you know of the tree that you know of a case and when everyday and Countess of move record Dosunmu entity you need to sign this year entities to some of the losses as accurately as possible so we know about of the is is used as and when he wanted to do 2 to show you what book cost and how the view that while basically it said it learning Elgar's are so what you do is you have tested by just saying OK he 100 people that tell you who have them is of of cloth and then you have to figure out However this decision whose in what could be made from and this is the notion of a test set up on the storey of this is the notion of the training set of post after trains my mobile my classified with some known entity and then not have tests that was a big call accurate is my design Sell many of the test case is are classified correctly hominy on the wrong assigned to some clubs with APEC as a basic the idea so you have set off objects you divide them into the Test side and the training sent the training sent to train them all And then with the test you test the model And everything for another acuity and then you can come reiterated take new training set of test that I'm
07:19
Now the question of whether this is houses useful be useful and of the the 1st of and the 1st use was in the financial feel so what people wanted to know is that some processes of for example the approval process for loans somebody walks into a banque and tells you like I'm 54 user woollen Naidoo now houses and no 1 by his 2nd house and higher and so on so much money and like and will you give me a little Of 10 and somebody has to decide by not a gut feeling something that you have to have some some matches in which he to find only wanted to talk to democratique so are investigating some some some group of people still Eidinow 20 30 deals or something I find out on your product board by everybody only the 20 to 30 year old don't You can't this group widened by the road is not not valuable for them more than they just know about it all whatever it is now like to find out about there so I'm back recognition alike declassified credit card transaction of a legitimate dollar for will be the biggest expect them some Fault play in some of the usage and that is basically what you do with classification algorithms and this is why the very useful in the rain pulled thanks to a 2 noble and anyway But Yes said the good of saving and everybody's new to lecture happens once and while the the test safe some of the annotations that ideas splendidly said smashing anyway again how does it work as utterly or any other year trading said that trading said has a certain cloth label are saying by the something belongs to a class all the belong to a club or if you have a lot of club's so much about category to which category belong
09:44
And the racket so for example of this is a record over its as I'd analysis of some identification number and and just values for for some some some summer may be yes no values may be numbers may be whatever upon and these and other correct answer for each of them So why do it is a trying to figure out Powell with this decision is interrelated with the values of the assessed some kind of connexion between this is what I'd you which is basically a good model could of the model of everybody who has yet in at reviewed 1 is also Klaus X and everybody who has no need to do about it of class wife for example something way and the morale is trained offices higher by by some ruining Elbaz this he that different kinds of growth we will over a visit 3 of them the decision tree algorithms the causing knife based classifiers Paul victims and from that you have an instance of the model But of course you train with some some fixed set of that high would would for full with unknown and this is exactly what the test said is about because here The closet computer missing from I'd No 8 because I've taken it from the a usually trade taking part of training said as a test of the them before or training by more but the algorithm doesn't know So the algorithm comes up with a conjecture And then icons knowing the origin of the use of the test Determined correctness of my classified is as good model officially good put the that model does totally under the picked up and that's a basic idea behind
12:06
But that's going to example also we have talked about credit approval for the 1st step will be learning how I'd have some records the and are already know that the decision was basically by sit next to the banque clerk and look at the people walking and and see what written down you are like so many among the income and 1 of the club's interested in them the clerk rakes the decision And I'd just note the class at computer science Icollector say 500 all thousand cases in the memory And then I'd have This table which basically much OK The use classification algorithm the and Anderton some hope what happened in the head of the And put it into classification route so for example if somebody is younger than the decision should always be a risk because those people come to work with money and but I don't own anything that worst mentioning and stuff like the every round may be a good decision if UK income as high than the decision say so this kind of rule gives me a way to to deal was unknown because it does not reflect on the decisions made by just the income visible at the age is young Stuff like that can sometimes several of them
13:53
2nd step is the classification have my classification room and now to new debt For example Henry middle low known decision but the wrote it income as though that allways risky so risk and this is basically this is what we want of cost the intelligence is he he Howard like at classification rules out of the Test How effective is that But and this is what we are going to look into intimate
14:28
The best of the 3 kind of learning about wanted the civilised during in supervised learning the training that comes with all the clubs labels and the new data that again is classifies based on the findings If the simplest learning this also uneasy robust learning and supplies to and included tricky where there are no labels and the training set so we have to find out by some of origin methods class tringle 1 of how the training said can be group into sensible units can stop The Class labels are basically unknown and that is basically the idea of class during which will reflected in next week's lecture of their after in will be the biggest pub today Daewoo deal was only suppressed learning so we know the class labels and in the training set and have to figure out how to build a good classification I'm that has so far possibilities so called semi supervised remain where some of you don't some but animal to go into the to to the bomb will just over who was to enable the justification case on the unsupervised yearning for the class of the world
15:54
That the prominent classification techniques on 1 hand decision Tree witches 30 popular because it but it yields exactly those those classification groups and the same holds for the full through Rubens's were just state some simple rules like somebody young in the long application Chable was rejected by a 2nd possibility is a up probabilistic approach with is that the days of Asian and and the board and the last so far complex long circles pulled back to machines which would go into at the end of lectures and the last 1 the new works that especially tricky we won't go into that too deeply here because it's it's kind of like nobody really knows what you but that it has some training some rules and but and can't classified actually of most of the new owner can justify pretty effective but you don't get to be about the nature of the rules of to the only know about how the network looks so that does not help you to much and well defined decision 1 good way to think the 1st launched was today is the decision trees and the decision tree is basically a float childlike like tree structures are you stopped it be the at the root of and every note that you have with a new treaty is basically test on some attribute
17:48
The for example the root of all was the H And some every note the branches we of edges and the graphs That tell you some Restriction on the edge of the sofa somebody age note the Brahms might be smaller than 30 31 to fully large of the board which is basically how split up all possibilities of of having the h t fight on deaf costs they should be this joint doesn't really purity of several possibilities to go because of somebody tells you age should have exactly 1 Brahms's to go into the next this is a path that the snow notes Other decisions itself out last labels that could meet a pen And while internodes Monday's notes always new tests so that helps you basically depending on body of and on when you get as announced from some into a note It might have different possibilities of asking new question so for example if you find out that somebody is a between 31 and It was a good candidate for loan no further questions all of or you find out that somebody has some of the high income And once a small amount of money to 1 of the Just handed out That I many questions that is basically basically the and the idea but them had we did how to find out about this she but the decision Tree looked like because we don't have the tree we of the training that with the right at the top flight looking more nothing and them many algorithms actually doing that of 1 of the earliest methods that will go into in just a minute is Hunt's algorithm albeit of the best known method of the wrong this that has some of the mood Bradley in existence and and the used and at this moment as a sea for 5 of the very real known technique is also done which will goal not to deep into because of what but complex and and the basic idea is very similar to hunt Elgar's sole where want to look at it is a welcome
20:36
Then there was classification regression trace the call up algorithm also 1 of the top that mining algorithms and their interest in some of the techniques that we would go into 2 deeply here
20:52
So basically A want talk about Hunt's Algorithm for moment and the general structure is that you have a set of training record notified the chief that data to reach some note of and what I do it is used split the status By the classes the different that it belongs to And And The knowledge That has to be defined Is he the class label if the old items and that had already been on the same group Has no point of splitting the Macau In the other cases Whether be long to different Plus a gift to split them Such lack Objects with different classes full into different groups That is not where always possible was the singles that might need more If so tried to break free a big part of so rather split and and equal parts were where most of the of the city of the funds at the end So If a single split doesn't help you in Italy classifying that correctly if to apply recursively with the new debt is said to find new note Split into a different classes and look what happened that example from last year is records about
22:54
People getting tax refunds or so this The are tax records and you have the idea of somebody America's status the information whether he or she got refund or not and that the information held by the those Leruo of the year the cup and he of the another Interesting column year that is This year she cheat on not the world we know that some people are so I'm not quite sure how he would know that they didn't she may be checked and it did find something and interesting maybe they hate that cheating so well that he could find it added that had system for a minute that will be a serious classes When you do it you 1st The find some attribute Up and the SAS for example refund Beckham's play but that was the biggest in the nose of the UK and so getting the information about the refund which to to possible Comes yes on of the disparate at a refund but the of has personal refund now that look at the classes the we find so for example we have a yes here She now we have to get the she now where the here she now The know that you got a refund in my point as she could Identity to split this further because of well single cost label A Kansas City well if somebody did get a refund What about the other alternative to black nodal Fowey of the No he'd you the you new and we have been Niall big for the all the different classes of display them again So the populist up the standard look at some of the other 2 being so for example we will focus on Commercial status Well the people who say no of new wiped out from So we are looking at But no you The no here And the Nokia And the thought of But part of the game because they are in this branch of the and not interested in them So far all the way do I do we find out 2nd tribute to want to know about the marital status of Kent For the last were written to find something so we have a mandate 1 he let me do it and Blue American American here American And all the others the singled and the walls how about what doesn't work is no is known as a no of this the Oriental American people which got no refund 32 The decathlete like up and will have to to split the for the 1 about the other side is the only way to yes the guests who were no bad Here we some Get this part by looking at the West well the loss to the yes the singer has also yes and the singers as and so that the mouth of a three way split would not so that put the single and the Boston 1 set up and and we have to work again with the winner work with the refund we now work with America's status still won to achieve left that some sense of Bob me Of this up 4 minutes from named Or Kent Service singers and you will also need the singles and the Bulls who did not get a refund The as a single At The city will also Not a refund The single And another finger Oka You're The solar cell cases are in this book stony and Let's find out how quickly and but them by the taxable income 70 gave 4 1 0 For every ILO's Have higher income Boca Some Smolen 70 gate But chic They don't teach which was the no he of a pet For all all the others that she does What do we get out of this Opie Bill would not get in the way of a refund Or single the that means have time on their hand and have a high income about to cheat next time of The basic idea of ministers are part of what is some some questions and mid to gritty because an order to Noel was a refund them and look at the mercy of the had to the basic for where some heuristics but 1 might
29:55
The 1st 1 is it agreed strategy ice but the at tribute and on the record that optimized certain criteria for how they do that Hilditch whose whether to refund 1st 0 with a 2 meshes status 1st all to do the taxable income 1st Jews that as the 1st at before this issue and how signed group of and case are iconic and so high overspecify Pietriga test conditions and how good determined that the 2 bases And of course how the determined when to stop splitting around point this only to alternative that will getting Smolensk once more of them and you will be done of course
30:46
It basically depends on the edge of the type that 3 times of attributes of the 1 the nominal at review which some some discrete number segment or along which is kind categories married sold before so maybe and the others are continued which some numbers but from a hunting And of course there is also the different ways of splitting likened the that fine rate of has no off large of than 50 case smaller 50 came from the right and multiway split Mehretu single the loss of which will be free which will be different possibilities which might uses and interesting for example a fight with them traumas of stuff of something for sold Tree of family 0 and the author of new ones small medium large purchase continue choices and the by race but what were ways used to stop
32:04
To subgroups can be cost 2 different in size and in the correct although had road genius of the group Sell might be that this fine arrested but immediately you some objects that have all the same labels are still a classification decision already from the virus but they used to groups that still IPad genius and types of the terms of the class to which think of what you might useful The P The For was trying to figure out That the best that the single split into his roughly equal size or getting class labels very early But Well you expect Yes there a time But But would I would I tried to figure out what objects looked down to a single count label would you rather go like the and equal site splitting approach But come on So you want to know and the Tree OK but they advantage of that That a lot of Exactly that you don't have to ask the user so many questions and if you can split the you by asking to to questions And there is an alternative tree that this year the group asking Why questions of causal 1 asking only to be prefer brought because a means that if you time your money in a likely if you are the banque clerk asking to question lost 5 questions Boston different on the other hand It might give you a full CV and here because you can't where we split off 1 Reckitt obviously has done label Staying with a very big step All remaining objects and from that you are 1 object having a single label in the world because the be a very bad idea because of many of the cases where and how PM price so that this can reflect a decision that I have to take by looking at the tree was reaches packed with hindsight it was easier to find out about half of this whole thing
35:32
That and the and the and the nominal is not true I'm problematic how about continues at tribute to look like a real numbers of some of that excellent but that's a little bit difficult because of the you need some kind of discretisation of them from some categorical attribute out of it you just have spanned that are possible and and then new look what Fulton to what spans can be static wanted to be a new decide in all like icons but this into Royal into 5 into integrals and this is a bit of back and it Dynamic leading up once it comes to this at tribute I'd look at what that has left and that night citing the and the Royal such that they have equal parts of of the space or the other in a way that can be done through a goal was a binary decision and just say OK the some value in the middle some expected valuables 1 of it is in the middle of the in developed just take the left side take like that on the basic possibilities for what he can do when work of
36:50
And The example of Aids continued as a can though the suspended can say something like UK it is bigger than 40 yesterday about by interest that can only have multi will were slalom Saudi that I want fully credit for Redefined that is exactly 40 in both cases depends on your debt you want it enough on all depends on the club labels are dependent on some cement acknowledged on for example people by alcohol 21 on the UK and so there may be some restrictions that you to low just put into a decision Baby semantic they began a pen
37:46
But how do which during the best that while less than the problematic of for example the age with the income with question whether somebody was to win the credit rating from which 1 to use for the 1st Was last down is you look at the whole Jeanty of the cost of for example you look at the income And look at the different groups of all the low income purposes We have Here here The and you look at what the decision it is we have yes the every year with no we have of 3 at 1 0 On the other hand if we look at the medium various to to use to know or care if we do it for different attributer for example the question whether somebody is eschewed the camp where the No no no no no no no and look at them out come off that which makes and no and no The yes a A yes and who knows that the move is a not so we end up with little yes and 3 no of care and Should be the other way around now for No 3 at the correct can and if we look at the yes with yes 6 yes and 1 that was played is better Should we gold Should 1st after being come closer with 1st cost about human status of somebody his wife We of the bill what the and a half the staff Why The There It are looked at the with 20 But facing the where we see here is that the the size of these 7 to of these 7 people these 0 7 people is about saying Luis that order said to off by that what happened here and we have for the 6 where for which also roughly the same The Anything else Of course all the work of this and that Exactly a obtained the if would split by this Cheung for the best part We already have a very strong evidence that somebody who is a student of the by the computer Of With just a single so we don't have to look at but we don't really have to look of the ratio of the dead but over so that they can rather than strength of example For use to No 2 furlongs the firm half off the subgrid helpful This is definitely out helpful for But the creation But after determined
41:59
So that the problem is really worked was that meat The around is augmented prefer is was homogeneous classes so that was exactly the autumn and we know that if the heterogeneity but it would appear note that on the other hand Just a single error But on the other hand is kind of like not to exotic of classical was signed annoyed with just a single racket and help from the cell said that the trade off between the size of the deficit of the split at the and the homogeneousness of the death of and the extreme measures for all for the purity of the decisions
42:56
And the most recognised that the so called inflammation gained see for 5 uses the the is basically a more complex version of comes with a view as to what level at you used to be categorical You can Distinguish the information gained by the ratio between hominy Classes of some labelled out there as opposed to classes of the of the late in the day and you can you can modified also continued as a tribute which is basically a discrete ties them some make them categorical and and it applies another word name for a fully information the skull declined the divergence at sustain comes just different names as well you might much here by the information game that you have plenty of 6 yes and 1 no is very high if you have to want to yes and to know that very because it impurity of Makes information gained almost possible The 2nd 1 is the gene next so it if you will Humewood attributes to be continued as the values and the use human exist to several split possibilities for the UK to be it and then you can find out how many objects with respect to each other so what the ratio between the number of objects in each of the possible to do that On the other hand when I of these continue the attributes at you you might need some Clustering to to find out what sort out that look at the information assuming with just 2 classes
44:59
The and And then we have the said of example of where the of small pie element of Class B the and moral and and image of Intuitively speaking as just the information a multi that the notion of the information is that the team the decision becomes The mood of 1 close to to have in your best the little inflammation you can gain from a the So the amount of information that you need to decide whether some example belongs to peel and is that I find this must looking for muleta which basically And this Politeia estimates that the probability of later speed you count for many pieces The divided on all Molise it by hominy Todos and and a And on the other hand the probability that something is an account for 1 with and again mobilise by the total cost the sector's here are mobilisation factor for for just making the if the of large data and we as a postal small the kind of book directly to the scale of these scaling factors and basically the information being is Of the major parts of the information games is said published latest peak minus probability that the label is The further they drift a pile of dead is for the information Many of Tiepy period of time and the same Many of type and that of type he is a very good thing Roughly the same of the LSE is a very bad thing to happen again The basic idea of information that and if you do it in the decision trained at than
47:36
Hugh true Sunetra beauty She was hardly split along with a tribute and I end up with a subset Oka fish The subset contains a computer Items So of plus and of class and then can You can you can look at the entropy Which is basically defined by the purity value that we had just and all of the different types of the so you some up the purity But pulled difference up the and the split having the best and repeat the highest though you where subset have lies impurities That is the best but the tree and a new ground on a queue you define the information being as the basic impurity Minus the entropy so the bigger the entropy gets The verse is split up and the entropies usually seen as the the amount of this plant in the system It's very clear what the cross later is because of all with the same The order is very high if it stood a mixed than the order of a pet But everybody understood the concept of information gained And impurity Jose Debt Up
49:35
We can do it sold for example here I'm we selected at tribute by a gang computation so we have the Decision to make if somebody by the computer although to to buy computer yes or no this for and and we've Computer generated puberty which is basically Is this rather the order of the day rather Distributed so we can see it already there are 9 The other 2 1 2 3 4 5 6 7 8 9 and there are 1 or 2 3 fell by a Ninetofive that led to 3rd something not to impressive and if you put it into the far below will end up with a point which is the general impurity and this is the Hoppner of all decision of such a good mood or with BSE It will be a very soft problem because of what decision rule you may well away siesta that easy But in this case we have some impurity now we have to look at the the different value for for example take the at Bude h t site for some petitioned of a discrete tying the same and account hominy positive and hominy negative labels are the and the club so we have to 3 4 0 3 2 4 so 0 4 the understudies and the author of these is about half half full of 30 to 40 is it very clear they by computer
51:40
Pukka What seemed purity for the half off again by pudding it into the into the semi left the seen before we get all point 7 all point 9 year the decision is clear impurity no 0 again Now that look at the end of the day entropy works again with the purity and well summed up over the different groups and again this is a mobilisation factor for each step self hobby is the subgroup with respect to the total impurities this is the total impurities and this is the purity of the subgrid a size and of the beauty the size of the sub the size of the total group it out of the very slow group that is impure who Kaia's either a very big group that you that should get some more attention and is exactly what this factories about a case Every the cell awaited by this importance of the of the sub group We will take the purity of the subgroup and ended up which basically leads to a entropy for the age of 5 for 13 total cases of and my factors sold the 1st group is very nice Group the 2nd with a slightly smaller was having just fell under 14 and last group is again 5 on 14 of the fact as he up the much even Distributed from all this that the group's that but have roughly the same number of cases 5 4 5 apparent so that would equal in The fact is kind of the same Look at the purity for each group of Kent this year has a 0 This year has 0 point 9 of and open a new account Total impurities of total entropy that you incurred by using split in this way these are due to a faulty of the and the the at tribute age is Point 6 9 APEC but now we need to do is find the gained the game in a row this the general impurity minus the entropy Generally impurity was open 5 minus impurity use all point to for a This is the the inflammation gained that we get a split Not too bad and which and then calculated for the income for this tuneful penetrating and find out that the the media of being values are much more Then fold The 48 OK and she was affected this very strongly Whilst this 0 he can killing the fact already So the entropy before This case basically one off the rationale that we have that we need splits that produced Queudrue With these factors We need groups that have height For the high employer and the purity group is The less entropy it will include and this for more information gained began a bucket That's a basic at the arrest of that If this attack
56:08
The song since age provide information being it becomes a splitting out with
56:16
Dope
56:18
It becomes a splitting the board and that we might attached Labour's tools that somebody is less than surveys so they sell young people and the City 2 14 other middleaged people and the and the leader of the senior people part of the inability of the and then you forget about it The at tribute age and just consider the rest of the attribute incomes skewed credit because your instead doesn't that doesn't make any sense to split twice run 8 again so for different groups have split the total sent to the more he can already seat As a club that has a clear cost of this is directly from in the path if somebody's middle aged he will definitely by computer that said As the case of well that yes Also Yesss and then nose If those groups again that these groups we calculated for a meaning at tributes the information gains OK And again and again and again the The Times of of all that
57:51
So whenever we have a class with the same labelled it's safe to assume that this is a Perfect decision has nothing to do
58:06
This is stopping route from you can also stay UK in order to stop if they have similar ability to bits so the ideal make too many mistakes Thought the head the for example you have a Fed credit rating on excellent credit rating ball The case of care but that doesn't matter Fell by take
58:35
I'm Using the decision Tree work just the other way around once you have all the splitting not and all that this is this based on splitting up that although the notes that showing off you take the new data And he led the sift through the route 1st decision where it takes you from and at some point you get a decision from the 1 area happened to be in the leaves no you just take this label from believe note and Texas to that since almost but were disjoint This no growing different ways as just a single leave note that he will end up And as also most cycles and APEC That to me and with that he is a new concept for funds from US has called the true that the 2 is one off the exercise part of the lecture with just said that relaxed and let things happen media and you get some good either inside from from of practical examples over his particle information about how it can happen now just doing deeper into into things that basic would be to assault about and it is my duty to man but the surrounding the deeper will continue example we've seen in the UK in the late John on the mouth but that's good for the state of the the where we were spoken actually about 2 steps the induction the input for being the action is training said trading data said it to the 1 where we know the attributes but not only win also the sold everything is given we just need this to train power output we needed for a full day's see what the problem is far data in order to understand how classes should look right when we get to do the output of the same match step is the decision of this was the decision to the 2nd step is the deduction would the deduction a where able to predict the crisis for the date of said in order to test car would now Warroad she's Wigan allways high look part before it declassified data that would and then see how are would the decision to a has performed on the stage and into the games against the human classified data
1:01:27
The 2nd step the decision tree is very good and the output of cost across 42 in
1:01:36
Today is to So this is holidays that looks like a scene example would buys computer thousand by computer we have followed it to And they have the cost everything she is not
1:01:52
The but we go further and we have a regular the various information game including the and berries and and we already have learned that we want to choose the best lead beach moment so with ground with their age for the 1st spirit and we have managed was but this this continues today with the new Newark based and senior and that with got world trade some prices this 1 Shia Islam and is 1 of costs now because it has become a place because it is the same here and he is not clear don't know what will come up slightly of the investigating load
1:02:39
OK decision is the 1st sign up for the age where the young people come to question so 80 had to be is not yet any more time in Tuesday's right now only become the student and the credit seems all of them are young people is a can see after my An Iowa 1st country has for a new trial for real they found the Emporiki holloweyed Dudus we have before we discussed before The for about 14 per games and that not all the players at the records of positive belonged to the squad until the cost of them like updated including the and then I'd go to the and dropping Stuffed with the 1st Ed to build And created and drove again ahead the and before allow misation factor into to by subprime that has been thoroughly income its and before but you income we see she sat cross border low income A the fight The purpose this is 1 of the 5 Now and it seemed to 18 and she is you can see it's quite Hewitt because we have won the Yes and No not just 1 or 0 1 0 0 1 right then we go We don't need you Probable 4 5 it's more important But it's quite a few of the point because 1 easy as 1 is not so it has a high and And the last The case with high income The Which again global 5 is quite Buick The author and the and by some in the service of prices in the and Toby identities out we taken steps act from the local alone infuriated together formation and they can see that in this case when there would speed by the income to be held who was 0 1 1 5 information idle the same for the students to be and that operating the and the information you this is quite intuitive is you can see Here broke If if he is not a student He's not going to to bite action will be the some stuff computer But So if he is not the student he Blunsdon or class if he's a student he belongs to the school so things are quite achieved this is expressed also by their information but it's a very high inflammation of the side will stop you it's the much information being but I want to go any further by can also overpopulated information for their for but credit sold just remember and 1 and and the with range accompanied the formation game for created here
1:06:18
And they against quite small sold the largest game was told student at 3 OK this is what I'm going to see the world from here in the range and the decision to now looks like I've decided 1st on H and where it comes from data belonging to the use that they would in the end I'm asking myself East Houston
1:06:46
If not all of them idea that data from before which airport a displayed if you has to get this Again see complete on would do anything after done display so that it can operate each some This is the highlight can also close this spring that's a close range
1:07:11
Could see but they go back to the age and discuss the middle aged costs this is again across French Open ideas and wanted to go any further on the to anything and then had the last went the range of the senior people but with the beautiful corresponding data unwilling to to perform the same Process so again by the purity of what can go there and be for income and bands world sold for 5
1:07:47
1 use positive 1 is negative 3 times medium have no hiding And then began entropies pretty big has been is quite small but by same for students and the same for credit
1:08:04
And ideas of the idea that the credit Has the best best game There for my disciples by creating It's quite simple to Disney tool for country very by building by critics or remember we are based in the senior branch we go into full range is the and to be so full of the sort book set of credit fare and excellent
1:08:33
And we get a full new in subset Where's the example is pretty easy this leads to police notes which we can already folks Accustomed is also complete
1:08:50
So band of induction phase already finishes with the all of this is is how it looks like on this week and now we can perform deduction also step secondstep
1:09:05
Where with the new date we have we have destroyed we have date this can either be a Test said where we already know how it actually it is unknown the and not people with aged between 31 and that the young people and for people to see him in a pig's dissenting counts insisted and not and credit silicium had died just imagine this has of many based against and conducted by big retailer during the has this from the start is the Department and she just once Journal history is chances on on the sort of big dusted said collected from 1 area to was should open a but we like to see so many computer as their priority all the demographics or the area and all of their income but it's not that it was to very or not by just 1 game away fashionable commercial again run state owned the decision to and see how this looks like all the submission off of the probability that the people will buy a computer not looks like this is are doing I'm questioning was the 1st record about the age you see how his between 31 and there is so he's made which walking fuel by and could not be provoking a but this is how long this is what my my decision to stay away just implicant across They say about the next 1 He belongs to the hand across so I'm here and then they will ask myself If is a student by going to the student field and they see that he is still as he's a student great you by computer recording phone decision to join in but this year the same for 3rd 1 or Kate is you again he is a student at the door open for further the same Case and in the case of a senior and she of I'm going to ask about the money to buy the student with something goes looking for the credit rating and seeds fare What happens when a set well his public not going to my computer and this is what I'm writing she It's easy to pull would be just a bit results for the decision to this was slow on the data but we will perform with the later death row Sir last that but is still needed used to extract the justification from the tree and the rooms or a waste of Kind of Blue
1:11:57
Found and the rooms or a waste of time of it than adults in are like if somebody young than he will by computer of some of the young and married or some of old and Marie than he will buy of stuff like the not so basically each part each part in in in the tree in the decision trees Represents a certain rules pop Giving all well the decision scenario fell by the rule knowledge about the age of consent if somebody young if somebody was middleage some The next mobile Tommy if somebody has a high income from a low income or something and to get a possible to notes just means the conjunction of the if some his young End of the income for life that basically the idea behind the and a if you get a rule of the kind which it very easy for a new year employees away you have to to to decide to what walked across some the for example of the Tree here if somebody euros and he's not eschewed the and he will not by the way that quite 200 you just follow the possible on to find the correct label that if the advantages of decision trees and why this is very often the and is very expensive to construct a them basically died is we tested will different attributes the where the information being those took the split with the highest information game
1:13:49
And then ended up pretty quickly was decision Tree found this was also a lot of some extremely false justification I'd only to calculate anything that only adjusted shed at be well you overall below the pretty easy to do and once the tree is smooth site is also very very easy to to have some semantic understanding of what it actually that began interprete what for examples to need computer for working so it comes as no suprise as you might buy computer for young people or used to be a responsible so it comes as no suprise that young people only time or creditworthy are lower the about what they are married because than they are already showed commitment all something like that they can count like find interpretations full for all these kind of interesting thing to do and the accuracy that you get is compared to a presentation already the rooms at the very good so are made of very useful way of spitting at the and 1 of the dangers that is a danger in order to classification over the so called over 50 sold each you split
1:15:17
The group's until they have a of the same labels than you might end up in the a summer sites in all likelihood he was blocked by rather were the iconic has something to do with a bit of buying a computer because you trading dead just set that in the last of the young and married and high income people out there was 1 who did not by computer and exactly that wanted green eyes so it must be the green eyes for the 1st couple of the other far too clever by half a year and that is yet to nudibranchs some of them may be just reflect nominees that you don't want to be a new decision of 1 to apply that to to I on seeing examples the outcome might be nonsensical from the cell of obviously the eye colleges have nothing whatsoever to do with computer or buying computer that this kind of sanity cheque that still needed and basically there to the of use techniques from the you treat so wanted she becomes to bushy to deep in what you might want to snap up some of the trees and say what I'd high because there are usually about like but extending the branches really not valuable for getting getting the last bit of security out of that on the 1 hand the recruiting approach where you had the tree construction and early stage so if you information gainful splitting the group again below the the settled so just don't do it up and how to some appropriate for effort is problem you now like you might smell after Woods about what the semantic of the rules actually work 2nd possibility is the perfect routines precoding way already through during Tree construction in the post room you look at the fully grown true once it comes out of the of everything distributed and then you can progressively through from the back and say we have a system that size is well that it could smell and that's just from it and then you set from the data from a from a Test said and apply the tree and took which 1 but that was in a vital part of the tree that he threw of what it really really nonsensical over 40 part of me that was to techniques the usually used for routine
1:18:03
The summer hundreds of the algorithms so some alive for continued of valued at the height of the misses amid a hobby hand missing values have not valued at just under the age of 1st what they do account use the decision Tree because the route note is concerned with the age you know what to do good tried several possible assuming 1 of them is correct and make majority said that kind of several ideas about how to manage the and that this is that it is used a big problem was decision trees because if you do it is made along with the items and it becomes rather interested in what you do and what it will take Another announcement as the construction of a new tribute just put together some of the old abuse O'Connor aggregate them before new aggravated gives a bad decisions so of what 1 of his typical things for example body mass index and by the time together from several indexes also wrote measurement is about your body and tells but just run log on the weather you'll might have health problems all multimolecular need to have turned down a characteristics around if the single 1 kind of could be used on the other hand it may in a certain I'm a certain Iniguez by just a competing the victims That kind of 1 way to do it
1:19:43
Not a before 4 break tenants far from while but say on 2 0 0 4 4 weeks and then we will go on was my teeth based into the ATP The Gaza of lecture surround the 2nd type of justification off all want prevent that is probably stick approach which of Beijing justification most video using them published a good earnings growth so at what you do is you have some hypothesis and the hypothesis of been I'm which Object all which item which cost of this person by the computer this does not by computer and then we of the probability that this hypothesis truoble that you to reject the and of course to take the choice of take labelled with the height of probability that should give you sufficient that will be quality in when I'm You can Trading Asian classified either again on the training set wages use the frequencies off of all the items and how much prove you have for some of this time though you can do it incrementally so we have some customers comes to stop you you learn something new and use that to adapt to of ability model and undefeated in 2 of the team over the find out and that he ought to patients are correct all of patients have still problem what you want to predict a single or you multiple labelled you might want to be define a single not just like all the purchase 50 50 buying computer on not but the site has something than of cost You wait hypothesis by by the probability of that that you decide for single But and was cold Beijing justification because this is Tomas Beattie's actually it preacher from from from island actually he was originally and that he had invented many of the statistical ideas of all for a conditional Paul Politti's what's called patient some basic I have is that you have the persons who by a computer
1:22:38
This is what you interest And the person who by computer can be split into all the doomed and all people are a general all people will have an excellent credit rating and stuff like that now so you can find some classes of this class of a full of people by computers The have certain correct eristics otro Nikon defined probabilities all these clubs had probably is somebody by the computer I'd under anything of the person are probably For use a frequent EST approach to say while less take that the people in the world And that's look at where the computer hominy bought computer and that's exactly the probability But some random person Phys computer and the number of people which should that by the number of people killed or can't get now if we have some smell a subset of the personal by computer and ask you now again we can Calculated from Paul disease that probability that somebody's student is just a number of schemes in the buy to let the number of people in the world he 1 of the probability that somebody this June when he bought a computer sell the SAS Said inclusion he book and This is a conditional probability Under the conditions that somebody or the already bought computer hop rebel is That he or she is a student of the year and like that every knows about conditional probabilities Show good Basically do it is to use a conjunction here that was basically the number of people who bought computer and I'm the same time students by the number of people who bought computer But and this is a condition of the singles for the conduct of the probability of an schemes are their where people Did not by the computer justice and wellies clubbability can be directly estimated from the kind of training set of a number of people the Grand total and find out how many of the most used home and of the moment at age are many of them have an excellent track rating Havas compel Probably would need some moving to to get to serious Publity is and
1:25:37
The or could so not we need classifying new document somebody comes into a shoppe and he or she is doomed and he or she is ultimately each and he achieved has a credit rating and where you want to know is not the a subset that while the computer but you want to know whether this person is bound to buy computer And so you basically go the other way around On And you don't know that you want to know that the label you looking for 3 1 2 what want to find out whether she is a true Given all the information you have got person And here they see ringed comes into the gate And he basically said that the probability of some and a under the condition that the has already happened Is exactly the same as the probability that the happens under the condition that they already happened Times the probability that a happened on its own and the device but be have this is based series of 1 of the pillars off off statistics or a very very of music and what can we do with it so that we need to the test said
1:27:20
And we have a new kind of that with a young team and with the fact that trading who comes into much shoppe and I'm Antonova based on what can be learned from the past that is this person bound to buy a computer or not so do have to take my mind a wonderful computer salesman now and set him on the track of this guy and not the waste of money and time This is want to find out several under the condition that somebody is to want somebody is not middleaged but young and Has not been excellent credit rating but on the fact that you can't buy want the ability But he or she by to compete They Fiore I'd owner base of 50 LeCount Calculate was Account of probability And now comes the interested party because those that items can be gained from the tests Find out how many people as a whole by computer enough only people all means of the their credit rating and By young or can't soul are still we we have this conjunctions
1:28:57
Very hot to thinking a like mean what the condition the company's all free things at the same time And this is why called night I think Which is assuming the rosy bands of statistically independent Because what happens if they off statistically independent Tree Worry What have to any event and conjunction when they went so statistically independent Publity 1 Although to sleep wanted to pay any attention to stick are basically icons the earliest computer conjunction of events in case of statistically independent even as the multiplication of Individual culpability But allows me to split the events this when he saw just assuming the of statistically independent solo but makes computation easier and this is what was called the based in like it's kind of like it but it's very common assumption wrong was sacked because of credit rating and the age might well have something to do with each other so remain upbeat and highly sophisticated but Adam care and the NHS was introduced much trouble at the start of the ability to do it is by taking this conjunction and Just multiply I'd take this conjunction and just Multiply the individual prop up and was so that assumption all Sophisticat independent step of calls for the EU was in New record from a the probability The get across Cecil somebody buys computer this new record For Kent I'll just compute the probability for each club separately each possible to you And then the signed the label was the highest probability The Zygi behind them but I can't That knowledge go into the details white
1:31:40
OK so the to the previous example with down with the decision to use the same Case with 9 8 we have the same day at the base with a person's buying computers and the old tributes morning gauging comes to creditrating we know there demanding questions the city's old training Now we want to answer the people of buying computers and the people not buying computers and then where going to come the frequency compared to the total number of for their codes in order to innings by computer so 9 sons have been found by computer before the end by the and then well going to take each of the attributes and calculated the for the scenes where sense studies the good independence we need them later so we can multiplied OK for age 18 but we have began the tree subprocess for the you would need to continue the each of these subprocess where during the again for copulate where a high many persons have so this would be the paucity and this would be a negative many were young bosses have bought the computer for 9 but also confirmed he would be sued the 1 and this would be 1 So public mind have Botha computer the same for both negative Scene In this 1 The slump in And this 1 The of cost book Lamont The upon the negative where 5 negative the for example sold problems with the new book computing and have 3 games before the law and the same is going to happen before the middle aged bosses while computers 9 and not permit each person would be both computer 4 0 For senior the same cases and discuss even further so where just twice the the age will go on with their income
1:34:12
The income can because the subgrid supplied in Palo medium and height Then we have to help of 9 with low income of what the computer 1 out time 5 the help of 9 with me you mean come while the competing for for 5 the answer for the student at the book and for the treaty to be an Ohio does this commonplace the quake
1:34:41
I'm interested in the 4 per cent a young person with little in common with a student and has a fake created has called computer What global he spoke on their the probability that he has bought a computer under the assumption that she is on the by annoying but he is young and has a low income for student has care Carafate their created and the probability that he didn't he wouldn't by a computer And how can made this again cut the each of these Due to the study decoding independence on Nanhai Noble That there are about due to the based oil and a person who has loses the right hand side of the feeling that this is quoted changed around walking on the air of this is the probability that the amount that people are your computer somebody is young and beautiful and what would accept in times the probability that some pretty was 3rd is about to buy a computer and the and 1st And the probability that somebody is young and beautiful but public and so this is nothing to do with or case this is just the limelight effect but they fear that if you go back 1 flight
1:36:19
Again and this is the party of and and the other part was a party at the way the probability somebody by the computer given what every year Kent
1:36:36
We just the left hand side the at so actually this point becomes this multiplications here due to the statistical independence and though actually they 2nd 23 anyway because the company is a might be by both of them
1:36:56
And this can disclose began to the multiplicational the various have already calculated because every have calculated by a home made me you persons we have which have bought the computer how many worsens with glowing come we have reached But the computer of and home a new worsens we have with come with which have bought the computer this is the paucity of case is a said and then again for the negative Get the public about the future of the negative about the before winning the case and what they actually do not have to do is just companies to numbers the High Line and say OK In this case this 1 Bond cakes It is more probable told by the computer so why with Labour leader has has
1:37:56
It Des talk This summer a bomb since the probability based on your own with some up the whole thing it's very robust to isolated noise of as 1 out life it doesn't change the probabilities much of the world minus 1 of them and it also handles missing values 30 well by just ignoring the incidents because not needed opposed to the encompasses decision tree it's not needed to make the distance for each and every attribute but it just needed to determine some basic and some very general ability of some that point dismissing the not really interesting as so I give rather than a tribute don't change that will not affect the probability Because they are not part of the computation it's totally a resident whether somebody is young all old it will be totally rather than in terms of the positive computer by people and in terms of negative computer buying and not computer buying So he would have the same ability multiplied into each of the numbers that useful classification talking of this is very robust and the pope might basis kind of the independent assumption that while the might be and that it can be a very strange things can happen once things get to depend because the and the whole of wrong and classification total of total around them The
1:39:51
Rest assured that it happens and look at the data and the and semantic terms and want to find out something that is really correlated somehow so older people also have higher credit ratings all something we have to put it together and 1 way or the other every how a naive based work and again we see the power of the rim of the very nice also in classification up last powerful today Leave it alone for the day so that the machine symplectic the machine is a wonderful thing basically it just those buying a reclassification so that you may only 2 classes yesorno by the computer but doesn't by computer spam all non spam rather than document moment and stuff like that and you have a back better representation where each fewer than the lectured corresponds to some wrist so you might have access for the age of access for the public creditrating what it is in order and that the number in the wake to shows of what what well you actually that died dimensional real vector that kind of the most By the case that you can the tall schools for the machine to find the lid classified Hyperplane that the body the dimensional space into 2 part Such that the data samples are separated positive 1 being 1 side of the face of negative 1 to the other side of a pet
1:41:42
But easy The task is have some positive example some negative examples of the idea of negative examples of here of positive on both and now my supplant victim machines Have to draw some light not hyperplane buses to them and the spacing 72 onedimensional hyperplane to separated and I'm may do it like that July separate may also do it like that She wrong separated And I'm a also do it like this Welch Which is also a violent separated but probably not this night Ramp And now we have the same problem that we have and the decision trees and we have to figure out what the makes the niceness of a separated what makes a good separated what makes a bed separated and how we need separated There are problems the how to cut the mustard and so this is your but this works but it is not too for impressive this kind nice and we find all kinds of odd Linear classified and have find out what just which is the best ones and 1 of these ideas is we could measure the quality of lineage as defined by the mountains that the board of classified so for example if we of this justified the classified earlier witches of alive
1:43:33
But the distance Of the points With respect to the justified Should be rather big A Kent So if I've kind of like the big demilitarise zone over here It would be a good idea because we had the new good assembles coming All times are that they will come in somewhere where they are good ones were already well over here Talking to make the across to a more compact the file chosen not the Classifier But schools like that Charges are Something company and not a good idea but it is very close to the club But it would be misclassified So the much of the classified seems to have some some some semantic No 2 terms of robust UK and this is the basic ideas of the past so what model should which used the mountains allways like this 1
1:45:00
Good classify
1:45:06
Classified
1:45:08
Good classified a Bacalso The Mountain on both sides of the line before the coldest distant so their the power lines in the book the classified There is a case to make it and I don't know which 1 support for the accuracy of of possible but machine tell me of the but some of all the 1 leading the most based and either direct Is a good idea Because it allows for their from both sides So that is basically a what we looking for them not to my method classified as it was called as the simplest kind of supply but the machine
1:46:02
Because the the and the highdemand Josep all victim machines on that are very complicated and we will go to the very briefly at the end of a lecture but only very So for example if you over there is not Said separated but for example could have utility tried to White the both sets by line just doesn't work and you would need something like reuse The And these kind of shapes only happened high Adam and Eve and protected don't care So that kind of night while topic that stick to the easiest case of the pub she who has heard the notional Seppo where the machine already 1 2 of the of the for the others off you it's kind of the way that the find support victim a sheet of good called support the machine that may be the sepulchral But correct would also Paup victims when we look at the best said It really doesn't matter Whether he are more than a point because it doesn't affect the classified it won't change the separation of the of the death said can only change 1 points on here was in the mountains Because then the mountain becomes And my defect the angle separated from so for example if we have Another point here Separated my rather Switch little the ship the angle To get more distant here up and the maximum option But kind of like a account for this The new model of much if the Inglewood new point and it might affect the width of the of the March but that can only happen if the points After close to the separating it totally doesn't not Interest of what happened here of only what happened here close to the board and the Points that lying on the mountain They are called the supply and they defined the best of the classified it's kind of only determined by them not by points he can't work and restrict Search took coupled points only the holders which as the convenient in the case of the death which usually have so why do we take a maximum mountain while its intuitive to to divide the set as as as far away as possible analyze and about once you points come into the game having a white mountainous as is a good idea because it allows for something I'm makes a very robust classify obviously and there also some theoretical of humans as easy to compute and but the robustness is basically that the main issue here that the main point of interest and and politically it has been shown that action books so the board victim Machine quite nice technique and so with the full ISA and we have some trading data and trained at off the dimensional Elektra's together with a closely with binary last labelled the can say what basically a consistent of the victory don't and the cost and across the world is minus 1 0 1 1 unlike negative example published that at all and then we have the accuracy of the class label here
1:50:46
Positive She easy the But right now the fines of type of How can we determined and how can we described hyperplanes they are basically describing hyperplane is done The move ecta again We find right is the man who victory of the hype a plane And how far it is from the origins Then this uniquely identifies the high Can't because was the number that were facing of the hyperplane and that was a distance early origins have pointed space where it's kind of like fixed All that means that any hyperplanes can be defined by some role that don't just along with
1:52:01
And Just a distance scaleout That allows me to be fine Linear equation Defining the height of so the points in the demand for places But described by this equations up airline on but this type of up for lying on the slide into the match But Foca Basically means this is the Victoria this is the distance to the option of looking more nothing of the sort Distanced the origin Mahmood vector W distance the origins of the For Kent For example this type of plane here It is 1 might 1 And minus to So the minus to is exactly you can't its where kind of intersects Why And X minus 1 minus 1 is exactly the picture and that the key Go 1 step whose once upon 1 my Buddika with this icon uniquely identify this act The But having identified the exodus from Icann you might the space by this sixes in those points And those points where Is equal to the those points for smaller than the and the point of the The point where it's bigger than the Boys of this is wrong All know this is the the cause of this is the victory and the point for its bigger than 0 should where have the same labels In the points race movement your should also the same level Can the points birds equal to 0 is all not hyperplanes that this is separated there should be no points the APEC this Is the the much And these Other sepulchral terse a Telecoms from the start And let's get highdemand well
1:55:22
The basic problem is that you have to remove smile to find out about the label of a of some of the so if line is a separate that separated Will be objects lying left from it Should be off label minus want all over values right of it should be of level of labels 1 half on 1 side should be the positive on the other side of the which is which went up and the people but
1:56:03
He was there to separating high They can't even goal the little but in each direction A will take that up last and on minus again And this would be up last and this would be the other way around this will be a loss of this book The are might pocket But even if I'd at are minus to things I'm still safe Even if I'm The track up plus from the same time step of these up Parallel line Stephanie defined by this new scale Just differ in where they cut the white exes where the into such a wide taxes can't but has the same number victory Lello parallel Becta that as perpendicular Opentable that and now we always said
1:57:21
That If I'd wanted equal much The are plus and minds the same type of into that while icann defiance some the Prime which is basically the Obeid last I'd take the combined to step up to a mile might of US and just divided by 2 1st people half of Then what happens basically is that given to lines over here was all miners That was up last and the 1 over here was are miners Like and Take the distance divided in house and and out with a new upon just nothing happens at just stick with the same Now I'm a victory by just To use the intersection of the yaxis such that it is equal mountains to those supplied factors could not would not realised the Holy Beijing nothing happens will just the right everything by all prime the new 1 but that doesn't change just here are like everything is divided Celso it doesn't and then buy it
1:59:01
The finding new never that that is not realise while prime and a new into section was a yaxis snow last by taking the lead occasion anti biting about some common affected doesn't matter doesn't change in some nothing happened here and I'd just wants to have that should constant and but it's called such that With the the new quantity of and terms The effect of this Is that the shift constant the supply of pictures 1 It's not a prime any more because the divided by helpline find and up find devised by Prime makes 1 of his best it Nothing happened
1:59:54
All Azkaban mathematician saying nothing happened if he had sometimes vividly Computing but it's not out of the and Basically we just did well the mobilisation is that exist separating hyperplanes Defined by the end victory and the intersection was a wide yaxis the and that is a hyperplanes defined by some of the more memorable victory and some others in the sector was a wide that still ability separating rather hyperplanes that I could not and would because on a flight changed all classes are mines into what all prime end of the bombing hyperplanes shifted away by 1 So because of the light bulbs in the UK a nothing happened so where we need to do is we need to find a maximum object justified and that is exactly that Of this type It's a special want to know that it doesn't matter which 1 which use we can do not analyze and get any other equation out of it but it's the same kind of thing it didn't change anything Well and the leaders of the 2 basic idea is that what we do there is of course if we have to bonding hyperplanes we won 2 1 in the middle of the central that has exactly the same here as he broke a good day and we need say they will search space this any anything more to follow orientation of Ohio this any intersection with the White for
2:01:54
Shifting Ohio It has to That will be examples in order and that it unclassified this minus 1 hour on the left transfer of the player that runs the classified plus 1 off on the right hand side of justified in the mountains science is 1 Caught and the That's how 1 From Baku Kuwait for the bomb And when all those the issues that exists some object and other items that is exactly that in each of the bombing Ipecac because if they weren't accroaching them further part of that has to be strict what happens is 2 This is all W Kent The origin of coordinate system But here This is all this is all beat because section was like 1st up and we know that are Bonding playing Equal not and 1 of the distance 1 Int that point from the dead said that to realise you can do idea Some straight points and all these are positive and all of these are negative What I'm saying is also not trying to find the correct Victor Oka That is why we need to 1st look at what the mountain would Of such a high point of his dependence on the half the support victory was away from each other but account measured hopes for The that killed Michael of the system Weekend by the 1 year and 1 he and so that the distance between the 2 is not the point here But the distance perpendicular to my hyperplane has appointee When I do is This Foca And as overseas Cunniffe like Rectangular trying
2:05:06
But what happens is that the situation you have minus want and I've lost 1 0 for the mountains shifted away Molise so that want to have the Playing in the middle that was separated and after find this can't talk and find that this is all 51 a misuse The W Kent By taking moment of W I can find out how big the mountain it can sell the mountain is obviously 2 because like a shifted 1 away in each direct you make it to the fight at the moment education Oka now with to maximise demolishing this Subject to a constraints that we had on Aug or 1 1 1 or classic so we have the constraints on owned the Beijing of the Linear separated and we we have to find out how do we get this with maximum much so we maximise the mountains signed with all these constraints that would Woodward we discussed before a Kent
2:06:35
Well This is just Because we need some straight For point victory Once we only look at the support sectors this not needed any more so we can have added that because we don't have to consider all points and that as a justice of the street and you becomes a little bit easier for this is the classification and this is a modular which would which was that which we have to and to maximise up to nearby by the most W
2:07:18
When does it become Maxim when the Momo W become minimal of again So we have to minimize W Apec then we can spread because the mobile was has where So we will minimize this act over and the fact abide by half well for minimize the Momo W than we also minimize the moment of W by 2 of its kind the fact of the don't don't even think about it now it just brings it into some standard for that we know from from UK but it does matter the point is we have to minimize the long after the UK and that
2:08:14
Set her minimize the mountain with overruled Momo that Church and the possible into sections with white such that it is separate from the UK and even combined those constraints Into 1 accretion By just using the classified his wife Eidos minus 1 last 1 like and to mud apply with it and just get a single equation So now I'm have a maximum minimisation problem subject to single each and we know how to solve that from the Who has the Correct terms It Basically let go of my take it Like like didn't tool extreme point of some function when due to build all possible derivatives and all the different directions to look where they are 0 0 and the new cheques for the 2nd derivatives Pasha derivatives and all the things and find them Smaller book
2:09:52
The city stop and we need that and that is called for her to programming all McComish would be patient and them a standard procedures to to find a solution and basically by having this special for almost special you with a no point find out this some tricks where you can speed up the computation but that should concerned us to much the many a libraries to the computational of the of the machines that all ready to use in the UK at the wanted to show I was behind the idea of the sort of which fell as far as the implementation goes done and up
2:10:41
But at the beginning of the week was assuming that trading said this Linear the inseparable what happened here But That's kind of like these are because we cannot find a prop line what we can do however is so close of mountains so we don't have a separate the divide everything but Trident misclassifications as ever and the further they are away from the from the mountain
2:11:15
The worse it gets sell the hollow we punished the error that can be done well and the total specification are and you have to minimize the total classification error for finding the best so that as minimize Asian problem over minimisation problems for the more complex the computer but still manage the can
2:11:38
I'm We also have the assumption that we just have to classes the time the negative 1 positive and how can we had no more than a basically that every mighty Klaus problems in the lead up in a set of 2 clubs from the with classes 80 the and seat We got 1st built up into a and not say And then split not a up into the and see Kent The system iterative programme which can do that and found that the lung rose Altadis of eyelids the type that just showed your like you take a class that across training centre and and Just classified biggest last night when the club the you can use a 1 was 1 of the classified you take the head of classes new 24 and I say this because the of and then the next communication and that some of the club's support the machines of some of the more complicated because of the need something like black woman diagrams of the more complicated computer but can be of this is not the restrictions on the use of support for the match but they were easily strichen and that is again the problem of over fitting as told you some if it's kind of kind of difficult because the basics a pledge that the the machines Linear of the Baltic the machine was a maximum modular can only work and Linear classification but if you did that
2:13:27
Looks like this You can by them in the early and even worse if you put something through the air what happens if the loyalty of misconduct acidification so even though I'm a soft mountain classified don't help you because of a lack of Everest he The from him But enough about them to do what I can do is you can skip the Linear upon and just go to a higher demand And saying what if my classified what comic classified look like that The and houses basically down this is basically down by going to the 3rd time engine way you have parabolic Classified which easy to be fund with freedom and and then he you cut through it For the plate And the protection offers on the plane to be circle can you take the trouble from Space make up for it to end up with some of the affected her account so you can use higher them engine justified and against naggin the implicated But basically using high enough time engine You can get any you like into the image of the A could really get something like this Greenline down here just by moving up the couple spaces and accounting for Kent should you that Toseland complicated enough were probably not the because and the Tony before or if we have an order Lahia building such a separated is possible T It is possible that what happens if some Red Book her he would be very bad for the occasion but the perfectly legal perfectly developed point for some for some Redbook to appeal for calm So the high I'd in the demand mentioned less robust the singing utterly that 1 of the major Wantage's of above a committee is the robustness and the and the like trade off again and again fitting is kind of like a very difficult thing to controlled and your with have to make sure that you don't want into a fitting usually this is also done with tests at the cell and is a perfect justification for the Trading said reach that is something that is not to say that it might take 2 downright badly to new items because you get to special again perfectly fit the Training said that was cost and and what I usually do is kind of across addition you split the available data into positronic that and test sent
2:17:05
And then you find a solution for the training such and Test with a test of whether their loss to talk that 2 of fitted over that works and so I use 1 part for she during the classified which is the 2nd part of checking the classified and you try some bad classification will some slightly was classification again compute the results of trained of the tests that and then take the classified thoughtful that has the best of homes tests on the trend again Across ability but that regularization if you know how good classifed roughly should look like though though it is usually possible to divide the said Llanelli see many can also introduced Kennedy values into the Optimization from for skipping damage to lose the weight loss to the if you off all way from what you would expect you have a lot and the if you are quite close to what you would expect semantically speaking to incur only no penalty and you take the 1 minimize you your penalty and example such a regularization of mountain technique which accepted that he you can not separated the said properly but you incur some was still to punish the the area and that while the to fall from from the mountain
2:18:45
But still It's not over classified because you could get a functioned Texas all the outline of But
2:18:58
Sell to they would talk about classification the basic the 3 kind but we wanted to talk about the decision trees and with the times as well you use the inflammation gained by looking at what splitting characteristics you can find to level you find you set into into smaller subset that he to some label and you directly can generate classification rules from that in the terms of questions if somebody is Young and the late of a block the and he is able to buy a computer or a 2nd round based classification walls probabilistic happy arrived probabilities for The event that somebody by computer or doesn't not by making it night few gets too easy calculation and you extract from statistical depend on the other hand But he faced father very robust global might be a good idea and surfing with talked about their with OPEC to to machines the of fine reclassification techniques that usually works and Linear Facebook and the St or to use a more complex places the idea was to find a maximum Object classified kind of maximised as the distance between the 2 Different different itemsets and all of the different declassified and that and then chooses separated between basically prop of linear algebra so we have seen 3 approaches 1 published a could approach on approach from India algebra and took 1 approach based on the information Dudley different approaches for come to correct classification of the dead Depends on the application which use and which easy beneath the route 3 need something that is very robust and 3 after missing dad and for all that it might be that the public pros and cons and you can just chop around with the idea Questions Ever since Kent understood the principle of classified Bucket Next
2:21:36
Also to index sweet when the of the last and that this is to talk about that Clustering algorithms about how rocket Cook left bring algorithms or you ever wanted to know Thankfully tension
00:00
Mereologie
Polynom
tTest
Ungerichteter Graph
Zählen
Bildschirmfenster
Formale Semantik
Datenverarbeitungssystem
Gruppe <Mathematik>
tTest
Statistische Analyse
Kontrollstruktur
Phasenumwandlung
Verschiebungsoperator
Computersicherheit
Stichprobe
Biprodukt
Generator <Informatik>
Benutzerschnittstellenverwaltungssystem
Ablöseblase
Skalarfeld
Ordnung <Mathematik>
Tabelle <Informatik>
Folge <Mathematik>
Subtraktion
Große Vereinheitlichung
Virtuelle Maschine
Weg <Topologie>
Spieltheorie
Äußere Algebra eines Moduls
Analysis
Normalvektor
Soundverarbeitung
Videospiel
LikelihoodFunktion
Verzweigendes Programm
Gasströmung
EINKAUF <Programm>
Bildschirmmaske
Wort <Informatik>
Quadratzahl
Klumpenstichprobe
SupportVektorMaschine
Statistische Schlussweise
Natürliche Zahl
Versionsverwaltung
Element <Mathematik>
Sondierung
Inzidenzalgebra
Eins
Regulärer Graph
Existenzsatz
Wurzel <Mathematik>
Diskrete Untergruppe
Umwandlungsenthalpie
Interpretierer
Theoretische Physik
Systemaufruf
Nummerung
Bildschirmsymbol
Mustererkennung
Verknüpfungsglied
Datenstruktur
Automatische Indexierung
Netz <Graphische Darstellung>
Garbentheorie
Extreme programming
Koordinaten
Aggregatzustand
Kategorizität
Web Site
Total <Mathematik>
Ablöseblase
Vektorraum
Physikalisches System
Hyperebene
Schätzung
Konstante
Lineare Geometrie
Schwellwertverfahren
Ereignishorizont
Bildgebendes Verfahren
SchreibLeseKopf
Vektorgraphik
Routing
Sichtenkonzept
Objekt <Kategorie>
Ähnlichkeitssuche
Programmfehler
Differenzkern
Codierung
Entscheidungsbaum
Körpertheorie
Innerer Punkt
Data Encryption Standard
Einfügungsdämpfung
Extrempunkt
Nebenbedingung
Iteration
Gleichungssystem
Richtung
Netzwerktopologie
Negative Zahl
Korrelation
Nominalskaliertes Merkmal
Softwaretest
Addition
Shape <Informatik>
Sichtenkonzept
Winkel
Gebäude <Mathematik>
Cliquenweite
Prognostik
Marketinginformationssystem
Instantiierung
Optimierung
Ereignishorizont
Web log
Menge
Garbentheorie
Wurzel <Mathematik>
EinAusgabe
Instantiierung
Fehlermeldung
Fitnessfunktion
Familie <Mathematik>
Orientierung <Mathematik>
Partitionsfunktion
Facebook
Verzweigendes Programm
Wellenpaket
SmithDiagramm
Spannweite <Stochastik>
Datentyp
Datenstruktur
Informatik
Bruchrechnung
Stochastische Abhängigkeit
Algorithmus
Stochastische Abhängigkeit
Statistische Schlussweise
RaumZeit
Default
Indexberechnung
Menge
Einfache Genauigkeit
Existenzsatz
Selbstrepräsentation
Partikelsystem
Data Mining
Mittelwert
Bit
Programmiergerät
Punkt
Videokonferenz
Strategisches Spiel
Schwebung
Entscheidungsmodell
Folge <Mathematik>
Zentrische Streckung
Prozess <Informatik>
Element <Gruppentheorie>
EinAusgabe
Teilbarkeit
Linearisierung
Rechenschieber
Twitter <Softwareplattform>
Funktion <Mathematik>
Ganze Zahl
Dateiformat
Kategorie <Mathematik>
Gebäude <Mathematik>
Datensatz
Warteschlange
Drei
Hilfesystem
Lineare Regression
Diskretes System
Einfache Genauigkeit
Vektorraum
Frequenz
Schlussregel
Diagramm
Quelle <Physik>
Datensichtgerät
Information
Hinterlegungsverfahren <Kryptologie>
Rechenbuch
Multiplikationssatz
Perfekte Gruppe
Algorithmus
Vorzeichen <Mathematik>
Lineare Regression
Konditionszahl
Entropie
Punkt
Auswahlaxiom
Gerade
Prinzip der gleichmäßigen Beschränktheit
Distributionstheorie
Datennetz
Systemidentifikation
Schätzung
Entscheidungsbaum
Entscheidungstheorie
Messprozess
Charakteristisches Polynom
Algebraisches Modell
Objekt <Kategorie>
Multiplikation
Kontrollstruktur
Hyperbelverfahren
Algebraisches Modell
Unüberwachtes Lernen
Unrundheit
Demoszene <Programmierung>
Zeitreihenanalyse
Nominalskaliertes Merkmal
Reelle Zahl
Programmbibliothek
Verschiebungsoperator
sincFunktion
Softwarewerkzeug
Datenmodell
Hyperebene
Koordinaten
Skalenniveau
Netzwerktopologie
Offene Menge
Faktor <Algebra>
Chipkarte
Prozess <Physik>
Randwert
Selbstrepräsentation
Gruppenkeim
Computer
Statistische Hypothese
Computerunterstütztes Verfahren
Analysis
Untergruppe
Entscheidungstheorie
Skalierbarkeit
Ausreißer <Statistik>
Total <Mathematik>
Rechenschieber
Parallele Schnittstelle
Lineares Funktional
Statistik
Teilbarkeit
Assembler
Physikalischer Effekt
Reihe
Heuristik
pBlock
BayesNetz
Frequenz
Zeitzone
Algorithmische Programmiersprache
Konfiguration <Informatik>
Arithmetisches Mittel
Teilmenge
Transaktionsverwaltung
Emulation
Mathematikerin
Schwimmkörper
Telekommunikation
Lineare Abbildung
Computervirus
Ortsoperator
Gruppenoperation
Zellularer Automat
Term
Data Mining
Task
Stichprobenumfang
Gewichtung
Leistung <Physik>
Modallogik
Trennungsaxiom
Fehlermeldung
Mathematik
Physikalisches System
Primideal
QuickSort
OfficePaket
Flächeninhalt
Last
Dreiecksfreier Graph
Hypermedia
Polygonnetz
Brennen <Datenverarbeitung>
Innerer Punkt
Momentenproblem
Wärmeübergang
Extrempunkt
Gesetz <Physik>
Statistische Hypothese
RaumZeit
Stetige Abbildung
Softwaretest
Standardabweichung
Typentheorie
Gasdruck
Metropolitan area network
Cliquenweite
Multifunktion
Extremwert
Kategorie <Mathematik>
Mobiles Internet
Güte der Anpassung
Testdaten
Rechnen
Bitrate
Zeiger <Informatik>
Sinusfunktion
Reihe
Dienst <Informatik>
BAYES
Rechter Winkel
Festspeicher
Konditionszahl
HeegaardZerlegung
Hypercube
Wärmeleitfähigkeit
Theorem
Teilmenge
Klasse <Mathematik>
Automatische Handlungsplanung
Abgeschlossene Menge
Whiteboard
Informationsmodellierung
Datennetz
Abstand
Inklusion <Mathematik>
Attributierte Grammatik
Schlussregel
Elektronische Publikation
Chipkarte
MIDI <Musikelektronik>
Attributierte Grammatik
Resultante
Kartesische Koordinaten
Computer
Ähnlichkeitsgeometrie
Übergang
Einheit <Mathematik>
Nichtunterscheidbarkeit
Randomisierung
Stützpunkt <Mathematik>
Einflussgröße
Funktion <Mathematik>
Konstruktor <Informatik>
Glättung
Globale Optimierung
Ideal <Mathematik>
Gleichheitszeichen
Knotenmenge
Datenfeld
Strategisches Spiel
Information
Reelle Zahl
Ebene
Nebenbedingung
Gewicht <Mathematik>
Stab
Virtuelle Maschine
Zahlenbereich
Implementierung
Derivation <Algebra>
Kombinatorische Gruppentheorie
EMail
HeegaardZerlegung
Task
Unendlichkeit
W3CStandard
Multiplikation
Koroutine
Pi <Zahl>
Beobachtungsstudie
Autorisierung
Binärcode
Kreisfläche
Matching <Graphentheorie>
Dimensionsanalyse
MailingListe
Lineare Gleichung
Datenreplikation
Objektklasse
Ausgleichsrechnung
Integral
Inverser Limes
Basisvektor
Mereologie
Entropie
Metadaten
Formale Metadaten
Titel  Classification (13.01.2011) 
Serientitel  Data Warehousing and Data Mining Techniques (WS 2010/2011) 
Teil  10 
Anzahl der Teile  13 
Autor 
Balke, WolfTilo

Mitwirkende 
Homoceanu, Silviu

Lizenz 
CCNamensnennung  keine kommerzielle Nutzung 3.0 Deutschland: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nichtkommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. 
DOI  10.5446/327 
Herausgeber  Technische Universität Braunschweig, Institut für Informationssysteme 
Erscheinungsjahr  2010 
Sprache  Englisch 
Produzent 
Technische Universität Braunschweig

Produktionsjahr  2011 
Produktionsort  Braunschweig 
Inhaltliche Metadaten
Fachgebiet  Informatik 
Abstract  In this course, we examine the aspects regarding building maintaining and operating data warehouses as well as give an insight to the main knowledge discovery techniques. The course deals with basic issues like storage of the data, execution of the analytical queries and data mining procedures. Course will be tought completly in English. The general structure of the course is: Typical dw use case scenarios Basic architecture of dw Data modelling on a conceptual, logical and physical level Multidimensional E/R modelling Cubes, dimensions, measures Query processing, OLAP queries (OLAP vs OLTP), rollup, drill down, slice, dice, pivot MOLAP, ROLAP, HOLAP SQL99 OLAP operators, MDX Snowflake, star and starflake schemas for relational storage Multimedia physical storage (linearization) DW Indexing as search optimization mean: RTrees, UBTrees, Bitmap indexes Other optimization procedures: data partitioning, star join optimization, materialized views ETL Association rule mining, sequence patterns, time series Classification: Decision trees, naive Bayes classifications, SVM Cluster analysis: Kmeans, hierarchical clustering, aglomerative clustering, outlier analysis 