Add to Watchlist
Data Mining Overview, Association Rule Mining (16.12.10)
Series
Annotations
Transcript
00:00
So far ahead there and welcome to lecture that where housing and that a mining and we kind of after through the day welding part so we can get rid of that and for the rest of the term will deal was that mining issues and some big exciting algorithms and interesting ways to get more out of it and then today will start with a shot introduction into a what business intelligence is and then presented the 1st algorithm which will be free provide mining and official fisherwoman today UK the last time we were talking about how to deal and that the houses and concluded a wide details of the well what that about actually is in terms of whether it is done all were talking to the bit about the on each prose which is the most important roses in the life cycle of the lower house because the old crack in fact out paradigm is still that it so that a new data warehouses bad than the decisions based on the data and the early in mind from the fact that it is of no quality to and that 1 of the annoying so we have to do is you have to consider how do you get that the house was the transport into some global scheme that you can really use for lot of applications under which is extensible also you would not coming back applications and then you have to to look in the transformation phase out as bit into the UK and into the top your back leading how the make sure that the dentistry correct and that is complete found because that determined that the major quality issues of the day away and interesting enough if you look at some of some of the become Beni's today that have used has terabyte entire about the data and you look at the the quality of the data that is actually in the UK you will not have done not find that the death qualities actually pretty and of course that is a problem today at today's stayed away houses in bigger companies that cost lot of money to clean up finely after loaded status usually opposed the the stunned involved modes so you Load all of that has been cleaned up and the following the extracted from from from the underlying productive also and 1st overnight whenever into that allows and the away is ready for the off series that you could think of the next day
03:01
But Announcing the move talking about briefly last time Wills man the Soham highly describe what that actually is and the way how the you keep up with demand that are and how well you get understanding of the nature of the and comprises where I'm and something that is very off and done these days in terms of matter that is the so called steadily Nycholat provenance so information about where does that come from that can be very interesting to see you know what it is that some of what it sees all some things that you just don't understand looking at the trail of the made very of how to to get a feeling higher list at a loss also allows aggregated and maybe there is something wrong so this point you really interesting inside the your day but that today's we want to go into into bat mining rather and that the but the major turned that is needed if we talk about the mining and that a where houses and and 1 lecture is business and had because that is what you want your to find
04:17
Information they don't know yet About your business about a company about it customers about what and we haven't had that way since he will the debt and the data were found that the place to look for it on the other hand it's quite a difficult thing to look for that because you just don't that and that was and and and and the way that you that you find out what some hidden related should of something but the you just have to use the and Load said tones of that there and we will briefly provide no view of what is intelligence Challenges all today and then start with the mining and today will be 1 of the most important Algorithms for that a mining this very often employe chat especially for for migrating purposes and Association room But for most this is intelligent and this is intelligence is kind of the sides about your company insights about what you do in your company and it should be said that when you get a warehouse actuaries is dead
05:40
What you need to make a decision Is inform nations and those 2 totally different things you can generate information from the by she you have to generate information from the other was the guesswork and the that gives you support but To have some statements about what going on And the information makes the better digest ability to turn of began Easigrass the concepts underlying and and there is another step for many people talk about this but the kind of state of the law that the in your company but you extract information in 4 nations Order of the day and What is the peak of under permit here No decisions as the actions that you take based on what in the Not would exactly For this is that the most 3 fine And that was a statement that he will The buildup on information give you some sometimes perspective and some decision sepulchral that this is the decisions they give and take is basically the actions of sepulchral by not of And so what we basically do is you take that away also the CIA where over the results of the winners of the and to get from here to there Use and easy to of this is where a lot reside and this is where the mining algorithms reside in all working on the basis that and generating information from the and this is basically a manual step The Emmanuel stabbed pulling together information into knowledge which basically were and just go through all the information and the and the strategies from it and find out what he should do to improve the processes of the company to increase the revenue of the company to increase the happened this as off your customers will increase of the of the creative it off to to work for salt water intellect can be the kind of stuff But it's founded security founded by all the of that we have stalled in way the basic like But with that began to stoic and the and the put could Samo problems every year
08:58
Virgo From the 1st of to
09:06
The so right to be good applications for business intelligent 1 billion public thing is segmentation mostly customers segmentation of segmentation find different groups of customer That you could the dress by certain book advertising Campaign of all 1 of the find out how the market is actually segment and well you products Qaeda for specific segments of the market as a model and the new decisions like quality but don't tell the whole Mockett yet wide on the right address some other segments of the island than this a propensity to buy so I'm who would other customers that are willing to buy have the money that are willing to spend the money if you just give them the right ideas what they want also has something to do do with at the time Profitability of cold is worse by Kate for certain customers all these customers amounting extremely high quality products for very cheap prices so fuelled revenue was like modulo and eaten away by the Texas anyway you might not be willing to pay for these customers and more than a central thing French detection think about my credit card company or something like that and that needs to find out whether the transactions taking on a credit card up possible Figure for example of the island where where the couple's used to find out that the couple's used in reality and 3 different countries Maybe that is possible But volume probable and so with the savings and and to be good at a money with new you can detect Fault quietly East you can generate candid for fraud detection but his band and and local and find out whether this really before by The same thing customer attrition still Allah customers leaving Offing slowing down by end of the time Optimization sell of how the get the correct to the customer are you dressing the segment through the right channels Fault the email of a lot of very good way of advertising any more because of the spam field of these days I'm may be making some some postal things will be much nicer which you count for so much of the find out what your customers really
12:11
I'm customers segmentations hazardous said in like the basic cable Mockett segments of the young people of all the people does you a product of data for the more you know know you might need some some fleshy products for the younger people and some of from the very reliable for the older people something like that and I'm and so by the personalisation of customer relationship of which is to be a usually referred to as customers later should management it will all of have heard of the year and to lose that did I go in the industry today which means customer relationship Management so far up on purchases find out whether the customer satisfied with all the baby can do something to make them even more satisfied you can handle warranty planes and stuff like that of a customer relationship Management the what part of the state and of course what customer later should management that money and find out whether somebody bellway's uses the the warranty hardly ever uses the warranty and the and the and fulfilling the claims if somebody Hopley every used the warranty all Load here just Fulfil the claim but if somebody or at least he was a launching and has returned some product 5 times already a might be something wrong for this what you find out what customers each
13:51
BigSim propensity to buy a set up which customers are most likely to respond to a promotion talking Tygart customer group of especially campaign possibilities of planning at the time as complaint I'm which would like to division commercials 95 certain customer groups in know like it might be more interesting to the to put my advertisement in the evening program if want data for kids and they should go to the nagging about what they want for Christmas NAMD new product from Tbills on before company you now the put it What during sponge ball 1 of the 2 divisions programme
14:36
I'm profitability in was a lifetime profitability of my customers that pay off to send him a Christmas body like other going to buy some call up in the in view of the day just by a kind could be interested left because the next if they would definitely by news of a new car or something like that kind of interesting to follow up on customers and find out about the Oval profitability so you can read focus on the on the profitable customers it it's very Austin bound in banks these days so they have these key account of management and where red really if some customers especially profitable your mouth he account have some people exactly data for this very important in today's dozens of and fraud detection of tell what transactions are likely to be followed and so credit card companies use of law and the ways data finding out what is actually fraud and what as an almost you off of has all something to do with a little bit of thought of
15:55
There We to send out light and the skills and and profiling so find out what somebody mentally does and look for the UK life those that are not too big for his or her profile the South load could fraud and the year during you believe messages on the new service 5 minutes to allow the use of Nigeria scam all things like you can only for a film called from him can be found out and the good thing about some of the data mining of the country's worst that they will automatically They can run them overnight And Rick very quickly to what happening sometimes while it's happening but At least immediately after it has and that could cost the loss of that make things using
17:02
Addressable also for police and the because the riches and tell optimizations told what customers should add up to take care for because they are about their bound to leave for the kind of the evening in a way that seems their considering other alternatives to to my part my services and the ideas to prevent the loss of highly customers and if somebody is not profitable anyway your competition assesses the absolutely about them to their customers of costs but optimizations about whether you do TV advertisements email spamming called like fanciful bannerets in the city centre 0 every do you like him in the Test defeat to the customer and the fit the prop and this is what I find out what this time thoroughly basically to this year that there were house at the centre of the company
18:09
If some that solves which basically productive systems but can also be customer relationship Management Systems key account systems and Didier processes L is basically bring the into the and and from this that the King of the business users Maybe and this work was that of the on the back and from The most of point you are best off but I can't see things that are automatically generated some things are a plants like would it was all over a period but not Neston the knock you will find that there are some of that a mining algorithms used some in what is going on or what is interesting about trends you should look at and what trends that you can safely forget about because much interesting and the PM Find lead goes to the telemessages or to the decision makers that will be built strategies all about strategies as shown by by the and the and the interesting thing is here in the future component intelligence Systems so this is where we were made the dead a mining algorithms work on their own directly on the battle where house and also delivered that what is known locally as the World exciting what is new 2 0 to leave the face of the analyst can really see what happens Laundry
19:55
Up in the wrong thing to find out what's happening is of automated decision to go to for example threat detection Means happens automatic very off you don't have a human and a little to anyone but the algorithms run and immediately when your credit card seems achieve Figure that your credit card has been of the account has been close of has been at least put on ice at the moment Do you really the Sri transactions that look fishing and then you respond yes was buying something at Greenland yesterday of something or your it on now definitely that was not me And then the processes and walked for credit card But at least closing down the account of freezing the account Sending off Paletta for the potential of the victim That can be done automatic and this is the kind of interesting if you just throw based system They have a certain solutions If the case arises fraught What possible for probably At this point I don't know whether it is fraud all whether it will be perfectly harmless but that are safe and saw sell should really consider the account at a very early stage and before the money is gone and the customers as well by was that and you should have detected sell you have to be a reliable for the flow of of revenue the loss and the same goes for Load approvals also mainly automatically these days just and in what during what you want more basically have terms of securities and at the decision can be made automatically The a single for all of the defendants management sold just getting the indicators of how he businesses running Is very appalled England but mostly based on the best so called the card which is on economic system of that shows you the key before my indicators of your business and that can be done automatically that can be of defending automate away directly on the in way out and the new this the go cuts and you can see what for 4 months indicators up and running before and the should needed Attention The bomb was a possibility to get this to the people to the decision masted makers are so called dashed that's all business cockpit like because they look like like like the dash bought a new car that they are not here you may have some metres showing whether you in in the agreed area where the and the red area was revenues of what you gained all some some progression and this showing that the trend is upwards or download still stagnation at the moment and you can see all these pictures and the and the ride the inflammation knowledge from that you might need to to find out what's happening
23:57
Now set how to do that but it's the kind of interesting for the Howard we get over this wonderful that all this wonderful inflammation who want to visualise and all and dashed bought all in the island of costs if Also has not which discovering the database was Bertie term fall out of the life sciences and thought well below each discovering the basis found that to be sell just cause that might go into the mind of all that all the information you need from the dead mountains of interest and this is exactly what it is to find the interesting information All patterns that operating you Given that the databases so large that it computer by that you don't see and and effect can looks from icon books it very easy to find out whether the profitable now if I'm a multi national company With thousands of Holdings It's not so easy any more And yet to look very closely bomb where the need for having interesting for which is that this month review But to use That when you earn lost money that things are good need somebody to tell me that But maybe somebody should tell me about certain segment of customers being more and more dissatisfied with Because that is not true but the implicit so Amin if not just a single number but it's a information that puts a lot of numbers together some factual connexion with each other and it should of course be previously unknown because and like systems telling me were were already know more of the same era but have McKinsey over but that is basically what would mean we talking about interest from each And of course there are other ways to get to that information so for example the best Adaptive through possessing expat Systems statistical programs that this sort of a way of the different of lectures the this domestic which is owned about the system victory at the most basic Systems lecture And there are a lot of techniques mostly ruralbased all of them to the law logical ways of the dressing information of mining all interest from information about what we want purely statistical algorithms running over the that and then we want to find out what interest
27:13
I'm the education of their mining out than as 1 1 and database and other this sell find out what is in the end of cost this is support to find out what to do when something happens and as as in the case of this and intelligence these old Applications set up from the last no Applications especially in the data in the in and and Life Sciences the of them are around like find some some characteristics of some patents and that the use aloft Adamind other things go for text mining still lost but at the moment I'm exploiting brought those of finding out what people think about the thought of a by by looking at post means and I'm or email of analysis all 1 of these days I arrived new services at who will every 5 minutes to find something else to do for you so it is difficult a mining applications
28:26
Firms The 1 that we are focusing on Is mainly Mockett You want to know how the mock to cement Hollyoaks customers fit into the profiles you are a dress with a problem You won't find approaches purchasing patterns it worthwhile to close down to stop all the weekend because nobody by something over the weekend Some especially your follow You just want to get into the into a Christmas a business and forget about the rest of the year Crest Mokadem oversold what happened between different of how they are related to film maybe up selling don't across the and would be a good idea if accustomably something maybe you want the extensions all something that fits very well the public That would be a nice things the bomb Ethel summary information so again the topic of reporting a want to know about what happened the reports and that of calls includes all the statistics
29:52
But I'm saying things this paper than those risk management buyout finance planning was my company worst was a cashflow that happening at the moment UK predict some interesting trends of the sales going up for down some private running out of some of the some getting him at the moment some of the results planning so all what were to put into resources To any more of something all should slowdown production and it because of the mock doesn't take it at the moment I'm not sure that I got to know what it competitors are doing house them on of their segmentations their revenues and the sale of interest in a pricing strategies of calls was the stuff worse that I'd produce which is not only dependent on what a but rather dependent on what the market is willing to give it a might be highquality productive mock just not taking too expensive and there might be some which happened very often actually some amazing which he followed the mock is prepared to pay everything for them so why not take it Basic do what I do and the object of their minds system is basically just showed for business intelligence year that where house
31:34
That is basically built from the productive systems and Resident that and the that away and on top of that you put the money and that is the main compiled that runs all the Algorithm for that a mine that you interested and finds the patterns and then I need something to evaluate the patent The best this ruled based very often this is based some on some constraints that you can't get away with such a part of not space and then graphic and use it in the face to get the information across the nice chopstick can show to your boss state where he is a trenchant that is showing the sloping down you know like and we don't want that to be to do something good
32:27
I'm The major part in their mining group covered the sector's off associations with the bat today Association means that things may be correlated with each other because they are so the 1 thing depends on some of the book Well follow the full some of the event that has happened than those should be considered together of this but how when 0 Which things and pandering to of that its association mining bomb Sentencing classification of predictions While the segment for part of popular predicts which segment Strongbow for getting weaker Frimpong to find models that kind of Allami to predict A lot of what is going on in my business and that a of ways to prevent all the information so decision 3 specification will go into some of them and during the next lectures and and of cause will also look the into the future with prediction algorithms and find out what will happen next
33:58
Just an hour of this as a very big Topic in data mining so I'm you might have different classes and a year later Lemnian find out on how to opt to group your product or service is together that you can buy manage the more efficiently for example produced a up rather together or out outsoles a certain kind of service to some of the some subcontract of what I'm saying goes for the advertisement refiner walked off of people are buying directly advertises may be wanted to to to address the people that are rich and and go into the slums all the pulled out of the city and Distributed he fled to the UK of a disco to TillyTilly it suburbs where all the people of the region beautiful the and try my appetising their from them or a talk about Outlander's briefly mostly for detection and stuff like that something happens that is out of the long may be a black Friday some Mockett crash of something definitely of the moment that a the detective dirty Maybe some for land use credit card based again that detected a some rather useful if you're for effects of today's topic is Association room mining said associations between object and the idea is that with the object of or customers all whatever it is that you looking at
35:49
For credit together That has to be some Higdon relationship between them While not exactly East Fife people find What common idea delay And The Jessica application for that so much about it and the very of the by the big vendors but roses for for example Wellmont was 1 of the very become tender mockup off but and the and for what reason actually so what was that he does is a year supermarket and have Umaga bosket in all likelihood the Boston put in staff and by the US and then you find out for example that some some some Association route everybody who buys cheese also by wife of its probable that if somebody by cheese he will also what is obviously where well by the rules comes from you know because of the many people like to have a wine and cheese in like that Dessert something and you will have a certain support for the rule before example 10 per cent of the customers by cheese and 1 together and 80 per cent who by cheese will also 1 This is a rulers intuitively clear Interesting thing Why should a note Being what The specialist tightening might be a and endline know my teeth and wine special 5 things together and get 15 per cent off of something other reasons But it is The has found that 1 of the major reasons planning where to put your if I'm not there is a relationship between 2 2 It's much more probable that get customers to by exactly these 2 out of 2 to look at promotions that I'm doing on the It put them to get their close to each other because and people think what he of all the wine called as she wonderful that goes the about together that also Britain cheese And this is 1 of the reasons why means with with a cheese and wine at the end of the classic example that very API for example in the world about his found out that nappy sell pampas and stuff like that of a day off but together with Exactly was the sixpacks It's not up to to use What you think about it it's kind of like it but then I see that goes Saturday afternoon shopping in our likened the public the told the Commons on 4 be left care for the public to sail with a wife and a lot of the well put that could be the next to the pampas and your fine because people will buy it This is kind of like to getting of things more fish
39:41
Tesco into the into the algorithm of ex as fishermen how do you find the given to launch said that it held find out efficiently and correctly what are the Association said of the void With 2 basic components 1 the items that I'm selling the product everything that is in my 2 bomb Beaumont is a prop Phil by 1 has approach to the public and so into what his belt and eye consider this as a mock about but as the transaction Everybody takes as much about the broader food shopping Cocksailor like put everything on the 1 on on the on the tail and There was stuff this bought together But a single customer is 1 such transactions Kent But that is very easy to get Because there was supermarkets now electronic cash Regis's and scant for everything So at the end of the day you know exactly That was a customer Don't care about the identity of the customer as soon as the customer does not using up payback consol Deutsche some kind of high school you like these providing to his You know that some customers who and a may be What these items together today A and the and the station Road is implication ways say somebody who bought the site The so brought these and that should be different Amin don't tautologies somebody water find also bought wine high that's not the end of the 2 bigger things somebody what wine also bought somebody who bought pampas Ausubel be Was or try to find out and the PM
42:14
If we take the item sold stole a 1 might be beat at chicken which I sent everything and then you take you look into the offered of people at the cash register and said borrowers 1 guy who Bauby's ticking and milk to And 2nd guy who bought beaten And the 3rd 1 book cheese and what sort of care But the basic idea and you can find out and decision room if somebody buys beef and venting than it a big problem that he also will by milk Was of type Allami to put the rich with the milk next to the beach and That of course was a could make thousands of such rules because people will buy anything they need to know about and I'm might come up with olive this guy bought to Polish of something like that and you breadlines Wyandanch superbly to have something in common there must be good collection of him connexion might have been but I just needed both think
43:32
Which can offer would have said that the risk in the rather we like the polished and wine example of that they can be strong and what makes them strong Basically the ludicrous ready after people by the things that it ought to buy consider something like like collection published People were by that when they need it may need a new car It doesn't really matter what they by the board together in a room insists that every 2 months will by the books of shot And that's about it I'm The to win Basic measures for the strength of the rule which is the SapuraCrest and the confidence and in the root of the supports deal with that How strong is the data The confidence use with the semantics Japan basically means that had various people quite sure all the people by what or is just a one off by this 1 guy I've never sold to the supervision by my shoppe and Beckham will 1 guy and the buyers should published and pineapple while it as a new rules Not appreciate this no new rules It has to have a certain amount of time that people by to Polish unto UK and state some sensible information about what is also bought together which again this basic Thesis applause and so that the plant of the rule is the per cent dish of transactions that contains will pop most of the world Foca The UK and that it happens to be basically this is the year of the probability that all these things operating a single terms How often does that happen well as a supplier for some route is just you take the items to Index And why And account of the number of transactions The 3rd these items in X imply bought to get And you do guided by the number of transactions that he did during the all day But I did which can they find out that half the people by this product novelised by Holl's people buying usual anyway And you find out where the something interesting the confidence of some ruled beauty was the semantics off X is some of the costs
46:41
Of fine wine So it something if somebody buys cheese The trial that he will supply white hobnobs The other way around somebody wise buys wine If he also bound to buy cheese What could differ code Could be bad altogether various but that might be the cause all the other of about wine chief would guru well with But there might be a lot of Antioch O'Lakes that their blood 1 at all but rather like cheese So they were by cheese and that by wine Louis seems to work rather way somebody who buys wine also want to buy cheese Then somebody who buys cheese also owned by 1 Were find out By Cold into the current Without wine and she's has been altogether And look at how many times while the product has been bought individually A pen so if somebody Phys it wanted to stay at the root that if somebody by cheese is bound to buy wine Icahn the number of times a wine and cheese for a ball together and divide them by the number of the cheese was built with a with 1 But I've found all party is a tree that they are both altogether world of 1 of the things for which I care the confidence of the route is basically I'm take the number of times it is that to get the so basically the supply And divided by the number of times that only The cause has been brought it Kim
49:02
The rationale behind supported and confidence is basically that is supposed to local with no way of saying this is a typical wrote This really has statistical rather maybe just like us to shine and pineapple idea enough time that happened once and nobody ever bought to shine again But that this is not a real of this system I'm so little supplant should be avoided by saying there was little confidence that the confidence is low and people buy anything with cheese And pudding the wine next to the cheese is as random as pudding shoeshine next to the she doesn't help you because you don't have confidence in your own and the Association rule mining As algorithm has to discover L Association In the number of transactions that he with a minimum Sapori minimum comes But the to set some values and they are abroad only interested fraud that makes and 10 per cent of my business and not consideration shot get me to the big players but out rules that up reasonably believable Maybe have confidence over 80 per cent
50:42
But that's what we do so that try I've my number of transactions here on 7 customers came into the game the 1st 1 will be used to commit 2nd Rumbaugh upbeat and cheese But 1 book cheese and boots And I've seen this not too easy to see off the association In the East transactions as Because it could be possible that somebody who buys beast that was also buys chignon 1 possible rule It could also be possible somebody was beaten ticking was by night However somebody who buys beef and milk or with a stick The loss of possibilities mecoprop Damai Items for failed to together in any arbitrary or The exponential explosion of possible rules So what do you Calculate the supplier and the of the confidence that everyone of this combination a Madness Have 5 things he cheese boots before taking and milk and close 6 things up and with about his 23 hours In the singles to follow Try billion all the rule it out of 20 thousand A PGIC The So Let's consider we have a minimum supply and a minimum comfort We are interested things that we sell more than 30 per cent of all And we only interest rules that have been made confident of 80 per cent of probably so 1 thing will be the route just invented Whoever buys chicken includes also buys milk Has set up a Web had Weston and do chicken closes at New Approach a given my protection Once twice 3 times 's only take and is only milquetoast shake and closes And they sit up and so these do not count Because they don't contained little The object that I'm interested all the item account but the 3 you count and I'd have 3 out of 7 Transaction dealing with 3 out of 7 Only 2 per cent which is definitely higher than my minimis applaud of the UK and So is obviously meets the minimum supply was my confidence by up to boost days that whoever lipstick and closes will also by milk Let's find out who buys seeking and closes Chicken and closes here The firm said in and The 6 a APEC In hominy of this case is that somebody or chicken and killed But he also by milk 1 or 2 3 3 0 4 8 Again this is a 100 per cent confident it never happened that somebody well chicken closes without buying A pack of the room seems to be calls It is not because I'm a bit of that it has the confidence of a 100 per cent anticipate all 30 per cent for might be used to promote pretty again Everitt kicked now we could have opened the way for somebody was close all the buzz American taking somebody but the by those to global bloodline can make thousands of these rules I however find out efficiently what if what and them Associated roads Or about the minimum supply and what passes Station ruled beat the minimum confident Anyway political singles This Not if I had the idea to would have invented 1 of the most pressed I just algorithms in that mining the availability a
55:56
We'll see the real momenta and it's actually quite simple once you've seen it actually believable found of cost This kind of all 4 of a situation like this rather simplistic view of the shopping and so we are not considering the quantity in which things are combined awlwort pricespaid so out which would be affected by by some some special offers well ways say OK this amazing which he today so it will be bought this everything this week or something like that and that the problem was with the basic thing but once you define you transactions and Joe items in the right way but you will find that it is definitely a end it doesn't matter how you might not be truly you can always computer the minimum confidence and the minimum supply so that the men what algorithms use the rules are the same but there is a certain set of rules that as a Minimum sepulchral had been made country
57:12
Yet Burmese 3 major algorithms of which we will look Today Of these 2 am the apriori algorithms mining was multum a minimalist appalled and as also so called class as the station rules which will go in to a very very briefly in and the 2 at the end and that the best known Evraz more for definitely A very efficient way of Getting things don't prop of it basically consists of to stop the 1st 1 of the so called free could itemset mining But find out all the item said that about the minimum supply And the 2nd Step is generating candlelit rules that have a child of being above the confidence of the the team of the minimum comes from the Greek for items and these are the kind of voice that will be generated than only will not look at them at their ruled that a possible And still get the correct results so would never missed the boat and the basic ideas
58:39
It is that when somebody has A frequent items About the the minimum supporter The only items contained in the Sept We have to be frequent themselves So if I've a minimum supply of light a No 10 per cent some The record setting If milk cheese and bread If we could item said Then there was the Sri items individually I have to grow a at least In the number of times they occur to get On top of that They couldn't Foca individually But this applause is definitely higher for equal To the minimum supposed of the biggest undisturbed so this called the downward close of to close a property in a subset of free could items that is also a frequent if I'd take chicken close milk by chicken close the chicken close to close The that part of it Milk close closed For those who don't Walker at least and the 3 instances And cocoa some more Doesn't or Kent Liz approach 3 times But what What about the other How about Chicken closes They chat milk Take chicken look Chicken milk Bicknell again Oakridge 3 times But even Walker the 4th time Without closes in her But it d'Afflisio cross at the East as many times as the Brixton Oka This property is amazingly useful because it allows us to see the candidate generation From free could itemset of low continuity To freak indicted said of high cotton And that means we won't have to try all the different rules of possible but only those that are definitely have a frequent items I'm that works
1:01:49
So how do I find freakin items we stop By looking at 1 element that the Once defiant 1 adamant that high into adamant that be built Well basically by putting together 1 element said The frequent because if part of it will be not frequent The download close probity would told a press for the bigger said can not be frequent That's the basic Idea For each iteration look at the generated candidates Of the last races and put them together and such that 1 more new items into the set That means go from a came either 1 frequent said took 8 items Frequent said But and this brief optimizations will assuming that the items are selected in lexicographical or 5 typological assaulting of the of the items Such that UK and find out what Well the very easy I find out what is the intersection of items because they would be kind of like all twisted round of all with have to look when comparing to itemsets what is contained in both of the effect the them at a sabbatical I lexicographically I'm will just have to look at the beginning and once they stop diverging the can of have locked into section NEMO but that's not with finding the item of size 1 which basic these products
1:03:54
That are sold at cut off from them and the minimum supply Easy does go for your transection missed and find them Then my eye needs The next day by a need to be 2 0 damaged or 3 amateur for element of 5 damage of frequent itemset Hollywood do that well By taking The 1 element items Put them together to to and UK Does a candidate Still after check in the bad weather this really is the to eliminate because consider IBM item by whom An item eye to 2 2 Talking and these are my transactions So that really matter what they will be what to go with him a lot and could have been bought with all different things Consider what the minimum support of 2 Umberto's Iwunda night to off frequent items But is by 1 end eye to a frequent items Group In my little example This this Exactly it's not those items individually for through the minimum support but the into section and the transactions empty So they are never bought to get good so which find the can that up possible and Niall we have to
1:06:09
Which of them are actually and how do you find the can this this the joint staff We put together over set of continuity came minus 1 such that 1 new item and the Sept Is a very efficient thing to do because we have offered the of things lexicographically for the half to have the same had and in the end 1 single item should be different because of take those 2 sets of books them together Then again 80 set of continuity K are The head Has to be the same Last lament may differ pudding these to to get the results in of calls the same had but then 2 different items CIA again told lexicographically and the card in the sea Of By case in K book Racy set very efficient to do and that is employed for this form of goods this I've to where Canada's that just before the data and you look at it fullsystem interim Seppelt Luckily So parallel a can that do not respect the download close up property what does it mean that fully subset of frequent items Ueno the subset Well yes Because In the last act a belated became minus 1 free content If there is a came minus 1 set In any of my candid The case of The and a can immediately called the can immediately put off because the and the dollar fell to prop but I would be Jim Laskar said some of Maggie makes it more
1:08:37
Easy to see how this and we have worked out a way up to sweep itemsets And these items 1 2 3 1 2 4 0 1 3 4 1 2 3 4 5 and 2 3 4 and not want to put together a items items So want to go from the free to air for us joined the candid however joint and that about and look at the 1st items St And those with the same 1st item can be joined together at the close has been given items in the last place you would create a new items said With little entry or care So we have to do a start you Sir The and I look What towboat couples could before joint was the 1st 1 to say this is 1 2 so definitely a joke This is 1 2 this is 1 for the no don't 1 3 2 3 no joined up at these are invented pudding this together results in Before top of the good that go to the vexed 1 2 4 4 5 1 2 route through mid the and create something new next 1 3 yes 1 3 you 1 3 here so this creates 1 3 4 5 a pen The grid and the next 1 3 or do that 2 3 Canopy John was anything because this No 2 3 0 Forget to new candidates From my 5 Read item can And those are the only candidate That could be frequent items What could be no more item said Undeniable having fixated all something like that because 6 is not a frequent items But
1:11:11
Well not after see whether there was actually do the trick It well How can prove they look at look at where the possibilities to get 3 numbers of the so by could have won 2 3 1 9 2 4 0 1 They For me 1 9 3 4 2 3 4 is or are 3 of the men said that are part of the for a month but and the media Do they exist are a frequent items so well that look about 1 2 3 1 2 4 1 3 4 0 2 4 3 4 with the media and for your convenience for them he turned up at a one two three get 1 2 full Caddick 1 3 4 got it to 3 for that the move the subset of my bigger frequent items are itself weakened itemsets download closure probity hold so this is definitely
1:12:34
Weakened itemset The FMM wrote that all 2nd Kamdesh 1 3 4 0 1 3 5 3 4 5 1 5 4 can all be done to look at it 1 3 4 7 1 3 5 this there are those who are not there That not 3 conducted download of puppet hurt Fruit It's not about that frequent a pit So candidates for the free combined and that it is just a single point
1:13:21
Of course It we do that we have to scan list times and again But look at the example was a minimum supply of all point 5 we of transactions CIA which means I need to apprentice to make 50 per cent are picked up that look at the individual items he is item whom the he is item 1 so to or Curtis's witches minimum support of 50 per cent item to He is item to has item to has added to the line the 3 point to minimum account of minimum Adam support of 50 per cent down Adam 3 is items rating is items 3 is items 3 The freeing of Kenneth getting these not Item all only 1 approach Item 5 1 2 3 Now These are the kind which she has the minimum supply of 50 per cent everything that across 2 times Look for this cut out There Good fun Now we have to join them together for a daunting 1 adamant that to to Adam and that is very easy because they fit altogether they're all the same beginning 1 3 1 1 2 1 3 1 5 2 3 5 3 5 of the law does not occur in more because it in itself is not a free from items I'm actually clothing down space the search based still getting the correct result makes the algorithm efficient But now have to look at what actually happened But in this case everything belongs to a that 1 so that nothing pruned yet firm because all these things are frequent items of the before
1:15:56
And now after scanned through the things Which of these candidates are actually have a minimum support of 50 per cent But all possible that they have the minimis appalled a 50 per cent but whether that your car so that took at 1 2 1 2 happened once not what 1 Street he of 1 3 he is 1 3 I haven't twice 1 5 The along fund Britain's only once again you see out walking Calculate the minimum supports unique to whom more L 1 5 definitely not in the game 1 to definitely not in the game or the other or we could items of size to Now how do we joined them to items of size 3 well can join something with whom no eyed and have any 1 any more can be done anything with to yes he is a 2nd to join them 2 3 5 again see advertising with 3 no this mostly free to draw but On the same Locanda that helps read items frequent items look up How about this up 6 2 0 3 5 2 3 is here 3 5 2 Tell that to the to freeze 3 5 EST here there 2 3 3 4 2 5 4 this you can while there no need to pull Is it really so that while the look bomb where 2 3 5 so that it can do that with the To refund happened here happened here That's it Tool Kearns's appalled a 50 per cent codes Definitely is in the set It we die that was something we all know we don't have anything I considered 5 1 a lament items like On said 5 to Adam and items consider a single 3 adamant items It's much less as the combination of the 5 items of subset seeking from the UK and The easy very efficient Everybody the about the step of the opera Get
1:19:07
That led to make it the more difficult 1 of the scourges step to instead to we would generate rules from the Treecode items And The value of the confidence that we have well From the street frequent items that have to be Distributed because the frequent items that this kind of 1 to 3 per Association of looks like who bought 1 and 2 off the bought 3 So several ways to split the items and the frequent items To gain S officials are so I'm led to believe he could do it is weak distribute The items and the Freedom items Such that We need support hostile to the idea of critical the by divided by the half hour home introduction to a week confidence Which is basically Houliston the items of grow to get the devalued by higher Austin does The body of the rule of Booker 1 its Because it across on its own but it's not the cause for wide And this is what all confidence want to measure basically this supply white bicycled X that is a confidence that we want to point out that what can we do we have free could items said 2 0 3 5 and has is a part of 50 per cent as we calculated from what doesn't detailed in terms of subset
1:21:01
We have to 3 2 5 3 5 2 3 5 that those Could be the body of the rules And the head of the rules would follow from what dismissing if the value of the route is to free A lot But The body of the rules to 3 of The 5 dismissing would be put in the head of a pin goods We can't Calculate the supplants full the different things like like like with it in and the last step before example to fee has appalled 50 per cent to 5 of the Pope of 75 per cent of the fund and not only to generate the rules that possible to free makes 5 2 5 3 3 3 5 2 to make 3 and 5 3 make 2 5 and 5 makes 3 into and now we need to make to the tune a confidence confidence is basically the Oprah of well items divided by the of current of The items that made the body of the route easy to calculate so for example of how Weston up 2 3 and five ought to get the 2 3 and 5 2 3 and 5 2 times how not to to end we which bought to get Also 2 times Means we of confidence of 100 per cent at this road course 2 and 3 are never but and less also 5 Might be a cause Very high confident for Kent 100 per cent Fully other and we find that with actually to rules with 100 per cent and all the other schools have of confidence of treecode which Fault some players as very high hugely he won't get so that would rule is an associate of mine and the confidence of the beauty much lower and the support for the with this kind of thing of post because of things that would generated from the scene Kia Easy to calculate You make that 4 will the different said that you have all the free said you get to the fish Only considering a minimum number of cash The break would have idea that reconvene at of while 5 cost Sept
1:24:16
Sell step to a chase But her To summarise IRA's and
1:24:31
With the damage
1:24:35
We want Association rows of the type which in all the support The body of the of the route and the support for the whole frequent items And of course we all know that he got the swell use when calculating the frequent The testing the support is not a testing the confidence is not a problem for me no anyway
1:25:05
For every step of the arrest for every extension of the freak and items we just made 1 policy for the day So if are large lopsided set is of size we possible the devil Necator Given that the best exponentially many basis Asian ruled that quite efficient can be done Linear time and 1 of my time the mining but we using the word Boston softer the and high minimis appointed minimum confidence for told all makes its especially red because the frequent itemsets will break down nicely and the and the generations that can be restricted to only a few 1 and on the other hand it's kind of interesting sometimes to focus on round So what about the issue published I'm never get any rules which Polish because it's not the true off in the supermarket As the Swiss when at to to milk to your all whatever it is you know that people by every day in the soup They by 2 polished once a month of the as a different sort Omaru was will be considering milk and yoghurt and bread and whatever and no road consider them To publish future because it's just a red and what we do about that that is a very and offer them up 0 risen basically a mining with multiple minimal supports the say well the minimum Seppelt not bound to the set of items in my shoppe and and have a good record in the a minimum Seibel but for every product either special Minims appalled that reflex higher than this product as a whole is bowled
1:27:24
But silent easy poses some problem but the basic point is that it could really introduces the real items in to the station if you have for example cooking and the frying pan so something and you are sure that that was much less frequently need to promote then bread only just below the minimum support for the follow up and they have a child of getting into the Treecode items And then you don't found
1:28:01
This is exactly the right item problem if the minimum supply set to high he was never find ruled in bowling If you new rules that involves the frequent and Ray items A use of global a minimis appalled that he would have said the local generating a lot of rules about bred to remove the lump not like everything because everything's over the minimum support base to help global supply of Canada the tree you need individuals and that is what I'm I'm stating that if you have a minimal support for each item
1:28:45
Then you need to find out what the rules are making a real of shoeshine and bread Is a reasonable because 1 of the frequent board items admitted that the US local without The shoeshine The confidence of rooms like that His death at the very low you don't want them How about things that are out of the wrath of that could we or so maybe people by a shoeshine with laces of which was yet ranked the There were later that the confidence could be very high Sony basically do you restrict the minimum the frequent itemsets to those where objects Have not too far diverging minimum You don't want bread and to shine but you might want to shine achieved Or you might want to shine and of the county The things are very rarely bought up and that could have something to do with each other while the don't but still both threat items and this is what you do what you basically look at the maximum support for any item in new set and you look at the minimum support for any item and if they diverge Too much of a 1 that as a minimum support of 50 per cent and 1 that has a minimum Seibel of 10 per cent just don't considered not a sense of But from this basic what I do duo
1:30:45
1 c of chosen your frequent itemset every item in the free conducted said to me I have a different minimum supply what do you do for the rules that you generate Some The basic idea is Take the a minimum of supply of values that you have and that is the simple value for the whole group and for example a ship the user specified minimum value for bread shoes and close to the centre of 1 per cent of point to and and we consider it works in all like to Britain is not too far from point want us and all point to percent it doesn't diverged to match that could have a rude closes everybody would by his close also will buy branded as a point of all point 15 and confidence of 70 per cent But how about the the minimum Seppelt closes and bread have all point to a precedent And to the set This is too low of this would doesn't get the minimum sepulchral on the other hand if you are close to was taken from the same frequent itemset so the support and confidence of saying But this time the open 15 book Because large of an 1 There Depending on what items new rules you define the minimum support for the rule of as the minimum minimum support of all the items that were But
1:32:39
The problem was model momentum of support is that the down but showed a puppet he breaks And rise this time Consider we have full items in the database funky 4 and different minimum suppose candles and 20 per cent and 6 Then we could again joined them for for example 1 and 2 It has a policy 9 per cent The minimum Aftenposten 20 per cent of to to 10 per cent So this is not the a free conducted doesn't couple must Was of 1 2 3 but we take 1 2 3 4 We get a 10 per cent 20 per cent And 5 per cent Now the minimal support for the set If 5 per cent not 10 per cent as was the case here the so this set has a minimalist appalled means of 10 per cent This has Aminzai Of the 5 per cent A pen When the US ended up on the closure probity is that the frequency The 1 2 3 Definitely a smart equal the frequency of 1 2 But now that we also have a free and smaller means poll It could be sufficient And if we wouldn't have created the 1 2 We would never have thought of the 1 2 3 was 0 8 per or a pig We have to look at match her for the beatific of body weight
1:34:56
Basically we were this time only items again but not according to the The lexicographical well but according to the minimal supply of these with the total also items The smartest minimum support stops and then the minimal sepulchral was idea behind that if we do so And then take the beginning of the list of items and frequent items The minimum But be the same A Kent I can only happened with 1 item that is taken out there will change
1:35:44
I'm the modular minnows supports the tourism is a straightforward extension of the April only other again we have to step 1 weakened itemset generation on these that the candidates are now generated with respect to the new multiple minimal Seppo And read the slightly different from the step because we are still have to consider some of the said that don't make the minimal support But maybe they could be useful in some other sets in some big said of where they will do you whether will meet the minimum supposed and step to of the rule generation were exactly as an opera so that we do
1:36:42
All 3 could itemset generation We take the scent and we take the minimum supports which handles and 20 per cent 5 per cent 6 per cent now we that well The items such Starting with the minimum of those loyalist minimal supply go to the highest standard 0 5 per cent year items 3 is the 1st thought item for the 2nd from item 1 10 per cent and items to the 20 per cent of UK and just or nothing happened Now as then the data and current each item 3 of the 6 time for the three time 1 of 9 times and let to August when he 25 for this is a very free conducted if you with a 100 transactions are so the pledged to is already 25 Support for 3 a 6 per cent for 3 per cent 1 9 per cent at having said that we go the items and find the 1st items that needs its minimize appalled and this item is
1:38:06
The seat For generating joint Paul and that stood So We said to stop with 3 and The mood was appalled what of 3 5 per cent but we have 6 or quences in a 100 transaction makes 6 per cent so that the smell of it The the next 1 We have 3 or Kearns's in on rachitic the minimum supplant for 4 or 6 per cent at The book was for UK the that further 1 which 10 per cent We have 9 So but it doesn't need to minimize appalled that it meets the minimal support of the of the 1st items so of its of event popular It's not in itself a frequent items But it's a bit part but together with the Reita and so you achieve to goods and you Treecode items said We do the same with the to you has a minimum support of 20 per cent happens 25 times not only because it over the part but also specifically beach The 5 to 70 him Good For the 1st weekend 3 8 1 and 2 And now we find out that the item that 1 really has The superbug that itself from and should be excluded for this is not a candidate for 1 lament So it's not began the 1 adamant said wide Consider here the minimum supplant of the 1st eyed and stop Well now because the want could be interesting for bidding higher and frequent items though itself is not a frequent and high would be interesting by could combine it with the sweet It has 9 per cent This week the minimum support of 5 per cent How can so it's about the minimum support for the Free it put together with it But This why would keep up
1:41:08
Well now we have to do 2 adamant that with found out that items 2 and 3 2 and 3 need their own minimum supply Silicon be mixed with other And at the east London and a flat but good can but still that Canada and feel there was that the old of the race of the doesn't even meet the loyalist That is assuming we on any rules where the spread is to follow up and the and we take the 1st evidence and now used the candlelit set so 1 2 3 and demand and not Those meetings the actual The minimum supposed to be called off the download of up and so we know well before it was definitely out of the game Those 2 are good and this 1 is not a frequent items Item But is a contender for joint Foca Relied on to join with the for self whether we do it We have the support of 3 which is up 50 thousand Minims a part firms and we could send level 2 candidates with this week 2 possibilities 3 1 3 2 A print
1:43:05
3 run as a candidate for the support of 1 Is large of and the minimum support And the support of 3 months sepulchral want It's more of them the before because his has a sepulchral 1 0 0 4 4 0 0 9 this as a crew of 6 That's quite just to get a care Dutch look at 3 2 Well 20 find and 6 Very far part It differs by 10 per cent more than 10 So cut it up A pink Those that to do anything with the supply of new homes a bottle stuff like the it's only because of the beverage wrote But see that we have created a candlelit matching or the minimum support requirements By some items that in itself is not a frequent items A print On the stock with a limit of the of the things we we take the next seed from a long time so want to is an act of the next ability support of 1 small of the minimum support so we can use that as a seed anyway and after and the generation has completed with only a single and we 1
1:44:53
Now we look at all of the dead again read the transaction distant find out that the the sepulchral 3 1 6 which is large and the minimum supply of 3 and to the minimum supply of we've lost 5 per cent of its 5 per cent And does this is about it candidate disability freedom a That's basically works that if you want to be to have a bigger said it works exactly like the full but we have to look at this across the diverged across the UK and this is another cheque otherwise would do exactly the same thing which means look at the front
1:45:45
If the heads of the 2 Leeds identical up to the last thing These make the move to the new don't talk a rich that need to Pru part So not point is really that we cannot go to the payment once upsets to find out whether this is a band said Because they might have been prolong away you to higher minimum supply found that closed prop at the noted a pink and Still if we find all the stops Everything's good we still valid But it could be that the head of the group 1st items that in the smallest Sapori If missing the That would increase the minimum supposed because we have ordered this so that the smallest minimum supports and from And that the break of this is the same with the exception that we desperately need him her Look at the head acted It does include the had item that and it has to be in the Kansas City that has been calculated before of the smallest and if it does not go in The dust comes with the accounting the 1st items in this has to be in there in the 1st leg and the and if it does not contain it does not have to be and the more and and so that's what we do so for example we have a couple 3 adamant that and we join them you want to use 1 to make the 1 2 3 5 is once reuse on 3 makes 1 2 3 5 is 1 full year for help make 1 2 5 6 can only joint and
1:47:59
The and after proving that we might get a 1 2 3 5 1 3 5 0 0 4 5 UK it that trying to figure out what's in the set so we need to have Lewis said That can be built Order the day Out of all this we and said that can be built of this fall adamant that want to see is the island 1 to find their on once refund the and 2 3 5 2 3 4 and that it can do Kate that look at the 2nd 1 UK 1 3 full 1 3 4 2 in different colours 1 3 for their time 3 4 5 0 2 missing Does it matter that its missing know it does not matter because the Sri for find does not include the minimum support items This is the 1 that presses down a minimum of 4 can for we don't need it in the nett The other things that this 1 4 find 1 for 5 with the and 1 3 5 1 3 5 there are so they are both their and the We have left OK Then the for the last 1 1 4 5 6 OK rebuilt things on offer We get rid of My little drawings here we can't have 1 4 5 there we could have had 5 6 doesn't have to be there because the 1st items not and that we could have 1 5 6 Oops That is missing and the 1st items including them the best So the whole thing ghost on the right care During Simmers before The reason Just You don't have to find those said that the 1st item is not convinced And that's all Foca
1:50:58
Prop for the fund for the regeneration I'm we found But the damage to property is not that it any more That means that we have a frequent Katie item that contains a minus 1 non frequent So we may have to build something Whose subgroups groups on not frequent items in the sale And and well That is a problem because we don't have to supply the new we have to compute sepulchral useful votes The problem but we have to go all the bat again and inefficiency public not the effectivity But this is close to the socalled had item problems so we might end up with some said We have to go where we have to look at ports but don't have the because he said the cost of Domenicali military is not weakened Now he is ready to deal with that is just if something as missing girl of the that find out what this of course is also a very efficient way to deal with that some of
1:52:27
Some of the ways to do that and to will come to that A 2 billion example is 3 like said Shoe close threat than we could have the minimum support for is the minimum of all the 3 of them soloist we have bred pills shoes and open 1 definitely the minimum so this year the UK and now we have thought for example closes threat has the point of all point 15 should close but has a point of 0 point well And the minimum support hopes But what about close bread Well for close bread the minimum he here is 0 point to The minimum suppose of this said does not only any more I'd don't have the confidence for that Follow roots icon built that have chosen fresh Icon of computer book and they might be true I'd just can of computer without success that But Gay aka
1:53:57
Not too difficult this world record that ahead item problem facing the UK and the Calculate the sepulchral some might of the confident and in some rules and some icon and not do it without reading the data again and well basically and There it was a possible solutions that at least High inseparability about without a reading the again If I'd taking the none of all over probabilities where NAMD over 1 9 3 conductors is taken out of the fight take those intake of 1 9 3 and items wickets of this and that of the ability of the recording the 1st effect required operability also for those items It's not guaranteed that they get listings on each but it's probably sold their sole calculating them when you are run through left and is a good idea In a The advantages of might a momentum of supply it very realistic for practical application because bread and shoeshine also sold together but I'm not the same thing
1:55:30
1 is a convenient opticaldisk for every day the other is not tickled that order like triadic of basically board 1 4 months or so you can not lose them with a global fast and If you use model a minimal support and consider the diverge and we will not and that was over 2 in like I'm people like that because they liked shine or something like that but rather through rules of the type of people buy laissez because they by shoeshine because they think the point that tells you laissez of this kind of like a red item rules and that is what we set out to do Basically if we said that the supply values 2 100 per cent We can prevent items occur in inroads anyway Also a nice under the not interested in anything that has something to do with Brett while I'd just ask for a minimum supplant bread over 100 per cent It as a single transaction Web rebels or caught up that it followed Brad a 2nd effectively because you would items from frequent But may be a good Yeah and that's it fall to date in terms of evidence yet yes
1:57:13
This basically I'm Howell how many of its kind of like the steering wheel setting the minimum of support The granularity of rules that you're interested If you really are and you and rules that have something to do with the top prise Raise the ball if you are interested in what you are props and their relationships with their thousands of rules The smell minimum supposed But basically I thought it had kind of like it said the granularity how we look at it and look at you So in the end though the true among look at another the rooms and the class Association will mining and that will do so OK so on recent across a approaches with seemed cases where going to pulled work on some must do about my Muscat policies but this kind of Finite is this is not so I'm not expecting something specific in the right side of my roots there me there was that the UK will rise Association mining into would use specific semantics this kind of pools by exploiting prices The idea was that we will be fine a set of prices and we tried to identify those who was which explained that important items coral said those crisis and the board based with my bacon and examples from the area of fixed mining its setting the best of the best deal for Class Association rules
1:59:21
OK so let's imagined would be time is a set of welcome here Bacon's 7 but commenced each speaking about the tempo weeks before they go command speak about education and the last 4 about 4 We have been denounced web of cost for the sake of an example I've taken a very small number of Mount for each book you should imaginable 300 before the unique now I've taken the announced 48 occasion for the 1st though command and so on the and we apply classical Oriali and we come to the aid of cost of those with the support and confidence has which seen before so for example for the education for students would we find the support for 4 The divided by 7 and though it this has pay over a medium and so before through the maintenance of what on the condition and the minimum confidence conditions have made them a new meaning and confidence condition is not such a 0 problematic situation The association believes that I'm getting all few is you can also have a have a classical Structured namely the words in the side because of the right side of this is the whole of idea of across Association the mind
2:01:03
This is why the advantage is that associational 19 bureau with prices can be profoundly just 1 step by growing pulled off for the candidates were joined and Poland In identified preprint itemsets but then I'd don't need tool to perform Amy Cuemba nations to extract the most because the original that the classes of the ride of my whose system L sold them Benyah have each transaction as before The resulting in goes off by a conditions that these are my words for items and labels of my prices is pretty simple before the appellate followed it can be for their use will cover the cost Association goes just as before deciding semantics for crisis of costs also the idea of multiple means board can be used yet by the end of the different types classes different minimum so for a classic example of this is my defined
2:02:24
Full localize is 1 positive 1 negative with the corresponding items and then icy OK I'm really interested in what makes the positive class so I'm going to use of the name and so far and I'm going through the highest bought for for negative questions and that interest in order it comes in my day data anyway if 1 can also excluded and diary for my data by using them even possible of their of 100 per cent
2:02:57
Some classical foods for of performing Association mining or the 3 of we we discussed the bulk of the work variations we discussed about being offered Israelis in open sauceboat is in commercial point the most known open sauceboat dates for Association would mining and they die mining has pulled out of the rapid minor and that you can download them and they would then testament see what kind of country the day off year on a three for two as commercial solutions but the more powerful of the scalability will lead they dumped out of the region minor For over by the end there is a book that is the assessed and the cost 1 or other of the also have their Audium For before a could be done
2:03:57
About wanted to bring practical examples so why this the Association would mining adaptivity what when 1 0 1 0 Big the said reading a book from the if 1 University Of cost you can find the total datasets for performing exactly Association or mining and the Shia downloaded of data acidity with the the got tool of customer preferences when it comes to buying a house The classes are unacceptable the climbed from a certain point unacceptable acceptable with up to 30 would and the attributes which customers had took 1 when deciding their own because they knew was the cost of the card which would I've from high to to low over hyped Britain modem maintains cost the number of Boris Blagoje woman also there were more tributes for causing
2:05:05
They accept the beat the overcome eyes were performed the study with the and the and the victory in the face 1 can say home a new moves once bought being by 100 goes 0 1 can say this about policy the confidence that price index in the case 1 1 CPU was the last Association mining in this case for classical up area would it with classical Association were sold more cuts in more
2:05:42
And wanted to see what comes up and there for all to see that the are amount for the 1st gold for example was that the group being sought to worsen card was found is unacceptable by most of the time This is a pretty convincing it has of confidence or 100 per cent the same can be said for example of all kinds with 2 per cent and smaller ones were 1 luggage compartment again on acceptable and so on these as pretty simple rules Pichai displays of forest the rules with their just the small number of high density and left part for for So'oialo waited a bit longer and Road receive or sold a but there more comprising goes like for example of this is not for the World Cup which was fined unacceptable due to its low safety so for a perceived his son unacceptable that didn't it is most certainly because it sounds and if you wait some more or defined by data shows of for allow whose name to get even complexity rules which decrease also green confidants due to the the large number of records in the US
2:07:14
Group a and then wanted to try something logic And the and the and cut its events database ejection but the interest it is quite being sold to 150 thousand files and 54 IPDPS and eye was also could is to see what kind of phantoms and then but there was information about the words the accidental place where their Tsedenbal based although the sexy but needed in the 6 to life what that the majority of where the conditions or the information is there indicated so why wanted to see all key how the rulers whose like in order to depict what kind of of conditions more frequent for such a accidents ice that in the back and the Foreign only 54 attribute centrifuge to 150 thousand told in by the effect so you can imagine the complexity of the house where it can go up pretty fast and the board to go medical memory of not enough of cost their improvements of this out with which to achieve This kind of performance and can be with the amount of the time and I'd been also was interested to see how well will not and can boost such are not as and found find out that actually what they're doing is carried with Ireland sold they just bought out of hours when the for something like this they tried to put as much as possible of the head with 3 based it and then I thought of it the with idea with the of their
2:08:59
The so on what they would like to Italy to take from 2 days later with spoken about their business intelligence with spoken about the importance of Sigmund in their customers about the propensity of the customers who by how how and why the biggest to buy what is the profile of the debate the were based on the customer and how to make people customer wrote to the company We've spoken about the number of duty got data mining and what these data mining Geno's what the interesting algorithms and we start Association of 19 with introduced after oil would it reaches About a week and measures it can measure of the strength of the weaknesses of the offer was disappointment confidence that most important part of the downgrade is the very pretty down down with the motions broke the which minimize is the intermediate results by a higher order of money too About with discussed about the association mining by placidity multiple multiple minimum so board to sold of their eigenproblem and we discussed the about their head eigenproblem which is the from using multiple NEMO so for
2:10:35
After the hole in the league will be on the road continued throughout the day by mining fee as an application to be about where houses with time cities and and stimulate said John mielies is Israeli sequence thinker and have a nice
00:00
Point (geometry)
Trail
Product (category theory)
Numbering scheme
Transformation (genetics)
Decision theory
Multiplication sign
Disintegration
Insertion loss
Mereology
Video game
Term (mathematics)
Natural number
Phase transition
Software
Data mining
Data storage device
Series (mathematics)
Algorithm
Programming paradigm
Information
Structural load
Electronic mailing list
Metadata
Bit
Data warehouse
Transformation (genetics)
Cartesian coordinate system
Data mining
Content (media)
Computer animation
Computer hardware
Phase transition
Cycle (graph theory)
Data structure
Task (computing)
Asynchronous Transfer Mode
03:59
Group action
State of matter
View (database)
Decision theory
Water vapor
Analytic set
Perspective (visual)
Strategy game
Data mining
Information
Process (computing)
Associative property
Information security
Rule of inference
Metropolitan area network
Algorithm
Process (computing)
Information
Physical law
Actuary
Basis (linear algebra)
Group action
Data mining
Content (media)
Computer animation
Business Intelligence
Order (biology)
Statement (computer science)
Resultant
Associative property
08:58
Database transaction
Scientific modelling
Decision theory
Multiplication sign
Analytic set
Field (computer science)
Dressing (medical)
Latent heat
Goodness of fit
Centralizer and normalizer
Insertion loss
Subtraction
Address space
Modulo (jargon)
Email
Product (category theory)
Horizon
Group action
Cartesian coordinate system
Local Group
Maxima and minima
Content (media)
Computer animation
Business Intelligence
Smart card
Right angle
Figurate number
Mathematical optimization
Mathematical optimization
12:10
Purchasing
Metropolitan area network
Product (category theory)
State of matter
Characteristic polynomial
Structural load
Multiplication sign
Division (mathematics)
Mereology
Local Group
Emulation
Programmer (hardware)
Plane (geometry)
Computer animation
Customer relationship management
Right angle
Data management
Mathematical optimization
Newton's law of universal gravitation
14:35
Database transaction
Focus (optics)
Service (economics)
Database transaction
Key (cryptography)
Decision theory
View (database)
Structural load
Physical law
Bit
Insertion loss
Group action
System call
Video game
Data mining
Message passing
Video game
Computer animation
Smart card
Profil (magazine)
Oval
Lipschitz continuity
Data management
17:02
Point (geometry)
Software engineering
Service (economics)
Equals sign
Multiplication sign
Connectivity (graph theory)
Decision theory
Insertion loss
Mereology
Emulation
Twitter
Data management
Architecture
Frequency
Strategy game
Insertion loss
Integrated development environment
Software testing
Address space
Physical system
Source code
Algorithm
Key (cryptography)
Fitness function
Data mining
Exterior algebra
Computer animation
Customer relationship management
Strategy game
Bus (computing)
Mathematical optimization
Mathematical optimization
19:53
Point (geometry)
Metre
Dataflow
Database transaction
Enterprise architecture
Decision theory
Insertion loss
Functional (mathematics)
Area
Twitter
Data management
Stagnation point
Term (mathematics)
Information security
Physical system
Area
Rule of inference
Algorithm
Process (computing)
Key (cryptography)
Decision theory
Structural load
Moment (mathematics)
Price index
Vector potential
Latent heat
Computer animation
Personal digital assistant
Smart card
System programming
Strategy game
Software framework
Figurate number
Film editing
Exception handling
Arithmetic progression
Data management
Freezing
Wide area network
23:57
Email
Query language
Video game
Text mining
Statistics
Information
Drum memory
Physical system
Algorithm
Email
Decision theory
Moment (mathematics)
Basis (linear algebra)
Sound effect
Data mining
Arithmetic mean
Database
Uniform resource name
Pattern language
Quicksort
Mathematical optimization
Statistics
Service (economics)
Competitive analysis
Characteristic polynomial
Knowledge extraction
Google Analytics
Mathematical analysis
Computer icon
Number
Local Group
Data management
Goodness of fit
Causality
Term (mathematics)
Database
Data mining
World Wide Web Consortium
Multiplication
Information
Physical law
Computer program
Mathematical analysis
Cartesian coordinate system
Computer animation
Logic
Personal digital assistant
Text mining
Abfrageverarbeitung
28:22
Cluster sampling
Product (category theory)
Statistics
Set (mathematics)
Finitary relation
Mathematical analysis
Dressing (medical)
Twitter
Planning
Local Group
Social class
Prediction
Performance appraisal
Strategy game
Profil (magazine)
Statistics
Physical law
Information
Bounded variation
Extension (kinesiology)
Traffic reporting
Physical system
Metropolitan area network
Product (category theory)
Information
Characteristic polynomial
Direction (geometry)
Moment (mathematics)
Planning
System call
Computer animation
Uniform resource name
Strategy game
Pattern language
Energy level
Object (grammar)
Data management
Resultant
Associative property
Series (mathematics)
31:20
State of matter
Model theory
Scientific modelling
Decision theory
Artificial neural network
Decision tree learning
Mereology
Event horizon
Architecture
Social class
Goodness of fit
Latent heat
Performance appraisal
Graphical user interface
Prediction
Causality
Software
Data mining
Pattern language
Hausdorff dimension
Presentation of a group
Associative property
Physical system
Rule of inference
Metropolitan area network
Boss Corporation
Algorithm
Spacetime
Product (category theory)
Constraint (mathematics)
Information
Fasterthanlight
Server (computing)
Confidence interval
Prediction
Law of large numbers
Local Group
Singleprecision floatingpoint format
Data mining
Causality
Computer animation
Database
Crosscorrelation
Function (mathematics)
System programming
Pattern language
Associative property
33:57
Service (economics)
Mathematical analysis
Rule of inference
Likelihood function
Social class
Crash (computing)
Data mining
Noise
Associative property
Social class
Rule of inference
Product (category theory)
Outlier
Distribution (mathematics)
Closed set
Classical physics
Moment (mathematics)
Confidence interval
Sound effect
Data analysis
Cartesian coordinate system
Similarity (geometry)
Data mining
Event horizon
Computer animation
Smart card
Object (grammar)
Exception handling
Routing
Associative property
39:38
Database transaction
Set (mathematics)
Connectivity (graph theory)
Decision theory
Water vapor
Mathematical analysis
Rule of inference
Workstation
Data mining
Associative property
Rule of inference
Algorithm
Product (category theory)
Database transaction
Content (media)
Computer animation
Uniform resource name
Website
Video game console
Quicksort
Identical particles
Data type
Form (programming)
Empennage
Associative property
43:03
Database transaction
Confidence interval
State of matter
Code
Multiplication sign
Rule of inference
Semantics (computer science)
Number
Hypothesis
Root
Causality
Radiofrequency identification
Term (mathematics)
Data mining
Physical law
Rule of inference
Product (category theory)
Database transaction
Information
Confidence interval
Measurement
Maxima and minima
Subject indexing
Number
Computer animation
Network topology
Estimation
Whiteboard
Routing
Associative property
49:01
Database transaction
Confidence interval
Multiplication sign
Transport Layer Security
Covering space
Combinational logic
Online help
Insertion loss
Rule of inference
Number
Workstation
Web 2.0
Maxima and minima
Singleprecision floatingpoint format
Personal digital assistant
Data mining
Associative property
Booting
Physical system
Chisquared distribution
Rule of inference
Metropolitan area network
Algorithm
Database transaction
Closed set
Exponentiation
Confidence interval
Usability
Thresholding (image processing)
Law of large numbers
System call
Maxima and minima
Data mining
Computer animation
Commitment scheme
Personal digital assistant
Object (grammar)
Game theory
Routing
Supremum
Associative property
Wide area network
Booting
55:53
Database transaction
Momentum
Algorithm
Confidence interval
Set (mathematics)
View (database)
Real number
Rule of inference
Computer
Workstation
Maxima and minima
Social class
Data mining
Information
Social class
Rule of inference
Algorithm
View (database)
Airy function
Set (mathematics)
Maxima and minima
Data mining
Number
Computer animation
Data structure
Resultant
Associative property
58:38
Algorithm
Multiplication sign
Sheaf (mathematics)
Mereology
Emulation
Number
Maxima and minima
4 (number)
Order (biology)
CAN bus
Roundness (object)
Iteration
Algebraic closure
Normal (geometry)
Analytic continuation
Category of being
Mathematical optimization
Electric generator
Product (category theory)
Closed set
Element (mathematics)
Sound effect
Instance (computer science)
Maxima and minima
Subset
Category of being
Computer animation
Iteration
Key (cryptography)
Mathematical optimization
Supremum
Row (database)
Booting
Wide area network
1:03:44
Database transaction
Algorithm
Set (mathematics)
Sheaf (mathematics)
Disk readandwrite head
Host Identity Protocol
Emulation
Subset
Maxima and minima
Goodness of fit
Lattice (order)
Algebraic closure
output
Analytic continuation
Category of being
Subtraction
Form (programming)
Metropolitan area network
Element (mathematics)
Length
Content (media)
Staff (military)
Set (mathematics)
Local Group
Subset
Inclusion map
Category of being
Computer animation
Database
Smart card
Personal digital assistant
1:08:34
MIDI
Sweep line algorithm
Algorithm
Closed set
Mereology
Emulation
Number
Subset
Maxima and minima
Computer animation
Algebraic closure
Hypermedia
Routing
Reading (process)
Resultant
1:12:05
Point (geometry)
Database transaction
Spacetime
Algorithm
Multiplication sign
Physical law
Electronic mailing list
Port scanner
Line (geometry)
Algorithmic efficiency
Emulation
Maxima and minima
Maxima and minima
Computer animation
Bit rate
Personal digital assistant
Radiofrequency identification
Lattice (order)
Physical law
Resultant
Supremum
1:15:55
Rule of inference
Algorithm
Confidence interval
Combinational logic
Set (mathematics)
Port scanner
Mereology
Code
Rule of inference
Emulation
Subset
Maxima and minima
Subset
Maxima and minima
Computer animation
Term (mathematics)
Condition number
Estimation
Game theory
Associative property
Reading (process)
Supremum
Associative property
1:20:46
Algorithm
Confidence interval
Multiplication sign
Calculation
Control flow
Disk readandwrite head
Rule of inference
Computerintegrated manufacturing
Emulation
Airfoil
Number
Summation
Maxima and minima
Order (biology)
Goodness of fit
Causality
Information
Associative property
Tunis
Personal identification number
Rule of inference
Proper map
Database transaction
Confidence interval
Demoscene
Subset
Data mining
Computer animation
Convex hull
Hill differential equation
Routing
Associative property
1:24:31
Maxima and minima
Algorithm
Confidence interval
Multiplication sign
Rule of inference
Summation
Maxima and minima
Order (biology)
Roundness (object)
Data mining
Physical law
Software testing
Information
Extension (kinesiology)
Subtraction
Rule of inference
Multiplication
Product (category theory)
Electric generator
Database transaction
Confidence interval
Basis (linear algebra)
Set (mathematics)
Thresholding (image processing)
Maxima and minima
Subset
Data mining
Word
Computer animation
Quicksort
Data type
Routing
Supremum
Row (database)
Associative property
1:26:55
Point (geometry)
Rule of inference
Metropolitan area network
Online help
Rule of inference
Emulation
Workstation
Maxima and minima
Maxima and minima
Frequency
Computer animation
Network topology
Data mining
Right angle
Supremum
1:28:37
Point (geometry)
Rule of inference
Maximum length sequence
Confidence interval
Constraint (mathematics)
Multiplication sign
Closed set
Set (mathematics)
Lace
Rule of inference
Local Group
Maxima and minima
Maxima and minima
Independent set (graph theory)
Computer animation
Whiteboard
Object (grammar)
Subtraction
Local ring
Supremum
1:32:38
Momentum
Algorithm
Scientific modelling
Multiplication sign
Maxima and minima
Frequency
Order (biology)
Independent set (graph theory)
Database
Algebraic closure
Category of being
Subtraction
Quicksort
Electronic mailing list
Total S.A.
Set (mathematics)
Law of large numbers
Maxima and minima
Arithmetic mean
Computer animation
Algebraic closure
Database
Personal digital assistant
Form (programming)
Matching (graph theory)
1:35:44
Rule of inference
Database transaction
Electric generator
Database transaction
Algorithm
Multiplication sign
Set (mathematics)
Counting
Rule of inference
Maxima and minima
Maxima and minima
Independent set (graph theory)
Computer animation
Extension (kinesiology)
Quicksort
Extension (kinesiology)
1:37:49
Database transaction
Maxima and minima
Multiplication sign
Letterpress printing
Electronic mailing list
Highlevel programming language
Mereology
Event horizon
Rule of inference
Maxima and minima
Order (biology)
Goodness of fit
Independent set (graph theory)
Algebraic closure
Energy level
Physical law
Category of being
Maximum length sequence
Metropolitan area network
Raw image format
Bit
Maxima and minima
Computer animation
Lattice (order)
Element (mathematics)
Software testing
Energy level
Game theory
Freeware
Form (programming)
Supremum
1:43:03
Database transaction
Electric generator
Database transaction
Constraint (mathematics)
Multiplication sign
Letterpress printing
Electronic mailing list
Limit (category theory)
Mereology
Maxima and minima
Maxima and minima
Independent set (graph theory)
Computer animation
Moving average
Film editing
Supremum
1:45:20
Point (geometry)
Musical ensemble
Set (mathematics)
Constraint (mathematics)
Multiplication sign
Home page
Control flow
Online help
Disk readandwrite head
Mereology
Maxima and minima
Independent set (graph theory)
Objectoriented programming
Physical law
output
Exception handling
Rule of inference
Binary file
Local Group
Maxima and minima
Subset
Computer animation
Graph coloring
Uniform resource name
Order (biology)
Right angle
Exception handling
Identical particles
1:50:58
Point (geometry)
Building
Manufacturing execution system
Algorithm
Confidence interval
Computer
Computer icon
Maxima and minima
Order (biology)
Independent set (graph theory)
Root
Algebraic closure
Category of being
Rule of inference
Metropolitan area network
Maximum length sequence
Closed set
Confidence interval
Sound effect
Local Group
Subgroup
Disk readandwrite head
Maxima and minima
Category of being
Voting
Frequency
Computer animation
Uniform resource name
Form (programming)
Supremum
1:53:56
Point (geometry)
Database transaction
Momentum
Algorithm
Scientific modelling
Rule of inference
Computer icon
Web 2.0
Data model
Independent set (graph theory)
Term (mathematics)
Operator (mathematics)
Singleprecision floatingpoint format
Right angle
Window
Rule of inference
Confidence interval
Electronic mailing list
Sound effect
Cartesian coordinate system
Maxima and minima
Calculation
Computer animation
Order (biology)
Whiteboard
Data type
Row (database)
1:57:02
Finitismus
Confidence interval
Set (mathematics)
Multiplication sign
Student's ttest
Semantics (computer science)
Rule of inference
Emulation
Number
Web 2.0
Word
Maxima and minima
Social class
Latent heat
Root
Data mining
Normal (geometry)
Associative property
Game theory
Social class
Condition number
Area
Rule of inference
Metropolitan area network
Beta function
Student's ttest
Set (mathematics)
Software maintenance
Maxima and minima
Data mining
Word
Arithmetic mean
Computer animation
Personal digital assistant
Uniform resource name
Speech synthesis
Condition number
Whiteboard
Supremum
Associative property
2:00:58
Algorithm
Semantics (computer science)
Maxima and minima
Social class
Military operation
Negative number
Lipschitz continuity
Normal (geometry)
Subtraction
Physical system
Social class
Condition number
Rule of inference
Multiplication
Database transaction
Maxima and minima
Word
Arithmetic mean
Computer animation
Order (biology)
Whiteboard
Data type
Form (programming)
Local ring
Supremum
Associative property
2:02:56
Point (geometry)
Algorithm
Set (mathematics)
Adaptive behavior
ACID
Route of administration
Open set
Scalability
Software maintenance
Attribute grammar
Subset
Number
Social class
Data mining
Normal (geometry)
Associative property
Social class
Modem
Rule of inference
Metropolitan area network
Characteristic polynomial
Open source
Attribute grammar
Data mining
Number
Computer animation
Smart card
Data warehouse
Universe (mathematics)
Dew point
Units of measurement
Bounded variation
Oracle
Associative property
2:05:04
Classical physics
Complex (psychology)
Beat (acoustics)
Observational study
Algorithm
Confidence interval
Mountain pass
Chemical equation
Multiplication sign
1 (number)
Price index
Mereology
Rule of inference
Emulation
Number
Social class
Population density
Forest
Data mining
Electronic visual display
Physical law
Associative property
Newton's law of universal gravitation
Scalable Coherent Interface
Area
Rule of inference
Demon
Confidence interval
Interior (topology)
Price index
Group action
Local Group
Data mining
Number
Fraction (mathematics)
Computer animation
Intrusion detection system
Personal digital assistant
Smart card
Lie group
Hill differential equation
Film editing
Units of measurement
Row (database)
Associative property
2:07:14
Complex (psychology)
Confidence interval
Disk readandwrite head
Mereology
Maxima and minima
Video game
Algebraic closure
Eigenvalues and eigenvectors
Process (computing)
Information
Category of being
Vulnerability (computing)
Metropolitan area network
Machine learning
Algorithm
Open source
Sound effect
Attribute grammar
Disk readandwrite head
Maxima and minima
Data mining
Database
Order (biology)
Whiteboard
Resultant
Associative property
Readonly memory
Computer file
Algorithm
Set (mathematics)
Limit (category theory)
Event horizon
Rule of inference
Plot (narrative)
Attribute grammar
Number
Profil (magazine)
Database
Data mining
Associative property
Condition number
Rule of inference
Multiplication
Information
Magnetooptical drive
Confidence interval
Local Group
Word
Fermat's Last Theorem
Computer animation
Business Intelligence
Logic
2:10:35
Maize
Computer animation
Multiplication sign
Data mining
Mathematical analysis
Cartesian coordinate system
Sequence
Emulation
Ähnlichkeitssuche
Sequence
Series (mathematics)
Metadata
Formal Metadata
Title  Data Mining Overview, Association Rule Mining (16.12.10) 
Title of Series  Data Warehousing and Data Mining Techniques (WS 2010/2011) 
Part Number  8 
Number of Parts  13 
Author 
Balke, WolfTilo

Contributors 
Homoceanu, Silviu

License 
CC Attribution  NonCommercial 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
DOI  10.5446/333 
Publisher  Technische Universität Braunschweig, Institut für Informationssysteme 
Release Date  2010 
Language  English 
Producer 
Technische Universität Braunschweig

Production Year  2011 
Production Place  Braunschweig 
Content Metadata
Subject Area  Information technology 
Abstract  In this course, we examine the aspects regarding building maintaining and operating data warehouses as well as give an insight to the main knowledge discovery techniques. The course deals with basic issues like storage of the data, execution of the analytical queries and data mining procedures. Course will be tought completly in English. The general structure of the course is: Typical dw use case scenarios Basic architecture of dw Data modelling on a conceptual, logical and physical level Multidimensional E/R modelling Cubes, dimensions, measures Query processing, OLAP queries (OLAP vs OLTP), rollup, drill down, slice, dice, pivot MOLAP, ROLAP, HOLAP SQL99 OLAP operators, MDX Snowflake, star and starflake schemas for relational storage Multimedia physical storage (linearization) DW Indexing as search optimization mean: RTrees, UBTrees, Bitmap indexes Other optimization procedures: data partitioning, star join optimization, materialized views ETL Association rule mining, sequence patterns, time series Classification: Decision trees, naive Bayes classifications, SVM Cluster analysis: Kmeans, hierarchical clustering, aglomerative clustering, outlier analysis 