Latent Semantic Indexing (11.5.2011)
Formal Metadata
Title 
Latent Semantic Indexing (11.5.2011)

Title of Series  
Part Number 
5

Number of Parts 
13

Author 

Contributors 

License 
CC Attribution  NonCommercial 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and noncommercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor. 
Identifiers 

Publisher 

Release Date 
2011

Language 
English

Producer 

Production Year 
2011

Production Place 
Braunschweig

Content Metadata
Subject Area  
Abstract 
This lecture provides an introduction to the fields of information retrieval and web search. We will discuss how relevant information can be found in very large and mostly unstructured data collections; this is particularly interesting in cases where users cannot provide a clear formulation of their current information need. Web search engines like Google are a typical application of the techniques covered by this course.

00:00
Statistics
Multiplication sign
Execution unit
Special unitary group
Theory
Independence (probability theory)
Information retrieval
Mathematics
Causality
Term (mathematics)
Different (Kate Ryan album)
Subject indexing
Query language
Energy level
World Wide Web Consortium
Area
Domain name
Focus (optics)
Information
Computer simulation
Bit
Term (mathematics)
System call
Singleprecision floatingpoint format
Type theory
Computer animation
Search engine (computing)
Personal digital assistant
Mixed reality
Website
Endliche Modelltheorie
Object (grammar)
Library (computing)
Spacetime
02:30
Consistency
Cellular automaton
Set (mathematics)
Bit
Term (mathematics)
Infinity
Special unitary group
Area
Independence (probability theory)
Frequency
Computer animation
Profil (magazine)
Term (mathematics)
Information retrieval
Query language
Dew point
Maize
Series (mathematics)
Game theory
System identification
Singuläres Integral
World Wide Web Consortium
06:56
Multiplication sign
Decision theory
Water vapor
Singular value decomposition
Special unitary group
Prime ideal
Singular value decomposition
Subject indexing
Energy level
Spacetime
Endliche Modelltheorie
Position operator
Linear map
World Wide Web Consortium
Covering space
Information management
Matching (graph theory)
Mereology
Degree (graph theory)
Subject indexing
Type theory
Algebra
Computer animation
Computer science
Mathematician
Data structure
09:54
Point (geometry)
Suite (music)
Observational study
Multiplication sign
Insertion loss
Singular value decomposition
Icosahedron
Special unitary group
Information retrieval
Vector graphics
Vector space
Subject indexing
Representation (politics)
Maize
Algebra
Linear map
Physical system
World Wide Web Consortium
Multiplication
Observational study
Matching (graph theory)
Information
Physical law
Branch (computer science)
Planning
Transformation (genetics)
Mathematics
Algebra
Computer animation
Phase transition
System programming
Linearization
Computer science
Nichtlineares Gleichungssystem
Figurate number
Linear map
Matrix (mathematics)
Spacetime
Data compression
12:29
Point (geometry)
Transformation (genetics)
Real number
Multiplication sign
Heat transfer
Mereology
Special unitary group
Field (computer science)
Number
Information retrieval
Mathematics
Different (Kate Ryan album)
Vector graphics
Vector space
Matrix (mathematics)
Row (database)
Spacetime
World Wide Web Consortium
Matching (graph theory)
Point (geometry)
Variable (mathematics)
Computer animation
Personal digital assistant
Network topology
Matrix (mathematics)
Spacetime
Row (database)
14:55
Point (geometry)
Rectangle
Building
Divisor
Multiplication sign
Mereology
Number
Independence (probability theory)
Medical imaging
Linear independence
Web service
Vector space
Vector graphics
Set (mathematics)
Electronic visual display
Maize
Logic gate
Linear map
Social class
Identity management
Rhombus
World Wide Web Consortium
Personal identification number
Multiplication
Scaling (geometry)
Matching (graph theory)
Surface
Independence (probability theory)
Line (geometry)
Cartesian coordinate system
Number
Process (computing)
Symmetric matrix
Computer animation
Vector space
Telecommunication
Linearization
Theorem
Diagonal
Musical ensemble
Metric system
Matrix (mathematics)
Identity management
Square number
Spacetime
20:21
Point (geometry)
Code
Line (geometry)
Multiplication sign
Execution unit
Maxima and minima
Special unitary group
Computer icon
Number
Goodness of fit
Linear subspace
Term (mathematics)
Singleprecision floatingpoint format
Vector graphics
Set (mathematics)
Hausdorff dimension
Binary multiplier
Linear map
World Wide Web Consortium
Theory of relativity
Point (geometry)
Planning
Independence (probability theory)
Basis <Mathematik>
Plane (geometry)
Singleprecision floatingpoint format
Computer animation
Angle
Personal digital assistant
Convex hull
Finitestate machine
Linear subspace
Spectrum (functional analysis)
Spacetime
24:56
Point (geometry)
Standard deviation
Multiplication sign
Combinational logic
Set (mathematics)
Online help
Special unitary group
Subset
Number
Goodness of fit
Linear subspace
Bit rate
Different (Kate Ryan album)
Vector graphics
Maize
Lambda calculus
World Wide Web Consortium
Octahedron
Uniqueness quantification
Weight
Point (geometry)
Independence (probability theory)
Basis <Mathematik>
Subset
Computer animation
Basis <Mathematik>
Linearization
Theorem
Quicksort
Linear subspace
Routing
Spacetime
Row (database)
28:37
Web page
Point (geometry)
Standard deviation
State of matter
Transformation (genetics)
Weight
Execution unit
Set (mathematics)
Mereology
Special unitary group
Goodness of fit
Linear subspace
Crosscorrelation
Different (Kate Ryan album)
Uniqueness quantification
Physical system
World Wide Web Consortium
Matching (graph theory)
Information
Military base
Weight
Uniqueness quantification
Point (geometry)
Physical law
Moment (mathematics)
Counting
Basis <Mathematik>
Bit
Line (geometry)
Transformation (genetics)
System call
Mathematics
Wave
Arithmetic mean
Computer animation
Angle
Basis <Mathematik>
Personal digital assistant
Phase transition
Order (biology)
Theorem
Musical ensemble
Figurate number
Matrix (mathematics)
Spacetime
33:58
Point (geometry)
Transformation (genetics)
Length
Code
Multiplication sign
Orthogonality
Set (mathematics)
Darstellungsmatrix
Radon transform
Special unitary group
Product (business)
Number
Mathematics
Causality
Different (Kate Ryan album)
Vector space
Vector graphics
Green's function
World Wide Web Consortium
Scaling (geometry)
Point (geometry)
Expression
Length
Independence (probability theory)
Basis <Mathematik>
Product (business)
Degree (graph theory)
Arithmetic mean
Computer animation
Vector space
Angle
Scalar field
Normed vector space
Order (biology)
Routing
Spectrum (functional analysis)
Writing
Row (database)
40:17
Length
Direction (geometry)
Multiplication sign
Computergenerated imagery
1 (number)
Special unitary group
Computer icon
Number
Independence (probability theory)
Information retrieval
Medical imaging
Linear independence
CAN bus
Term (mathematics)
Different (Kate Ryan album)
Vector graphics
Set (mathematics)
Matrix (mathematics)
Moving average
Hausdorff dimension
Row (database)
Ranking
Algebra
Metropolitan area network
World Wide Web Consortium
Matching (graph theory)
Moment (mathematics)
Physical law
Length
Independence (probability theory)
Usability
Basis <Mathematik>
Number
Computer animation
Personal digital assistant
Interpreter (computing)
Theorem
Website
Diagonal
Figurate number
Ranking
Quicksort
Metric system
Linear map
Matrix (mathematics)
Curve fitting
Spacetime
46:18
Eigenvalues and eigenvectors
Divisor
Length
Scaling (geometry)
Multiplication sign
Real number
Correspondence (mathematics)
Characteristic polynomial
1 (number)
Special unitary group
Computer icon
Singleprecision floatingpoint format
Vector space
Row (database)
Ranking
Position operator
World Wide Web Consortium
Matching (graph theory)
Scaling (geometry)
Eigenvalues and eigenvectors
Moment (mathematics)
Independence (probability theory)
Basis <Mathematik>
Number
Word
Computer animation
Vector space
Physicist
Nichtlineares Gleichungssystem
Faktorenanalyse
Ranking
Metric system
Mathematician
Square number
Spectrum (functional analysis)
Spacetime
51:20
Eigenvalues and eigenvectors
Divisor
Multiplication sign
Computergenerated imagery
Execution unit
Singular value decomposition
Special unitary group
Medical imaging
Mathematics
Different (Kate Ryan album)
Vector space
Set (mathematics)
Subject indexing
Matrix (mathematics)
Linear map
World Wide Web Consortium
Source code
Execution unit
Scaling (geometry)
Matching (graph theory)
Eigenvalues and eigenvectors
Military base
Basis <Mathematik>
Mathematics
Algebra
Computer animation
Basis <Mathematik>
Universe (mathematics)
Metric system
Freezing
53:59
Point (geometry)
Rectangle
Transformation (genetics)
Mathematical singularity
Singular value decomposition
Special unitary group
Graph coloring
Element (mathematics)
Bit rate
Singular value decomposition
Vector graphics
Subject indexing
Matrix (mathematics)
Row (database)
Ranking
Algebra
Linear map
World Wide Web Consortium
Covering space
Matching (graph theory)
Structural load
Physical law
Algebra
Computer animation
Information retrieval
Theorem
Right angle
Diagonal
Metric system
Matrix (mathematics)
Spacetime
Row (database)
56:21
Web page
Metre
Divisor
Ferry Corsten
Texture mapping
Weight
Multiplication sign
Range (statistics)
Singular value decomposition
Bit
Special unitary group
Dimensional analysis
Number
Information retrieval
Mathematics
Term (mathematics)
Different (Kate Ryan album)
Vector graphics
Row (database)
Ranking
Spacetime
Physical system
World Wide Web Consortium
Shift operator
Scaling (geometry)
Matching (graph theory)
Interior (topology)
Basis <Mathematik>
Bit
Measurement
Degree (graph theory)
Subject indexing
Type theory
Arithmetic mean
Computer animation
Personal digital assistant
Information retrieval
Factory (trading post)
Order (biology)
Video game
Right angle
Diagonal
Metric system
Linear map
Matrix (mathematics)
Resultant
Spacetime
Row (database)
1:03:42
Point (geometry)
Transformation (genetics)
Ferry Corsten
Weight
Scaling (geometry)
Mathematical singularity
1 (number)
Orthogonality
Distance
Special unitary group
Approximation
Rule of inference
Term (mathematics)
Vector space
Ranking
Row (database)
Right angle
Summierbarkeit
World Wide Web Consortium
Scaling (geometry)
Product (business)
Process (computing)
Computer animation
Basis <Mathematik>
Mathematical singularity
Diagonal
Game theory
Matrix (mathematics)
Spectrum (functional analysis)
Spacetime
1:06:46
Metre
Point (geometry)
Transformation (genetics)
Weight
Scaling (geometry)
Multiplication sign
Orthogonality
Special unitary group
Computer icon
Number
Crosscorrelation
Term (mathematics)
Different (Kate Ryan album)
Matrix (mathematics)
Position operator
World Wide Web Consortium
Area
Information management
Electronic data interchange
Weight
Sound effect
Basis <Mathematik>
Cartesian coordinate system
Measurement
Product (business)
Degree (graph theory)
Type theory
Word
Computer animation
Basis <Mathematik>
Principal component analysis
Information retrieval
Bounded variation
Spacetime
1:10:58
Addition
Weight
View (database)
Mathematical singularity
Control flow
Mass
Mereology
Special unitary group
Approximation
Perspective (visual)
Food energy
Number
Mathematics
Vector space
Row (database)
Ranking
Spacetime
Noise
Right angle
Summierbarkeit
World Wide Web Consortium
Noise (electronics)
Shift operator
Matching (graph theory)
Information
Weight
System call
Approximation
Product (business)
Computer animation
Basis <Mathematik>
Personal digital assistant
Mathematical singularity
Website
Diagonal
Metric system
Linear subspace
Matrix (mathematics)
Spectrum (functional analysis)
Spacetime
1:15:09
Point (geometry)
Frobenius method
Computer file
Multiplication sign
Distance
Approximation
Special unitary group
Food energy
Goodness of fit
Different (Kate Ryan album)
Helmholtz decomposition
Matrix (mathematics)
Ranking
Endliche Modelltheorie
Error message
World Wide Web Consortium
Personal identification number
Matching (graph theory)
Measurement
Distance
Uniform boundedness principle
Arithmetic mean
Error message
Computer animation
Personal digital assistant
Ranking
Metric system
Matrix (mathematics)
1:18:55
Point (geometry)
Ferry Corsten
1 (number)
Drop (liquid)
Distance
Mereology
Approximation
Special unitary group
Food energy
Dimensional analysis
Computer icon
Neuroinformatik
Singular value decomposition
Different (Kate Ryan album)
Ranking
Error message
Rhombus
Capability Maturity Model
World Wide Web Consortium
Matching (graph theory)
Information
Projective plane
Data storage device
Maxima and minima
Fourier transform
Approximation
Computer animation
Ring (mathematics)
Vector space
Personal digital assistant
Theorem
Ranking
Coefficient
Mathematical optimization
Matrix (mathematics)
Spacetime
1:24:46
Point (geometry)
Eigenvalues and eigenvectors
Multiplication sign
Expandierender Graph
Singular value decomposition
Mereology
Special unitary group
Approximation
Uniformer Raum
Singular value decomposition
Helmholtz decomposition
Matrix (mathematics)
Ranking
Row (database)
Identity management
World Wide Web Consortium
Matching (graph theory)
Information
Eigenvalues and eigenvectors
Projective plane
Moment (mathematics)
Type theory
Curvature
Computer animation
Information retrieval
Theorem
Video game
Metric system
Matrix (mathematics)
Resultant
1:29:54
Point (geometry)
Context awareness
Divisor
Multiplication sign
Gene cluster
Singular value decomposition
Approximation
Special unitary group
Crosscorrelation
Term (mathematics)
Singular value decomposition
Different (Kate Ryan album)
Helmholtz decomposition
Subject indexing
Ranking
Noise
Algebra
Proxy server
Metropolitan area network
Linear map
Rhombus
World Wide Web Consortium
Noise (electronics)
Matching (graph theory)
Key (cryptography)
Information
Dimensional analysis
Line (geometry)
Term (mathematics)
Rectangle
Approximation
Algebra
Computer animation
Personal digital assistant
Network topology
Spacetime
Directed graph
1:34:42
Algorithm
INTEGRAL
Mathematical singularity
Combinational logic
Maxima and minima
Singular value decomposition
Nonlinear system
Different (Kate Ryan album)
Term (mathematics)
Electronic meeting system
Aerodynamics
Integral equation
Position operator
Social class
Window
Algorithm
Differential (mechanical device)
INTEGRAL
Algebraic number
Theory
Dimensional analysis
Term (mathematics)
Algebra
Computer animation
Basis <Mathematik>
Computer cluster
System programming
Partial derivative
Nichtlineares Gleichungssystem
Ideal (ethics)
Matrix (mathematics)
Spacetime
1:36:34
Implementation
Context awareness
Personal identification number
Differential (mechanical device)
INTEGRAL
Multiplication sign
Mathematical singularity
Similarity (geometry)
Ordinary differential equation
Coordinate system
Food energy
Arm
Knapsack problem
Dimensional analysis
Neuroinformatik
Nonlinear system
Arithmetic mean
Term (mathematics)
Different (Kate Ryan album)
Aerodynamics
Sinc function
Algorithm
Differential (mechanical device)
INTEGRAL
Algebraic number
Physical law
Dimensional analysis
Theory
Coma Berenices
Term (mathematics)
Cartesian coordinate system
Knapsack problem
Convolution
Arithmetic mean
Stochastic
Algebra
Computer animation
Doubling the cube
Basis <Mathematik>
Intrusion detection system
System programming
Convex hull
Nichtlineares Gleichungssystem
Ideal (ethics)
Spacetime
Geometry
1:38:33
Context awareness
MUD
Scaling (geometry)
Equals sign
Mathematical singularity
Maxima and minima
Coordinate system
Mereology
Neuroinformatik
Information retrieval
Nonlinear system
Term (mathematics)
Different (Kate Ryan album)
Singular value decomposition
Operator (mathematics)
Spacetime
Row (database)
Aerodynamics
Nichtlineares Gleichungssystem
Sinc function
World Wide Web Consortium
Covering space
Metropolitan area network
Hamiltonian (quantum mechanics)
Differential (mechanical device)
INTEGRAL
Information
Algebraic number
Dimensional analysis
Theory
Coma Berenices
Term (mathematics)
Knapsack problem
Word
Arithmetic mean
Root
Algebra
Computer animation
Basis <Mathematik>
System programming
Faktorenanalyse
Nichtlineares Gleichungssystem
Metric system
Matrix (mathematics)
Square number
Spacetime
1:40:57
Divisor
Scaling (geometry)
Mathematical singularity
Maxima and minima
Singular value decomposition
Coordinate system
Information retrieval
Root
Term (mathematics)
Helmholtz decomposition
Singleprecision floatingpoint format
Oval
Matrix (mathematics)
Spacetime
Row (database)
Information
World Wide Web Consortium
Scaling (geometry)
Matching (graph theory)
Dimensional analysis
Coordinate system
Term (mathematics)
Flow separation
Type theory
Stochastic differential equation
Root
Computer animation
Basis <Mathematik>
Faktorenanalyse
Ranking
Metric system
Routing
Matrix (mathematics)
Square number
Abstraction
1:42:58
Transformation (genetics)
Scaling (geometry)
Mathematical singularity
Singular value decomposition
Information retrieval
Causality
Term (mathematics)
Vector space
Vector graphics
Query language
Spacetime
Row (database)
Process (computing)
World Wide Web Consortium
Matching (graph theory)
Term (mathematics)
Stochastic differential equation
Root
Computer animation
Vector space
Faktorenanalyse
Musical ensemble
Matrix (mathematics)
Square number
Row (database)
Spacetime
1:44:48
Group action
Transformation (genetics)
Insertion loss
Heat transfer
Inverse element
Mereology
Theory
Area
Prime ideal
Mathematics
Query language
Process (computing)
Capability Maturity Model
Physical system
Area
Theory
Complete metric space
Cartesian coordinate system
Measurement
Similarity (geometry)
Computer animation
Partial derivative
Nichtlineares Gleichungssystem
Right angle
Spacetime
1:47:53
Convolution
Email
Context awareness
State of matter
Equals sign
Mountain pass
Mathematical singularity
Singular value decomposition
Special unitary group
Area
Neuroinformatik
Information retrieval
Vector space
Electronic meeting system
Query language
Process (computing)
Aerodynamics
Sinc function
Metropolitan area network
Algorithm
Differential (mechanical device)
Algebraic number
Term (mathematics)
Knapsack problem
Commutative algebra
Computer cluster
System programming
Nichtlineares Gleichungssystem
Personal area network
Convex hull
Ideal (ethics)
Geometry
MUD
Line (geometry)
Maxima and minima
Coordinate system
Vector graphics
Ideal (ethics)
Spacetime
Linear map
World Wide Web Consortium
Hamiltonian (quantum mechanics)
Information management
INTEGRAL
Dimensional analysis
Theory
Core dump
Cartesian coordinate system
Similarity (geometry)
Punched card
Algebra
Stochastic
Computer animation
Basis <Mathematik>
Synchronization
Partial derivative
1:49:49
Point (geometry)
Group action
Scaling (geometry)
Mathematical singularity
3 (number)
Singular value decomposition
Area
Number
Information retrieval
Term (mathematics)
Vector space
Vector graphics
Query language
Matrix (mathematics)
Spacetime
Row (database)
Process (computing)
Maize
Data structure
World Wide Web Consortium
Area
Theory
Total S.A.
Term (mathematics)
Limit (category theory)
Computer graphics (computer science)
Similarity (geometry)
Arithmetic mean
Root
Computer animation
Compilation album
Faktorenanalyse
Nichtlineares Gleichungssystem
Matrix (mathematics)
Square number
1:51:34
Point (geometry)
Context awareness
Eigenvalues and eigenvectors
Multiplication sign
Maxima and minima
Similarity (geometry)
Insertion loss
Online help
Mass
Shape (magazine)
Mereology
Special unitary group
Crosscorrelation
Term (mathematics)
Different (Kate Ryan album)
Office suite
Position operator
World Wide Web Consortium
Area
Execution unit
Distribution (mathematics)
Matching (graph theory)
Eigenvalues and eigenvectors
Computer graphics (computer science)
Process (computing)
Computer animation
Mathematical singularity
9 (number)
Capability Maturity Model
Spacetime
1:54:29
Classical physics
Point (geometry)
Context awareness
Group action
Functional (mathematics)
Eigenvalues and eigenvectors
Multiplication sign
Control flow
Special unitary group
Semantics (computer science)
Term (mathematics)
Row (database)
Software testing
Endliche Modelltheorie
Message passing
Metropolitan area network
World Wide Web Consortium
Information management
Algorithm
Matching (graph theory)
Information
Structural load
Electronic program guide
Water vapor
Term (mathematics)
Mathematics
Category of being
Subject indexing
Message passing
Computer animation
Physicist
Information retrieval
Compilation album
Resultant
Spacetime
1:58:28
Point (geometry)
View (database)
Multiplication sign
Decision theory
Direction (geometry)
Maxima and minima
Online help
Singular value decomposition
Special unitary group
Number
Data model
Roundness (object)
Labour Party (Malta)
Different (Kate Ryan album)
Term (mathematics)
Matrix (mathematics)
Bus (computing)
Representation (politics)
Ranking
Position operator
World Wide Web Consortium
Addition
View (database)
Weight
Bit
Similarity (geometry)
Degree (graph theory)
Computer animation
Software
Angle
Different (Kate Ryan album)
Video game
Game theory
Ranking
Representation (politics)
Thermal conductivity
Row (database)
2:02:28
Point (geometry)
Presentation of a group
Information overload
Multiplication sign
Correspondence (mathematics)
Real number
Mereology
Special unitary group
Number
Arithmetic mean
Term (mathematics)
Matrix (mathematics)
Representation (politics)
Multiplication
Position operator
World Wide Web Consortium
View (database)
Weight
Total S.A.
Line (geometry)
Term (mathematics)
Subject indexing
Wave
Error message
Computer animation
Software
Order (biology)
Different (Kate Ryan album)
Arithmetic progression
Matrix (mathematics)
Row (database)
2:05:43
Laptop
Algorithm
Gradient
Multiplication sign
Singular value decomposition
Approximation
Intermediate value theorem
Field (computer science)
Neuroinformatik
2 (number)
Mathematics
Readonly memory
Term (mathematics)
Helmholtz decomposition
Semiconductor memory
Bubble memory
Operator (mathematics)
Matrix (mathematics)
Representation (politics)
Software testing
Gradient descent
Position operator
Physical system
World Wide Web Consortium
View (database)
Projective plane
Sampling (statistics)
Physicalism
Total S.A.
Line (geometry)
Term (mathematics)
Approximation
Subset
Process (computing)
Computer animation
Software
Order (biology)
Different (Kate Ryan album)
MiniDisc
Quicksort
Matrix (mathematics)
Resultant
Nearring
2:10:53
Point (geometry)
Group action
1 (number)
Intermediate value theorem
Computer icon
Dimensional analysis
Neuroinformatik
Number
Information retrieval
Goodness of fit
Type theory
Different (Kate Ryan album)
Matrix (mathematics)
Hausdorff dimension
Software testing
Noise
Metropolitan area network
Physical system
World Wide Web Consortium
Area
World Wide Web Consortium
Noise (electronics)
Physical law
Dimensional analysis
System call
Approximation
Number
Word
Arithmetic mean
Computer animation
Personal digital assistant
Function (mathematics)
Network topology
Order (biology)
Software testing
Right angle
Landau theory
Bounded variation
Thermal conductivity
Resultant
2:14:50
Multiplication sign
Parameter (computer programming)
Special unitary group
Event horizon
Formal language
Information retrieval
Computational physics
Different (Kate Ryan album)
Energy level
Maize
Endliche Modelltheorie
Scale (map)
Information
Structural load
Physical law
Dimensional analysis
Coma Berenices
Basis <Mathematik>
Line (geometry)
Cartesian coordinate system
Demoscene
Symbol table
Performance appraisal
Population density
Arithmetic mean
Computer animation
Estimation
Personal digital assistant
Web service
Information retrieval
Network topology
Different (Kate Ryan album)
Summierbarkeit
Heuristic
Matrix (mathematics)
Task (computing)
Resultant
Spacetime
2:18:17
Adventure game
Menu (computing)
Singular value decomposition
Group action
Special unitary group
Formal language
Information retrieval
Performance appraisal
Computer animation
Contrast (vision)
Endliche Modelltheorie
Matrix (mathematics)
Quicksort
World Wide Web Consortium
Family
Scalable Coherent Interface
00:00
Some of don't have again welcome to the new simulator of information retrieve and Web search engines and they want to talk a little bit about what we at what can we do to them to the term space where the documents are represented to get them that the feeling of what the document actually describing what the topic of the document was the main purpose of the document and and before the domain technique is called late amantadine mixing and this is what we will be doing today and as a motivation time which was assuming that many of the access in the term space citing about this 2 terms mean exactly the same and document on different access could actually be about the the same type of strike the same thing and this is 1 of the notion that became very well as the well that she was in the 19th so it was not very early fall for information or to tough but in in but that she was very clear that some bubbles wanted was not really and that the focus of whom on all on the single terms but rather the focus on top of what topics off things related to the case and what I need to do that is actually you need to relate single terms each individual terms to some top and you need a need to do to relate the collections of the documents to the top and then off calls on the IRA site you a need to related data to top up a we we know from from but this search and statistics that most theories about 2 to 3 Keywords that up to use the phrase of causes much more difficult to get a topic from 3 individual work than from a whole Texas where we get many whose what the textbook so of this different topics that the different techniques and we want to go into some of them up to see how can we do better so the and that even profits when dealing with topics is is basically we did where the easiest now like and think about the library of an exam Maria on Friday of 300 before Christel something like that but based on the big venture out library that contained so many believe that it took to to the club after the destruction of the library took until the 20th century until we had a single library that again about as many Williams and a library of sounds how did they manage that had been not where each local role as it was in those turned out work with a will located called what was that about and actually during this time that the 1st idea that you need people know about the stuff in the library in all like library who had a first classification system of this is history the SAS and the with this is that I'm a book about Apotek Charil's thought mathematics of something even on a very simple top level just describing the area where the 2 of them were actually something that was investigated and used before crisis in the that when I believe what they are basically David is that they looked Sir some of strolls and and and found out what will be the main topic were library and they separated the growth from each other and the way that they put the object of the end of the mathematics here and there some guy and will just knows what what going on unit and he identifies the topix and assigned document new incoming document which topic in now and then you go like OK if it's about politics at
04:34
And document about all the Nike and just assigned a wait for a negative because and the mobile politics but it goes to get the best of the game so something like that maybe the politics called for it a little bit higher but also the implications and the and the and the and the kind of manually assigned and and Rome than you can use the British based retrieval like we did based on these on these value based on the longest public schools And then you have to are find a method to transform the series over turns into the topic of the many in the faces today like like of prostitutes Search intellect where just 6 some books saying no I'm interested in politics so I'm interested biology and and set of basic of the brought topic found but if you just do it by the terms themselves can be pretty difficult because they are likely to face general about what the looking for looking for an animal looking for a new car depends on many things and how often these terms are used and maybe the profile of the person posing the period Wallabies audience that could lead to getting the topic but of course this method really depends on how good the Libor and the work and and it depends on home and documents to so signing the score in the rather consistent with to a few document might work out actually Doing it for millions of billions of books tedious you will get inconsistencies and the death of something that something And that is something you don't want and this is why there should be across now that we cell basically
06:59
Once you have found out about found out of the water they document is about what was good about you can do do that rather rather easy but that are making the telescope topics centred document wondering what human indexing that is really something that we had to look into it is the the and that he is
07:25
Solutions latest Amanti indexing Campbell's proposed buses and the made from Microsoft and was at that time that help much and and it was the beginning of the Nineties still was not get that it will still document collections in the free with a year and but still century idea was less not talk about the document themselves that not talk about back of model that talk about what type of the covers and sends a document does not cover a on tropical does not have a Topics was not really a binary decision but some topics may be just mentioned in in the paper said that the any any documents to some degree about some top and It is called latencies based so the topix on later and in the end and in the documents and you cope with I put your finger on Uniloc and document will probably not that this document about politics in a way that it will start to talk about the Prime Minister and a new low levels of pasta by Parliament or all some of the things got the police does something all the money go to all should Robbie Paul annoyed latent decision and but somehow in the in the paper but what you do However work is of course also very interesting thing and many of you will bird of singular value decomposition With so not so many of you will have put up cricket at the singular valued composition which is a very useful techniques that happens all the time computer science but was still it from the mathematicians actually from Linear algebra so that 1 of the way to get the I'm and to be the ideal values actually of of matches and the idea already reckons that many would have for bottom but all the Linear under by might be birdied but deeper and we want to go into this and I'm sure Recap of many out to Brussels which can and talk about a new from position in a more sensible way and banned and will pave the way to walls indexing and show you what
10:14
Algebra well you basically study systems of Linear Beijing and suit the systems get high dimensional with talking Bektas Vectra's and Becta spaces The actress just it all like a representation of a multidimensional point in some space Space ability so is given by some Vectra's which are called the base where the of the British every space has based spans the space as each of cross when we have points and space we can do things that interesting but the so for example if you have read the figures like some pyramidal something like the look on a point of Durham could be given by the 1st the edges could the be pictures of her in a way we can do is basically turning to promote pudding and upside down by making law loss making big for just shifting around in phase of things we can do and Now waste and a things are basically Linear transformation we change the point according to some plans and is usually done by by by Metrix on a flight that the damage multi of of the victory and is multiplied with the victory of changing the values of the victory to make it more low to make point into a different direct what everyone to And actually these these pictures of matches Caesar Walker although computer Science of very popular techniques and very MP for information that she immediate time and again so pay attention and you will learn something that it also use and many other lectures
12:30
I'm basically a said before for the point of space and you can either have them of the rope for as a column I'm where columnist transposed rolled so that it doesn't really matter how we can show that it is the same point of the island from the actress and and an mattresses that we talk about will be real value so that these variables Bob of given by by some real number could be as And the issue of a match tricks of some dimensionality
13:20
Ambrose and and columns and this match 6 basically is a match of that transfer almost Bektas into some of the victims by my supplying the so if you take the Matrics with the Rose here from the end you take a column victory Amazing what you can do is you can't French firm this victory at To some Becta ex prime time multiplying it with the matches and what you basically duis you take the record of the match and a column of the victim and you might apply each and and some them up UK and until the last 1 c And this is basically a case you and you victory the same dimensionality field but it somehow the values and the victory the changed the have not been change are the tree but they have been changed with respect to the match and a lot numbers large number of different mattresses of different films you now bomb that that can do different things to victory can be part of a different time of different transformation sold for example
15:03
1 Metrix that will know is the identity magic and has just 1 is on the point on the day job and is Eros anywhere else and was what doesn't do if you might applied to act Nothing at all levels of metric and so you needed said that sometimes exactly does nothing because every part of the the victory is multiplied but 1 which is now that much and you don't get anything you have that you could say about this arrival blowing metrics Ireland to put some numbers to in all like Oh my and loans and you know what off on to the diet and you still have a diet match and what was due to a actors It It scales daily entries of the victory according to what you might apply each victory of those not gate even the if you use different numbers on on the back of it might be skated and in different ways so different comment of has a Maddox does not have to be to be squalor it does not have the same number of frozen and columns but you could also rectangular mattresses and and that 1 class of matches is that this the was very important and practical applications that systematic magic where you have something on the battle and the stuff he is exactly the same as that of and just same numbers are basically merit According to legend so when we all know about picked displays is that we have to have some based nectarous that spans the space and how we actually determined the diamond another of the space how do we know if something it is a point or a space or a summer of the new just surface for 1 of its own line 1 damages for the image of the child
17:30
It's a number of access in the space and these Texas have a certain Omnes of a certain of care Kristic some and that as they are the The Independent Which means that if we take to victory that spent space neither of the victims can express the of a what if I'd had start wept so take it to the mental space giving by the axis the don't have to be rectangular but sometimes very helpful for the murder and then neither of these factors A and B Can represent the other 1 because they are going to from direct again This novel way of building better 8 out of victory But just multiplying be by something some matches But it takes some practise seat I can actually built it from a and B By taking part of the And taking part of it and that is if 3 basic Vivica seat of pin so true angelesbased has now way That can after 3 near the independent sector but for every served lacked put into 2 independent vector represented somehow by the 2 Papers or this concept of Linear dependants so if we have a real numbers such that we can build New pictures of the remaining And we can not add another wicket papers 1 that can not be built Then We have the time and I multi of the space And the pictures and and the and the and Every papers once victory has to be in the hands bomb And and and that is basic 1 of the few reams of all 4 of Road Relational 0 4 4 Linear of abroad that whenever you are so close that dimensionality of the you will get laid and you can do them some of the UK a service basically the idea bomb and discuss nose to the to the idea of Linear span so you take the cave icterus linearly independent and there were possible Bektas in the space of us band can be built by Linear communication of the and
20:49
Well basically Linear spend all with subspaces off The Ndimensional space of Relational numbers with demanding that most of take any K papers In Ndimensional space they was banned subspace And if there are hand the independent which is possible case smaller and Then and the Subspace will be Qaeda and if 2 of them are already Linear dependent on the ability of some of and the damage it easy small more A pet
21:38
Good so the spent some Bektas can be a single point 0 the Solar now Lectorum could be allowing all the cave across off the nett and the suspect each other like could be a plane to of decay because of the newly independent and the City be used to build all the other enough and this is how it goes so for example if you have a span of the Sri Bektas this 1 2 3 4 dismayed the 1st 1 of the 2 4 6 0 2 4 6 can be built of the 1st victim just by doubling every number are so beat is Lamda time says With a number 2 UK and so immediately Linear but and OK of the looks like that just doubled Lancs The public see you all She is also depend mobile phone from the street The and see is dependent on better It might be sure at The number of my number Depend on the spectrum beat West Apec So we find that the Sri victors are basically all depend on each other and if only like the 1 that Menschel Subspace of Ndimensional space But if we take makers Indeed the FSA This simply not away worried that died This Slumdog terms Because 1 of I'm and multiply with This basically get bigger of small would about but this hero does not change and to get died Code 2 Get the 1 he obtained icon not the victims died of victory on the simply no way that Llanelli The Independent single picturesque Thus they spend read the mental space And everywhere there has and 3 entries this is the biggest base camp And so this is a wonderful kind of basis because it is the unit based Every victory of laying swum And and they are called in 93 angles for the all perpendicular to each of this is good but this is a very nice for more for building the actual based goods
24:59
Talking about basis for which we have a set of Linear the dependent Ndimensional victims Then the spend is a damaged subspaces and any point is actually generated by a unique Linear combinations of the basic 1st 1st what single possibility of multiplying The basic crews to get to some sort of a the And This set of independent directors is basically the basis of a subset but we want to look at 2 basis of club space began take the standard basis having just 1 on the xaxis them on the white excess while we could look at some Doctorow's 1 1 2 Street
25:59
That can not be transformed into each other because of what are while the what and I'm multiplied the 1 1 wicket of it will change The weight Dependent on each side can make the it to 2 victory 3 3 victory and scrambled of to squalor route of to wake there but it has to be the same number of either modified something for the that different numbers he can not built this picture of the 1 woman their linen the independent by that the pace of Oct And as we can see is not a beautiful based because and that the standard based allways goes if I'd have to get to a certain point so that look at the 20 3 full public get there was a standard faces disabled and was the 1st number that is how it might take the 1st base and was a second number this is how it might take the 2nd base rate so to get to 3 of the I'd take 3 times for the 1st basic the and 4 times the 2nd base the and then under and has no other way of getting to this point The with the the space to make diverse it doesn't it doesn't help me via per 1 further because the and at some point you have to go back To get to the point which will basically be minus and will end up with 3 times the 1st base rate of 4 times But can also get to this point using The different basis from if I'm the 1 1 victory And the 2 3 victory by could take 1 time the 2 3 victory and one time the 1 1 record 1 was to make 3 And 1 of the 3 4 The number of Just different ways of getting to what every point at 1 2 and again This is unique of the Goods from sometimes
28:43
The Standard faces a mean you can allways waste go to the standard phase of the unit based for every dimensionality unique but sometimes this is that little bit too difficult and this is where Coffel based transformation Wasey sings some Paul better so for example if you have Uman you will find The page and the weight of people Us some correlated From The largest somebody is The law which he would meet the some count example that their small people the very in weighty But usually it cut off The bigger you all the more weight you put on the way down on the state of the band's so points might be might be like The can using the sea This seems to be Some kind of looked connexion between the points So maybe What we actually I'm interested in is not the weight and the high but rather side is of something that we have a small divagation from the not so good they are so if somebody is of a certain size and not only hero she should wait Like this and if somebody is bigger the wave grows in proportion and might be here all might be here again And is overweight It's about the line If somebody is Germany's next top And below the line of gay And this kind of The Daily a And with this access gives you let behind The connexion correlation between weight and size weight and height something This gives you What a and what part of the population that is not moment and in what way if it is not moment different information It's basically the same infiltration because the points have not changed They have just changed with respect to the new faces and the only thing that it was actually tilting to base pictures by some angle In this case it was actually the same anger so I'm pretty kept them perpendicular does not to be in order and took them to a different but still Sometimes based victory is changing a based victories and a good idea and Dome you get all these to that to of the things so if you have to set of basis pectoris so those off the newly and on each other because you come only have 1 day's but within the space that the victims of an independent of calls of what a would be base of the falls
32:36
The Nike and Each point in 1 space so expressed with respect 1 set of To appoint expressed with the other side of the base That can do that by some transformation match Such that when my polite pictures from the old base with the by would get the victory in the new book Pocket and Swiss the same see the useful transforming the they sectors into each other at can also chance for every point and space every single victory into a point into a new it in the new caught in the system that means the point will be expressed by a different set of basis care The ever of I'm doing so just talking about nectarous British bases faces of Victor spaces and helped from full MP points and they set of figures
33:58
So if I've to ExoMars 2 0 2 0 to a set of basic dossier on the fact that on the fact that obviously they are independent year on different axes and here This can only scale in the same way that this is not in the same way and that the bell face both of them and the eye takes a point was For a 1 1 tie in based whom the basic the means that have to use the 1st baseman the one time to get to the point and the 2nd base recta also 1 time to get to a point that do it the 1st record the 1 1 victory for the use of long The 2nd leg is the 2 3 victory We also use of 1 of the And then eye arrived at this point again Which in the Blue base Is the point 1 1 So not the violent to change for of this place How halston after use of accuracy the point over here is still the same the the pace of change and and looking at it to get to rich after used the 1st base for a 4th time By the 2nd vector just want So this point in order to to make it through for example this point and the new basis green basis would be the point Well Fully booked it But the new basis airport Would be the point full time the 1st day spectrum And 1 time 2nd base for this would be a OK That has changed with suspected a new based from 1 whom to full 1 still is the same point And we can kind of you to calculate that the whole of the actress relaid was suspect each other and if we use this as a transformation my and multiply every based victory was suspected of what's happening that time than we can't should fall in the 1st place into 2nd place so called we do that P But it looked at the top The exactly but just might apply rope by column so we can't body plied with the basic 1st and get the other based vector of that and so it if we use it to fund the vector 1 1 0 Then we would get 1 plus really makes for And for the 2nd entry we get 1 3rd place to 1st makes whom a and indeed This point has been transferred from 1 1 in the old days this to 4 1 in the new This is kind of the expression of a week at the base of the suspect Kia Transformation metric Based transformation and just shifted The cause of the blaze 4 and The transformation Matrics basically records highlighted the what it should go
38:22
Then we still need to the scale of what we have to talk about all of that team and basically but if I'd wanted have time scale approach to victory I'd just taking Each entry Of the victory might apply it was each corresponding entry of the other director and summon up so scale approach to wake does give me a single number and basically divided on the same vectorscope and almost victory which is in the squalor of the grey a route of the spread of the entries Just some up the squalor of the entries victory take their of the fact that this long the length of time And what we can say is that 2 victories over for a But the scale of product busy but does it mean to be offered in no Basically means to be perpendicular To include a writing And some The scale UK of to widows basically can also be expressed by the length of the pictures and the code goes of the angle between them and what we can easily see is soon as the degree is 93 because I'm gets 0 for the whole thing up and then we would same of the for 2 under 70 Just set up your Authorial basically means that include rightangled and the interesting thing is that if you have a set of new chief of victory
40:29
You can immediately saying that they are in the early independent wise that but if they include a rightangled then you can only go into 1 direct and space using the 1st victory and into the of direct and space using the 2nd victory for this simply not only of building the victors with respect to each other because you going in the wrong direction and the that have to be a near the indeed the ECJ approval for fear in and out the algebra is more difficult than the and that but that is the do metric interpretation you know at just under build a victory in a different direct of space defy only get figures showing a perpendicular the and and we get also say that they are not only also Gummo but they also Mehmood if 1 almost although the length of along with the 2 to 1 This called moment basis and of cost about what extended to a mattress Rossi's sell a match Icstis column for for moment if it set of column victory is often of a take Matrics and look at all the columns in the match and they should all have long of length of 1 And should be or took 1 look at what with respect Talk and the the same goes for a walk of my
42:19
That we can talk about some what actually is contained in terms of the that in some tricks and that is the number of linearly independent role of columns of the same by the end of the match So if iconic press Different columns with suspect to each other and the columns Linear dependent And the rank of the Maddox's smaller as if they were in the Independent But these are the sort online have a match that and sites Is bigger than time and the dimensionality of the underlying space can immediately saying that the rank of the match So what we he it Exactly must be Smolik equal to the men's sea of the space and the Magic is bigger than the man who of the space the magic can not have full right now so on began also defined the rank as the demands of the image of Linear maps of acute some basically multiplied pictures with the match tricks than the image Has to be shown in terms of the of the base pectoris and the more independent expect to see the bigger space and the Maddox's kind of unity and this is perfect because it just contains a a typical all too long will be sectors But we all know from are to of the basic the you can say that if it's a bad law metrics Then the rank of the match it can easily be seen because in the battle romantics everything Here is 0 so multiplying it was something building new pictures of transforming the based not happen because of makes everything to 0 Interesting is only was on the way That's not And as to why as are not 0 for of of whom you all Street the of entries on the Dutch you have based victors pointing in different direct The difference case of 3 at the the highest paid as 1 of the more often Momo the there off come because and never Kent But this role this column victory although of this column So the number of linearly independent column pictures and thus though rank of the match but it is given by hominine non 0 entries have have won the battle Because it is assumed that the the 0 on the legend Ikea This is irrespective is now that It doesn't mean anything a can of built in a thing of because and it doesn't matter what I'm multiplied this victory it will be 0 0 can't for of rank of the mattresses equal to the number of non 0
46:20
That this match here is the unity metrics it red also Mahmood of columns also moment So every day Becta put like this it is perpendicular was restricted each other and they are mobilised of length of 1 the Take Northamber of this way its basically just qualify it basically 1 But to interested OK and the rank is full because I've full not 0 entries The basis for a full time and space looking at this Victory and again we find The columns off Independent but what about this column This is not independent because and the and multiply any of the other victims but 0 Ikat district This is not column Off over not the But look at the rebels P we'd it at the time The BB The Then icon not used the different Vectra's to 20 for them into each other so that is indeed the role of Homo Metric and the rank of this week up his 3 basic Good
48:20
And highly if I'm where mattresses and taking A non 0 vector some non 0 the Then it occurred respected and eigenvector of 80 is satisfied that the occasion 8 times x Lamda times for some real number land sold basically using The match tricks and the better modifying the metric with the victory just needs to a scaling of the victims the metrics doesn't do anything to change the direct you or something complicated it just gales of Mexico Figo makes for a plant and these kind of pictures are called eigenvectors next because the palm of my freediving witches 30 of set by mathematicians the joke called of among prescribing celebs of very famous position physicist some and the and the and with some people claim that the idea of these undying but was called up with spectrum but it was difficult not true and that just called eigenvector item in German means the owned direct arrested and it's the only characteristic of the every match to have some idea of some specific eyes and this is what went public so and algebra picking the 20th century some many of the big people Beckham big mathematicians I'm were talking German and the use of the word and the is the 2 but the Germans were the most just translated and them in the same full in Indian and to the American I'm going to enjoy I'm saying singles for the lamb down so the scaling factor that the Vectra has suspect to the magic that is caught the eye in the value of the and and every I'd value corresponds to some of the This is what I'd factors that basically the idea is that the match rigs does not change anything important of all the victory but just gave the accepting the of the wall and the way it affects the victory is the money There by the idea that it's a big scaling the big kind of small scaling small like the United that does not thing available to them when you have the idea that you 1 of the
51:20
Get so for example if we take the time the unit sectors It it IS and we want to see the images of the 2 when multiplied by some of some metrics Then we will find that The look well paid But now it
51:53
But now
51:55
They will find that the eigenvector moves with the universe solidify change to a different based also the image of the eigenvectors change And so it could act to be useful to to change to a different kind of car and of some of bases and using the said of eigenvectors as a basis Is that with a good idea why Exactly why of a and that bejel if you say I'd riches basis The But It's a scale for each individual eigenvectors how the do that you take the unit Matrics And put the scaling factors In the diet and and when you trimmed for your bases every eigenvector is just scale to by what you put in the That was booked a Was a basic so so if you have the if you have to some basis the hot trim The victory that the edge along 1 But it too The something like that For more this assault 0 And you take the 1st eigenvector Bobo Balde than or with the use of the IHT victory or just multiplied By London 1 2nd I'd like the just mother of 2 and so on a can those just scaling factors such as diet and the in the match
54:02
So are all the more so than the algebra for and Chris at this point about the question but what is useful for a while with a bigger bust up to now we see law pictures and just spaces and majesty of what do we do with it reasonable for
54:25
And this is where we want to talk about now basically Because If fight decompiling Matrics was wrong called the Nikon ways find a transformation also see from of about such the bike and the compost the metric into some billion year Some metrics here and so is a simple on the map tricks Cover ratings Only the Degel about what happens they are basically is that you have column off Momo matches and that you have the right role from of matches and that the Diane Joe metrics in between basically gives scaling actress and multiplying rebels and columns of the altar of the altar matches and it is Syrian that the that can be compose every metrics and and as such for the colours of the 1st picked up on the left before a senior Likas and column the roles of the last match of the right singular the actors and the as The elements of the tie at load of the matches in the middle of a singular so the singular value decomposition is basically I'm very similar to the idea the composition of the of up to a row not off on a mobile and from the more from more and put the singular into the UK
56:22
But what happens basic needs to take any you get Column Also Mahmood Bektas with Frank off You get Page Bagenal metric was ranked on which basically me Since it has a full wrangling may not have the full range that there might be some 0 On The Bachelor of About Italy's Are all the exactly off moaned 0 values almost indexes and the and the Road will for more Tricks Alinea account And if a as Linear map footprint for some Bakhtaran into some of the land basically you can't do that in the 3 mapping space mapping steps up a comedian compose in that way and so that basic the means that could look at 1st transferring the victory acts 2 A rally Metric borrow based across the and ice scale the result according to the the singular values and and finally I'm method into space using the column pictures of and UK and to the steps into the steps differently and a can do that on the right hand side with his Ndimensional pictures across Do something you has to fit in terms of the row over time column if the road is Ndimensional also column to be and I'm and you will walk by can do it from the left hand side Then I've take M dimension of the a can't because It will be rope by column in the column that and dimensions It also the and mention up his left modification right kitchen all with robe times code up
59:20
Could to example about of the Hayden and the wait for for people if which take measurements on people with might find that this is We get so that somebody of the 1 metre 70 centimetres Wayne 69 kilogrammes Somebody was the bigger might be a bit heavy But of course the does not only or use told so this guy you polluted a bigger but for some reason the we more from 1 of Germany's top order with the degree of their what we can do it we just take at least measurements as metric but the while the 1st Road basically all of the different types By the 2nd row is all the different way accepted And if we compute the singular valued the composition of the match We might find But there are some big Mattresses over via the basic the of the rope and a column Bektas to get to the dimensions of the match but the interesting potlines by a child So there 2 1 very big scaling factory in And quite small scaling factory the What of the green But But it at that I went his silly interesting about The difference and the skating factors Pop up but it it up but it it Think about the pace that the transformation of with the but it before But it up to the But For the 1st value Is allways taken with this Rio this column of Kent At taking With a stroke this column A Case So it has a large influence in scaling The Becta's The 2nd 1 Smaller ones Take with the column up and has a very small in through Case the doesn't need very much because they it said that quite close to 0 for my life and I think by 0 0 in the Romanian in the resulting there will be a lot of 0 if 5 Scale anything by some number very close to to 0 0 gets to numbers also very close to 0 and the basic the died a shift at the basis system into a direct you Waggett Bomb freedom Houghton exit And why I'm not so important acts Kent and if we consider are of pictures from before the war a stuff he was was this off a basis but what we could do it as we could and of change
1:04:00
The way the victors basic method makers to say well if that is what we have in terms of books of Asians Then 1 day spectrum very for them This will lead and the 1st 1 Corresponding to this lot scaling W over here Very well discriminate Along with around very well discriminate Between the points in terms of the distance the suspect to each other and West the other whom To be cost from into smaller ones The Does not good to do with a good job in doing it so I look at what because the or with goal of the year And like here So the between the Basically all same thing and here is a model of so we see that the The values are actually quite small talk and it does not discriminate and this is White has the space for scaling back to win 1 game pull and access the 1 with the singular value and we of 1 rather unemployment uninteresting axes with respect To the Low scaling back no scaling singular value of can and Having this might be an advantage to having to exit But the rules have Hmmm Some scaling but we can really distinguish Again of based transformation Dust the trick of concentrating the inflammation on some Few axes 1st the other exes are of Unemployed getting And this this very good idea to do from a yes of what you say yes
1:06:59
If they were given and centimetres not metres of basically make a 100 80 year 1 of the wonderful and the effect will be the same so that it does not depend on on whom under what what happens in terms of the measurement it rather is interested in how their interconnect So the via A couple of points like your here here again This no way that you could get them to to concentrate the inflammation in 1 axis because this no correlation Messner nothing between them the but it is my points are like maps and i can see eye can discriminate these points very well with respect to this access by Comte discriminate this point with suspected this access to a PC and a mediascape this access and just the ball wide and discriminated the discriminate was suspected this access if I'd try the same trick here it doesn't work because they can discriminate these 2 points went to bring the 2nd access But icon not discriminate these 2 point any more because they would be thrown on the same Point on the 1st Texas a the world No not it is not more this criminous to act like like it like that at all that we go out and buy it will sell this access is more Discriminative because the value high end This X Estonia It is not more Discriminative again but they full over base of the underlying space Only that 1 excesses small Discriminative and won the last word as we stick with the old 1 Those excess ought to some degree Discriminative so too with the example of the of the point of a pet But But it But but but but Well actually it could be done in that way so though the loss of singular you become positions of principal component and numbers and and you take different Maharishi's and take different different the ideas Oasis who on the basis transformation Actually if you occupations in the terms of Wigan types of hand over a random variable but called area Machakos St the competition a acoba methods for immediately springs to mind but you can get your mokume up human or the variations of used for every time the terms of a kind of magic and of method to we are intending to use of the cost of term document To find out what interesting discriminate between document well and what does not sailors got up as a child you we got 1 victory with a very strong weight and 1 with a very small 0 8 5 the if we kind of like multiplied everything we would get Alridge Matrics because
1:11:21
This was just a segmentation of the match but we lost the composition of the match so or Akashi duo was of the of the match excuse you take account of the base pictures waste rose and column scaled with the victor from the battle match perspective And he is just some of all these matches it's the way the new based Bektas End up to build the or a child actors in a APEC like we did before And now we can see is that some of these space Bektas Maybe very strong And some of them will be the year of their alone or maybe never 0 as skin and so that autumn for all might happen
1:12:30
And the idea of crosses if we That will be singular values By site American allways reorder mattresses we keep intact If we just want them by side Then the 1st part of the some example because it gives aloft weight and it creates a lot of information in magic With this in the later part that are multiplied by 0 0 4 by numbers very close to 0 I'm not as because they don't change values much A Case and this kind of like concentrating the energy of what is expressed by metrics a into the 1st couple of singular values and the rest with a break of the spectrum 1st arrest and large began all the idea of of calls that we well we don't talk about 80 any more but we will talk about some a prime The basically consists of the Mass And views out the rest of what the a it will be the most pay some new pop here are missing and firm If we have some noisy data noise might be exactly what is in the tale of the shift Mulch interested in to the match so far which just leave them out and this is basically also the idea of the Subspace because I'd just use some based victims and I'd ignore on the base of an ad look into some Subspace that contained the most information about it Both my tricks And feel far below Subspace was the other based breakthroughs was of small singular values adjusted more of this kind of the idea tool elaborating Kate approximation of of matches saying I'd get to some of all matches the case and just taking the 1st case
1:15:25
Bektas A pin Of course they should be non 0 so the as time everyone to escape should be non 0 of the West with the vault even more than that is kind of the idea of rank effort to make sure and of course it by don't need The basic burst of the small of pubs over here and I've this move as the decomposition and I'd take only possible The look The and put this so 0 of the of the S metrics but and basic to to do the same Of Maddox's 8 that you and that because of what happened CIA by compute this match and so on at some point out that 2 0 and multiplying something was 0 is just useless So if I've Remove the columns in the end but book only The mother was 0 Street and do the same for the metrics removing the Rose the and that would only be scaled by 0 0 The negative smaller mattresses And a good Apr UK seem Asian still good at a nation of case because it took most of the energy and just give smaller than the UK and that There is a basic idea highlights Osieck to Apr automation error well this Kameyama measure by the for BSE distance from being used as between 2 matrices is something that's very similar to the model who take all the rose to take columns and you some up the squarest off the differences between the entries and and the respect of rose come and then you Treliske with just 2 0 dimensional like we did with the bat before and this is this is roughly the same as the means credit entry was a or if you do it like so all what can be shown
1:18:07
Some is that the rank and file timation of 8 has a minimum error if you lose the smallest singular specialist if you use those that 0 It's not summation of old it will result in the same metrics because they would have paid out by multiplying anyway if they were my this 0 And if you cut out some singular values that bigger than the ozone on 0 but rather small will make only a small at this actually what makes this so also so interested so for every match itself rank most K
1:19:03
We will find the from being used long of this match it is allways large and when we did what we did cutting out the smallest singular What ever Ambrose you would cut out you would with more information about their foot Optimo ring Kaieda approximation it doesn't really Emeco when we cut out as long as we make sure that it has the smallest in the 1st of Death These any metrics with small ranked sorrel 18 As rank and not something like that and we want to approximate Asian of rank case A Kent So we look at the distance between 80 and some metric to be a friend and we can't say that about every week cut out the mayoral so the Smollett's distance between the matches is cutting out the smallest singular A pet food and the new based vector with respect tools most singer of a at this allways saying When you do something make sure that you use the smallest are not of the energy of a pet It's like Indian in other Fourier transformational something the 1st Coefficients carry the maximum information if you cut them out the bad idea because of the Taylor who kept in the same Simla think OK so Some well basically saying as part like that if you take the smallest ones and the rest of what he losing is Less simple than the things you keeping up the and you can conduct Calculate book the actual about what the exit error is that that you doing but that shouldn't be too interesting for us and it's just not like that and if we do something sometimes have to do and make sure you But
1:21:53
Well so that new something he remember we have the second Texas and 1 of them And if we use that access their 0 The differences with respect to the direct you are set to 0 for this point actually goes here and this point actually goes he at this point because he was point here are so it's just a Projections after points in the 2 diamonds space 1 2 0 1 diamond and looting information was respect to the 2nd With respect to the 2nd excess and his direct UK and follow this information was last resulting In this protection But course Why have them so it just take a point here And just come the green access is a onedimensional space Of the 2 would damage of space Bright and AC and still distinguish The Points within the up But if 2 points Here and here for their before I'm might actually use information because after skipping the 2nd access They would be projected on to the same point in my that 1 dimension of the up and icon not discriminate between the point any more As a drop still and make sure that the excesses that is the 1 with the ridiculously Smara singular valued the composition of the singer value At last as little information as possible May happen like a misplaced but usually It doesn't matter and still distinguish between what Boca But now we have to onedimensional space but to damage of space and time saving Of storage space by computations get off easy and and moving not much This is the idea behind so the if we kind of like a stop with this maturate and it correct singular value decomposition
1:24:56
We ended up with a nice a violating the this just trusses out and said is 0 he So that some part of my Matrics when might applying the whole thing in a life filled with him and so on but multiplying at all What happens is that we get this match now we can see is that the value slightly changed so instead of the 1 70 again 1 55 year and said of the 6 9 year 69 something here so the values are not really what they were before the some of the information the values before easy you to the project But I'd just took The part was the biggest single evaluate and it just took The 2 Vectra's instead of and time mattress up and it's easy using the information while I'm not able to I'm recreate 2 0 2 2 2 to really get the full information on of it but still surfaces and but somehow stretch Sometimes like here we can't even get that values that were different before Ahmed on to same but because and but like from part of the different types of the Act The protected to the same point a It can happen
1:26:49
So what a collection trying victors but if I'd have and M times and match and the respective singular value decomposition than if I'd look at the the map tricks times the transposed matches The end came by re The different majesty psychic after the decomposition such that the Beeston go together But Since there are also Mama their anger is perpendicular and the scale of public 0 So everything perpendicular is 0 and the Rose that corresponds to write column in the central become 1 So actually this is the identity matches And off by discussed this away to be identity Ikon look at what happened here for basically since this is a badge lamented the transposed of the Maddox's exactly the same as a metrics itself and so this is the squalor of the match and the squalor of the dead were metrics dinner like a take every grow and every column assesses quail for the entries of the regional again so this is all that happens but still a diet and looking at you And this Degen match by 1 can show that the columns of you are the eigenvectors It off 18 times chance for the can't to post match the and The metric Cimillo just contains the corresponding like the UK and so this actually 8 decomposition into I'd values and eigenvectors Yes and if you do as a different way around if you take the transposed Maddox time for the Origin Match expand of peace rose are the eigenvectors and Esquel still contains But we just different than the also Momo the column off moment useful on the Flat the result and the and the identity of the with a collection to of eigenvectors said it something as they very a similar to eigenvectors I'm and a singular values just very similar to the UK But
1:29:56
Not you comes the questions OK we know about the new algebra what we know about singular value decomposition what have was are problems apply looking at 7 looking at topics of the day and this is where they can Simmonds again this comes in because there may at some point said well why we applied the same valued compositions to terms document matches
1:30:28
Everybody still remember long term document magic All basically of all the terms of the and so on and I of all the documents of the collection and so on and in the match exceeding note course in some terms of 1st in some document of Kent The term document much and This is usually a rectangular metric depending where you more document was small real more terms with a small document collection rectangular bomb and if you apply the sinking of the decomposition then you create some intermediate engines which of those with the highest singular values That could be seen as topics basically what it does is it shows correlation in the terms which If somebody topic of some some some terms in the origins of some of the most based their way soakers in the same context as some of the tree the of occurrence of the terms correlated That can be exchanged suspect so probably cinema This correlation will be shown by the singular valued composition because it gives a large senior value to some exodus along the line of correlation as we just soul with the agent and I'd Excel Kent and it gives a very smoke Becta or for along to you to the follow on but it has caught the basic IOMI does not at much information to discriminating between 2 documents that just differ in cinemas is impossible with respect for the new because discuss the difference between the 2 way the case of this we doing we are kind of like bomb But Put together synonymous with a handful of and and if terms that about the same term so basically in the term space her on the same access to Duberry of smokers in different context It is not possible to just over 1 excess showing a correlation because there is not It would be rather like in Why should be way of the points here Where Linear said that approximation was 1 X is the make any sensible enough if these different proxy of this different Clusters and you try to figure out how to discriminate those points with a singer access adjustable work you need to access to distinguish between this is what we call the the police and the request Uk set up by the man described the diamond having slow singular values and would like adjusted during approximation it removes the noise from the data so it actually does not losing 4 major about the term document but at the same time it makes term document metric smaller more compact and remove some noise that comes through the usage of synonyms for different context and APEC so you can show that a key factor in terms of quality and the original
1:34:44
I'm 1 of the examples of was later than than the or think if you take a small collection of the title depending on what they are integral equations here equations again algorithms here again algorithms of some Was the ball The integrals different terms operating in different from different combination Then the term document Maddox documents here
1:35:33
You looks like that we can see very much from the document Maddox we we countries see where classes all what belongs to each other But what we can do it is we can't just before Masingale OverDecomposition on without knowing what happened and if we do that
1:35:55
And take the 1st 2 means Of singlearated become position that means that the selectors with respect To the high has singular of new book and just too Before that we have what I'm Enginuity of the term space of 1 2 3 4 the and and the end of the 16 for these 16 different took term space 16 damage I've cut it down to 2
1:36:37
And what can we see well seems to be a But look yurt book you will So the different topics approved last this seems to be something very obscure the suspect whether to and if we also put in the time the terms until a new terms space with a fine for example differential occasions this here Integral problem of delation delay in the implementation application algorithms So we find that that different centre and for example look now at at the 7th and the 17th what it actually was
1:37:29
During the seven the knapsack problems algorithms and computer implementation and the 7th the double Mellon bombs time into growth and Applications convolution fury The have anything to do which was opened in terms of the public or the both seem to be some all practical books of do something was implementation of the law that was applications to something So just judging from the 1st 2 acts as we can out really say this is the only a book about Eidinow geometry all book about differentially which of something but where we can say is that have something to do with the which each other because they are in terms of behind his energy contains to get the industry dimensions but we can not really put labels on these acts so weak which we don't really know what these X is actually need that they seem to means something and similar terms the used in a similar context Booker Robbo
1:38:54
But While in the same part of space or Kent Yes this is West upwards are a move side into and and is and and and 1 of the just focus on the 16th terms that al today's it does because I'm in the the the documents are talking about the same things that don't matter that there as a for more than the with it doesn't Ed context but that equations have something to do with the late that is important information on the other hand ICI here that he Beijing
1:39:49
I have nothing to do with the way this is the kind of information that taking up and where these terms that are underlined are basically the terms of the Her for political party We can also do it all on my own on the cover of means that they don't add much information in a way of cutting the mob before the end of making the being singular value decomposition which very expensive operation on a huge metrics and then cutting the mob after words doesn't make much difference only after computer more of So just focusing on the interesting terms and higher they are crew which was picked to each other in the documents and the every every book is a document of that is the interesting part that we want to focus on Good
1:40:58
But well had we met the document and terms and until late stage but we had this idea of the scale wrangle rank Kate abstract timation and what we do is a week on of get rid of the scaling back of of the single ability I'm so we split up this match eggs and put half ordered into the you and the other half into the
1:41:31
But remember this only has birdies on the back of the take the squalor roots of the values and the but if I'd multiplied together with just the same way of tools where the modification of this where it this is just the same as your original value sold basic the can do is fight extract the squid routes from the diet metric and then The fine 2 matches That incorporated the inflammation from the time of from the from a singular values in the same type of both at the same scaling factors and then I'm just left to matrices and Lleyton stays coordinator of the chase documents are given by the day's column This metrics and the corridors of the eyes terms given by the ice of the 1st to we of Rome Metrix And where the document match this kind of the separation the decomposition but we did with a document A After 2
1:43:00
Now however as being worked well but kind of the same idea we met fewer victory in 2 late space And the problem is cruel contains considerably less I'm considerate as the terms and the documents to the band for the terms of the treaty that we can do exactly the same that we did in terms of transformation with the terms in the documents that this was what the Matshiqi you Kate crime was interesting for dissent the ice row few K used a cordon edge of the ice term so we just assuming we don't know the cause of the few yet
1:44:03
But we can computed By she was a newer The Metrix transforming the terms on this Coleman to get the origin of term back up again And Niall If we are given the term vector and we know the match you We just can't arrived A private acute crime By inverting the match fixed Phuket crime and multiplying from the left and on 2 C and
1:44:48
This is the new damaged some what we can do this kind of like we have this week they use new book and we solve the facing the suspect to Hugh prime we might declined by the transfer those of far formal the awful MoMA actually pictures of the inverse is the We might apply from the left And basically end up with the rest of the senior value to completion which began my ply by the rest If you want to get the inverse of the of Bajan emetics just take the group reciprocal values of the of the deadline for the UK and so if you have an entry 80 in the dead of after multiplied 1 divided by 80 to get 1 of Kent So if I've Metrix the sea and along and the Rose here and that want the inverse off that the worst is just 1 divided by 1 divided by the And so on Again with the US the UK and this will give me the and matter so I'd take this inverse maturate and multiplied again from the right side of Kent and then and The transformation of The back to queue to get Hugh prime And these are all part of an hour And no you pay prime Another trend and and all the singular acute compositional the recipe for soap
1:46:56
So all is on the theory applications and the regime Back can method into space just by multiplying at like a child before and not like and look for the most relevant entries for example by the land measures and take the collapse of the richest degree between between the different things I'm everything changes shade area has those simulators of at least ballpoint 9 sell documents the 6 at the 16 and the 5 thousand m operates similar to applications And document the 8 for example is very December looking at application during looking at what the a loss at the age methods of slowing singular systems of all NYRI differentially regime that patients he read
1:48:11
The salt like Looking at 1 of the other documents what was similar and without things work work fine 1 that seems to be very close to be fined
1:48:27
Kent
1:48:29
Go back to look at what the fine for my mum
1:48:34
Ideal for right his algorithms Ideals write his algorithm Introduction to Computer computer at a bright geometry and community commute algebra from application hope he doesn't get too well here that maybe introductions has something to to do with applications sell it seems occur in the same in the same context but it's not true that can immediately seat However the topic relates to some it's not likely that because on Simoneti the origin of the state budget just but going in UK when the Kurland Simoneti is small it contains the same works this is not true any more here it has smokers and team than It in the same context it has in the context of Fiore and algorithms and that's a good
1:49:52
So everybody understood how works and everybody understands limitations and the chances of
1:50:01
Questions Nope
1:50:08
Well it would take another example of here we have the documents from the area of and new groups actually from used postings in the area of computer graphics selfless but the Hawking and religious texts And we have the terms here so like about 12 thousand terms in total earning in the document published removed and we just take the documents which can see that document term Matrics in doesn't seem to have good structure that some Obviously Some document containing almost all terms the same year And this goes for every area of also and and OR talking about documents that contains a lot of terms all the different and and but there are some document that are based Boston terms for example document here on the few points for the different terms and whenever there is a point it just means that the and of this term is contained in this document document numbers the of 3 And as you can see we basically see nothing at all like a mean this term document that metric Zameen very much of this week re look at at the different documents and the different
1:51:44
Terms taken from some well basically taken from them The other documents so we want to see what happens in computer graphics will find that many of the computer graphics document what kind of similar and the Spice Of the other parts shows that that kind of December With the other documents still come to consider in the for example here the terms of the computer graphics with the terms of the British and document it's rather also not to too much point concerning the hockey document place they seem to be very similar to each other the loss through the and of course we can also see that there some kind of correlation because this area seems to be very crowded and the area Ovadia seems to be very spouse so maybe shifting the matches would be a good idea shifting the basic was exactly what we did and if we take the
1:53:09
3 eigenvectors him The blatant to them and shape time we can see the difference documents up last In different puff office space a pick Classified them according to where they came from the can see that the singular Babycham position The good job Indiscriminating between we can also see that are So there are some help you documents Trying to be computer graphics of talking about the context as computer graphics documents usually to have put my point the finger on on 1 day she said but that still there are some that have interesting on or off the book's context but for the masses of the distribution seems to work quite well And unlike the 12 thousand will unlike the 12th thousand terms that before
1:54:35
At 3 am and in every document this represent threedimensional not by 12th thousands dimensionally to be losing their little
1:54:53
Good yet another example of of the right erotic collection 1 of the classical collections of for testing and load information retrieval algorithms which basically consists of the 20 thousand Musuem messages on taking from from 19 70 87 the talks results and being for the terms taxes and Dragon who was the president of the point of using the late semantic indexing time is fit for the Test Oregan strongly and tax hike are so all Pizzi the this is the a perfect match because reckons that something about Texas The 2nd 1 Rostenkowski says says with big you as text high about it but all credit for point and so this is document where not ready and said something about the taxes but it contains it talks about reckons and need principle And surprise really interesting because it does talk about Texas but it does not talk about reckons It does however have a talk about White House Whitehouse's it or post tax increases unnecessary and you can see this is the 3rd want to the physicist Search match of 20 Feltham's but you definitely contained Sommeil information about taxes and about right Which we could not see from the From from the original term but the match was over and that kind of puts White House and breaking into the the same category They are all open as the message that exchange the terms White House break And for every man Message for for for many messages which has oregano this all that there is a simpler message Saying the White House does this all too apparent And this is kind of like the the beauty of the Lake semantic and index does not directly from that of the term Whether they occur on not like of actress based model like William retrieval like probabilistic between But it tries to forget about what the context of terms on and groups together for those of documents containing the same Walker in the same context while the at the same time making the damage in that he of all the much smaller and and This fund and function across Intel's quality and engine It actually is good to use because the space is a small and what more can you expect to get
1:58:28
And also to give you a different view on Monday side can be seen in a different way this all next the to bomb so on duty on the stood at the side of very easy to see it from a new angle he of just laid back critics and bits and let you up and take you to the beauty of the side again but it was different with the troops to be your network for patient who funeral self new work He's 1 game that describes similar in a if there are no good Like this 1 and connexions between notes about the presentation with with no notes and so on and if you have Matrix that we have with the conduct and Matrix and their risk to value the composition of the side that we have read and heard of columns and each entry can be seen as the ranks of the connexions between the rope and column for Fox or the 2nd round and the 3rd column are connected with the degree of 7 in the sound and the higher the number of those from the connexion so and because in the Travis Matrixes a representation yet for each where with And for each because we also have a note of Assembly bread and the connexion between each row and column knowledge and assigned a wage to each connexion that corresponds to the weight and the respect made sauntered world we have found trials by connected to and looks for a confusing so what can we do Hawkins as by bus in the Mehdi composition help
2:00:29
So in medical decision essentially introduces a New Labour into this network creating to addition and notes are which a used to connect the rose to columns so from now on the roads and cars are not connected directly by just not anywhere but not or connexions on direct through this into intermediate laity so again his like some Matrix and if we do a singer medical position we get paid he could you time as times the and the late that it turned out by Crestmead these on view it as the best regatta from the same date from position so and now we can do the same connexion trick to get in the way of his life now the 1st role is connected to the 1st topic so I remember the columns and new you on topics P Troops and also the roles of the on topics The American seeing that the 1st role is connected to the 1st typikka with a degree of off point to and the 1st public is connected to the 1st column He and his degree of point fall so and As goes directly into the intermediate layout in terms of weight so on this intermediate not are also get the weight of 95 point for the 2nd note gets getaway of 0 at 1 point said So and now we can Wigan on reconstruct all originated from this representation
2:02:29
Progressed of prolific solver and He a look at the entry in the 2nd row and the 1st time the world 1st column the 5 feet 2nd Royal 1st firstcome than the order to the connexion between them and the original picture of 5 FIA but not with his immediately that we can reconstruct this 5 from the topic reputation so on 2 it is connected to bowl wrote to is connected to the 1st public with a weight of point by this collection and to cut and won the connexion goes this way and are so that this can connexion along with a 2nd topic going this way now week and we can just the reconstruct origin value by just a them following both part through the network and any not all when the waves with kind sulphoxide on the bomb this year we have 0 points 5 times the weight of the 1st public richest 25 9 points from 4 times the connexion strength to the respective column which is point that and we get it are due to the way it's corresponding to the way off for the 2nd public knowledge this point lead To which I was 1 point 7 times weight on his mind this point have the total we get 5 points away to minus point 3 5 7 and ignoring running at this is about the size is about you would expect the by introducing a new low in the network weekend we can avoid all direct connexions and simplified on the index and the Greek islands and real was and it so if you think of a very large network with every many roles and Collins There would be a number of columns my NYTimes number froze connexions in the region the presentation and usually if introduced this Somali immediately after all with a lot less connexions overload on and we have in the region representations so that this This is representation with intermediate layaway able to find systematic correspondents East between Real's and Collins and into across public's sell them look at the time but when Matrix on the non example it might look like this man's with many nearest in it and that might have this club
2:05:20
Strong line means that there is a strong connexion In this example found so 1 documents contained the 2nd term While documents that are connected to this term in the new network of the CIA the sound 3 documents connected to the time and again weekend we can introduce intermediate their presentation with the ability medical position
2:05:51
And here we can see the topics every year 3 constructed Holland little singer medical position with a total of all 3 intermediate topics and it's a crack confusing Matrix but as it seemed if we did not Alpine of the topics sexy those topics which are really important so that this is only weight off you point 5 about and is much less and point and and the other 2 topics wickets simplifying this new network and this is the system that what I'm not tries to try to on more clear picture of how the connexions are between terms and documents so that the 1st time and that sample does look really really more simpler than what was on the region representations but if you think of really not stand up and make this is an intermediate layout with only connexions with and between terms and and immediately and and documents and and immediately that much less connexions than you would have been enough to hold network by every time is connected to every day
2:07:04
Oh and it's a different way to think that some null began on I would go near found another question is how we can compute the as the sort on received some example and and example the issues some Matrix a competition and ask you to trust us that this will be the right decomposition and that is how to be computers and we weren't going to be tested the lecture because this is really all mathematics and abundant literature hot to compute as we do and what I did wrote the probity it has found In general you could say that this computing the its computer computer computer near expensive so it is that in the operation need you off in income seconds for really not stop document collections it can take hours of days of some of the writers collection of these 20 thousand documented the Physics on the computing the as he died on a notebook computer takes about 19 seconds so and not thinking about all document collections containing millions of the billions of documents and the testing the is is UJT ask and requires Massively Parallel Computation on its it's not never really do and 1 problem is found that many traditional under rhythms for computing the deal that have been developed in the field of mathematics require that followed that the time dopamine Matrix an almighty you as the and all intermediate value to be kept in May and memory to be to be found house so of course you will not be able to apply order job document Matrix the 1st bonding to the worldwide web contributed from the main memory even if you have very big computer because that that it is many many solve gigabytes of them you would need and tackled in computer which could do with the efficiently So and because of this on the are some specialized was available at the time able to to blow summit immediate results not disc with all the adults in the UK on time penalties and how and that it seems still to be researched topic on how to compute the line on the document collection so with looking into the literature and it is up to now there has been any researchers who published their work pound fund as by using more than 1 million documents maybe some guys who will follow the do this efficiently but obviously when no finding out how these different difficult things work than it it is a becomes a trade secrets and become ponies and not interested in publishing called into a lack of competition with very difficult but somehow or you can manage it usually cries of hot and some of tricks may be distributed by picture and no idea So in recent research that means some ideas on how to how to use some some cuts and that the nation on tourism to come up with and approximation off the of the as he died on sold them based on some great into sent of imitation techniques under quite simple to with 2 on to implement currently but it said it research project and open the way to go
2:10:56
A Case found that when you want to compute days but be use some kind of a guy read that is all that many many on my Rossi can download on the Web or use his to with such as method on the bill with the the way from most Document collection of medium size of exercise was the Kate hominy I tell picks order engines should be should be take the and likes of that we had to do it to them engines and the and the and the religions and on the Web expect some of its freedom and on but we have to find out there the correct number so and usually found that the fate of involved on a free kick to many demands and then of course not but we have a better approximation of homage No 10 recommend Matrix that we have not demand we have to to compute on so computation is getting more noise and and of calls for months on the grid in the tree the when are the effect of of contending that the conduct of and Matrix into smaller ones that can be achieved so too many commented that on the other hand too few them engines are thought that aren't because we simply don't pulled topics of cost of iconic and computer as the deal large document collections and people I though the most significant topic but the Emery this would be would be a topic where we lasted tool to launch groups of documents that we won't be able to want to differentiate between the a smaller smaller topic variations in the collection so essentially using too few them engines are not want away with but also to a full man's so people with simply not be able to find with the looking for because they are not able to defend to differentiate between different documents so the momentum of that you demands also bad The finding the right here down usually Wolves just trying on different values And of course depends on all specialized the document collection is so it is that many many different area in the collection for some collection about owners of all different area in in science works on both the new would lead with the 2 law topics that this is because of the collection bombs that contains almost same document and not not many many different topics as more like a good the could be better so you just have to to try out usually found and and where maiden made and experiment where the used to the different different K case on the excess
2:13:58
And Dame concept at some kind of synonyms test vetted their checked Powell many system and them have correctly being group into the same public bombs and higher many words that have different meanings has been put into different topics at that point out that their collection using an intermediate value for cases of about 200 300 you that the best results of began to smile number of demand that such a large number of dimensions also bed on finding the right 1 usually is difficult so that are some some values from experience so sold some people say they are appalled 200 500 solvent public is a good value for Kate
2:14:51
But you need to find out his collective What are the pros and cons of side so we are seeing frostbite usually that the tree quality is quite as it seems with the White House and some no other to method is scenes of law would be a good to retrieve this document but has identified all of that wise White House and red essentially the same as collect so you are able to retrieve rather than documents that accounted for with any other men that definitely a big pro so on their own or also is that the design of the rest on the very strong mathematical foundations so we have seen the other methods where many heuristics reduced the thing think of the violin and the 3rd for you to make some very strange assumptions and in the end it just works but nobody knows why bound for you can explain why you expected as a way but also So from another proIraq as eyes a technique is that you can apply designed to other other kind of on problems not only from the major trio but you can also use it on with the disease in the to in the next week you can also apply to are finding Simenon movies for example so if you try to find out what what movies are similar to ramp and but as I'd techniques you are able to find the house without asking asking people what a symbol of the city tour so that is why this is just a good thing in general selected from its benefits and many different feuds so it's also a good thing too used in information The now also sums on this but it is only as if he or pictures and that some of the by and so that they can turn engines to get such a very difficult to to understand You know they have no idea what he did demand means but you can be sure that documents on times that have similar coordinate in the latest based on cement can that's all I know but you can't say that the 1st excess means Texas where the something I've had no idea of on another Kaunas found that committees competition requirements are really really high as said of can easily take days all week computed iFilm medium sized document collection and ideas and his thing you won't do in a day by day basis 3 all this have to do is off line and definitely not a good time And that on the right case quite how to find just the try out far different K use some Exxon decrease in cheque whether the results are accepted for your application so at this information retriever but we can only provide you with some method some ideas of how to use the NYT this method halted human the parameters Owais dependent on the collection of the problem you have to solve Indeed this the will be done in the next week because of the time the interesting and and the next week we will talk about language model is also load of Thailand last model on all to do with a major review of how to search for and documents and we come to the question What rather than 3 is so railways said that events yet reunited when you see it as a day something really about rather than send them different levels of what relevance really is Andrew would take a look at this and finally below discuss Fault 1 can only evaluate hold but and information extremists and really is so with this ideas we will be able to compel some space between the and by an independent review and see what what works spent on most collection of that it for 2 days ago much weight and