We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Texture Features, Low-Level Texture Features, Tamura Measure, Random Field Models, Transform Domain Features (21.04.2011)

00:00

Formal Metadata

Title
Texture Features, Low-Level Texture Features, Tamura Measure, Random Field Models, Transform Domain Features (21.04.2011)
Title of Series
Part Number
3
Number of Parts
14
Author
Contributors
License
CC Attribution - NonCommercial 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer
Production Year2011
Production PlaceBraunschweig

Content Metadata

Subject Area
Genre
Abstract
In this course, we examine the aspects regarding building multimedia database systems and give an insight into the used techniques. The course deals with content-specific retrieval of multimedia data. Basic issue is the efficient storage and subsequent retrieval of multimedia documents. The general structure of the course is: - Basic characteristics of multimedia databases - Evaluation of retrieval effectiveness, Precision-Recall Analysis - Semantic content of image-content search - Image representation, low-level and high-level features - Texture features, random-field models - Audio formats, sampling, metadata - Thematic search within music tracks - Query formulation in music databases - Media representation for video - Frame / Shot Detection, Event Detection - Video segmentation and video summarization - Video Indexing, MPEG-7 - Extraction of low-and high-level features - Integration of features and efficient similarity comparison - Indexing over inverted file index, indexing Gemini, R *- trees
HistogramGeometric quantizationAveragePairwise comparisonMatching (graph theory)Texture mappingSurfaceGraph coloringDifferent (Kate Ryan album)Arithmetic meanAbstractionDistanceTable (information)Magnetic stripe cardHistogramVector spaceUniform resource locatorMeasurementSimilarity (geometry)Contrast (vision)Cross-correlationSpacetimeLetterpress printingRoundness (object)Form (programming)AreaCylinder (geometry)MereologySpectrum (functional analysis)Multiplication signColor spaceBitMultimediaPoint (geometry)Quadratic equationAssembly languageVideo gameTerm (mathematics)DatabaseGroup actionTexture mappingIdentical particlesReflection (mathematics)Computer-generated imageryComputer animation
AverageHistogramGeometric quantizationPairwise comparisonMatching (graph theory)Texture mappingInformation retrievalComputer-generated imageryLevel (video gaming)Zufälliges FeldModel theoryMeasurementDatabaseMultimediaFourier transformRepresentation (politics)Raster graphicsData typeObject (grammar)Wallpaper groupArtificial neural networkRecurrence relationMathematical analysisMagnetic stripe cardSurfacePattern languageLine (geometry)Blackboard systemBitAngleNatural numberDescriptive statisticsReflection (mathematics)Grass (card game)SmoothingInformation retrievalType theoryKnotGraph coloringProgrammschleifeRecurrence relationRight angleTexture mappingMemory managementMultiplication signMathematical modelDifferent (Kate Ryan album)MereologyPatch (Unix)Category of beingLevel (video gaming)View (database)Data structureFlow separationObject (grammar)Cellular automatonTrailStructural loadOcean currentUniverse (mathematics)Statement (computer science)Group actionBeat (acoustics)Control flowCodePoint (geometry)Regular graphComputer animation
Texture mappingComputer-generated imageryHelmholtz decompositionMultimediaTextursyntheseDatabaseInformationMultimediaNeuroinformatikParallel portNatural numberSpeech synthesisTexture mappingMereologyGrass (card game)Line (geometry)Formal languageGroup actionComputerRandomizationOcean currentReal numberCausalityAreaCASE <Informatik>Term (mathematics)FreewareWordAuthorizationDigital photographyCartesian coordinate systemDescriptive statisticsHand fanMultiplication signUniform resource locatorComputational complexity theoryDisk read-and-write headComputer graphics (computer science)Video gameDatabaseElement (mathematics)Category of beingTransformation (genetics)Different (Kate Ryan album)TheoryVisualization (computer graphics)SurfacePlanningTable (information)Archaeological field surveyTextursyntheseComputer scienceQuery languageState observerQuantum stateFourier seriesAlgorithmNetwork topologyComputer animation
DemosceneHelmholtz decompositionDatabaseMultimediaComputer-generated imageryTexture mappingPairwise comparisonQuery languageSimilarity (geometry)SatelliteObject (grammar)Characteristic polynomialBranch (computer science)Different (Kate Ryan album)Graph coloringDialectTexture mappingStructural loadWhiteboardDemosceneAddressing modeCircleRegular graphMathematicsAreaSingle-precision floating-point formatPopulation densityForestClosed setFocus (optics)Medical imagingMultiplication signWebsiteCodeFault-tolerant systemDegree (graph theory)Real numberReflection (mathematics)Term (mathematics)Control flowPoint (geometry)Table (information)Water vaporGreen's functionSatelliteMereologyQuicksortField (computer science)Semantics (computer science)CASE <Informatik>NeuroinformatikNetwork topologyDisk read-and-write headNoise (electronics)Cartesian coordinate systemFrequencyQuantum stateForm (programming)WavePattern languageDepictionMixture modelWordDigital photographyArithmetic meanGoodness of fitMagnetic stripe cardMaizeComputer animation
Texture mappingObject (grammar)Computer-generated imageryQuery languageCharacteristic polynomialMultimediaDatabaseSatelliteWater vaporBlock (periodic table)BuildingFourier seriesTransformation (genetics)Similarity (geometry)Orientation (vector space)Kolmogorov complexityOrder (biology)BuildingGroup actionMenu (computing)Block (periodic table)Social classDifferent (Kate Ryan album)Texture mappingOrientation (vector space)MereologyComplex analysisInterpreter (computing)Physical lawTable (information)Transformation (genetics)Classical physicsPattern languagePhysical systemNeuroinformatikLevel (video gaming)FreezingDistribution (mathematics)Natural numberPixelLine (geometry)State of matterCovering spacePoint (geometry)Structural loadPatch (Unix)DataflowSet (mathematics)CASE <Informatik>Client (computing)Ideal (ethics)SatelliteMatching (graph theory)BlogGraph coloringCellular automatonMathematicsDirection (geometry)StatisticsIdentical particlesFilter <Stochastik>MeasurementFourier transformRegular graph1 (number)Characteristic polynomialComputer animation
PixelInformationHistogramPairwise comparisonSimilarity (geometry)Distribution (mathematics)Standard deviationMathematical analysisDatabaseMultimediaTexture mappingLevel (video gaming)Moment (mathematics)Positional notationShift operatorFunction (mathematics)Matrix (mathematics)Euklidischer RaumPoint (geometry)Matrix (mathematics)Shift operatorSimilarity (geometry)MathematicsPoint (geometry)Graph coloringHistogramOnline helpPosition operatorPixelDistanceTwo-dimensional spaceMoment (mathematics)Direction (geometry)Standard deviationInformationDifferent (Kate Ryan album)FrequencyProbability distributionExpected valueFunktionalanalysisEstimatorDistribution (mathematics)Euklidischer RaumLine (geometry)AreaPattern languageType theoryNeuroinformatikTexture mappingArithmetic meanFlow separationComputer scienceFirst-person shooterCharacteristic polynomialMathematical analysisLevel (video gaming)Inheritance (object-oriented programming)Physical systemChaos (cosmogony)Order (biology)Polygon meshQuicksortCausalityConfidence intervalMetric systemMatching (graph theory)Single-precision floating-point formatCovering spaceRight angleCartesian coordinate systemGradientCategory of beingQueue (abstract data type)Computer animation
Euklidischer RaumPixelPoint (geometry)Level (video gaming)Matrix (mathematics)DatabaseMultimediaMeasurementTexture mappingIdentical particlesThumbnailDistanceHypothesisLine (geometry)Regular graphCorrelation and dependenceDimensional analysisContrast (vision)Acoustic shadowDirection (geometry)Computer-generated imageryPoint (geometry)Letterpress printingBasis <Mathematik>Discounts and allowancesLocal ringLevel (video gaming)Color managementPixelFigurate numberTexture mappingComputer scienceVector spaceOcean currentMatching (graph theory)Metric systemSummierbarkeitComputer programmingImage resolutionGradientContrast (vision)Software testingCross-correlationLinearization1 (number)Regular graphShift operatorNumberTunisDirection (geometry)Data structureVideo gameDistanceMathematicsSmoothingClosed setMatrix (mathematics)Volume (thermodynamics)Shape (magazine)AreaMultiplication signMetropolitan area networkDemosceneLine (geometry)NeuroinformatikQuadratic equationDigital photographyField (computer science)Graph coloringQuery languageMountain passCharacteristic polynomialCombinational logicSimilarity (geometry)TheoryHypothesisRule of inferencePhysical systemDifferent (Kate Ryan album)Computer animation
Correlation and dependenceDimensional analysisAcoustic shadowContrast (vision)Direction (geometry)Line (geometry)Regular graphDatabaseMultimediaComputer-generated imageryMeasurementDigital photographyImage resolutionPixelAverageNeighbourhood (graph theory)Maxima and minimaLevel (video gaming)Different (Kate Ryan album)Mixture modelRow (database)WindowMathematicsGraph coloringImage resolutionPoint (geometry)Cubic graphPixelNeighbourhood (graph theory)Digital photographyOnline helpBlock (periodic table)BuildingRectangleFrame problemCASE <Informatik>AreaQuery languageMultimediaObject (grammar)Arithmetic meanGoodness of fitLevel (video gaming)Physical systemDistribution (mathematics)Content (media)Multiplication signWave packetSheaf (mathematics)Matching (graph theory)AverageGradientComputer animation
Maxima and minimaLevel (video gaming)PixelMultimediaDatabaseComputer-generated imageryEntire functionHistogramPairwise comparisonAverageOperator (mathematics)EstimationContrast (vision)Acoustic shadowMoment (mathematics)Standard deviationDistribution (mathematics)KurtosisDirection (geometry)Element (mathematics)Standard deviationTexture mappingMoment (mathematics)Web pageProbability distributionGraph coloringSkewnessLevel (video gaming)MeasurementPattern languageWindowStatisticsDistribution (mathematics)PixelHistogramMassMaxima and minimaTotal S.A.Selectivity (electronic)Single-precision floating-point formatContrast (vision)MereologyTable (information)AreaDifferent (Kate Ryan album)Acoustic shadowNumberData structureArithmetic meanModal logicDirection (geometry)Line (geometry)Group actionVarianceCalculus of variationsKurtosisElement (mathematics)Sheaf (mathematics)Focus (optics)Expected valueFilm editingFitness functionCaustic (optics)CausalityHand fanModel theoryConfidence intervalMobile WebPoint (geometry)Musical ensembleQueue (abstract data type)Centralizer and normalizer4 (number)Row (database)Software testingCovering spaceComputer animation
Element (mathematics)Direction (geometry)MultimediaDatabaseComputer-generated imageryOrder of magnitudePixelGradientAngleThresholding (image processing)HistogramAverageCalculationBit rateInvariant (mathematics)RotationSpacetimeTexture mappingSimilarity (geometry)Pattern languageLogic synthesisModel theoryEndliche ModelltheorieSample (statistics)Zufälliges FeldStochasticPixelGradientThresholding (image processing)Data structureDigital photographyRotationMeasurementDirection (geometry)Physical lawTunisMultiplication signComputer scienceNumberCellular automatonOrder of magnitudeProcess (computing)Physical systemMetropolitan area networkIdentical particlesMathematicsAnalogyCodeGraph coloringBoundary value problemPolygon meshLevel (video gaming)Different (Kate Ryan album)Texture mappingClosed setModel theoryDivisorDistancePattern languageMatching (graph theory)CASE <Informatik>Endliche ModelltheorieSimilarity (geometry)Invariant (mathematics)HistogramComputerEuklidischer RaumSingle-precision floating-point formatDecision theoryCubic graphPairwise comparisonNormal (geometry)Contrast (vision)2 (number)Standard deviationComputer animation
Texture mappingLogic synthesisModel theoryEndliche ModelltheoriePixelSample (statistics)Computer-generated imageryZufälliges FeldStochasticMultimediaDatabaseParameter (computer programming)Matching (graph theory)Random numberAsynchronous Transfer ModeMatrix (mathematics)Variable (mathematics)Distribution (mathematics)ImplementationMaxima and minimaEstimationModel theoryPixelSummierbarkeitDistribution (mathematics)State observerParameter (computer programming)Physical lawRhombusRandom variableSimilarity (geometry)Logic synthesisMultiplication signMagnetic stripe cardStatisticsMatrix (mathematics)Different (Kate Ryan album)Pattern languagePoint (geometry)Right angleMatching (graph theory)Metric systemTexture mappingSampling (statistics)Regular graphType theoryReal numberGame theoryAreaDescriptive statisticsPower (physics)Network topologyMaxima and minimaSocial classEndliche ModelltheorieRandomizationMaximum likelihoodSystem callGroup actionError messageTwo-dimensional spaceGoodness of fitVector spaceLikelihood functionArtificial neural networkComputer animation
Matrix (mathematics)Computer-generated imageryParameter (computer programming)PixelModel theoryRandom numberVariable (mathematics)Distribution (mathematics)ImplementationMultimediaDatabaseTexture mappingSquare numberCategory of beingPrinciple of localityNeighbourhood (graph theory)Markov chainOptical disc driveSocial classEndliche ModelltheorieVarianceAreaCharacteristic polynomialPattern languageTunisOperator (mathematics)Figurate numberNoise (electronics)Calculus of variationsoutputMatrix (mathematics)Latent heatVideo gameUniformer RaumState observerNeighbourhood (graph theory)Different (Kate Ryan album)MeasurementRight angleParameter (computer programming)Symbol tableCausalityPixelModel theoryLetterpress printingWaveMaximum likelihoodDistribution (mathematics)StatisticsShift operatorEndliche ModelltheorieMagnetic stripe cardDirection (geometry)Texture mappingChainDisk read-and-write headRandomizationPhysical lawTerm (mathematics)Local ringEvent horizonDistanceData structureProcess (computing)Category of beingHypermediaGroup actionProbability spaceComputer fileBitMathematicsTable (information)Probability distributionStochastic processConnected spaceWeißes RauschenRegular graphNeuroinformatikMoment (mathematics)Pairwise comparisonMereologyGraph coloringDiagonalMultiplication signDescriptive statisticsError messageSocial classVector spaceType theoryComputer animation
VarianceModel theoryEndliche ModelltheorieTexture mappingPixelParameter (computer programming)MultimediaDatabaseNeighbourhood (graph theory)MultiplicationImage resolutionRandom numberSinguläres IntegralTime domainComputer-generated imageryInformationSpacetimeContrast (vision)Vector spaceRepresentation (politics)Point (geometry)GradientData conversionInformationParameter (computer programming)Vector spaceNeighbourhood (graph theory)Signal processingTexture mappingNoise (electronics)MereologyIntegrated development environmentFrequencyDifferent (Kate Ryan album)Maximum likelihoodImage resolutionPixelSimilarity (geometry)Type theoryModel theoryPoint (geometry)Line (geometry)GradientData conversionPhysical systemEndliche ModelltheorieGraph coloringCondition numberAnalytic continuationControl flowNatural numberAbstractionMathematicsData compressionBit ratePattern languageSequenceThetafunktionBeta functionForcing (mathematics)CASE <Informatik>Field (computer science)MeasurementContrast (vision)Transformation (genetics)Domain nameUniqueness quantificationComplete metric spaceRepresentation (politics)RandomizationMaxima and minimaDirection (geometry)Video gameMultiplicationLocal ringBeat (acoustics)Heat transferNeuroinformatikRight angle2 (number)Polygon meshMultiplication signPrologState of matterRoundness (object)Level (video gaming)Matching (graph theory)Computer animation
Representation (politics)Computer-generated imageryInformationSinguläres IntegralDatabaseMultimediaPolynomialPoint (geometry)Coordinate systemPolynomial interpolationPixelPresentation of a groupTexture mappingNumberSequenceCoefficientFrequencyCalculus of variationsFunction (mathematics)Discrete groupFourier seriesFourier transformSeries (mathematics)SineTime domainDensity functional theoryPoint (geometry)Web 2.0Series (mathematics)Sign (mathematics)QuicksortNumberData structureRight angleMoment (mathematics)Context awarenessLine (geometry)Special unitary groupDifferent (Kate Ryan album)Scripting languageMultiplication signAreaSymbol tableTransformation (genetics)Representation (politics)FrequencySpacetimeDegree (graph theory)Uniform resource locatorGoodness of fitMatching (graph theory)Row (database)System callGraph coloringInformationFunktionalanalysisWeb pageType theoryWell-formed formulaCycle (graph theory)Presentation of a groupFreewareSound effectProduct (business)Single-precision floating-point formatPolynomialMetropolitan area networkReal numberSequenceTexture mappingKnowledge representation and reasoningAlgebraStatisticsMeasurementPlanningLinear algebraDimensional analysisTheoremWaveNichtlineares GleichungssystemTrigonometric functionsCartesian coordinate systemPixelPerfect groupSummierbarkeitSinc functionOscillationMathematicianSineComputer animation
Series (mathematics)NumberFourier seriesSineFunction (mathematics)Time domainDensity functional theoryMultimediaDatabaseFrequencyInformationSequenceCoefficientTrigonometric functionsWaveTransformation (genetics)SpacetimeComputer-generated imageryRepresentation (politics)OscillationMatter waveParameter (computer programming)PixelFourier seriesCorrespondence (mathematics)CoefficientTrigonometric functionsProjective planePoint (geometry)Order (biology)Direction (geometry)BitDimensional analysisSineMagnetic stripe cardFrequencySequenceMoment (mathematics)CurveAdditionInformationReal numberMultiplication signSummierbarkeitInfinityWaveCASE <Informatik>OscillationMatching (graph theory)Time domainWell-formed formulaBuildingRepresentation (politics)Series (mathematics)Domain nameSpectrum (functional analysis)HistogramPixelApproximationPattern languageGraph coloringClique-widthFamilyVariety (linguistics)Sign (mathematics)SimulationQuicksortWordFunktionalanalysisVolume (thermodynamics)NumberStructural loadNichtlineares GleichungssystemVideo gamePhysical lawFigurate numberRevision controlKnowledge representation and reasoningComputer simulationComputer iconDisk read-and-write headIndependence (probability theory)DemoscenePresentation of a groupDifferent (Kate Ryan album)Computer animation
Matter waveOscillationParameter (computer programming)Computer-generated imageryClique-widthPixelDensity functional theoryMultimediaDatabaseSequenceCoefficientNumberSineFourier seriesFunction (mathematics)FrequencyWaveTrigonometric functionsMatrix (mathematics)Vector spaceTime domainSpacetimeQuicksortHarmonic analysisFrequencyPresentation of a groupInformationCategory of beingFamilyMagnetic stripe cardMereologyNumbering schemeVariety (linguistics)Commitment schemeThomas BayesQuadrilateralMultiplication signGroup actionDistanceFunktionalanalysisPoint (geometry)Spectrum (functional analysis)Transformation (genetics)Order (biology)Right anglePopulation densityMultiplicationConnectivity (graph theory)Complex analysisMatrix (mathematics)Perspective (visual)Correspondence (mathematics)Object (grammar)Projective planeFigurate numberVideo gameView (database)AreaRepresentation (politics)Domain nameSineTrigonometric functionsVertex (graph theory)Direction (geometry)CoefficientPattern languageFehlererkennungLine (geometry)2 (number)Symmetric matrixDifferent (Kate Ryan album)Fourier seriesComputer animation
MultimediaComputer-generated imageryFrequencySpacetimeAlgorithmFast Fourier transformDensity functional theoryPrime idealDivisorKolmogorov complexityDatabaseProgramming paradigmFunction (mathematics)FactorizationRepresentation (politics)Pattern languageDirection (geometry)Different (Kate Ryan album)Fourier transformKnowledge representation and reasoningQueue (abstract data type)Sign (mathematics)Goodness of fitPresentation of a groupLevel (video gaming)Transformation (genetics)Complex analysisMultiplication signFast Fourier transformSystem callTwo-dimensional spaceSineLibrary (computing)CoefficientResultantPairwise comparisonOrder (biology)Graph coloringPower (physics)4 (number)AlgorithmDatabaseDomain nameFourier seriesFunktionalanalysisQuery languageImplementationProgramming paradigmWell-formed formulaTrigonometric functionsHypermediaSet (mathematics)Nichtlineares GleichungssystemSubject indexingKey (cryptography)Reduction of orderFerry CorstenCodeState of matterFile formatNeuroinformatikWordControl flowFigurate numberGroup actionCausalityWaveType theoryComputer programmingMachine visionComputer animation
Function (mathematics)Spectrum (functional analysis)WaveletLocal ringScaling (geometry)FrequencyComputer-generated imageryImage resolutionApproximationMultimediaDatabaseWavelet transformBasis <Mathematik>Sampling (music)Scale (map)Exponential functionDivisorCoefficientMultiresolution analysisPhysical systemUniqueness quantificationFunktionalanalysisDifferent (Kate Ryan album)Pattern languagePower (physics)WaveletCurveInformationShape (magazine)Parameter (computer programming)MereologySocial classMassAreaTexture mappingPolynomialPoint (geometry)Transformation (genetics)Cartesian coordinate systemDomain nameMeasurementData compressionCodierung <Programmierung>Term (mathematics)PixelCoefficientBuildingGoodness of fitCharacteristic polynomialCombinational logicLinearizationBasis <Mathematik>Image processingTrigonometric functionsSquare numberFourier transformWavelet transformDivisorMehrskalenanalyseContinuous functionMatrix (mathematics)Endliche ModelltheorieSpectrum (functional analysis)Representation (politics)FrequencyMatter waveGraph coloringDiscrete groupDescriptive statisticsStatisticsBitFamilySineRootCovering space2 (number)Image resolutionScaling (geometry)Kritischer ExponentMultiplication signElectric generatorDiscrepancy theoryPhysical systemSequenceSpacetimeRow (database)Shift operatorSampling (statistics)TheoremDistribution (mathematics)Set (mathematics)NumberLimit (category theory)Uniform resource locatorWaveWebsiteSound effectCASE <Informatik>Gateway (telecommunications)Right angleLevel (video gaming)Computer iconFreewareDampingVideo gamePlastikkarteGroup actionPairwise comparisonResultantHeat transferCalculationModel theoryIdeal (ethics)Menu (computing)Sign (mathematics)System callTelecommunicationArithmetic meanMusical ensemblePopulation densityLocal ringCausalityStructural loadVolume (thermodynamics)Letterpress printingMoment (mathematics)Electronic mailing listComputer animation
Transcript: English(auto-generated)
Hello everyone and welcome to the wonderful world of multimedia databases. And last time we were beginning to talk a little bit about colors, about images.
We saw color as the first and the primary impression towards perception. So what you immediately notice, if you're not colorblind, is kind of the contrast and the colors that are in an image
and that makes an immediate impression on you. And of course this could be formalized, so we were talking about a couple of color spaces. The RGB space for example, CMYK, usually used for printing. But also other spaces for building the actual histograms, for building the actual features.
We've reflected on HSV. Does anybody still know why we reflected on HSV? Exactly, it's not really psychological color space, but usually the distances or the measurements of distances in HSV
have a pretty good notion of how humans distinguish between colors. And the basic form of HSV was a cylindric form, so you have the color hu, which is kind of like the round part of the cylinder
and then you have the saturation from inside to the outside of the color of the cylinder and you have the brightness beginning very bright on top and then going down all the way to dark areas.
So we were talking about a couple of how colors can be mixed and how colors can be subtracted or added or something. But in the end we were kind of interested in what we could actually do with this color. How could we compare images based on color?
And the one idea that really makes it great is the color histogram. We say, well how much, what percentage of each color is in the picture? And then you can do all kinds of tricks with the layout where you say no, you also have to consider where the color actually is, so the location of the color,
and you can do a lot of tricks there and it gets more complicated, but in the end what you get is a feature vector and there are different ways to compute similarities between these feature vectors beginning from simple histogram distances,
so just subtracting the different columns from each other up to quadratic measures or the malanobis distance which takes the correlation between different colors
and the similarity between different colors in the spectrum into consideration. And today we will be moving on from the simple colors to something that is also very interesting in recognizing images or describing images,
and that is textures. What is a texture? Anybody wants to venture for a definition? Why is the surface of this table not like the surface of the carpet?
Where is the difference? The material that is used here, but I mean you cannot see materials
and abstract from the color, so this is brown, this is gray, yes I noticed, but still also in terms of visual impression. Yes, meaning different, yeah, the light reflection is somewhat different,
but could you describe, take a look at the table before you and would you describe it in a way? Seems pretty smooth to me too.
Any ideas? It's hard, isn't it? So this is kind of stripey in a way, you know, like with the wooden stripes in it, you know, this is kind of pointy, I don't know.
It's very hard to describe what it actually is and what makes it so different, but we can immediately recognize the difference. Even if I would do the same in the same kind of colors
or in the same types of, you know, like smoothness of the surface, if I would take exactly the same reflection properties, you know, I will immediately see that this surface looks somehow different from the surface over here.
And the idea of this lecture is to describe this. How is it different? And this is what we call the texture, right. So we'll be onto texture-based image retrieval today. We will just go into the basics of textures and find out what makes a texture, a structure, a surface pattern,
or however you may call it, a texture. We will then talk about some features that could be used to measure or to describe such textures. And in this time we will also introduce the idea of low-level features
and high-level features. Low-level features are very basic descriptions of something. High-level features are, well, basically intrinsic descriptions that are built by mathematical models. So it can be pretty complicated
but usually give you a better impression than a low-level feature. They don't abstract as much. But both have their uses. So let's hop into that. So textures describe the nature of typical recurrent patterns in pictures.
So if I look at the surface here of the table, there are the stripes, and there's not just a single stripe, but there's this kind of layer of stripes on top of each other. And they're not really regular, so I wouldn't go that far that I say this is shaded somehow.
But the stripes are of different strengths, you know, and some are perpendicular. No, perpendicular probably not, but some are vertical, some are a little bit angled or a little bit skewed, you know. So there is a regularity, though it's not as regular
as if I would go consider the blackboard and have the real vertical edge here, okay? But the idea why I know that this is a pattern is that it is reoccurring. It repeats several times, otherwise I wouldn't recognize it as something of a pattern.
And that is the basic idea, that it is recurrent. A shading is only a shading if there are several lines. If it's just a single line, nobody would call it a shading, okay?
And the same goes for the pattern of the carpet here. That's rather pointy or I don't know how to call it actually, and that is part of the problem, because if I don't know how to call it, you will not understand what I mean or what I'm talking about without actually seeing it. So I try to describe the pattern of the paper,
of the carpet over the phone to somebody. That would be very difficult, huh? Well, it's like a grayish, pointy, carpety thing. Got an impression? Probably not. What we really need is a good description for those images,
for those patterns. And I could use on one hand the objects, so this is a wood structure, and everybody knows what I'm talking about, and I say, well, it's patterned like wood, because the different wood kinds are, well, different in a way,
but they are kind of similar to each other. It's all these stripes, it's all these big loops that they have, knot holes probably, huh? And everybody knows what I'm talking about. Not exactly, but everybody has an idea.
I can only do that with things that are somehow natural, you know, like the grass. Everybody has an idea what's the pattern of a meadow, just the leaf of grass beside each other. Or gravel, huh? A heap of gravel, how does it look like?
Well, it will be pebbles, different sizes, so it will be a little bit coarse. But also artificial things, like a brick wall. How does a brick wall look like? Well, usually it's kind of like bricks, and the next brick, and the next brick, you know. You have an idea, though I don't really say what it is,
what makes a brick wall, or how big the bricks are. That is the basic idea that we are looking at. Come on. Yeah, here's for the saving again. And the idea of this lecture, or our quest for today,
our quest for today is to order and somehow describe random textures that may occur in images. This is kind of very regular, though it is a natural thing, it's bamboo.
So how do we describe it? It's kind of parallel lines, maybe that is easier. Kind of knotted, parallel and perpendicular pieces of something.
Grass, gravel, I mean, it's hard to do. It's really hard to do. Whoever tried, it's clear what I'm getting at, but it's totally unclear how to represent it in a computer.
Because even talking to you, and talking is a natural language, speech is one of the most effective and efficient ways of transporting information.
I go a lot, kind of. A computer doesn't know kind of. A computer knows one and zero. And this is something that we have to consider. And actually this is not only useful for multimedia databases,
but the description of textures is very important in many areas of computer science. So we will also revisit a couple of techniques that you may well know from other lectures, like for example Fourier transformation, as a typical high-level feature, because textures are used in many other applications.
So one problem is always the segmentation of textures. If I talk about a certain texture, I talk about a certain location in the image. If I talk about a wooden texture,
not all that I see here has a wooden texture, but only this table. If I step a step aside, the wooden texture is gone. So it has something to do with segmentation. And talking about the texture of an entire image would need the entire image to be kind of totally covered in the texture.
Which doesn't make too much sense, because most pictures, I mean take any photo that you did recently, it doesn't show a single texture. It will show maybe happy people,
and maybe a tree that has a leaf pattern, and maybe there's sea sand, or whatever, you know? But there are certain elements in the picture. Trying to figure out which element is which is very important, called texture segmentation.
Then we need to classify the texture. We need to know what we're talking about. Okay, the area A over here is of the wooden texture. The area B over here of the image is of the carpety texture. The area C, is that a texture? Is it?
Would you describe it?
It's not really texture, isn't it? Because it's not regular enough. Because here are some words, and this is white, and this is green, you know. But it's not really a texture as such. So some parts of the images may be very hard to describe in terms of texture. Also that is something that we have to think about.
So this is basically the classification of the texture. And these two parts are the parts we definitely need for multimedia databases. We have to investigate incoming pictures, so pictures that are put into the database, or pictures that are put up as a query picture,
what textures are contained, which need segmentation. And we need to classify those textures to compare them between different images. Okay? So we need the classification of the texture. The third part that is very important, but that we will not go into,
is so-called texture synthesis. And this is one of the major features of, for example, computer graphics. Think about gaming, 3D engines. What is the trick there? The trick is to project textures on surfaces.
And that makes it pseudo-realistic. Texture mapping. And for those kinds of techniques, it's the same problem over and over again. You need to classify the textures, you need to see how the textures look, you need to do ray tracing and whatever, you know, very complex algorithms
to get a good visual impression of the texture, an impression that could fool the observer in believing this texture is real. This is a wooden wall in the computer game. Or this is a wooden table. You will immediately recognize that if you see it, just because the texture seems woody.
Okay? So creating texture, that is something that is a big part of texture research, but we will not go into that here. So for the texture segmentation, we want to find regions in the image which have a certain texture.
And one often calls of scene decomposition, for example, here is a grey pea texture, whatever that may mean, and the texture over here in the rest of the picture is leafy.
Very hard to describe, you know, like a lot of colors, nothing really regular. Okay? But finding out the difference between the two leads to understanding what the image shows. What does the image show? It shows a bunch of grapes and some wine leaves.
With the leaves, I can have recollections of how leaves look like, what texture do they have, maybe this classical here with the stem and then the little branches over here. Okay? This is kind of like how leaves look like
and then the branches even branch out further more. Okay? This is what we would expect of leaves. With the grapes, it's clear, it's kind of like, oh, let's check red, wee, wee, wee, wee, kind of like all this bunch of grapes.
So a very regular texture with little circles basically, spheres. Okay? This is what we would expect. And the color and texture are usually related. So what you often do in texture segmentation is look at the colors of the image.
So for example, you find the brown color here in the sandy part of the image, that gives you a certain texture or well there's basically no real regularity there. But as soon as we look at the green part over here,
we find that there's a certain pattern which is kind of like the change between light green and dark green parts. So it has something to do with the color but it's not true that textures always have to come in the same color
or they can be very colorful or a mixture of different color, still the periodicality of the pattern, the reoccurrence of certain colors, that might be a very good hint at what a texture really is or what area of a picture is really texture.
If we denote the segmented region with a predominant texture, that might also have a benefit beside just being able to focus on a single texture.
Because very often in images, areas with a certain texture belong to the same entity in the real world. Think about sonograms or x-rays to some degree.
The idea basically is to make things visible that are inside the body and very often they are kind of color coded and you see this is the liver over here, it's this area that has the same typical form usually but also a certain texture that reflects the sound waves
and in a certain way if it's a sonogram or if you have tomography, it will be other kinds of rays that penetrate the body or the outer layers of the body and are reflected in a certain way.
And this way of reflection will be a certain texture that is imposed on this area or on the specific point in your body and doctors actually can see things.
So for example in oncology, if you're looking at cancer, it's very often possible to see tumors just because there's some change in texture or there's some change in the way the rays are reflected. Same goes for satellite images.
If you look at satellite images, you can immediately see what is water and what is land because the water has, besides the color, has a different texture. It has kind of like longitudinal stripes which are kind of waves in the area close to coasts and a very flat, no-texture area in the middle of the oceans.
Whereas on land masses, you usually have some mountains, you have some cities, you have some, I don't know, like forests or something that will change the texture very quickly.
And if you look at images of densely populated areas where you have agriculture, for example, you will have this carpet pattern, you know, like with different corn fields and whatnot. So what we have to do now for the classification is we have to describe the corresponding texture with some features
or words or whatever that can be used by computers to compare the textures of different images, whether it's the same texture or it's a different texture, and how close matching textures are.
Yeah, the classification on one hand can be semantic, so I can say, well, if something is textured like that in a medical image, it is the liver. Or if something looks rather bubbly in x-rays, it might be the lung.
So there's a semantic meaning to the things. But this is very strongly dependent on the application, so in medical imaging one can do that. In many other kinds of remote sensing also, it's just not possible to say what it actually is.
It's like seeing something on the radar, you know, like there is something but you have no idea what it actually will be. But you can figure out it's not the background, it's not the background noise. The same happens if you look at photos, photos of friends. You will immediately recognize where the person is in the image and where the background is.
You get an idea of that just by looking at the clues because there's something that is hair textured around the head of a person. Well, very often, more often than not. There's something that is kind of like textured here.
Sylvia is a very good example of texture today, where you would expect the shirt, yeah, exactly, you find regular textures and you will immediately focus on these areas that are more interesting to you. And the good thing is about the textures that you can actually skip those parts
that are all the textures because you recognize it and then you go, oh, this is the shirt, no interest. This is the face, I have to look at it to see who it is. So also, by segmenting the images, this classification is also very helpful.
If we consider the classification not to be, you know, like in real world terms, this is the shirt, this is the hair of a person or something like that, but if we rather say, well, this is a striped area, this is a wood textured area, you know,
whether it's a table or kind of chair or wall panel or whatever, this is just a wood pattern area. Then it allows us to compare between images.
We can say, oh, this is an image that has a wood covered or wood pattern area and this is another image that has a wood pattern image. It doesn't care if one shows a table or the other shows a chair, you know, it's not semantic anymore, but it's just the visual impression that it puts on us. And that actually allows us to compare between images,
compare, describe the visual impression. This is kind of the same trick that we did with the colors last time. We didn't care about what the colors actually depicted. Was it an elephant that was shown there or was it a carpet that was shown there or was it Sylvia that was shown there?
Well, that's hard with the elephant. But still, we just focus on the colors. We can do the same here. And we can also do something that is called clear by example. We just tell, okay, if I'm looking for a wooden table,
then I will just give you a piece of wooden pattern, whatever it may be, and I want all the pictures having this wooden pattern in it. It frees me from having to do something like annotating every image, what is a table.
It can be very helpful. Anyway, so for the classification of images, one of the classical examples is satellite images where you do it semantically, so you look at the things and find out this here is a river
with a very smooth texture. This here is sand, which is very light texture and coarse texture. And then you can use it for later segmentations of what you're looking at. The question is really how to describe textures for the measurement. And there are, on one hand, low-level features that just say,
well, what are the building blocks of the texture? What makes the texture the texture? So also a shading starts with the single line. And then you add a parallel line, and then you add a third line, and then you probably have a shading at some point.
Or you can have high-level features that are kind of like a mathematical interpretation of how things, different patterns, different statistical characteristics of patterns
reflect on the users. So typical examples here are Gabor filters of the Fourier transformation. We will go into that during the course of this lecture. The interesting question that always remains,
or that always is there once you want to build a system that is really useful to humans is kind of how do people distinguish textures? How do we find out that this is a regular pattern?
So how do we? What would you say? Any ideas? Yes?
Some little black lines here, as you can see. So there's no black line here. So if we kind of walk through this image pixel by pixel, then at some point we will hit a black line.
And once we are over it, we have kind of the same distribution of pixels, of colors, of intensities than before. What happens if we walk in this direction?
We will not hit black lines. So the change of light intensity or color when walking in different directions over the image seems to be something that could be used, yes.
Other things? Well, what do you say that the texture is different
in different parts of the image? Maybe I just showed the lower part here of the image. It's much harder to see the texture there
than looking at this part of the image. Okay, why is that? So looking at this part, one can see it. Looking at this part, I would say one could see it better.
Why is that? Yes?
So if we do what you exactly said, we walk that way, and we do the same over here. In the upper corner, we find that we hit very fine lines of black. Whereas, as you say, in the lower top part, in the red part of the picture,
we find that there's a lot of darkness, so we had longer stretches of black that might belong to a shading or might be not. So also this is kind of what characterizes a texture. And this is kind of what gave the first idea.
So there are basically three main criteria. One is the repetitiveness. Something is not a texture if it does not repeat in some interval. Hitting one black line is not enough.
You have to hit black lines at regular intervals. It also has something to do with in what way you walk visually through the image. Because in one way you may hit black lines periodically, in the other you may hit nothing at all,
which makes it shading, basically. So this is orientation. And the other thing is the complexity of the pattern. There are very simple patterns and very complex patterns. It's like carpet weaving, you know. You have the very simple carpets that are just striped, very easy.
Or you have the ones with elaborated floral patterns and everything, you know. This is also a pattern, but very hard to describe because it's more complex. So Rao and Loza actually defined three criteria. The repetition, the orientation and the complexity of some pattern
that actually make for the possibility to discover its nature or to describe its nature. And the question, of course, is can we measure that? Do we have any chance? Do we have any chance to find it out?
And in the 60s and 70s, actually, the idea of describing such patterns that's older than computer science, actually some psychologists were already interested in that long before computer science adapted the problem because they wanted wonderful 3D ego shooters.
The 60s and 70s basically focused on gray-level analysis and said it has something to do with going over the picture in different direction and finding once you hit the black line. So what we do is we kind of like take the gray values of the pixel
and we have the intensity of each pixel. We build a histogram on that and just count how many, I mean, light pixels are there, how many dark pixels are there and so on. We get these histograms. These histograms could be kind of like compared to each other
like we did with the color histograms. And we can use some statistical information about these histograms. Do they have just a single peak? Are there several peaks? Something periodic maybe? What's the standard deviation if you have peaks?
What's the expected value or the mean or the median or whatever? And the idea of course was that similar patterns would produce similar kinds of histograms, similar types of histograms. And since you abstract it from the color by just taking the gray values, by just taking the intensity values,
similar patterns should look alike in the histograms. And if you then take moments of the first order, which is basically the expected value,
you throw away all the information of where each pixel is located. So if I say there's 50% black pixels and 25% gray pixels and 25% white pixels, it could be that it's just a bar of white pixels, a bar of gray pixels,
a bigger bar of black pixels. But it could also be that it's kind of shaded, that they are well mixed, which makes it very hard to see the periodicity.
So if I look at this picture here, there's basically no periodicity and we get a color histogram that looks like that. What does it tell us? Well, there's kind of like here, very little black, then it goes up here. Something like that was an expected value of here in the middle gray area.
This is where you would expect the usual picture. But you've thrown away all the information where each pixel is located by using this histogram. The solution to that is kind of the gray level co-occurrence.
You want to find out where the different intensities of pixels co-occur with other pixels. If you have a pixel at some certain position, it has an intensity, say Q.
And one of the first approaches or one of the first investigations of that was also done by psychology. It was Charles Schulich in 1961. And he said, basically, what I have to do is, I have to look at each pixel in some image,
here for example, a picture, and then I have to see, taking its gray value, which is here the blue, what happens if I move it in some direction? What is the expectation that this gray value changes?
And if I have a pattern, for example, like this, and I take a pixel, a white one at this time, and I change the direction into this direction,
the expectation that the intensity changes is zero. If I change it in this direction, the expectation that it changes is very high. And that is the same for all the pixels here. That is something that is interesting now,
because now, not really looking at the exact point where a pixel is located, I can still describe some characteristics of it. For each pixel anywhere in the picture, if it is a regular texture, shifting it in different directions
should result in the same probability distributions of changing its color value or changing its intensity value. Is this a clever idea? Psychologists, hey!
So, what he did is, he calculated the empirical probability distribution for intensity changes of the value at pixel shifts. And he just used shift to the right. So, we have said, well, basically, my pixel has the intensity of Q,
and if I shift it, D positions to the right, then its intensity is M, and I want to know the probability. And with that, I get a probability distribution over the whole picture, and if the probability distribution for two pictures is identical,
the texture is identical. Wow! Well, of course, this is only true if you have longitudinal changes. If it's a texture like that,
it doesn't help you because it's kind of like not changing. So, then, a couple of years later, Jules said, well, yeah, I see that. Let's generalize it to shifts in different directions. So, in any direction that we walk through the picture,
we need the probability distribution, and then, as a two-dimensional distribution function for every single picture, we get our probability estimation.
And this gives us actually the gray-level co-occurrence matrix. So, for any direction, I just say, what is the expected pixel change? Yes, the gray-level co-occurrence matrix basically considers all the pixel pair
within Euclidean distance of D. And for all these pixel pairs, the entry in the matrix is the probability of shift. Okay? So, if a point x1, y1 has a gray value of i,
and a point x2, y2 has a gray value of j, and that's, I mean, gray value is usually 0 to 255. It's easy. You will define in the matrix, in the gray-level co-occurrence matrix,
for the field i and j, the number of pixel pairs that have exactly that shift
in the distance, in any direction. Okay? I just count them. How many in the pictures are there? How many pixels are in the pictures that, if I move the pixels in any direction,
I get a shift from intensity i to intensity j. Okay? Big matrix. What can you do with this matrix? Well, what you can always do with matrix,
you can compare them to other kinds of matrices, very efficiently, actually. If you have one of these matrix for every picture, then the texture should be the same.
And actually, it's the thesis of Julesz that was done in 1973, stating that if two pictures show the same, nearly the same, gray-level co-occurrence matrix,
then it's not possible for humans to distinguish between the patterns in them. Well, my thesis, but wrong. Interesting, newer perception psychology shows that it's not like that.
And actually, one of the psychologists doing it was Julesz himself, a couple of years later, and seeing, well, the theory is nice, but the tests don't work out so well, humans don't behave like they are told to, usually. Especially not if they are told to perceive something by a gray-level co-occurrence matrix.
So, as a rule of thumb, similar co-occurrence matrix indeed do point to the same textures, it's not really true that it has to be the same. You can trick the system. Okay? So, as a rule of thumb, kind of interesting enough.
So, this was the first idea that was actually along the lines of, yeah, if I go through the picture and hit the black line, you know, like, so I have the pixel, gray intensity change, yeah, this is what I'm doing. It is exactly the idea that the people had in the 60s. So, you seem to be kind of a 60s man.
And it was a good idea. I mean, it was the basis of color management. But when it became clear that this is not the whole truth, and on one hand these gray-level co-occurrence matrices are very hard to compute.
It's very not efficient. Because you have to shift every pixel. Then you have to look at the different color values that you get from the shift. Then you have to count them. Then you have to kind of, like, put it into the matrix. And you have to repeat that for all the pixels in the picture.
This is rather tedious work, and, well, even in the age of computer science, and very quick computers need a lot of computation time to prepare images and also prepare the query image. And this is the time that really is needed.
So, Tamura and some other fellows in 1978 there, well, basically, maybe we take it a notch down. We don't go from every pixel in every direction and look at the thing.
Maybe there are some basic characteristics that more or less describe the image. And we can have a smaller feature vector than this huge quadratic gray-value matrix, which in the least resolution is a 256 x 256 matrix.
If you consider more gray values, it will grow. And they said, well, basically, what we can see is that the granularity,
the coarseness of the image has something to do with the perception. So if you consider gravel versus sand, sand is worth very, very fine granular, looking at it. It's a smooth kind of texture. Gravel is kind of, you see the individual pebbles,
and they differ in color. So it has a coarse impression. Then the contrast. There are areas of light, there are areas of shade. The more shade there is, the more areas dissolve into each other. The less contrast you have,
the more different will be the perception of the texture. Directionality. If I go in some directions, the intensities of the pixel change very quickly. If I go in others, think of the bamboo, going along the cane of the bamboo, they don't change at all.
It's the same color all over the picture. Linelikeness. Does it look rather pointy and elongated, or is it rather bubbly, like pebbles? I can immediately discern it. The regularity of the pebble, of the pattern,
does it really repeat? Is there a periodicity? And finally the roughness, that is the impression that whereas this wooden structure seems very smooth to us, the carpet structure does not seem smooth.
I mean, seen from here to where the carpet is over there, it seems rather smooth, but if I look directly here, I see some irregularities which do not make it smooth. So also this could be. And they were actually measuring or trying to figure out
how to measure these things and found out that, well, basically, these seem to be correlated to the other three. So if you got the three, the other ones are more or less linear combinations of the ones before. They seem to be dependent on the other ones.
If you have a strong directionality, you also will have a strong regularity. Or if you have a strong directionality, the linelikeness is going to increase. So they looked at the different correlations between them and crossed out the last three. So this is what we want to do.
For the granularity, it has something to do with the image resolution. So, for example, if you look at aerial photographs from different heights, you will find that here you can see the buildings on the left-hand side, whereas same picture, just a different resolution.
This is both Manhattan, is it? Yes. It seems to be Manhattan. So this is the point of Manhattan, Statue of Liberty here. And this is a house block in Manhattan. And you can actually see the different houses here, different skyscrapers, okay?
Gives you a totally different impression. But how do you measure it? Any ideas?
Scaling, and then? I mean, if you scale each image enough, you will end up with the individual picture. Doesn't help you really. You have to take the picture as it is. Yes. Yes!
There we go for the 80s man. So why don't we look at different size pictures or frames in the image and look how regular the colors are in there. So basically the idea is
I take a rectangle of a certain size and I move it over the picture. And if I do that with the same size rectangle here and there, assuming this is the same size rectangle, I will find out that this rectangle here very often it houses of the same color.
Here it will not. There's also always a mixture of different colors. This is one way of describing the granularity and this is actually what you do. You examine the neighborhood of each pixel
for brightness changes. Not actually the color but the brightness is enough. So you work for each pixel. You have a window of size 1 to 1. You start with 1 to 1 and go to 32 to 32 pixels. So different sizes.
And you just record for every pixel in the image what is the brightness change within this area. So this is for example here the typical values of IBM's cubic query by image content
that was one of the first running systems for multimedia databases. And then for each size of the window you record the average gray level in the corresponding window. So you get one for each pixel for this, for this, for this, for this size.
Okay? Distribution of gray values for the different sizes. Good. Then you compute the difference of means of gray levels between this window and the window next to it and the window on the other side.
Okay? What does it help you? Well, if you have three windows directly adjacent and of the same size roughly and there's a change in the gray level distribution
then it means that it's not a regular area. If there's no change or very little change the three windows may belong to the same area.
So going back to our example here if I take any three adjacent things I will find in this case
that the gray value has not changed very much. So there seems to be one area making this area of the image very coarse.
Okay? If I do the same over here it changes very much showing that this area is very fine granular. So it's not these three different windows
do not belong to the same area, to the same object. So this seems to be a rather fine granular. And I can do the same over the whole image with different sized windows and for each pixel I determine the maximum window size
where it has the maximum difference different from its neighbors. Okay? So basically what I'm trying to do is if I have an image
I work with these little windows of different size and I basically try to blow them up until they fit the coarseness of the pattern. And if I have a very fine granular image this will not be possible
because already at small windows the distribution between adjacent windows will change. If I have stretches of the same texture, of the same color this will be possible to blow them up because the adjacent windows still have the same texture
means the same gray level distribution. Clear? Yeah, that is basically the idea of what you would do. And the granularity of the entire image is the mean of the maximum window sizes of all pixels. So if you have one area that is very coarse
and one area that is very fine granular it's basically just take the percentage between them. And you can also use a histogram mapping the number of pixels corresponding to each window size then you would have kind of like how many coarse texture
or how little coarse texture is there in the image or you could just use a single coarseness value for the entire image just as the expected value of that histogram. Very well. So there's one problem with that and that is that the image selections
with granularity you need to determine might itself be very small and you segmented your total image in different texture areas before. So you may be left with very small places where you have to find a texture.
So consider an image I'm just sitting here on this table and the most of the part is you see me and you see some wooden texture here right beside my knee. In the image that would be just a very small part. Most of the table structure would be covered by me.
And this is one of the problems that you have to deal with. Well, you can, there are actually some ways then to estimate the maximum delta the maximum difference from smaller values so if you can't blow up the pictures
the pixel windows that you move over the pixel to a certain size then there are still ways to do that in a probabilistic fashion. Papers on the web page might look it up. The second part is a little bit easier. It's a contrast.
So we have focus on the coarseness of the picture, now we want to focus on the contrast of the picture. And the contrast is kind of like the clarity or the sharpness of the colour transitions. Also the shadows that are there. So for example, again looking at Manhattan here, we find that this is very much a grey area, you know,
and you hardly can see the different skyscrapers in some areas of the pictures. You can distinguish some here, yes, but it's getting different because the contrast is very low. Whereas here you can see clear cuts between parts of the image.
Very high contrast image. This is something that you can easily measure. So for example, the contrast value is just the expectation of the grey level histogram distribution.
You just build the grey level histogram for each pixel, you record the intensity and add to the column of the histogram for that intensity. And then you look at the expected value of the histogram.
And the contrast of some picture actually describes the histogram. And it's the standard deviation divided by the kurtosis, which is the fourth central moment. Everybody knows about statistical moments.
The first statistical moment of some distribution is, ah, you need to polish up your statistics at some point, the expected value. Basically moments are statistical values that you get from distributions to describe the kind of distribution.
So how can I describe such a distribution versus such a distribution? Well, to distinguish between them, I could use just the expected value, okay?
This doesn't tell me the whole story, but this is the first moment. How can I distinguish between such a distribution and such a distribution?
Well, they have the same expected value. So the first moment is exactly the same here.
So we can derive the second moment, which is the variance, clearly distinguish it.
Then versus something like that.
Third moment, the skewness, okay? And so on. So these are different statistical measures taken on the probability distribution. And if you look at those distributions, you could also imagine them as histograms,
as a gray level histogram. And this is what we use here for describing the contrast, okay? We could use the expected value, we could use the variation of the histogram, we could use the skewness of the histogram, but what we use is the kurtosis,
which is basically the fourth central moment divided by the standard deviation. So please look it up in your statistics books if you don't know, it's not too interesting. And the kurtosis is not to discover the mass of the distribution
or the variance of the distribution or the skewness of the distribution, but actually the number of modalities of the distribution. So that means what we have to distinguish with the fourth statistical moment
is kind of such a distribution from such a distribution, okay? B-modal distribution, unimodal distribution. Kurtosis can distinguish between them.
Okay, this is what we do. Still, directionality, how do we deal with that? Well, it's kind of the predominant direction of elements. And seeing that I walk through the image, I will hit black lines here very quickly,
in that direction not as quickly. In this direction, always very quickly. So this is the way of distinguishing between the images. And what you look at is the gradient of the color change.
If I walk through this image in this direction, which is highly directional, I will find that the gray level values will go like this. Because here is black, I go to light, I go back to black,
I go to light, I go back to black, okay? This is exactly what happens here. Black, go up to light, go back to black and stuff like that. What happens if I go into that direction?
It looks like that. I have some color and it never changes as I go through the image. The gradient here is zero. The gradient here is quite high, okay?
So the gradient is a good measure when walking over the picture for the change of colors. The same goes here. So here in both directions, the color changes very quickly, okay? High gradients in all directions. This is a way to distinguish between the two images.
Yes, exactly. It's a problem, so what do we do?
Perfect! Typical computer science solution. Pragmatic and works. Yes, there are many arbitrary, many directions, but what we will do, we will just stick to eight of them. Like in the good old nautical charts. There's east, west, north and south and the things between and that's it.
And this is what we do. So for directionality, we just determine the strength or the magnitude of the direction of the gradient in each pixel. So we can, for example, use a soluble edge detector or whatever you know. So we fix pixels, then walk in the different directions, look at the gradient.
Size of the gradient determines whether there is a big change or there is not a big change. Okay? Same here. Good? What do we do? We build histograms with the directions, number of pixels that have a big gradient in that direction.
Okay? So, create the histograms for each angle, the number of pixels with a gradient above a certain threshold. And if there's a dominant direction in the image, there will be a peak in the threshold
because many pixels will have a high gradient there. Okay? Easy. What could happen now is that I'm not interested in the direction, so, you know, like if something is shaded or if something is shaded,
is this the same texture or not? Depends is a very good answer.
Yes, exactly. So it is kind of the same texture. But one could argue that in certain semantic occurrences,
if you have a horizon, for example, it might not be the same texture. So what you can decide is you could say, well, I want my measure for directionality to be invariant with respect to the rotation. I want to see, well, basically it's the way the photo has been taken of this wooden structure,
whether I photograph it this way or whether I photographed it this way. It's still the same texture and it's rather random what kind of photo I got. It's unfair to kind of like punish the photographer or the pattern for the photographer.
Then I should vote for something that is in rotation invariant. But if it really has a sense that something is horizontally striped or vertically striped, then I should probably not. So this is a design decision that we can make.
Good thing, our directionality measure, our histograms, can be made both ways. If I note the different directions, 8 or 16 or however you want to, and leave it like that, it's not rotationally independent.
If you have a strong directionality in north-south direction, this is a different texture from if you have a strong directionality in east-west direction. On the other hand, I can also say, well, do not look at the different columns.
Just count how many columns are there with different directionalities. Is there a single predominant directionality? Are there two? If I do that, it becomes rotation independent, because I don't look at the exact histogram or the exact places of histogram columns,
but just look at the structure of the histogram as such. Okay? Good! Tamura then went on to show that these first three measures,
the coarseness, the contrast, and the directionality, are not correlated, so they are independent with respect to each other. So the distance measure between two images with respect to the texture
could just be the coarseness value plus the contrast value plus the directionality value, Euclidean distance, just simple thing, divided by scaling factors, which is basically the standard deviation to kind of normalize between the three features.
That's it. Our first texture measure! So now we can take images apart, we can measure three aspects of the image and compare images with respect to each other in a digital computer system.
Ain't that wonderful? You don't seem so happy. It's great! Anyway, and it actually works. So if you do it, this is the IBM cubic system, and I took here a couple of coats of arms, so this is an ermine pattern,
and tried to find it in different pictures, and you see here that with rising distances, the things get more and more difficult. So of course, the first one is an exact match, but also these inverse colored, slightly different patterned images are immediately recognized,
and the more you go towards something that is rather striped here, you will have a higher difference measure. OK? Good!
Second possibility of computing the similarity is so-called random field models. So you could also say that basically your image is a random variable,
random being for each pixel the intensity. And if you set the intensity of some pixel, then it will influence by the pattern the similarity of other pixels.
So for example, if I do have one of my nicely striped patterns, then knowing that this pixel here is red influences the probability of the color of this pixel.
OK? Because if it would be a pattern, this pixel should be white. OK? Same goes for this pixel. It would not be regular if this pixel were not red too. So knowing kind of this pixel gives me at least an impression,
if it is a pattern, given that it is a regular pattern, of the surrounding area of this pixel. And this is basically the idea. So the basic idea is textures repeat periodically.
If something is not regular, it's not a sensible pattern. So what you have to do when you do the synthesis of patterns is basically you have a small sample of the pattern and then you just repeat it.
Put it together. And what you do to make a realistic pattern is, what do the game industry people do to make it look realistic? They introduce some errors.
Because if it would be a perfect pattern, it would look artificial. In real world textures, there are no perfect patterns.
There's always little now and this is, you know, like you have the tree with a lot of leaves, but they're not side by side. One may be slightly behind the other, you know. Some may be missing for some reason or the other. And the same with a brick wall. A brick wall is very regular, you know.
But there's a chip taken off some of the bricks. There's some smear of the mortar somewhere that makes it look realistic. So introducing a certain irregularity is basically a good way of making a texture look realistic.
Why don't we do just the opposite thing? So from looking at a somehow flawed pattern, we could find out what was the model that generated this pattern.
The statistical model with this pattern was synthesized. And if we decide for a certain class of models, then the parameters of this model that were used to introduce the texture with high probability.
These parameters are perfect description of our pattern. Okay? Everybody understood the idea? So in texture synthesis, you use statistical models to change a texture.
A little bit, just slightly, you know. To introduce some noise, introduce some, or get rid of, take the edge off some of the artificiality if you want it like that.
Knowing this model will result in similar textures. Now, also the opposite is true. Similar textures will result in the same model. Knowing what model very probably or with the highest likelihood
is behind the synthesis of some texture helps us to describe the texture. Good? Actually very simple idea. So if you have a good model, you create different but very similar textures.
And we do it the other way around. Which model, which parameters for a certain model class generates the textures occurring in an image in the best way?
Well, how to model generated texture? Let's just assume we have some model, call it X because I don't know its name. And using this model, what are the expected intensity values of the pixels surroundings?
Fix a pixel at some point and look at the surrounding. Based on the intensity value of the pixel and based on your statistical models, I can predict with a certain probability the intensity of all the other places.
Okay? For example, if my model created this texture,
it is obviously a model that creates stripes. But then, knowing that this picture is black should increase the probability using this model that also this picture is black and this picture is black.
And at the same time, should increase the probability that this picture is white and this picture is white. Okay? And of course, in the created texture, this could also be black. Sure, yes, because this would be an error.
But in the long run, with the highest probability, it's white. Now we do it the other way around. We take the pixel, look at its surrounding and see this surrounding as an observation
of a statistical model in action. Try to figure out what are the parameters for this model. This is our feature vector for the image, for the texture.
Good. So, if we do the same here, it's a more complex pattern. So also the model has to have different parameters. If it would be the same, the upper and the lower should be white if all pixels are black and the right and the left should be black if all pixels are black,
it would also be some kind of stripey pattern. But it has to be different here. For irregular patterns, it's a very complex model and the parameters definitely have to look different. So let's go. We describe the image by some matrix.
Basically, this is the image with the different pixels, look inside. We take the intensity value of each pixel and put it into a matrix.
Same entries as pixels in the picture. This is our matrix F. Nothing happened. Now we assume a model where this matrix is a random variable.
Of course a matrix is two-dimensional, this is why they call it a random field. It's a two-dimensional random variable, nothing more. If I know the distribution of class F by just assuming some kind of model,
I still have to look for the parameters that resulted in this observation. So we have an image, we assume that it is an observation of this model creating those matrices.
What are the parameters for the corresponding distribution? And this leads us basically to a maximum likelihood estimation. So what are the most likely values creating this specific matrix and therefore the specific picture?
So a picture is seen as a matrix with intensity values as the entries. We assume there is a common model always producing the matrices.
We take all the textures that we have as input values, as observations of our model, and then do a maximum likelihood estimation.
What are the parameters of the model that created these observations with the highest probabilities? And this is what it's called. So the problem is the dependency. So if I look at a certain pixel, will it kind of influence the color of its neighbors?
Well, if it's a pattern, yes.
Otherwise, it will not result in a pattern if I use the probability distribution. How about the neighbors that are a little bit more far off? Surely also yes. How about things over here?
Well, transitively yes. But on the other hand, if I look at some pattern, locally I can see very strong connection.
On long distances, things might have changed. It might not be as regular. Look at this wooden structure here. Yes, it is striped, but I can see a very good and a very regular striping in small areas of the wooden part.
Just saying this goes on until the edge of the table over here is not true. Because there are some irregularities in between that would change it. So what we can do is we can have some idea of locality and just say,
well, basically if the neighbors to the left and right are white and the up and down neighbors are black and the second we have some striped thing, the pixel considered here has a very high probability of being white. And this is not influenced by some pixel over here.
Just look at the immediate surrounding of any pixel to determine its value or its most probable value. Anybody sees, well, how many of you have actually heard statistics?
So many. Interesting. Then nobody obviously sees what this characteristic is. This is the Markov property. So probably many of you will have heard the terms Markov chains or hidden Markov models.
Yes, so just as a name. This is actually one of the properties of Markov chains or Markov stochastic processes. That you already always restricted to the locality. So the probability of some event occurring only depends on the probability of events occurring in the neighborhood.
So immediately before or two time steps or whatever before but not 50 years before. Makes it easier to calculate.
So we just assume that the value of some pixel does not depend on the value of all the other pictures in the image but just on the values of the pixels in the neighborhood of this pixel. And this is called the Markov property.
Again, like in the directionality idea, we say neighborhood is basically shifts. 5 pixels to the left, 5 pixels to the right, 5 pixels up, 5 pixels down and the diagonals. So this is it. The neighborhood of some pixel S is start from the pixel S and go T in some direction.
And what we basically do is we will just go into every direction possible 1 pixel.
Enough for us at the moment. So 0, 1, 0, 1-0, 1-1, 1-1, 0 and this is 1, 1, 1-1, 1-1, 1-1, okay.
See, these are the neighborhood pixels here for pixel S. Now we have to define a model that reproduces the observed distribution, our images in the collections, our textures with the best value, with the best parameters.
And actually for everybody who has heard lectures or worked in computer gaming, there are a lot of different texture models. And they all have their drawbacks and their advantages but basically it's a lot of them.
And of course if we want to compare images from different collections, we cannot say well let's just assume different models and different parameters and different everything but we have to restrict ourselves to one class of models and then just look at the parameters
and the difference in the parameters will be kind of like a measurement for the difference in the texture quality of the image. So a popular class of models for texture descriptions are so-called simultaneous autoregressive models.
Basically what they do is they take the different intensity values from the pixel of the neighborhood with a certain parameter that changes it, okay.
This kind of parameter is what we later need for our feature vector because this is characteristic for each different texture. The neighborhood is the same for every texture, just moving,
but the way the neighborhood is influenced by the color of some picture, that is different. In a stripe thing, the neighborhood in these directions is influenced being white, the neighborhood in this direction is influenced being black.
If you have some, I don't know, like pebble type pattern, then the neighborhood is influenced having kind of the same characteristics or the same intensity as our pixel, okay. And this is basically encoded in this vector.
Second goes for this vector here. This is noise. You just have a random variable with mean zero and variance one. So it's spread, it's a distribution spread over the whole probability space.
And every value of that is kind of the same characteristics. So basically this is just adding white noise totally randomly and it's just to account for small errors in the textures
that are given by the observations, okay. Also this parameter, how much noise is in the image, how much noise belongs to a certain texture is characteristic.
And these are therefore the two parameters that we want to estimate with the maximum likelihood. This is what we want to have, okay. This is what describes our texture. The other parts here are the same for every image in the collection.
We look at the same size of the neighborhood. We look at the same Gaussian noise that is distributed somehow. Just the intensity of the noise is different between different textures and just the way the color is changed in the different directions
is different between different textures, okay. That's it. Yeah, the problem is really that restricting ourselves to some neighborhood of the picture also means that different periodicities of textures
could be detected or not. So for example, I have a texture like that and I have a texture like that. Those are shaded textures. If I look at the same size of environment around certain pixels,
I will find that in one case I clearly detect some pattern, in the other case I detect nothing at all because it seems the same to me.
This is kind of difficult and unfortunately this is a non-trivial problem. What is often done to solve it is so-called multi-resolution simultaneous autoregressive models. So I don't use a single model but I use different models of the same type
or autoregressive similar models with different sizes of neighborhoods. Then I say, well the feature vector is not only these parameters, theta and beta for a certain neighborhood size
but the feature vector is theta and beta for neighborhood size one pixel, theta and beta for neighborhood size two pixels, theta and beta for neighborhood force pixels and so on. And this gives me different resolutions of the images basically that are different sizes of the neighborhood. So this is what I can do.
So what I end up with is really a feature vector consisting just of these parameters. That is high compression for the images texture, isn't it?
It's wonderful. We had a complete image before, now we have two values or if we use different sizes of neighborhoods, maybe eight values or sixteen values, something like that. We compressed the texture information of an image into sixteen values.
That's great. The only assumptions that we did is Markov condition is valid. Colors or textures are of a local nature
and they are repetitive. And we chose the size of the neighborhood well for the pictures in the collection, either manually or by using several of them and looking which works best. Okay? This is what you have to do. Good?
Shall we go for a short break? Break time! What's the time? Half past eleven? Ten minutes. Very well.
So what I showed you right now was two basic ways of describing textures with very little effort. One was basically coarseness, directionality, contrast.
Measure it, determine the values for each picture. That's your texture measure. And you can compare it. Second one was random field models. Well, do a maximum likelihood estimation of the model parameters.
That's your feature vector for the picture. But there are also other ways of describing textures and there are ways that are derived from signal processing actually. Because, aha, interesting, isn't it? Because what is a texture? Well, a texture is a periodical pattern
in the continuous intensity information of an image. So if I see the line of each image
as a sequence of intensity information to detect a periodical pattern in that, I can use the tools of signal processing.
Right, that was what I was looking for. And this is what is often called transform domain features. So I go from the signal, as such,
the intensity information, as such, into some other domain, talk about frequencies, how often does something occur, talk about the amplitudes, like how high is the change or what is the rate of change. And I can talk about these things rather in a different domain,
not in the image domain anymore. And this is something totally different than the low-level features that we had before, which kind of abstract from the feature, from the actual picture. So if I tell you there's a certain granularity
or a certain shading in the picture, you can't really reconstruct the picture from that. The granularity, it will give you an idea how it looks like, but you can't paint it. On the other hand, if I tell you, well, there's a signal of intensities
and going along the system and measuring the amplitudes and measuring the periods and the frequencies and whatnot, you know, change rate, I can reconstruct the signal.
Okay? So I can reconstruct the actual image. This is the idea here. We don't focus on certain aspects, we focus on the whole image and we can reconstruct the complete picture from that if it's lossless. And this is what is called a high-level feature.
So it takes the whole picture into account and the whole picture can be reconstructed. So transformation or transform is basically the conversion of something of a signal in a different representation.
I can transform it into the domain, I can transform it back from the domain and it preserves all the information. So, for example, if I have the picture of a straight line, I can describe it by two points, A, B.
Just storing these two points will always allow me to reconstruct the line, this line, not some other line, exactly this line and therefore also the picture of this line. I could also say, well, actually,
I don't know point B, I just need point A and a gradient here. Also these two pieces of information will allow me to draw exactly that line and therefore reconstruct the picture of it. It's totally different, the gradient and point information
is totally different to the two-point information but they encode the same thing uniquely. And this is the same for all the transformations. For example, for images we often use Fourier transformation, so this is an image
and this is a representation of this Fourier transformation. We don't see anything anymore. It's a different way of looking at the image and it's reversible. We can get it back from Fourier space. We will see in a moment how that actually works
and how that helps us. So the idea is we gain information by transforming to some other representations to see other things, to see frequencies, to see parietalities that we know about but that we can't really put our fingers on.
Now we can quantify it. We can say how much of each frequency is in the picture, for example, yes? And of course this is also a good measure. Good! Let's start with some algebra. Before that we had statistics, now we go for algebra.
We take points in space, so here, oh, I should take red, here and here and here and so on, okay? And if I have n points in space,
there's a theorem in linear algebra actually that I can construct a polynomial of degree n-1 touching all these points. So it doesn't really matter where they lie. I will always find a polynomial going through them. So for example, if I have just a single point,
I can always construct this point. If I have two points, I can always construct a line going through the two points.
It does not depend where they are. The equation for the line will look different, yes. But it's always a line. If I have three points,
I can always have a parable going through these three points. It does not matter where they lie. And as the line is a polynomial of degree 1 and the parable is a polynomial of degree 2
with more points, the polynomials get more interesting to say the least. But the degree of the polynomial is always n-1. Good. So then to describe the points, I could just say, well, given the points,
I look at the coordinates of the points, n points, or I will give you the polynomial, the formula for the polynomial, and I will give you the steps where you have to look at the polynomial to get the points.
So, for example, if I have this polynomial and I say, okay, this is 0, 1, 2, 3, 4, 5, then just knowing the blue line, I can detect the points. And the representation is totally different.
From the representation of the actual points. But it is the same information. Okay? Good.
Let's now say that an image is just a discrete function that assigns each pixel on a two-dimensional plane an intensity value. So if I look at an image,
I will just take the location of each pixel and say for every pixel in a certain intensity dimension what the degree of intensity is. So, for example, this here is light and the pixel next to it may be darker or even lighter.
So I get a kind of flying carpet thing. Each image is a two-dimensional function that is somehow contorted in space. And you can also use color images and not talk about intensities
but about the colors, of course, but that makes things more complicated. Anyway, each row of an image can be interpreted as a sequence of real numbers. Okay? And each row, since it's just a sequence of real numbers, can then be described by some polynomial function.
Okay? Easy to do. But for textures, this is kind of strange, isn't it? Because we know something about textures and that is exactly they are regular
and they are repeating forever and ever and ever. So it's not just some polynomial that kind of like does something
and then moves off into any direction, but it's actually a certain type of polynomial that does the same over and over again forever. Anybody knows the kinds of functions that do that? The sinus and the cosine.
Exactly. And this information is kind of like not very new but has been discovered in the 1700s by Jean Fourier who said that basically any periodic symbol
or any periodic signal can be covered by sine and cosine functions. Okay, so Jean Fourier was a French mathematician and this was his great idea. He said that any periodic signal can be decomposed in a sum of series
of such oscillating functions, the sines and the cosines. And why are we mentioning Jean Fourier here in this context? We've just said that images can be represented as signals based on the gray value. It starts from dark, from black, to white.
This is our signal right now, real numbers. Okay, this is two dimensional, but it's our signal. The idea is since polynomial representation don't repeat themselves, but sines and cosines do. And Fourier says that any periodic signal
can be decomposed in the sum of such series. Why not represent patterns, so images containing patterns in such a Fourier series, such a sum of oscillating functions. This is what Fourier said. He said that any such function can be decomposed in an infinite,
probably infinite sum of such functions. And let's start with the basics. Let's start with a one dimensional signal. You can imagine, for example, radio waves or voice, for example. So sound has the amplitude and the frequency in time.
So if you have, well, this is not quite a good example for a sound wave. It's something like a step function, but anyway, it's perfect for, exactly, for intensity information.
So the value would be then our intensity, and then we have the time axis. And discretizing this time axis, you can obtain some real numbers at different moments in time.
Like, for example, this one, this one, this one, this one, this one, and so on. Well, I'll leave one out. This one I didn't represent. Anyway, and then our signal can be described through this series of real numbers.
And according to Fourier, the signal, the series of real numbers can then be decomposed into a series of sine or cosine functions. So then, let's start with a simple sine function
and overlap it over the signal with the whole time spectrum, considering the whole time spectrum. So not a locality, like, for example, the first seconds or so, but the whole time spectrum. And this looks something like this with a sine with the lowest frequency.
And then we go further. We increase the frequency and add additional sine or cosine functions to better approximate our original signal. Well, the idea here is actually that, and so on, I can't draw on this thing.
A lower frequency and a higher frequency and a different amplitude
give me another sine representation, which added. So then I add this representation here and this one and get such a curve like this one. So I'm just basically adding sines together with some frequency and some amplitude
to get a better approximation of my signal. And I do this further with higher frequencies. You see, for example, here, the third edition, and then the fourth, and so on.
And at some moment in time, the quality would be higher and higher and higher with the higher the frequency. And what's important to notice here is that I'm happy with the quality of my estimating the original signal.
Then I could just stop and cut and say, okay, I'm going up to this frequency and I'm already happy with the representation. Otherwise, I can go up to infinity and get the perfect match. So this would be the idea. I can then decompose the original signal in a sum of oscillations,
as you've seen here, going up to infinity by increasing the frequency and depending on some amplitude. Okay, so then we have, as I've said, some signal here
and what Fourier says is that I have some amplitude, a coefficient, some frequency, and the sine or cosine, plus another signal with some amplitude and the frequency, and so on.
And then I represent the signal as this sum here. What this helps me for is to translate my original signal from this time domain, the intensity over time,
into the frequency domain, where I say, okay, let me represent the frequencies I've just built. This frequency here, this frequency here, this frequency here. I get them all in my frequency domain, as for example,
in a histogram with the amplitudes described by these coefficients here. So I'm building this histogram. This is the representation in the frequency domain of the signal
in the time domain. And as I've said, the further I go with higher frequencies, the better quality I get. If you want some intuitive comparison, think of sound. The higher frequency you get, the lower information
the higher frequency contains. So if you cut somewhere at 22,000 hertz, you won't lose too much because the human ear doesn't hear that high frequencies and they also don't contain that much information. Okay, more formally, what Mr. Fourier says is that
every sequence of real numbers, in our case the intensities, can be transformed into a sequence of coefficients. The coefficients we've just seen for the cosine and sine functions.
And then, of course, we have to vary the frequency of the sine and cosine functions. So then we need practically in order to transform this sequence of real numbers only to establish these coefficients here.
And for calculating these coefficients, what we basically need to do is to project the signal, the original signal we have, onto each cosine or sine wave we compose our signal with.
So in order to project this, for example, for the coefficient of the cosine function of a certain real number, we just multiply the intensity the signal has in the corresponding point with the corresponding cosine value.
And we do the same for the sine coefficient. This is how we calculate the coefficients. And then this is how we then represent the signal. This was great for the one dimension. Of course, images are two dimensional.
So then things become a bit more complicated. Of course, there is a generalization for Fourier. And of course, we have the possibility to do this also two dimensional. The idea here is, for example, that we describe the intensities in
images as sums of sine and cosine, considering also directionality. So we have also directionality to consider. Like, for example, here, these pixels here have the same intensity. The idea here is that the intensity of pixels varies
only with this direction. Maybe I should choose another color. Yeah? So these have the same intensity.
Here, I'm going, for example, from white to black. And if you would be to compare something like this with a pattern in an image, then you probably should imagine you have something like this. I'll draw red for white.
Well, not white. Black background with white stripes in this direction. Why doesn't it delete this thing?
Doesn't want to. OK, so this would be the idea. I would have such a pattern. OK, so then the formula becomes a bit more complicated, because I need to take into consideration the
two dimensions. The two dimensions are actually the coordinates of the pixels based on the width and height of the image. And then I have, again, the same cosine and sine oscillations, which describe my function. So each of the pixel intensities together, again,
with the amplitudes of these oscillations. Then the pixel intensity can be represented as this a and b coefficients, which now are matrices. Yes?
You mean the amplitude, not the frequency, because the frequency goes over the whole spectrum. Well, actually, for the amplitude, you perform
projections of the signal over the sine base, so to say. And this is how you calculate the amplitude for that sine base. If that sine base is not fit to represent that amplitude, then the coefficient will be 0.
So then it's out of the picture, so to say. The step signal, yeah?
Yeah, you override the frequency, yeah.
Each of them is independent.
It's like the blood points where there's a unique line.
I can't say, well, this is more of this frequency and less of this, or more of the first one and less of the second one or something. And the way you build them is not optimizing them piece by piece and laying them together piece by piece at each other here. But it's rather just, OK, take all you can get from
the first, two small error corrections with the second. Then two small error corrections or smaller error corrections still with the third frequency. In any case, you need to do that.
OK, so then again, the amplitudes can be calculated based on the same projection principle. Only this time, I have a two-dimensional function, and I
multiply it with the corresponding cosine and sine functions. OK, now the purpose, the goal of this Fourier transformation was to compare different images. So I want actually to be able to compare an image with
another one in the frequency domain. One possibility would be to compare this A and B coefficient matrices for the sine and cosine of one image with the A and B matrices of the sine and cosine for the second image. Well, of course, this is not the best solution. One reason would be that actually Fourier
coefficients are complex. They have a real component and an imaginary component. And the second would be that actually we need an image that shows us the data in a different perspective.
And this is the Fourier representation. This is how the Fourier representation shows the frequency spectrum. The idea here is that we show the frequencies as points
with different intensities, light intensities, where we have, for example, the fundamental frequency. Then we have some lower and higher harmonics of this
frequency showing us the directionality, together with how much information does this frequency actually contain. Let me show you a better example that I hope I brought. Yeah, so the properties of these images here are that
they are centered on the fundamental frequency, the lowest frequency of my sine and cosine. This is this one here. And they are then symmetrically towards the origin. So what happens upwards happens lower and left and
right, symmetrical. Then I have the harmonics, as for example in audio signal. They are the upper frequencies, which are multiples of the fundamental frequency. Like, for example, what you see here, they, for example, become weaker and weaker.
I get the strength of a frequency through the light intensity in this representation. So the lighter it is, the brighter it is, the more strength lies in the image for the pattern with that
frequency. So for example, for this image here, I have again this signal here. And I should imagine that the amplitude of, so it lies
such that information that I get from this pattern here is in this main frequency. So then I have such a sine, for example, so that you can
get an image in the time domain, which repeats itself over this directionality, directionality which is given also in the Fourier domain, with the strength of how much from this pattern is in my image here.
Of course, I can have many more patterns. And these patterns, if I would imagine of having, for example, also these stripes here, could be present something like this, yeah?
Of course, this image gives me also not the direction of the pattern, but also the size of the period. So how big is the distance between these two stripes, for example, in the certain pattern?
Well, if we start from the image domain and we imagine something like this, a pattern like this, then we can already see in the frequency domain that there is something happening on the vertical.
So I'm varying practically from bright to black to bright to black and so on, yeah? That the frequency, actually, the period is not that big. The intensity is quite high.
And if you compare it with the Fourier representation of the next image, you can see that the directionality of the pattern is again the same. So I have the same vertical variation from bright to black.
But the frequency is higher. Again, on both of the images, you can see that nothing or almost nothing happens on the horizontal. And then again, if you rotate the image we've
previously seen, then you also can have some diagonal representation. You can already see this also in the Fourier transformation. So clearly recognizable directionality.
You can see the amplitude, the difference in amplitude. And this is how you can differentiate between this pattern here, this pattern here, and this pattern here, just by looking at their Fourier representations.
OK, now I've said that in order to get this, we must first calculate the amplitude coefficients of the sine and cosine waves.
In order to perform this, I've shown you a formula which actually has quite bad complexity. So I have something like two fours here, if you were to program this. And this means complexity of the power of two.
This is not that great if you have a big database with images and you want actually to extract for each image the Fourier domain and then compare it. You imagine that you have 100,000 or millions, or I don't know, maybe, of images. Then you have to extract these coefficients for each image,
and then extract the coefficients of the query time for the query image, and then compare them. For this reason, there is a more efficient implementation that's the whole Turkey algorithm, which actually
implements a variant of the discrete Fourier transformation, the so-called fast Fourier transformation. And the idea here is to reduce the problem. So it actually functions as the divide and conquer paradigm. It reduces the complexity of the whole problem to smaller problems, always divides the domain.
And actually, through this reduction, the Cooley-Turkey algorithm results in a complexity of n log n. Of course, for our exercises, I won't require you to
implement this in Java or something like this, because it's rather laborious. It's not that trivial to implement. And this is the great advantage of MATLAB. It has libraries which efficiently have implemented Fourier transformation.
You can just do the two-dimensional Fourier transformation. So what you need for images are the call of a function. What you need to previously do is just to get the gray levels of an image. You have an image with colors. You get the intensities.
So you transform the image to gray levels. And then just by calling FFT of two dimensions, you get the Fourier representation of that image. So it's that simple. I have here, I've brought here an example.
You would probably need this for the next homework you will perform. What you just need to know is that image to gray level, gray level to Fourier transformation. And of course, in order to be able to see something and compare something, you need to center the Fourier
coefficients. As I've said, this image should be centered on the fundamental frequency. That would be everything about the Fourier transformation. So everything clear about the Fourier transformation?
It's basically like we did with the polynomial, given some points in the intensity domain. We can calculate a unique representation in terms of sine and cosine curves.
And taking the coefficients of each frequency as a feature vector, we can distinguish between different patterns. This is not only true for Fourier transformation, but there are many transformations of these kind.
So there is, for example, the discrete cosine transformation that restricts itself to cosine forms only. And there's another one that I want to go very briefly into, which is called the wavelet transformation.
And the basic ideas, they all give the same results, but slightly different domains. And other domains are sensible for different purposes. So the frequency domain might be very interesting for things in signal processing, whereas wavelets can be very
interesting in slightly other application areas, like image processing. And this is where they're actually done. The idea behind that is kind of the same. In cosine transformation, I do just the same as Fourier
transformation, restrict myself to cosine functions. And one application area where this is very practical is the encoding of JPEG images, for example. So in JPEG, you also go through the pictures for compression, and you want to compress how many pixels
following each other have the same color. Okay? This is the basic compression. Of course, this gives you a function again, and you can compress this function using cosine measures. This is what you basically do.
Basically, since both are based on sinus cosine waves, the power spectrum, that is these coefficient matrices, is what you compare. The images that Silvio showed you is just to show how different patterns work on the spectrum,
on the frequency spectrum. So you can actually visualize it. But for the comparison, you will take the matrices of the coefficients. Usually, you restrict yourself to only the first few
coefficients, not really the tail of the distribution, because that's suitable enough. With the wavelength transformation, you approximate the intensity function again, but with a different class
of base functions. You don't use sine and cosine functions anymore, but you use functions, polynomials, that only exist in very localized spots. And the idea is that if you have functions,
can be different shapes, we will see a couple of them in a minute, that exist for some interval, and then are zero on the rest of the interval. They can be used as a new basis system for building
up your intensity function. Because for every point in the intensity function, you have to say where it is, in the sequence of time, and you have to say what its value is.
Take as a basis all the functions that would exist in this interval of some type, and put them together uniquely to get to this value. This is the base idea of wavelength transformation. And the idea of how to reach the value is exactly
the same as in the Fourier part. You start with the biggest wavelet existing, you take as much as you can take off that, and then you add on smaller kinds of wavelets that exist in this interval to kind of shape the curve.
The functions that we may use are considerable. It's kind of like polynomials and also things here with spikes, for example, so not really polynomials.
But they are local, locally integrable, and the integral over the function is zero. So they don't have any mass.
This peak here covers out the areas down here. Exactly the same mass. And this is true for all different wavelet functions. And we will only consider the easiest kind
of wavelet function that might look like what? Well, that's basically a step function. This area here covers this area here. Just step function, and this is actually called
the Ha wavelet. But as soon as we have any such wavelet, whatever it may be, just call it psi, we cannot generate a base by shifting this wavelet around and by scaling the wavelet in size.
The functions exist only locally. For the rest, they are zero. So shifting that around will get us the mass of the distribution where we need it
to get the desired intensity value. But then, each wavelet has a certain shape. Our intensity curve also has a certain shape, which is usually not the shape of the wavelet. So we have to even out the problems,
the discrepancies in the shape. What we do is we scale down the wavelet and edit and subtract it where we need it. And this will then give us a representation for the curve. So basically, there's the scaling factor.
We just divide the value by something. And there's a shifting factor. We just shift it left or right. Okay? And for the wavelet bases, you usually use powers of two.
So it says I have one wavelet that exists on the thing. Then I consider two wavelets covering the halves. I consider four wavelets covering the quarters, eight wavelets covering the eighth part of the whole space. Okay? So I kind of like make it smaller by a factor of two.
And these values doesn't have to be two, but one usually uses two is called a critical sampling. That's what you would do. So the most simple example is the half wavelet. It's one for half of the interval and minus one for the other half.
As you can easily see, this would cover this. So the integral is zero. Of this function. Okay. Using this wavelet, what do we do to scale it?
Well, easy. We just scale it by half. And we have a wavelet that lives on the half interval. Right?
Again, the areas even out. But of course, this lives only on the half interval because it's half the size. So what do we do? We just take a second instantiation of that,
shift it to the other half. Okay? Next step that we can do, we take the quarter. Okay? We have to shift that one time, three times.
So we get four of these quarter sized wavelets. And now we only want to, we want them to be finer. This is, we don't take the full amount, but just rescale them also.
I will show you in a minute, it will be kind of like, now we have orthogonal basis, but what we need is an also normal basis. So we will normalize the smaller wavelets by factors to make them smaller so we can even out the finer areas. So adding a factor of two to the power of square root J,
we will can kind of like get the basis to be orthonormal. So what happens is if we have the, started with the mother wavelet, the biggest wavelet that we can get,
we get two of the smaller wavelets that also have smaller amplitude by these orthonormalization factor. So they're shifted and they are scaled. Next generation of wavelets, four,
shifted four times for the different parts of the integral and again, scaled by a factor of two. Mother wavelet has the amplitude of one for some wavelet has the amplitude of
one divided by square root two. Second wavelet, so grandchild wavelet has the amplitude of one divided by two.
Okay? Good. What we can do is we can also represent this base as a scaling function. And for high wavelets, the scaling function that we use to build this is just
the characteristic function on the interval one and zero. So wherever this function is one on this integral and it's zero everywhere else. And if I have a data set of cardinality two to the power of n, there's a theorem stating
that I can represent it on this normalized interval of zero and one by a piecewise continuous function using the specific scaling factors that we need
and the location of where the wavelets are in this interval and this is the characteristic function saying I'm one within this interval, I'm zero outside this interval.
Okay? This is basically what the scaling function does. The step functions we have for our intensity values are obviously finite, so the image ends at some point. Huh? And they have a limited number of points.
Then I can represent it by the scaling function and the high wavelets. I take the different high wavelets with different scalings and different shifts
at the scaling function for each part of the, interval, yeah, so if I have a function here, this is the intensity and this is the row of the image.
Pixel one, pixel two, pixel three, pixel four and so on, okay? Then I can use the scaling function saying, okay, I consider all these little intervals individually
and for each value in the interval, I can build a unique representation with respect to my wavelets. Okay? So let me show you an example that we can see it.
I have the step function given by these values. So these are intensity values from one row of my image. First pixel has intensity one, second pixel has intensity zero, third pixel has intensity minus three or whatever, yeah? And I want a resolution of my basis of three.
So I take a mother wavelet, a child wavelet and the grandchild wavelets. That's the only thing I want to do. Then I need to calculate the auto-normalization factors which would be one, the square root of three zero
and two to kind of scale it down. What happens now is that I built myself a characteristic function for the interval that is just one over the interval. Then I built myself the mother wavelet
which is basically one for half of the interval minus one for the other half of the interval. Then I built myself two baby wavelets. The first one lives in the first half of the interval.
One in the first half of the interval minus one in the second half of the first half of the interval scaled down by the auto-normalization factor to the square root of two and minus the square root of two.
Okay? Then I shift this to the other half of the interval. So the first wavelet covers the entire interval. The two second cover,
no, sorry. The two second cover the first half and the second half. The first half and the second half, okay? And are zero in the other half. So they are just shifted.
Otherwise they are identical. Then we have the grandchildren covering the first quarter, the second quarter, third quarter and the last quarter, okay?
We have one mother wavelet. We have two baby wavelets. We have four grandchild wavelets. Why do we have eight intervals? Well, that is basically the scaling function behind it because we have eight data points.
One, two, three, four, five, six, seven, eight, okay? So these parts here that these are eight points defines us basically how many intervals we have to separate our interval into, okay?
Good. So if we want to build anything with respect to some basis then we just make it a linear combination of these values.
Making something a linear combination is, I just have the factors, basic scaling function for the characteristic function, coefficient of the mother wavelet, two coefficients of the baby wavelets,
four coefficients of the grandchild wavelets, okay? And what I want to have, what I want to represent by this is the intensity function. Intensity of pixel one, intensity of pixel two,
intensity of pixel three, and so on. Yes? Clear? This is basically what it does. Okay. So if I solve this linear equation, obviously I can do that. Then I get as coefficients these, okay?
And what do they mean? Well, they mean I have to take half of the characteristic function for each point. I have to take minus one half of the mother wavelet at each point. I have to take such and such
for the first baby wavelet at each point. And if we now get the function from it here with the wavelets, okay, and the factors as derived from there,
yes, then we can reconstruct each point of our characteristic function. For example, let's reconstruct the first point. This is the interval, zero to one eighth of the interval length, huh?
We had determined our interval had eight buckets, okay? One, two, three, four, five, six, seven, eight, okay? What is the value here? Okay, let's try it out.
Zero, one eighth, which wavelets do live there? The mother wavelet, yes. So we take one half. The first baby wavelet, yes. So we calculate it and take minus one half, okay?
The first grandchild wavelet, yes. And finally, the last one, okay? So what we do is we just enter the values here into the function where it exists, okay?
So the first one does exist. This is the one. The second one does exist, mother wavelet, in this area. This is minus one half.
The third one does exist in this one, goes here, okay? Second one does not exist on this interval.
Don't take it. This one does exist, take it, okay? And these don't exist on our interval. Good? And if we add this all up, we get one. Looking back on our function,
the first value indeed was one, okay? We can go from the function to the wavelet to the function again. This is how it works. So actually, it's very easy if you look at it.
And as a summary for today, I showed you some low-level texture features today where we just considered the coarseness or the granularity or statistical models and their parameters. And I showed you some high-level features which kind of embed the image information
into a different domain, the frequency, the description in a wavelet basis, whatever it is, okay? And one allows us to describe the textures but not to reconstruct the image. The high-level features even allows us to reconstruct the image and do some interesting stuff
with the feature values, compare them to each other. And if there are no more questions today, are there? Everybody happy? Good. Then I would say happy Easter
and see you again next week when we will continue with texture analysis, little bit of multi-resolution analysis and then start on the interesting part, the shape features.