We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Scikit-learn to "learn them all"

00:00

Formal Metadata

Title
Scikit-learn to "learn them all"
Alternative Title
Why SCIKIT-LEARN is so cool
Title of Series
Part Number
49
Number of Parts
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production PlaceBerlin

Content Metadata

Subject Area
Genre
Abstract
Valerio Maggio - Scikit-learn to "learn them all" Scikit-learn is a powerful library, providing implementations for many of the most popular machine learning algorithms. This talk will provide an overview of the "batteries" included in Scikit-learn, along with working code examples and internal insights, in order to get the best for our machine learning code. ----- **Machine Learning** is about *using the right features, to build the right models, to achieve the right tasks* However, to come up with a definition of what actually means **right** for the problem at the hand, it is required to analyse huge amounts of data, and to evaluate the performance of different algorithms on these data. However, deriving a working machine learning solution for a given problem is far from being a *waterfall* process. It is an iterative process where continuous refinements are required for the data to be used (i.e., the *right features*), and the algorithms to apply (i.e., the *right models*). In this scenario, Python has been found very useful for practitioners and researchers: its high-level nature, in combination with available tools and libraries, allows to rapidly implement working machine learning code without *reinventing the wheel*. **Scikit-learn** is an actively developing Python library, built on top of the solid `numpy` and `scipy` packages. Scikit-learn (`sklearn`) is an *all-in-one* software solution, providing implementations for several machine learning methods, along with datasets and (performance) evaluation algorithms. These "batteries" included in the library, in combination with a nice and intuitive software API, have made scikit-learn to become one of the most popular Python package to write machine learning code. In this talk, a general overview of scikit-learn will be presented, along with brief explanations of the techniques provided out-of-the-box by the library. These explanations will be supported by working code examples, and insights on algorithms' implementations aimed at providing hints on how to extend the library code. Moreover, advantages and limitations of the `sklearn` package will be discussed according to other existing machine learning Python libraries (e.g., "Shogun Toolbox", "PyML", "MLPy"). In conclusion, (examples of) applications of scikit-learn to big data and computational intensive tasks will be also presented. The general outline of the talk is reported as follows (the order of the topics may vary): * Intro to Machine Learning * Machine Learning in Python * Intro to Scikit-Learn * Overview of Scikit-Learn * Comparison with other existing ML Python libraries * Supervised Learning with `sklearn` * Text Classification with SVM and Kernel Methods * Unsupervised Learning with `sklearn` * Partitional and Model-based Clustering (i.e., k-means and Mixture Models) * Scaling up Machine Learning * Parallel and Large Scale ML with `sklearn` The talk is intended for an intermediate level audience (i.e., Advanced). It requires basic math skills and a good knowledge of the Python language.
Keywords
Scalable Coherent InterfaceSoftware development kitComa BerenicesVirtual machineWordMachine learningWordGraph coloringSoftware frameworkMachine learningBookmark (World Wide Web)Computer animation
Task (computing)Machine learningRobotComa BerenicesInformation managementGraph (mathematics)Software testingVirtual machineObject (grammar)8 (number)EstimationCluster samplingAlgorithmInformationTheoryExecution unitTask (computing)Marginal distributionData analysisStatisticsPoint cloudWordAlgorithmQuicksortSocial classComputer animation
Data analysisVirtual memoryData miningObservational studyVenn diagramRaw image formatData miningWordFundamental theorem of algebraDiagramData analysisMereologyMachine learningMathematical analysisObservational studyTheory of relativityComputer animation
Machine learningVirtual machineData analysisPredictionPattern languagePersonal identification numberCodecMathematical analysisEndliche ModelltheorieState of matterStatisticsInsertion lossPredictabilityTheory of relativityPoint (geometry)WordPersonal identification numberData miningPattern languageSet (mathematics)Multiplication signInstance (computer science)XMLUMLComputer animation
Metropolitan area networkPort scannerSpecial unitary groupComa BerenicesCASE <Informatik>Formal languageComputer programmingMoving averageAxiom of choiceBit error rateLine (geometry)Natural numberMachine learningLevel (video gaming)Virtual machineAmsterdam Ordnance DatumAddressing modeAlpha (investment)12 (number)Computer networkDensity of statesDean numberAlgorithmSocial classVirtual machineMachine learningMachine codeSupport vector machineCASE <Informatik>Group actionSet (mathematics)Library (computing)Different (Kate Ryan album)Mathematical analysisComputerWordProcess (computing)Expert systemUnsupervised learningFormal languageClassical physicsNatural languageGreatest elementVector spaceResultantComputer scienceRight angleSupervised learningEndliche ModelltheorieAxiom of choiceBitNeuroinformatikProjective planeInterpreter (computing)Cartesian coordinate systemGraph (mathematics)MereologyFunctional (mathematics)Artificial neural networkRule of inferenceLevel (video gaming)UsabilityCommitment schemeAlpha (investment)Scaling (geometry)Interface (computing)Order (biology)Electronic mailing listIntegrated development environmentNetwork topologyBit rateCycle (graph theory)MathematicsPhysical systemSoftwareSampling (statistics)ManifoldWebsiteGene clusterProgramming languageSystem callUniqueness quantificationBell and HowellState of matterThermal conductivityTask (computing)Computer animation
Menu (computing)Musical ensembleInstallation artSoftware development kitScalable Coherent InterfaceMachine learningMachine codeModal logicAlgorithmLinear regressionMatrix (mathematics)Performance appraisalFunction (mathematics)Core dumpAlgorithmScalabilityInterface (computing)Scaling (geometry)Different (Kate Ryan album)Slide ruleFocus (optics)Selectivity (electronic)WebsiteMetric systemModal logicCross-validation (statistics)Order (biology)Set (mathematics)CuboidNatural numberFunctional (mathematics)Codierung <Programmierung>Library (computing)Cycle (graph theory)Installation artInterpreter (computing)Revision controlPerformance appraisalLinear regressionFolksonomyMachine learningGene clusterMachine codeArithmetic meanVirtual machineMatrix (mathematics)Domain-specific languageComputer animation
Scalable Coherent InterfaceSoftware development kitAlgorithmCheat <Computerspiel>Reduction of orderLocal area networkInterface (computing)Data modelEstimatorPredictionEndliche ModelltheorieReduction of orderGene clusterLinear regressionTransformation (genetics)Figurate numberQuicksortLevel (video gaming)Virtual machineSet (mathematics)EstimatorMachine codeObject (grammar)ConsistencyDiagramProgram flowchart
Computer-assisted translationRaw image formatData modelConditional-access moduleSoftware development kitEstimatorScalable Coherent InterfaceMetropolitan area networkCAN busUniform resource nameValue-added networkArtificial neural networkEmulationAddressing modeModal logicRadio-frequency identificationAsynchronous Transfer ModeSummierbarkeitKnotGraphics tabletLetterpress printingBuildingWeightNumberSet (mathematics)Chi-squared distributionMatrix (mathematics)Vector spaceComputer-assisted translationAlgorithmVirtual machineDifferent (Kate Ryan album)Endliche ModelltheorieSocial classCASE <Informatik>Fitness functionTransformation (genetics)Process (computing)Reduction of orderCuboidForm (programming)Selectivity (electronic)Constraint (mathematics)Goodness of fitGroup actionRepresentation (politics)Object (grammar)EstimatorMachine learningRight angleInterface (computing)Instance (computer science)Category of beingHeat transferOrder (biology)Point (geometry)Linear regressionSound effectLibrary (computing)PredictabilityLengthComputer animation
NumberSet (mathematics)Chi-squared distributionMatrix (mathematics)Vector spaceRow (database)Cartesian coordinate systemVector spaceMatrix (mathematics)Presentation of a groupDifferent (Kate Ryan album)Graph coloringCASE <Informatik>ImplementationInformationSampling (statistics)NumberData compressionSet (mathematics)Sparse matrixComputer animation
AlgorithmSpeciesIRIS-TClique-widthLengthPermianLetterpress printingSpecial unitary groupVarianceInterior (topology)Duality (mathematics)Artificial neural networkUniform resource nameAnalogySocial classDatabase3 (number)Structural loadPort scannerCuboidPersonal area networkGroup actionCluster samplingCAN busCore dumpChainSoftware development kitScalable Coherent InterfacePredictionData modelInsertion lossMetropolitan area networkDescriptive statisticsRight angleObject (grammar)Key (cryptography)Pattern languagePerformance appraisalComputer virusSpeciesDifferent (Kate Ryan album)Endliche ModelltheorieCASE <Informatik>Functional (mathematics)Field (computer science)Cycle (graph theory)Resampling (statistics)Finite-state machineAlgorithmResultantSampling (statistics)Multiplication signShape (magazine)Metric systemIntegrated development environmentSlide ruleGene clusterWordMachine codeBoundary value problemVirtual machineVector spaceInterface (computing)Decision theoryLibrary (computing)NumberFitness functionSocial classUniformer RaumLine (geometry)PredictabilityState of matterMatrix (mathematics)Inverse elementLengthTheory of relativityStructural loadValidity (statistics)Computer-assisted translationVideo gameSet (mathematics)Speech synthesisDistanceRing (mathematics)Task (computing)2 (number)IRIS-TClique-widthDot productPiRow (database)Term (mathematics)Graph coloringComputer animation
Artificial neural networkMacro (computer science)Data modelMathematical analysisStatisticsIndependence (probability theory)Set (mathematics)Curve fittingMUDClefPredictionSummierbarkeitScalable Coherent InterfaceSoftware development kitProcess (computing)CuboidMaizeFormal languageNatural numberComputer-generated imageryNormed vector spaceSpecial unitary groupZoom lensUniform resource nameMachine learningLibrary (computing)Virtual machineDisintegrationLibrary (computing)WordSystem callSocial classProcess (computing)Formal languageSet (mathematics)Matching (graph theory)DiagonalVirtual machineCorrespondence (mathematics)Row (database)Condition numberMetric systemCross-correlationParameter (computer programming)Cross-validation (statistics)Bit rateSlide ruleBefehlsprozessorHeegaard splittingProgrammer (hardware)Multiplication signCASE <Informatik>Line (geometry)ResultantAdditionMachine codeImplementationDigital electronics1 (number)Module (mathematics)Token ringArithmetic meanSoftware testingContext awarenessReading (process)Different (Kate Ryan album)Functional (mathematics)NeuroinformatikFilm editingDefault (computer science)Codierung <Programmierung>Scaling (geometry)Error messageInterface (computing)Endliche ModelltheorieStatisticsNumberMatrix (mathematics)Vector spaceCuboidPredictabilityMedical imagingCartesian coordinate systemAsynchronous Transfer ModeRight angleWrapper (data mining)LinearizationWebsiteInstance (computer science)Parallel computingSoftware development kitMultiplicationSupport vector machineValidity (statistics)Machine learningNoise (electronics)Natural languageSingle-precision floating-point formatComputer animation
Coma BerenicesElectronic meeting systemWorld Wide Web Consortium12 (number)Fitness functionCASE <Informatik>E-learningWordInternet forumMedical imagingFourier seriesCross-validation (statistics)Parameter (computer programming)MereologyPartial derivativeSlide ruleEndliche ModelltheorieMachine codePoint (geometry)Right angleProjective planeComplete metric spaceMatrix (mathematics)Process (computing)Direction (geometry)Arithmetic meanVector spaceLinear regressionConnectivity (graph theory)Food energyStatistical hypothesis testingCombinational logic2 (number)Interface (computing)Semiconductor memoryDatabase normalizationMultiplication signCuboidSet (mathematics)Core dumpVisualization (computer graphics)PredictabilityLatent heatGreedy algorithmResultantSelf-organizationGroup actionData managementSheaf (mathematics)Type theoryPreprocessorMathematicianMathematicsVirtual machineAlgorithmInstance (computer science)Online helpoutputOrder (biology)Different (Kate Ryan album)XMLComputer animationLecture/Conference
Lie group
Transcript: Englisch(auto-generated)