We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Bridging the Gap: from Data Science to Production

00:00

Formal Metadata

Title
Bridging the Gap: from Data Science to Production
Title of Series
Number of Parts
132
Author
License
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
A recent but quite common observation in industry is that although there is an overall high adoption of data science, many companies struggle to get it into production. Huge teams of well-payed data scientists often present one fancy model after the other to their managers but their proof of concepts never manifest into something business relevant. The frustration grows on both sides, managers and data scientists. In my talk I elaborate on the many reasons why data science to production is such a hard nut to crack. I start with a taxonomy of data use cases in order to easier assess technical requirements. Based thereon, my focus lies on overcoming the two-language-problem which is Python/R loved by data scientists vs. the enterprise-established Java/Scala. From my project experiences I present three different solutions, namely 1) migrating to a single language, 2) reimplementation and 3) usage of a framework. The advantages and disadvantages of each approach is presented and general advices based on the introduced taxonomy is given. Additionally, my talk also addresses organisational as well as problems in quality assurance and deployment. Best practices and further references are presented on a high-level in order to cover all facets of data science to production. With my talk I hope to convey the message that breakdowns on the road from data science to production are rather the rule than the exception, so you are not alone. At the end of my talk, you will have a better understanding of why your team and you are struggling and what to do about it.
Bridging (networking)SoftwareComputer musicInformationStack (abstract data type)System programmingBusiness IntelligencePoint cloudProduct (business)Physical systemHand fanSoftware developerArithmetic meanDigital signal processingProduct (business)Mathematical modelWordCartesian coordinate systemProof theoryOperator (mathematics)Focus (optics)Projective planeOffice suiteStack (abstract data type)Multiplication signComputer animation
Self-organizationData modelPerspective (visual)Product (business)Source codeDatabaseFile systemStreaming mediaStapeldateiFrequencyScaling (geometry)Machine learningForestRandom numberoutputReal-time operating systemLevel (video gaming)Characteristic polynomialStack (abstract data type)Java appletLinear regressionPoint cloudEuler anglesPerformance appraisalLatent heatArchitectureState of matterInterface (computing)TestdatenInstance (computer science)Endliche ModelltheorieProduct (business)Multiplication signPoint cloudComputer fileCASE <Informatik>AlgorithmPoint (geometry)Perspective (visual)Software frameworkResultantProjective planeSource codeAverageProof theoryMereologyFrustrationInterface (computing)Data managementPredictabilityDatabaseStreaming mediaLaptopGame controllerVolume (thermodynamics)Decision theoryScalabilityFrequencyInferenceVelocityReal-time operating systemCharacteristic polynomialPhysical systemStapeldateiWebsiteCondition numberRelational databaseArtificial neural networkRepresentational state transferLinear regressionInternet service providerWeb serviceForestVirtual machineBasis <Mathematik>Java appletStack (abstract data type)Library (computing)Social classIdentifiabilityConnectivity (graph theory)CodeLevel (video gaming)File systemConstructor (object-oriented programming)State of matterPreprocessorIntelligent NetworkComputer animation
Self-organizationDivision (mathematics)IterationSmoothingProcess (computing)Structural loadSolid geometryCodeComputer programSoftwareCellular automatonContinuous integrationRevision controlSystem programmingSoftware testingExecution unitData conversionNumberState of matterPredictionGoogolMathematical analysisProduct (business)Data modelDependent and independent variablesDistribution (mathematics)Asynchronous Transfer ModeFeedbackMultiplicationMetric systemElectric currentDatabaseMultiplication signBit rateStreaming mediaProcess (computing)Product (business)AlgorithmCodeLimit (category theory)Service (economics)Endliche ModelltheorieProof theoryTotal S.A.Electronic program guideRun time (program lifecycle phase)Open sourceSoftware testingHeuristicHistogramDifferent (Kate Ryan album)Arithmetic meanSoftware developerComputer-assisted translationInstance (computer science)Computer programmingLaptopProjective planeTask (computing)DialectResultantDataflowMereologyTrailCodeTransformation (genetics)Data miningDependent and independent variablesNumberData conversionUnit testingPredictabilitySoftware design patternCASE <Informatik>GoogolWebsiteContinuous integrationIterationMetric systemMathematical analysisDistribution (mathematics)Data storage deviceData managementKey (cryptography)Price indexSystem callStandard deviationExecution unitHierarchyComputer animation
Software testingProduct (business)Video trackingData modelGroup actionRollback (data management)Independence (probability theory)Self-organizationOperations researchConfiguration spaceControl flowRevision controlCodeInformation securityDisintegrationAsynchronous Transfer ModeFlow separationRippingSet (mathematics)Maxima and minimaExpert systemVertical directionFormal languageSineAerodynamicsVirtual machineProduct (business)Disk read-and-write headSoftware developerThumbnailOperator (mathematics)Goodness of fitDependent and independent variablesMultiplication signInformation securityCASE <Informatik>Instance (computer science)Open sourceSoftwareConfiguration managementTraffic reportingAdditionData managementRevision controlInformation engineeringSoftware engineeringAnalytic continuationResultantHypercubeProcess (computing)Different (Kate Ryan album)Online helpDecision theoryProper mapSingle-precision floating-point formatParameter (computer programming)Mathematical optimizationClosed setSelf-organizationEndliche ModelltheorieRule of inferenceExpert systemUser interfaceFormal languageLink (knot theory)Cartesian coordinate systemOrder (biology)Execution unitPoint (geometry)WebsiteWeb 2.0Software testingMultiplicationDataflowGroup actionContinuous integrationTrailField (computer science)PredictabilityTensorNormal (geometry)Projective planeJava appletComputer animationDiagram
Formal languageImplementationKolmogorov complexityComputer configurationPredictionScale (map)Personal digital assistantStapeldateiData modelFatou-MengeSCSIJava appletProduct (business)Interpreter (computing)TheoryPortable communications deviceFunction (mathematics)Software frameworkTensorInstance (computer science)Cartesian coordinate systemServer (computing)Morley's categoricity theoremPerfect groupProjective planeJava appletFormal languageSingle-precision floating-point formatCASE <Informatik>StapeldateiProduct (business)PredictabilityService (economics)Point (geometry)Library (computing)Core dumpProgramming languageImplementationBinary fileWave packetMedical imagingWeb serviceScaling (geometry)NumberWebsiteWeb browserLimit (category theory)Goodness of fitoutputProcess (computing)Computer fileFile formatAlgorithmCodeTask (computing)RewritingData conversionCompilerSoftware bugEndliche ModelltheorieMusical ensembleFunctional (mathematics)Standard deviationRoundness (object)BitFlow separationVirtual machineElectric generatorSoftware frameworkOperator (mathematics)ResultantNatural numberBookmark (World Wide Web)PreprocessorComplex (psychology)Computer animation
Formal languageSingle-precision floating-point formatSoftware frameworkSelf-organizationSoftware engineeringContinuous integrationData managementProcess (computing)Mathematical analysisIntegrated development environmentOvalSimilarity (geometry)Product (business)Rule of inferenceMachine learningData modelSoftware testingIterationCodeRepository (publishing)Software repositoryCloningDistribution (mathematics)DisintegrationSineConfiguration spaceDeclarative programmingStandard deviationSoftware frameworkProduct (business)AutomationEndliche ModelltheorieJava appletIntegrated development environmentFormal languageBlogUnit testingMereologySoftware engineeringVirtual machineArchaeological field surveyPerformance appraisalContinuous integrationCodeProjective planePhysical systemError messageInternet service providerBitAlgorithmAnalytic continuationComputer fileCASE <Informatik>Focus (optics)Game controllerLimit (category theory)Process (computing)Personal identification numberRevision controlConfiguration spaceRule of inferenceInstance (computer science)Data managementDifferent (Kate Ryan album)Set (mathematics)Template (C++)Repository (publishing)Decision theoryLink (knot theory)CuboidComputer programmingData storage deviceSoftware developerGoodness of fitSimilarity (geometry)CloningOnline helpStandard deviationSubject indexingResultantExtension (kinesiology)ImplementationSlide ruleDependent and independent variablesSoftware testingMetric systemExecution unitSoftware repositoryComputer configurationSingle-precision floating-point formatMultiplication sign2 (number)PreprocessorAsynchronous Transfer ModeComputer animation
Interface (computing)CASE <Informatik>Java appletLevel (video gaming)Mathematical analysisDatabaseBit rateEndliche ModelltheorieCoprocessorDivisorCartesian coordinate systemMetric systemFunction (mathematics)Software engineeringNumberMultiplication signArrow of timeRevision controloutputFlow separationData conversionDifferent (Kate Ryan album)Physical systemMereologyError messageOutlierProcess (computing)CalculationSoftwareRun time (program lifecycle phase)Category of beingDecision theoryPredictabilityFormal languageProduct (business)Projective planeBitExecution unitRight angleAxiom of choiceResultantScripting languageSlide ruleHistogramNeuroinformatikCodeLibrary (computing)Dependent and independent variablesCache (computing)QuicksortShared memoryPresentation of a groupComputer animation
Transcript: English(auto-generated)