We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Building a metadata ecosystem using the Hive Metastore

Formal Metadata

Title
Building a metadata ecosystem using the Hive Metastore
Title of Series
Number of Parts
69
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Metadata has been a key data infrastructure need since the beginning of our team's history at Stitch Fix. We began this journey in 2015 with the setup of the Hive Metastore to work with Spark, Presto, and the rest of the platform infrastructure. But as our business needs grew, we felt the need to enhance and extend our metadata ecosystem. In this talk, we want to share our journey of building additional capabilities with metadata to solve data and business challenges. Starting with our base infrastructure - the Hive Metastore, we will highlight each capability that led us to build the extensions into our present day metadata infrastructure. This includes improvements made to the Hive Metastore itself, extending the use of metadata beyond table schemas, and additional microservices we added to make access and use of metadata easier. Building these capabilities has helped our team use metadata to power internal use cases. We want to share how we went about building this ecosystem and the lessons we learned along the way.