We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Simplifying upserts and deletes on Delta Lake tables

Formale Metadaten

Titel
Simplifying upserts and deletes on Delta Lake tables
Serientitel
Anzahl der Teile
69
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Data Engineers face many challenges with Data Lakes. GDPR requests, data quality issues, handling large metadata, merges and deletes are a few of the tough challenges usually every Data Engineer encounters with a Data Lake with formats like Parquet, ORC, Avro, etc. This session showcases how you can effortlessly apply updates, upserts and deletes on a Delta Lake table with a very few lines of code and use time travel to go back in time for reproducing experiments & reports very easily, how we can avoid challenges due to small files as well. Delta Lake was developed by Databricks and has been donated to Linux Foundation, the code for which could be found at http://delta.io. Delta Lake is being used by a huge number of companies across the world due to its advantages for Data Lakes. We will discuss, demo and showcase how Delta Lake can be helpful for your Data Lakes because of which many enterprises have Delta Lake as the default data format in their architecture. We will will use SQL or its equivalent Python or Scala API to perform showcase various Delta Lake features.