Demystifying Spark: A Deep Dive into Its Workings

Zitieren

Swiss Python Summit Association

Gibbons, Neil

Formale Metadaten

Titel

Demystifying Spark: A Deep Dive into Its Workings

Serientitel

Swiss Python Summit 2024 (SPS24)

Anzahl der Teile

Autor

Gibbons, Neil

Mitwirkende

N. N. (Moderation)

Lizenz

CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/69797 (DOI)

Herausgeber

Swiss Python Summit Association

Erscheinungsjahr

2024

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

Python developers and data enthusiasts attending the Swiss Python Summit, this talk is for you! Apache Spark is a powerful framework often used alongside Python for big data processing. You've seen its capabilities, but what powers its impressive performance? Join me, Neil Gibbons, a Backend Engineer with a passion for distributed systems (and a recent MSc in Data Science!). I've also delivered talks at DevFest Mons 2022 and Birkbeck University. In this session, we'll delve into the internal workings of Spark. We'll explore concepts like Resilient Distributed Datasets (RDDs), which are fundamental to Spark's fault tolerance. We'll see how Spark distributes tasks across a cluster, leveraging Python's strengths in parallel processing. Finally, we'll uncover the secrets of in-memory computations, the key to Spark's blazing speed. Why attend? Gaining a deeper understanding of Spark's internals, especially within the Python ecosystem, empowers you to: Optimize your Python big data applications for peak performance. Troubleshoot issues more efficiently. Write effective Spark code that unlocks its true potential and complements your Python expertise. Whether you're a data scientist, developer, or simply curious about big data, this talk will bridge the gap between Python and Spark. Join me as we explore Spark's inner workings! --------------------- About the speaker(s): Neil Gibbons is a Backend Engineer with a passion for big data. He's currently pursuing an MSc in Computer Science at St Andrew's University. He's eager to share his experience and delve into Apache Spark's inner workings at the Swiss Python Summit.