Understanding Neural Network Architectures with Attention and Diffusion

EuroPython

Karzynski, Michal

Formal Metadata

Title

Title of Series

EuroPython 2023

Number of Parts

141

Author

Karzynski, Michal

Contributors

N. N. (Moderation)

License

CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this

Identifiers

10.5446/68753 (DOI)

Publisher

EuroPython

Release Date

2023

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Neural networks have revolutionized AI, enabling machines to learn from data and make intelligent decisions. In this talk, we'll explore two popular architectures: Attention models and Diffusion models. First up, we'll discuss Attention models and how they've contributed to the success of large language models like ChatGPT. We'll explore how the Attention mechanism helps GPT focus on specific parts of a text sequence and how this mechanism has been applied to different tasks in natural language processing. Next, we'll dive into Diffusion models, a class of generative models that have shown remarkable performance in image synthesis. We'll explain how they work and their potential applications in the creative industry. This is a good talk for visual learners. I prepared schematic diagrams, which present main features of the nerual network architectures. By necessity, the diagrams are oversimplified, but I believe they will allow you to gain some insight into Transformers and Latent Diffusion models.