We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Laptop-sized ML for Text, with Open Source

Formal Metadata

Title
Laptop-sized ML for Text, with Open Source
Title of Series
Number of Parts
60
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date2023
LanguageEnglish

Content Metadata

Subject Area
Genre
Abstract
AI text models like GPT3, ChatGPT, Bing AI and Github Co-Pilot are getting a lot of buzz right now, both good and bad. Much of the training techniques are public, but the computational and data requirements mean most of us can't build our own. Using these big models typically involves cost or sharing your data. What if that's not an option? Luckily, there are a number of open source language models out there, with pre-trained versions available to download! They won't let you compete with Google or OpenAI, but they're good enough for a number of real world problems. We'll start with a quick introduction to the main open ML-for-text systems like Word2vec, GloVe, ELMo and BERT, along with how they differ from traditional text relevancy like TF-IDF. Then, we'll discover how open source ML frameworks let us easily work with those techniques, and how pre-trained models let us quickly get up and running. With our ML-for-text model running on our laptop (or hefty docker container!), next it's time to see what kinds of problems we can solve! We'll look at embeddings for search, inference, semantic reasoning, prediction and more, all with (fairly) minimal coding. Finally, we'll see how we can improve the pre-trained models for specific use-cases with our own text. It may not run on your phone and it probably won't hallucinate incorrect answers, but there's still a lot of text problems we can solve just with open source on our laptops. And we'll share the code you need to do so!