We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Closing Keynote by Lamere

00:00

Formal Metadata

Title
Closing Keynote by Lamere
Title of Series
Part Number
88
Number of Parts
89
Author
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Paul Lamere works at Spotify, where he spends all his time building machines to figure out what song you really want to listen to next. When not at work, Paul spends much of his spare time hacking on music apps. Paul's work and hacks are world-renowned including the 'The Infinite Jukebox', and 'Girl Talk in a Box'. Paul is a four-time nominee and twice winner of the MTV O Music Award for Best Music Hack (Winning hacks: 'Bohemian Rhapsichord' and 'The Bonhamizer') and is the first inductee into the Music Hacker's Hall of Fame. Paul also authors a popular blog about music technology called 'Music Machinery'.
81
Metropolitan area networkTurbo-CodeVideo gameReal numberObservational studyCodeRow (database)Lattice (order)Line (geometry)Computer animationLecture/Conference
Musical ensembleSoftwareBitQuicksortBookmark (World Wide Web)CodeSemiconductor memorySoftware testingTouch typingDifferent (Kate Ryan album)Programmer (hardware)Streaming mediaShared memorySet (mathematics)Service (economics)Computer animationMeeting/Interview
Musical ensembleCodeHazard (2005 film)Boss CorporationMultiplication signDisk read-and-write headFunctional (mathematics)Streaming mediaRoboticsClosed setProduct (business)NumberTupleQuicksortSpecial unitary groupFigurate numberCausalitySurface of revolutionTouch typingDifferent (Kate Ryan album)PlotterFrequencyCross-correlationElectronic mailing listData storage deviceReal numberPattern languageWorkstation <Musikinstrument>BitFitness functionRow (database)Data miningHacker (term)Total S.A.Right angleRing (mathematics)Water vaporDreizehnView (database)Time seriesService (economics)Process (computing)Volume (thermodynamics)Goodness of fitOnline helpDigitizingLevel (video gaming)TrailWritingSpontaneous symmetry breakingSoftware bugWebsiteLecture/Conference
Musical ensembleCodeExtreme programmingMultiplication signWell-formed formula1 (number)Process (computing)AdditionMathematicsStatisticsQuicksortHand fanNatural languageTwin primeWordInternetworkingSpacetimeBitScaling (geometry)Electronic mailing listComputer clusterDescriptive statisticsRevision controlBuildingTerm (mathematics)Web 2.0Similarity (geometry)Äquivariante AbbildungCompilation albumService (economics)Different (Kate Ryan album)Theory of everythingWeb crawlerGoodness of fitFocus (optics)MetadataTrailQuery languageArithmetic meanCoefficient of determinationEndliche ModelltheorieDegree (graph theory)PhysicalismFrequencyRight angleBeta functionBlogSet (mathematics)Server (computing)Water vaporWebsiteWhiteboardLecture/Conference
Hydraulic jumpMobile appType theoryMultiplication signHand fanSimilarity (geometry)SpacetimeCASE <Informatik>NumberMusical ensembleFrequencyCartesian coordinate systemMereologyQuicksortEmailRight angleBridging (networking)PlotterLine (geometry)Message passingVisualization (computer graphics)Arithmetic progressionOnline helpWordDifferent (Kate Ryan album)Water vaporLetterpress printingAverageBoiling pointHoaxLecture/Conference
Different (Kate Ryan album)Decision theoryWave packetMusical ensembleQuicksortType theoryGoodness of fitHand fanTrailSet (mathematics)Axiom of choiceSheaf (mathematics)Context awarenessMultiplication signReal numberCommutatorSubject indexingNumberFiber (mathematics)Social classLink (knot theory)Right angleRule of inferenceView (database)WordCASE <Informatik>Disk read-and-write headFocus (optics)Metropolitan area networkRow (database)Loop (music)Point (geometry)Event horizonBinary file
QuicksortLevel (video gaming)MereologyBitMultiplication signVolume (thermodynamics)ResultantNumberGoodness of fitCurveGravitationSpecial unitary groupMusical ensembleDemosceneDivisor
FreewareRight angleShape (magazine)Multiplication signDrop (liquid)BitHost Identity Protocol
CausalityHost Identity ProtocolInterface (computing)Musical ensembleSupersonic speedDrop (liquid)Observational studySoftware developerFamilyMereologyData managementBasis <Mathematik>BitComputer animation
Moving averageBackupCASE <Informatik>Classical physicsMereologyMoment (mathematics)Bookmark (World Wide Web)QuicksortMusical ensembleSoftware crackingLecture/Conference
TrailRight angleMathematicsMusical ensembleVirtual machineBeat (acoustics)Mobile appLevel (video gaming)Bounded variationBitQuicksortKey (cryptography)Signal processingFigurate numberNatural numberAsynchronous Transfer ModeException handlingInteractive televisionDrop (liquid)Lecture/Conference
MathematicsNatural numberParsingMusical ensembleRight angleMoment (mathematics)MereologyVarianceMobile appLibrary (computing)2 (number)Bounded variationVirtual machineActive contour modelBitLecture/Conference
Game theoryVisualization (computer graphics)Web browserMusical ensembleQuicksortMultiplication signMetreInstance (computer science)CodeBookmark (World Wide Web)Mobile appWinAmpBeat (acoustics)Lecture/Conference
WordMultiplication signDomain namePower (physics)Lecture/Conference
Term (mathematics)Multiplication signLibrary (computing)Virtual machineCuboid1 (number)InfinityKeyboard shortcutProcess (computing)Line (geometry)Beat (acoustics)Electronic mailing listTheoryCodeLine codeMusical ensembleRight angleOrder (biology)Video gameArithmetic meanIntegrated development environmentMereologyLink (knot theory)Revision controlMedianSymbol tableHacker (term)BitMoment (mathematics)Web 2.0Reverse engineeringComputer programmingProgramming languageCellular automatonMetropolitan area networkWebsiteCircleQuicksortJukeboxInternet forumBookmark (World Wide Web)AlgorithmHand fanArc (geometry)Programmer (hardware)Lecture/Conference
QuicksortMusical ensembleRight angleState of matterSoftware testingMeasurementRow (database)CuboidDomain nameWeb browserBeat (acoustics)BitDifferent (Kate Ryan album)SummierbarkeitMultiplication signEndliche ModelltheorieSpecial unitary groupElectronic mailing listWordMoment (mathematics)Revision controlIntegrated development environmentProgrammer (hardware)MereologyInformationSet (mathematics)Graph coloringCanonical ensembleTheory of relativityCodeRule of inferencePhysical lawBuilding
Computer animation
Transcript: English(auto-generated)
This is great, so, this is my very, very first RailsConf, and a full disclosure, I have never, ever shipped a line of Rails code in my life.
I am a total outsider, but I came to this conference, I came to the whole thing, I went to lots of really, really great talks, I got to meet lots of really, really great people, it was really, a really welcoming community, I'm really glad I came, and I've learned a few keywords that I need to study on something like active record, something turbo,
or turbo something, and apparently you folks are into this thing called testing, which I've never had to do, so, so, I'm Paul Amir, I work at Spotify, music streaming service, and I'm really excited
about this talk because I get to talk about my two very, very favorite things. One is music, and one is code. So, let's talk a little bit about music. So, when I was in high school, I was in a band, it was the Memorial High School Marching Band, and I played trombone, and I really loved
about being a musician, I loved getting in front of people, I loved to share a performance, I loved to maybe even be able to touch people emotionally with power, or make them feel maybe sad because of the sad trombone, or something, but I sort of realized, first of all,
I was not really that good, I could not make a career as a trombonist, so I moved on to become a programmer. This is back in the early 80s, this is what it looked like to write code in 1984. A oscilloscope, that's me wearing a tie, and you notice a tie actually makes me look a lot slimmer.
So, I spent 20 years or so writing all sorts of software in lots of different industries, including, funnily enough, the industry that Nicholas was talking about, I wrote software that controls the cameras that are flying in the U2, and they're actually
still flying to this day, much to my shock and horror. But, about 15 years ago, I got to combine the streams, my two very favorite things, code and music. I started to work in the music tech industry, and eventually that led me to working at Spotify.
And so now, I'm in this total heaven, because I'm surrounded by all of these awesome people who are building this awesome product, but even better, I'm surrounded by all this great music data that lets us learn about how people actually experience music. So this talk is called Data Mining Music,
but that's just so you can go back to your bosses and say, yeah, we learned about data mining. It's really hacking on music, because that's what I do. I like to explore music through code in all sorts of different ways. I like to find out how people experience music. I like to think about how people can learn about music.
I like to make the music experience more interesting, more interactive, and do all sorts of things like that. Now, my real job at Spotify is to spend every day looking through all this data and trying to improve the music listening experience for our listeners. And so that may be things like helping them
find a little bit more about the music itself or help organize their music collection or give them good music recommendations. So in this talk, I'm gonna show 14 examples of really kind of hacks, but I call them experiments, because that's better than hacks, that show different ways that we can
learn stuff about music. So one of the things about Spotify is we're surrounded by data, and the data that we get at Spotify about how people are listening to music gives us a view of music listening that we've really never had before. And I'm just gonna dive right into a quick example
to give you a taste of this, and this is called the Aerosmith Anomaly. So you guys, I'm sure, are all familiar with Aerosmith. They're the bad boys from Boston. They had their first hit way back in 1975, but I'm sure most of you are most familiar with their only number one hit, which happened in 1998.
That was, it's the song, I'm blanking on the name of the song, but you guys know what it is. This is what it sounds like. ["Aerosmith Anomaly"]
First spontaneous applause I've ever received in a talk. So that, some of you may have actually, that may have been your very first slow dance in middle school. I should have given you a trigger warning.
So, well one of the things, because we know exactly when people play this song on Spotify, we can look at a time series of this song. So here's the track of streams as a function of time for Don't Wanna Miss a Thing over about a two year period,
from January of 2013 to December 2014. So you notice a few things about this plot. One is, you know, there's this general slope up to the right, and that's not because this song's getting more popular, that's because Spotify is getting more popular. You also see a lot of these little bumps, and that's the weekly listening pattern.
So during the week, people listen to more when they're at work, and they listen a little bit less during the week. And that's a pattern pretty common across all songs. But the really interesting thing, and this is the Aerosmith Anomaly, are these peaks. What is causing these peaks? We have, so we see some peaks that the volume of listening is doubling or tripling
in what looks like a single day. I mean, this actually is pretty rare. We don't see this across any other song. So what is causing this? So we can start off by taking a look at the dates. All right, so, all right. February 14th, February 14th, so you're thinking middle school dance, right? But then what about this August 7th,
September 18th, and November 13th? Well, it turns out these are all, these spikes are here all for the same reason, and it's not because of Valentine's Day. It's for another reason, and it's, actually, it's an unworldly reason. So take a look. This first one is close shave with asteroid 2012.
Potentially hazard asteroid, this is the Earth. Rosetta spacecraft orbits Comet 67P. Filet landing site selected on Comet 67P. Robotic lander touches down on 67P.
Hey, we're a team, too. So yeah, you guys have totally figured this out, right? This song was on the soundtrack for the movie Armageddon where Bruce Willis goes and saves the world from a comet that's about to hit the Earth.
So what we found out is this song is sort of the poster child. This is the go-to song when there's some kind of asteroid or meteor coming our way. So this is the kind of thing we never ever would have found out if we looked at the old traditional chart.
So here's Billboard, and before the digital music revolution, people would go to Billboard, and Billboard would actually call record stores and call radio stations to find out how often people were playing songs. And so if we look at how much more data, how fine a view we have of the data, it's kind of akin to, with Galileo,
it's kind of akin to when Galileo took his telescope and pointed it at the sky the first time, and he saw the moons of Jupiter or the rings of Saturn. It's maybe not as had such a big global impact as this, but it's the same sort of thing,
certainly a big impact in the music industry. The data that we see here is just opening up our eyes to all sorts of ways of how people are listening to music. But before we leave this example, just one more thing. I know you guys know that correlation does not necessarily imply causation.
So if you go back to that chart with the Aerosmith thing, just sort of think maybe, what if the correlation thing is actually reversed and increased Aerosmith listening is actually attracting asteroids and meteors? Now think about it, no really.
All right, so why do we care? We care that we have lots of music data that lets us find strange things about music. Well, back to when I was first writing that code, this was our exciting new music technology, the Sony Walkman. You could put 10 songs in your pocket, really big deal. The biggest challenge was actually trying to get that
to fit in your pocket. And 20 years later, Steve Jobs got on stage and introduced the first iPod. You could put 1,000 songs in your pocket. The exciting technology that went with this was ShufflePlay. You could shuffle 3,000 songs and get a pretty good listening experience. And then 10 years later, Spotify launched in the US
and lots of other streaming services did as well. They essentially put 30 million songs in your pocket. So you're essentially a tap away from listening to just about any song that has ever been recorded, except for Taylor Swift. It's true. And so we're gonna need help figuring out
how to listen to music. When you have 30 million songs, you can't hit the ShufflePlay button because you're gonna get some John Philip Sousa March and some Gregorian chant and some Keisha. And you're gonna get iPod whiplash, right? So we need tools to help this. So we have 30 million songs in our pocket.
What are we gonna do? So this is what I do at Spotify. I try to figure out how to improve the listening experience. So in this talk, I'm gonna sort of walk through 14 experiments around how we can better engage the listener. And we're talking about this all about data. So I'm gonna focus on four different kinds of music data.
Music metadata, so this is the basic facts about music. Cultural data, so this is what people are saying online about music. There's listener data, so this is who's playing what and when. And finally, acoustic data. So this is treating the music as data itself. So what does the music actually sound like?
So we're gonna start off with music metadata, sometimes called the most boring-est of the music data. So this is really the basic facts. Spotify, we have millions of tracks, artists and albums, and they're all interlinked. Artists have albums, albums have tracks, tracks can appear in multiple albums. So there's a crazy intertwining of all these things,
and they have lots and lots of different facts associated with them. And it turns out that this is really, really hard, because the music space is not really well-contained. Artists can call things whatever they want, they can do whatever they want, and nobody can stop them. And so it turns out that we spend a whole lot of time
trying to get this data right. And I'm just gonna give you an idea of some of the challenges here. So first of all, here's a simple music query. Somebody says, hey, what are the songs on the album White Christmas by Bing Crosby? The reason that this is hard is actually that Bing Crosby has 40 albums called White Christmas.
And they all have different album art, they have different tracks, they have different performers. So he had the cow called White Christmas, and he kept making sure he could milk it every year with a new album. So Sarah had told me that this is a really highly technical crowd, and that I should try to challenge you as much as I can. So I have a little math quiz here.
So why is this formula troublesome for music recommendation and discovery? Just shout out the answer if you know. Yeah, all right. All right, no, all right, the answer is this is the name of a song by Aphex Twin.
So if you're building a music service and you wanna make sure that your listeners can find this song, what do you do? It's not easy. So just, you kinda look at artist names, right? Back before we had Google, we had a band called The Tha who made their name out of stock words. I know you guys are all familiar with the band Duran Duran, but did you know there's a band called Duran Duran Duran?
And there's a DJ who calls himself DJ Donna Summer. And there's Glass Teeth. And in fact, there's a whole genre called Witch House, which is just filled with nonpronounceable names. And then the final artist, which is an artist that, as far as I can be concerned,
be burned in hell forever, is the artist named Various Artist. There really is a DJ whose name is Various Artist.
So say you're a fan of the band Eclipse. Did you know that there's actually 22 bands called Eclipse? You know, if you happen to be the fan of the Utah-based vocal choral group, and I happen to play for you the Ukraine Brutal Death Metal Grindcore version of Eclipse, both of us are gonna have a bad day. All right, so music metadata, it's hard,
but we can still try to have a little bit of fun with it. So what we wanna do for our first experiment, we're just gonna put a toe in the water, a little bit of data, is to answer a very important question from the internet. And the question is, have band names been getting longer?
And this is a real question that's been posted on Quora. And so helpfully, this guy named Zachary Davidson, he starts off by giving us his qualifications, which are, I named both my bands. And he says, he would say, yes, they are getting longer, but only very gradually. And he goes on and on and on, justifying his things.
But of course, that's his opinion. We're just gonna do this with data. And so it's pretty easy to do. Only thing we have to do is go through the top 500 artists for a five-year window, blah, blah, blah, calculate the average names, and we're done. And the code is actually shorter than Zach's answer.
So, have band names been getting longer? A little audience participation. Raise your hand if you think artist names now are longer than they were a long time ago. Okay, raise your hands. Okay, keep your hands up. If your hands up, you are wrong. The artist names were actually longest in 1955 to 1959.
You say, well, what was going on then? Because they were quite a bit longer. These are the kinds of band names we had back then. Van McCoy and the Soul City Symphony Orchestra, Academy of St. Martinsons. So essentially, the meme at the time was orchestra leader name plus orchestra name. So we ended up with very long names.
And of course, while we're at it, we might as well take a look at what the longest artist names are ever that were popular. And the longest is Tim and Sam's with Tim and Sam band with Tim and Sam. I think Tim and Sam both grew up with short names and they compensated.
All right, so that's a quick tour through metadata. Now we're going to cultural data. And you probably may know less about cultural data because no one else, I don't know, whatever. I don't know what I'm saying. All right, so cultural data. So the idea of cultural data is
there's lots of data out there on the web where people are writing about music. So there's music, blogs, there's music review sites, people are putting playlists together. There's lots and lots of really interesting data. So the idea is let's go out, crawl all this data, serve a Google scale crawl, collect it all up, do some natural language processing on it,
statistics on it, and use that data to give us a really good description of particular artists. So here's an example. So we have this black metal band, Dimu Borgir. Here's a typical review that you'd see for a band. And you see there's actually lots of descriptive words
for this artist. So black metal, melodic black metal, unique, symphonic, Norwegian, fast, famous. So when you start to collect all these words up, we get a pretty good idea of what this band is about. And this is just from the words from the first paragraph of one review. So if you repeat this over thousands and thousands
of reviews that have been written about this band, collect up all the words, find the ones that are distinctive for the band, find the ones that appear a lot, we get a really good description of what this band is. And we can use this data for all sorts of things. But one of the things that we use it for is for artist similarity. So we can find out how similar two artists are.
So we got Katy Perry, we have Dimu Borgir, we can look at their top terms. And here I'm only showing a half dozen terms. But imagine that these lists go on and on and they're weighted in all sorts of crazy ways. There's very little overlap, essentially the decades that they played in. Whereas we look at a different artist, like say Britney Spears,
quite a bit of overlap with Katy Perry. So we can sort of assume that Britney Spears and Katy Perry are very similar. All right, so now we're gonna do an experiment with this artist similarity. And the experiment is to expose the listener to new music by using this artist similarity.
But this is the extreme addition. And it's extreme because our goal is to help a Katy Perry fan listen to Dimu Borgir. All right, so how are we gonna do this, right? So we start off, we have these two artists. We know that they're very, very dissimilar.
But maybe we can find an artist that fits in between that has some overlap with both of them, right? So here we have maybe a three-song playlist that would get us there. But that's a pretty big jump from Katy Perry to Evanescence and also a very big jump from Evanescence to Dimu Borgir. So we have to go a little further.
So in fact, what we do is we can take all of the artists, put them in a big graph, and connect them all up to their nearest neighbors. This is a visualization that was done by researchers at Yahoo, by the way. It's pretty nice. So when we see Katy Perry is at one end of the space, Dimu Borgir is at the other, Evanescence is nicely in the middle. But what we wanna do then is just sort of fill in the gaps,
call in our friend Dykstra to find a path through this space and we have a playlist. And so there's an app that does this. It's called Boil the Frog. All right, you guys are way ahead of me.
You know the story about boiling a frog. You put a frog in cold water and heat it up very slowly. It won't notice. It will jump out. The idea is to do this musically. So take somebody who's listening to music, very gradually take them from one artist to another. So I'm gonna run this generated playlist
and we'll listen to very brief snippets of the 10 songs. And the idea is there should be like gradual progression between the songs. Here we go. Katy Perry, Dimu Borgir, Boil the Frog. We've got a playlist of 10 songs. So let's listen. We're gonna have audio, I hope. Here it comes.
Message that says, hey, you play this song, this Keisha song, this date. So we get billions and billions of this data every day. And it's really sort of the bread and butter of Spotify. We derive all sorts of things from this data, from charts and music recommendations and collaborative filtering and stuff like that.
But I'm gonna try to talk about something that's maybe a little future looking. And it's something I'm really interested in. And that's finding artists that have really passionate fans. So there's the certain artists. Are there any Hamilton fans here by any chance? So Hamilton fans, they'll listen to that album
over and over and over and over again. Am I right? Yes, all right, yeah. And so they're very, very passionate. I think finding artists that have passionate fans is really interesting. If you're helping people find music and you can steer them to an artist that they become really passionate about, I think it's a big win.
So what we wanna do is see if we can find artists that have the most passionate fans. And we'll sort of bucket this and say, is it, which genre has the most passionate fans? Is it dubstep or is it metalheads? So how are we gonna find out how passionate the fans are for a particular artist? There's probably lots of ways to do it,
but one really simple way is to just look at sort of the average number of plays per fan for a particular artist. And so we can sort of illustrate this. These numbers are fake. I don't mean to disparage any artists here. So here's two artists with a million plays. Robin Thicke on the left, he has 200,000 fans, which means every fan is playing his song five times.
So Blurred Lines made it into the playlist. They play it five times and that's it. Whereas the other side, we have Lorde and she has 10,000 fans. And each of those fans are playing her songs 100 times. So obviously those fans are much more engaged than Lorde. So she has these high passion fans.
So same number of plays, different number of fans. We can say which artist has highest passion fans. So we can do this for lots and lots of artists. Here's a plot here. X-axis is the number of fans. Y-axis is the number of plays. And we can sort of see most artists follow the same general arc, but we have some artists that have really high number of average plays per listener
and there's some artists that have low. So to answer our question, which artists have the most passionate fans, we can just look for the artists that are in the top here. And so, here's the question. Who has the most passionate fans? Raise your hand if you think it's, so the choices are dubstep or metal.
Raise your hand if you think it's metal. Have the, and do the slew too. Yeah, all right, yeah, all right, all right. It's a pretty good number, right? So who has the most passionate fans? So the artist with the most passionate fans is a band called In Flames with 115 plays per fan. All right, yeah? Yeah, see?
Keisha fans wouldn't clap. I didn't mean to pick on Keisha. Kelly Swift's fans wouldn't clap. Oh, they would, I'm sorry. So here's In Flames with a song that means I'm clapping. All right, that's pretty good. You can see why they have the passionate fans. So if we look at the top 20 artists,
nine out of the top 20 are metal bands and zero are dubstep. So indeed, metalheads are the most passionate fans according to this one small experiment. All right, so who has the least passionate fans? There's one band that stands out. They have only five plays per fan.
Who could it be? And so they're sort of this one-hit wonder band. They have a song that's in thousands and thousands and thousands of playlists. Everyone listen to old guys like me when they go running. And but then you never listen to anything else.
So they have a pretty low plays per fan. All right, so that's a quick tour through the Passion Index. All right, so playlists. You know, at Spotify in the last 10 years, people have created two billion playlists. That's a huge amount of data. Oh, it sounded like Trump, sorry.
So the question here is what can we learn from two billion playlists? And so one of the things we can do is just look at the most frequent names of playlists. So we look through all these two billion playlists, find the most frequently occurring names, and these are the top 10. So we have rap, chill, country party,
house, workout, rock, gym, music, and road trip. So it's kind of funny people make playlists called music. Lots and lots of people make playlists called music. But you notice here, five of them are genre-related. Five are not genre-related, they're context. So people are actually making playlists around what they are doing as opposed to
the type of music that's in the playlist. So if you look at the top hundred playlists, 17 of the top hundred names of playlists are genre-related, and 41 are context-related. So this leads me to believe context is really the new genre. So people are organizing, they're listening around.
Oh, I'm going for a run, or I'm having a dinner party, or I need some focus music. So just to give you an idea of some of the names that people are using when they're creating playlists, here's a wall of words, but they're bucketed into different sections like training and workout. You know, lots of different types of playlists. We have mood, relax motivation, 420.
We have travel, road trip on the fly, commute fly, yeah. We have romance, love. We have time, 420, focus. 420 should be highlighted there.
Apologize for that. All right, actually I was talking to some Rails guys from Weed Mapper. Yeah, yeah, we had this great idea, sort of a joint effort Spotify Weed Mapper playlist thing. All right, socializing, yeah, party, dance, pregame, 420.
So lots of different ways people are organizing their music. So context is really, really important, but context is really, really hard because our music does not come pre-labeled with the context that it's good to listen to. So our challenge here is to let's find music
for a particular context by mining tracks from these two billion existing playlists. And so to put it very specifically, let's see if we can create a playlist of mainstream tracks that are good for running for a 55 plus year old male like me. All right, so how are we gonna do this? Well, here's the app, but we'll just look at what we do.
We have two billion playlists. We start, we just start by searching for all the playlists that have running in the title. We may get about 10,000 of those playlists. We aggregate all the tracks from them, so all the tracks that occur in those 10,000 playlists, some are going to occur a lot more than others, so we aggregate them up so that the tracks that appear the most in all those playlists go to the top.
We sort of adjust for popularity so that distinctive running tracks come to the top. Then we can filter these by demographics so that tracks that are more likely listened to by a 55 year old get re-ranked higher. And this leads us to a playlist. So here's the 55 year old male running playlist.
So here's what it sounds like. So that movie, Rocky, came out in 1975, so all the 55 year olds watch that when they're 16 years old.
It's still their big go-to running song. So since we're doing this nice demographic filtering, we can generate the same playlist but targeted for an 18 to 24 year old female. And we get a much different set of tracks. We can hear what they sound like. Right, so you notice that the tempo is much higher
for the female, the 18 to 24 year old, than the male. So we have a few others. We have road trip tracks. Here's a road trip track. Whereas a young woman would listen.
Sexy time. I sort of backed myself into a corner here because I have three 18 to 24 year old daughters.
So I'm just going to play the first song from the next playlist. You can read the titles. I don't even want to know.
They can make their own decisions. All right, so after sexy time, it's breakup time. You notice the men, it's all about the crying. The 18 to 24 year olds, it's all about the F-you.
Oh, quick tour, mining, playlists. Fun thing is we're rolling this kind of thing out into Spotify. You'll be able to get good playlists recommendations
based on things like titles real soon. All right, experiment number five, perhaps one of my favorite experiments of all. And this is using scrubbing data to find the best parts of a song. So people listen to music. They oftentimes not only just press play, but they actually scrub inside the song. They'll move and play the best part
or they'll skip the worst part of the song. So what happens if you aggregate this data across millions of millions of people who are listening to the same songs? Well, you start to get a map of what are the hot spots in the song, the best parts of the song. So here, sort of blow this up here, the most common places that people are scrubbing to
in the song In the Air Tonight by Phil Collins. Now, people actually don't scrub to their favorite parts. They usually scrub a little before that. So if we sort of integrate the actual listening part, we get another curve and that shows us the most listened to parts of the song as a result of scrubbing. So you guys probably figure out
what this is going to sound like, but let's listen anyway. We're gonna start at the most scrubbed to spot and we'll listen through the actually most listened to spot. And we need the volume up just a little bit on this example. Very good. Look at me, you're the bully. The hurt doesn't show, but the pain still grows. So straight after you and me.
So, a lot of times I do these experiments, you get some interesting results,
but these results just pop. You just look at it like a dubstep song, a dubstep song like this. That peak, that very, very sharp peak, it's the drop, so. So it's really turns into this really awesome
drop detector, which is, the world needs this because who wants to listen to all that, right? You just listen to this bit.
Imagine millions and millions of people scrubbing to the exact same spot.
Pretty awesome, but there's more. So, because we can look at the shapes of these peaks, some peaks are more prominent than others.
So we can say, well, find me the most prominent peaks across a whole genre. So we can actually find what is the most interesting rap in all of hip hop. So we have this little interface there. We can search based on prominence and find the most prominent Scrub 2 spot. So, for hip hop, this is the most prominent peak.
I don't know if you're familiar with the song, but. Hey Fab, I'ma kill you. Lyrics coming at you with supersonic speed. Someone, I'm a demon, I'm a human, I'm a human. What I gotta do to get it through to you, I'm superhuman, innovative, innovative, rebel, something, anything you're saying, is picking, singing off of me. And it'll do to you, I'm devastating, more than ever demonstrating how to give a motherfucking audience a feeling like it's levitating, never fading, and I know the haters are forever waiting for the day that they can see. I fell off, they'll be celebrating cause I know the way to get them motivated.
What do you think is the biggest drop, quote unquote, from the 19th century? When we did the drop, it was with cannons,
not with the bass. And classic rock, it's all about the guitar solo. And so, what do you think is the greatest moment
in classic rock? Take a moment to think about it. It may be controversial, I don't know. Here it comes. All right, greatest scream in rock, you know what this is, right? So Rolling Stones, I was wondering,
when I sort of looked at the song, I had in my mind what I thought would be the best part. So my favorite part of the song, but I said, nah, I probably listened different than everybody else, and it's not the case. The favorite part of Gimme Shelter is when the backup singer's voice cracks,
and it's amazing, that little red dot is right at that moment. So everyone loves that. All right, yeah, I think. All right, and for pop music, it's all about the chorus.
Everyone loves this. All right, so that's Hot Spot. Why do we care? Well, all sorts of reasons. I'm just looking forward to the day where we have a Play Me Just a Drop button on Spotify.
Coming soon. All right, now we're getting into the last bit of data. This is the acoustic data. This is treating our music like data. And so how do we do this? So this is, we do all sorts of things. We take audio, we can do signal processing on the audio, do machine learning around this,
and we can extract all sorts of things, like basic things like tempo, key, the mode. We can figure out how danceable a song is, or how energetic it is, or how loud it is. And we can also get a really good map of where all the beats are, and the bars, and the tatems, and things like this. And then we can use this to do all sorts of things,
and all the next experiments are gonna be driven off of that. But I wanna take a little step back. Just sort of think, before we had recorded music, how people used to listen to music. You were always in the same room with the musician. And this means that the listening session was much more interactive. You could have eye contact with the musician,
you could shout play the chorus again, or you could sit down and join them, and depending on the venue. So music listening was really, really interactive. Now compare that to today, how we listen to music. There's no interactivity at all, except for pause and play, and maybe a little bit of scrubbing. So we wanna see if we can use some of this data
to help make music listening a little bit more interactive. And some of these experiments will be around that. All right, so first one is trying to see is a human or a machine setting the tempo for your favorite song. So recently, drummers have been putting on headphones and listening to click tracks when they set the beat.
And so what we can do is try to figure out if the band you're listening to is actually using a click track or not. And this is a little app called In Search of the Click Track. What this is showing is a variation in tempo over the course of the song. And you can sort of see over the course of the song, we get some natural variations, four peaks. And this is because the band is speeding up
and slowing down. So, and that's not necessarily because Stewart Copeland, the drummer for the Police song, is a bad drummer. He's actually using the changes in tempo to add tension to the song and release and over. So let's take a little listen to hear what that sounds like.
So we can do a little math and we can, and see this,
there's definitely natural variation here in the tempo. So this is definitely a human drummer. Now let's look at another song. This is a song by Britney Spears. You can see it's perfectly flat is what it sounds like.
So no tempo variation, probably a machine drumming and maybe even singing. All right, sorry, I don't mean to disparage any artists. It's just a funny joke. All right, so finding the most dramatic moments in music.
So we looked at hotspots, finding interesting moments, but I really am interested also in finding dramatic moments. I really love dramatic music, music that goes from a real quiet to a real loud in 30 seconds or so. And so that's what this is. It's an app called Where's the Drama?
And the idea is you only wanna listen to the dramatic parts because that's the best part of the song. So it will find the most dramatic bits just by looking at the loudness contour, highlight it and you can hit the button and say, play me the drama. And here's what it sounds like. You put your lighter up there.
So by the way, I think Evanescence's, Sam Phippen's, the R-Spec guys' favorite band. So I make sure we have lots of Evanescence examples. All right, so next one. This is something built out of music hack day
sitting right next to Sam, by the way. And this is my attempt at doing something creative. So yeah, I'm coding, but I think back to the time when I was a musician and I could touch somebody's emotionally. So this is my attempt at doing this. So since we know where all the bars and beats are and where the song gets exciting and where it falls off and where all the phrases are,
we can do a very much better visualization that accompanies the music, much better than sort of the VU meter kind of visualizations you may have seen in Winamp or something like that. So this is an app that's written in 3JS, runs in the browser, and it's really just an accompaniment for a song by Ellie Goulding. And this is a small excerpt.
["We Can Light It Up"] We can light it up So thick I'll put it out, out, out We can light it up, up, up So thick I'll put it out, out, out We can light it up, up, up
So thick I'll put it out, out, out We can light it up
We got the power And we gonna let it burn, burn, burn
Posted on the web It's a 55 year old man who had just created some Ellie Goulding fan art. So I figured in for a penny, in for a pound. So I found my way to an Ellie Goulding fan art site which there's apparently quite a few of them. But I found what I thought was the biggest one,
crafted a forum post and posted a link to there. And nobody liked it, so. All right, so next, automatically remixing your music. So, you know, a song is just a list of beats and bars and stuff. So what if you could manipulate these lists of beats and bars just like you manipulate any list
in your programming language. And so that's what this remix library is. Oh look, Python code. So it's code. So it's a, so here's an example of doing some, using a remix library. There's a Python and JavaScript bindings for manipulating music algorithmically.
So here's a six line program. It's gonna take a song, Bad Romance by Lady Gaga, find the first beat in every bar and render it out into a new song. And for those who can't read Python, I'll do a little visual for you. Take the first beats and line them up like that. All right, so what does it sound like?
It sounds surprisingly musical, doesn't it?
What if you did three lines of code to put those beats in reverse order? Sound like this? Well, it would look like this, sorry for the non-Pythoners. I have a theory that you can put Lady Gaga in any order, and it doesn't matter.
So I have my youngest daughter, she was listening to me prepare this talk, and she'd
say, dad, dad, dad, do you like Lady Gaga? I'd say, yeah, I like some of her stuff, mainly the beat ones, bad joke, sorry. All right, next experiment now. So we know where all the beats are in a song, well, let's see if we can improve a song by changing the drummer, and that's the idea behind the Bonomizer.
So imagine the idea here is that any song could be improved if John Bonham was the drummer. So we know where all the beats are in a song, we have some outtakes from John Bonham, so we just need to align those beats, and let it play some tricks, because songs are never the same tempo, and the real trick is to make sure that the drums stop at the
right time and start at the right time, otherwise it sounds just like a machine. So the idea here is to Bonhamize it, here's a before, before, not Bonhamized. Some nights I stay up, cashing in my bad luck, some nights I call it a draw. All right, now Bonhamized.
Some nights I stay up, cashing in my bad luck.
Who's one of the founders of the Echo Nest, which is the company that got bought by Spotify and why I'm at Spotify. So the idea here is to make your favorite song swing, so swinging a song is, you know, instead of going dun, dun, dun, dun, da, da, da, da, da, da, da, da. So you just need to stretch the first part of the beat and shrink the last half
and you get a nice, swung song. So let's hear what this sounds like. Here's Unswung. Here's a swung version.
This next one is called the Infinite Jukebox. This may be my most well-known hack. Anybody who's ever heard of the Infinite Jukebox? Anybody? All right. So the idea of the Infinite Jukebox is for when your favorite song is not long enough. So you all have those songs you wish would go on forever and that's what Infinite Jukebox does.
It does this, we know where all the beats are in the song, we know what the beats sound like, so we can find beats that actually sound very, very similar. So when we're playing back the song, instead of just playing back in sequential order, we can jump to a similar sounding beat. And we do this a little bit randomly and so we end up with a song that plays forever but it's always changing. So it's not just a simple loop.
So here, around the outside of the circle are all the beats, they're sort of colored by the timbres, and then the beats that sound most similar are connected up. And so as I play through the song, you'll see sometimes it will flash green on those arcs and that's when we're jumping to a different part of the song. So, here we go.
Because it made me actually feel like I created something beautiful,
which is something you very rarely get to do as a programmer. And this is called the auto-canonizer. And the idea of the auto-canonizer is to turn any song, well maybe not any song, but lots of songs into a canon. So you know what a canon is? It's a song like Row, Row, Row Your Boat where you can play the song against an offset copy of itself and it sounds pleasant.
So what the auto-canonizer does is it plays the song straight through but then it finds other parts of the song that would mesh well with the first voice and tries to generate a pleasing version of the song. So in this example, we're going to be playing the song Over the Rainbow by the Hawaiian singer Iz.
So this is not a duet, but you'll hear him singing it as a duet. So we can make this deceased singer sing a duet with himself. So here's what it sounds like.
It brings us to the final experiment, and this is called Girl Talk in a Box.
And the idea here is to turn a song into a musical instrument. Girl Talk is a remix artist. And so the idea is we can take a song, we can break it down into its beats, and then you can interact with it.
It all runs in the browser. And I am going to attempt a live performance for you of, oh look, that was just sitting in the background. I probably should have written some tests. Alright, I have no tests. So here's a song, Fancy, featuring, or whatever.
So all these are the beats in the song, and I can interact with this in all sorts of ways. So first of all, I can just sort of play the song, and you see the colors change when I beat. But I can stop the song. I can make it play backwards.
Or I can play it forward every other beat. Or I can just click around in different beats. This song is really, really rich. Every beat is filled with totally different sonic stuff. So it becomes a great musical instrument.
So I can bookmark these things. So I can bookmark things. That's how it works. So now I'm going to try to play it as a song. Here we go. This is it. I've practiced this quite a bit.
And I'm doing this really because I've never performed in front of a thousand people before. Even when I was playing trombone. Alright. I'm quite nervous. More nervous about this than any other part of the day. And I'll screw it up. Alright. Here we go.
Just to generalize to be able to work with the things that I love most. Music and code. But all of you are working in some domain. And I really strongly encourage you to get to know your data. Love your data. Play with it. Do experiments with it.
Build crazy stuff. Some of the stuff will just be crazy. But some of it actually becomes really, really useful. So love your data. Hack your data. So thank you very much. All of these experiments are online. You can go and try them out. So you've been a great audience. Special applause for our signers.