Hung Truong: The Blog!

Shuffling and Randomizing Algorithms for Music Playlists

June 06, 2009 | 2 Minute Read

picture-4

I’ve been messing around with iTunes and the DJ functionality. It seems like all it does is pick random songs out of the library and shows them in the order they’ll be played (unlike shuffle, which just randomly jumps around). Typically, I like to put my iTunes library on random when I’m listening. This allows me to hear a bunch of different music in my library. If I don’t like a certain song, I’ll skip it, unless I’m not really paying attention.

I find that the most important piece of metadata in my iTunes library is probably “play count.” This is a pretty good indicator of how much I like a song. It’s a bit off sometimes though, since I might really like a new song with a lower play count because I haven’t had a chance to listen to it 80 times. I use a smart playlist that sorts on “play count” to determine which songs to stick into my iPhone on sync since the phone can’t hold my entire music library. Generally it works well. Perhaps there could be another measure like “normalized play count” that takes into account how long the song has existed in my library.

The iTunes DJ is pretty lacking in terms of how you can weigh what will randomly show up next. You can basically click a box that says “play higher rated songs more often.” I don’t rate my songs (the metadata gets thrown away pretty quickly as I move from computer to computer or Mac to PC, etc), so this feature doesn’t do a lot for me. I prefer the implicit rating (play count) versus the explicit rating (star rating) because the implicit way to do it is natural and doesn’t require me to do anything extra.

I’d like there to be a “play songs with a higher play count more often” feature. This could be bad, though, because it’d lead to a sort of rich get richer deal. So weighting would be important. I’d say it’d be a good heuristic to give each song a probability of “(playcount + 1)/(total number of library plays + # of songs)” to be played. That way the more popular songs (the ones I like more) are played more often, but other songs still have a chance to be played as well (and skipped). There might be other better algorithms for weighing songs based on play count that don’t lead to an unnatural skew (which would mess up the point of having the feature in the first place).

Another thing I could do is prune all of the songs I skip most out of my library. I’m too much of a digital packrat to do that, though, so I guess a smarter algorithm will have to suffice.