Data Analysis on my iTunes Library

You can export an XML file from iTunes that contains all the data from your music library. Today I’ve been playing around with this perl script that can read the file and export it into an HTML page. What I really wanted to do was use it as a starting point to help me crunch some numbers. So far, it’s been working great and I used it to find out my top ten albums.

The formula I used to calculate each rating was as follows. (Note: You can rate each song 0 through 5, but when you export the file it shows up as 0 through 100 with 1 star = 20, 2 stars = 40, etc.)

((The sum of the ratings for each track / number of tracks) + The sum of the ratings for each track + (number of times played / 3)) / 2000

This won’t be the final formula I use, but the eventual one will resemble it. I want it to reward albums with many good songs, but also the occasional track heavy album with half good and half bad songs. I also wanted to give a boost to any albums that are heavily played (note: It only counts if I’ve played it on my computer or my iPod and I only recently got an iPod).

Here are the results:

– Album
Year Disc Number Tracks Tracks Played Rating
Beck – Random Stuff ???? 1 22 96 108.20%
Frank Black – Teenager Of The Year 1994 1 22 110 103.20%
Liz Phair – Exile in Guyville 1993 1 18 99 96.65%
Pixies – Complete ‘B’ Sides 2001 1 19 120 87.25%
The Breeders – Last Splash 1993 1 15 122 78.80%
Elastica – Elastica 1995 1 16 140 78.80%
Frank Black – Frank Black 1993 1 15 54 77.70%
Beck – One Foot In The Grave 1994 1 16 83 76.80%
Pixies – Trompe Le Monde 1991 1 15 124 76.70%
Pixies – Doolittle 1989 1 15 100 74.20%

Clearly, this algorithm favors good, long albums that I’ved played alot. Teenager of the Year and Exile in Guyville fit that description well. The random Beck album I threw together from various downloads makes sense since I assembled the tracks myself!

The only shocker on here is that Trompe Le Monde is rated higher than Doolittle. Ha! 6/10 are Pixies or ex-Pixies, too. Something might be up with the Pixies B-Sides album, since I don’t think it deserves the top ten. My ratings must be off.


2 responses to “Data Analysis on my iTunes Library”

  1. very cool!

  2. JoanneG Avatar

    you are a very sick man.