17 Mar 2008, 22:45

UPDATE: This is currently DOWN, because's APIs are BROKEN. If that annoys you, you can make your comment here:
Since my original journal entry about feeds for national charts about 15 months ago, the data to make the calculations has become available at . I've been busy making a script to compare the tastes of different nations using the same formula as before, but this time pulling the data in live from XML.

Although a table of results is good, a map is probably a better way of visualising the data so now each country gets a spot on a google map. The spot can be clicked to bring up that country's data.

I also got to thinking that maybe matching people to countries would be more interesting than matching countries to countries (opinions welcome).

You can take your pick of either version:

Match countries to countries

Match countries to users

Some limitations:

There is no feed for "Congo". Not much I can do there!
Update: They've fixed that now. Thanks to brtkrbzhnv for the heads up.

The feeds only cover the top 50 artists at the moment, so the results are a less accurate reflection of reality than I'd like. If the feeds are increased in size, the maps will update on their own.

The spots might not be in the right places - locating them all was a fairly painful process and some errors may have slipped in. Msg me if you spot any howlers!

The blobs aren't as nice looking as coloured areas (polygons) would be, but the thought of creating polygons for 240 countries makes me feel a bit ill. The dots load up nice and fast too so they're probably staying for now.
  • SEoD

    Haha Solak.. you should see where it dumped me! The country-to-country matching seems to work better than country-to-person because it's matching 2 large sets of data to each other. Statistically, the size of a single person's tastes are so small in comparison to a whole country that the match doesn't seem quite as good (except for marshee who got a good match with his beloved scandinavia!). I think that's why a lot of african countries have very low matches across the board because they don't have very many listeners. Only matching the top 50 doesn't help because it's not giving such a rich picture of the country/person's tastes. Like I say, the app will automatically pick up the extra artists if the feeds are improved. I do like the chart for Australia though - english speaking nations come top, then western europe, then eastern europe, then the rest - quite close to how you might guess it would be.

    22 Mar 2008, 13:32
  • per_automatik

    Rely interesting! grate work, love it that i was dumped in tongo or something how about doing the comparison in between countries and persons based on there percentage difference in there charts, that each artist in each chart have a value (starting from toppartist 100%) and there after it shouldn't matter wetter it's 5.000.000 or 200 when both are 100%. or am i all wrong?

    27 Mar 2008, 23:19
  • SEoD

    Hi per_automatik, I think it might already be working how you describe... From the top 50 artists for a country, it works out the percentage weight of each artist so (simplifying to only 5 artists) 10 Radiohead 5 Blur 3 Inspiral Carpets 1 Stone Roses 1 Oasis gives a total of 20 points, meaning Radiohead is worth 0.5 (50%); Blur is worth 0.25 (25%) etc. The same calculation is done for the person's chart and where there is a matching artist, the lower of the 2 scores is taken. When the total of all scores is added together, it gives a result with a maximum score of 100% (identical artists, identical weighting) and a minimum score of 0% (no matching artists at all) I hope this makes some sense.. are you proposing something different?

    27 Mar 2008, 23:34
  • per_automatik

    How could you make sure they're graded for identical weighting if (player 1) blur 1pt radiohead 10pt (player 2) blur 10pt radiohead 1pt wont they be a perfect match anyway if there values where added together?

    28 Mar 2008, 1:11
  • SEoD

    Say we are dealing with only the top 2 artists as in your example: score for blur: player 1 = 1/(1+10) = [b]0.090909[/b] player 2 = 10/(10+1) = 0.909091 Lowest score ([b]0.090909[/b]) counts score for radiohead: player 1 = 10/(1+10) = 0.909091 player 2 = 1/(10+1) = [b]0.090909[/b] Lowest score ([b]0.090909[/b]) counts Total of both scores = [b]0.090909[/b] + [b]0.090909[/b] = [b]0.181818[/b] Therefore the match is approx. 18%

    28 Mar 2008, 2:13
  • per_automatik

    worth thinking about, do you think 18% is a reasonable result for such a match?

    28 Mar 2008, 23:56
  • SEoD

    Obviously any measure of similarity is going to be a bit arbitrary beyond making sure 0% and 100% are in the right place, so it's hard for me to say what's right and wrong. I don't feel a result of 18% is all that bad for those charts given that the two entities rate the two bands they like so differently (roughly an order of magnitude difference in both cases). For the purposes of displaying the data as colours on a map it's probably good enough! If you'd seen some of the forum posts from a few years ago about the best formulae to calculate similarity, you would probably favour keeping it simple too :)

    29 Mar 2008, 0:38
  • SEoD

    MunchyBrain: In case you didn't find it, it's just off the edge of the map when it loads! Tokelau is here!

    29 Mar 2008, 17:01
  • SEoD

    Thanks for that spot RWP. When I was getting the locations of the different countries, they tended to default to the north-south and east-west centre of the country. For crescent-shaped countries like Croatia that isn't inside the country's borders! I'll have it fixed soon.

    10 Abr 2008, 6:35
