iPod shuffle, or more generically, randomness

I ran across an article discussing randomness, specifically as it relates to the iPod, both the high capacity units, and the iPod Shuffle. The article took us right to the point of getting interesting, and then boom, it was done. The basic idea is that people are seeing patterns in the shuffle, and calling foul. The author cites Paul Kocher, president of Cryptography Research, as saying that “Our brains aren’t wired to understand randomness,” but leaves the topic to die on the vine.

I think that the reason that we see patterns where there is only randomness, is that we tend to treat the group as a whole, and try to see causality, rather than individual, completely (well almost) unrelated transactions. In card games of chance, for instance, most people seem bound to the idea that what has come before somehow affects what will come after. This is true to some extent if the sample size is small enough, but when you go to Las Vegas, there are so many decks of cards mixed together, that taking one card out of play does little to change the chances of that card showing up again.

A more concrete example would be coin tosses – when you ask the average person on the street what the chances are for heads or tails in a coin toss, you will get an immediate, no hesitation answer of 50/50. As an aside, it actually is not 50/50, but that is another story entirely. But ask the same person what the chances are of getting heads after getting after getting heads 5 times in a row, and you get some hesitation, some confusion, and (usually) an eventually admission that there is still an even chance of getting heads vs. tails. This is usually followed by a strongly stated claim that the chances of getting 6 heads in a row is astronomically huge, and very unlikely, and this is where the train leaves the rails. Because in actual fact, since there is no causality between tosses, the chances of getting 6 heads in a row is exactly the same (1 in 512, or 1 in 26) as the chances of getting 1 head, then 2 tails, then 2 heads, then a another tail. Note that I implied some chronological ordering of my second, seemingly more random set, and I did not claim that the chances of 6 heads in a row is the same as the chances of getting 3 heads and three tails, with no ordering involved. The statement only works if order matters, which I think is where many people get thrown into a tizzy.

In the case of the shuffle, the idea is to play each song once and only once, before doing a reshuffle. To quote the article,

More specifically, when an iPod does a shuffle, it reorders the songs much the way a Vegas dealer shuffles a deck of cards, then plays them back in the new order. So if you keep listening for the week or so it takes to complete the list, you will hear everything, just once.

What people do not seem to understand is that where they see patterns, for example, a couple of songs from the same artist in a row, they are seeing randomness, because in order to actually prevent such a occurrence, the shuffle algorithm would actually have to reduce the randomness by taking into account the artist and album, and trying to spread songs from a particular artist and album evenly across the ordering of songs. This seems to be the other place that people get confused, is when they expect that random == even.

To see this a little more clearly, lets take an example of a collection of 500 songs, where each album has 10 songs, and we will ignore artists for now (partially because artists cannot stand to be ignored.) If I were writing the randomness algorithm, I would index all the songs, so the first song on the first album has index 0, the second song on the first album has index 1, the first song on the second album has index 10, etc. (Sorry for the 0 indexing, I just cannot help it.) Lets say the first song that gets picked is 58. What are the chances that the next number is 59? 1 in 499. What are the chances that the next song picked is 487? 1 in 499. But people are looking for ordering in the randomness, so lets look at songs from a particular album. In this case the chances of the second song coming from the same album as the first song are 9 in 499, or about 1.8%, meaning that if we were to run through this scenario 1000 times, the next song would be from the same album 18 times. The chance of getting a song from any other specific album is 10 in 499, or about 2%. Not much of a difference.

Another way that our perception makes randomness look patterned. In the scenario above, people would say that song 59 should not be next because it is in the same album, but if the first song picked were 59, they would have no problem with the next song picked being 60, because that is from a different album, even though it is the next song in the indexing sequence both times.

I guess the more succinct way of saying this, is that a random distribution is not the same as a even distribution.

The human mind pulls for even distribution, without an implicit acknowledgment of such, because people would complain about the even distribution as well. Would you accept something as a shuffle if you got to hear the first song from each and every album, then the second song from each and every album? I think not. The instant that you build in pattern avoidance in a randomness algorithm (well I just played a song from that album/artist, I had better pick one from a different album/artist), the algorithm is no longer pure, and certainly does not conform to the dictionary definition of random.

Leave a Reply