A YouTube choir: the history of the most viewed videos converted into sound


Why did I choose the videos that I did? Well, the original plan was to include only videos that have reached #1 at some point in time, which is mostly music videos. Including other music videos that haven’t gotten to #1 (like Hello, Sorry, Uptown Funk, etc.) isn’t such a good idea. Since they constantly get views, we’d quickly get clouded with dozens of voices singing at once, which’ll turn into a muffled mess. That’s why flash-in-the-pan viral videos are so great – they add interesting events to the video, but don’t cloud up the audio space for more than a few seconds! My other criterion was a little subjective: it had to get enough views to reach a comfortable pitch (about 2M views a day), but it also had to be considered part of “YouTube culture”, like the ALS ice bucket challenge certainly was. So I didn’t include Adele’s carpool karaoke since that seems more like traditional TV.

There is very little data from before mid-2009, so that’s why I didn’t go back that far.

To get the majority of the data, I used inspect element to find the coordinates of the vertices of the “daily views” graph underneath each video, and converted those coordinates into views.

For the videos whose stats were private, I could usually look up the SocialBlade stats of the channel that uploaded the video. If the main video accounted for over 75% of the views, I could somewhat safely just assume that that fraction stayed about constant over time. So I just multiplied the channel’s total upload views by that fraction, and voila.

For videos that were older than SocialBlade itself, I used this Wikipedia article (https://en.wikipedia.org/wiki/List_of_most_viewed_YouTube_videos) for some historical data, and then interpolated between those given data points. To be extra fancy, I didn’t just linearly interpolate. Instead, I looked at the Google Trends (which goes back to 2004, before YouTube) for that video title, and assumed that the number of searches was somewhat proportional to views received on that day, and interpolated using that. All that work probably wasn’t worth it, though, because if you look at the two oldest videos (Evolution of Dance and Charlie Bit My finger) their view-graph lines still look pretty close to linear to me.

