Abstract: This page provides an argument for determining the best directors using shrinkage estimation, links to relevant imdb ratings aggregated in 2010, and provides results from applying the estimation for three different parameter choices. First Published: 7/20/2010. Last updated: 11/1/2012.
This project came about after watching Inception and deciding that it had to vault Chris Nolan into the discussion of who is the best director of all-time.
Specifically, I wanted to answer the question, statistically, is Chris Nolan the best director of all-time?
The short answer is, quite possibly.
For the longer answer, you’ll have to indulge me with a bit of stats. You see, there’s this website called imdb, (you may have heard of it), and one interesting fact about it is that it has the largest depository of user-generated ratings in recorded history.
So, after aggregating all of the movies that somebody has directed, it is easy to calculate his or her average rating on imdb. In order to see whether Chris Nolan really is the best director of all time, I did this for anyone who had seemed to have a reasonable chance of winning.
To be fair, I didn’t count early movies that the director probably didn’t have much funding for, movies that he/she wasn’t the main director of, and documentaries, because they tend to be lame.
Once I did this, I realized that I had meandered into a dilemma. You see, the best directors had only directed one movie each! At the tops of the list were Florian Henckel von Donnersmarck, who directed The Lives of Others (an 8.5), and Tony Kaye, who directed American History X (an 8.6).
To appropriately punish these slackers for their limited sample sizes, I had an excuse to employ a shrinkage estimator. This sounds much more complicated than it is.
Basically, I counted the total number of movies each person had directed (n), inputted the average rating of a movie on imdb (6.9; C), and set an arbitrary variable, m, to be some value between like 0.001 and 1000.
To calculate the weighted rating (WR), I put each director’s average rating and total number of movies through this equation:
where WR refers to weighted rating, n refers to the number of movies each director has made with a reasonable budget and degree of creative control, and C is the average score of a movie across imdb.
m is a controllable parameter specifying how much to punish directors that have made few total movies. To provide some intuition for it, consider the extremes.
If m approaches zero, then the equation (1) breaks down into , and so m has no effect.
If m approaches infinity, then the right side of the equation (1) comes to dominate, and every director’s rating will be computed via .
The reason the equation (1) is called a shrinkage estimator is because it “shrinks” the average rating of every director towards the average.
Crucially, the equation shrinks the scores less the more movies that a director has made (n). If m is relatively small compared to n, then the left side of (1) comes to dominate the total expression.
The end result of applying the shrinkage estimator is that it spits out rankings that take into account the fact that von Donnersmarck and Kaye have only directed one movie each.
Now, determining which value to use for the variable m is an open and interesting question. It depends on your values: do you prefer a director that has made a whole lot of good movies, or one who has made just a few great movies? It’d be hard to answer this objectively.
If you prefer quality over quantity, then you should set your m low, so you don’t punish low sample sizes as much. If you think that a director has to be somewhat prolific to be even included in the discussion, then you should set your m high. I set m to three different values to be fair to each of these reasonable positions.
When m = 20, the top 5 directors are:
1) Akira Kurosawa, score = 7.40 (weighted), directed 25 movies. Highlights: Seven Samurai,Yojimbo.
2) Stanley Kubrick, score = 7.36, directed 11 movies. Highlights: Paths of Glory, Dr. Strangelove.
3) William Wyler, score = 7.34, directed 26 movies. Highlights: Dodsworth, Ben-Hur.
4) Ingmar Bergmann, score = 7.34, directed 30 movies. Highlights: The Seventh Seal, Wild Strawberries.
5) Luis Buñuel, score = 7.29, directed 32 movies. Highlights: Viridiana, The Discreet Charm of the Bourgeoisie.
When m = 10, the top 5 directors are:
1) Stanley Kubrick, score = 7.58 (weighted).
2) Akira Kurosawa, score = 7.54.
3) Chris Nolan, score = 7.50, directed 7 movies. Highlights: Memento, The Prestige.
4) William Wyler, score = 7.46.
5) Hayao Miyazaki, score = 7.45, directed 9 movies. Highlights: Spirited Away, Princess Mononoke.
When m = 3, the top 5 directors are:
1) Chris Nolan, score = 7.93 (weighted).
2) Stanley Kubrick, score = 7.92.
3) Sergio Leone, score = 7.88, directed 6 movies. Highlights: The Good, The Bad, and The Ugly, Once Upon a Time in America.
4) Quentin Tarantino, score = 7.80, directed 7 movies. Highlights: Pulp Fiction, Inglourious Basterds.
5) Hayao Miyazaki, score = 7.78.
Another sort of difficult thing to choose is how to count Pixar’s movies. Most of the movies list different directors, but really, who actually knows what goes on in that forsaken place? If you consider the Pixar movie making team as its own distinct entity, then that entity would end up at 5th, 3rd, and 5th on the above lists.
If you’d like to check my raw data, feel free to peruse this google document at your leisure.
So there you have it. Chris Nolan is the best director of all time, under certain assumptions. Finally, implicit in this essay is the recommendation that if you haven’t seen Nolan’s Inception yet, you need to get your act together.