Looking back on a number of metrics associated with the 253 feature-length science fiction, fantasy, or supernatural horror films I’ve seen from the years 2011-2015, I don’t find too much of interest, but an uninteresting result is still something. It’s not bad to have empirical evidence for what you might have guessed, and if you look at something closely enough for long enough, you come out the other side of uninteresting and enter a tiny world of nerdy fun.
Correlations among metrics
First, a matrix of correlation coefficients based on data from multiple sources:
A B C D E F G H I J K ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- A: My rating 1 0.98 0.15 0.38 0.23 0.34 0.15 0.38 0.27 0.2 0.33 B: My retro rating 0.98 1 0.15 0.39 0.23 0.37 0.14 0.39 0.29 0.23 0.35 C: IMDb votes 0.15 0.15 1 0.46 0.94 0.22 0.81 0.43 0.1 0.17 0.24 D: IMDb rating 0.38 0.39 0.46 1 0.48 0.77 0.32 0.86 0.55 0.55 0.75 E: Letterboxd votes 0.23 0.23 0.94 0.48 1 0.34 0.71 0.47 0.22 0.33 0.35 F: Letterboxd rating 0.34 0.37 0.22 0.77 0.34 1 0.06 0.72 0.77 0.8 0.86 G: RT votes 0.15 0.14 0.81 0.32 0.71 0.06 1 0.33 0.05 0.05 0.15 H: RT user rating 0.38 0.39 0.43 0.86 0.47 0.72 0.33 1 0.65 0.55 0.79 I: RT crit. rating 0.27 0.29 0.1 0.55 0.22 0.77 0.05 0.65 1 0.87 0.96 J: Metacritic rating 0.2 0.23 0.17 0.55 0.33 0.8 0.05 0.55 0.87 1 0.87 K: IMDb * RT crit 0.33 0.35 0.24 0.75 0.35 0.86 0.15 0.79 0.96 0.87 1
As a basic explanation of what’s going on here, every available set of metrics has been compared with every other. For the sake of space, the column labels have been simplified to letters that correspond to the letters beside the row labels. “My rating” is the collection of scores I gave each film at the time I watched it. Evidently I held my marks: I didn’t give out a single “10.” I also never changed a rating, so the “My rating” metric is not colored by rosy retrospection. But in view of the fact that my opinions did evolve a little over time, I built a “retro rating” metric by awarding a one-point bonus to every film noted in my previous “Movie Favorites: SF/F/H 2011-2015” post. Most other data sources should be transparently intelligible, except for row/column K, which is based on multiplying the IMDb user rating and RT critic ratings together—a metric I’ve used in some earlier data mining posts about movies.
Unsurprisingly, my judgments line up best with those of other users at IMDb, RT, and (to a slightly lesser extent) Letterboxd. It’s only a moderate correlation, but aggregate ratings from critics have even weaker correlations with my scores.
Optimal movie selection
Looking only at the 40 feature-length films I eventually designated as favorites, here are the minimum scores they achieved:
IMDb 6.0+ Letterboxd 2.7+ RT user 43%+ RT critic 32%+ Metacritic 41+
If I’d known this in advance, I could have skipped 29 out of 253 films without missing out on any favorites. That is to say, if I had insisted sort of compulsively that everything I watch have at least the scores above, I could have achieved a 40/224 favorite-to-watched ratio. But if I’d been interested in a compromise—discovering fewer favorites but also watching far fewer films I didn’t enjoy—what criteria would have yielded better ratios? To answer that question, I wrote a script to cycle through random threshold values for all five metrics and find a set of scores yielding a good ratio for each number of favorites:
IMDb Let. RT u. RT c. Meta 40/224 (17.9%) 6 2.7 43 32 41 39/207 (18.8%) 6 2.7 43 58 43 38/166 (22.9%) 6.3 3.1 55 42 41 37/150 (24.7%) 6.4 3.1 61 39 44 36/132 (27.3%) 6.3 3.3 48 68 45 35/127 (27.6%) 6.3 3.3 60 68 41 34/118 (28.8%) 6 3.35 53 68 46 33/114 (28.9%) 6.4 3.35 61 66 45 32/101 (31.7%) 6.1 3.4 52 67 48 31/95 (32.6%) 6.2 3.4 61 68 42 30/88 (34.1%) 6.3 3.45 46 68 41 29/85 (34.1%) 6 3.45 61 47 42 28/78 (35.9%) 6.4 3.4 73 67 43 27/70 (38.6%) 6.1 3.45 73 68 41 26/67 (38.8%) 6.1 3.4 60 88 60 25/59 (42.4%) 6.5 3.45 60 89 46 24/56 (42.9%) 6 3.35 73 89 62 23/50 (46.0%) 6.1 3.45 73 89 44 22/48 (45.8%) 6.4 3.45 73 89 68 21/41 (51.2%) 6.8 3.45 46 92 41 20/35 (57.1%) 6.8 3.65 56 89 60
My script in fact yielded much more data, showing each ratio it preferred over the next, and one very slow version of the script deterministically ran through every possible value for every metric. But the “random walk” version converged on similar results much more quickly.
My conclusions are a little impressionistic. Aiming at a ~33-34% favorite-to-total ratio “feels” about right. Ratios above 38% require very high RT critic ratings, perhaps limiting a viewer to a steady diet of blockbusters and children’s movies. Surprisingly, Metacritic and IMDb scores don’t seem very crucial in these results: they often return to their start values as the ratios improve, suggesting some Letterboxd or RT user rating might have served about as well. So I suspect I could find favorites more easily in the future by selecting only SF/F films that have a Letterboxd rating of 3.4+, an RT user approval rating of 61% or above, and an RT critic rating of 68% or above. I mean, I doubt I’ll do much with the information—movies that don’t turn out to be favorites can still be fun, etc.—but it was an engaging puzzle to work through.