The Toronto International Film Festival (TIFF) is a prestigious event and a hot-spot for finding some of Hollywood’s biggest stars. Apparently, it’s also home to 47 000 ghosts. Shortly after premiering at the 2016 TIFF, a film titled The Promise received over 50 000 negative reviews on the Internet Movie Database (IMDb). Given the fact that less than 3000 people had actually watched the film at TIFF, who (or what) generated these additional 47 000 reviews?
The Promise, starring Oscar Isaac, Charlotte Le Bon and Christian Bale, “follows a love triangle between Mikael [Isaac], a brilliant medical student, the beautiful and sophisticated Ana [Le Bon], and Chris [Bale] – a renowned American journalist based in Paris”, according to the IMDb synopsis. Set in the early 1900s in historic Armenia (modern day Eastern Turkey), the movie sheds light upon the greatest unacknowledged mass killing of the 20th century: the Armenian Genocide.
How could a movie about genocide create such negative sentiment nearly six months before its theatrical release?
Most Armenians will tell you these review numbers are not at all surprising, and believe this to be an orchestrated campaign of genocide denial being carried out either directly or indirectly by the Turkish government. They claim that by negatively manipulating the ratings data, the Turkish government and pro-government supporters hope to undermine the popularity of the film and limit the attention it draws towards the events that took place in the Ottoman Empire (modern day Turkey) at the beginning of the 20th century – events which, until this day, Turkey denies as constituting genocide. In response to the deluge of negative ratings, Armenians worldwide have countered by giving The Promise a perfect score on IMDb. As of 7 April 2017, The Promise had received slightly over 96 000 ratings on that website, with nearly 60 000 people giving the film the lowest possible score of 1, and about 35 000 giving it the highest possible score of 10.
Background
On 24 April 1915, the Ottoman government arrested and executed hundreds of Armenian intellectuals, professionals, and religious leaders, catalyzing what would become one of the worst genocides of the 20th century. In the months following April 1915, hundreds of thousands of Armenian civilians were uprooted from their homes and forced into death marches through the desert of modern day Syria. By the end of World War I, over 1.5 million Armenians had lost their lives, and those fortunate to have fled or survived were dispersed throughout the Middle East, Europe, and America.
Propaganda denying the Armenian Genocide has been a routine policy of the Turkish government for nearly a century. In fact, in Turkey, simply mentioning the word genocide in a manner that undermines Turkish national interests is considered a crime punishable by up to two years in prison. Moreover, Turkish officials have literally rewritten history in order to support their political agenda, with some history books in Turkish schools claiming that it was in fact the Armenians who committed massacres against the Turks. Among international genocide scholars, there is near unanimous consensus that the actions of the Ottoman regime constituted genocide; organizations such as the International Association of Genocide Scholars, the Institute on the Holocaust and Genocide, and the Institute for the Study of Genocide all agree that the Armenian Genocide is a historical fact.1
The Turkish government operates a highly effective lobbying effort in Washington; according to ProPublica, in 2007 “Turkey spent $3.5 million to mobilize its lobbyists to influence a resolution that hinged on the single word – genocide”. The resolution here refers to a bill which would have made the United States officially recognize the 1915 killings as a genocide. Due to their reliance on Turkish military support in the Middle East, the United States has refrained from using the term genocide to refer to the state-sanctioned massacres of the early 20th century. Unsurprisingly, the 2007 bill never passed.
With lobbying campaigns like these, as well as a resolute determination to protect their national image, it stands to reason that the Turkish government and pro-government sympathizers would take any measure to ensure that a movie like The Promise has as little impact as possible.
Digging deeper with data
An article in The Hollywood Reporter claims that: “The online campaign against The Promise appears to have originated on sites like Incisozluk, a Turkish version of 4chan, where there were calls for users to ‘downvote’ the film’s ratings on IMDb and YouTube. A rough translation of one post: ‘Guys, Hollywood is filming a big movie about the so-called Armenian genocide and the trailer has already been watched 700k times. We need to do something urgently.’” Similarly, an article in The Wall Street Journal reported how a Turkish social media post urged people to “Please go to Rotten Tomatoes or IMDb and rate this ½ a star. Let’s drop its ratings.”
There is little doubt that there was a concerted effort organized on Turkish social media aimed at distorting the ratings of The Promise before it even premiered in theaters. But perhaps this ratings “tug-of-war” is not unique to The Promise; perhaps the most polarizing non-fiction movies on IMDb tend to focus on controversial historical events?
By using simple data exploration techniques on IMDb’s dataset (available here), we can examine the kinds of movies that have ratings similar to those of The Promise. Specifically, we can focus on movies where at least 30% of the ratings are in each of the 1 (lowest possible score) and 10 (highest possible score) categories, so that a total of at least 60% of the reviews are in the most extreme rating categories. IMDb’s ratings dataset contains the ratings (rounded percentages of ratings in each score category) and genres of nearly 330 000 movies and around 390 000 TV shows from around the world, but the majority of the movie ratings are not relevant for our analysis. For example, the list of movies includes adult films, news programs, and musicals; furthermore, over half of all movies had less than 25 total ratings. To simplify our analysis, we can filter out movies with a small number of ratings (less than 100) and only consider non-fiction movies classified as some combination of drama, documentary, biography, history, romance, or war. This process generates a set of 34 films for which ratings data are available.
Table 1 shows the titles of these highly polarizing films. Some of these results make intuitive sense – for instance, there were nine movies about US politics and two movies about the Israeli/Palestinian conflict. But what stands out the most are the five movies about the Armenian Genocide and two movies about the history of the Ottoman Empire. While these results do somewhat validate the above assertion that the most polarizing films tend to focus on controversial historical events, the number of such movies specifically focusing on the Armenian Genocide is suspiciously large.
TABLE 1 Of the 34 most polarizing movies on IMDb, seven (in bold) are related to the Armenian Genocide or history of the Ottoman Empire.
These findings suggest that The Promise is not the first film about the Armenian Genocide that has fallen victim to an online “trolling” campaign. From a statistical point of view, the results on this small subset of the data suggest the hypothesis that, at the very least, the process generating extreme ratings for movies about the Armenian Genocide is different than that for other movies. Using statistical modeling, we can quantify just how strong the evidence is in support of this hypothesis.
A statistical model of extreme movie ratings
The results in Table 1 were obtained after looking for movies that were similar in genre and ratings to The Promise. A logical next step would be to collect data on all movies related to the Armenian Genocide (AG movies) and see if the distribution of their extreme ratings is significantly different from that of all other movies. Of the 37 AG movies listed on Wikipedia, only 15 had ratings on IMDb with more than 100 reviews.
Since our goal is to explore how the process generating extreme ratings differs between two groups of movies, we can model the proportion of extreme ratings (PROPER) for each movie, defined as the proportion of ratings for that movie that are in the most extreme categories (1 or 10), as a random variable. There are a variety of statistical techniques that can be used to test for differences in the distributions of PROPERs of AG and non-AG movies (e.g. Wilcoxon Rank-Sum test, ANOVA, two-sample t test). Here, we adopt a parametric approach by assuming that a movie’s PROPER is a random draw from an underlying beta distribution: the beta distribution is a very flexible probability distribution which is used to model variables that take values between 0 and 1, making it a suitable choice for modeling PROPERs. Table 2 shows our dataset, the PROPERs for the 15 AG movies. In addition, Figure 1 (left panel) shows the average distributions of scores for both AG movies (green) and non-AG movies (purple).
TABLE 2 Proportion of extreme ratings (PROPER) for each movie in dataset. Obtained 3 May 2017 from IMDb.
FIGURE 1 The left panel shows distributions of scores for AG movies (in green) and non-AG movies (purple), while the right panel shows probability densities for extreme movie ratings, again for AG movies (green) and non-AG movies (purple).
Since there is no baseline PROPER distribution for which to compare AG movies to, we can first use maximum likelihood estimation on non-AG movies to find θbaseline, the fitted parameter vector which best explains the PROPERs of non-AG movies. We can then test how likely it is that the distribution of PROPERs for AG movies, parameterized by the vector θ, is equal to this baseline distribution. In the notation of hypothesis testing, we are performing the following test:
Hnull: θ = θbaseline
Halt: θ ϵ Θ \θbaseline
In plain English, the null hypothesis states that the proportions of extreme ratings for AG movies are coming from the same process generating extreme ratings for non-AG movies. The alternative hypothesis states that the proportions of extreme ratings for AG movies come from a process that is different than that of non-AG movies (Θ is the set of all plausible parameter values for beta distributions).
A likelihood ratio test statistic can be calculated to assess the strength of the evidence against the null hypothesis. In this case, the likelihood ratio test answers the question “How likely is it that the PROPERs for AG movies are coming from the distribution of PROPERs for non-AG movies?”
Figure 1 (right panel) shows the fitted probability density function for PROPERs of movies about the Armenian Genocide (green curve), as well as the fitted density for PROPERs of all other movies (purple curve); the individual data points (PROPERs corresponding to the 15 AG movies) are shown on the horizontal axis. From the diagram, it is clear that ratings for AG movies are being generated from a process that is much more extreme than for other movies. The likelihood ratio test confirms this: if all movies shared the same distribution generating extreme ratings, then the probability that movies about the Armenian Genocide would have ratings as extreme or more extreme than what we see in the data is less than 0.000000001%.
Moreover, according to our model, 75% of the ratings for a randomly selected movie about the Armenian Genocide are expected to be in the most extreme categories, compared to 22% for all other movies.
Why is this important?
The past few years have witnessed a disturbing proliferation of misinformation over the internet, and the events surrounding the 2016 US presidential election suggest that some countries have the capability of carrying out elaborate misinformation campaigns. The deliberate suppression of information related to human rights violations, especially crimes against humanity such as genocide, is a dangerous first step towards ensuring that such atrocities remain unacknowledged, and perhaps more importantly, that those responsible for committing such atrocities remain unpunished. And if this reasoning seems far-fetched, recall the words Adolf Hitler used to justify his invasion of Poland:
“Who, after all, speaks today of the annihilation of the Armenians?”
About the author
Levon Demirdjian is a 5th year PhD student in the Department of Statistics at the University of California, Los Angeles. His dissertation research focuses on developing statistical methods and computational tools for the RNA sequencing analysis of alternative splicing. Levon has collaborated on numerous other projects, including research investigating the genetic determinants of methamphetamine addiction, modelling functional MRI data to diagnose neuropsychiatric disorders, and developing statistical models of extreme precipitation with scientists from NASA’s Goddard Space Flight Center.
Levon’s article was a runner-up in the 2017 Statistical Excellence Award for Early-Career Writing. The winning article, by Kevin Lin, is published here. Details of next year’s competition will be announced in February 2018.
Reference
- Charny I., Melson R., Stanton G. An Open Letter Concerning Historians Who Deny the Armenian Genocide (PDF). International Association of Genocide Scholars (October 1, 2006). Retrieved 21 April 2017. ^