Re-Used Halo 3 Study Completely Skews Results to Frame Sexist Agenda [Updated]

halo cortana 07-31-15-3

This is an editorial piece. The views and opinions expressed in this article are those of the author and do not necessarily represent the views and opinions of, and should not be attributed to, Niche Gamer as an organization.

[Update]

This article has received an official response from the author of this study. In it the author goes into detail about the decisions made that lead to many of my problems with his article. Within this response he goes into detail about the reason for only including the specific responses that were recorded, why he chose to only look at the kills and deaths as a factor for skill rather than adding in player assists to help clarify a players skill, as well as other problems I failed to consider at the time I originally wrote this article. If you are reading this Mr. Kasumovic, thank you very much for your response.

_____________________________________________________________________________________

 

Recently there has been an explosion of scientific research into video games, and with good reason. With an industry value of over 85 billion dollars in 2014 [1], video games have rapidly become the biggest entertainment industry in the world.  However, the potential that video games have to change gamers remains relatively unknown, and is an aspect that researchers are dying to understand.

Hundreds of new studies are being published each year looking at various gaming aspects, such as how platformers increase spatial reasoning, or how online cooperative games improve social skills, and so on. However, as has long been the case, a portion of these studies are misleading and flawed, and over-dramatic yet flawed results seem more likely to get media attention.

Unfortunately, it seems that not even peer review can prevent all of the studies like these from making their way into academic journals. Take, for instance, a study published recently discussing how people who played the game Halo 3 poorly were more likely to be sexist. This study, which has been presented as factually sound by several gaming and tech sites, has several distinct flaws that need to be fixed before the research should be taken seriously by anyone.

That is not to say that the entire study should be tossed out. Some aspects such as the way the researchers go about obtaining their original material, are quite useful. For example, in the study the researchers decided to use 3 different gamertags on Xbox Live. Two of these tags used a male and female voice to determine if the players reacted differently towards different genders, the third tag was used as a control and didn’t speak at all. Due to this fact, the data was removed because it didn’t tell anything about the gender interactions.

halo 3 07-31-15-1

This method could prove to be a very useful method of determining if players are generally more hostile online when certain signifiers are presented to different genders.  Another factor that was very useful was the use of outside transcribers to determine what was said in each match. This helps prevent personal bias, and allows the study to properly quantify the exact data needed to determine their statistics. However, while these aspects are noteworthy they do not save this study from being an otherwise poor study.

To start with, the data that this study used was not current in the least. It was obtained from a study on Halo 3 conducted back in 2012, a point in time where the game was already five years old. While using the data from older studies is a perfectly acceptable practice, the researchers misrepresented the older data in such a manner as new and original. This is reflected in their results page which stated, “we played” rather than the more accurate “we obtained from the original data”. Due to the wording, readers are left with the false notion that this information is a current and accurate depiction of today’s gamer, rather than a reflection of the Halo 3 gaming community in 2012.

Despite this, the researchers did provide a thorough description of how they went about as they processed the data from their original study.  For each match in the data set, the researchers had independent transcribers record anything that was spoken, without letting the transcribers know the purpose behind their transcriptions. The researchers then crosschecked 10 percent of all transcriptions and determined that they were accurate.  Next, the author and an independent coder looked through the transcriptions for comments toward the experimental player. These comments were divided into positive, negative, and neutral in nature. Negative comments were then checked to see if they contained any sexist remarks made to the female experimental player.

While the distinction was made in order to determine when females were more likely to receive sexist comments, it willingly ignored any sexist remarks directed toward men.  Without this critical piece of information readers and researchers alike cannot determine if the amount of negative sexist comments observed was normal for both male and female players, or if female players received a statistically significantly larger amount of sexist comments than men. If the former is true, then the findings would only prove that players who perform poorly are generally more negative than players who perform well, hardly the insight into gender norms and online gaming that the authors and media coverage billed it as.

halo 3 07-31-15-2

The study suffers from another major problem as it generalized old information to the public of today, without any indication of who comprised their sample demographic other than males who spoke when playing Halo 3 five years after it was released. This lack of information is partially due to the semi-anonymous nature of Xbox live pseudonyms, but lacking key information such as location, ethnicity, or age, makes it impossible to properly generalize this information to the public. They did not even record the one bit of demographic information that was available to them, player skill.

The manner which skill was determined would not distinguish between, as an example, a teenager who may have never played competitive online games alongside women before, or a fifty-year old Klansman. Without the necessary demographic information, it is not possible to determine the significance of the data in relation to the general public. Without factoring in other potential causes for the sexist behavior, or even attempting to control for such factors, the author has attempted to convince the public that the only reason for such sexist attitudes is the fact that a woman is performing better than a man in a game, and nothing else.

While the researchers didn’t control for factors such as age, nationality, location, or even ethnicity, they did make sure to carefully explain just what information was used. In total, the older study from which the data was obtained had 1136 participants, with roughly 574 participants played against the female voice, and 567 played against the male voice. However, these exact figures were omitted from the study because the author made sure to write only what he thought as “necessary” for the reader to come up with their own opinion.

This necessary information meant only looking at the participants that actually spoke, which ended up being 189 participants. While it is easy for a casual reader to assume the fact that people didn’t talk meant this information wasn’t useful, it is critical for researchers to publish all of the information that they obtained. The importance of including such “unnecessary” information means the difference between a figure of 1.9% or 13.4% of participants using sexist comments once it is stripped of context such as in media coverage, a figure which is seven times greater than initial findings.

halo 3 07-31-15-4

One of my old statistics professors told me in my undergrad program that you can prove anything using statistics. Change the amount of participants in a study and you can prove that raising the temperature in a classroom can improve the likelihood of scoring an A on a final exam. This is what the researchers proved, if you manipulate the data enough, you can show that male gamers who play poorly will make sexist comments toward women while losing.

Out of every player that participated on the female side of the study, only 1.9% displayed a sexist comment. That is 11/574 (or 1.9%), breaking that down to only players who were active teammates of the female experiment, and we’ve are left with 11/246 (or 4.47%). If we reduce the numbers further to only the 84 participants that actually talked and we’re at 11/84 (or 13.4%) of all participants using sexist remarks.

This generalization promoted by manipulated data is far from the only over-generalization due to inconsistent, or incomplete data. Factors such as Playlist Rank are more dependent on the amount of time played than on skill. This Playlist Rank system only takes the amount of times a player has won or tied, lost, or disconnected into account, rather than factors such as cooperation, assists, or even driving vehicles, let alone a proper kill to death ratio.  In this study, Rank is believed to be a “status symbol” that shows “dominance” to the other players, rather than a factor that could easily be manipulated, or paid for.

In this respect, Rank is no more than a cosmetic decal that shows how long a person has been playing. While this would could have been a contributing factor had this study looked at games such as Call of Duty, where specific activity directly relates to Rank and thus an increase in competitiveness to ensure a higher rank, this is not the case in Halo 3. The only time skill becomes a factor in Halo 3’s rank system is at a much higher level than typically used in this study.

halo 3 07-31-15-3

A second factor this study tries to use is a kill-death ratio to show skill of the player. The amount of kills you have over the amount of deaths you’ve had typically show your general level of skill according to the study. Again this takes out all cooperative team-like behavior. Just because a player isn’t killing the opposing team doesn’t mean they aren’t a credit to their team. Assists matter in a team-based game, driving the vehicle while others take the kill is important. By removing these important factors the data cannot accurately represent who performs the best in a team-based game like Halo 3, and skews the data to only show players who take aggressive action as dominant.

This study did show statistical significance for several interactions between the experimental player and the participants. It is important to understand that statistical significance is not proof of the conclusions they draw from it, rather that the likelihood of these events occurring again is greater than random chance. Given the small sample size, even the variation due to chance is quite large.

It does not account at all for countless demographic co-founders or other biases they do not control for and that could easily correlate with in-game communication habits, including their inappropriate use of Rank (which effectively measures the amount of time playing the game) and kill/death ratio (which is influenced by cooperative team-oriented playstyles). Instead they use this as a launching-off point to propose a elaborate hypothesis about the sociology of “low-status males”, as if this uncontrolled, scant, and noisy data is capable of providing insight into the human condition.

halo cortana 07-31-15-2

Despite the willful manipulation of the statistics, the over-generalization of the participant motives, the denial of a very useful control, the refusal to acknowledge demographic differences, and the fact that this data comes from a game that’s nearly eight years old, journalists continued to use this information.

This data was used to say that people who are bad at playing games are sexist. This study only had 11 participants say sexist things, but 11 participants became the public at large. This is why everyone should learn how to analyze a scientific paper, if only to come to your own conclusions.

Sources:

Cody Gulley

About

I am a research student with a history in psychology. I am a fan of tactical rpgs and I love to travel. I hope to one day be a clinical psychologist.



  • The Devil Within

    Thank you Cody, we appreciate this being analyzed. In the meantime, please ensure your own safety, this narrative was pushed by a huge number of interests, the backlash will be fierce from GJP 2.0 and the corrupt and dishonest academic clique that created this. Don’t let them have the satisfaction of hitting a soft target.

  • MaidKillua

    Im sorry but skill in no way relates to rank in CoD either, if anything its MORE of a treadmill progression than Halo. Unless they changed it, I stopped wasting my time after Black Ops

  • Dr. Evil’s Brother’s Evil Twin

    I’m getting sick of these people who claim to be gamers, but are only interested in pushing politics…

  • Fall

    But of course this is just another of his…

    HARMFUL OPINIONS

  • PenguinPlayer

    I need to find that study about how eating chocolate helps you lose weight that was published in a science magazine with a good reputation but was completely false and the people behind even admitted it’s false and did it to prove how easy it is to skew results these days.

    I’m glad those did it and proved how easy it was to skew results and publish complete rubbish even after peer review. It’s a sad day when we can’t even trust SCIENCE that comes from “respectable sources”.

  • Cody Gulley

    Oh I was using the information from the CoD wiki explaining XP (I have only played CoD 3 times in my life). It showed that xp was connected to quite a bit of skill based activities like scoring headshots, trickshots like placing a Semtex grenade on an enemy, or helping a teammate while they’re being attacked. This is the difference between the two rank systems. While both do count matches towards xp, CoD placed greater emphasis on other things.

  • Aaron Roberts

    I know you guys might be wondering, “Who would fund such a useless and inaccurate study?” Well, it was the Australian Fucking Government, apparently not content with ruining their own country.

  • Aaron Roberts

    I know you guys might be wondering, “Who would fund such a useless and inaccurate study?” Well, it was the Australian Fucking Government, apparently not content with ruining their own country.

  • Misogynerd

    Didn’t read the whole article, but basically, players who are losing are more likely to use insults, especially personal insults. In general, from what I’ve seen less skilled players tend to take the game less seriously and more emotionally. Teens tend to be less skilled since they have less experience.

    So yeah duh, of course shitty players are more willing to insult you.

  • Misogynerd

    Well they are interested in using games to push politics. Interactive Tumblr posts for everybody.

  • Ubernoob8470

    The government spending taxpayer dollars on useless studies that only serve to push agendas and keep people with gender studies degrees employed?
    I, for one, am SHOCKED.

  • xkumo

    What a surprise, the biggest nanny state in the world.

  • Magic Carpet
  • Holyfox25

    TL;DR did an excellent video on this “study” and it echos my sentiments exactly.

    https://www.youtube.com/watch?v=QK8NwZPLqBw

  • Crizzyeyes

    Thus is completely true and the easiest way to see for yourself is to play Dota. Communication in that game is notably higher than in non-MOBA type games, due to the nature of the game. The most obnoxious player is almost always the worst one.

    If you are already good at Dota and consistently play with nice players, abandon two matches in a row so you get placed in low priority (or as I call it, “the shit pool”). If you manage to make it through a match in that pool without at least one of your teammates flaming someone, I’d just say you probably didn’t understand it in their language.

    Good players are often seen “flaming” one another on stream, but it becomes clear in time that it’s always just a show for the stream or in good fun. One of the most prolific players, Arteezy, often subtly trolls his viewers by pocking bad items (and winning anyway).

  • Lex

    The more effort that’s put into getting the facts out there the better. This was disgraceful.

  • Arbitrary

    These people are scum.

  • Misogynerd

    Yes, this is based on my experience with Peruvian dipshits in Dota, spanish is my first language. Yeah, there is a difference between trash talking and shit talking your allies in good fun and starting to insult people. Never got shit on any game until I played Dota 2.

  • Laytonaster

    If I remember right, I think the guy who published it was part of Gawker or BuzzFeed. Just makes it all the worse really. I mean, someone who works for a site that has no basis in facts managed to get that shit in there so damned easily.

  • Maiq TheLiar

    Being a 27 year old who’s hayday was Halo 2 & 3, my belief is that teens are actually More skilled. The reason why is because without adult responsibilities like a job or college or actual chores, and without the funds to buy a new game on a whim, teens usually have several free hours a day to pour into getting better and better at the same game over months and months.

    Mostly why I don’t play pvp games anymore. The teens have all the time to get the exact physics and systems of the game down to a level only the developers should understand.

  • Cred

    This is a real study guys it is very serious look it says “Serious Study” in the title.

    Why would anyone publish information that is not true and lie?

  • Megapewpew

    They do it, because there is money in ed tech, and these parasites want that ed tech money, from the government.

  • TheDVI

    Positivism has turned the social sciences into nothing but cherrypicking statistics to suit a certain bias.

  • Misogynerd

    Well there is a range the 12-15 are pretty bad in general but 16-18 are the good ones.

  • TheCynicalReaper

    Good job ripping this stupidity a new asshole, Cody. Nicely done article.

  • TheCynicalReaper

    Haha Okay, fair point.

  • TheCynicalReaper

    Video didn’t come up for me, but yes. EDIT: Nevermind. I see you posted another one.

  • Holyfox25

    Weird. It double posted the video. I only have one video link in my comment. But yeah, the second one works.

  • chizwoz

    The social sciences haven’t been anything but left-wing propaganda for quite a while now.

  • Antoinant

    “This data was used to say that people who are bad at playing games are sexist.”
    So feminists in general ?

  • Immeis-sus in the Shade

    I am 100% sure that there is a research going on about whether fire is indeed hot, or is it just a common myth. Funded by some country.

  • CyberEagle

    /respect

  • A Real Libertarian

    Looks like he was lying about that:

    https://twitter.com/mombot/status/627001429257904128

  • Nonscpo

    There were so many things wrong with this study, it wasn’t even funny :(

  • Zachary Bower

    You’re leaving out the part where they used more insults towards the players manipulated to sound female. I realize people are skeptical of this finding, some for good reasons, but that doesn’t mean pretending it wasn’t found is a good idea.

  • Thanatos2k

    Another problem with this study is you cannot conclude that someone is sexist judging by what insults they use. Almost every troller on the internet says one thing but thinks another. They know insulting women using certain termilogies will be more effective, while insulting men for being men is ineffective in the male-dominated online gaming space. So they use the best tool in the box.

  • ThePete

    Unfortunately its common for media to buy into the hype so to speak from a studies author, instead of objectively assessing the data and conclusions.

    In this case the media got the narrative they were looking for so why check the data.

    Like you said presenting sweeping conclusions from such uncontrolled and noisy data doesn’t provide that level of insight.

  • ThePete

    The laughable thing is that ARC grants are highly competitive.

    Only 1 in 5 get funded.

    Granted this study is just a subset of the research, but doesn’t exactly bode well that tax payers are getting value for money.

  • Dio Brando

    >Ignoring and not taking data from the control group
    INSTANT GARBAGE.
    WHERE THE FUCK IS PEER REVIEW WHEN YOU NEED IT?
    (sorry if I appear heated, I get very passionate about the scientific method)