Re-Used Halo 3 Study Completely Skews Results to Frame Sexist Agenda [Updated]

halo cortana 07-31-15-3

This is an editorial piece. The views and opinions expressed in this article are those of the author and do not necessarily represent the views and opinions of, and should not be attributed to, Niche Gamer as an organization.


This article has received an official response from the author of this study. In it the author goes into detail about the decisions made that lead to many of my problems with his article. Within this response he goes into detail about the reason for only including the specific responses that were recorded, why he chose to only look at the kills and deaths as a factor for skill rather than adding in player assists to help clarify a players skill, as well as other problems I failed to consider at the time I originally wrote this article. If you are reading this Mr. Kasumovic, thank you very much for your response.



Recently there has been an explosion of scientific research into video games, and with good reason. With an industry value of over 85 billion dollars in 2014 [1], video games have rapidly become the biggest entertainment industry in the world.  However, the potential that video games have to change gamers remains relatively unknown, and is an aspect that researchers are dying to understand.

Hundreds of new studies are being published each year looking at various gaming aspects, such as how platformers increase spatial reasoning, or how online cooperative games improve social skills, and so on. However, as has long been the case, a portion of these studies are misleading and flawed, and over-dramatic yet flawed results seem more likely to get media attention.

Unfortunately, it seems that not even peer review can prevent all of the studies like these from making their way into academic journals. Take, for instance, a study published recently discussing how people who played the game Halo 3 poorly were more likely to be sexist. This study, which has been presented as factually sound by several gaming and tech sites, has several distinct flaws that need to be fixed before the research should be taken seriously by anyone.

That is not to say that the entire study should be tossed out. Some aspects such as the way the researchers go about obtaining their original material, are quite useful. For example, in the study the researchers decided to use 3 different gamertags on Xbox Live. Two of these tags used a male and female voice to determine if the players reacted differently towards different genders, the third tag was used as a control and didn’t speak at all. Due to this fact, the data was removed because it didn’t tell anything about the gender interactions.

halo 3 07-31-15-1

This method could prove to be a very useful method of determining if players are generally more hostile online when certain signifiers are presented to different genders.  Another factor that was very useful was the use of outside transcribers to determine what was said in each match. This helps prevent personal bias, and allows the study to properly quantify the exact data needed to determine their statistics. However, while these aspects are noteworthy they do not save this study from being an otherwise poor study.

To start with, the data that this study used was not current in the least. It was obtained from a study on Halo 3 conducted back in 2012, a point in time where the game was already five years old. While using the data from older studies is a perfectly acceptable practice, the researchers misrepresented the older data in such a manner as new and original. This is reflected in their results page which stated, “we played” rather than the more accurate “we obtained from the original data”. Due to the wording, readers are left with the false notion that this information is a current and accurate depiction of today’s gamer, rather than a reflection of the Halo 3 gaming community in 2012.

Despite this, the researchers did provide a thorough description of how they went about as they processed the data from their original study.  For each match in the data set, the researchers had independent transcribers record anything that was spoken, without letting the transcribers know the purpose behind their transcriptions. The researchers then crosschecked 10 percent of all transcriptions and determined that they were accurate.  Next, the author and an independent coder looked through the transcriptions for comments toward the experimental player. These comments were divided into positive, negative, and neutral in nature. Negative comments were then checked to see if they contained any sexist remarks made to the female experimental player.

While the distinction was made in order to determine when females were more likely to receive sexist comments, it willingly ignored any sexist remarks directed toward men.  Without this critical piece of information readers and researchers alike cannot determine if the amount of negative sexist comments observed was normal for both male and female players, or if female players received a statistically significantly larger amount of sexist comments than men. If the former is true, then the findings would only prove that players who perform poorly are generally more negative than players who perform well, hardly the insight into gender norms and online gaming that the authors and media coverage billed it as.

halo 3 07-31-15-2

The study suffers from another major problem as it generalized old information to the public of today, without any indication of who comprised their sample demographic other than males who spoke when playing Halo 3 five years after it was released. This lack of information is partially due to the semi-anonymous nature of Xbox live pseudonyms, but lacking key information such as location, ethnicity, or age, makes it impossible to properly generalize this information to the public. They did not even record the one bit of demographic information that was available to them, player skill.

The manner which skill was determined would not distinguish between, as an example, a teenager who may have never played competitive online games alongside women before, or a fifty-year old Klansman. Without the necessary demographic information, it is not possible to determine the significance of the data in relation to the general public. Without factoring in other potential causes for the sexist behavior, or even attempting to control for such factors, the author has attempted to convince the public that the only reason for such sexist attitudes is the fact that a woman is performing better than a man in a game, and nothing else.

While the researchers didn’t control for factors such as age, nationality, location, or even ethnicity, they did make sure to carefully explain just what information was used. In total, the older study from which the data was obtained had 1136 participants, with roughly 574 participants played against the female voice, and 567 played against the male voice. However, these exact figures were omitted from the study because the author made sure to write only what he thought as “necessary” for the reader to come up with their own opinion.

This necessary information meant only looking at the participants that actually spoke, which ended up being 189 participants. While it is easy for a casual reader to assume the fact that people didn’t talk meant this information wasn’t useful, it is critical for researchers to publish all of the information that they obtained. The importance of including such “unnecessary” information means the difference between a figure of 1.9% or 13.4% of participants using sexist comments once it is stripped of context such as in media coverage, a figure which is seven times greater than initial findings.

halo 3 07-31-15-4

One of my old statistics professors told me in my undergrad program that you can prove anything using statistics. Change the amount of participants in a study and you can prove that raising the temperature in a classroom can improve the likelihood of scoring an A on a final exam. This is what the researchers proved, if you manipulate the data enough, you can show that male gamers who play poorly will make sexist comments toward women while losing.

Out of every player that participated on the female side of the study, only 1.9% displayed a sexist comment. That is 11/574 (or 1.9%), breaking that down to only players who were active teammates of the female experiment, and we’ve are left with 11/246 (or 4.47%). If we reduce the numbers further to only the 84 participants that actually talked and we’re at 11/84 (or 13.4%) of all participants using sexist remarks.

This generalization promoted by manipulated data is far from the only over-generalization due to inconsistent, or incomplete data. Factors such as Playlist Rank are more dependent on the amount of time played than on skill. This Playlist Rank system only takes the amount of times a player has won or tied, lost, or disconnected into account, rather than factors such as cooperation, assists, or even driving vehicles, let alone a proper kill to death ratio.  In this study, Rank is believed to be a “status symbol” that shows “dominance” to the other players, rather than a factor that could easily be manipulated, or paid for.

In this respect, Rank is no more than a cosmetic decal that shows how long a person has been playing. While this would could have been a contributing factor had this study looked at games such as Call of Duty, where specific activity directly relates to Rank and thus an increase in competitiveness to ensure a higher rank, this is not the case in Halo 3. The only time skill becomes a factor in Halo 3’s rank system is at a much higher level than typically used in this study.

halo 3 07-31-15-3

A second factor this study tries to use is a kill-death ratio to show skill of the player. The amount of kills you have over the amount of deaths you’ve had typically show your general level of skill according to the study. Again this takes out all cooperative team-like behavior. Just because a player isn’t killing the opposing team doesn’t mean they aren’t a credit to their team. Assists matter in a team-based game, driving the vehicle while others take the kill is important. By removing these important factors the data cannot accurately represent who performs the best in a team-based game like Halo 3, and skews the data to only show players who take aggressive action as dominant.

This study did show statistical significance for several interactions between the experimental player and the participants. It is important to understand that statistical significance is not proof of the conclusions they draw from it, rather that the likelihood of these events occurring again is greater than random chance. Given the small sample size, even the variation due to chance is quite large.

It does not account at all for countless demographic co-founders or other biases they do not control for and that could easily correlate with in-game communication habits, including their inappropriate use of Rank (which effectively measures the amount of time playing the game) and kill/death ratio (which is influenced by cooperative team-oriented playstyles). Instead they use this as a launching-off point to propose a elaborate hypothesis about the sociology of “low-status males”, as if this uncontrolled, scant, and noisy data is capable of providing insight into the human condition.

halo cortana 07-31-15-2

Despite the willful manipulation of the statistics, the over-generalization of the participant motives, the denial of a very useful control, the refusal to acknowledge demographic differences, and the fact that this data comes from a game that’s nearly eight years old, journalists continued to use this information.

This data was used to say that people who are bad at playing games are sexist. This study only had 11 participants say sexist things, but 11 participants became the public at large. This is why everyone should learn how to analyze a scientific paper, if only to come to your own conclusions.


, ,


I am a research student with a history in psychology. I am a fan of tactical rpgs and I love to travel. I hope to one day be a clinical psychologist.

  1. The Devil Within
    The Devil Within
    July 31, 2015 at 3:46 pm

    Thank you Cody, we appreciate this being analyzed. In the meantime, please ensure your own safety, this narrative was pushed by a huge number of interests, the backlash will be fierce from GJP 2.0 and the corrupt and dishonest academic clique that created this. Don’t let them have the satisfaction of hitting a soft target.

  2. MaidKillua
    July 31, 2015 at 4:00 pm

    Im sorry but skill in no way relates to rank in CoD either, if anything its MORE of a treadmill progression than Halo. Unless they changed it, I stopped wasting my time after Black Ops

  3. Dr. Evil's Brother's Evil Twin
    Dr. Evil's Brother's Evil Twin
    July 31, 2015 at 4:04 pm

    I’m getting sick of these people who claim to be gamers, but are only interested in pushing politics…

  4. Fall
    July 31, 2015 at 4:12 pm

    But of course this is just another of his…


  5. PenguinPlayer
    July 31, 2015 at 4:14 pm

    I need to find that study about how eating chocolate helps you lose weight that was published in a science magazine with a good reputation but was completely false and the people behind even admitted it’s false and did it to prove how easy it is to skew results these days.

    I’m glad those did it and proved how easy it was to skew results and publish complete rubbish even after peer review. It’s a sad day when we can’t even trust SCIENCE that comes from “respectable sources”.

  6. Cody Gulley
    Cody Gulley • Post Author •
    July 31, 2015 at 4:16 pm

    Oh I was using the information from the CoD wiki explaining XP (I have only played CoD 3 times in my life). It showed that xp was connected to quite a bit of skill based activities like scoring headshots, trickshots like placing a Semtex grenade on an enemy, or helping a teammate while they’re being attacked. This is the difference between the two rank systems. While both do count matches towards xp, CoD placed greater emphasis on other things.

  7. Aaron Roberts
    Aaron Roberts
    July 31, 2015 at 4:19 pm

    I know you guys might be wondering, “Who would fund such a useless and inaccurate study?” Well, it was the Australian Fucking Government, apparently not content with ruining their own country.

  8. Aaron Roberts
    Aaron Roberts
    July 31, 2015 at 4:21 pm

    I know you guys might be wondering, “Who would fund such a useless and inaccurate study?” Well, it was the Australian Fucking Government, apparently not content with ruining their own country.

  9. Misogynerd
    July 31, 2015 at 4:26 pm

    Didn’t read the whole article, but basically, players who are losing are more likely to use insults, especially personal insults. In general, from what I’ve seen less skilled players tend to take the game less seriously and more emotionally. Teens tend to be less skilled since they have less experience.

    So yeah duh, of course shitty players are more willing to insult you.

  10. Misogynerd
    July 31, 2015 at 4:29 pm

    Well they are interested in using games to push politics. Interactive Tumblr posts for everybody.

  11. Ubernoob8470
    July 31, 2015 at 4:37 pm

    The government spending taxpayer dollars on useless studies that only serve to push agendas and keep people with gender studies degrees employed?
    I, for one, am SHOCKED.

  12. xkumo
    July 31, 2015 at 4:40 pm

    What a surprise, the biggest nanny state in the world.

  13. Crizzyeyes
    July 31, 2015 at 6:21 pm

    Thus is completely true and the easiest way to see for yourself is to play Dota. Communication in that game is notably higher than in non-MOBA type games, due to the nature of the game. The most obnoxious player is almost always the worst one.

    If you are already good at Dota and consistently play with nice players, abandon two matches in a row so you get placed in low priority (or as I call it, “the shit pool”). If you manage to make it through a match in that pool without at least one of your teammates flaming someone, I’d just say you probably didn’t understand it in their language.

    Good players are often seen “flaming” one another on stream, but it becomes clear in time that it’s always just a show for the stream or in good fun. One of the most prolific players, Arteezy, often subtly trolls his viewers by pocking bad items (and winning anyway).

  14. Lex
    July 31, 2015 at 6:45 pm

    The more effort that’s put into getting the facts out there the better. This was disgraceful.

  15. Arbitrary
    July 31, 2015 at 7:10 pm

    These people are scum.

  16. Misogynerd
    July 31, 2015 at 8:12 pm

    Yes, this is based on my experience with Peruvian dipshits in Dota, spanish is my first language. Yeah, there is a difference between trash talking and shit talking your allies in good fun and starting to insult people. Never got shit on any game until I played Dota 2.

  17. Laytonaster
    July 31, 2015 at 8:18 pm

    If I remember right, I think the guy who published it was part of Gawker or BuzzFeed. Just makes it all the worse really. I mean, someone who works for a site that has no basis in facts managed to get that shit in there so damned easily.

  18. Maiq TheLiar
    Maiq TheLiar
    July 31, 2015 at 9:00 pm

    Being a 27 year old who’s hayday was Halo 2 & 3, my belief is that teens are actually More skilled. The reason why is because without adult responsibilities like a job or college or actual chores, and without the funds to buy a new game on a whim, teens usually have several free hours a day to pour into getting better and better at the same game over months and months.

    Mostly why I don’t play pvp games anymore. The teens have all the time to get the exact physics and systems of the game down to a level only the developers should understand.

  19. Cred
    July 31, 2015 at 11:27 pm

    This is a real study guys it is very serious look it says “Serious Study” in the title.

    Why would anyone publish information that is not true and lie?

  20. Megapewpew
    August 1, 2015 at 12:38 am

    They do it, because there is money in ed tech, and these parasites want that ed tech money, from the government.

  21. TheDVI
    August 1, 2015 at 1:25 am

    Positivism has turned the social sciences into nothing but cherrypicking statistics to suit a certain bias.

  22. Misogynerd
    August 1, 2015 at 1:47 am

    Well there is a range the 12-15 are pretty bad in general but 16-18 are the good ones.

  23. TheCynicalReaper
    August 1, 2015 at 3:06 am

    Good job ripping this stupidity a new asshole, Cody. Nicely done article.

  24. TheCynicalReaper
    August 1, 2015 at 3:07 am

    Haha Okay, fair point.

  25. TheCynicalReaper
    August 1, 2015 at 3:07 am

    Video didn’t come up for me, but yes. EDIT: Nevermind. I see you posted another one.

  26. Holyfox25
    August 1, 2015 at 4:41 am

    Weird. It double posted the video. I only have one video link in my comment. But yeah, the second one works.

  27. chizwoz
    August 1, 2015 at 5:23 am

    The social sciences haven’t been anything but left-wing propaganda for quite a while now.

  28. Antoinant
    August 1, 2015 at 6:44 am

    “This data was used to say that people who are bad at playing games are sexist.”
    So feminists in general ?

  29. Immeis-sus in the Shade
    Immeis-sus in the Shade
    August 1, 2015 at 11:36 am

    I am 100% sure that there is a research going on about whether fire is indeed hot, or is it just a common myth. Funded by some country.

  30. CyberEagle
    August 1, 2015 at 4:06 pm


  31. Nonscpo
    August 1, 2015 at 7:35 pm

    There were so many things wrong with this study, it wasn’t even funny :(

  32. Zachary Bower
    Zachary Bower
    August 3, 2015 at 9:21 am

    You’re leaving out the part where they used more insults towards the players manipulated to sound female. I realize people are skeptical of this finding, some for good reasons, but that doesn’t mean pretending it wasn’t found is a good idea.

  33. Thanatos2k
    August 3, 2015 at 11:01 pm

    Another problem with this study is you cannot conclude that someone is sexist judging by what insults they use. Almost every troller on the internet says one thing but thinks another. They know insulting women using certain termilogies will be more effective, while insulting men for being men is ineffective in the male-dominated online gaming space. So they use the best tool in the box.

  34. ThePete
    August 4, 2015 at 3:16 am

    Unfortunately its common for media to buy into the hype so to speak from a studies author, instead of objectively assessing the data and conclusions.

    In this case the media got the narrative they were looking for so why check the data.

    Like you said presenting sweeping conclusions from such uncontrolled and noisy data doesn’t provide that level of insight.

  35. ThePete
    August 4, 2015 at 3:24 am

    The laughable thing is that ARC grants are highly competitive.

    Only 1 in 5 get funded.

    Granted this study is just a subset of the research, but doesn’t exactly bode well that tax payers are getting value for money.

  36. Dio Brando
    Dio Brando
    August 4, 2015 at 11:48 am

    >Ignoring and not taking data from the control group
    (sorry if I appear heated, I get very passionate about the scientific method)