Recently, there have been claims that the AssaultRunners in Lane 8 for both the European and North America West Semifinals were mis-calibrated. Based on placement for Event 5, it was postulated that the Assault runner in Lane 8 was significantly slower in Europe and it was significantly faster in North America compared to the other runners. There has been a statistical analysis conducted combining the men and women in Europe, but no analysis has been conducted for the women and men of Europe separately, or on the athletes in North America West.
The aim of this article is to statistically analyze the placements on Event 5 for all lanes across the male and female fields in Europe and NA West. If you don’t care to read through this analysis, skip to the final heading of this article (“Final Take Aways”) for the general review and final thoughts. If you are someone who wants to read through the analysis in full, we’ll first look at the raw data for both the NA West and European fields on Event 5.
Statistical Analysis
To begin, placement in Event 5 was found for each lane. Groups were created based on lane assignment. For most lanes the sample size was 5-6 athletes. The exceptions were lanes 4 and 10 in the European men’s field where the sample size was 4.
Group size matters because a one-way ANOVA (a statistical test used to assess differences in means of groups) is more accurate with equal samples and has higher power for detecting true differences between groups. So, even though the samples are small the statistical test still has high power (i.e., the ability to accurately detect a significant effect).
Table 1.
Mean Event 5 Placements
Next, we’ll visualize the data for both the competitions for the women’s field. This is done just to get a gist of what the data looks like and note any potential errors or patterns seen.
Figure 1.
Event 5 Placement Means & Variation of European and NA West Women
Subjectively looking at this data, it appears that for the European women the mean placement in Lane 8 is close to all the other lanes except for maybe lanes 4, 5, and 7. Looking at the women in North America West, Lane 8 appears to be different from all other lanes. In particular, Lane 8 and Lane 1 seem to have a large difference between means.
Next, we’ll look at the standard deviation of the samples (the two scatter plots located at the bottom of fig. 2). Essentially, we’re analyzing how much variance was in each lane (e.g., Lane 8 in NA West had a 1st and 42nd place finish, creating high variance). This can help determine if all athletes in the same lane performed similarly. If we think Lane 8 had an unfair advantage over all the other lanes in NA West, we would except little variance as all athletes would have placed well in Event 5 (due to the runner being faster).
With the standard deviation added we can see that the variance for all the lanes is large (note: a Levene’s test was used to ensure variance was equal before conducting the ANOVAs). Although Lane 8 in Europe was one of the slower on average, the spread is large across all lanes with a lot of overlap. Likewise, even though the mean of Lane 8 in NA West appeared to be significantly higher, there is a lot of variability in performance. Looking back at Table 1, there are two athletes who have finished 25th and 42nd respectively. So, even though there were four top 10 finishes in lane 8 for the West, there’s still a large spread overall due to these two athletes with lower placements.
Next, we’ll run a one-way ANOVA. This will statistically analyze if the means of each lane are significantly different.
Table 2.
One-Way ANOVA of Women’s Fields
What we can take from both one-way ANOVAs is that neither the European nor NA West women’s field placement on Event 5 was significantly impacted by lane assignment (p < .05). That might seem impossible (especially in NA West) given the appearance of the bar graph (fig. 1). But if we break down the actual means of these lanes it’s a little clearer.
Table 3.
European Women Event 5 Average Placements per Lane
The overall means for each lane are about what we would expect. Lane 5 has the highest placement followed by lanes seven and four. Arguably the most interesting mean is Lane 6 which has the second to last lowest placement. Theoretically, because the highest-ranking athletes in the competition are placed in lanes 4-6, lane 6 should have one of the highest placements. If Lane 8 is under questioning for a potentially slower runner, Lane 6 should be under investigation too.
Table 4.
North America West Women Event 5 Average Placements per Lane
Moving on to the North America West means, we do see what appears to be a large difference in Lane 8 from all the other lanes. It has the highest placement on average and is almost half the value of other means. The best finishes came from three lanes that should have some of the lower ranking athletes in the competition (lanes 10, 3, and 8). So why isn’t Lane 8 significantly different from the other lanes in our analysis?
Recall the overlapping standard deviation bars in figure 2. With this wide spread of performance, we really can’t be sure if lane assignment was accounting for the placements seen on Event 5 (note: further analysis found that for the NA West women’s field lane assignment only accounted for 10% of the variance seen in Event placement). With this high variance, even though there does appear to be a difference in Lane 8 for the West, it is not statistically significant.
Another supporting analysis is to look at the male fields. If we’re questioning the runners for the women’s fields, the men’s fields should also show similar trends because they used the same runners.
Figure 2.
Event 5 Placement Means and Variation of European and North America West Men
Looking at the visuals of the data, we do see that Lane 8 for the Europe men had lower Event 5 placements than all the other lanes. We also see that the mean of Lane 8 in North America West was fastest, but Lane 9 is almost equal. Right now, just looking at the visual data, the men and women’s fields are showing the same general trends of placement for Event 5.
The variance in performance on the men’s side in Europe suggests possible significance between Lane 8 and lanes 4 and 5. We see no overlap between those performances, and the means appear to be largely different. In NA West, the Lane 8 performance variance is almost identical to Lane 9. The variance is also larger in this competition, and nothing stands out as possibly being significant, although the means do differ. We’ll run the ANOVA again to see if there is statistical significance.
Table 5.
One-Way ANOVA of Men’s Fields
The one-way ANOVA for North America West has no significant values, but the ANOVA of the European men has a significant p-value (p = 0.0484). Meaning, two or more of the lanes are significantly different from each other in their placement for Event 5. Using a Tukey post-hoc test (a follow-up analysis to see which specific lanes differ from each other), it is found that lanes 4 and 8 are significantly different (p = 0.451).
One important note for the variances seen in samples: high variances can bias the ANOVA to miss statistically significant relationships. Likewise, low variances can bias the ANOVA to wrongly find statistically significant relationships. Both ranges of variances have pros and cons for statistical analyses. We’ll come back to this later, but for the final analysis we’ll try to address the low sample size (which can help decrease the variance of our analysis) by combining the male and female fields. Once again, we start with visualizing the data and then running an ANOVA.
Figure 3.
Event 5 Placement Means of European and North America West Men and Women
Table 6.
European and North America West Men and Women ANOVA
The one-way ANOVA for the European men and women has a significant p-value (p = 0.0131). So, two or more of the lanes are significantly different from each other. We’ll use a Tukey post-hoc test again. The Tukey test shows that lanes 7 and 8 are significantly different (p = 0.457). Lanes 5 and 8 were almost significantly different from one another (p = 0.0548).
Putting It All Together
One of the biggest takeaways from this analysis is that athlete variance reduces the assumed significance of the runners in Lane 8 for both the NA West and Europe competitions. At first glance, looking at the visualized data, it seems like Lane 8 might have had an unfair advantage (in NA West) or disadvantage (in Europe). But the variance in athlete performance clouds the differences we see. The variance in athlete performance per lane is too large to make any conclusions about whether the runners in Lane 8 were calibrated unfairly.
As alluded to early, variance is essentially noise in data, and it can hide significant relationships. It’s possible that the runners in Lane 8 for Europe and North America West were mis-calibrated. But because of the wide spread of athlete performance and limited sample size, we will never know indefinitely if the runners were unfair. More data points are needed to accurately assess this question. We did see a decrease of the p-value when we combined men and women, but 8 to 12 samples are still not enough to appropriately assess this data given the spread in performance per lane.
An alternative analysis would be to use the placement per heat. So, instead of athletes receiving a placement between 1-60, they would fall within the range of 1-10. This would greatly limit the variance seen and would be biased to find significant relationships between lane assignment and Event 5 placement. WodScience has conducted this analysis for the European competition (combining the men and female fields) and did indeed find many significant differences between lanes. While this analysis is not wrong in anyway, it does have limitations. Just as the analysis conducted here was biased because of high variance, the alternative analysis is biased because of low variance.
Final Takeaways
The bottom line is that there is too much variance in athlete performance per lane to conclude if any runner gave an unfair advantage, and more samples are needed to address this issue. An alternative analysis using heat placement instead of overall Event placement (thus reducing the variance from 1-60 places to 1-10 places) did find the Assault runner in Lane 8 for the European competition to be significantly slower than some of the other lanes.
The question now becomes which analysis (i.e., using placement in Event 5 overall versus placement per heat for Event 5) is most representative of the population (i.e., NA West and European competitions). Arguably, the overall placement for the Event best represents the populations, as it precisely reflects placement for the entire population rather than just a sample of it (as seen in heats).
Looking at the graphs above, it does look like we should find significant differences in the means, particularly in NA West. Conversely, how much of an impact did the runners have in NA West if one female athlete places 1st and another places 42nd? Or the fact that males in lanes 8 and 9 had almost identical placement means (18.16 and 19.6 respectively)? The variance accounts for these performances and challenges conclusions that may be tempting to make based purely on the means.
For the Europe competition, people argued that Lane 8 had a slower runner because athletes who “should have placed well” in Event 5 didn’t finish as high as expected. But the average placement of the European women in Lane 8 is very similar for lanes 1-3, and 6. Did all these lanes have slower runners then? The variance of performance within and between lanes is too high to conclude anything about a potentially mis-calibrated AssaultRunner.
We can point at a few lanes that seemed faster or slower than the rest. But at the end of the day, athlete performance varied for each lane. On average it appears that lane assignment influenced the placements for Event 5, but multiple athletes demonstrated that the runner alone didn’t dictate placement. If the runner was having as much of an influence as claimed, we would see athletes assigned the same lane consistently finishing near each other with little variance, regardless of if overall or heat placement was used to conduct the analysis. The means of the placements don’t tell the whole story and can’t be used to make any conclusions.
A final quote for consideration:
“A statistician confidently waded through a river that was on average 50 cm deep. He drowned.”
-Godfried Bomans