Discussion:
During my analysis of the data, I found that, within the Premier League, there is a very large significant correlation between a player’s height and his ability to win headers. Thus we can reject the null hypothesis that a premier league player’s height does not affect his ability to win headers.
I also found that this relationship does not change, when we are considering position as a factor variable. Although, as expected, this relationship do seem to be stronger for strikers and defenders, compared to midfielders, as the ability to win aerial challenges are generally seen as more important in those positions.
We also found that the player’s ability to head the ball do not seem to significantly improve or change even with experience in the Premier League (measured by matches played).
Finally, we also found that there is no concrete evidence that, for attacking players, the ability to win an aerial ball can be translated into goals or assists. However, we did very large differences between two groups of attacking football players, in their ability to translate aerial challenges into goals and assists. A further analysis is suggested to find data on the roles of the attacking players and see if the ability to translate aerial challenges into goals and assists depends on their role was a winger or a striker.
Limits: There are several limits to my analysis. First and foremost, the analysis may cause the data to be skewed. The analysis assumes that players who have engaged in 0 aerial battles throughout the season as N/A as opposed to 0 percent winrate and thus are not included in the analysis, this may skew the winrates to the right. These players are still important to consider as they may not be engaging in aerial battle due to their height.
Further into the analysis, I found that there seem to be two very distinct group of players within the attackers dataset. If I had been able to identify that earlier or had an idea what separats these two groups, I may be able to provide a better analysis of the dataset. These two distinct groups may act as outliers,which linear regression is very senstive to.
I mainly used linear regressions in this project, commonly linear regressions is a not suitable model for relationships that are nonlinear. While one might expect the relationship to be similar to a linear relationship, the fact that there are two types of players, may mean that a non-linear model may be a better fit if we can separate and isolate these groups of players.
Conclusion:
It seems that the Premier League clubs are justifiably paying a higher premium for taller players, as taller players do tend to win significantly more aerial challenges. It seems that the ability to use these aerial victories effectively are not innate in all players; winning aerial challenges does not often translate into goals or assists. If the club only paid extra for the tall attacking player for his goals and assist then they must further analyze to see if he can effectively translate the aerial challenges into goals and assists. It is also unwise to hope to buy an tall player and hope that his heading ability improve through experience and learning in the league, as there is no evidence that game time and appearances have an effect on the ability to win headers.