Chapter 7. Web Scraping: Obtaining and Analyzing Draft Picks
One of the great triumphs in public analysis of American football is nflscrapR
and, after that, nflfastR
. These packages allow for easy analysis of the game we all love. Including data in your computing space is often as simple as downloading a package in Python or R, and away you go.
Sometimes it’s not that easy, though. Often you need to scrape data off the web yourself (use a computer program to download your data). While it is beyond the scope of this book to teach you all of web scraping in Python and R, some pretty easy commands can get you a significant amount of data to analyze.
In this chapter, you are going to scrape NFL Draft and NFL Scouting Combine data from Pro Football Reference. It’s a wonderful resource out of Philadelphia, Pennsylvania. It’s owned by Sports Reference, which also provides free data for every sport imaginable. You will use this website to get data for the NFL Draft and NFL Scouting Combine.
The NFL Draft is a yearly event held in various cities around the country. In the draft, teams select from a pool of players who have completed at least three post–high school years. While it used to have more rounds, the NFL Draft currently consists of seven rounds. The draft order in each round is determined by how well each team played the year before. Weaker teams pick higher in the draft than the stronger teams. Teams can trade draft picks for other draft picks or players.
The NFL Scouting Combine ...
Get Football Analytics with Python & R now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.