The NBA Data Project started as an ECON 103 – Statistics final group project at Lewis & Clark College during the Spring 2015 semester.  Visiting professor Zhaochen He limited the data sources for project to the American Fact Finder, FRED and DatabaseSports.   Our group  found that DatabaseSports had a downloadable archive of ABA/NBA player and team stats current through the end of the 2008-09 season.

One of the goals of the final project was to do descriptive statistics/graphing & regression analysis using R which was introduced in the last month of class. Instead of trying to do the data transform/format in R I leveraged my past database experience to make the regression analysis in R as straightforward as possible. The original CSVs were cleaned and then loaded into Microsoft Access for initial analysis.  Once initial analysis & was complete the data was loaded into a Microsoft SQL Server database.  The raw data needed for regression analysis was transformed using a series of t-sql stored procedures and then read into R via ODBC.

We were in a bit of a time crunch to get the final project done so the final paper probably wasn’t our best work… you can take a look at it here.

When I have time I’ll post the cleaned data files and various scripts on GitHub at https://github.com/nbadataproject