What is in this curriculum?

We have spent many (sort of) early mornings waxing existential over Dunkin' Donuts while trying to define what makes a "data scientist for social good," that enigmatic breed combining one part data scientist, one part consultant, one part educator, and one part bleeding heart idealist. We've come to a rough working definition in the form of the skills and knowledge one would need, which we categorize as follows:

  • Programming, because you'll need to tell your computer what to do, usually by writing code.

  • Computer science, because you'll need to understand how your data is - and should be - structured, as well as the algorithms you use to analyze it.

  • Math and stats, because everything else in life is just applied math, and numerical results are meaningless without some measure of uncertainty.

  • Machine learning, because you'll want to build predictive or descriptive models that can learn, evolve, and improve over time.

  • Social science, because you'll need to know how to design experiments to validate your models in the field, and to understand when correlation can plausibly suggest causation, and sometimes even do causal inference.

  • Problem and Project Scoping, because you'll need to be able to go from a vague and fuzzy project description to a problem you can solve, understand the goals of the project, the interventions you are informing, the data you have and need, and the analysis that needs to be done.

  • Project management, to make progress as a team, to work effectively with your project partner, and work with a team to make that useful solution actually happen.

  • Privacy and security, because data is people and needs to be kept secure and confidential.

  • Ethics, fairness, bias, and transparency, because your work has the potential to be misused or have a negative impact on people's lives, so you have to consider the biases in your data and analyses, the ethical and fairness implications, and how to make your work interpretable and transparent to the users and to the people impacted by it.

  • Communications, because you'll need to be able to tell the story of why what you're doing matters and the methods you're using to a broad audience.

  • Social issues, because you're doing this work to help people, and you don't live or work in a vacuum, so you need to understand the context and history surrounding the people, places and issues you want to impact.