Skip to content

Data Security Primer#


Why this is important#

  • We have a lot of sensitive information
  • Much of it is private data about individuals
  • Legal agreements in place with partners to keep data safe

Security 101#

  • No such thing as absolute security
    • Consider your home
    • Can a dedicated attacker break in to your home?
    • Do you lock your door?
  • Goal: Reduce risk of disclosure

What We Care About#

  • Confidentiality of project data
  • Login credentials to the servers and databases (and places where these credentials are stored)

Common DSSG Challenges#

  • Avoid: Committing database credentials, API keys, SSH keys, etc. to Github repos
  • Maintain awareness: IPython notebooks with exploratory data analysis with confidential data in them (talk with your team about this)

Commit with Confidence!#

  • Use git add filename to stage files individually
  • Before you commit, git diff --cached to verify what you have staged is what you expect
  • If you have files that you want to make sure that you do not commit, add them to your [.gitignore]{.title-ref}


  • Use unique, strong passwords
  • Use a password manager e.g. KeePass, LastPass, 1Password
  • Use two factor authentication when available (e.g. on Github)

Database: Don't#

Don't commit the following:

from sqlalchemy import create_engine
engine = create_engine('postgresql://')

Database: Do#

Store these credentials in a separate file


Add this file to your .gitignore to ensure that you don't commit it

You can commit an example file to your repo dbcreds.example:


Database: Do#

import dbcreds

engine = sqlalchemy.create_engine(('postgresql://{conf.user}:'

Database: Do#

Commit an even simpler config file ``:

config = {'sqlalchemy.url': 'postgres://'}

And then connect:

import sqlalchemy
from dbcreds import config

engine = sqlalchemy.engine_from_config(config)

Beyond Content#

Cleaning Repos#

Mistakes Happen#

  • Avoid cleaning by not putting sensitive data in your repos

Web Applications#

If you end up creating a web application, be aware of security best practices: