Skip to content

Software Setup Session#

Overview#

  1. Make sure you are connected to the university wifi with your correct credentials.
  2. Set up SSH keys for the server
  3. Connect to the server (through SSH) and test your connection
  4. Connect to the Postgres database (both command line through the server and through a GUI - DBeaver) and test your connection.
  5. Set up text editor (VSCode for example) to work over ssh.

Motivation#

Every team will settle on a specific setup with their tech mentors. This setup will determine, for example:

  • which Python version to use
  • versioning via virtual environments
  • maintaining package dependencies
  • continuous testing tools
  • your version control workflow.

Today, we're not doing that. We're making sure that everybody has some basic tools they will need for the tutorials and the beginning of the fellowship, and that you can log into the server and database. Hopefully you already got many of these tools set up on your computer before arriving, but we'll quickly check that your local enviroment is working (and resolving any lingering issues), and then talk a bit about some of the remote servers and tools we'll be using throughout the summer.

You might not understand every detail of the commands we go through today, but don't worry if not -- we'll go into a lot more detail in the following sessions throughout the week. For today, we just want to be sure everyone has the tools they'll need for working on their projects.

Getting help

Work through the prerequisites below, making sure that all the software you installed works.

Affix three kinds of post-it notes to your laptop:

  • one with your operating system, e.g. Ubuntu 18.04
  • if you get an app working (e.g. ssh), write its name on a green post-it and stick it to your screen
  • if you tried - but the app failed - write its name on a red post-it

If you're stuck with a step, ask a person with a corresponding green post-it (and preferrably your operating system) for help.

The tech mentors will hover around and help with red stickers.

You will need a few credentials for training accounts. We'll post them up front.

Important

The notes below aren't self-explanatory and will not cover all (or even the majority) of errors you might encounter. Make good use of the people in the room!

Let's get started!

Windows Users: Setting up WSL#

We'll be using a lot of unix-based tools, but fortunately windows now provides a tool called "windows subsystem for linux" (WSL) that provides support for them. If you're using windows and didn't already set up WSL, let's do that now:

WSL Instructions for Windows

SSH#

The data you'll be using for the summer is generally of a sensitive nature and needs to be protected by remaining in a secure compute environment. As such, all of your work directly with the data will be on one of the remote servers we've set up. Let's see how connecting to those resources works:

Creating an SSH Keypair#

The setup information we sent out before the summer had some details on creating an SSH keypair, but if you need to revisit it, you can find that here:

Reaching the Compute Server#

Now let's make sure you can connect to the server. In your terminal, type:

ssh {your_jhu_id}@training.dssg.io

If you got an error about your ssh key, you might need to tell ssh where to find the private key, which you can do with the -i option:

ssh -i {/path/to/your/private_key} {your_jhu_id}@training.dssg.io

If you connected successfully, you should see a welcome message and see a prompt like:

yourjhuid@mrpzdssgadmin01: ~$

To confirm that you're connected, let's look at the output of the hostname command:

  • Type hostname at the shell prompt and then hit return
  • Did you get mrpzdssgadmin01?
    • If so, you're all set! Put a green post-it on the back of your monitor!
    • If not, put a red post-it on the back of your monitor and we'll help you out.

PRO tip

Your life will be easier if you set up a .ssh/config file

Accessing the Database#

Reaching the Database from the Compute Server#

We'll be using a database running PostgreSQL for much of our project data.

One way to connect to the database is via the command line from the server using psql. Since we're already logged onto the server, let's give that a try:

$ psql -h db.dssg.io -U {your_jhu_id} -d baltimore_311

This should prompt you to type your password (we'll tell you what it is separately) and then if all goes well, you should see something like:

psql (14.22 (Ubuntu 14.22-0ubuntu0.22.04.1), server 18.3)
WARNING: psql major version 14, server major version 18.
         Some psql features might not work.
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

baltimore_311=>

Let's make sure you can interact with the server:

  • Type SELECT CURRENT_USER; and then hit return
  • Did you get back your andrew id?
    • If so, you're all set! Put a green post-it on the back of your monitor!
    • If not, put a red post-it on the back of your monitor and we'll help you out.

Finally, exit out of psql with \q:

baltimore_311=> \q

Reaching the Database from Your Computer#

Often it can be a little easier to access the database via GUI system from your computer. To do so, we recommend setting up DBeaver, since it offers a free version and works with every operating system (but if you already have a SQL client like dbvisualizer or data grip that you know how to use and prefer, feel free to use that instead).

Protecting your data

Although it's fine to run queries and look at the data from a SQL client like dbeaver running on your computer, you should never use that client to download an extract of the data (in whole or part) to your local computer, excel, google sheets, etc.

To get DBeaver, you can install it directly from the DBeaver Website. NOTE: you'll want to install the free "community edition" version.

DBeaver Version

If you've installed DBeaver previously, be sure you're using at least version 25 (if you're on an earlier version, please upgrade before proceeding with the setup)

Because the database is only accessible from our compute servers, we'll have to use an SSH Tunnel to connect to it via dbeaver. Here's how to set that up:

First, create a new connection to a postgres database:

creating a connection in dbeaver

Next, fill in the details for the database server on the window that pops up, following the example below:

postgres connection details

Note that your password here can be found in the .pgpass file that we looked at on the server.

Got an error like invalid privatekey: [B@7696c31f?

You may need to install a different SSH package called "SSHJ" -- under the Help menu, choose "Install new Software" then search for SSHJ and install the package (you'll need to restart dbeaver). After restarting, choose “SSHJ” in the drop-down under advanced (should be labeled either ”Implementation” or “Method”) when setting up the tunnel

Note: some versions of DBeaver have SSHJ pre-installed, so you might not find it in the "install new software" dialog. If so, check for it under the advanced settings when setting up the SSH tunnel.

Let's make sure you can connect to the database:

  • Try connecting to the database and opening a new sql script
  • Type SELECT CURRENT_USER; and then press Ctrl-Enter to run it
  • Did you get back your jhu id?
    • If so, you're all set! Put a green post-it on the back of your monitor!
    • If not, put a red post-it on the back of your monitor and we'll help you out.

Text Editor#

You'll want a good text editor to write code installed on your local machine. We recommend using VSCode because it's both free and allows you to directly edit code stored on a remote machine over SSH (this comes in very handy since all of your code will need to run on the server in order to keep the data in our secure environment). If you already have a different editor that you prefer (e.g., Sublime, PyCharm, emacs, etc.), you're welcome to use that (though note that you may need do some extra work to make sure you keep your local code in sync with the server).

We sent out some instructions on installing VSCode before the summer, but if you need to revisit them, you can find them below (be sure to install the Remote-SSH extension as well):

PRO tip

You might also want to install the microsoft python extension, which provides some additional features such as improved auto-completion when coding in python and the jupyter extension, which allows to use jupyter notebooks without the need of a jupyter server.

Editing Files Remotely Over SSH#

In the same way we set up DBeaver to use SSH to talk to our remote infrastructure, we can set up VSCode to remotely edit files stored on the server using the Remote-SSH extension you should have installed in the step above. Let's set that up and make sure we can talk to the server:

  1. Configure our training server as an SSH host:

    With the SSH plugin installed, we can tell VSCode how to log into the server. In this step we'll be entering our connection string and saving it in a file, making it easy to connect in the future.

  2. Press ctrl+shift+p (Linux/Windows) or ⌘+shift+p (MacOS) to open the command pallette, and select Remote-SSH: Connect to Host

  1. Select Add New SSH Host...

  1. Enter ssh {jhuid}@training.dssg.io

  1. Select the first option to store your login config:

  1. Connect VSCode to the course server:
  2. Press ctrl+shift+p (Linux/Windows) or ⌘+shift+p (MacOS) to open the command pallette, and select Remote-SSH: Connect to Host

  1. Select the ssh config we just created: training.dssg.io

  1. Enter your JHU ID password when VSCode prompts you to (it will open a box at the top of the screen). You'll need to authorize your connection via MFA on your Microsoft Authenticator App.

  2. You should be connected to the training server. This should be indicated in the bottom of your VSCode window:

  3. Open a workspace folder:

    Now that VSCode is connected via SSH, you can browse all of the files and folders on the server. In this step, we select a folder containing some code to edit and test.

  4. Select the folder menu button

  1. Select Open Folder

  2. Select a folder to work in

Let's make sure you can connect to the server through VSCode:

  • Were you able to SSH and open the remote baltimore_311 folder?
    • If so, you're all set! Put a green post-it on the back of your monitor!
    • If not, put a red post-it on the back of your monitor and we'll help you out.

Access from Off-Campus: JHU VPN#

For an additional layer of security, our infrastructure can only be reached from the JHU campus network. As such, if you're not on campus, you'll need to use the JHU VPN to reach the server:

  1. Download the Ivanti Secure Access VPN client from here and install it

  2. Make sure you have enrolled in Azure MFA. Instructions here

  3. Request VPN access here

  4. Open the Ivanti Secure Access Client

    • Add a new connection

  1. Enter login credentials (be sure to select the "Policy Secure (UAC) or Connect Secure (VPN)" option):
    • Type: Policy Secure (UAC) or Connect Secure (VPN)
    • Name: “JHU VPN”
    • Server URL: vpn.jh.edu

VPN URL

Note that the URL has jh not jhu

  1. Click connect
    • An MFA authentication window will show up to ask for yourJHU ID password
    • Enter your MFA code from your authenticator app

If you connect successfully, you will have on the connections window a connection with the "Disconnect" button and the status of the connection will be "Connected".

When you don't need to be connected to the VPN anymore, disconnect from the VPN by clicking the "Disconnect" button.

Other (Optional) Local Tools#

Most of your work this summer will be on the remote server for your project, but there are a few of other tools that might be handy to have installed locally if you feel inclined: