import gh_scoped_creds
%ghscopedcreds
Using DataHub as a commmon computational environment
While in principle, it’s fine to use your laptop, it will be hard for us to diagnose all the different problems that might arise with so many participants. So we recommend using a common computational environment during the workshop. We’ll use the Berkeley DataHub for this. You can think of DataHub as providing user-specific virtual machines with identical software for all of us to use.
The exception is that if you already have experience using your laptop with the shell/terminal, Git (and pushing to GitHub) and Python.
Click here to access your own machine/server with Python, git, and terminal access on the campus DataHub (also used for various data science, CS, and statistics courses). (Do NOT go directly to datahub.berkeley.edu
the first time you are accessing DataHub. If you do, you won’t have the repository materials available in your virtual machine.)
That will get you onto your server with the materials for the course, from the berkeley-scf/compute-skills-2024
repository available on the server.
Check your environment
To check things look good:
- Click on
Terminal
(in the 3rd row of theLauncher
tab, underOther
). (If you have trouble finding the Terminal, go toFile
->New
->Terminal
.) - Check that your working directory is
~/compute-skills-2024
(by looking at the terminal prompt or runningpwd
) - Run
ls
to see the files in the repository.
Editing files
You can combine text and code in a notebook, which you can start with:
New Launcher
->Notebook
->Python 3 (ipykernel)
, orFile
->New
->Notebook
To run the code in a cell, a shortcut is Ctrl
+ Enter/Return
.
To create a script file containing just code, do File
-> New
-> Python file
.
Stopping your server
You can stop your server via File
-> Hub Control Panel
-> Stop My Server
.
If you only log out, any ongoing computations will still run in the background.
Accessing GitHub repositories from DataHub
(These instructions are needed only for the main sessions of the workshop, Thursday and Friday.)
Please don’t do the steps in this section until after we do the exercise where you create your newton-practice
repository.
Authenticating with GitHub can be a bit tricky, particularly when using DataHub (i.e., JupyterHub).
If you’re on your laptop and already set up to use GitHub, you don’t need to do the steps in this section, but if you’re using your laptop and not set up to use GitHub, these instructions should work there too. (it may also depend on the version of git
you are using; e.g., version 2.34.1 is too old but 2.39 should be fine). You’ll need to install gh-scoped-creds
via pip install
.
That said, if you haven’t already used Git on your laptop and pushed changes up to GitHub, it’s best if you work in DataHub so that we don’t have to troubleshoot laptop issues for multiple participants.
We’ll use a tool (an ‘app’) that helps us with this, provided in the gh_scoped_creds
Python package.
You must do all three steps in order to be able to interact with your GitHub repository from DataHub.
1. Configure git
First configure git in the terminal:
git config --global user.name "Your Name"
git config --global user.email "your_name@berkeley.edu"
git config --global color.ui "auto"
git config --global pull.rebase false
That will modify your ~/.gitconfig
file.
2. Give DataHub access to your GitHub account
Then run this in the terminal:
gh-scoped-creds
Or you could instead run this in IPython or Jupyter Notebook:
When you do that you’ll see a link and a code. Go to the link in a browser (on your laptop, outside of DataHub), login to GitHub (or click Continue
if already logged in), input the code, and click Authorize Berkeley-DataHub-Git-Access
. This will grant access to GitHub from DataHub (i.e., JupyterHub) for 8 hours (or until you stop your JupyterHub server).
3. Give access to the specific GitHub repository you are working with
Then (only the first time you are doing all this) go to the URL that is printed out in your terminal (which should be https://github.com/apps/berkeley-datahub-git-access and click on Install
(and then if you have access to multiple GitHub organizations, choose your personal org, namely your GitHub username). Choose Only select repositories
and select the repository you are using for the workshop, namely newton-practice
. Then click Install
.
If you run git push
and get a permission denied or 403 error or Git asks for a password, something has gone wrong with one of the three steps above.