The basics of working on the UNIX command line

Published

October 31, 2024

1 Introduction

1.1 This Tutorial

This tutorial covers the basics of navigating in a UNIX-like (e.g., Linux or MacOs) environment. In particular, it covers using the UNIX command line interface, a powerful way to carry out operations on a computer and to automate tasks. Being familiar with operating on the command line will allow you (with some practice and training) to do things more quickly and in a way that can be reproduced later. That’s hard or impossible to do if you are doing point-and-click or drag-and-drop operations in a File Manager or Finder window.

Materials for this tutorial, including the Quarto Markdown file that was used to create this document are available on GitHub.

Software Carpentry has a very nice introductory lesson on the basics of the shell. It also has an accompanying YouTube video. Episodes 1-3 (the first 20 minutes) cover the material that is in this tutorial.

License

This tutorial by Christopher Paciorek is licensed under a Creative Commons Attribution 3.0 Unported License.

1.2 The shell

Operating on the UNIX command line is also known as “using the terminal” and “using the shell”.

The shell is the UNIX program that you interact with when in a terminal window interacting with a UNIX-style operating system (e.g., Linux or MacOS). The shell sits between you and the operating system and provides useful commands and functionality. Basically, the shell is a program that serves to run other commands for you and show you the results. There are actually different shells that you can use, of which bash is very common and is the default on many systems. In recent versions of MacOS, zsh is the default shell. zsh is an extension of bash, so you should be able to use zsh based on this tutorial.

I’ve generated this document based on using the bash shell on a computer running the Ubuntu Linux version 22.04 operating system, but you should be able to replicate most of the steps in this tutorial in other UNIX command line environments, ideally using the bash or zsh shells.

1.3 Accessing a UNIX command line interface

Here are some options for accessing a UNIX command line interface:

  • MacOS: If you’d like to work on your own Mac, you’ll find the Terminal under Applications -> Utilities -> Terminal.
  • Windows:
    • If you have a sufficiently new version of Windows 10, you can use the Windows Subsystem for Linux, which will provide you with an Ubuntu shell running bash on your own machine.
    • If you have access to remote machines running Linux, you can login to them using programs such as MobaXTerm and Putty. Once logged in, you’ll find yourself in a Terminal window on the remote machine.
  • JupyterHub: If you have access to a JupyterHub, you will likely be able to start a Terminal session under “New”.
  • Cloud-based options: You could also try a cloud service such as Google Cloud Shell.
Don’t use Git Bash for this tutorial

You probably shouldn’t use Git Bash to follow this tutorial as its functionality is limited.

Once you’re in a Terminal window, you’ll be interacting with the shell and you can enter commands to get information and work with the system. Commands often have optional arguments (flags) that are specified with a minus in front of them, as we’ll see.

1.4 Getting started

Once we are in a terminal, we’ll see the “prompt”, which indicates that the shell is waiting for us to enter commands. Sometimes the prompt is just $:

$

but often it contains information about the username of the current user and the directory on the filesystem that we are in. For example, here a prompt that shows that the current user is ‘scflocal’, on the machine named ‘gandalf’ in the ‘tutorial-unix-basics’ (sub)directory in the user’s home directory (indicated by ~):

scflocal@gandalf:~/tutorial-unix-basics>

Tutorial code formatting

In the remainder of this tutorial, you won’t see the prompt in front of the commands. All commands will appear in a grey background, with the output (if any) following the code.

When the shell is waiting for more information

Note that if you simply see > instead of the usual prompt, that means the shell thinks you haven’t finished entering your command (usually that you haven’t finished entering a string) and is expecting more input from you. If you see a newline but nothing else, the shell probably expects you to enter some text for it to process.

If you’re not sure what to do, type Ctrl-c (the control key and ‘c’ at the same time) to get back to the usual prompt.

Let’s start by running a command, whoami, that prints out the username of the current user:

whoami
scflocal

2 Using git for version control

We’ll discuss git briefly, both because it is an important and useful tool, and because it’s the easiest way for us to get a set of files to work with in this tutorial.

Git is an important tool to become familiar with, at least at the basic level. Git allows you to share files between different computers and different people and for people to collaborate on projects together. In particular, it is a version control tool that allows you to have different versions of your files and to go back to earlier versions of your files. Git stores the files for a project in a repository.

For our purposes here, we’ll simply use Git to download materials from GitHub, a website that stores Git repositories in the cloud.

First we’ll download the materials for this tutorial.

To clone (i.e., copy) a repository (in this case from GitHub) we do the following. Note that berkeley-scf is the organization and tutorial-unix-basics is the repository. Note that everything below that follows the # symbol is a comment and not executed.

Here we’ll first use the cd command (for “change directory”) to go to our home directory and then use git clone to download materials to a subdirectory (which will be called tutorial-unix-basics) within our home directory.

cd
git clone https://github.com/berkeley-scf/tutorial-unix-basics
Cloning into 'tutorial-unix-basics'...
remote: Enumerating objects: 387, done.
remote: Counting objects: 100% (66/66), done.
remote: Compressing objects: 100% (45/45), done.
remote: Total 387 (delta 37), reused 46 (delta 19), pack-reused 321 (from 1)
Receiving objects: 100% (387/387), 779.53 KiB | 5.27 MiB/s, done.
Resolving deltas: 100% (199/199), done.

Now suppose that whoever controls the repository makes some changes to the materials in the repository online and you want an updated copy of the repository on your computer. Simply use cd to go into any directory in the repository materials on your computer and run git pull.

cd tutorial-unix-basics
git pull
Already up to date.

In this case, since no changes had been made, git simply reports that things are up-to-date.

We’ll discuss how to use cd in more detail in the next section.

3 Files and directories

3.1 Moving around and listing information

We’ll start by thinking about the filesystem, which organizes our information/data into files on the computer’s disk.

Anytime you are at the UNIX command line, you have a working directory, which is your current location in the file system.

Here’s how you can see where you are using the pwd (“print working directory”) command:

pwd
/home/scflocal/tutorial-unix-basics

and here’s how you use ls to list the files (and subdirectories) in the working directory…

ls
assets
_config.yml
example.text
example.txt
filename with spaces.txt
_freeze
_includes
index.qmd
index.rmarkdown
_layouts
mv_assets.sh
myfile
name of my file with spaces.txt
_quarto.yml
README.md
_sass
_site

Now suppose I want to be in a different directory so I can see what is there or do things to the files in that directory.

The command you need is cd and an important concept you need to become familiar with is the notion of ‘relative’ versus ‘absolute’ path. A path is the set of nested directories that specify a location of interest on the filesystem.

First let’s go to our home directory, which is generally where our files will be. Simply running cd will do that.

cd
pwd
/home/scflocal

Now let’s go into a subdirectory. We can use cd with the name of the subdirectory. The subdirectory is found ‘relative’ to our working directory, i.e., found from where we currently are.

cd tutorial-unix-basics
pwd

/home/scflocal/tutorial-unix-basics

We could also navigate through nested subdirectories. For example, after going back to our home directory, let’s go to the assets subdirectory of the tutorial-unix-basics subdirectory. The / is a separate character that distinguishes the nested subdirectories.

cd
cd tutorial-unix-basics/assets
pwd
/home/scflocal/tutorial-unix-basics/assets

You can access the parent directory of any directory using ..:

pwd
cd ..
pwd

/home/scflocal/tutorial-unix-basics/assets /home/scflocal/tutorial-unix-basics

We can get more complicated in our use of .. with relative paths. Here we’ll go up a directory and then down to a different subdirectory.

cd assets
cd ../_includes 
pwd
/home/scflocal/tutorial-unix-basics/_includes

And here we’ll go up two directories and then down to another subdirectory.

cd ../../Desktop  # go up two directories and down
pwd

/home/scflocal/Desktop

All of the above examples used relative paths to navigate based on your working directory at the moment you ran the command.

We can instead use absolute paths so that it doesn’t matter where we are when we run the command. Specifying an absolute path is done by having your path start with /, such as /home/scflocal. If the path doesn’t start with / then it is interpreted as being a relative path, relative to your working directory. Here we’ll go to the units subdirectory again, but this time using an absolute path.

cd /home/scflocal/tutorial-unix-basics/assets
pwd
/home/scflocal/tutorial-unix-basics/assets
Absolute paths are not robust

Note that using absolute paths in scripts is generally a bad idea because the script wouldn’t generally work correctly if run on a different machine (which will generally have a different filesystem structure) or as a different user (who will have a different home directory).

3.2 The filesystem

The filesystem is basically a upside-down tree.

For example, if we just consider the tutorial-unix-basics directory, we can see the tree structure using tree:

tree
.
├── assets
│   ├── css
│   │   └── style.scss
│   ├── fonts
│   │   ├── Noto-Sans-700
│   │   │   ├── Noto-Sans-700.eot
│   │   │   ├── Noto-Sans-700.svg
│   │   │   ├── Noto-Sans-700.ttf
│   │   │   ├── Noto-Sans-700.woff
│   │   │   └── Noto-Sans-700.woff2
│   │   ├── Noto-Sans-700italic
│   │   │   ├── Noto-Sans-700italic.eot
│   │   │   ├── Noto-Sans-700italic.svg
│   │   │   ├── Noto-Sans-700italic.ttf
│   │   │   ├── Noto-Sans-700italic.woff
│   │   │   └── Noto-Sans-700italic.woff2
│   │   ├── Noto-Sans-italic
│   │   │   ├── Noto-Sans-italic.eot
│   │   │   ├── Noto-Sans-italic.svg
│   │   │   ├── Noto-Sans-italic.ttf
│   │   │   ├── Noto-Sans-italic.woff
│   │   │   └── Noto-Sans-italic.woff2
│   │   └── Noto-Sans-regular
│   │       ├── Noto-Sans-regular.eot
│   │       ├── Noto-Sans-regular.svg
│   │       ├── Noto-Sans-regular.ttf
│   │       ├── Noto-Sans-regular.woff
│   │       └── Noto-Sans-regular.woff2
│   ├── img
│   │   ├── logo.svg
│   │   └── ls_format.png
│   ├── js
│   │   └── scale.fix.js
│   ├── stat_bear.png
│   └── styles.css
├── _config.yml
├── example.text
├── example.txt
├── filename with spaces.txt
├── _freeze
│   ├── index
│   │   └── execute-results
│   │       └── html.json
│   └── site_libs
│       └── clipboard
│           └── clipboard.min.js
├── _includes
│   └── toc.html
├── index.qmd
├── index.rmarkdown
├── _layouts
│   └── default.html
├── mv_assets.sh
├── myfile
├── name of my file with spaces.txt
├── _quarto.yml
├── README.md
├── _sass
│   ├── fonts.scss
│   ├── jekyll-theme-minimal.scss
│   ├── jekyll-theme-minimal.scss.bak
│   ├── minimal.scss
│   └── rouge-github.scss
└── _site

18 directories, 46 files

The dot (.) means “this directory”, so the top of the tree here is the tutorial-unix-basics directory itself, within which there are subdirectories, asset, _includes, _layouts, etc. Then within each of these are files and further subdirectories (as seen in the case of assets, which has subdirectories named css and fonts.)

If we consider the entire filesystem, the top, or root of the tree, is the / directory. Within / there are subdirectories, such as /home (which contains users’ home directories where all of the files owned by a user are stored) and /bin (containing UNIX programs, aka ‘binaries’). We’ll use ls again, this time telling it the directory to operate on:

ls /
accounts
app
bin
boot
dev
etc
home
lib
lib32
lib64
libx32
lost+found
media
mirror
mnt
opt
pool0
proc
root
run
sbin
scratch
server
srv
swap.img
sys
system
tmp
usr
var

If there is a user named scflocal, everything specific to that user would be stored in the user’s home directory. Here that is /home/scflocal, but the exact location may differ on different systems. The shortcut ~scflocal refers to the scflocal home directory, /home/scflocal. If you are the scflocal user, you can also refer to your home directory by the shortcut ~.

ls /home
scflocal
shiny
cd /home/scflocal
pwd
/home/scflocal

Go to the home directory of the current user (which happens to be the scflocal user):

cd ~
pwd
/home/scflocal

Go to the home directory of the scflocal user explicitly:

cd ~scflocal
pwd
/home/scflocal

Another useful directory is /tmp, which is a good place to put temporary files that you only need briefly and don’t need to save. These will disappear when a machine is rebooted.

cd /tmp
ls
assets
assets.tgz
quarto-session71f9ea197eacbf29
RtmpMQcgrV
Temp-76c6318d-4f54-44d2-8bd9-d9a42eeeb7ce
test

We can return to the most recent directory we were in like this:

cd -
pwd

/home/scflocal

4 Using commands

4.1 Overview

Let’s look more at various ways to use commands. We just saw the ls command. Here’s one way we can modify the behavior of the command by passing a command option. Here the -F option (also called a ‘flag’) shows directories by appending / to anything that is a directory (rather than a file) and a * to anything that is an executable (i.e., a program).

ls -F
assets/
_config.yml
example.text
example.txt
filename with spaces.txt
_freeze/
_includes/
index.qmd
index.rmarkdown
_layouts/
mv_assets.sh
myfile
name of my file with spaces.txt
_quarto.yml
README.md
_sass/
_site/

Next we’ll use multiple options to the ls command. -l shows extended information about files/directories. -t shows files/directories in order of the time at which they were last modified and -r shows in reverse order. Before I run ls, I’ll create an empty file using the touch command. Given this, what file do you expect to be displayed last when you do the following?

touch myfile
ls -lrt
total 112
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 _sass
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 _layouts
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 _includes
-rw-r--r-- 1 scflocal scflocal   291 Oct 29 15:17 _config.yml
-rw-r--r-- 1 scflocal scflocal   567 Oct 30 16:35 README.md
-rw-r--r-- 1 scflocal scflocal     6 Oct 30 16:38 name of my file with spaces.txt
-rw-r--r-- 1 scflocal scflocal    52 Oct 30 16:45 example.text
-rw-r--r-- 1 scflocal scflocal    91 Oct 30 17:00 mv_assets.sh
-rw-r--r-- 1 scflocal scflocal    51 Oct 30 17:03 example.txt
drwxr-xr-x 4 scflocal scflocal  4096 Oct 30 17:18 _freeze
drwxr-xr-x 6 scflocal scflocal  4096 Oct 31 13:54 assets
-rw-r--r-- 1 scflocal scflocal   531 Oct 31 14:34 _quarto.yml
-rw-r--r-- 1 scflocal scflocal    10 Oct 31 14:38 filename with spaces.txt
-rw-r--r-- 1 scflocal scflocal 26755 Oct 31 14:42 index.qmd
drwxr-xr-x 2 scflocal scflocal  4096 Oct 31 14:42 _site
-rw-r--r-- 1 scflocal scflocal 27079 Oct 31 14:42 index.rmarkdown
-rw-r--r-- 1 scflocal scflocal     0 Oct 31 14:42 myfile

While each command has its own syntax, there are some rules usually followed. Generally, executing a command consists of four things:

  • the command
  • command option(s)
  • argument(s)
  • line acceptance

Here’s an example:

wc -l example.txt
4 example.txt

In the above example, wc is the command, -l is a command option specifying to count the number of lines, example.txt is the argument, and the line acceptance is indicated by hitting the Enter key at the end of the line.

So that invocation counts the number of lines in the file named example.txt.

The spaces are required and distinguish the different parts of the invocation. For this reason, it’s generally a bad idea to have spaces within file names on a UNIX system. But if you do, you can use quotation marks to distinguish the file name, e.g.,

echo "some text" > "filename with spaces.txt"
ls -l "filename with spaces.txt"
-rw-r--r-- 1 scflocal scflocal 10 Oct 31 14:42 filename with spaces.txt

Also, capitalization matters. For example -l and -L are different options.

Note that options, arguments, or both might not be included in some cases. Recall that we’ve used ls without either options or arguments.

Arguments are usually one or more files or directories.

4.2 Options

Often we can specify an option either in short form (as with -l here) or long form (--lines here), as seen in the following equivalent invocations:

wc -l example.txt
wc --lines example.txt
4 example.txt
4 example.txt

We can also ask for the number of characters with the -m option, which can be combined with the -l option equivalently in two ways:

wc -lm example.txt
wc -l -m example.txt
 4 51 example.txt
 4 51 example.txt

Options will often take values, e.g., if we want to get the first two lines of the file, the following invocations are equivalent:

head -n 2 example.txt
head --lines=2 example.txt
head --lines 2 example.txt
Hello there.
This is a file
Hello there.
This is a file
Hello there.
This is a file

4.3 Comments

Anything that follows # is a comment and is ignored.

# This is ignored
ls  # Everything after the # is ignored
assets
_config.yml
example.text
example.txt
filename with spaces.txt
_freeze
_includes
index.qmd
index.rmarkdown
_layouts
mv_assets.sh
myfile
name of my file with spaces.txt
_quarto.yml
README.md
_sass
_site

4.4 Getting help with UNIX commands

Essentially all UNIX commands have help information (called a man page), accessed using man. We won’t show the results here as they are rather long.

man ls

You should try it yourself to practice viewing man pages. Once you are in the man page, you can navigate by hitting the space bar (to scroll down) and the up and down arrows. You can search by typing /, typing the string you want to search for and hitting <Enter>. You can use n and p for the next and previous search hits and q to quit out of the search.

Unfortunately man pages are often quite long, hard to understand, and without examples. But the information you need is usually there if you take the time to look for it.

Also, UNIX commands as well as other programs run from the command line often provide help information via the --help option:

ls --help

Again, we’re not showing the output as it is rather long.

4.5 Seeing if a command or program is available

You can see if a command or program is installed (and where it is installed) using type.

type grep
type R
type python
grep is /usr/bin/grep
R is /usr/bin/R
python is /usr/local/linux/miniforge-3.12/bin/python

5 Working with files

5.1 Copying and removing files

You’ll often want to make a copy of a file, move it between directories, or remove it.

cp 
cp example.txt example-new.txt
mv example-new.txt /tmp/.
cd /tmp
ls -lrt
cp: missing file operand
Try 'cp --help' for more information.
total 468
drwx------  2 scflocal scflocal   4096 Oct 29 15:21 Temp-76c6318d-4f54-44d2-8bd9-d9a42eeeb7ce
drwxr-xr-x  6 scflocal scflocal   4096 Oct 31 13:54 assets
drwxr-xr-x  2 scflocal scflocal   4096 Oct 31 14:38 test
-rw-r--r--  1 scflocal scflocal 453183 Oct 31 14:38 assets.tgz
drwx------ 37 scflocal scflocal   4096 Oct 31 14:42 quarto-session71f9ea197eacbf29
drwx------  2 scflocal scflocal   4096 Oct 31 14:42 RtmpMQcgrV
-rw-r--r--  1 scflocal scflocal     51 Oct 31 14:42 example-new.txt

When we moved the file, the use of /. in /tmp/. indicates we want to use the same name as the original file.

cd /tmp
rm example-new.txt
ls -lrt
total 464
drwx------  2 scflocal scflocal   4096 Oct 29 15:21 Temp-76c6318d-4f54-44d2-8bd9-d9a42eeeb7ce
drwxr-xr-x  6 scflocal scflocal   4096 Oct 31 13:54 assets
drwxr-xr-x  2 scflocal scflocal   4096 Oct 31 14:38 test
-rw-r--r--  1 scflocal scflocal 453183 Oct 31 14:38 assets.tgz
drwx------ 37 scflocal scflocal   4096 Oct 31 14:42 quarto-session71f9ea197eacbf29
drwx------  2 scflocal scflocal   4096 Oct 31 14:42 RtmpMQcgrV
rm is forever

I used rm above to remove the file. Be very careful about removing files - there is no Trash folder in UNIX - once a file is removed, it’s gone for good.

The mv command is also used if you want to rename a file.

cd ~/tutorial-unix-basics
mv example.txt silly_example.txt
ls
assets
_config.yml
example.text
filename with spaces.txt
_freeze
_includes
index.qmd
index.rmarkdown
_layouts
mv_assets.sh
myfile
name of my file with spaces.txt
_quarto.yml
README.md
_sass
silly_example.txt
_site

We can copy and remove entire directories. The -p flag preserves the time stamp and other information associated with the files/directories, while the -r option copies recursively, such that the directory and all its contents (all child files and directories) are also copied.

cp -pr assets /tmp/.  # Copy the assets directory into /tmp.
cd /tmp
mkdir test
mv assets test     # Move the assets directory into the test directory.
ls -l test/assets
mkdir: cannot create directory ‘test’: File exists
total 112
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 css
drwxr-xr-x 6 scflocal scflocal  4096 Oct 29 15:17 fonts
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 img
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 js
-rw-r--r-- 1 scflocal scflocal 92106 Oct 30 17:10 stat_bear.png
-rw-r--r-- 1 scflocal scflocal    69 Oct 31 13:54 styles.css
rm -rf /tmp/test/assets   # Remove the ps directory and anything contained within it.
ls /tmp/test              # This should be empty now.

You can use a variant of cp named scp to copy files between different UNIX-like machines. Suppose I have access to the machine radagast.berkeley.edu and that my user name on that machine is scf1. I can copy a file to that machine or from that machine as follows.

(Note that I am not running the code in the process of generating this document.)

cd ~/tutorial-unix-basics

# FROM the machine you're on TO another machine
# Copy the file to the Desktop subdirectory of the scf1 home directory on the remote machine
scp example.txt username@machinename.berkeley.edu:~/Desktop/.

# FROM another machine TO the machine you're on
# Copy a file from the /tmp directory of the remote machine to a specific directory on this machine
scp username@machinename.berkeley.edu:/tmp/data.txt ~/Downloads/.

5.2 File names and extensions

The format a file is in is determined by the actual content of the file. You can determine the file format using file:

file index.qmd
file /usr/local/linux/miniforge-3.12/lib/python3.12/site-packages/numpy/dtypes.py
index.qmd: exported SGML document, ASCII text, with very long lines (615)
/usr/local/linux/miniforge-3.12/lib/python3.12/site-packages/numpy/dtypes.py: Python script, ASCII text executable

In many cases, files have extensions such as .csv (for comma-separated text files), .pdf for PDFs, .jpg for JPEG files. The extension is a convention that helps us and programs distinguish different kinds of files and therefore know how to manipulate/interpret the files.

Filename extensions don’t determine the file type

The extension is just a convention – changing the file name doesn’t change the file format!

So if make a copy of the example.txt file but name it example.pdf, we see that it’s still just a simple text file even if I give it a name that would suggest it’s a PDF.

cp silly_example.txt silly_example.pdf
cat silly_example.pdf
Hello there.
This is a file
that contains
4 lines.
file silly_example.pdf
silly_example.pdf: ASCII text

However, changing the extension may prevent a program from using the file simply because the program was written to assume that files in a certain format have a certain extension.

6 Other useful tools and information

6.1 Compressing and uncompressing files

The zip utility compresses in a format compatible with zip files for Windows:

zip -r assets.zip assets
  adding: assets/ (stored 0%)
  adding: assets/css/ (stored 0%)
  adding: assets/css/style.scss (stored 0%)
  adding: assets/js/ (stored 0%)
  adding: assets/js/scale.fix.js (deflated 62%)
  adding: assets/fonts/ (stored 0%)
  adding: assets/fonts/Noto-Sans-regular/ (stored 0%)
  adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.ttf (deflated 34%)
  adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.svg (deflated 66%)
  adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff2 (stored 0%)
  adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.eot (deflated 0%)
  adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff (deflated 1%)
  adding: assets/fonts/Noto-Sans-700italic/ (stored 0%)
  adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.ttf (deflated 32%)
  adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff2 (stored 0%)
  adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.eot (deflated 0%)
  adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff (deflated 1%)
  adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.svg (deflated 68%)
  adding: assets/fonts/Noto-Sans-700/ (stored 0%)
  adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.ttf (deflated 35%)
  adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.eot (deflated 0%)
  adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.svg (deflated 66%)
  adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.woff2 (stored 0%)
  adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.woff (deflated 1%)
  adding: assets/fonts/Noto-Sans-italic/ (stored 0%)
  adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.svg (deflated 67%)
  adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff2 (stored 0%)
  adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.eot (deflated 0%)
  adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.ttf (deflated 31%)
  adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff (deflated 1%)
  adding: assets/stat_bear.png (deflated 18%)
  adding: assets/img/ (stored 0%)
  adding: assets/img/logo.svg (deflated 70%)
  adding: assets/img/ls_format.png (deflated 4%)
  adding: assets/styles.css (deflated 4%)
ls -l assets.zip
-rw-r--r-- 1 scflocal scflocal 444635 Oct 31 14:42 assets.zip

gzip is a standard UNIX compression utility to compress individual files:

cp assets/img/ls_format.png test.png
ls -l test.png
-rw-r--r-- 1 scflocal scflocal 52402 Oct 31 14:42 test.png

Here we see that gzip can’t compress the png file much, but it can help a lot with other formats.

gzip test.png
ls -l test.png.gz   # Not much smaller than the uncompressed file.
-rw-r--r-- 1 scflocal scflocal 50200 Oct 31 14:42 test.png.gz

Finally, the tar utility will combine multiple files and directories into a single archive.

tar -cvf assets.tar assets
assets/
assets/css/
assets/css/style.scss
assets/js/
assets/js/scale.fix.js
assets/fonts/
assets/fonts/Noto-Sans-regular/
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.ttf
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.svg
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff2
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.eot
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff
assets/fonts/Noto-Sans-700italic/
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.ttf
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff2
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.eot
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.svg
assets/fonts/Noto-Sans-700/
assets/fonts/Noto-Sans-700/Noto-Sans-700.ttf
assets/fonts/Noto-Sans-700/Noto-Sans-700.eot
assets/fonts/Noto-Sans-700/Noto-Sans-700.svg
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff2
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff
assets/fonts/Noto-Sans-italic/
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.svg
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff2
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.eot
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.ttf
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff
assets/stat_bear.png
assets/img/
assets/img/logo.svg
assets/img/ls_format.png
assets/styles.css
ls -l assets.tar
-rw-r--r-- 1 scflocal scflocal 686080 Oct 31 14:42 assets.tar

Adding the -z flag also gzips the result. In that case there was more noticeable compression.

tar -cvzf assets.tgz assets
assets/
assets/css/
assets/css/style.scss
assets/js/
assets/js/scale.fix.js
assets/fonts/
assets/fonts/Noto-Sans-regular/
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.ttf
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.svg
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff2
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.eot
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff
assets/fonts/Noto-Sans-700italic/
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.ttf
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff2
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.eot
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.svg
assets/fonts/Noto-Sans-700/
assets/fonts/Noto-Sans-700/Noto-Sans-700.ttf
assets/fonts/Noto-Sans-700/Noto-Sans-700.eot
assets/fonts/Noto-Sans-700/Noto-Sans-700.svg
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff2
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff
assets/fonts/Noto-Sans-italic/
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.svg
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff2
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.eot
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.ttf
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff
assets/stat_bear.png
assets/img/
assets/img/logo.svg
assets/img/ls_format.png
assets/styles.css
ls -l assets.tgz
-rw-r--r-- 1 scflocal scflocal 453183 Oct 31 14:42 assets.tgz

Now let’s move that tarball (as it is called) to a new directory and unzip and expand it using the -x flag.

mv assets.tgz /tmp
cd /tmp
tar -xvzf assets.tgz
assets/
assets/css/
assets/css/style.scss
assets/js/
assets/js/scale.fix.js
assets/fonts/
assets/fonts/Noto-Sans-regular/
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.ttf
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.svg
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff2
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.eot
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff
assets/fonts/Noto-Sans-700italic/
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.ttf
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff2
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.eot
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.svg
assets/fonts/Noto-Sans-700/
assets/fonts/Noto-Sans-700/Noto-Sans-700.ttf
assets/fonts/Noto-Sans-700/Noto-Sans-700.eot
assets/fonts/Noto-Sans-700/Noto-Sans-700.svg
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff2
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff
assets/fonts/Noto-Sans-italic/
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.svg
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff2
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.eot
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.ttf
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff
assets/stat_bear.png
assets/img/
assets/img/logo.svg
assets/img/ls_format.png
assets/styles.css

You can see the whole directory structure of what was archived has been recovered in the new location:

ls -l /tmp/assets
total 112
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 css
drwxr-xr-x 6 scflocal scflocal  4096 Oct 29 15:17 fonts
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 img
drwxr-xr-x 2 scflocal scflocal  4096 Oct 29 15:17 js
-rw-r--r-- 1 scflocal scflocal 92106 Oct 30 17:10 stat_bear.png
-rw-r--r-- 1 scflocal scflocal    69 Oct 31 13:54 styles.css

6.2 Disk usage

You can can see how much disk space is being used versus available as follows. The ‘Mounted on’ column will generally identify the parts of the filesystem in a more user-friendly way than the ‘Filesystem’ column.

df -h
Filesystem                       Size  Used Avail Use% Mounted on
/dev/sda2                         59G   32G   25G  57% /
tmpfs                             63G   68M   63G   1% /dev/shm
tmpfs                             13G   23M   13G   1% /run
tmpfs                            5.0M     0  5.0M   0% /run/lock
tmpfs                             13G   28K   13G   1% /run/user/3173
tmpfs                             13G   28K   13G   1% /run/user/3520
tmpfs                             13G   28K   13G   1% /run/user/3417
tmpfs                             13G   28K   13G   1% /run/user/764
tmpfs                             13G   28K   13G   1% /run/user/3023
tmpfs                             13G   28K   13G   1% /run/user/3565
tmpfs                             13G   28K   13G   1% /run/user/3530
tmpfs                             13G   28K   13G   1% /run/user/3294
tmpfs                             13G   28K   13G   1% /run/user/3066
tmpfs                             13G   28K   13G   1% /run/user/3180
tmpfs                             13G   40K   13G   1% /run/user/3189
tmpfs                             13G   28K   13G   1% /run/user/3188
tmpfs                             13G   32K   13G   1% /run/user/3605
tmpfs                             13G   28K   13G   1% /run/user/3608
tmpfs                             13G   32K   13G   1% /run/user/3466
tmpfs                             13G   28K   13G   1% /run/user/3604
tmpfs                             13G   28K   13G   1% /run/user/3218
tmpfs                             13G   32K   13G   1% /run/user/3624
tmpfs                            4.0M     0  4.0M   0% /sys/fs/cgroup
/dev/sda3                         59G   34G   22G  61% /var
/dev/sda4                        472G  2.7G  445G   1% /var/tmp
/dev/sda5                        1.3T   76G  1.1T   7% /tmp
oz.berkeley.edu:/pool0/system    6.0T  4.8T  1.3T  80% /system
oz.berkeley.edu:/pool0/scratch    37T   33T  4.3T  89% /scratch
oz.berkeley.edu:/pool0/accounts   66T   20T   47T  30% /accounts

In general, you’ll want to look at the ‘/’ line under Mounted on, and on standard UNIX machines possibly at ‘/tmp’, ‘/home’, and others.

We can see usage in specific directories like this:

cd assets
du -h
8.0K    ./css
8.0K    ./js
140K    ./fonts/Noto-Sans-regular
140K    ./fonts/Noto-Sans-700italic
140K    ./fonts/Noto-Sans-700
132K    ./fonts/Noto-Sans-italic
556K    ./fonts
76K ./img
748K    .

Here we see that the total usage is about a bit less than 700 KB, with, for example, about 70 KB of that in the img subdirectory.

If we only want a summary of usage for each top-level subdirectory, rather than showing all nested subdirectories:

cd ~/tutorial-unix-basics
du -h -d 1
28K ./_sass
8.0K    ./_layouts
4.0K    ./_site
16K ./_includes
84K ./_freeze
748K    ./assets
1.5M    ./.git
172K    ./.quarto
3.8M    .

6.3 Machine information

Linux machines (but not Macs) have system information provided in a few special files.

/proc/cpuinfo shows information on each processor.

head -n 30 /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 45
model name  : Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
stepping    : 7
microcode   : 0x71a
cpu MHz     : 2394.053
cache size  : 10240 KB
physical id : 0
siblings    : 4
core id     : 0
cpu cores   : 4
apicid      : 0
initial apicid  : 0
fpu     : yes
fpu_exception   : yes
cpuid level : 13
wp      : yes
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm arat pln pts md_clear flush_l1d
vmx flags   : vnmi preemption_timer invvpid ept_x_only ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple
bugs        : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips    : 4800.18
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel

This indicates there are at least two processors – numbered 0 and 1 (we’d need to see the whole file to see if there are more). Each is an Intel Core i7.

This file has information on the memory available:

head -n 10 /proc/meminfo
MemTotal:       131966464 kB
MemFree:        22645656 kB
MemAvailable:   98414788 kB
Buffers:          588304 kB
Cached:         32359372 kB
SwapCached:      1249372 kB
Active:         28592240 kB
Inactive:       32685712 kB
Active(anon):    3933376 kB
Inactive(anon): 24488076 kB

The key line is the MemTotal line, indicating 132 GB of RAM.

cat /etc/issue
Ubuntu 22.04.1 LTS \n \l

We’re running Ubuntu version 22.04.

We can also use commands to get information:

nproc  # how many processors?
8

7 The shell

The shell provides a number of useful shortcuts, of which we highlight a couple here.

7.1 Tab completion

The shell will try to auto-complete the names of commands/programs or of files when you type part of the name and then hit <Tab>. This can save quite a bit of typing, particularly for long file names.

7.2 Keyboard shortcuts

You can navigate within a line using the usual arrows but also:

  • Ctrl-a moves to the beginning of the line
  • Ctrl-e moves to the end of the line
  • Ctrl-k deletes the rest of the line starting at the cursor
  • Ctrl-y pastes in whatever was deleted previously with Ctrl-k
  • Ctrl-r enables an interactive history search

7.3 Command history

The up and down arrow keys will move you through the history of commands you have entered in the terminal. So you can recover something you typed previously and then directly run it again, or edit it and then run the modified version. You run the command by pressing <Enter>, which you can do regardless of where your cursor currently is on the line you are editing.

There’s also lots more functionality along these lines that we won’t go into here.

7.4 Saving your code as a shell script

Often (particularly as you learn more sophisticated shell functionality) you will want to save your shell syntax in the form of a code file, called a script, that you could run another time.

For example, suppose you often need to do the following series of steps:

cd 
tar -cvzf assets.tgz assets
mv assets.tgz /tmp
cd /tmp
tar -xvzf assets.tgz

You can put those lines into a file, say, mv_assets.sh, which will generally end in .sh.

Then we can run the code in the file as follows. (Results not shown here.)

chmod ugo+x mv_assets.sh  # Make the script executable by everyone.
./mv_assets.sh            # Run it.

The initial ./ is needed because UNIX is not expecting there to be an executable file in this particular directory.

You’ll generally want to have the first line of your shell scripts indicate the shell to be used to execute the script, so you’d want to put #!/bin/bash as the first line of mv_assets.sh.

8 Practice questions

  1. Try to run the following command mkdir ~/projects/drought. It will fail. Look in the help information on mkdir to figure out how to make it work without first creating the projects directory.

  2. Figure out how to list out the files in a directory in order of decreasing file size, as a way to see easily what the big files are that are taking up the most space. Modify this command to get the result in the ascending order.

  3. Use both zip and tar -cvzf to compress the tutorial-unix-basics directory. Is one much smaller than the other?

  4. Figure out how to print out free disk space in terms of megabytes.

  5. The ls command is itself an executable installed on the system. Where is it located?

  6. Where is gzip installed on the system? What are some other commands/executables that are installed in the same directory?

  7. Practice with moving/removing/copying. Make a copy of the tutorial-unix-basics directory (and all its contents) in /tmp. Now use cd to go into the copied directory. Remove the /tmp/tutorial-unix-basics/.git directory. Now run git status. Congratulations, you should discover that you’ve turned a directory that is a Git repository into a directory that is not considered a git repository.