whoami
scflocal
This tutorial covers the basics of navigating in a UNIX-like (e.g., Linux or MacOs) environment. In particular, it covers using the UNIX command line interface, a powerful way to carry out operations on a computer and to automate tasks. Being familiar with operating on the command line will allow you (with some practice and training) to do things more quickly and in a way that can be reproduced later. That’s hard or impossible to do if you are doing point-and-click or drag-and-drop operations in a File Manager or Finder window.
Materials for this tutorial, including the Quarto Markdown file that was used to create this document are available on GitHub.
Software Carpentry has a very nice introductory lesson on the basics of the shell. It also has an accompanying YouTube video. Episodes 1-3 (the first 20 minutes) cover the material that is in this tutorial.
This tutorial by Christopher Paciorek is licensed under a Creative Commons Attribution 3.0 Unported License.
Operating on the UNIX command line is also known as “using the terminal” and “using the shell”.
The shell is the UNIX program that you interact with when in a terminal window interacting with a UNIX-style operating system (e.g., Linux or MacOS). The shell sits between you and the operating system and provides useful commands and functionality. Basically, the shell is a program that serves to run other commands for you and show you the results. There are actually different shells that you can use, of which bash
is very common and is the default on many systems. In recent versions of MacOS, zsh
is the default shell. zsh is an extension of bash, so you should be able to use zsh based on this tutorial.
I’ve generated this document based on using the bash shell on a computer running the Ubuntu Linux version 22.04 operating system, but you should be able to replicate most of the steps in this tutorial in other UNIX command line environments, ideally using the bash or zsh shells.
Here are some options for accessing a UNIX command line interface:
Applications -> Utilities -> Terminal
.You probably shouldn’t use Git Bash
to follow this tutorial as its functionality is limited.
Once you’re in a Terminal window, you’ll be interacting with the shell and you can enter commands to get information and work with the system. Commands often have optional arguments (flags) that are specified with a minus in front of them, as we’ll see.
Once we are in a terminal, we’ll see the “prompt”, which indicates that the shell is waiting for us to enter commands. Sometimes the prompt is just $
:
$
but often it contains information about the username of the current user and the directory on the filesystem that we are in. For example, here a prompt that shows that the current user is ‘scflocal’, on the machine named ‘gandalf’ in the ‘tutorial-unix-basics’ (sub)directory in the user’s home directory (indicated by ~
):
scflocal@gandalf:~/tutorial-unix-basics>
In the remainder of this tutorial, you won’t see the prompt in front of the commands. All commands will appear in a grey background, with the output (if any) following the code.
Note that if you simply see >
instead of the usual prompt, that means the shell thinks you haven’t finished entering your command (usually that you haven’t finished entering a string) and is expecting more input from you. If you see a newline but nothing else, the shell probably expects you to enter some text for it to process.
If you’re not sure what to do, type Ctrl-c
(the control key and ‘c’ at the same time) to get back to the usual prompt.
Let’s start by running a command, whoami
, that prints out the username of the current user:
whoami
scflocal
git
for version controlWe’ll discuss git
briefly, both because it is an important and useful tool, and because it’s the easiest way for us to get a set of files to work with in this tutorial.
Git is an important tool to become familiar with, at least at the basic level. Git allows you to share files between different computers and different people and for people to collaborate on projects together. In particular, it is a version control tool that allows you to have different versions of your files and to go back to earlier versions of your files. Git stores the files for a project in a repository.
For our purposes here, we’ll simply use Git to download materials from GitHub, a website that stores Git repositories in the cloud.
First we’ll download the materials for this tutorial.
To clone (i.e., copy) a repository (in this case from GitHub) we do the following. Note that berkeley-scf
is the organization and tutorial-unix-basics
is the repository. Note that everything below that follows the #
symbol is a comment and not executed.
Here we’ll first use the cd
command (for “change directory”) to go to our home directory and then use git clone
to download materials to a subdirectory (which will be called tutorial-unix-basics
) within our home directory.
cd
git clone https://github.com/berkeley-scf/tutorial-unix-basics
Cloning into 'tutorial-unix-basics'...
remote: Enumerating objects: 387, done.
remote: Counting objects: 100% (66/66), done.
remote: Compressing objects: 100% (45/45), done.
remote: Total 387 (delta 37), reused 46 (delta 19), pack-reused 321 (from 1)
Receiving objects: 100% (387/387), 779.53 KiB | 5.27 MiB/s, done.
Resolving deltas: 100% (199/199), done.
Now suppose that whoever controls the repository makes some changes to the materials in the repository online and you want an updated copy of the repository on your computer. Simply use cd
to go into any directory in the repository materials on your computer and run git pull
.
cd tutorial-unix-basics
git pull
Already up to date.
In this case, since no changes had been made, git simply reports that things are up-to-date.
We’ll discuss how to use cd
in more detail in the next section.
We’ll start by thinking about the filesystem, which organizes our information/data into files on the computer’s disk.
Anytime you are at the UNIX command line, you have a working directory, which is your current location in the file system.
Here’s how you can see where you are using the pwd
(“print working directory”) command:
pwd
/home/scflocal/tutorial-unix-basics
and here’s how you use ls
to list the files (and subdirectories) in the working directory…
ls
assets
_config.yml
example.text
example.txt
filename with spaces.txt
_freeze
_includes
index.qmd
index.rmarkdown
_layouts
mv_assets.sh
myfile
name of my file with spaces.txt
_quarto.yml
README.md
_sass
_site
Now suppose I want to be in a different directory so I can see what is there or do things to the files in that directory.
The command you need is cd
and an important concept you need to become familiar with is the notion of ‘relative’ versus ‘absolute’ path. A path is the set of nested directories that specify a location of interest on the filesystem.
First let’s go to our home directory, which is generally where our files will be. Simply running cd
will do that.
cd
pwd
/home/scflocal
Now let’s go into a subdirectory. We can use cd
with the name of the subdirectory. The subdirectory is found ‘relative’ to our working directory, i.e., found from where we currently are.
cd tutorial-unix-basics
pwd
/home/scflocal/tutorial-unix-basics
We could also navigate through nested subdirectories. For example, after going back to our home directory, let’s go to the assets
subdirectory of the tutorial-unix-basics
subdirectory. The /
is a separate character that distinguishes the nested subdirectories.
cd
cd tutorial-unix-basics/assets
pwd
/home/scflocal/tutorial-unix-basics/assets
You can access the parent directory of any directory using ..
:
pwd
cd ..
pwd
/home/scflocal/tutorial-unix-basics/assets /home/scflocal/tutorial-unix-basics
We can get more complicated in our use of ..
with relative paths. Here we’ll go up a directory and then down to a different subdirectory.
cd assets
cd ../_includes
pwd
/home/scflocal/tutorial-unix-basics/_includes
And here we’ll go up two directories and then down to another subdirectory.
cd ../../Desktop # go up two directories and down
pwd
/home/scflocal/Desktop
All of the above examples used relative paths to navigate based on your working directory at the moment you ran the command.
We can instead use absolute paths so that it doesn’t matter where we are when we run the command. Specifying an absolute path is done by having your path start with /
, such as /home/scflocal
. If the path doesn’t start with /
then it is interpreted as being a relative path, relative to your working directory. Here we’ll go to the units
subdirectory again, but this time using an absolute path.
cd /home/scflocal/tutorial-unix-basics/assets
pwd
/home/scflocal/tutorial-unix-basics/assets
Note that using absolute paths in scripts is generally a bad idea because the script wouldn’t generally work correctly if run on a different machine (which will generally have a different filesystem structure) or as a different user (who will have a different home directory).
The filesystem is basically a upside-down tree.
For example, if we just consider the tutorial-unix-basics
directory, we can see the tree structure using tree
:
tree
.
├── assets
│ ├── css
│ │ └── style.scss
│ ├── fonts
│ │ ├── Noto-Sans-700
│ │ │ ├── Noto-Sans-700.eot
│ │ │ ├── Noto-Sans-700.svg
│ │ │ ├── Noto-Sans-700.ttf
│ │ │ ├── Noto-Sans-700.woff
│ │ │ └── Noto-Sans-700.woff2
│ │ ├── Noto-Sans-700italic
│ │ │ ├── Noto-Sans-700italic.eot
│ │ │ ├── Noto-Sans-700italic.svg
│ │ │ ├── Noto-Sans-700italic.ttf
│ │ │ ├── Noto-Sans-700italic.woff
│ │ │ └── Noto-Sans-700italic.woff2
│ │ ├── Noto-Sans-italic
│ │ │ ├── Noto-Sans-italic.eot
│ │ │ ├── Noto-Sans-italic.svg
│ │ │ ├── Noto-Sans-italic.ttf
│ │ │ ├── Noto-Sans-italic.woff
│ │ │ └── Noto-Sans-italic.woff2
│ │ └── Noto-Sans-regular
│ │ ├── Noto-Sans-regular.eot
│ │ ├── Noto-Sans-regular.svg
│ │ ├── Noto-Sans-regular.ttf
│ │ ├── Noto-Sans-regular.woff
│ │ └── Noto-Sans-regular.woff2
│ ├── img
│ │ ├── logo.svg
│ │ └── ls_format.png
│ ├── js
│ │ └── scale.fix.js
│ ├── stat_bear.png
│ └── styles.css
├── _config.yml
├── example.text
├── example.txt
├── filename with spaces.txt
├── _freeze
│ ├── index
│ │ └── execute-results
│ │ └── html.json
│ └── site_libs
│ └── clipboard
│ └── clipboard.min.js
├── _includes
│ └── toc.html
├── index.qmd
├── index.rmarkdown
├── _layouts
│ └── default.html
├── mv_assets.sh
├── myfile
├── name of my file with spaces.txt
├── _quarto.yml
├── README.md
├── _sass
│ ├── fonts.scss
│ ├── jekyll-theme-minimal.scss
│ ├── jekyll-theme-minimal.scss.bak
│ ├── minimal.scss
│ └── rouge-github.scss
└── _site
18 directories, 46 files
The dot (.
) means “this directory”, so the top of the tree here is the tutorial-unix-basics
directory itself, within which there are subdirectories, asset
, _includes
, _layouts
, etc. Then within each of these are files and further subdirectories (as seen in the case of assets
, which has subdirectories named css
and fonts
.)
If we consider the entire filesystem, the top, or root of the tree, is the /
directory. Within /
there are subdirectories, such as /home
(which contains users’ home directories where all of the files owned by a user are stored) and /bin
(containing UNIX programs, aka ‘binaries’). We’ll use ls
again, this time telling it the directory to operate on:
ls /
accounts
app
bin
boot
dev
etc
home
lib
lib32
lib64
libx32
lost+found
media
mirror
mnt
opt
pool0
proc
root
run
sbin
scratch
server
srv
swap.img
sys
system
tmp
usr
var
If there is a user named scflocal
, everything specific to that user would be stored in the user’s home directory. Here that is /home/scflocal
, but the exact location may differ on different systems. The shortcut ~scflocal
refers to the scflocal
home directory, /home/scflocal
. If you are the scflocal
user, you can also refer to your home directory by the shortcut ~
.
ls /home
scflocal
shiny
cd /home/scflocal
pwd
/home/scflocal
Go to the home directory of the current user (which happens to be the scflocal
user):
cd ~
pwd
/home/scflocal
Go to the home directory of the scflocal user explicitly:
cd ~scflocal
pwd
/home/scflocal
Another useful directory is /tmp
, which is a good place to put temporary files that you only need briefly and don’t need to save. These will disappear when a machine is rebooted.
cd /tmp
ls
assets
assets.tgz
quarto-session71f9ea197eacbf29
RtmpMQcgrV
Temp-76c6318d-4f54-44d2-8bd9-d9a42eeeb7ce
test
We can return to the most recent directory we were in like this:
cd -
pwd
/home/scflocal
Let’s look more at various ways to use commands. We just saw the ls
command. Here’s one way we can modify the behavior of the command by passing a command option. Here the -F
option (also called a ‘flag’) shows directories by appending /
to anything that is a directory (rather than a file) and a *
to anything that is an executable (i.e., a program).
ls -F
assets/
_config.yml
example.text
example.txt
filename with spaces.txt
_freeze/
_includes/
index.qmd
index.rmarkdown
_layouts/
mv_assets.sh
myfile
name of my file with spaces.txt
_quarto.yml
README.md
_sass/
_site/
Next we’ll use multiple options to the ls
command. -l
shows extended information about files/directories. -t
shows files/directories in order of the time at which they were last modified and -r
shows in reverse order. Before I run ls
, I’ll create an empty file using the touch
command. Given this, what file do you expect to be displayed last when you do the following?
touch myfile
ls -lrt
total 112
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 _sass
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 _layouts
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 _includes
-rw-r--r-- 1 scflocal scflocal 291 Oct 29 15:17 _config.yml
-rw-r--r-- 1 scflocal scflocal 567 Oct 30 16:35 README.md
-rw-r--r-- 1 scflocal scflocal 6 Oct 30 16:38 name of my file with spaces.txt
-rw-r--r-- 1 scflocal scflocal 52 Oct 30 16:45 example.text
-rw-r--r-- 1 scflocal scflocal 91 Oct 30 17:00 mv_assets.sh
-rw-r--r-- 1 scflocal scflocal 51 Oct 30 17:03 example.txt
drwxr-xr-x 4 scflocal scflocal 4096 Oct 30 17:18 _freeze
drwxr-xr-x 6 scflocal scflocal 4096 Oct 31 13:54 assets
-rw-r--r-- 1 scflocal scflocal 531 Oct 31 14:34 _quarto.yml
-rw-r--r-- 1 scflocal scflocal 10 Oct 31 14:38 filename with spaces.txt
-rw-r--r-- 1 scflocal scflocal 26755 Oct 31 14:42 index.qmd
drwxr-xr-x 2 scflocal scflocal 4096 Oct 31 14:42 _site
-rw-r--r-- 1 scflocal scflocal 27079 Oct 31 14:42 index.rmarkdown
-rw-r--r-- 1 scflocal scflocal 0 Oct 31 14:42 myfile
While each command has its own syntax, there are some rules usually followed. Generally, executing a command consists of four things:
Here’s an example:
wc -l example.txt
4 example.txt
In the above example, wc
is the command, -l
is a command option specifying to count the number of lines, example.txt
is the argument, and the line acceptance is indicated by hitting the Enter
key at the end of the line.
So that invocation counts the number of lines in the file named example.txt
.
The spaces are required and distinguish the different parts of the invocation. For this reason, it’s generally a bad idea to have spaces within file names on a UNIX system. But if you do, you can use quotation marks to distinguish the file name, e.g.,
echo "some text" > "filename with spaces.txt"
ls -l "filename with spaces.txt"
-rw-r--r-- 1 scflocal scflocal 10 Oct 31 14:42 filename with spaces.txt
Also, capitalization matters. For example -l
and -L
are different options.
Note that options, arguments, or both might not be included in some cases. Recall that we’ve used ls
without either options or arguments.
Arguments are usually one or more files or directories.
Often we can specify an option either in short form (as with -l
here) or long form (--lines
here), as seen in the following equivalent invocations:
wc -l example.txt
wc --lines example.txt
4 example.txt
4 example.txt
We can also ask for the number of characters with the -m
option, which can be combined with the -l
option equivalently in two ways:
wc -lm example.txt
wc -l -m example.txt
4 51 example.txt
4 51 example.txt
Options will often take values, e.g., if we want to get the first two lines of the file, the following invocations are equivalent:
head -n 2 example.txt
head --lines=2 example.txt
head --lines 2 example.txt
Hello there.
This is a file
Hello there.
This is a file
Hello there.
This is a file
Essentially all UNIX commands have help information (called a man page), accessed using man
. We won’t show the results here as they are rather long.
man ls
You should try it yourself to practice viewing man pages. Once you are in the man page, you can navigate by hitting the space bar (to scroll down) and the up and down arrows. You can search by typing /
, typing the string you want to search for and hitting <Enter>
. You can use n
and p
for the next and previous search hits and q
to quit out of the search.
Unfortunately man pages are often quite long, hard to understand, and without examples. But the information you need is usually there if you take the time to look for it.
Also, UNIX commands as well as other programs run from the command line often provide help information via the --help
option:
ls --help
Again, we’re not showing the output as it is rather long.
You can see if a command or program is installed (and where it is installed) using type
.
type grep
type R
type python
grep is /usr/bin/grep
R is /usr/bin/R
python is /usr/local/linux/miniforge-3.12/bin/python
You’ll often want to make a copy of a file, move it between directories, or remove it.
cp
cp example.txt example-new.txt
mv example-new.txt /tmp/.
cd /tmp
ls -lrt
cp: missing file operand
Try 'cp --help' for more information.
total 468
drwx------ 2 scflocal scflocal 4096 Oct 29 15:21 Temp-76c6318d-4f54-44d2-8bd9-d9a42eeeb7ce
drwxr-xr-x 6 scflocal scflocal 4096 Oct 31 13:54 assets
drwxr-xr-x 2 scflocal scflocal 4096 Oct 31 14:38 test
-rw-r--r-- 1 scflocal scflocal 453183 Oct 31 14:38 assets.tgz
drwx------ 37 scflocal scflocal 4096 Oct 31 14:42 quarto-session71f9ea197eacbf29
drwx------ 2 scflocal scflocal 4096 Oct 31 14:42 RtmpMQcgrV
-rw-r--r-- 1 scflocal scflocal 51 Oct 31 14:42 example-new.txt
When we moved the file, the use of /.
in /tmp/.
indicates we want to use the same name as the original file.
cd /tmp
rm example-new.txt
ls -lrt
total 464
drwx------ 2 scflocal scflocal 4096 Oct 29 15:21 Temp-76c6318d-4f54-44d2-8bd9-d9a42eeeb7ce
drwxr-xr-x 6 scflocal scflocal 4096 Oct 31 13:54 assets
drwxr-xr-x 2 scflocal scflocal 4096 Oct 31 14:38 test
-rw-r--r-- 1 scflocal scflocal 453183 Oct 31 14:38 assets.tgz
drwx------ 37 scflocal scflocal 4096 Oct 31 14:42 quarto-session71f9ea197eacbf29
drwx------ 2 scflocal scflocal 4096 Oct 31 14:42 RtmpMQcgrV
rm
is forever
I used rm
above to remove the file. Be very careful about removing files - there is no Trash folder in UNIX - once a file is removed, it’s gone for good.
The mv
command is also used if you want to rename a file.
cd ~/tutorial-unix-basics
mv example.txt silly_example.txt
ls
assets
_config.yml
example.text
filename with spaces.txt
_freeze
_includes
index.qmd
index.rmarkdown
_layouts
mv_assets.sh
myfile
name of my file with spaces.txt
_quarto.yml
README.md
_sass
silly_example.txt
_site
We can copy and remove entire directories. The -p
flag preserves the time stamp and other information associated with the files/directories, while the -r
option copies recursively, such that the directory and all its contents (all child files and directories) are also copied.
cp -pr assets /tmp/. # Copy the assets directory into /tmp.
cd /tmp
mkdir test
mv assets test # Move the assets directory into the test directory.
ls -l test/assets
mkdir: cannot create directory ‘test’: File exists
total 112
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 css
drwxr-xr-x 6 scflocal scflocal 4096 Oct 29 15:17 fonts
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 img
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 js
-rw-r--r-- 1 scflocal scflocal 92106 Oct 30 17:10 stat_bear.png
-rw-r--r-- 1 scflocal scflocal 69 Oct 31 13:54 styles.css
rm -rf /tmp/test/assets # Remove the ps directory and anything contained within it.
ls /tmp/test # This should be empty now.
You can use a variant of cp
named scp
to copy files between different UNIX-like machines. Suppose I have access to the machine radagast.berkeley.edu and that my user name on that machine is scf1. I can copy a file to that machine or from that machine as follows.
(Note that I am not running the code in the process of generating this document.)
cd ~/tutorial-unix-basics
# FROM the machine you're on TO another machine
# Copy the file to the Desktop subdirectory of the scf1 home directory on the remote machine
scp example.txt username@machinename.berkeley.edu:~/Desktop/.
# FROM another machine TO the machine you're on
# Copy a file from the /tmp directory of the remote machine to a specific directory on this machine
scp username@machinename.berkeley.edu:/tmp/data.txt ~/Downloads/.
The format a file is in is determined by the actual content of the file. You can determine the file format using file
:
file index.qmd
file /usr/local/linux/miniforge-3.12/lib/python3.12/site-packages/numpy/dtypes.py
index.qmd: exported SGML document, ASCII text, with very long lines (615)
/usr/local/linux/miniforge-3.12/lib/python3.12/site-packages/numpy/dtypes.py: Python script, ASCII text executable
In many cases, files have extensions such as .csv
(for comma-separated text files), .pdf
for PDFs, .jpg
for JPEG files. The extension is a convention that helps us and programs distinguish different kinds of files and therefore know how to manipulate/interpret the files.
The extension is just a convention – changing the file name doesn’t change the file format!
So if make a copy of the example.txt
file but name it example.pdf
, we see that it’s still just a simple text file even if I give it a name that would suggest it’s a PDF.
cp silly_example.txt silly_example.pdf
cat silly_example.pdf
Hello there.
This is a file
that contains
4 lines.
file silly_example.pdf
silly_example.pdf: ASCII text
However, changing the extension may prevent a program from using the file simply because the program was written to assume that files in a certain format have a certain extension.
The zip
utility compresses in a format compatible with zip files for Windows:
zip -r assets.zip assets
adding: assets/ (stored 0%)
adding: assets/css/ (stored 0%)
adding: assets/css/style.scss (stored 0%)
adding: assets/js/ (stored 0%)
adding: assets/js/scale.fix.js (deflated 62%)
adding: assets/fonts/ (stored 0%)
adding: assets/fonts/Noto-Sans-regular/ (stored 0%)
adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.ttf (deflated 34%)
adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.svg (deflated 66%)
adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff2 (stored 0%)
adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.eot (deflated 0%)
adding: assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff (deflated 1%)
adding: assets/fonts/Noto-Sans-700italic/ (stored 0%)
adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.ttf (deflated 32%)
adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff2 (stored 0%)
adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.eot (deflated 0%)
adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff (deflated 1%)
adding: assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.svg (deflated 68%)
adding: assets/fonts/Noto-Sans-700/ (stored 0%)
adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.ttf (deflated 35%)
adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.eot (deflated 0%)
adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.svg (deflated 66%)
adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.woff2 (stored 0%)
adding: assets/fonts/Noto-Sans-700/Noto-Sans-700.woff (deflated 1%)
adding: assets/fonts/Noto-Sans-italic/ (stored 0%)
adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.svg (deflated 67%)
adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff2 (stored 0%)
adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.eot (deflated 0%)
adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.ttf (deflated 31%)
adding: assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff (deflated 1%)
adding: assets/stat_bear.png (deflated 18%)
adding: assets/img/ (stored 0%)
adding: assets/img/logo.svg (deflated 70%)
adding: assets/img/ls_format.png (deflated 4%)
adding: assets/styles.css (deflated 4%)
ls -l assets.zip
-rw-r--r-- 1 scflocal scflocal 444635 Oct 31 14:42 assets.zip
gzip
is a standard UNIX compression utility to compress individual files:
cp assets/img/ls_format.png test.png
ls -l test.png
-rw-r--r-- 1 scflocal scflocal 52402 Oct 31 14:42 test.png
Here we see that gzip can’t compress the png file much, but it can help a lot with other formats.
gzip test.png
ls -l test.png.gz # Not much smaller than the uncompressed file.
-rw-r--r-- 1 scflocal scflocal 50200 Oct 31 14:42 test.png.gz
Finally, the tar
utility will combine multiple files and directories into a single archive.
tar -cvf assets.tar assets
assets/
assets/css/
assets/css/style.scss
assets/js/
assets/js/scale.fix.js
assets/fonts/
assets/fonts/Noto-Sans-regular/
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.ttf
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.svg
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff2
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.eot
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff
assets/fonts/Noto-Sans-700italic/
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.ttf
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff2
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.eot
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.svg
assets/fonts/Noto-Sans-700/
assets/fonts/Noto-Sans-700/Noto-Sans-700.ttf
assets/fonts/Noto-Sans-700/Noto-Sans-700.eot
assets/fonts/Noto-Sans-700/Noto-Sans-700.svg
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff2
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff
assets/fonts/Noto-Sans-italic/
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.svg
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff2
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.eot
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.ttf
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff
assets/stat_bear.png
assets/img/
assets/img/logo.svg
assets/img/ls_format.png
assets/styles.css
ls -l assets.tar
-rw-r--r-- 1 scflocal scflocal 686080 Oct 31 14:42 assets.tar
Adding the -z
flag also gzips the result. In that case there was more noticeable compression.
tar -cvzf assets.tgz assets
assets/
assets/css/
assets/css/style.scss
assets/js/
assets/js/scale.fix.js
assets/fonts/
assets/fonts/Noto-Sans-regular/
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.ttf
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.svg
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff2
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.eot
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff
assets/fonts/Noto-Sans-700italic/
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.ttf
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff2
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.eot
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.svg
assets/fonts/Noto-Sans-700/
assets/fonts/Noto-Sans-700/Noto-Sans-700.ttf
assets/fonts/Noto-Sans-700/Noto-Sans-700.eot
assets/fonts/Noto-Sans-700/Noto-Sans-700.svg
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff2
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff
assets/fonts/Noto-Sans-italic/
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.svg
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff2
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.eot
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.ttf
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff
assets/stat_bear.png
assets/img/
assets/img/logo.svg
assets/img/ls_format.png
assets/styles.css
ls -l assets.tgz
-rw-r--r-- 1 scflocal scflocal 453183 Oct 31 14:42 assets.tgz
Now let’s move that tarball (as it is called) to a new directory and unzip and expand it using the -x flag.
mv assets.tgz /tmp
cd /tmp
tar -xvzf assets.tgz
assets/
assets/css/
assets/css/style.scss
assets/js/
assets/js/scale.fix.js
assets/fonts/
assets/fonts/Noto-Sans-regular/
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.ttf
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.svg
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff2
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.eot
assets/fonts/Noto-Sans-regular/Noto-Sans-regular.woff
assets/fonts/Noto-Sans-700italic/
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.ttf
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff2
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.eot
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.woff
assets/fonts/Noto-Sans-700italic/Noto-Sans-700italic.svg
assets/fonts/Noto-Sans-700/
assets/fonts/Noto-Sans-700/Noto-Sans-700.ttf
assets/fonts/Noto-Sans-700/Noto-Sans-700.eot
assets/fonts/Noto-Sans-700/Noto-Sans-700.svg
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff2
assets/fonts/Noto-Sans-700/Noto-Sans-700.woff
assets/fonts/Noto-Sans-italic/
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.svg
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff2
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.eot
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.ttf
assets/fonts/Noto-Sans-italic/Noto-Sans-italic.woff
assets/stat_bear.png
assets/img/
assets/img/logo.svg
assets/img/ls_format.png
assets/styles.css
You can see the whole directory structure of what was archived has been recovered in the new location:
ls -l /tmp/assets
total 112
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 css
drwxr-xr-x 6 scflocal scflocal 4096 Oct 29 15:17 fonts
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 img
drwxr-xr-x 2 scflocal scflocal 4096 Oct 29 15:17 js
-rw-r--r-- 1 scflocal scflocal 92106 Oct 30 17:10 stat_bear.png
-rw-r--r-- 1 scflocal scflocal 69 Oct 31 13:54 styles.css
You can can see how much disk space is being used versus available as follows. The ‘Mounted on’ column will generally identify the parts of the filesystem in a more user-friendly way than the ‘Filesystem’ column.
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 59G 32G 25G 57% /
tmpfs 63G 68M 63G 1% /dev/shm
tmpfs 13G 23M 13G 1% /run
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 13G 28K 13G 1% /run/user/3173
tmpfs 13G 28K 13G 1% /run/user/3520
tmpfs 13G 28K 13G 1% /run/user/3417
tmpfs 13G 28K 13G 1% /run/user/764
tmpfs 13G 28K 13G 1% /run/user/3023
tmpfs 13G 28K 13G 1% /run/user/3565
tmpfs 13G 28K 13G 1% /run/user/3530
tmpfs 13G 28K 13G 1% /run/user/3294
tmpfs 13G 28K 13G 1% /run/user/3066
tmpfs 13G 28K 13G 1% /run/user/3180
tmpfs 13G 40K 13G 1% /run/user/3189
tmpfs 13G 28K 13G 1% /run/user/3188
tmpfs 13G 32K 13G 1% /run/user/3605
tmpfs 13G 28K 13G 1% /run/user/3608
tmpfs 13G 32K 13G 1% /run/user/3466
tmpfs 13G 28K 13G 1% /run/user/3604
tmpfs 13G 28K 13G 1% /run/user/3218
tmpfs 13G 32K 13G 1% /run/user/3624
tmpfs 4.0M 0 4.0M 0% /sys/fs/cgroup
/dev/sda3 59G 34G 22G 61% /var
/dev/sda4 472G 2.7G 445G 1% /var/tmp
/dev/sda5 1.3T 76G 1.1T 7% /tmp
oz.berkeley.edu:/pool0/system 6.0T 4.8T 1.3T 80% /system
oz.berkeley.edu:/pool0/scratch 37T 33T 4.3T 89% /scratch
oz.berkeley.edu:/pool0/accounts 66T 20T 47T 30% /accounts
In general, you’ll want to look at the ‘/’ line under Mounted on
, and on standard UNIX machines possibly at ‘/tmp’, ‘/home’, and others.
We can see usage in specific directories like this:
cd assets
du -h
8.0K ./css
8.0K ./js
140K ./fonts/Noto-Sans-regular
140K ./fonts/Noto-Sans-700italic
140K ./fonts/Noto-Sans-700
132K ./fonts/Noto-Sans-italic
556K ./fonts
76K ./img
748K .
Here we see that the total usage is about a bit less than 700 KB, with, for example, about 70 KB of that in the img
subdirectory.
If we only want a summary of usage for each top-level subdirectory, rather than showing all nested subdirectories:
cd ~/tutorial-unix-basics
du -h -d 1
28K ./_sass
8.0K ./_layouts
4.0K ./_site
16K ./_includes
84K ./_freeze
748K ./assets
1.5M ./.git
172K ./.quarto
3.8M .
Linux machines (but not Macs) have system information provided in a few special files.
/proc/cpuinfo
shows information on each processor.
head -n 30 /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
stepping : 7
microcode : 0x71a
cpu MHz : 2394.053
cache size : 10240 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid xsaveopt dtherm arat pln pts md_clear flush_l1d
vmx flags : vnmi preemption_timer invvpid ept_x_only ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown
bogomips : 4800.18
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
This indicates there are at least two processors – numbered 0 and 1 (we’d need to see the whole file to see if there are more). Each is an Intel Core i7.
This file has information on the memory available:
head -n 10 /proc/meminfo
MemTotal: 131966464 kB
MemFree: 22645656 kB
MemAvailable: 98414788 kB
Buffers: 588304 kB
Cached: 32359372 kB
SwapCached: 1249372 kB
Active: 28592240 kB
Inactive: 32685712 kB
Active(anon): 3933376 kB
Inactive(anon): 24488076 kB
The key line is the MemTotal line, indicating 132 GB of RAM.
cat /etc/issue
Ubuntu 22.04.1 LTS \n \l
We’re running Ubuntu version 22.04.
We can also use commands to get information:
nproc # how many processors?
8
The shell provides a number of useful shortcuts, of which we highlight a couple here.
The shell will try to auto-complete the names of commands/programs or of files when you type part of the name and then hit <Tab>
. This can save quite a bit of typing, particularly for long file names.
You can navigate within a line using the usual arrows but also:
Ctrl-a
moves to the beginning of the lineCtrl-e
moves to the end of the lineCtrl-k
deletes the rest of the line starting at the cursorCtrl-y
pastes in whatever was deleted previously with Ctrl-k
Ctrl-r
enables an interactive history searchThe up and down arrow keys will move you through the history of commands you have entered in the terminal. So you can recover something you typed previously and then directly run it again, or edit it and then run the modified version. You run the command by pressing <Enter>
, which you can do regardless of where your cursor currently is on the line you are editing.
There’s also lots more functionality along these lines that we won’t go into here.
Often (particularly as you learn more sophisticated shell functionality) you will want to save your shell syntax in the form of a code file, called a script, that you could run another time.
For example, suppose you often need to do the following series of steps:
cd
tar -cvzf assets.tgz assets
mv assets.tgz /tmp
cd /tmp
tar -xvzf assets.tgz
You can put those lines into a file, say, mv_assets.sh
, which will generally end in .sh.
Then we can run the code in the file as follows. (Results not shown here.)
chmod ugo+x mv_assets.sh # Make the script executable by everyone.
./mv_assets.sh # Run it.
The initial ./
is needed because UNIX is not expecting there to be an executable file in this particular directory.
You’ll generally want to have the first line of your shell scripts indicate the shell to be used to execute the script, so you’d want to put #!/bin/bash
as the first line of mv_assets.sh
.
Try to run the following command mkdir ~/projects/drought
. It will fail. Look in the help information on mkdir
to figure out how to make it work without first creating the projects
directory.
Figure out how to list out the files in a directory in order of decreasing file size, as a way to see easily what the big files are that are taking up the most space. Modify this command to get the result in the ascending order.
Use both zip
and tar -cvzf
to compress the tutorial-unix-basics
directory. Is one much smaller than the other?
Figure out how to print out free disk space in terms of megabytes.
The ls
command is itself an executable installed on the system. Where is it located?
Where is gzip
installed on the system? What are some other commands/executables that are installed in the same directory?
Practice with moving/removing/copying. Make a copy of the tutorial-unix-basics
directory (and all its contents) in /tmp
. Now use cd
to go into the copied directory. Remove the /tmp/tutorial-unix-basics/.git
directory. Now run git status
. Congratulations, you should discover that you’ve turned a directory that is a Git repository into a directory that is not considered a git repository.
4.3 Comments
Anything that follows
#
is a comment and is ignored.