Managing processes
A process is a program that is being executed. Processes have the following attributes:
- A lifetime.
- A process ID (PID).
- A user ID (UID).
- A group ID (GID).
- A parent process with its own ID (PPID).
- An environment.
- A current working directory.
Anytime you do something on the computer, one or more processes will start up to carry out the activity.
1 Monitoring
1.1 Monitoring processes
Using ps
Examining subprocesses of your shell with ps
:
$ ps
PID TTY TIME CMD
19370 pts/3 00:00:00 bash
22846 pts/3 00:00:00 ps
Examining in more detail subprocesses of your shell with ps
:
$ ps -f
UID PID PPID C STIME TTY TIME CMD
jarrod 19370 19368 0 10:51 pts/3 00:00:00 bash
jarrod 22850 19370 0 14:57 pts/3 00:00:00 ps -f
Examining in more detail all processes on your computer:
$ ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Aug21 ? 00:00:05 /usr/lib/systemd
root 2 0 0 Aug21 ? 00:00:00 [kthreadd]
root 3 2 0 Aug21 ? 00:00:07 [ksoftirqd/0]
root 5 2 0 Aug21 ? 00:00:00 [kworker/0:0H]
<snip>
root 16210 1 0 07:19 ? 00:00:00 login -- jarrod
jarrod 16219 16210 0 07:19 tty1 00:00:00 -bash
jarrod 16361 16219 0 07:19 tty1 00:00:00 /bin/sh /bin/startx
<snip>
You can use the -u
option to see percent CPU and percent memory used by each process. You can use the -o
option to provide your own user-defined format; for example, :
$ ps -o pid,ni,pcpu,pmem,user,comm
PID NI %CPU %MEM USER COMMAND
18124 0 0.0 0.0 jarrod bash
22963 0 0.0 0.0 jarrod ps
To see the hierarchical process structure (i.e., which processes started which other processes), you can use the pstree
command.
Using top
Examining processes with top
:
$ top
top - 13:49:07 up 1:49, 3 users, load average: 0.10, 0.15, 0.18
Tasks: 160 total, 1 running, 158 sleeping, 1 stopped, 0 zombie
%Cpu(s): 2.5 us, 0.5 sy, 0.0 ni, 96.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 7893644 total, 5951552 free, 1085584 used, 856508 buff/cache
KiB Swap: 7897084 total, 7897084 free, 0 used. 6561548 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1607 jarrod 20 0 2333568 974888 212944 S 12.5 12.4 11:10.67 firefox
3366 jarrod 20 0 159828 4312 3624 R 6.2 0.1 0:00.01 top
1 root 20 0 193892 8484 5636 S 0.0 0.1 0:01.78 systemd
<snip>
The first few lines show general information about the machine, while the remaining lines show information for each process.
- The
RES
column indicates the amount of memory that a process is using (in bytes, if not otherwise indicated). - The
%MEM
shows that memory use relative to the physical memory available on the computer. - The
%CPU
column shows the proportion of a CPU core that the process is using (which can exceed 100% if a process is threaded). - The
TIME+
column shows the amount of time the process has been running.
To quit top
, type q
.
You can also kill jobs (see below for further details on killing jobs) from within top: just type r
or k
, respectively, and proceed from there.
1.2 Monitoring memory use
One of the main things to watch out for is a job that is using close to 100% of memory and much less than 100% of CPU. What is generally happening is that your program has run out of memory and is using virtual memory on disk, spending most of its time writing to/from disk, sometimes called paging or swapping. If this happens, it can be a very long time, if ever, before your job finishes.
Note that the per-process memory use reported by top
and ps
may “double count” memory that is being used simultaneously by multiple processes. To see the total amount of memory actually available on a machine:
$ free -h
total used free shared buff/cache available
Mem: 251G 998M 221G 2.6G 29G 247G
Swap: 7.6G 210M 7.4G
You’ll generally be interested in the Memory
row and in the total
, used
and available
columns. The free
column can be confusing and does not actually indicate how much memory is still available to be used, so you’ll want to focus on the available
column.
2 Job Control
2.1 Foreground and background jobs
When you run a command in a shell by simply typing its name, you are said to be running in the foreground. When a job is running in the foreground, you can’t type additional commands into that shell session, but there are two signals that can be sent to the running job through the keyboard. To interrupt a program running in the foreground, use Ctrl-c
; to quit a program, use Ctrl-\
. While modern windowed systems have lessened the inconvenience of tying up a shell with foreground processes, there are some situations where running in the foreground is not adequate.
The primary need for an alternative to foreground processing arises when you wish to have jobs continue to run after you log off the computer. In cases like this you can run a program in the background by simply terminating the command with an ampersand (&
). However, before putting a job in the background, you should consider how you will access its results, since stdout is not preserved when you log off from the computer. Thus, redirection (including redirection of stderr) is essential when running jobs in the background. As a simple example, suppose that you wish to run a Python script, and you don’t want it to terminate when you log off.
$ python code.py > code.pyout 2>&1 &
What does the inscrutable 2>&1
do? Recall from earlier that it says to send stderr to the same place as stdout, which in this case has been redirected to code.pyout
.
If you forget to put a job in the background when you first execute it, you can do it while it’s running in the foreground in two steps. First, suspend the job using the Ctrl-z
signal. After receiving the signal, the program will interrupt execution, but it will still have access to all files and other resources. Next, issue the bg
command, which will start the stopped job running in the background.
2.2 Listing and killing jobs
Since only foreground jobs will accept signals through the keyboard, if you want to terminate a background job you must first determine the unique process id (PID) for the process you wish to terminate through either ps
or top
. Here we’ll illustrate use of ps
again.
To see all processes owned by a specific user (e.g., jarrod
), I can use the -U jarrod
option:
$ ps -U jarrod
If I want to get more information (e.g., %CPU
and %MEM
), I can use add the -u
option:
$ ps -U jarrod -u
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
jarrod 16116 12.0 6.0 118804 5080 tty1 Ss 16:25 133:01 python
In this example, the ps
output tells us that this python job has a PID of 16116
, that it has been running for 133 minutes, is using 12% of CPU and 6% of memory, and that it started at 16:25. You could then issue the command:
$ kill 16116
or, if that doesn’t work:
$ kill -9 16116
to terminate the job. Another useful command in this regard is killall
, which accepts a program name instead of a process id, and will kill all instances of the named program (in this case, R):
$ killall R
Of course, it will only kill the jobs that belong to you, so it will not affect the jobs of other users. Note that the ps
and kill
commands only apply to the particular computer on which they are executed, not to the entire computer network. Thus, if you start a job on one machine, you must log back into that same machine in order to manage your job.
Finally, let’s see how to build up a command to kill firefox using some of the tools we’ve seen. First let’s pipe the output of ps -e
to grep
to select the line corresponding to firefox
:
$ ps -e | grep firefox
16517 ? 00:10:03 firefox
We can now use awk
to select the first column, which contains the process ID corresponding to firefox
:
$ ps -e | grep firefox | awk '{ print $1 }'
16517
Finally, we can pipe this to the kill
command using xargs
or command substitution:
$ ps -e | grep firefox | awk '{ print $1 }' | xargs kill
$ kill $(ps -e | grep firefox | awk '{ print $1 }')
As mentioned before, we can’t pipe the PID directly to kill
because kill
takes the PID(s) as argument(s) rather than reading them from stdin.
3 Screen
Screen allows you to create virtual terminals, which are not connected to your actual terminal or shell. This allows you to run multiple programs from the command line and leave them all in the foreground in their own virtual terminal. Screen provides facilities for managing several virtual terminals including:
- listing them,
- switching between them, and
- disconnecting from one machine and then reconnecting to an existing virtual terminal from another.
While we will only discuss its basic operation, we will cover enough to be of regular use.
tmux
is an alternative to screen
.
Calling screen:
$ screen
will open a single window and you will see a new bash prompt. You just work at this prompt as you normally would. The difference is that you can disconnect from this window by typing Ctrl-a d
and you will see something like this :
$ screen
[detached from 23974.pts-2.t430u]
All the screen key commands begin with the control key combination Ctrl-a
followed by another key. For instance, when you are in a screen session and type Ctrl-a ?
, screen will display a help screen with a list of its keybindings.
You can now list your screen sessions :
$ screen -ls
There is a screen on:
23974.pts-2.t430u (Detached)
To reconnect :
$ screen -r
You can start multiple screen sessions. This is what it might look like if you have 3 screen sessions:
$ screen -ls
There are screens on:
24274.pts-2.t430u (Attached)
24216.pts-2.t430u (Detached)
24158.pts-2.t430u (Detached)
with the first session active on a machine.
To specify that you want to reattach to session 24158.pts-2.t430u
, type:
$ screen -r 24158.pts-2.t430u
If you have several screen sessions, you will want to name your screen session something more informative than 24158.pts-2.t430u
. To name a screen session gene-analysis
you can use the -S
option when calling screen:
$ screen -S gene-analysis
While there are many more features and keybindings available for screen, you’ve already seen enough screen to be useful. For example, imagine you ssh to a remote machine from your laptop to run an analysis. The first thing you do at the bash prompt on the remote machine is:
$ screen -S dbox-study
Then you start your analysis script dbox-analysis.py
running:
$ dbox-analysis.py
Starting statistical analysis ...
Processing subject 1 ...
Processing subject 2 ...
If your study has 50 subjects and processing each subject takes 20 minutes, you will not want to sit there watching your monitor. So you use Ctrl-a d
to detach the session and you will then see:
$ screen -S dbox-study
[detached from 2799.dbox-study]
$
Now you can log off your laptop and go home. Sometime after dinner, you decide to check on your job. So you ssh from your home computer to the remote machine again and type the following at the bash prompt:
$ screen -r dbox-study
You should then be able to see the progress of your analysis script, as if you had kept a terminal open the whole time.