Home Linux Admin Introduction Directory Tree Disks/File Systems Memory Mgmt Monitoring Startup/ShutDown Logging in/out User Accounts Backups Processes Cron Packages Books

Contents


Overview

In this section we study some tools to monitor system resources on the Linux system.

top

The "top" utility shows some general information about resources on the system and information about processes. We run the utility with the command "top" and a display shows up. A sample run on the hills server looks like:
top - 09:28:35 up 13 days, 19:41,  5 users,  load average: 0.08, 0.02, 0.01
Tasks: 337 total,   1 running, 336 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us,  0.1 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.1 hi,  0.0 si,  0.0 st
MiB Mem :  11697.6 total,   2661.2 free,   1465.5 used,   7570.9 buff/cache
MiB Swap:   8192.0 total,   8156.8 free,     35.2 used.   9627.2 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
1260391 mcruzmur  20   0   17.9g  61064  27472 S   0.3   0.5  14:17.26 zellij
3041836 amittal   20   0   54640   4800   3768 R   0.3   0.0   0:00.02 top
      1 root      20   0  247312  10912   8044 S   0.0   0.1   1:51.59 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.35 kthreadd
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par+
      5 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 slub_fl+
      7 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker+
     10 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_perc+
     11 root      20   0       0      0      0 S   0.0   0.0   0:00.00 rcu_tas+
     12 root      20   0       0      0      0 S   0.0   0.0   0:00.00 rcu_tas+
     13 root      20   0       0      0      0 S   0.0   0.0   0:02.69 ksoftir+
     14 root      20   0       0      0      0 I   0.0   0.0   2:23.91 rcu_sch+
     15 root      rt   0       0      0      0 S   0.0   0.0   0:00.18 migrati+

We can exit the top display by hitting the 'q' key.Let us examine the output in more detail.
top - 09:28:35 up 13 days, 19:41,  5 users,  load average: 0.08, 0.02, 0.01

The first line shows the current time and how long the system has been
up.The number of users that are logged in is shown in the next field followed by the
load average for 1 minute, 5 minute and 15 minute intervals.

The load average is the number of processes that are running or waiting to run on the
CPU. It shows how busy the system is.
If we assume 1 core then a value of 0 means that no processes are using the CPU.
If the load average is 1 then the system is being fully utilized. If the load average
is 2.0 ( again with 1 core ) then the system is overloaded and some processes
are waiting for the CPU.
The values depend on the number of cores. The load average is not a value that
must stay between 0 and 1. It depends on the numbe of cores. If the number of cores is
say, 4 then the load average might be 2 and that means the CPU is not overloaded but has
some load.
We can find out the number of cores using the "lscpu" command. Running this
on the hills server produces the following output.

[amittal@hills ~]$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           2
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz
Stepping:            7

Let's review some of the CPU concepts.
The number of cores is 4 for the hills server in the above output.
A socket is a physical slot on the motherboard. From the above output
we have 2 sockets on the motherboard. A socket can have cores. A core is
like a CPU in that it can execute instructions independently. If there are 2
cores in a single socket then they can share the cache. There is also the
concept of logical processor or hyperhteading. It doesn't apply in the above
case of the hills server but it's basically a core that support the storage of
instructions for 2 or more threads. This make execution of 2 threads slightly faster
than a normal core however the core by itself is only executiing a
single instruction at a time.

The next line in the top output is:

Tasks: 337 total,   1 running, 336 sleeping,   0 stopped,   0 zombie

The 337 refers to processes. A stopped process is a process that
receives a STOP signal and will resume when it receives CONT signal.
As an example we start a process using the sleep command.

[amittal@fog ~]$ sleep 120
^Z
[1]+  Stopped                 sleep 120
[amittal@fog ~]$ ps aux | grep sleep
amittal   57737  0.0  0.0 108056   356 pts/0    T    09:57   0:00 sleep 120
amittal   57747  0.0  0.0 112816   980 pts/0    S+   09:58   0:00 grep --color=auto sleep
[amittal@fog ~]$

We use Ctrl-Z to send the "STOP" signal. We see that the state is "T" which means stopped
state. To resume the process we can type the "fg" command.


[amittal@fog ~]$ fg
sleep 60
[amittal@fog ~]$ ps aux | grep sleep
amittal   55316  0.0  0.0 112816   984 pts/0    S+   09:09   0:00 grep --color=auto sleep
[amittal@fog ~]$

We will cover what a zombie process is in the processors section.
The next line gives us some information
%Cpu(s):  0.2 us,  0.1 sy,  0.0 ni, 99.7 id,  0.0 wa,  0.1 hi,  0.0 si,  0.0 st

us -- percentage of time running user processes
sy -- percent of time spent running the kernel.
ni -- percent of time spent running processes with manually configured nice values.
id -- percent of time idle (if low, CPU may be overworked).
wa -- percent of wait time (if high, CPU is waiting for I/O access).
hi -- percent of time managing hardware interrupts.
si -- percent of time managing software interrupts.
st -- percent of virtual CPU time waiting for access to physical CPU.
      This has to do with vmware installed on the operating system.

These values will refresh every few seconds.

The next 2 lines deal with memory.
MiB Mem :  11697.6 total,   2661.2 free,   1465.5 used,   7570.9 buff/cache
MiB Swap:   8192.0 total,   8156.8 free,     35.2 used.   9627.2 avail Mem

The line for Mem actually contains the last value from the second line. The
Swap line only has the first 3 values and then ends with the dot.
We can use the Shift-E key to change the memory units. The below shows the
units in Kilo Bits.

%Cpu(s):  0.4 us,  0.2 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.1 hi,  0.0 si,  0.0 st
KiB Mem : 11978356 total,  1478968 free,  1723136 used,  8776252 buff/cache
KiB Swap:  8388604 total,  8355116 free,    33488 used.  9578488 avail Mem

From the above we have around 12 Gb RAM on the hills server. Around 1.7 Gb is
used and 8.7 Gb is used for the buff/cache while 9.5 Gb is available.
If the buff/cache is using roughly 8.7 Gb and the total memory is 11.9 Gb then
how could we possibly have 9.5 Gb for avail memory. The available memory means how
much memory the system could allocate without doing swapping. And it can
use the memory of the buff/cache also if need be.

The swap memory is easier to understand; the total memory is the sum of the
free and used memory.

Next we have the processes sorted by the CPU being consumed by the
process:

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
1260391 mcruzmur  20   0   17.9g  61064  27472 S   0.3   0.5  14:17.26 zellij
3041836 amittal   20   0   54640   4800   3768 R   0.3   0.0   0:00.02 top
PR  Priority
NI  If the nice command was used to start it.
VIRT How much virtual memory is being used.
RES  How much of the RAM is being used.
SHR  Shared memory. Not all of it may be in the resident memory. Some of it
     ma be in the swapped memory.
S represents the state.
	'D' = uninterruptible sleep
	'R' = running
	'S' = sleeping
	'T' = traced or stopped
	'Z' = zombie
CPU Percentage of the CPU being used. By default this is for a single CPU but
    we can use the Sfit-I command.
MEM Percentage of memory consumed by the process.

The "top" has some options when starting the program and it also has
some interactive options while the program is running.

Command line options

-u: Displays processes associated with a specific user.
top -u amittal

[amittal@hills ~]$ top -u amittal
top - 22:50:39 up 16 days,  9:03,  1 user,  load average: 0.00, 0.00, 0.00
Tasks: 325 total,   2 running, 323 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.4 us,  0.1 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.1 hi,  0.1 si,  0.0 st
MiB Mem :  11697.6 total,   1458.1 free,   1668.9 used,   8570.6 buff/cache
MiB Swap:   8192.0 total,   8159.3 free,     32.7 used.   9367.9 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
3136993 amittal   20   0   89680   9772   8304 S   0.0   0.1   0:00.26 systemd
3136996 amittal   20   0  308992   3308      4 S   0.0   0.0   0:00.00 (sd-pam)
3137008 amittal   20   0  138976   6272   4944 S   0.0   0.1   0:00.03 sshd
3137009 amittal   20   0   29444   4816   4288 S   0.0   0.0   0:00.00 sftp-se+
3137035 amittal   20   0  138976   5612   4284 S   0.0   0.0   0:00.06 sshd
3137036 amittal   20   0   17000   5652   3228 S   0.0   0.0   0:00.01 bash
3167303 amittal   20   0   54536   4636   3740 R   0.0   0.0   0:00.02 top





-i: Displays only idle processes.
[amittal@hills ~]$ top -i
top - 06:14:08 up 16 days, 16:26,  2 users,  load average: 0.00, 0.00, 0.00
Tasks: 326 total,   1 running, 325 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.1 sy,  0.0 ni, 99.5 id,  0.0 wa,  0.1 hi,  0.0 si,  0.0 st
MiB Mem :  11697.6 total,   1440.2 free,   1642.2 used,   8615.2 buff/cache
MiB Swap:   8192.0 total,   8159.3 free,     32.7 used.   9386.6 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
   1207 root      20   0  410636  29220  13164 S   0.3   0.2  20:30.08 tuned
   2103 apache    20   0 2774684  27208   8764 S   0.3   0.2   3:00.88 httpd
1260391 mcruzmur  20   0   17.9g  61712  27608 S   0.3   0.5  21:28.75 zellij
3098460 mcruzmur  20   0   17.6g  59812  27056 S   0.3   0.5   4:55.50 zellij
3156046 mcruzmur  20   0   17.6g  54860  27660 S   0.3   0.5   2:12.35 zellij
3156509 mcruzmur  20   0   17.6g  57540  27764 S   0.3   0.5   2:11.20 zellij
3192208 root      20   0       0      0      0 I   0.3   0.0   0:00.09 kworker+
3192235 amittal   20   0   54536   4552   3660 R   0.3   0.0   0:00.04 top



-d: Sets the update interval in seconds.
[amittal@hills ~]$ top -d 10

-n: Limits the number of iterations before top exits.

-w: Adjusts the output width.
-1: Toggles single-CPU view.
[amittal@hills ~]$ top -version
  procps-ng 3.3.15
Usage:
  top -hv | -bcEHiOSs1 -d secs -n max -u|U user -p pid(s) -o field -w [cols]
[amittal@hills ~]$


Interactive Options

We can also type the 'h' key to view all the interactive
options.

Shift+L: Allows searching for processes by name.
k: Allows killing a process by its process ID.
Shift+p: Sorts processes by CPU usage.
Shift+m: Sorts processes by memory usage.
Shift+t: Sorts processes by running time.
Shift+n: Sorts processes by process ID.

Exercise:
1)Start a process:

sleep 120 &
Start the top utility .
Do Shift-L to see the process by typing in "sleep" .
Find the process id in the display and kill the
process using the 'k' key.

netstat

The tool "netstat" can be used to view network activity.
[amittal@hills ~]$ netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 hills.ccsf.edu:ssh      99-7-60-213.light:64431 ESTABLISHED
tcp        0      0 hills.ccsf.edu:53292    server-216-137-39:https TIME_WAIT
tcp        0      0 hills.ccsf.edu:ssh      76-237-102-216.li:36978 ESTABLISHED
tcp
The first is the connection from the Putty to the hills server. We can check that because 99.7.60.213 is the public IP address of my modem. If we open a browser window and type in what's my ip v4 address then we can find the public ip address.
Some more netstat option examples:
List all active connections.

[amittal@hills ~]$ netstat -at
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 localhost:smux          0.0.0.0:*               LISTEN
tcp        0      0 localhost:36777         0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:http            0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:ssh             0.0.0.0:*               LISTEN
tcp        0      0 localhost:smtp          0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:https           0.0.0.0:*               LISTEN
tcp        0      0 hills.ccsf.edu:ssh      99-7-60-213.light:64431 ESTABLISHED

List the routing tables.
[amittal@hills ~]$ netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         147.144.1.1     0.0.0.0         UG        0 0          0 ens192
147.144.1.0     0.0.0.0         255.255.255.0   U         0 0          0 ens192
[amittal@hills ~]$


ifconfig

The tool "ifconfig"
[amittal@hills ~]$ ifconfig
ens192: flags=4163  mtu 1500
        inet 147.144.1.2  netmask 255.255.255.0  broadcast 147.144.1.255
        ether 00:50:56:89:aa:f8  txqueuelen 1000  (Ethernet)
        RX packets 12137944  bytes 9068451651 (8.4 GiB)
        RX errors 0  dropped 568  overruns 0  frame 0
        TX packets 10067329  bytes 6173706483 (5.7 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 6090705  bytes 5765160422 (5.3 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6090705  bytes 5765160422 (5.3 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[amittal@hills ~]$