sections in this module | City
College of San Francisco - CS260A Linux System Administration Module: Processes |
module list |
During our discussion of process creation, the astute reader may have been concerned about the efficiency of the fork and exec process. Yes, indeed, it does carry some overhead. However, this overhead is reduced significantly by paging.
[ The background discussion on paging and process scheduling has
been removed due to time-constraints. We will concentrate just on
what you need to know about these topics in this section. By the
way, the term 'memory' in this discussion (and always in this
course) means RAM. ]
Paging involves breaking the memory image of any program in
execution (a process) into constant-sized pages, currently 4096
bytes. Only those portions of the process which are needed at any
point in time are kept in memory. The set of memory pages of a
process that is currently in-memory is called the run-set or
resident-set, and its size is termed the resident-set-size (RSS or
RSZ). The RSZ can be compared to the process' total size, called
its memory size or just size. When a page of the process is referenced that is not in
memory currently, a page fault
occurs. This causes the page to be read into memory. When
no free memory is available, pages that have not been used in some
amount of time are stolen from existing processes. The
aggressiveness of this page 'stealing' increases as memory becomes
more scarce. If pages must be stolen that are still needed, the
frequency of page faults could increase dramatically. This
situation is called thrashing,
and will cause system performance to decrease dramatically. You
can verify thrashing behavior using the vmstat or top process monitoring program. We will show
an example of vmstat
below and discuss both of these at length in the process
monitoring section.
I must interject one note of caution here, however: when
examining output that gives memory usage you must be aware of the
units involved. Memory usage is commonly expressed as either K
(remember I always use the power-of-two version of units) OR as
memory pages, which are 4K.
If a page of a program is selected for removal from memory due to disuse, what the system must do with it depends on whether it is code or data. Code (instructions) pages do not have to be saved (swapped-out) since they do not change and the page can be re-read from the program file. Data pages do have to be swapped-out so that the current state of the data can be preserved. Data that is swapped-out is saved on the swap device, if one is configured. Use of a swap device (a swap partition or swap file) for this purpose has been magnanimously termed 'virtual memory' on other systems, and that term has been adopted on linux, but it is traditionally just called paging (or 'swapping').
Let's look at the paging behavior of a linux system configured with adequate swap space and adequate RAM:
$ vmstat 1 2
procs
-----------memory---------- ---swap-- -----io---- --system--
-----cpu-----
r b
swpd free buff cache
si so bi
bo in cs us sy id wa st
2
0 0 1302304 116412
524856 0 0
122 13 910 2975 20 3 76
1 0
1
0 0 1302360 116412
524884 0
0 0 0 1680 4760
17 1 81 0 0
We will discuss only the swap
fields for now. You can see that all swap fields are 0 (si = swap in, so = swap out). In fact, this
system has never swapped.
This is not too surprising since it is a personal system, has 4GB
of RAM, and we are not running any programs that use a lot of data
memory, such as a video editor or image manipulation program, nor
is it running as a webserver or database server. The surprising
thing is that hills has
never swapped either, and
it currently has 46 users, 846 processes and is running a
webserver, oracle database, and mysql server!
Affecting Process Scheduling
Again, we will not discuss process scheduling due to time-constraints. The interested reader is referred to discussions of the Completely Fair Scheduler (CFS), such as that on Wikipedia. We will limit our discussion to the single metric that a user can available to alter process scheduling priority. That metric is called the nice number.
Traditionally, the nice number was created so that a user could indicate when she started a program that she didn't need the results right away, so she could make the program more 'nice' - or make it run with less priority (i.e., slower). This freed the system resources up for other users, making her task run in whatever extra resources were available, perhaps finishing overnight. However, this concept of a 'nice number' makes discussions of process priority and altering it very confusing, since the niceness of a process is inversely proportional to its priority. (So hang onto your seats and read this a couple of times and practice.)
Every process starts with a nice number (NI) whose default value is its parent's nice number. Nice numbers are integers that range from 20 (very nice - a request for low priority) to -20 (not nice - a request for the highest priority possible). Normally, processes are started with a nice number of 0 (the middle). Normal users can increase the nice number of their processes, making them more nice (thus lowering their priority), but only root can decrease the nice number of a process, or raise its priority.
The nice number does not directly determine scheduling priority - that is up to the scheduler (a process that arbitrates who gets resources). The nice number is simply a suggestion to the scheduler. The scheduler is free to honor or ignore nice numbers. Some schedulers only pay attention to nice numbers which have been decreased, since those are obviously requests by root to increase a process' priority. You can see the true scheduling priority (PRI) in the output of the ps command but you cannot affect it directly. You can only suggest a change in the priority by changing the nice number.
Setting the nice number
There are two ways to set a nice number: when you start a process
or after it is running.
$ nice -10 gimp &
would start the gimp process with a nice number of 10.
This gets ugly if you are root and want to make a process less nice. Here, if root's nice number is 10 and she wants to start the program top to look at the current processes running on the system, she could give it increased priority by decreasing its nice number (altering it by -10):
# nice --10 top &
Notice the first dash for the option is followed by a -10 giving what looks like a linux double-dash (but its not!)
Ugly, you say? Don't worry, it gets worse.
If a process is already running you alter its nice number using renice. The standard forms are
renice N -p pid1
[pid2 pid3 ... ] to alter the nice
number of a list of processes
or
renice N -u user1
[user2 user3 ... ] to alter the nice number
Here N is the nice number you want to assign to the process (note: no dash!), pid is a process-id and user is a user name. As an example, here is a line from the output of the ps command on my system that shows firefox has a nice number of 0:
F
S UID PID PPID C PRI
NI ADDR SZ WCHAN
TTY TIME
CMD
0 S 501
3297 3244 4 80 0 - 334429 poll_s
? 00:06:32 firefox
After I issue the command
$
renice 10 -p 3297
3297: old priority 0, new
priority 10
Here is the changed ps output:
F
S UID PID PPID C PRI
NI ADDR SZ WCHAN
TTY TIME
CMD
0 S 501
3297 3244 3 90 10 - 334515 poll_s
? 00:08:21 firefox
Prev | This page was made entirely
with free software on linux: the Mozilla Project and Openoffice.org |
Next |