1.Introduction
Insufficient system resources such as storage, memory, and CPU (central processing unit) can significantly impact application performance. Therefore, it is critical to monitor these components.
Unlike disk and memory, monitoring CPU usage on a Linux system is not as straightforward. In this article, we will learn how to interpret CPU metrics and display them in a human-readable format.
2.CPU Load and CPU Utilization
Although CPU load and CPU usage sound similar, they are not interchangeable; CPU load is defined as the number of processes using or waiting to use a core at a single point in time.
Assuming we have a single-core system, our average CPU load is always below 0.6, which means that every process that needs to use the CPU can use it immediately without waiting. If the average CPU load is greater than 1, then there are processes that need to use the CPU, but cannot do so at this time because the CPU is not available.
However, an average load higher than 1 in a multiprocessor system will not be a problem because there are more cores available.
The uptime command provides us with a view of the average load in intervals of 1, 5 and 15 minutes.
[root@localhost ~]# uptime 12:40:05 up 2:29, 1 user, load average: 0.37, 0.08, 0.03
The average load cannot be interpreted without knowing the number of cores in the system.
[root@localhost ~]# cat /proc/cpuinfo |grep core core id : 0 cpu cores : 1
CPU utilization, on the other hand, is the percentage of time the CPU spends processing non-idle tasks. cpu utilization can only be measured over a specified time interval. We can determine the CPU utilization by subtracting the percentage of idle time from 100.
3.Calculate CPU usage
3.1 Using vmstat to get CPU usage
The vmstat command displays CPU activity in near real-time.
[root@localhost ~]# vmstat 3 4 procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 4 0 0 1347080 6120 941464 0 0 68 11 72 137 1 2 97 0 0 1 0 0 1347080 6120 941464 0 0 0 0 84 157 1 2 97 0 0 1 0 0 1347080 6120 941464 0 0 0 0 59 107 1 1 98 0 0 1 0 0 1347080 6120 941464 0 0 0 1 59 104 1 1 98 0 0
The columns under CPU provide an overview of where processor time is spent.
-
us - time spent running non-kernel code
-
sy - time spent running kernel code
-
id - idle time
-
wa - time spent waiting for I/O
-
st -Steal time from virtual machines
The id column is what we are interested in. With a one second delay, we use vmstat to calculate the CPU usage.
[root@localhost ~]# echo "CPU Usage: "$[100-$(vmstat 1 2|tail -1|awk '{print $15}')]"%" CPU Usage: 2%
A vmstat command with no arguments provided will give the CPU time since self-boot. This will not give an accurate CPU usage percentage. Therefore, the parameters can only be 1 and 2, and we use the metric calculated after one second.
vmstat 1 2
3.2. Get CPU usage using /proc/stat
CPU activity can also be extracted from the /proc/stat file. This file contains various metrics about the system since boot.
[root@localhost ~]# cat /proc/stat cpu 3020 28 1863 22404 35 432 47 0 0 0 cpu0 3020 28 1863 22404 35 432 47 0 0 0 intr 96468 28 100 0 0 0 0 0 0 1 0 0 0 1263 0 0 0 3696 0 153 928 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 207 0 41 14600 0 0 0 0 0 0 0 0 0 0 0 0 0 0 343 97 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ctxt 340950 btime 1628404433 processes 3276 procs_running 2 procs_blocked 0 softirq 112867 1 16857 56 269 510 0 261 0 0 94913
The first row, 'cpu' is an aggregation of all the core metrics of the system. On a system with 4 cores, there will be 4 cpu lines - cpu0, cpu1, cpu2, and cpu3. The columns in the ' cpu ' row indicate the time spent processing different tasks.
-
user – Time spent in user mode
-
nice – Time spent processing nice processes in user mode
-
system – Time spent executing kernel code
-
idle - Free time
-
iowait – Time spent waiting for I/O
-
irq - Time spent in service interruption
-
softirq – Time spent on service software interruptions
-
steal —Time stolen from virtual machines
-
guest - Time spent running the virtual CPU for the guest OS
-
guest_nice – Time spent running virtual CPUs for a "nice" guest OS
We will use these metrics to calculate the average idle percentage. Subsequently, we will use the calculated values to calculate CPU usage. Note that older Linux distributions do not calculate the steal, guest, or guest_nice metrics. If we are using an older system, we will ignore these metrics in the calculation.
平均空闲时间 (%) = (idle * 100) / (user + nice + system + idle + iowait + irq + softirq +steal + guest + guest_nice) cat /proc/stat |grep cpu |tail -1|awk '{print ($5*100)/($2+$3+$4+$5+$6+$7+$8+$9+$10)}'|awk '{print "CPU Usage: " 100-$1}' CPU Usage: 2.4219
Since we are developing a single core system, the "cpu" line will be the same as "cpu1". Therefore, the use of tail -1 is to retrieve only one of these lines. However, we will use the "cpu" line on multiprocessor systems, since it is the set of metrics on all cores.
3.3. Use top to get CPU usage
Normally, the top command is usually used to display the active processes on the system and how many resources these processes are consuming. However, we can use this command to measure the status of the CPU.
[root@localhost ~]# top top - 07:08:31 up 2:41, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 322 total, 2 running, 320 sleeping, 0 stopped, 0 zombie %Cpu(s): 10.0 us, 15.0 sy, 0.0 ni, 97.8 id, 0.0 wa, 5.0 hi, 0.0 si, 0.0 st MiB Mem : 3709.4 total, 1483.1 free, 1402.0 used, 824.4 buff/cache MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 2053.4 avail Mem
Also, note that the top command shows the CPU percentage of a single core. On multiprocessor systems, the CPU percentage may exceed 100%. For example, if the 4 cores are 75%, the top command will show a CPU of 300%.
We need to get the idle time value so that we can subtract it from 100 to get the usage.
[root@localhost ~]# top -bn2 | grep '%Cpu' | tail -1 | grep -P '(....|...) id,'|awk '{print "CPU Usage: " 100-$8 "%"}' CPU Usage: 2.2%
The -n option is the number of iterations the top command should use before it ends. We avoid using the first iteration because the metric we retrieve will be the value since the start. Therefore, we perform a second iteration.
Alternatively, on a multiprocessor system, we must divide the given "id" value by the number of cores and then subtract that value from 100. For example, if we are running on a quad-core system and the value of "id" is 304%, we calculate the CPU usage as
CPU usage rate % = 100 – (304/4) [root@localhost ~]# top -bn2 | grep '%Cpu' | tail -1 | grep -P '(....|...) id,'|awk '{print "CPU Usage: " 100-($8/4) "%"}'