NavigationRecent Updates |
The Display of Percent CPU in topSubmitted by wnl on Tue, 2007-07-31 11:48.Applications
The single most important piece of information processed by top is the
measure of a process's percentage cpu utilization, known as percent
cpu. Although top is perfectly capable of display and sorting on a
variety of information, by default it sorts by percent cpu. The
reason for this is that most people use top to find out what the cpu
is doing, or more specifically which process is hogging the cpu.
Percent cpu readily reveals this information.
But how exactly is this measured? Since the early days of BSD, the percent cpu utilization of each process was calculated by the kernel and saved along with other process information in what's called the proc structure. The kernel maintains an array of proc structures to track its processes. The ps command and, in some cases, top simply display this value as a true representation of process utilization. Unfortunately, that isn't the whole truth. You see, the only reason the kernel cares about this number is so that it can provide equitable access to the cpu. If a process is using too much cpu and there are other processes waiting to run, then the kernel will want to ensure that each process gets a fair shot at running. So it calculates a figure that takes recent history in to account; a decaying average of the true cpu utilization for a process. In most BSD-derived systems, for example, the average is decayed so that 95% of a process's utilization from 60 seconds ago is no longer counted (but 5% is). If a process begins to accumulate a high average percentage then it has been sustaining such high use for at least 30 seconds, and probably longer. A process with a high average is likely a cpu hog and should be given lower priority when being scheduled for the cpu. In Unix, the decaying average is used when making scheduling decisions (see footnote). You can see how this number behaves over time. Start an infinite loop on an otherwise idle system (my favorite is "while true; do true; done"). Then watch the %cpu figure in the output of ps. At first it will be quite low. In fact, it will take about 30 seconds before the percentage reaches 50. It won't get over 90% until about 55 seconds, and it won't reach 99% or 100% until a full minute after you start the loop. This is because it is still giving weight to the idleness of the process from a full minute in the past. The decaying average is helpful to the kernel, and it may be helpful to someone who wants to get a single snapshot of the system with ps. But top is intended to provide information about what is happening with the system right now, not a minute ago. I think it is misleading for top to show this decaying average as percent cpu, as it doesn't always provide an accurate picture of the moment. Some of the ports for top now calculate this percentage from other information rather than simply parroting what the kernel has done. This provides a better picture of what happened over the past 5 seconds, while ignoring information that the kernel scheduler will use. The calculation is simple. A process's total accumulated cpu time is also tracked in the proc structure, and is shown in the column labeled time. To calculate percent cpu, top will sample and remember every process's cpu time. At the next update, it will again sample cpu time, calculate the difference from the last reading, divide by the elapsed time between the samples, then multiply by 100 to convert to a percentage (multiprocessor systems complicate this, and we will get to that case next). This provides an accurate measurement of cpu utilization between updates, but it is not a decaying average. If you are on a system where top performs this calculation, then start your infinite loop again and run top. You will see that within two updates the process is at or near 100% (assuming your system is otherwise idle and that there is only one processor). There is one downside to this method: top has to take two samples before it can display percent cpu. You may notice on such systems that it takes about a second for top to display its first screen of data. This is entirely due to the need for recording two samples before calculating percent cpu. There are cases where top will only show 50%, 25%, or even a lower figure for a looping process on an otherwise idle system. If you are on a system that has more than one cpu, top will probably display cpu utilization as a percentage of the total available cpu. So if you have two processors, a single process can only use one of them and will only show 50%. Likewise on a 4-processor system your infinite loop will only show 25%. However, a process with multiple threads will be able to utilize more than one processor simultaneously. Top will reflect such utilization with higher percentages. Think of it as a percentage of all available cpu cycles across all processors. While continuing development for top, I plan to implement direct calculation of percent cpu in as many ports of top as I can, as I believe it provides a better picture of system activity. I am interested in feedback from the user community on this matter. What do you think? Weighted CPU While we are on the subject of percent cpu, we must mention something called weighted cpu. This is a bit of an embarrassment to me, but I feel that some sort of clarification is long past due. Top was originally written for BSD 4.1, which tracked percent cpu pretty much as described here. However, the scheduler would apply an additional calculation to that number before using it to influence scheduling. The result was known as weighted cpu percent, and was also reflected in the source for ps. I dutifully followed suit and, in that early version, displayed raw cpu and weighted cpu in two columns side-by-side. As Unix grew, mutated, and developed, scheduling algorithms were changed and methods altered. Percent cpu is an easy measure to understand and it has significance to you and me, but the intent and meaning of weighted cpu became lost. Most Unix systems no longer used this calculation, preferring improved scheduling algorithms that no longer needed it. Nonetheless, when top was ported to these systems by well intentioned individuals, they copied the code for calculating weighted cpu. I think it is fair to say that at this point in time the number no longer has any meaning on any modern Unix system. Even the ps version of BSD calculates and displays something called weighted cpu, but that number really has no relevance to scheduling or anything that goes on in the kernel. Right now I consider weighted cpu nothing more than a waste of valuable screen space. In future versions of top you may no longer see that number. Footnote In actuality, BSD systems (starting with version 4.3) no longer use a traditional decaying average in their scheduling algorithms, and they only track the number to keep ps happy. The BSD scheduler instead uses a formula that decays cpu utilization by a factor of the load average. Trackback URL for this post:http://lopsa.org/trackback/1495
wnl's blog | add new comment | 37134 reads
|