Discussion:
hrSWRunPerfCPU - 5.4.3 vs 5.7.2
(too old to reply)
Vincent Newell
2015-02-23 16:25:53 UTC
Permalink
Greetings,

I have a small LAN of four systems for which I would like to monitor
specific processes' CPU and Memory utilization. I am able to determine the
PID and host on which each process is running so I would like to use SNMP
to make the actual queries for resource utilization. For Ubuntu 12.04 with
net-snmp 5.4.3, I wrote some software to find all of the processes (using
ROS), and dispatch SNMP queries and track performance using
HOST-RESOURCES-MIB::hrSWRunPerfCPU.<pid> and
HOST-RESOURCES-MIB::hrSWRunPerfMem. Each data point combines a timestamp
with an identifier and the encoded varbind. The data is passed through
some complicated system and, long story short, ends up in graphite. Using
the nonNegativeDerivative and scaleToSeconds functions, I created a CPU
graph to track all of the processes. Computation-heavy single-threaded
processes were hovering around 100%, so I think I did everything right.

I recently upgraded the systems to 14.04 and net-snmp 5.7.2 and noticed
that my data points were alternating between 0 and 200% ... I took a closer
look by launching 'yes' in the background, and comparing 5.4.3 and 5.7.2
with:

watch -d "snmpget -v2c -c public localhost
HOST-RESOURCES-MIB::hrSWRunPerfCPU.6650"

I found that net-snmp 5.4.3 would update every time I made the request,
net-snmp 5.7.2 was behaving more like a step function. I used strace and
fond that net-snmp 5.4.3 reads /proc/6650/stat each time it's queried while
5.7.2 will read all of /proc/*/stat into memory every 30 seconds, and
respond to queries with the latest value. That made sense based on my
results because my original query interval was 15 seconds, so I would see
zero change in the hrSWRunPerfCPU every other time I queried.

Is there any way I can get net-snmp 5.7.2's process table to actually check
the procfs rather than reading from memory, or should I just make 30
seconds the minimum resolution/interval in my data?

Regards,
Vince
Vincent Newell
2015-02-23 16:27:34 UTC
Permalink
Shoot, sorry about the duplicate - I thought the first message didn't go
through.
Post by Vincent Newell
Greetings,
I have a small LAN of four systems for which I would like to monitor
specific processes' CPU and Memory utilization. I am able to determine the
PID and host on which each process is running so I would like to use SNMP
to make the actual queries for resource utilization. For Ubuntu 12.04 with
net-snmp 5.4.3, I wrote some software to find all of the processes (using
ROS), and dispatch SNMP queries and track performance using
HOST-RESOURCES-MIB::hrSWRunPerfCPU.<pid> and
HOST-RESOURCES-MIB::hrSWRunPerfMem. Each data point combines a timestamp
with an identifier and the encoded varbind. The data is passed through
some complicated system and, long story short, ends up in graphite. Using
the nonNegativeDerivative and scaleToSeconds functions, I created a CPU
graph to track all of the processes. Computation-heavy single-threaded
processes were hovering around 100%, so I think I did everything right.
I recently upgraded the systems to 14.04 and net-snmp 5.7.2 and noticed
that my data points were alternating between 0 and 200% ... I took a closer
look by launching 'yes' in the background, and comparing 5.4.3 and 5.7.2
watch -d "snmpget -v2c -c public localhost
HOST-RESOURCES-MIB::hrSWRunPerfCPU.6650"
I found that net-snmp 5.4.3 would update every time I made the request,
net-snmp 5.7.2 was behaving more like a step function. I used strace and
fond that net-snmp 5.4.3 reads /proc/6650/stat each time it's queried while
5.7.2 will read all of /proc/*/stat into memory every 30 seconds, and
respond to queries with the latest value. That made sense based on my
results because my original query interval was 15 seconds, so I would see
zero change in the hrSWRunPerfCPU every other time I queried.
Is there any way I can get net-snmp 5.7.2's process table to actually
check the procfs rather than reading from memory, or should I just make 30
seconds the minimum resolution/interval in my data?
Regards,
Vince
Loading...