Hi!
Yes, lots of things to consider... I have done some information
gathering on CMTSes (cable modem routers) reading out several thousamd
parameters. In general I can say that an snmpBULKwalk (command line) or
its equivalents in other languages are MUCH faster than reading
individual objects if you need several adjecent objects.
# time for ((i=1001 ; i-1025 ; i++)) ; do snmpget -v 2c -c public
192.168.8.51 IF-MIB::ifInOctets.$i; done
IF-MIB::ifInOctets.1001 = Counter32: 1105357170
...
real 0m0.386s
user 0m0.316s
sys 0m0.020s
# time snmpbulkwalk -v 2c -c public 192.168.8.51 IF-MIB::ifInOctets
IF-MIB::ifInOctets.1001 = Counter32: 1105357170
...
real 0m0.055s
user 0m0.008s
sys 0m0.008s
Above, the snmpbulkwalk even reads three extra OIDs. Obviously, CLI/bash
is not the way to go when you have a need for speed, I just used it to
demonstrate the difference. The waiting time is mostly in the switch
anyway. I have used PHP, and that's not optimal, but good enough for my
needs.
If you need, for instance, most info from the ifEntry-tree, like the red
ones below:
IF-MIB::ifIndex.1001 = INTEGER: 1001
IF-MIB::ifDescr.1001 = STRING: Alcatel-Lucent 1/1
IF-MIB::ifType.1001 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifMtu.1001 = INTEGER: 9216
IF-MIB::ifSpeed.1001 = Gauge32: 0
IF-MIB::ifPhysAddress.1001 = STRING: e8:e7:32:2:c5:2a
IF-MIB::ifAdminStatus.1001 = INTEGER: up(1)
IF-MIB::ifOperStatus.1001 = INTEGER: down(2)
IF-MIB::ifLastChange.1001 = Timeticks: (48888100) 5 days, 15:48:01.00
IF-MIB::ifInOctets.1001 = Counter32: 1105357170
IF-MIB::ifInUcastPkts.1001 = Counter32: 159748417
IF-MIB::ifInNUcastPkts.1001 = Counter32: 0
IF-MIB::ifInDiscards.1001 = Counter32: 0
IF-MIB::ifInErrors.1001 = Counter32: 0
IF-MIB::ifInUnknownProtos.1001 = Counter32: 0
IF-MIB::ifOutOctets.1001 = Counter32: 1778801908
IF-MIB::ifOutUcastPkts.1001 = Counter32: 2041113484
IF-MIB::ifOutNUcastPkts.1001 = Counter32: 0
IF-MIB::ifOutDiscards.1001 = Counter32: 23
IF-MIB::ifOutErrors.1001 = Counter32: 0
IF-MIB::ifOutQLen.1001 = Gauge32: 0
IF-MIB::ifSpecific.1001 = OID: SNMPv2-SMI::zeroDotZero
I'd definitely recommand walking the entire ifEntry tree instead of
walking several separate walks (again, cli ony for educational purposes):
snmpbulkwalk -v 2c -c public 192.168.38.51 IF-MIB::ifEntry -m all
It will probably be much more efficient to discard the extra info (black
lines above) instead of doing multiple walks. I think your
switches/routers will agree too, but that's very vendor- and even
product dependent. The downside is a little more traffic over the
network, but I'd say it's negligable.
I store all values retrieved in a database (I use MySQL or Postgres, but
choice is free) and then I can have the front-end pick out values
whenever convenient.
I even use the "bulkwalk" strategy to monitor almost 250 emergency
phones spread on 28 AudioCodes phone-to-SIP concentrators for a
customer. We can detect a "hook off" event in less than 5 seconds (the
criteria) by using 4 or 5 parallell jobs that walk the AudioCodes units
constantly. There are 24 ports in each unit, so selecting only the
active ports and reading just those would have required way more time
and essentially lots of parallell jobs.
There are functions in sFlow and similar that can send/push info on the
amount of traffic to an sFlow server if traffic volumes are all you're
interested in. I'm often also interested in errors, queues and so on,
but I could settle for sFlow based traffic every minute and poll errors
and such less often.
/Fredrik
Hi, Like James, I wrote tools that handle several 10K variables on 5
minute interval. I am not at all trying to discourage you in your
project. There are some concern you should keep in mind: Please always
remember that the primary role of the network / network equipment you
try to manage is to transport data. I mean payload data, not
management data. Have you estimated/calculated the ration of the
bandwidth you will consume "just" for management? (You did not mention
how many counters per routers you intend to collect data about.)
Measurement should not bias measured (or only to a minimal extend).
Switchs and routers are not designed to assist to this extend, the
management station. SNMP is not the best method (high CPU load, low
priority on the equipment) => How about Netflow ( nfsen is a wonderful
free tool , if you accept 5 min. period) There is a big difference
between what can be done and what make sense to be done. In any way
Get-bulk (if several counters par routers) seems more appropriate than
Traps. There are some questions you should consider and find an answer
to: How are configured the time-outs and the retries of your SNMP
requests (you intend to address some "real" equipment that may respond
with latency) ? Two seconds time-out and 3 retries don't make really
sense when polling every second. Impact of a "non-responding" device?
(a faulty one, defectuous one), of several faulty devices (let say
10). What happens to the 190 others? What is the latency of the
network that interconnect your 200 routers? How do you "time-stamp"
the collected data ? Good luck with your work. JM -----Message
counters from a huge amount of routers. Hi, Thanks for the replies.
What about using Traps? Can I make the routers send me the counters
every second ? Is it hard to set up? Kind regards, Richard 2016-05-07
mrtg cricket They were fine for small numbers and long intervals, but
will not scale to your usage. I wrote an implementation from scratch
that handles 10K variables on 5 minute interval. So it can be done.
Issues I think you will run into: router CPU: hitting a device every
second for a number of interfaces may consume too much CPU time to be
practical. storage IO: if you are successful in retrieving the data,
standard DB storage even RRD will not be able to handle the IO load
Things to consider: Bulk SNMP gets RMON local device scripting, I
believe Cisco routers can do local TCL scripting and junos based
devices have local scripting as well. Directory based queues with
small files storing samples Good luck with your work. On Sat, May 07,
Hi folks, For my master thesis I am doing a load balancing project
and I have to know the link usage if possible every second. For that
I set the refresh interval to 1 second, so every thing is good so
far. My problem is that I am working with big topologies and I may
have 200 or more routers. If I get the counters polling it takes
forever I can not poll the routers one by one, or not even using
threads (at some point it would not scale). What would be the best
way to get all the counters ? Since I am simulating everything in a
single machine I can do a trick and write the counters in a file,
however that will not be useful when I test my solution in a real
network. Kind regards, Richard
---------------------------------------------------------------------
--------- Find and fix application performance issues faster with
Applications Manager Applications Manager provides deep performance
insights into multiple tiers of your business applications. It
resolves application problems quickly and reduces your MTTR. Get
your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________ Net-snmp-users
https://lists.sourceforge.net/lists/listinfo/net-snmp-users
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications
Manager Applications Manager provides deep performance insights into
multiple tiers of your business applications. It resolves application
problems quickly and reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________ Net-snmp-users mailing
https://lists.sourceforge.net/lists/listinfo/net-snmp-users
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications
Manager Applications Manager provides deep performance insights into
multiple tiers of your business applications. It resolves application
problems quickly and reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________ Net-snmp-users mailing
https://lists.sourceforge.net/lists/listinfo/net-snmp-users