Показаны сообщения с ярлыком OpenNMS. Показать все сообщения
Показаны сообщения с ярлыком OpenNMS. Показать все сообщения

вторник, 18 сентября 2012 г.

Temperature monitoring with OpenNMS and brocade FC switch

We are expanding! And we have to monitor temperature level in our new data center. So, we have:
  1. existing OpenNMS monitoring system,
  2. Brocade 300 SAN FC switch (or something very similar).
Our task is the same: to monitor temperature and send SMS when it's too high. Later part was already discussed.
The above mentioned FC switch has three temperature sensors, which data is available by SNMP (oids .1.3.6.1.4.1.1588.2.1.1.1.1.22.1.4.1 - .1.3.6.1.4.1.1588.2.1.1.1.1.22.1.3). The only problem is that OpenNMS doesn't know about them. Let's teach it!
At first, we have to modify datacollection-config.xml and add to "groups" section:
<group  name="brocade-temperature" ifType="ignore">
         <mibObj oid=".1.3.6.1.4.1.1588.2.1.1.1.1.22.1.4" instance="1" alias="temperature1" type="integer" />
         <mibObj oid=".1.3.6.1.4.1.1588.2.1.1.1.1.22.1.4" instance="2" alias="temperature2" type="integer" />
         <mibObj oid=".1.3.6.1.4.1.1588.2.1.1.1.1.22.1.4" instance="3" alias="temperature3" type="integer" />
      </group>
I know, "temperature[1-3]" names are ugly, but I didn't care at first and now when the whole thing is working I don't want to change anything :)
And modify "Brocade FC Switches" systemDef in systems section:
     <systemDef name="Brocade FC Switches">
        <sysoidMask>.1.3.6.1.4.1.1588.</sysoidMask>
        <collect>
          <includeGroup>brocade-temperature</includeGroup>
          <includeGroup>brocade-switch-fctable</includeGroup>
        </collect>
     </systemDef>
Now we are ready and can define thresholds. To be sure that requested parameters are gathered you can look at rrdRepository. Path to the repository is specfied in rrdRepository property of datacollection-config resource (look at top of datacollection-config.xml). If you have temperature[1-3].jrb files in ${rrdRepository}/${NodeID} and these files contains some data, everything is OK. NodeId is displayed in URL, when you look at node (/opennms/node.jsp?node=353) or you can query "node" table in OpenNMS database:
SELECT nodeid from node where nodelabel='your label';

To be sure that temperature[1-3].jrb contains correct data, you can dump it (your version of jrobin-*.jar file may be different):
# cd ${rrdRepository}/${NodeID}
# echo "dump temperature1.jrb" |  java -jar ${OPENNMS_HOME}/opennms/lib/jrobin-1.5.12.jar
However, OpenNMS still doesn't know how to display this data. Let's help it. To define new graph we will add the following parts to snmp-graph.properties. In the begining of file we have to add reports names to reports definition:
reports=mib2.HCbits, mib2.bits, mib2.percentdiscards, mib2.percenterrors, \
...
brocade.switch.temperature1, brocade.switch.temperature2, brocade.switch.temperature3, \
...
and define reports later (here only one report is shown, other two are essentially the same, just change [Tt]emperature1 to [Tt]emperature[23]):

report.brocade.switch.temperature1.name=Brocade switch temperature1
report.brocade.switch.temperature1.columns=temperature1
report.brocade.switch.temperature1.type=nodeSnmp
report.brocade.switch.temperature1.command=--title="Brocade switch temperature1" \
 --vertical-label="degrees celsius" \
 DEF:temperature1={rrd1}:temperature1:AVERAGE \
 AREA:temperature1#0000ff:"Temperature1" \
 GPRINT:temperature1:AVERAGE:" Avg \\: %8.2lf %s" \
 GPRINT:temperature1:MIN:"Min \\: %8.2lf %s" \
 GPRINT:temperature1:MAX:"Max \\: %8.2lf %s\\n" 
.. 
Now after "service opennms restart" we'll get pretty graphs if we build graph based on "Node-level Performance Data" resources for brocade node.

P.S. And lastly I just must give you a link to a good document concerning data collection configuration in OpenNMS and another one concerning SNMP configuration.

среда, 6 июля 2011 г.

SMS notifications in OpenNMS

We just got a lot of problems when our air conditioning system controller in data center went mad and didn't notice the failure of two air conditioners. The 60 Celsius degrees is not the best temperature for servers operations...
After dealing with this situation we decided to setup SMS-notification service at least for temperature in server room and several other parameters. We already had OpenNMS monitoring system configured, so we had to add SMS-notification to our setup.
This task was done in several steps. First of all, we created a script which would send SMS. We used Google Calendar for this purpose. One dedicated user (let's say opennms) was created for our monitoring system and all system administrators imported his calendar with the following notification settings:

  • Events reminders - By default remind me via sms 1 minutes before each event

  • New Invitations - SMS


Of course, system administrators had to register their phone numbers in Google Calendar.
When we want to send SMS notification, we create new event in opennms's calendar using gcalcli for this purpose.
We used the following script to create a new event (sleep was inserted to prevent mass event creation in case when everything is bad):

#!/bin/sh
H=$(/bin/date "+%H")
M=$(/bin/date "+%M")
sleep 1
event_text="$H:$M $@"
/usr/local/bin/gcalcli --user opennms --pw OurPassword --cals=owner quick "$event_text"


Then, we described a notification command in /usr/local/opennms/etc/notificationCommands.xml:

<command binary="true">
<name>SendSMS</name>
<execute>/root/bin/send_sms.sh</execute>
<comment>Send SMS by gcalcli</comment>
<argument streamed="false">
<switch>-subject</switch>
</argument>
<argument streamed="false">
<switch>-tm</switch>
</argument>
</command>

Argument elements describe script parameters, here we pass it notice subject and notice body (full list of possible parameters may be found here).

Later we created a destinationPath in /usr/local/opennms/etc/destinationPaths.xml:

<path name="SMS-Admins" initial-delay="0s">
<target interval="0s">
<name xmlns="">admin</name>
<autoNotify xmlns="">auto</autoNotify>
<command xmlns="">SendSMS</command>
</target>
</path>

You have to set some legal OpenNMS user or group here (in name target's sub-element).

Now you can create some notifications using OpenNMS Web UI. To control outstanding characteristics of certain parameters you should set trigger event to uei.opennms.org/threshold/highThresholdExceeded and set destinationPath for it to "SMS-Admins". You can do it by hand, adding the following entry to /usr/local/opennms/etc/notifications.xml:

<notification name="High Threshold" status="on" writeable="yes">
<uei xmlns="">uei.opennms.org/threshold/highThresholdExceeded</uei>
<description xmlns="">High threshold exceeded</description>
<!-- some filter -->
<rule xmlns="">(NODELABEL = 'our label')</rule>
<destinationPath xmlns="">SMS-Admins</destinationPath>
<text-message xmlns="">The parameter %parm[ds]% is high on node: %nodelabel%, interface:%interface%. The parameter %parm[ds]% reached a value of %parm[value]% while the threshold is %parm[threshold]%. This threshold for this alert was %parm[threshold]%.</text-message>
<subject xmlns="">Notice #%noticeid%</subject>
<numeric-message xmlns="">111-%noticeid%</numeric-message>
</notification>


The only interesting question I have now is how to do notification only for several thresholds. Just for now I deleted all unimportant (for me) thresholds. It would be cool, however, to specify instead in notification description only thresholds you are interested in...