Mrtg系统状态监控[CPU 内存 网卡流量 系统进程 硬盘空间 TCP连接数]
----------------------------
1) 安装所需rpm包
----------------------------
net-snmp-perl-5.1.2-11.EL4.6.x86_64.rpm
net-snmp-libs-5.1.2-11.EL4.6.x86_64.rpm
net-snmp-utils-5.1.2-11.EL4.6.x86_64.rpm
net-snmp-devel-5.1.2-11.EL4.6.x86_64.rpm
!!需要预先安装以下rpm包
beecrypt-devel-3.1.0-6.x86_64.rpm
elfutils-devel-0.97-5.x86_64.rpm(在第5张安装光盘上)
net-snmp-5.1.2-11.EL4.6.x86_64.rpm
net-snmp-utils-5.1.2-11.EL4.6.x86_64.rpm
安装mrtg-2.10.15-1.x86_64.rpm
----------------------------
2) 修改/etc/snmp/snmpd.conf
----------------------------
找到这行:
# Make at least snmpwalk -v 1 59.57.251.56 -c public system fast again
# name incl/excl subtree mask(optional)
view systemview included .1.3.6.1.2.1.1
view systemview included .1.3.6.1.2.1.25.1.1
在view几行的下面,加上这段文字:
# For Mrtg Add start ####################################
view all included .1.3.6
# For Mrtg Add end ####################################
找到这行:
####
# Finally, grant the group read-only access to the systemview view.
# group context sec.model sec.level prefix read write notif
access notConfigGroup "" any noauth exact mib2 none none
把 access notConfigGroup 里的mib2(也可能是systemview) 更改为all
如果需要监控硬盘容量使用状况,还需要进行以下操作:
用 df -a 看到你要监控的硬盘分卷方式以及容量大小(-am:以M为单位; -ak:以K为单位 -ag:以G为单位)
比如:
#df -am
Filesystem 1M-blocks Used Available Use% Mounted on
/dev/sda2 63993 10284 50459 17% /
/dev/sda1 981 24 908 3% /boot
然后修改/etc/snmp/snmpd.conf,在刚才修改的view行最下面加入:
disk / 63993
disk /boot 981
然后重启snmp服务
# service snmpd restart
----------------------------
3) 修改mrtg配置文件 mrtg.cfg
----------------------------
使用Redhat AS4u5 自带的mrtg
安装以后的配置文件是/usr/local/apache/htdocs/mrtg
我们预设要监控的服务器以下参数:
连接公网网卡的流量;
主机连续运行时间;
系统负载;
CPU负载;
内存使用量;
系统进程数;
硬盘空间;
打开的TCP连接数。
mrtg.cfg配置如下:
- ###################### Configuration Being #########################
- #---------------------------------
- # filename : mrtg.cfg
- # 注意如果文件夹不存在,需要先创建
- #---------------------------------
- ### Global Config Options
- # to get bits instead of bytes and graphs growing to the right
- #Options[_]: growright, bits
- EnableIPv6: no
- WorkDir: /usr/local/apache/htdocs/mrtg
- Language: Chinese
- HtmlDir: /usr/local/apache/htdocs/mrtg
- ImageDir: /usr/local/apache/htdocs/mrtg
- LogDir: /var/log/mrtg
- ThreshDir: /var/lib/mrtg
- LoadMIBs:/usr/share/snmp/mibs/UCD-SNMP-MIB.txt,/usr/share/snmp/mibs/HOST-RESOURCES-MIB.txt,/usr/share/snmp/mibs/TCP-MIB.txt
- #================================================================================
- #监控eth0网卡(连接公网的)
- #================================================================================
- Target[eth0_lan]: /59.57.251.56:public@59.57.251.56:
- Options[eth0_lan]: growright
- Directory[eth0_lan]: eth0
- MaxBytes[eth0_lan]: 100000000
- Kmg[eth0_lan]: ,k,M,G,T,P
- YLegend[eth0_lan]: Bytes per Second
- ShortLegend[eth0_lan]: B/s
- Legend1[eth0_lan]: 每秒流入量 (单位 Bytes)
- Legend2[eth0_lan]: 每秒流出量 (单位 Bytes)
- LegendI[eth0_lan]: 流入:
- LegendO[eth0_lan]: 流出:
- Title[eth0_lan]: eth0网络流量[流入+流出]
- PageTop[eth0_lan]: <H1>eth0网络流量[流入+流出]</H1>
- <TABLE>
- <TR><TD>描述 :</TD><TD>LAN网络接口eth0的网络流量(Bytes/s)</TD></TR>
- </TABLE>
- #================================================================================
- #监控主机连续运行时间[运行天数] 实际操作过程中此节未配置,没有多大用途
- #================================================================================
- Target[upday]: `/usr/local/mrtg/bin/mrtg-updays.pl`
- Options[upday]: gauge,nopercent,growright
- Directory[upday]: upday
- MaxBytes[upday]: 1000
- YLegend[upday]: Up Days
- ShortLegend[upday]: 天
- Legend1[upday]: 主机连续运行时间(天)
- Legend2[upday]:
- LegendI[upday]: 运行时间:
- LegendO[upday]:
- Title[upday]: 主机连续运行时间[运行天数]
- PageTop[upday]: <h1>主机连续运行时间[运行天数]</h1>
- <TABLE>
- <TR><TD>描述:</TD><TD>主机连续运行的时间(天)</TD></TR>
- </TABLE>
- #================================================================================
- #监控系统负载[1分钟+15分钟]
- #================================================================================
- Target[systemload]: .1.3.6.1.4.1.2021.10.1.5.1&.1.3.6.1.4.1.2021.10.1.5.3:public@59.57.251.56:
- Options[systemload]: gauge,nopercent,growright
- Directory[systemload]: load
- MaxBytes[systemload]: 3000
- YLegend[systemload]: System Load
- ShortLegend[systemload]:
- Legend1[systemload]: 最近1分钟系统负载(x100)
- Legend2[systemload]: 最近15分钟系统负载(x100)
- LegendI[systemload]: 1分钟负载:
- LegendO[systemload]: 15分钟负载:
- Title[systemload]: 系统负载(x100)[1分钟+15分钟]
- PageTop[systemload]: <h1>系统负载(x100)[1分钟+15分钟]</h1>
- <TABLE>
- <TR><TD>描述:</TD><TD>系统负载(x100)[1分钟+15分钟]</TD></TR>
- </TABLE>
- #================================================================================
- #监控CPU负载[用户+闲置]
- #================================================================================
- Target[cpuload]: .1.3.6.1.4.1.2021.11.50.0&1.3.6.1.4.1.2021.11.53.0:public@59.57.251.56:
- Options[cpuload]: nopercent,growright
- Directory[cpuload]: cpu
- MaxBytes[cpuload]: 100
- Unscaled[cpuload]: dwym
- YLegend[cpuload]: CPU Utilization
- ShortLegend[cpuload]: %;
- Legend1[cpuload]: CPU用户负载(%)
- Legend2[cpuload]: CPU闲置(%)
- LegendI[cpuload]: 用户:
- LegendO[cpuload]: 闲置:
- Title[cpuload]: CPU负载[用户+闲置]
- PageTop[cpuload]: <h1>CPU负载[用户+闲置]</h1>
- <TABLE>
- <TR><TD>描述:</TD><TD>CPU负载[用户+闲置]</TD></TR>
- </TABLE>
- #================================================================================
- #监控内存使用量[Mem+Swap]
- #================================================================================
- Target[memory]: .1.3.6.1.2.1.25.2.3.1.6.2&.1.3.6.1.2.1.25.2.3.1.6.3:public@59.57.251.56:
- Options[memory]: gauge,growright
- Directory[memory]: mem
- MaxBytes1[memory]: 4045336
- MaxBytes2[memory]: 2097152
- Kmg[memory]: k,M,G,T,P
- Kilo[memory]: 1024
- Unscaled[memory]: dwym
- YLegend[memory]: Bytes
- ShortLegend[memory]: B
- Legend1[memory]: 已用Mem (Bytes)
- Legend2[memory]: 已用Swap(Bytes)
- LegendI[memory]: 已用Mem :
- LegendO[memory]: 已用Swap:
- Title[memory]: 内存使用量[Mem+Swap]
- PageTop[memory]: <h1>内存使用量[Mem+Swap]</h1>
- <TABLE>
- <TR><TD>描述:</TD><TD>Memory和Swap的使用量(Bytes)</TD></TR>
- </TABLE>
- #================================================================================
- #监控系统进程数[进程数]
- #================================================================================
- Target[process]: .1.3.6.1.2.1.25.1.6.0&.1.3.6.1.2.1.25.1.6.0:public@59.57.251.56:
- Options[process]: gauge,nopercent,growright
- Directory[process]: process
- MaxBytes[process]: 1000
- YLegend[process]: Processes
- ShortLegend[process]: 个
- Legend1[process]: 系统进程数(个)
- Legend2[process]:
- LegendI[process]: 进程数:
- LegendO[process]:
- Title[process]: 系统进程数[进程数]
- PageTop[process]: <h1>系统进程数[进程数]</h1>
- <TABLE>
- <TR><TD>描述:</TD><TD>系统进程数(个)</TD></TR>
- </TABLE>
- #==================================================================================
- # 监控硬盘空间[系统盘+数据盘] !!参照前面的2) 这里需要与修改后snmpd.conf 的硬盘参数一致
- #==================================================================================
- Target[disk]: .1.3.6.1.4.1.2021.9.1.8.1&.1.3.6.1.4.1.2021.9.1.8.2:public@59.57.251.56:
- Options[disk]: gauge,growright
- Directory[disk]: disk
- MaxBytes1[disk]: 10080520
- MaxBytes2[disk]: 46251780
- Kmg[disk]: k,M,G,T,P
- Kilo[disk]: 1024
- Unscaled[disk]: dwym
- YLegend[disk]: Bytes
- ShortLegend[disk]: B
- Legend1[disk]: 系统盘已用空间
- Legend2[disk]: 数据盘已用空间
- LegendI[disk]: 系统已用:
- LegendO[disk]: 数据已用:
- Title[disk]: 硬盘空间[系统盘+数据盘]
- PageTop[disk]: <h1>硬盘空间[系统盘+数据盘]</h1>
- <TABLE>
- <TR><TD>描述:</TD><TD>监控硬盘空间</TD></TR>
- </TABLE>
- #================================================================================
- #监控打开的TCP连接数[TCP连接数]
- #================================================================================
- Target[tcpopen]: .1.3.6.1.2.1.6.9.0&.1.3.6.1.2.1.6.9.0:public@59.57.251.56:
- Options[tcpopen]: gauge,nopercent,growright
- Directory[tcpopen]: tcpopen
- MaxBytes[tcpopen]: 1000
- YLegend[tcpopen]: Tcp Connections
- ShortLegend[tcpopen]: 个
- Legend1[tcpopen]: 打开的TCP连接数(个)
- Legend2[tcpopen]:
- LegendI[tcpopen]: TCP连接数:
- LegendO[tcpopen]:
- Title[tcpopen]: TCP连接数[TCP连接数]
- PageTop[tcpopen]: <h1>TCP连接数[TCP连接数]</h1>
- <TABLE>
- <TR><TD>描述:</TD><TD>打开的TCP连接数(个)</TD></TR>
- </TABLE>
- ###################### Configuration End #########################
注意<一>
在上面监控主机连续运行时间的配置中,有提到要使用/usr/local/mrtg/bin/mrtg-updays.pl这个文件。需要手动创建,内容如下:
- #!/usr/bin/perl
- $machine = `/bin/hostname`;
- $uptime1 = `/usr/bin/uptime`;
- $uptime2 = $uptime1;
- $uptime1 =~ /up (.*?) day/;
- $upday = int($1);
- $uptime2 =~ /up (.*?) load/;
- $uptime = $1;
- print "$upday\n";
- print "$upday\n";
- print "$uptime\n";
- print $machine."\n";
保存以后修改权限为可执行:
- #chmod +x /usr/local/mrtg/bin/mrtg-updays.pl
注意<二>
关于硬盘参数,一定要注意
snmp.conf与mrtg.cfg中的格式、参数一定要一致,并且完全跟df -a*的命令结果相符。否则会得到输入错误的提示信息。
----------------------------
4) 生成工作目录及相关文件
----------------------------
env LANG=C /usr/local/mrtg/bin/mrtg /usr/local/apache/htdocs/mrtg/mrtg.cfg
----------------------------
5) 生成监控的页面文件
----------------------------
env LANG=C /usr/local/mrtg/bin/indexmaker --output /usr/local/apache/htdocs/mrtg/index.html --title="System state Monitor" /usr/local/apache/htdocs/mrtg/mrtg.cfg
6) 多服务器环境下的集中管理
mrtg主目录:/usr/local/apache/htdocs/mrtg
如果存在多服务器,建议用创建ip文件夹,方便识别:
实例:
创建一IP文件夹(117.25.130.49),
(1) 生成CFG配置文件
/usr/local/mrtg/bin/cfgmaker public@117.25.130.49 --global 'workdir: /usr/local/apache/htdocs/mrtg/117.25.130.49' --output /usr/local/apache/htdocs/mrtg/117.25.130.49/mrtg.cfg
这样生成的配置文件只是MRTG默认的配置,可以不用操作这一步,直接COPY上面的配置模板,将其中IP更改为要监控的服务器IP
(2) 生成监控页面文件
/usr/local/mrtg/bin/indexmaker --output=/usr/local/apache/htdocs/mrtg/117.25.130.49/index.html --title="System Monitor" /usr/local/apache/htdocs/mrtg/117.25.130.49/mrtg.cfg
(3) 生成工作目录及相关文件
env LANG=C /usr/local/mrtg/bin/mrtg /usr/local/apache/htdocs/mrtg/117.25.130.49/mrtg.cfg
(4) 更改模板文件的字符集,
上面生成的html文件中,默认字符集为iso-8859-1,浏览时可能会出现乱码,可以将其更改为utf-8
# sed -i "s/charset=iso-8859-15/charset=utf-8/" filename
# sed -i "s/charset=iso-8859-1/charset=utf-8/" filename
(5) 定时执行
*/5 * * * * env LANG=C /usr/local/mrtg/bin/mrtg /usr/local/apache/htdocs/mrtg/117.25.130.49/mrtg.cfg
(6) 浏览
http://59.57.251.56/mrtg/117.25.130.49/
(7) 防火墙
iptables -A INPUT -s 59.57.251.56 -p udp --destination-port 161 -j ACCEPT
iptables -A INPUT -s 59.57.251.56 -p udp --destination-port 162 -j ACCEPT
- ### Global Config Options
- EnableIPv6: no
- Options[_]: growright, bits
- WorkDir: /usr/local/apache/htdocs/mrtg
- Language: Chinese
- HtmlDir: /usr/local/apache/htdocs/mrtg
- ImageDir: /usr/local/apache/htdocs/mrtg
- LogDir: /var/log/mrtg
- ThreshDir: /var/lib/mrtg
- LoadMIBs:/usr/share/snmp/mibs/UCD-SNMP-MIB.txt,/usr/share/snmp/mibs/HOST-RESOURCES-MIB.txt,/usr/share/snmp/mibs/TCP-MIB.txt
- # User vs Idle CPU usage
- Target[kontor.cpu]:ssCpuRawUser.0&ssCpuRawIdle.0:public@59.57.251.56
- RouterUptime[kontor.cpu]: public@59.57.251.56
- MaxBytes[kontor.cpu]: 100
- Title[kontor.cpu]: CPU LOAD
- PageTop[kontor.cpu]: <H1>User CPU Load %</H1>
- Unscaled[kontor.cpu]: ymwd
- ShortLegend[kontor.cpu]: %
- YLegend[kontor.cpu]: CPU Utilization
- Legend1[kontor.cpu]: User CPU in % (Load)
- Legend2[kontor.cpu]: Idle CPU in % (Load)
- Legend3[kontor.cpu]:
- Legend4[kontor.cpu]:
- LegendI[kontor.cpu]: User
- LegendO[kontor.cpu]: Idle
- Options[kontor.cpu]: growright,nopercent
- # User vs System CPU usage
- Target[kontor.usrsys]:ssCpuRawUser.0&ssCpuRawSystem.0:public@59.57.251.56
- RouterUptime[kontor.usrsys]: public@59.57.251.56
- MaxBytes[kontor.usrsys]: 100
- Title[kontor.usrsys]: CPU LOAD
- PageTop[kontor.usrsys]: <H1>CPU (user and system) Load %</H1>
- Unscaled[kontor.usrsys]: ymwd
- ShortLegend[kontor.usrsys]: %
- YLegend[kontor.usrsys]: CPU Utilization
- Legend1[kontor.usrsys]: User CPU in % (Load)
- Legend2[kontor.usrsys]: System CPU in % (Load)
- Legend3[kontor.usrsys]:
- Legend4[kontor.usrsys]:
- LegendI[kontor.usrsys]: User
- LegendO[kontor.usrsys]: System
- Options[kontor.usrsys]: growright,nopercent
- ### Active CPU usage
- Target[kontor.cpusum]:ssCpuRawUser.0&ssCpuRawUser.0:public@59.57.251.56 + ssCpuRawSystem.0&ssCpuRawSystem.0:public@59.57.251.56 + ssCpuRawNice.0&ssCpuRawNice.0:public@59.57.251.56
- RouterUptime[kontor.cpusum]: public@59.57.251.56
- MaxBytes[kontor.cpusum]: 100
- Title[kontor.cpusum]: CPU LOAD
- PageTop[kontor.cpusum]: <H1>Active CPU Load %</H1>
- Unscaled[kontor.cpusum]: ymwd
- ShortLegend[kontor.cpusum]: %
- YLegend[kontor.cpusum]: CPU Utilization
- Legend1[kontor.cpusum]: Active CPU in % (Load)
- Legend2[kontor.cpusum]:
- Legend3[kontor.cpusum]:
- Legend4[kontor.cpusum]:
- LegendI[kontor.cpusum]: Active
- LegendO[kontor.cpusum]:
- Options[kontor.cpusum]: growright,nopercent
- ###Monitoring DISK space
- ###Monitoring from dskTable
- Target[kontor.root]:dskPercent.1&dskPercent.2:public@59.57.251.56
- RouterUptime[kontor.root]: public@59.57.251.56
- MaxBytes[kontor.root]: 100
- Title[kontor.root]: DISK USAGE
- PageTop[kontor.root]: <H1>DISK / and /usr Usage %</H1>
- Unscaled[kontor.root]: ymwd
- ShortLegend[kontor.root]: %
- YLegend[kontor.root]: DISK Utilization
- Legend1[kontor.root]: Root disk
- Legend2[kontor.root]: /usr disk
- Legend3[kontor.root]:
- Legend4[kontor.root]:
- LegendI[kontor.root]: Root disk
- LegendO[kontor.root]: /usr disk
- Options[kontor.root]: growright,gauge,nopercent
- ###Monitoring from hrStorageTable
- Target[kontor.hrroot]:hrStorageSize.1&hrStorageUsed.1:public@59.57.251.56
- RouterUptime[kontor.hrroot]: public@59.57.251.56
- MaxBytes[kontor.hrroot]: 300000
- Title[kontor.hrroot]: DISK / USAGE
- PageTop[kontor.hrroot]: <H1>DISK / Usage</H1>
- ShortLegend[kontor.hrroot]: B
- kMG[kontor.hrroot]: k,M,G,T,P
- kilo[kontor.hrroot]: 1024
- YLegend[kontor.hrroot]: DISK / Utilization
- Legend1[kontor.hrroot]: Root disk size
- Legend2[kontor.hrroot]: Root disk usage
- Legend3[kontor.hrroot]:
- Legend4[kontor.hrroot]:
- LegendI[kontor.hrroot]: Root disk size
- LegendO[kontor.hrroot]: Root disk usage
- Options[kontor.hrroot]: growright,gauge,nopercent
- Two further examples that have been offered:
- ### Monitoring TCP connections
- Target[tcpopen]:
- .1.3.6.1.2.1.6.9.0&.1.3.6.1.2.1.6.9.0:public@localhost
- Options[tcpopen]: nopercent,growright,gauge,noinfo
- Title[tcpopen]: Open TCP connections
- PageTop[tcpopen]: Open TCP connections
- MaxBytes[tcpopen]: 1000000
- YLegend[tcpopen]: # conns
- ShortLegend[tcpopen]: connections
- LegendI[tcpopen]: Connections:
- LegendO[tcpopen]:
- Legend1[tcpopen]: Open TCP connections
- ### Monitoring Free Memory
- Target[freemem]: .1.3.6.1.4.1.2021.4.11.0&.1.3.6.1.4.1.2021.4.11.0:public@localhost
- Options[freemem]: nopercent,growright,gauge,noinfo
- Title[freemem]: Free Memory
- PageTop[freemem]: Free Memory
- MaxBytes[freemem]: 1000000
- kMG[freemem]: k,M,G,T,P,X
- YLegend[freemem]: bytes
- ShortLegend[freemem]: bytes
- LegendI[freemem]: Free Memory:
- LegendO[freemem]:
- Legend1[freemem]: Free memory, not including swap, in bytes