Avalon network monitoring tool

1 Introduction

Percival is a Network Monitoring and Capacity Planning front-end to the excellent RRDtool software. It is based on our experience of providing customized network monitoring solutions to the ISPs, banks and large enterprises in Israel.

We found that existing commercial tools are too expensive and ill designed for this task while existing RRD frontends lack essential features such as user profiles, simple configuration model, customizable GUI, performance, reports etc.

We started with stock Cricket and with time we essentially rewrote it to address its shortcomings.

We also had to add number of bug fixes and improvements to the RRDtool. Percival is a subset of our Lancelot Monitoring Framework that can be released under the GPL.

Percival Features:
  • WEB User Interface
  • Themable user interface
  • Support of MIB2
  • Cisco, Linux and Windows
  • Configuration is stored in hierarchical database
  • Database supports on the fly editing, links and multiple users
  • Reports: top utilized interfaces, errors, discards etc
  • Totals: average, sum, many small graphs on one page
  • Drill-down on all graphs
  • Moving average smoothing
  • Percentile 95%
  • User profiles. Each user can see only his part of the configuration tree
  • Each device polled only once even if it appears in multiple profiles
  • Consistent CLI tools for the system configuration and maintanence
  • Modified RRDtool is used for the data storage
  • Designed to hold around 500 interfaces with 120 sec polling frequency
  • OS: Linux, Solaris(must built from source)

Percival is written in Perl so it should be pretty portable. The officially supported platforms are Linux and Solaris.

2.1 Installation from RPM

It is best to install Percival from RPM. Percival RPM can be installed on RedHat, Mandrake and Turbolinux. RPMS can be obtained from SourceForge http://percival.sourceforge.net. The installation is as simple as doing

rpm -ihv percival-1-1.i386.rpm

After successful completion of the command you have Percival up and running.

2.1.1 Verifying Installation

Connect to http://<yourhost&gt;:8181. You should see system login screen. The system comes with user guest and password guest. It is a good idea to change them.

Then do following commands:

su – avalon

cd /usr/local/percival/bin

./overlord.pl check

Command will produce following output:

konfigd – ok

kollector – ok

thaw – ok

querymaker – ok

Another check is to rebuild database:

./kompile

It will produce something like this:

[10-Mar-2003 17:11:29 :8336] Starting compile: Percival version 1.1.1 built on rothut.avalon-net.co.il at Sun Mar 9 16:05:50 IST 2003 by root features: light gpl

[10-Mar-2003 17:11:31 :8336] Processed 62 nodes (in 30 files) in 4 seconds.

2.2 Installing from Binaries

Binary installation is a little more tricky. Percival must be installed in the /usr/local/percival. It expects to find perl and all supporting packages in /usr/local/avalon. Other configurations are not supported at the moment. It is possible to relocate Percival and its supporting packages but you should be prepared to write some scripts.

First, you have to download following tarballs:

  • Percival-Perl.tar.gz
  • Percival-Apache.tar.gz
  • Percival-RRD.tar.gz
  • Percival-Source.tar.gz

I am assuming you have put all tarballs in /tmp. Next step is to open them:

cd /

gunzip -c Percival-Perl.tar.gz | tar -xvf –

gunzip -c Percival-Apache.tar.gz | tar -xvf –

gunzip -c Percival-RRD.tar.gz | tar -xvf –

gunzip -c Percival-Source.tar.gz | tar -xvf –

The last step is to add Percival to OS start up scripts. This is OS dependent. The example of such script is located in /usr/local/percival/bin/lancelotd

2.3 Building from Sources

Percival builds are done automagically from the unified source base. Still it should be perfectly possible to build Percival from its stand alone components. Percival expects to find its supporting packages at /usr/local/avalon. First, you will need Perl and following list of packages:

  • perl-5.6.1
  • Net-Telnet-3.02
  • Net-Radius-1.43
  • Statistics-Descriptive-2.4
  • AppConfig-1.52
  • Apache-Admin-Config-0.15
  • CGI-FastTemplate-1.09
  • CGI.pm-2.78
  • Cflow-1.025
  • Color-Object-0.1_02
  • Compress-Zlib-1.14
  • DB_File-1.76
  • Digest-MD5-2.12
  • File-Tail-0.98
  • HTML-Parser-3.23
  • HTML-Tagset-3.03
  • IO-stringy-1.220
  • MIME-Base64-2.12
  • MIME-tools-5.410
  • MailTools-1.15
  • Net-DNS-0.12
  • Net-Patricia-1.010
  • NetServer-Generic-1.03
  • Parse-Syslog-0.03
  • RadiusPerl-0.05
  • SNMP_Session-0.83
  • Storable-1.0.11
  • Template-Toolkit-2.02
  • Time-HiRes-01.20
  • TimeDate-1.10
  • URI-1.11
  • XML-Parser-2.30
  • libnet-1.0703
  • libwww-perl-5.50
  • IO-Tty-1.02
  • Expect-1.15
  • expat-1.95.2
  • zlib-1.1.3
  • db-4.0.14
  • BerkeleyDB-0.17

Then you need to have apache and mod_perl installed. We use following packages:

  • apache_1.3.26
  • mod_perl-1.26

After you have done with apache you have to build rrdtool that comes with Percival. Percival will not work with the standard rrdtool. You have to download rrdtool-1.0.28avalon.tar.gz from the Percival site and install it.

Finally, you have to untar Percival-Source.tar.gz

3 Percival Operation Basics

Despite its very simple look, Percival is a complex system. In this chapter you will learn how to operate Percival from command line. You will learn how to stop/start the system, what daemons(services) should be running, what each of them do. You will also learn how to manage network elements using konfne command and basic concepts of the Percival configuration database. In this chapter we try to keep things as single as possible, advanced concepts will be handled in the separate chapter.

All Percival commands a located in /usr/local/percival/bin.In the next sections we assume that you are working as user avalon you and you have this directory in your PATH variable or have done cd there. We also assume that Percival was installed from RPM on RedHat like linux distribution.

It is very important to operate Percival as user avalon and not as root!!! You can switch to avalon by typing su – avalon

3.1 Starting/Stopping Percival

As root execute

/etc/rc.d/init.d/lancelotd start

to start the system

/etc/rc.d/init.d/lancelotd stop

to stop the system. 1

3.2 Percival Daemons (Services) and Commands

There are several daemons that are required for Percival normal operation. The daemons are responsible for data collection, detection of not responding IPs, report generation, web user interface and managing Percival configuration by remote client applications such as Merlin. Every daemon except the webserver must be managed by overlord.pl command.

There are following Percival daemons:

kollector

does all data collection.

thawne.pl

detects not responding network devices.

querymaker.pl

generates all reports

konfigd

provides configuration API to the external clients

httpd

Apache webserver. Needed for Percival Web interface.

Apache is controlled with apachectl command. It is located in /usr/local/avalon/bin. Apache must be managed as user root.

apachectl start

starts apache

apachectl stop

stops apache.

Every other daemon is controlled with overlord.

3.2.1 Controlling Daemons with Overlord

Overlord was developed because amount of Percival daemons2 was increasing rapidly. It was clear that some master daemon is needed to rule them all. You need to know following basic overlord commands:

overlord.pl ping

tells status of each configured daemon

overlord.pl check

tells status of each configured daemon. Restart the daemon if it is not running

overlord.pl list

lists options for each configured daemon

overlord.pl tail <name>

shows last lines of the daemon log

overlord.pl –help

shows all available options and short help

overlord.pl modify <name> <param=value> …

tweaks daemon options. If you installed from RPM system has very reasonable default values. You should not change them unless you really know what you are doing.

3.2.2 Common Command Line Options

Every Percival process be it a command or daemon understand some common options. These are:

–help

show help message and exit

–version

show version information and exit

–loglevel <level>

controls process output. Level can be either Debug, Info, Warn or Error. Default level is Info. Debug is the most verbose and must not be used in production.

–logfile <file>

controls where to send process output. By default all output goes to STDOUT

Following options are accepted by any process but the process may ignore them. After all it makes no sense to run device configurator command (konfne) as daemon.

–daemon

if specified process becomes a daemon(service)

–interval <seconds>

controls how often daemon should work. For example kollector is run every 120s

Next options are available in Lancelot only. In Percival they will generate an error message.

–cached

turns on Perl based in memory cache. The cache is optimized to be very memory friendly and produces nice speed up. The option is obsoleted.

–hdb

turns on alternative implementation of the configuration database (HDB). HDB database is optimized for speed and provides order of magnitude (and in some cases even more) speedup in comparison with Percival implementation. It is also fully backward compatible with the old database on the API level.

3.2.3 Commands

Percival only has four commands and you already know everything you need about overlord. It leaves three others. One, target.pl, is used for debugging of the configuration database and out of the scope of this chapter. The other, konfne 3, is how you add, update or remove network devices from Percival. In upcoming chapters we will deal with it a lot. The last one, kompile, produces binary database from Percival text based configuration database and downloads device configuration files from device modules. You need to issue this command every time you edit configuration database by hand or if you suspect that Percival database is corrupted.

3.3 Managing Devices

By the time you get to this section you dont want to read anything about processes, commands, daemons and options. You want to add your new Cisco router to the Percival NOW. This is how you do it:

konfne -af –ip <your router ip> –community <snmp community> cfg /Tree/Routers/dummyname

Thats all. Now you see the device once you login into the Percival as user guest.

Percival configuration is based on the concept of the “device”. Device can be a router, computer, switch or other network element. In this case we call it “real” device. There is another class of device such as reports, summary graphs(totals), profile etc… These devices can be created based only on information in the database. We call such devices “Virtual”.

3.3.1 Configuration Database Basics

Percival keeps its configuration information in the hierarchical(tree) database. The database is completely text based. It makes it very easy to backup and you can modify it using standard Unix CLI tools. Each user has its own view of the database.4

Percival converts textual database into binary format. Lancelot uses alternative implementation of the database , called HDB, that works directly over text files. kompile converts database into binary form usable by Percival and konfne. konfne works on both binary and text database.Thus you don’t need to worry about keeping database copies in sync with each other.

The database is located in /usr/local/percival/etc/lancelot-config directory. Directory in the filesytem represents directory in the database. However one file can have several database child nodes. Each database entry can have many properties in form of attribute=value pairs. Node attributes can be inherited. Database node may have a reference (link) to another node. It work pretty much as symbolic links in Unix or shortcuts in Windows.

Percival and Lancelot come with several preconfigured database entries:

Defaults

has system wide settings such as skin, location of rrd files and others

SysProfiles

has default system profile. The only way to change administrator username or password is to edit this file.

daemons

contains settings for all system services. overlord.pl is the preferred way to manage it.

profiles

contains definitions of Percival users. Percival comes with preconfigured with guest account with the root at /Tree. konfne is the preferred way to manage users.

/Devices

directory contains all currently configured network devices. It also contains instructions how devices should be processed.

/Tree

is root of preconfigured guest profile.

NOTE: remember to run kompile if you changed database manually.

3.3.2 konfne Basics

konfne will be your main tool for managing Percival. When you configure new or already existing devices several things happen:

  • device might be snmp scanned
  • device global configuration is placed under /Devices according to element classes. For example Router may have interfaces and chassis. Interfaces are placed under /Devices/Interfaces/<routername>/ and chassis configuration is placed under /Devices/Routers/. Some devices, such as report, may not have global configuration.
  • If global configuration already exists, it is updated.
  • Links and other needed device elements are created in the specified path

konfne has several standard options:

–devlist

shows all available devices

–autotype or -a

guess device type automagically

–ip or -i

device ip address or hostname

–community or -c

device SNMP community. If not specified defaults to public

–fetchname or -f

fetch device name from sysName. Everything after the last / in the path is replaced by the fetched name

–recursively or -r

apply command to all devices in specified subtree. Usually used for automatic reconfiguration of already configured devices. For example, konfne -r /Tree will reconfigure whole guest profile.

–tag or -t <attribute>=<value>

apply device specific parameters. Each device may have specific configuration options.

konfne has following basic commands:

help

show help for specified device or path. To get help for profile configuration you can do:

konfne help Devices::Virtual::Profile

or

konfne help /Tree

cfg

will configure new device or update already configured device

del

deletes profile visible device configuration. Device is still collected.

DELETE

deletes profile visible device configuration and delete global device configurationtion. There is no concept of device usage count. So if you have device configured in other profiles it will stop working there. Already collected data are not removed.

DEMOLISH

deletes device configuration from profile, from the database and removes all collected data.

probe

check if device is responding to SNMP

3.3.3 Managing User Profiles

Percival supports concept of user. Each user must have different profile. For example, you can have one profile with the access to all of your routers. On the other hand your customer profile will give access only to specific router interface. Profile creation is governed by several simple rules:

  • Nested profiles are not allowed
  • All devices, except folders, must be created under profile
  • Profiles with the same root are not allowed
  • Profile name must be unique

Profile has three basic parameters:

  1. Profile name. In this document it is also referred as user name.
  2. Profile password
  3. Profile root. The closes analogy to profile root is user home directory in Unix.

Device Devices::Virtual::Profile provides all necessary profile management.

Profile has following device specific options:

profile

specifies profile name

auth

specifies profile authentication mode. Only local mode, which is a default, is supported in the Percival

editable

if option is present and equal to true profile user can use Merlin to manage profile.

alt-legend

can be either true or false. If present and is true then graph legend is displayed under the graph in MRTG like style.

su-allowed

user of this profile can switch to another profile without performing an authentication. This is mostly useful for large installation when you want to have ‘master’ account.

Example of creating new profile foobar with the root at /MyProfiles/FoobarTree:

konfne –device Devices::Virtual::Profile -t profile=foobar -t password=secret -t ‘alt-legend=true’ cfg /MyProfiles/FoobarTree

3.3.4 Automatic Configuration of Network Devices

Percival has ability to automatically detect type of the network devices and invoke correct device module. Auto-detection works in many cases and is the easiest way to add new equipment. The downside of auto-detection is that you can not pass device specific options to the konfne. The auto-detection will not work for virtual devices.

This is how you autodect device:

konfne –autotype –ip <ip> –community <secret> cfg /Tree/Routers/myrouter

3.3.5 Standard Device Options

Every Percival device must support following standard options:

display-name

if specified it overrides device name specified in the path. Unlike path it may have embedded HTML tags and spaces.

3.3.6 Configuring Generic MIB2 Device

Almost any SNMP manageable equipment implements MIB2. Percival uses MIB2 to obtain netwrok interface statistics. If there is no specific Percival device for your equipment you can use Devices::Routers::Generic to obtain traffic statistics.

Options supported by Devices::Routers::Generic must be supported by any other device dealing with network interfaces. Following options are supported:

namedonly

configure only named interfaces. That is interfaces which have description set in ifAlias.

config-v2c

if true, try to use 64 bit high performance counters (ifHCInOctets, ifHCOutOctets) for the high speed interfaces. Device checks if interface can really return high speed counters. In our experience ther are a lot of problems with 64 bit counters on CISCO routers. Care must be taken when invoking this option.

config-v2c-speed

64 bit counters can be used if interface speed is greater then specified threshold. Speed is given in megabits. Default speed value is 100M.

use-if-name

by default ifDescr is used to get interface names. Some devices may have identical ifDescr but different ifName. In this case this option should be set to true.

if-types

list of symbolic interface types that should be configured. Interface will not be configured if this option was specified and interface type does not match.

if-match-regexp

only configure interfaces that match given regexp.

keepabsent

do not remove interface from configuration if it does not present on router anymore. Instead the interface is marked as “frozen”. It will have word frozen added to the description and its default graphs will display will end at the time the interface was “frozen”. This feature is useful to keep graph of old lines.

3.3.7 Configuring CISCO Equipment

It is well known fact that majority of the network equipment is manufactured by CISCO. Percival and Lancelot have very good support for the CISCO routers and switches, including advanced features such as SAA, Netflow and Quality of Service monitoring.5

3.3.7.1 Cisco Routers

CISCO routers are configured with the Devices::Routers::Cisco device. The devices has following options:

setup-pptp-session

normally PPTP sessions are ignored unless the value of this option is true.

ppp-names

normally interfaces with ifType ppp are not configured. This option accepts a regular expression. If the expression match interface name as given in ifDescr and interface type is ppp then interface will be configured.

config-virtual

normally interfaces with the world “virtual” in ifDescr are skipped unless this option is true.

telnet-login

user on the router for doing login. This is needed for configuring either BGP or Pings.

telnet-password

password of the user that was specified with previous option

pings

configure pings from Cisco router. The option accepts coma separated list of ips or hostnames.

3.3.7.2 Cisco IOS Switches

CISCO IOS switches are configured with Devices::Switches::IOS. The device does not have any specific options.

3.3.7.3 Cisco Catalyst Switches

CISCO Catalyst switches are configured with Devices::Switches::Catalyst. There are no device specific options.

3.3.8 Configuring Linux

We support UCD-SNMP or NET-SNMP agents on linux. We have encountered problems with the packaged snmp agent on RedHat 7.3. You can download our build of NET-SNMP that fixed that proble from percival site on SourceForge.

Linux computers are configured with Devices::Computers::Linux. There are no device specific options. Linux device supports monitoring of CPU load average, memory and disk usage in addition to the interface monitoring.

3.3.9 Configuring Windows 2000

Percival can configure Windows2000 with Host MIB or with Compaq Insight Manager MIB. The correct MIB is auto-detected. Windows 2000 computers are configured with Devices::Computers::Win2000. Device supports monitoring of CPU, memory and disk usage.

The device has following options:

process-watchdog

gather service uptime statistics. Accepts coma separated list of services.

3.3.10 Configuring Windows NT

NT has very basic SNMP support. To get advanced statistics you must install SNMP4C from http://www.wtcs.org cess-watchdoghttp://www.wtcs.org. Windows NT computers are configured with Devices::Computers::WinNT. There e are no device specific options.

3.3.11 Configuring Reports

Reports in Percival provides you with high level system summary. Using reports you will be able quickly determine problems in your network and zoom to the problem area to view detailed statistics. Reports are configured with Devices::Virtual::Report. Following options are supported:

type

report type. There are several builtin reports:

utilization

compares network interface utilization over the period of time. Utilization is computed as traffic/bandwidthwhere bandwidth value is take from ifSpeed of the interface. Results will not be valid if your interface speed is set wrong.

errors

sort interfaces by error count over the period of time. Presence of errors on the interface usually indicates hardware problems.

discards

sort interfaces by discarded packets over the period of time. Packets are discarded when router queue is getting long. Presence of discard indicates routing problems or lack of bandwidth.

overloaded

show interfaces that are consistently utilized with over 70% of capacity.

idle

inverse of previous report.

limit

how many results to show. Must be positive number.

desc

detailed report description. HTML tags may be used here.

archive

how to process data. Can be either AVERAGE or MAX.

sort

sort report in either ascendant or descendant order. Can be either asc or desc.

range

range of report in seconds.

subtrees

specifies on what devices to report. Subtrees are specification is absolute to the configuration root. Subtrees are in coma separated list.

3.3.12 Configuring Totals

Percival has ability to combine several graphs into one. This is useful when you want to see average utilization of some several interfaces, or your total international traffic or to see all graphs on one page. Device Devices::Virtual::Total provides this functionality. It can be configured with following options:

long-desc

defines long description of the total. HTML tags and spaces are allowed.

subtrees

list of subtrees to search for report targets.

regexp

match target name based on given regexp. Must be used with the subtree tag.

selection

coma separated list of targets.

type

report type. Can be one of the following:

report

show small graphs for every interface or other target on one page.

sum

show graph that sums all information.

contrib

show stack graph for all interfaces

avg

show graph that averages all information.

Percival is smart enough to figure out how to aggregate information from different devices. The details of this process are out of the scope of this manual.

3.4 Logfiles

Percival daemons are supposed to write their log files under the /usr/local/percival/var/lancelot-logs/ directory. While it is possible to change that using –logfile option and overlord.pl, it is p probably best to stick to the convention. Overlord has command rotate that can be used to rotate logs periodically. In particularly log of kollector can grow quite large.

Logs format is:

[dd-Mon-yyyy hh:mm:ss :pid] free text

Log of every kollector measurement is written in following format:

[time] Retrieved data for <path>(inst)[error] : <ds1>@timestamp[low_bound-upper_bound],<ds2>,…,<dsn>

where

time

log time stamp

path

location of the element in configuration database

inst

element instance as determined by mapping. Can be empty.

error

provides a precision estimation based on error in time measurement and database sampling interval.

timestamp

in milliseconds. Actual time that goes to database

low_bound

in fractional seconds. Shows when measurement was started.

upper_bound

in fractional seconds. Shows when measurement was completed.

ds1..dsn

datasourcses. Things like ifInOctets, ifOutOctets etc..

3.5 Backup Procedures

As a bare minimum you need to backup configuration and data directories. They are located in /usr/local/percival/etc/lancelot-config and /usr/local/perciva/var/lancelot-data. Since the Percival keeps both configuration and data in files there is no need to in any special agent. Backup can be done with standard Unix tools like tar, cpio, or dump.

To perform a full rertore do:

  1. stop the system
  2. restore data and configuration
  3. use kompile to rebuild database
  4. start the system

3.6 Upgrading the system

Before you do upgrade make sure to backup your data first. Upgrading from RPM is quite easy. First stop the system:

/etc/rc.d/init.d/lancelotd stop

Remove installed RPM:

rpm -e percival

this will not remove your configuration and data. You will see some messages about directory not being empty. Install new RPM:

rpm -ihv precival-1-1.x.i386.rpm

Rebuild configuration database:

kompile

Restart the system:

/etc/rc.d/init.d/lancelotd stop

/etc/rc.d/init.d/lancelotd start

root@squid Win2000]# pwd

/usr/local/percival/etc/lancelot-config/Devices/Computers/Win2000

“= /usr/local/percival/etc/lancelot-config/Devices/Computers/Win2000/10.1.0.100/

 

Advertisements

Posted on December 13, 2013, in Monitoring Tools, Uncategorized. Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: