Avalon network monitoring tool
Percival is a Network Monitoring and Capacity Planning front-end to the excellent RRDtool software. It is based on our experience of providing customized network monitoring solutions to the ISPs, banks and large enterprises in Israel.
We found that existing commercial tools are too expensive and ill designed for this task while existing RRD frontends lack essential features such as user profiles, simple configuration model, customizable GUI, performance, reports etc.
We started with stock Cricket and with time we essentially rewrote it to address its shortcomings.
We also had to add number of bug fixes and improvements to the RRDtool. Percival is a subset of our Lancelot Monitoring Framework that can be released under the GPL.
- WEB User Interface
- Themable user interface
- Support of MIB2
- Cisco, Linux and Windows
- Configuration is stored in hierarchical database
- Database supports on the fly editing, links and multiple users
- Reports: top utilized interfaces, errors, discards etc
- Totals: average, sum, many small graphs on one page
- Drill-down on all graphs
- Moving average smoothing
- Percentile 95%
- User profiles. Each user can see only his part of the configuration tree
- Each device polled only once even if it appears in multiple profiles
- Consistent CLI tools for the system configuration and maintanence
- Modified RRDtool is used for the data storage
- Designed to hold around 500 interfaces with 120 sec polling frequency
- OS: Linux, Solaris(must built from source)
Percival is written in Perl so it should be pretty portable. The officially supported platforms are Linux and Solaris.
2.1 Installation from RPM
It is best to install Percival from RPM. Percival RPM can be installed on RedHat, Mandrake and Turbolinux. RPMS can be obtained from SourceForge http://percival.sourceforge.net. The installation is as simple as doing
rpm -ihv percival-1-1.i386.rpm
After successful completion of the command you have Percival up and running.
2.1.1 Verifying Installation
Connect to http://<yourhost>:8181. You should see system login screen. The system comes with user guest and password guest. It is a good idea to change them.
Then do following commands:
su – avalon
Command will produce following output:
konfigd – ok
kollector – ok
thaw – ok
querymaker – ok
Another check is to rebuild database:
It will produce something like this:
[10-Mar-2003 17:11:29 :8336] Starting compile: Percival version 1.1.1 built on rothut.avalon-net.co.il at Sun Mar 9 16:05:50 IST 2003 by root features: light gpl
[10-Mar-2003 17:11:31 :8336] Processed 62 nodes (in 30 files) in 4 seconds.
2.2 Installing from Binaries
Binary installation is a little more tricky. Percival must be installed in the /usr/local/percival. It expects to find perl and all supporting packages in /usr/local/avalon. Other configurations are not supported at the moment. It is possible to relocate Percival and its supporting packages but you should be prepared to write some scripts.
First, you have to download following tarballs:
I am assuming you have put all tarballs in /tmp. Next step is to open them:
gunzip -c Percival-Perl.tar.gz | tar -xvf –
gunzip -c Percival-Apache.tar.gz | tar -xvf –
gunzip -c Percival-RRD.tar.gz | tar -xvf –
gunzip -c Percival-Source.tar.gz | tar -xvf –
The last step is to add Percival to OS start up scripts. This is OS dependent. The example of such script is located in /usr/local/percival/bin/lancelotd
2.3 Building from Sources
Percival builds are done automagically from the unified source base. Still it should be perfectly possible to build Percival from its stand alone components. Percival expects to find its supporting packages at /usr/local/avalon. First, you will need Perl and following list of packages:
Then you need to have apache and mod_perl installed. We use following packages:
After you have done with apache you have to build rrdtool that comes with Percival. Percival will not work with the standard rrdtool. You have to download rrdtool-1.0.28avalon.tar.gz from the Percival site and install it.
Finally, you have to untar Percival-Source.tar.gz
3 Percival Operation Basics
Despite its very simple look, Percival is a complex system. In this chapter you will learn how to operate Percival from command line. You will learn how to stop/start the system, what daemons(services) should be running, what each of them do. You will also learn how to manage network elements using konfne command and basic concepts of the Percival configuration database. In this chapter we try to keep things as single as possible, advanced concepts will be handled in the separate chapter.
All Percival commands a located in /usr/local/percival/bin.In the next sections we assume that you are working as user avalon you and you have this directory in your PATH variable or have done cd there. We also assume that Percival was installed from RPM on RedHat like linux distribution.
It is very important to operate Percival as user avalon and not as root!!! You can switch to avalon by typing su – avalon
3.1 Starting/Stopping Percival
As root execute
to start the system
to stop the system. 1
3.2 Percival Daemons (Services) and Commands
There are several daemons that are required for Percival normal operation. The daemons are responsible for data collection, detection of not responding IPs, report generation, web user interface and managing Percival configuration by remote client applications such as Merlin. Every daemon except the webserver must be managed by overlord.pl command.
There are following Percival daemons:
does all data collection.
detects not responding network devices.
generates all reports
provides configuration API to the external clients
Apache webserver. Needed for Percival Web interface.
Apache is controlled with apachectl command. It is located in /usr/local/avalon/bin. Apache must be managed as user root.
Every other daemon is controlled with overlord.
3.2.1 Controlling Daemons with Overlord
Overlord was developed because amount of Percival daemons2 was increasing rapidly. It was clear that some master daemon is needed to rule them all. You need to know following basic overlord commands:
tells status of each configured daemon
tells status of each configured daemon. Restart the daemon if it is not running
lists options for each configured daemon
overlord.pl tail <name>
shows last lines of the daemon log
shows all available options and short help
overlord.pl modify <name> <param=value> …
tweaks daemon options. If you installed from RPM system has very reasonable default values. You should not change them unless you really know what you are doing.
3.2.2 Common Command Line Options
Every Percival process be it a command or daemon understand some common options. These are:
show help message and exit
show version information and exit
controls process output. Level can be either Debug, Info, Warn or Error. Default level is Info. Debug is the most verbose and must not be used in production.
controls where to send process output. By default all output goes to STDOUT
Following options are accepted by any process but the process may ignore them. After all it makes no sense to run device configurator command (konfne) as daemon.
if specified process becomes a daemon(service)
controls how often daemon should work. For example kollector is run every 120s
Next options are available in Lancelot only. In Percival they will generate an error message.
turns on Perl based in memory cache. The cache is optimized to be very memory friendly and produces nice speed up. The option is obsoleted.
turns on alternative implementation of the configuration database (HDB). HDB database is optimized for speed and provides order of magnitude (and in some cases even more) speedup in comparison with Percival implementation. It is also fully backward compatible with the old database on the API level.
Percival only has four commands and you already know everything you need about overlord. It leaves three others. One, target.pl, is used for debugging of the configuration database and out of the scope of this chapter. The other, konfne 3, is how you add, update or remove network devices from Percival. In upcoming chapters we will deal with it a lot. The last one, kompile, produces binary database from Percival text based configuration database and downloads device configuration files from device modules. You need to issue this command every time you edit configuration database by hand or if you suspect that Percival database is corrupted.
3.3 Managing Devices
By the time you get to this section you dont want to read anything about processes, commands, daemons and options. You want to add your new Cisco router to the Percival NOW. This is how you do it:
konfne -af –ip <your router ip> –community <snmp community> cfg /Tree/Routers/dummyname
Thats all. Now you see the device once you login into the Percival as user guest.
Percival configuration is based on the concept of the “device”. Device can be a router, computer, switch or other network element. In this case we call it “real” device. There is another class of device such as reports, summary graphs(totals), profile etc… These devices can be created based only on information in the database. We call such devices “Virtual”.
3.3.1 Configuration Database Basics
Percival keeps its configuration information in the hierarchical(tree) database. The database is completely text based. It makes it very easy to backup and you can modify it using standard Unix CLI tools. Each user has its own view of the database.4
Percival converts textual database into binary format. Lancelot uses alternative implementation of the database , called HDB, that works directly over text files. kompile converts database into binary form usable by Percival and konfne. konfne works on both binary and text database.Thus you don’t need to worry about keeping database copies in sync with each other.
The database is located in /usr/local/percival/etc/lancelot-config directory. Directory in the filesytem represents directory in the database. However one file can have several database child nodes. Each database entry can have many properties in form of attribute=value pairs. Node attributes can be inherited. Database node may have a reference (link) to another node. It work pretty much as symbolic links in Unix or shortcuts in Windows.
Percival and Lancelot come with several preconfigured database entries:
has system wide settings such as skin, location of rrd files and others
has default system profile. The only way to change administrator username or password is to edit this file.
contains settings for all system services. overlord.pl is the preferred way to manage it.
contains definitions of Percival users. Percival comes with preconfigured with guest account with the root at /Tree. konfne is the preferred way to manage users.
directory contains all currently configured network devices. It also contains instructions how devices should be processed.
is root of preconfigured guest profile.
NOTE: remember to run kompile if you changed database manually.
3.3.2 konfne Basics
konfne will be your main tool for managing Percival. When you configure new or already existing devices several things happen:
- device might be snmp scanned
- device global configuration is placed under /Devices according to element classes. For example Router may have interfaces and chassis. Interfaces are placed under /Devices/Interfaces/<routername>/ and chassis configuration is placed under /Devices/Routers/. Some devices, such as report, may not have global configuration.
- If global configuration already exists, it is updated.
- Links and other needed device elements are created in the specified path
konfne has several standard options:
shows all available devices
–autotype or -a
guess device type automagically
–ip or -i
device ip address or hostname
–community or -c
device SNMP community. If not specified defaults to public
–fetchname or -f
fetch device name from sysName. Everything after the last / in the path is replaced by the fetched name
–recursively or -r
apply command to all devices in specified subtree. Usually used for automatic reconfiguration of already configured devices. For example, konfne -r /Tree will reconfigure whole guest profile.
–tag or -t <attribute>=<value>
apply device specific parameters. Each device may have specific configuration options.
konfne has following basic commands:
show help for specified device or path. To get help for profile configuration you can do:
konfne help Devices::Virtual::Profile
konfne help /Tree
will configure new device or update already configured device
deletes profile visible device configuration. Device is still collected.
deletes profile visible device configuration and delete global device configurationtion. There is no concept of device usage count. So if you have device configured in other profiles it will stop working there. Already collected data are not removed.
deletes device configuration from profile, from the database and removes all collected data.
check if device is responding to SNMP
3.3.3 Managing User Profiles
Percival supports concept of user. Each user must have different profile. For example, you can have one profile with the access to all of your routers. On the other hand your customer profile will give access only to specific router interface. Profile creation is governed by several simple rules:
- Nested profiles are not allowed
- All devices, except folders, must be created under profile
- Profiles with the same root are not allowed
- Profile name must be unique
Profile has three basic parameters:
- Profile name. In this document it is also referred as user name.
- Profile password
- Profile root. The closes analogy to profile root is user home directory in Unix.
Device Devices::Virtual::Profile provides all necessary profile management.
Profile has following device specific options:
specifies profile name
specifies profile authentication mode. Only local mode, which is a default, is supported in the Percival
if option is present and equal to true profile user can use Merlin to manage profile.
can be either true or false. If present and is true then graph legend is displayed under the graph in MRTG like style.
user of this profile can switch to another profile without performing an authentication. This is mostly useful for large installation when you want to have ‘master’ account.
Example of creating new profile foobar with the root at /MyProfiles/FoobarTree:
konfne –device Devices::Virtual::Profile -t profile=foobar -t password=secret -t ‘alt-legend=true’ cfg /MyProfiles/FoobarTree
3.3.4 Automatic Configuration of Network Devices
Percival has ability to automatically detect type of the network devices and invoke correct device module. Auto-detection works in many cases and is the easiest way to add new equipment. The downside of auto-detection is that you can not pass device specific options to the konfne. The auto-detection will not work for virtual devices.
This is how you autodect device:
konfne –autotype –ip <ip> –community <secret> cfg /Tree/Routers/myrouter
3.3.5 Standard Device Options
Every Percival device must support following standard options:
if specified it overrides device name specified in the path. Unlike path it may have embedded HTML tags and spaces.
3.3.6 Configuring Generic MIB2 Device
Almost any SNMP manageable equipment implements MIB2. Percival uses MIB2 to obtain netwrok interface statistics. If there is no specific Percival device for your equipment you can use Devices::Routers::Generic to obtain traffic statistics.
Options supported by Devices::Routers::Generic must be supported by any other device dealing with network interfaces. Following options are supported:
configure only named interfaces. That is interfaces which have description set in ifAlias.
if true, try to use 64 bit high performance counters (ifHCInOctets, ifHCOutOctets) for the high speed interfaces. Device checks if interface can really return high speed counters. In our experience ther are a lot of problems with 64 bit counters on CISCO routers. Care must be taken when invoking this option.
64 bit counters can be used if interface speed is greater then specified threshold. Speed is given in megabits. Default speed value is 100M.
by default ifDescr is used to get interface names. Some devices may have identical ifDescr but different ifName. In this case this option should be set to true.
list of symbolic interface types that should be configured. Interface will not be configured if this option was specified and interface type does not match.
only configure interfaces that match given regexp.
do not remove interface from configuration if it does not present on router anymore. Instead the interface is marked as “frozen”. It will have word frozen added to the description and its default graphs will display will end at the time the interface was “frozen”. This feature is useful to keep graph of old lines.
3.3.7 Configuring CISCO Equipment
It is well known fact that majority of the network equipment is manufactured by CISCO. Percival and Lancelot have very good support for the CISCO routers and switches, including advanced features such as SAA, Netflow and Quality of Service monitoring.5
18.104.22.168 Cisco Routers
CISCO routers are configured with the Devices::Routers::Cisco device. The devices has following options:
normally PPTP sessions are ignored unless the value of this option is true.
normally interfaces with ifType ppp are not configured. This option accepts a regular expression. If the expression match interface name as given in ifDescr and interface type is ppp then interface will be configured.
normally interfaces with the world “virtual” in ifDescr are skipped unless this option is true.
user on the router for doing login. This is needed for configuring either BGP or Pings.
password of the user that was specified with previous option
configure pings from Cisco router. The option accepts coma separated list of ips or hostnames.
22.214.171.124 Cisco IOS Switches
CISCO IOS switches are configured with Devices::Switches::IOS. The device does not have any specific options.
126.96.36.199 Cisco Catalyst Switches
CISCO Catalyst switches are configured with Devices::Switches::Catalyst. There are no device specific options.
3.3.8 Configuring Linux
We support UCD-SNMP or NET-SNMP agents on linux. We have encountered problems with the packaged snmp agent on RedHat 7.3. You can download our build of NET-SNMP that fixed that proble from percival site on SourceForge.
Linux computers are configured with Devices::Computers::Linux. There are no device specific options. Linux device supports monitoring of CPU load average, memory and disk usage in addition to the interface monitoring.
3.3.9 Configuring Windows 2000
Percival can configure Windows2000 with Host MIB or with Compaq Insight Manager MIB. The correct MIB is auto-detected. Windows 2000 computers are configured with Devices::Computers::Win2000. Device supports monitoring of CPU, memory and disk usage.
The device has following options:
gather service uptime statistics. Accepts coma separated list of services.
3.3.10 Configuring Windows NT
NT has very basic SNMP support. To get advanced statistics you must install SNMP4C from http://www.wtcs.org cess-watchdoghttp://www.wtcs.org. Windows NT computers are configured with Devices::Computers::WinNT. There e are no device specific options.
3.3.11 Configuring Reports
Reports in Percival provides you with high level system summary. Using reports you will be able quickly determine problems in your network and zoom to the problem area to view detailed statistics. Reports are configured with Devices::Virtual::Report. Following options are supported:
report type. There are several builtin reports:
compares network interface utilization over the period of time. Utilization is computed as traffic/bandwidthwhere bandwidth value is take from ifSpeed of the interface. Results will not be valid if your interface speed is set wrong.
sort interfaces by error count over the period of time. Presence of errors on the interface usually indicates hardware problems.
sort interfaces by discarded packets over the period of time. Packets are discarded when router queue is getting long. Presence of discard indicates routing problems or lack of bandwidth.
show interfaces that are consistently utilized with over 70% of capacity.
inverse of previous report.
how many results to show. Must be positive number.
detailed report description. HTML tags may be used here.
how to process data. Can be either AVERAGE or MAX.
sort report in either ascendant or descendant order. Can be either asc or desc.
range of report in seconds.
specifies on what devices to report. Subtrees are specification is absolute to the configuration root. Subtrees are in coma separated list.
3.3.12 Configuring Totals
Percival has ability to combine several graphs into one. This is useful when you want to see average utilization of some several interfaces, or your total international traffic or to see all graphs on one page. Device Devices::Virtual::Total provides this functionality. It can be configured with following options:
defines long description of the total. HTML tags and spaces are allowed.
list of subtrees to search for report targets.
match target name based on given regexp. Must be used with the subtree tag.
coma separated list of targets.
report type. Can be one of the following:
show small graphs for every interface or other target on one page.
show graph that sums all information.
show stack graph for all interfaces
show graph that averages all information.
Percival is smart enough to figure out how to aggregate information from different devices. The details of this process are out of the scope of this manual.
Percival daemons are supposed to write their log files under the /usr/local/percival/var/lancelot-logs/ directory. While it is possible to change that using –logfile option and overlord.pl, it is p probably best to stick to the convention. Overlord has command rotate that can be used to rotate logs periodically. In particularly log of kollector can grow quite large.
Logs format is:
[dd-Mon-yyyy hh:mm:ss :pid] free text
Log of every kollector measurement is written in following format:
[time] Retrieved data for <path>(inst)[error] : <ds1>@timestamp[low_bound-upper_bound],<ds2>,…,<dsn>
log time stamp
location of the element in configuration database
element instance as determined by mapping. Can be empty.
provides a precision estimation based on error in time measurement and database sampling interval.
in milliseconds. Actual time that goes to database
in fractional seconds. Shows when measurement was started.
in fractional seconds. Shows when measurement was completed.
datasourcses. Things like ifInOctets, ifOutOctets etc..
3.5 Backup Procedures
As a bare minimum you need to backup configuration and data directories. They are located in /usr/local/percival/etc/lancelot-config and /usr/local/perciva/var/lancelot-data. Since the Percival keeps both configuration and data in files there is no need to in any special agent. Backup can be done with standard Unix tools like tar, cpio, or dump.
To perform a full rertore do:
- stop the system
- restore data and configuration
- use kompile to rebuild database
- start the system
3.6 Upgrading the system
Before you do upgrade make sure to backup your data first. Upgrading from RPM is quite easy. First stop the system:
Remove installed RPM:
rpm -e percival
this will not remove your configuration and data. You will see some messages about directory not being empty. Install new RPM:
rpm -ihv precival-1-1.x.i386.rpm
Rebuild configuration database:
Restart the system:
root@squid Win2000]# pwd