My favorites | Sign in
Logo
          
Search
for
Updated Jun 05, 2009 by jwzurawski
NPToolkit  

perfSONAR Network Performance Toolkit

TOC

Introduction

The perfSONAR Network Performance Toolkit (NPToolkit) is a customized version of a Knoppix Live-CD bootable disk. The software consists of an iso disk image that can be used in one of two ways:

  • Burn the image to a disk (recommended)
  • Mount the ISO file system and run as a virtual machine (e.g. under VMware, QEMU, or xen)

After performing either step, the result is a fully configured system capable of running and using many advanced network tools and methodologies to serve as either a testing point or a means to solve performance issues.

Background

The original idea for the NPToolkit came from the Network Performance Workshop series sponsored by Internet2. The old format of this workshop consisted of a day of tool installation (e.g. installing the Web100-enhanced kernel and other performance tools) followed by some time to test and use them. This particular format did not allow enough time to get to the primary reason many participants choose to sign up: solving network performance problems.

To lower the bar on this complex topic, this product offers a complete network performance suite pre-installed (and in many cases pre-configured) that is ready to be dropped directly into a network operations center (NOC) of any size. With the integration of perfSONAR tools, the results of both passive and active measurements can be shared in a federated environment with your peers enabling the early detection of performance issues.

Purpose

When serving as a measurement point, a use case adopted by Large Hadron Collider (LHC) affiliated Tier 2 sites in the United States, the disk becomes a part of a global registry of perfSONAR tools. Affiliated sites and other interested parties may perform any number of tests (on demand, or in some cases regularly scheduled) to the installation including:

  • Ping
  • Traceroute
  • OWAMP (One-way Ping)
  • BWCTL
    • Iperf
    • Thrulay
    • Nuttcp
  • NDT
  • NPAD

When used in the role of a network debugging infrastructure, the various tools can be used to methodically find, diagnose, and aid in the correction of network performance issues.

More Information

For information regarding upcoming releases or important notices concerning quality assurance and security we encourage subscription to our announcement mailing list:

This list is expected to be low volume and subscription is open. To get (or contribute) help within the Performance Node community we encourage subscription to our users mailing list:

Non-subscribing users may still post without being a member. This list will be of a higher volume than the announcement list, and should be used as the first stop for help regarding installation, configuration, and use of the Performance Node software. A final piece of information is the developer's issue tracker which can be used to find, track, and file bugs related to this software:

If your solution is not easily solved by members of the mailing list, we encourage the use of this tool to inform the developers of potential problems. Note that before filing a bug it is customary to search open issues to be sure something isn't being duplicated. Also be sure to use a the Component-NPToolkit tag or indicate somewhere in the report that the bug is related to this product.

Previous Releases

The original version of the perfSONAR Network Performance Toolkit was released in the summer of 2008, at the Joint Techs conference in Lincoln Nebraska.

Beta 1

Beta 2

Release

Installation and Configuration

Please visit NPToolkitQuickStart for a complete set of instructions regarding installation and configuration.

Disk Contents

The main feature is a Web100-enhanced Linux 2.6.23-9 kernel that facilitates the use of several network tools without the need to configure and compile this advanced component. This disk features several performance tools pre-installed and configured to a working state:

  • BWCTL - Bandwidth Test Controller
  • Cacti - Network data polling and graphing
  • NDT - Network Diagnostic Tool
  • NPAD - Automatic diagnostic server for troubleshooting end-systems and last-mile network problems
  • OWAMP - One-way Ping
  • perfSONAR-PS Tools
    • Lookup Service - perfSONAR registration and discovery
    • PingER Measurement Archive and Measurement Point - Perform and archive latency measurements
    • perfSONAR-BUOY Measurement Archive and Collection Framework - Perform and archive bandwidth and one way latency measurements
    • SNMP Measurement Archive - Archive SNMP data
  • Thrulay - Network capacity tester

Additional supporting components include:

Performance Tools

The following performance tools are packaged on the perfSONAR Network Performance Toolkit. Each tool comes configured to function with default options, although customization is possible and encouraged. Certain tools have begun the steps necessary to participate in the perfSONAR framework by exporting data and utilizing the information services. We expect many more will follow in the future as adoption continues to grow.

BWCTL

BWCTL is a command line client application and a scheduling and policy daemon that wraps tools such as Iperf, thrulay, and nuttcp. Currently BWCTL wraps these tools by actually executing the respective command line program on the system. The bwctl client application works by invoking instances on the two test endpoint systems. BWCTL will work as a 3-party application. The client can arrange a test between two servers on two different systems. If the local system is intended to be one of the endpoints of the test a local instance is not required, BWCTL will detect that there is no local server and execute the required functionality directly. The daemon manages and schedules the resources of the host on which it runs.

The perfSONAR Network Performance Toolkit contains a release candidate of version 1.3 of BWCTL. The daemon is started by default with a wide open authentication and resource protection scheme. It is recommended that the deploying party review this to ensure it matches local security policies.

Cacti

Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a easy to use interface that offers easy ways to pool data sources, manage the data, and graph the results.

The perfSONAR Network Performance Toolkit contains version 0.8.7b of Cacti. Cacti is fully installed and will start by default. It can be reached by visiting http://HOST_OR_ADDRESS/admin/cacti*. The interested use can follow the instructions in Cacti Configuration to poll a local or remote SNMP enabled network device to gather passive measurements. The SNMP Measurement Archive is configured by default to read the cacti data and deliver this data.

NDT

The Network Diagnostic Tool (NDT) is a client/server program that provides network configuration and performance testing to a user's desktop or laptop computer. The system is composed of a client program (command line or java applet) and a pair of server programs (a web server and a testing/analysis engine). Both command line and web-based clients communicate with a Web100-enhanced server to perform these diagnostic functions. Multi-level results allow novice and expert users to view and understand the test results.

The perfSONAR Network Performance Toolkit contains version of 3.4.4a NDT. All installation, including integration of the Web100 enabled linux kernel, is complete and operational as the disk comes online. Minor configuration is necessary, but is performed as a part of the general configuration steps described in Step 2 - Console Configuration.

NPAD

The NPAD diagnostic server, Pathdiag, is designed to easily and accurately diagnose problems in the last-mile network and end-systems that are the most common causes of all severe performance degradation over long end-to-end paths. The overall goal is to make the test procedures easy enough and the report it generates clear enough to be suitable for end-users who are not networking experts.

The perfSONAR Network Performance Toolkit contains version of 1.5.4 NPAD. All installation, including integration of the Web100 enabled linux kernel, is complete and operational as the disk comes online. Minor configuration is necessary, but is performed as a part of the general configuration steps described in Step 2 - Console Configuration.

OWAMP

OWAMP is a command line client application and a policy daemon used to determine one way latencies between hosts. It is an implementation of the OWAMP protocol as defined by http://www.rfc-editor.org/rfc/rfc4656.txt.

With roundtrip-based measurements, it is hard to isolate the direction in which congestion is experienced. One-way measurements solve this problem and make the direction of congestion immediately apparent. Since traffic can be asymmetric at many sites that are primarily producers or consumers of data, this allows for more informative measurements. One-way measurements allow the user to better isolate the effects of specific parts of a network on the treatment of traffic.

The perfSONAR Network Performance Toolkit contains version 3.0c of OWAMP. The daemon is started by default with a wide open authentication and resource protection scheme. It is recommended that the deploying party review this to ensure it matches local security policies.

Thrulay

Thrulay is used to measure the capacity, delay, and other performance metrics of a network by sending a bulk TCP or UDP stream over it.

Special features of thrulay include:

  • For TCP, ability to measure round-trip delay along with throughput
  • For UDP, ability to measure
    • one-way delay, with quantiles
    • packet loss
    • packet duplication
    • reordering
  • For UDP, the ability to send precisely positioned true Poisson streams (microsecond errors in sending times)
  • Human- and machine-readable output (ready to be fed to gnuplot)

The perfSONAR Network Performance Toolkit contains version 0.9 of thrulay. Thrulay is offered as an add-on tester when used with BWCTL and the thrulay daemon is not started as a service by default.

perfSONAR PS Services

perfSONAR-PS is a set of independent software services that implement the perfSONAR protocols for network performance monitoring. perfSONAR-PS services are designed to be compatible with all other perfSONAR software that implements the perfSONAR protocols. perfSONAR-PS is able to federate between deployments, particularly those that span multiple domains, making the job of solving end-to-end performance problems on paths crossing several networks much easier to address.

The perfSONAR-PS services provide Web Services (WS) based interfaces into already deployed network monitoring infrastructures and act as an intermediate layer, between the performance measurement tools and the diagnostic or visualization applications. The targeted audience for these services is network Network Operation Centers (NOCs) at Universities and Regional networks, however these services are broadly useful to the general public at large.

The perfSONAR-PS software suite is developed entirely in the Perl programming language, taking full advantage of numerous language features and benefits including the Comprehensive Perl Archive Network (CPAN) distribution system. This software manager makes perfSONAR-PS the ideal choice for integration into typical NOC environments.

The following perfSONAR-PS services are included on the Network Performance Toolkit. Each version is a pre-release of the upcoming 0.10 series.

Lookup Service

The perfSONAR-PS Lookup Service (LS) addresses the always challenging problem of resource registration and discovery for the perfSONAR framework. Service instances that manage datasets are only useful when they can be contacted by consumers. Consumers can only function when there is data available. To manage these problems in a dynamic environment such as perfSONAR, it is necessary to register, maintain, and query for the services that may contain interesting data.

The advent of the perfSONAR Global Lookup Service (gLS) now delivers a world wide view of all available perfSONAR and selected performance tools. Each LS instance is able to take an inventory of locally registered resources and share these with a well established infrastructure of globally deployed gLS services. Through the use of established APIs service and client applications alike can gain access to this powerful information source.

The perfSONAR Network Performance Toolkit contains a version of this software that starts up on disk boot. Additional configuration is provided via Step 2 - Console Configuration.

PingER Measurement Archive and Measurement Point

The PingER service is an evolution of the PingER project with more than 10 years experience in collecting and analyzing network performance across the world. The perfSONAR-PS PingER service is composed of both a storage backend (Measurement Archive) and measurement frontend (Measurement Point) to conduct and store ping measurements and to make available such data for consumption by interested parties. Network characteristics supported include availability, latency and jitter, which provide a broad spectrum of determining end-to-end network performance.

The perfSONAR Network Performance Toolkit contains a version of this software that starts up on disk boot. Additional configuration is provided via Step 2 - Console Configuration. A default set of ping-able hosts are included on the installation, additional configuration can be performed via PingER Configuration.

perfSONAR BUOY Measurement Archive

The perfSONAR-BUOY Measurement Archive service (pSB MA) exposes active measurement data, making the results available through perfSONAR web services interfaces. All performance tests are performed via the BWCTL tool (throughput measurements) or the OWAMP tool (one way delay) and currently consist of regularly-scheduled tests to a configurable list of source and destination hosts.

Data collected is stored in a MySQL database. Active measurements between known hosts are particularly valuable when assuring connectivity, availability, and quality of the network. perfSONAR-BUOY allows for the easy configuration of a "mesh" of tests to hosts equipped with BWCTL or OWAMP. Based on a configurable schedule, the tests will be conducted autonomously and stored for consumption via the perfSONAR-BUOY interface as well as presentation through included web scripts. Using the same XML protocols as the the other perfSONAR services, perfSONAR-BUOY provides uncomplicated access methods to retrieve the data in an unambiguous manner, thus eliminating the mystery associated with the backend storage.

The perfSONAR Network Performance Toolkit contains a version of this software that starts up on disk boot. Additional configuration is provided via the Console Configuration step. This software requires configuration via perfSONAR-BUOY Configuration before it can be used.

SNMP Measurement Archive

The perfSONAR-PS SNMP based Measurement Archive (SNMP MA) is able to expose data collected via variables from the Simple Network Management Protocol (SNMP) protocol found on networked devices and stored in Round Robin Databases (RRD) archives. The measurements are collected through external means, normally through software such as MRTG, Cacti, or Cricket, and is commonly stored in RRD archives. A common way to diagnose network problems is to gain external access to commonly requested data items (such as interface utilization, errors, discards, etc.) remains challenging due to physical and political boundaries. The purpose of the SNMP MA is to deliver these items transparently and securely.

The perfSONAR Network Performance Toolkit contains a version of this software that starts up on disk boot. Additional configuration is provided via Step 2 - Console Configuration. This software requires configuration via Cacti Configuration before it can be used.

Supporting Software

The following extra software packages are installed on the perfSONAR Network Performance Toolkit to aid the performance tools in the performance of their duties. Each piece of software named below is configured in a default manner to function when the disk starts, but interested parties may take additional steps to secure the disk further through additional configuration (beyond the scope of this document).

Apache2

Apache web server, version 2, is featured on this disk for use with the various GUIs and administrative tools. Apache has been configured with several major modules (including PHP, Perl, and MySQL) and matches the default upstream Knoppix version. By default all traffic will be passed over https (e.g. SSL), and traffic meant for http will be redirected.

Users may control apache via the /etc/init.d/apache2 interface.

K Desktop Environment

The K Desktop Environment (KDE) is included as the default XWindows environment. There are many applications included by default including web browsers, and graphical interfaces to certain configuration tools. Users more comfortable with this environment when compared to a console should run startx to enable XWindows.

MySQL

MySQL, an open source relational database, is the default method for storing perfSONAR data collected via PingER and perfSONAR-BUOY, and Cacti.

By default MySQL is configured to listen only on the local interface and does not feature a default root password. Users concerned with security are encouraged to set one immediately. All user accounts are protected by passwords.

NTP

NTP is a protocol designed to synchronize the clocks of computers over a network. NTP version 3 is an internet draft standard, formalized in RFC 1305. NTP version 4 is a significant revision of the NTP standard, and is the current development version, but has not been formalized in an RFC. Simple NTP (SNTP) version 4 is described in RFC 2030.

The perfSONAR Network Performance Toolkit comes pre-loaded with several NTP servers, and users may adjust which servers their specific disk uses in the Step 7 - NTP step. Do note that a sane configuration of NTP consists of 4 to 5 servers geographically close to the deployment location.

Oracle DB XML

Oracle Berkeley DB XML is an open source, embeddable XML database with XQuery-based access to documents stored in containers and indexed based on their content. Oracle Berkeley DB XML is built on top of Oracle Berkeley DB and inherits its rich features and attributes. Like Oracle Berkeley DB, it runs in process with the application with no need for human administration. Oracle Berkeley DB XML adds a document parser, XML indexer and XQuery engine on top of Oracle Berkeley DB to enable the fastest, most efficient retrieval of data.

Version 2.3.11 of this software comes pre-loaded, but does not need to be started (it functions more as a library than an executable).

FAQ

  • Q: How do I run the NPAD system?
  • A: The NPAD (Network Path and Application Diagnosis) is a client/server program developed by the network research group at Pittsburgh Supercomputer Center (PSC). At boot time, the NPToolkit starts the NPAD server process and leaves it listening on TCP port 8200. To use this server, a user starts a Java-enabled web browser and points it at the NPToolkit server (http://HOST:8200). The server automatically downloads a Java applet to the client. Then the user runs a test to begin the diagnostic process. Once the test has been completed, the server displays a results page on the clients browser. The user may examine these results and follow the recommendations to resolve problems. If the user is unable to repair a reported problem, the results page URL can be emailed to the appropriate system administrator or NOC operator. The server retains a complete record of the test results and the raw data used to derive these results. This allows post-processing of interesting results to determine what went wrong and to improve the reporting capabilities of the NPAD server.
  • Q: How do I use BWCTL?
  • A: BWCTL (Bandwidth Test Controller) is a client/server program developed to simplify Iperf, thrulay, and nuttcp testing between hosts. At boot time, the NPTookit starts a BWCTL server process and leaves it listening on TCP port 4823. This server may then be accessed by remote BWCTL clients. Additionally, the disk contains BWCTL client applications that can be used to test to remote instances. The BWCTL server allows TCP tests with a maximum duration of 60 seconds. To run a test to a remote BWCTL server:
    1. Logon to the NPToolkit server using the knoppix or other valid userid
    2. Identify the remote server
    3. Run bwctl -s remote-bwctl-server command to stream data for 10 seconds from the locally instance to the remote BWCTL server. Results are displayed on our console or terminal window.
  • Q: How do I run the NDT system?
  • A: The NDT (Network Diagnostic Tool) is a client/server program developed to simplify testing to desktop/laptop computers. At boot time, the NPToolkit starts a pair of NDT server processes and leaves them listening on TCP ports 7123 and 3001. To use this server, a client starts a Java-enabled web browser and points it at the NPToolkit server (http://HOST:7123). The server automatically downloads a Java applet to the client. The end-user can run a test to begin the diagnostic process. Once the test has been completed, the server displays a results page on the clients browser. The end-user may examine these results and follow the recommendations to resolve problems. If the end-user is unable to repair a reported problem, the user may click the Report Problems button to generate an email that will be addressed to the appropriate NPToolkit administrator. The server retains a record of the test results to allow the post-processing of interesting results to determine what went wrong and to improve the reporting capabilities of the NDT server.
  • Q: What is NTP?
  • A: NTP (Network Time Protocol) is a program that synchronizes a computers clock to a global time source. An accurate clock is essential for running many of the measurement tests including BWCTL and OWAMP. The NTP daemon must connect to several, at least four (4), remote time servers to obtain accurately set the local clock. By default the NPToolkit server will synchronize to both Internet2 and public time sources. See Step 7 - NTP for information regarding changing the default time sources.
  • Q: How do I use OWAMP?
  • A: OWAMP (One-Way Ping) is a client server program that was developed to provide delay and jitter measurements between two target computers. At boot time, the NPToolkit starts an OWAMP server process and leaves it listening on TCP port 861. This server may then be used by remote clients. Additionally, the disk contains OWAMP client applications that can be used to test to remote instances (including a Java client and a console based application). By default, the OWAMP server sends a low-level data stream in each direction and measures the one-way delay and jitter between the two hosts. Separate measurements, one for each direction, are reported to the user at the end of the test.
    • To run a test to a remote OWAMP server:
      1. Logon to the NPToolkit server using the knoppix or other valid userid.
      2. Identify the remote server.
      3. Run the owping remote-owamp-server command to make a pair of 10 second delay measurements (one in each direction) between remote OWAMP server and the local instance. Results are displayed on the console or terminal window.
  • Q: I get an error similar to DBD::mysql::db selectrow_array failed: Table 'pingerMA.data_XXXXXX' doesn't exist at /some/path/to/Base.pm line XXX.
  • A: Ignore this. Depending on OS scheduling, PingER might try to use a table before creation. It should only happen occasionally.
  • Q: Can I Use a Firewall?
  • A:
    • Yes. To enable, first add all the desired rules to the firewall then run the command "/etc/init.d/iptables save". The firewall should then automatically come up on the next boot. Note that there are some caveats to enabling a firewall, namely the amount of holes that must exist for the measurement tools included on the disk:
      • SNMP MA
        • open port tcp/8065
      • PingER
        • open port tcp/8075
      • perfSONAR-BUOY
        • open port tcp/8085
      • Lookup Service
        • open port tcp/8095
      • bwctl
        • open port tcp/4823
        • Edit /usr/local/etc/bwctld.conf, set peer_port to a value, open the tcp port for that value
        • Edit /usr/local/etc/bwctld.conf, set iperf_port, thrulay_port and nuttcp_port to a specific range, and open the tcp/udp ports for those ranges.
      • owamp
        • open port tcp/861
        • Edit /usr/local/etc/owampd.conf, set testports to range, open the udp ports for that range
      • NDT
        • open port tcp/3001
        • open port tcp/3002
        • open port tcp/3003
        • open port tcp/7123
      • NPAD
        • open port tcp/8100
        • open port tcp/8200
      • Apache HTTP Server
        • open port tcp/80
        • open port tcp/443
      • SSH
        • open port tcp/22
  • Q: How many ports will BWCTL need to operate effectively behind a firewall?
  • A:
    • BWCTL has several settings defined in the bwctld.conf file that dictate which ports it may use for testing. It is recommended that several ports be specified to allow for multiple test opportunities for the BWCTL scheduler, otherwise tests will not be completed in a timely fashion. It is also important not to overlap these ports to prevent contention from the tools. The configuration options are:
      • iperf_port - Port range (e.g. 5001-5005) to run the iperf receiver.
      • nuttcp_port - Port range (e.g. 5006-5010) to run the nuttcp receiver.
      • thrulay_port - Port range (e.g. 5011-5015) to run the thurlay receiver.
      • peer_port - Port range (e.g. 10100-10200) to run the server processes of the above tests.
    • Note that some simple calculations can be used to determine how many ports are expected for a full schedule of tests. For example in assuming a completely packed schedule of 10 second long tests, there are a maximum of 6 tests per minute available. BWCTL cycles through the port range one-by-one, assigning each scheduled test the next open port in line. The iperf/nuttcp/thrulay applications may leave sockets open for a while after the test has finished. If a test tries to use the same port as was used by a previous test, and that test's socket is still open, the newer test will fail. By default, Linux allows these sockets to stay open for up to a minute. So, to ensure that no tests fail, for 6 tests, the minimum number of ports to make this work would be 7. However, due to environmental factors, it is best to use a number higher than this. Since port ranges are reasonably easy to open, it is recommended that the final number use double the range: 14 ports.
  • Q: I'd like to PXE boot the NPToolkit. Is that possible?
  • A: Not currently, but this is a consideration of future releases.
  • Q: When I boot, it gives me the following error Can't find knoppix file system, sorry. Dropping you to a very limited shell ....
  • A: This can be attributed to a bad CD burn or a bad ISO image. Check the MD5 sum of the ISO and match this to the posted MD5 value.
  • Q: What should I enter for the Communities of interest configuration question?
  • A:
    • This question can be confusing to answer for new users. The goal is to associate some loosely coupled labels to the data that the perfSONAR NPToolkit disk will be making available to the larger world. Think of this step similar to assigning labels to photos or music. Some examples of valid answers are:
      • Internet2 - The data made available somehow connects the Internet2 backbone
      • LHC (CMS, ATLAS, etc.) – The system is part of the LHC perfSONAR instracture.
        • The USATLAS community has requested that peer sites use the following as the Communities of Interest string: LHC USATLAS
      • eVLBI - The disk is a part of the larger telescope community
      • MAX - A connector of member of the MAX gigapop
    • Use as many community names as necessary to properly categorize the data from the installation.
  • Q: How do I partition a new disk or re-partition an old disk?
  • A: Please see this NPToolkitDiskFormatting for a primer on disk formatting and re-formatting using the tools on the NPToolkit.
  • Q: I see an expired certificate error when I visit the HTTPS pages, what should I do?
  • A: Older versions of the NPToolkit contained an expired, self signed certificate that was originally created by Knoppix. We have updated this cert on new releases, but there are some steps that can be taken to address the problem.
    • Instruct browsers viewing the disk to accept the cert, regardless of its status.
    • Update the cert by hand:
    •      openssl req $@ -new -x509 -days 365 -nodes -out /etc/apache2/apache.pem -keyout /etc/apache2/apache.pem 

Last Updated

$Id$


Sign in to add a comment
Hosted by Google Code