Myricom logotype
Fabric Management System for Myri-10G and Myrinet-2000 Myrinet Networks

Table of Contents

Introduction

The Fabric Management System (FMS) is a collection of tools and processes used to manage a Myri-10G or Myrinet-2000 Myrinet network. This system relies on a database formed by a collection of files which describe how the network is connected. Since there is a description of how the network is supposed to look, it is possible to report discrepancies between the observed network state and the desired network state. For example, without this description, a failed switch-to-switch link could be routed around, but could not easily be reported as missing, since there would be no way of knowing the link is supposed to be there.

The Fabric Management System is an important diagnostic for verifying the health of the Myrinet hardware. This system also supersedes Mute.

The two primary processes in the Fabric Management System are the fm_server (Fabric Management Server) and the fma (Fabric Management Agent).

fm_server runs on one machine and serves as a focal point for all management activity on the fabric. All errors are reported to fm_server, and are available for viewing through a variety of means.

There is one fma process on each Myrinet node. This process replaces the previous mapper (mx_mapper or gm_mapper) and expands its functionality. Any errors noticed by the fma process are reported to the fms and are then made available to system operators.

There are two ways to use FMS:

  1. fma's in standalone mode (i.e., without the fm_server process). The fma is the default mapper in MX and can be used as a replacement for the gm_mapper in GM. Ethernet connectivity between the nodes in the cluster is not required for this fma-only mode.
  2. fma's and fm_server for diagnostics. Ethernet connectivity is currently required for the fm_server. An upcoming release of FMS will remove that restriction.

FMS relies on a fabric description database, consisting of a collection of text files as described below.

FMS Database

The FMS database consists of several files in a directory, all of which are easily human-readable and editable. These files are read and written by the fm_server process. The current file list is:

which are read and written by the fm_server process, and are located in the directory $FMS_RUN/database. The FMS server process must have write access to this directory, whereas the FMA agents only require read access privileges. By default, the environment variable FMS_RUN is set to /var/run/fms/, but can be overridden by the customer.

When error conditions are detected in the fabric, alerts are generated by the fm_server process. These can be queried on a regular basis, or the fm_server can be configured to proactively report alerts through a user-defined mechanism.

Error Reporting by FMS

Errors detected by or reported to the FMS process are reported to system administrators through "alerts", which are discussed further in Appendix B below. Alerts can be queried remotely via the command fm_show_alerts, via web CGI scripts, via log file monitoring, or the FMS can be configured to run a user-specified command whenever an alert occurs.

The FMS can detect exceptional conditions, either errors or warnings, by monitoring the switch enclosures and through communication with the fma processes running on each node.

Errors Detected Directly By FMS

The FMS process periodically polls the switch enclosures to monitor link status and environmental conditions. The most common problems reported by the switches are down links, noisy links, and overly high operating temperatures.

The FMS monitors the connection status of all fma processes in the system. The absense of an fma connection from a host which is expected to be present generates an alert, as does the loss of connectivity to any fma.

Switch enclosures report the up/down status of each link. The FMS compares this with expected link status and generates an alert if appropriate. If a link transitions from up to down too many times in a given time period, the link is marked as "flaky" and an alert is generated.

The badcrc counters on each link are monitored, and too many badcrcs within a given time period generates an alert.

If the temperature reported by a linecard exceeds a set threshold, an alert is generated. This threshold defaults to a value that is below the shutdown temperature of the linecard, but if higher than should be seen in practice.

The thresholds for all of these alerts can be controlled via the "fm_settings" command.

Errors Detected By FMA

The fma continuously monitors the NICs in a host for several conditions. A CRC error rate which exceeds a set threshold will generate an alert, as will SRAM parity errors in the NIC.

The fabric is continually verified by the fma processes, and any change in fabric topology is reported to the FMS. Depending on the source of the change, this may result in an alert being generated.


Installation

The Fabric Management System (FMS) for Myrinet networks may be used with either the MX or GM firmware.

Requirements

In order to install FMS, the following requirements must be met:

NOTE: FMS is not yet supported on 10G-SW32HSSM-16QP switches.

Installation Instructions

The FMS distribution is available in the MX distribution (MX-1.1 and later). If you would like to use FMS with GM-2 or GM-1, FMS is also available as a separate FMS tarball. We will soon integrate FMS into the GM-2 distribution.

Installing FMS within the MX distribution

As of MX-1.1 and later, the FMS is integrated into the MX install package. To build and use FMS with MX, you need to do the following:

  1. Select a host to be the fm_server. The host needs to have IP connectivity to all of the Myrinet compute nodes in the cluster as well as IP connectivity to the switch(es), and does not need to be a Myrinet node. This host also needs to have access to the MX installation directory, <install_path>. Do not choose one of the Myrinet compute nodes to be the fm_server. Ideally, the fm_server server process will run on the headnode of the cluster.

  2. Install and load MX (specifying the --with-fms-server=<fm_server> option at configure time) on the compute nodes, as instructed on the MX-2G Download page or MX-10G Download page.

  3. On the host chosen to be the fm_server, create a writeable directory for maintaining the FMS log and fabric database. The default directory location is /var/log/fms/.

       $ mkdir -p /var/run/fms/
    
  4. Define the Myricom switch enclosures with fm_switch.

       $ <install_path>/bin/fm_switch -a <switch_name>
    

    Repeat for each enclosure until all switches have been added.

  5. Start the program fm_server with the flag -d to force the process to run in the background as a daemon.

            $ <install_path>/bin/fm_server -d
    

    The fm_server process will print the name of its log file for confirmation.

    Note: If the host chosen to be the fm_server does not have read access to the MX installation directory, <install_path>, you will also need to configure and compile (but not load) MX on the FMS server in order to create the fms executable.

    The default directory for the FMS database is /var/run/fms/database/.

    The default location for the FMS log is /var/run/fms/fms.log.

    The default location for the FMA log (on each of the compute nodes) is in /var/run/fms/fma.log.

    The default location for the FMS tools is <install_path>/bin/.

Installing FMS from the FMS tarball

  1. Download FMS.
  2.  $ tar zxf fms.tar.gz
     $ cd fms
     $ autoconf
     $ ./configure
     $ make
     $ make install
    

    Note: A patch for FMS is required for interoperability with GM-1. This patch has only been tested with GM-1.6.5. Once the patch is applied, the code will only work on GM-1 (not GM-2 or MX). You should delete your old FMS database before starting up the new fms and fmas.

    By default, FMS assumes that MX is the low-level firmware, and that MX is installed in the directory (/opt/mx). (If the GM firmware is used, the default installation directory for GM is /opt/gm). The FMS package is installed in the default directory, /opt/fms.

    If you would like to specify a different FMS installation directory, you need to pass the --prefix option to configure. E.g.,

     $ ./configure --prefix=<other_dir>
    

    where <other_dir> specifies the alternate directory for installation.

    To use GM instead of MX, or to specify an alternate install directory for GM or MX, pass the following option(s) to configure:

     $ ./configure --with-myri-api=gm --with-myri-install-dir=<myri_install_dir>
    

    where <myri_install_dir> specifies the binary/library installation directory for MX or GM.

  3. If FMS is installed in the default location, add /opt/fms/bin to your $PATH. E.g.,

     $ export PATH=/opt/fms/bin:$PATH
    

    Otherwise, add <other_dir>/bin to your PATH.

    Note: If you do not use the default directory, (/opt/fms/), for the FMS installation, you will need to set and export the environment variable FMS_INSTALL. This will allow all of the FMS tools to find the directory automatically.

  4. Create a writeable directory for maintaining the FMS fabric database. The default directory location is /var/run/fms.

     $ mkdir -p /var/run/fms
    

    Note: If you do not use the default directory (/var/run/fms) for the FMS fabric database, you will need to set and export the environment variable FMS_RUN, specifying the location for this FMS fabric description directory.

  5. Define the Myricom switch enclosures by using the fm_switch command:

     $ fm_switch -a <switch_name>
    

    where <switch_name> is the DNS name or IP address for the monitoring line card within each switch enclosure in the Myrinet fabric. Repeat for each enclosure until all are added. To view a list of the enclosures currently defined, type:

     $ fm_switch
    

    If you need to remove a switch from the list, use the -d option:

     $ fm_switch -d <switch_name>
    
  6. Start the fm_server process with the flag -d on the machine to which you installed FMS. This -d flag causes it to run in the background as a daemon. E.g.,

     % fm_server -d
    

    The fm_server should have read access to the installation directory FMS_INSTALL and the directory to which MX or GM is installed, and also needs write access to the run directory FMS_RUN.

    Note: The machine to which the fm_server process is installed does not need to contain a Myrinet NIC, but it must have IP access to the Myrinet switches and all compute nodes in the Myrinet fabric. Since FMS has a socket connection to each fma and also to each switch enclosure, make sure the system running fms will allow enough open file descriptors for every node in your fabric, plus every enclosure, plus another fifty or more to be safe. (Sometimes a disconnected fma may reconnect before the OS realizes the previously used socket is now closed.)

  7. Stop the existing MX or GM mapper process on all Myrinet nodes in the fabric.

     $ <install_path>/sbin/mx_stop_mapper
    

    or

     $ killall gm_mapper
    

    where <install_path> is the MX install directory.

  8. Start the FMA agent process on each Myrinet node in the fabric.

     $ fma -d -s <fms-server>
    

    where <fms-server> is the hostname of the host running the FMS server process. Each compute node must have read access to the FMS_INSTALL directory and the directory where MX or GM is installed, but does not need write access to the FMS_RUN directory.

    Note: You can restart the fm_server or any or all of the fma processes independently.

    You should have mapping routes within about 10 seconds after starting the fms and fma processes.

  9. Use the fm_status command to see the current status of the FMS.

     $ fm_status
    

    Note: If you are using Myrinet-2000 M3-CLOS-ENCL or M3-SPINE-ENCL switches, it should not take longer than 30 seconds to map the Myrinet fabric.

    If you are using Myrinet-2000 M3-E* switches, it may take up to five minutes to map the Myrinet fabric.


Program Usage

Example Usage Scenarios

After the FMS system has been installed and the FMS database created, the operator (system administrator) should periodically check the "health" of the Myrinet fabric using the fm_show_alerts command. For example, there could be a screen on the operator's console that runs:

while true; do clear; fm_show_alerts; sleep 5; done

We also provide a web-based version of fm_show_alerts so that the health of the Myrinet fabric can be monitored remotely. If you need further information to explain a specific alert text message, refer to Appendix B: Alerts and the libfma/alerts.def file in the FMS distribution.

For a detailed discussion of the command-line arguments to all of the FMS tools, refer to Appendix A: Program Usage.

For a detailed listing of troubleshooting procedures to verify the health of a Myrinet installation, refer to the following FAQ entry.

In the following examples, MX has been loaded on all nodes in the Myrinet fabric, the fm_server server process and fma agent processes are running, and the FMS database has been created.

Example #1:

As a simple example to demonstrate an FMS alert and how to acknowledge and remove the alert, do the following:

  1. An fma process disappears from one node, fog12.
  2. Run fm_status to check for alerts.
  3. Run fm_show_alerts to detail the alert and provide the alert id number.
  4. Restart the fma process on fog12.
  5. After the fma process has been restarted, the alert is marked as a relic (an alert which is no longer true, but not yet ACKed).
  6. The alert is ACKed using fm_ack_alert, and removed.

The actual output from the FMS tools would look like:

$ ssh fog12 sudo killall fma
$ fm_status
FMS Fabric status

33      hosts known
31      FMAs found
1       un-ACKed alerts
Mapping is complete, last map generated by fog20
Database is complete
$ fm_show_alerts
   34 Tue Oct 11 14:09:47 2005 Lost FMA contact from fog12
$ ssh fog12 sudo mx_start_stop start-mapper
fma: no process killed
$ fm_status
FMS Fabric status

33      hosts known
32      FMAs found
1       un-ACKed alerts
Mapping is complete, last map generated by fog20
Database is complete
$ fm_show_alerts
$ fm_show_alerts -r
   34 Tue Oct 11 14:09:47 2005 [R] Lost FMA contact from fog12
$ fm_ack_alert -i 34
$ fm_status
FMS Fabric status

33      hosts known
32      FMAs found
0       un-ACKed alerts
Mapping is complete, last map generated by fog20
Database is complete

Example #2:

To simulate a node losing link connectivity to the Myrinet fabric, perform the following experiment:

  1. Disconnect the fiber cable connecting one of the nodes to the switch. (It may take close to a minute for this disconnect to be noticed since FMS tries to not disturb the network with too many probes.)
  2. Run fm_status and you will see an unACKed alert.
  3. Run fm_show_alerts and you will see an alert saying something like a "Link from node Y, nic N to enclosure X, slot Z, port A is down"
  4. Locate node Y in your cluster and the switch port A to which it was connected. You should notice that the green LED on the NIC in node Y and the switch port A is not illuminated. If the green LED is not illuminated on any connected port, there is no link connectivity from that node to the rest of the Myrinet fabric.
  5. Reconnect the fiber cable between node Y and the switch
  6. Wait for the link reconnect to get noticed, and then run fm_show_alerts -r to see the "relic" alert (an alert which is no longer true, but not yet ACKed).
  7. Run fm_ack_alert -i <alert_id> to ACK the alert and remove it.
  8. Run fm_status and the alert has been removed.

Note: If this had been a real-world situation and a link had been detected as down, you would probably see other alerts such as badcrcs for this connection.


Appendix A: Program Usage

The following is a list of programs that work within the FMS environment.

All tools will search for these files by default in the directory /var/run/fms/database ($FMS_RUN/$FMS_DB_NAME). By default, the environment variable FMS_RUN is set to /var/run/fms, and FMS_DB_NAME is set to database. The location and name of the database directory can be overridden by the environment variables, FMS_RUN and FMS_DB_NAME, respectively, or command line arguments to the tools.

In order to easily support future Myricom hardware products, the description of all the hardware products is table driven. These tables are included as part of the fabric management installation, and their location can also be changed through environment variables or command line arguments. The default directory for the fabric management system is /opt/fms.

All FMS tools are located in $FMS_INSTALL/bin, and have the following in common:

  1. Server/Agent processes

  2. Database commands -- All of these commands are run on a node which has filesystem access to the database files.

  3. FMS Client Commands -- These programs make IP queries to the fms server and need only be run on nodes which have IP access to the fms. The fms must be running for these commands to work.


Appendix B: Alerts

Alerts are created when certain exceptional events occur and are reported to the fms. Alerts persist within the fms until they are cleared. Clearing usually requires the alert to be acknowledged (ACKed) and for the condition which caused the alert to have cleared.

Once the alert has been acknowledged, it is marked as "ACKed". Once the condition that caused the alert has cleared, we mark it as a "relic". Most alerts are deleted only after they have been both relic-ed and ACKed.

The following is a list of all alerts and their meanings. The Flags line for each alert type may contain NEED_ACK or ACK_ONLY or both. If NEED_ACK is present, once the alert becomes a relic, it still needs an ACK before it is deleted entirely. If NEED_ACK is not present, the alert is deleted as soon as it becomes a relic. If ACK_ONLY is specified, the event is deleted as soon as it is ACKed. Without this flag, the alert will persist until becoming a relic, even after it has been ACKed.

Note: This list can also be found in the file libfma/alert.def in the FMS distribution.


Appendix C: Database File Formats

Every database file starts with 2 rows of column headers. The first row defines the data type of each column, and the second row defines the name of each column.


Appendix D: FMS Settings

The following parameters may be set using fm_settings to control the behavior of the Fabric Management System.

Note: This list of parameters and instructions for modifying them can be found in the file libfma/lf_fms_settings_def.h.


Appendix E: Legacy Tools

The FMS legacy tools are:



Last updated: 13 May 2010

Home | Mail for Product Information | Mail to Tech Support