THE MCP

The MCP (Myrinet Control Program) is the program that runs on the LANai chip on the host interface board. It is the MCPs job to transfer messages between the host and the network. To write an MCP you need to know how to write LANai software.

The "executable" file for the Myricom MCP is called "mcp4.dat", for the LANai 4, and "mcp.dat", for the LANai 2.

If you want your MCP to work with other Myricom MCP on a Myrinet, then there are compatibilty issues that you must deal with. The rest of this document explains how to deal with these issues.

Myrinet MCP and Host Requirements MYRINET PACKET TYPES

Myrinet uses the first 16-bit quantity of a packet as a packet type. The highest-order bit of this type must be zero to indicate "to-host", rather than "to-switch".

A specialized MCP may ignore any packets that it isn't interested in. The Myricom MCP will occassionally send mapping packets into the Myrinet. If some other kind of MCP does not want to participate in mapping, it is free to drop any message it receives with the first half-word equal to 0x0002 (the packet type of the Myricom MCP Mapping Packet).

MYRINET MAPPING AND ROUTING

The standard Myrinet consists of a set of Myricom MCPs running on LANai boards linked by cables and switches. Each MCP has a unique 64 bit address. The MCP with the highest address is the Mapper. If the Mapper is disconnected, the MCP with the next highest address becomes the Mapper. The Mapper sends out scout packets to map the network. To determine if a host is connected to a certain port on a switch, the Mapper will send a scout message addressed to a (hypothetical) host connected to that port. If a host does exist at that port, the host MCP will receive the scout message and send an acknowledgement back to the Mapper. When the mapper receives the acknowledgement, it knows that a host exists at that port. If the Mapper receives no response, then either nothing is connected to the port, or another switch is connected to the port. The Mapper sends a scout message with a route on it that goes through the port, into the (hypothetical) second switch, and back through the port of the first switch, and back to the Mapper. If the Mapper receives this scout message, then there was another switch attached to the first switch at the port the Mapper was looking at. Otherwise the port is unconnected. The Mapper recursively examines the whole network in this manner.

Designers of MCPs must decide how they want their MCPs to participate in this Mapping algorithm. There are five levels of participation:

Level Participation Requirement
0MCP agrees to abide by the packet-type convention.
1MCP will respond to Mapping Scout messages.
2MCP will accept a set of routes (server not yet implemented).
3MCP will accept a map and calculate its own routes.
4MCP will act as a network mapper if required.

Myricom provides two libraries to implement levels 0, 1, and 3. To use the libraries, link to "libRoute.a" and "libList.a" on the LANai2.x, and "libRoute3.a" and "libList3.a" on the LANai3.x or LANai4.x. Include the file "route.h" in your C code. The example code fragments that follow are for the LANai 3. The libraries will use up 20 KBytes of LANai memory.

route.h

#define MYRI_SCOUT_SIZE (64 + 16)
#define MYRI_MAP_SIZE 7644
#define MYRI_MAP_PACKET_TYPE 2


typedef int (*MapSendFunction)(int sendAlignment, int *buffer, int bufferLength);
enum RouteFlag { MYRI_SCOUT = 1, MYRI_MAP = 2};


void routeInitialize();
int routeHandleMessage (int flags, void *scoutMessage, int *address, 
                        MapSendFunction send);
int routeLookup (int *address, int *sendAlignment, int *lengthInHalfs, short **route);

LEVEL 0

The very least that an MCP must do to co-exist with the Myricom MCP is to receive and discard Myrinet Mapping Scout messages. To drop scout messages the MCP must first reserve space to receive them. A scout message will not be larger than MYRI_SCOUT_SIZE bytes. There are two kinds of scout messages, switch scout messages, and host scout messages.

Switch Scout Messages

If the Level 0 MCP receives a message with the highest-order bit of the packet type set, then the message is a scout packet that the Mapper intended to loop through some set of switches and back to itself. Usually the Mapper determines whether or not there is a host at a switch port before it tries to send any switch scout messages through that port. It will only send switch scout messages through a port that it knows there isn't a host attached to. But, in the process of disambiguating switches, a host may get a switch scout message by mistake.

Virtually the only way that a map message can be lost is if it is dropped by the receiving MCP. Since this nasty behavior is exactly the behavior that the Level 0 MCP implements, the Level 0 MCP will get switch scout messages. It should just drop them.

HOST SCOUT MESSAGES

If the Level 0 MCP receives a message with a Packet Type of MYRI_MAP_PACKET_TYPE then the message is a host scout message from the Mapper. The Level 0 MCP should drop these messages.

EXAMPLE

#define SWITCH_BIT 0x8000

int receive (short*p)
{
   short packetType = p[0];
	
   if ((packetType & SWITCH_BIT) || (packetType == MYRI_MAP_PACKET_TYPE))
      drop(p);
   else 
   {
      /* do whatever you do with one of your own messages */
   }
}

LEVEL 1

The Level 1 MCP is a curious beast. It is gracious enough to reply to scout messages, but it doesn't use the map that the Mapper will send to it. A Level 1 MCP must be able to accept a large message containing the network map, and then discard it. The message from the Mapper that contains the network map is MYRI_MAP_SIZE bytes long. The Level 1 MCP must also be able to accept a small scout message, which is not more than MYRI_SCOUT_SIZE bytes in length.

To reply to a scout message the Level 1 MCP must have a 64-bit MCP address. On the LANai 2 the 64 bit MCP Address is an array of 4 16-bit integers. On the L3 it is an array of 2 32-bit integers. These arrays are treated like big-endian, 64-bit, unsigned, numbers. It is illegal for the most-significant bit of an MCP address to be set: that bit is reserved for multicast addresses. The Mapper uses the MCP addresses to label the host-nodes on its Map graph. When calculating the routes from a map, the MCP addresses are used as indexes into the resulting route table. The MCP with the highest address becomes the Mapper. Since only a Level 3 MCP can be a Mapper, the MCP with the highest address in a Myrinet must be a Level 3 MCP.

If the Level 1 MCP receives a switch scout message it should just drop it.

If the Level 1 MCP receives a message with a Packet Type of MYRI_MAP_PACKET_TYPE then the message is either a host scout message or a network map message. The Level 1 MCP should call routeHandleMessage() with MYRI_SCOUT for its flag argument to deal with the message. If the message is a network map message, routeHandleMessage() will ignore it, and return 0. A flag argument of just MYRI_SCOUT means that the MCP is not interested in messages containing the network map. If the message is a scout message then routeHandleMessage() builds a reply message. routeHandleMessage() will use a callback, which you supply, to send this reply message back to the Mapper. Both the callback and routeHandleMessage() return 0 if there is an error.

The Level 1 MCP should call routeInitialize() to initialize the "libRoute.a" library whenever the MCP is started or restarted.

EXAMPLE

#include "route.h"

int myAddress[2]; /* My MCP address. Initialized somehow. */
 
#define SWITCH_BIT 0x8000

main()
{
   ...

   routeInitialize();

   ...
}

int receive (short *p)
{
   short packetType = p[0];
	
   if (packetType & SWITCH_BIT)
      drop(p);
   else if (packetType == MYRI_MAP_PACKET_TYPE)
   {

      /*
          * If it is a map message, ignore it, otherwise it is a scout message,
          * and we send a reply. It is assumed that it is safe to set use SMP and SML
          * at this point to send the reply.
          */

      return routeHandleMessage (MYRI_SCOUT,p,myAddress,myReplyCallback);
   }
   else 
   {
      /*do whatever you do with one of your own messages*/
   }
  ...
}

int myReplyCallback (int sendAlignment, int *buffer, int bufferLength)
{

   /*assume we can send the message now.*/

   /*send the message*/
   SA = sendAlignment;
   SMP = buffer;
   SML = (char*)buffer + bufferLength - 4;

   /*wait for the message to be sent.*/
   while (!(ISR &  SEND_INT_BIT))
      touch (ISR);
   clear_SEND_INT_BIT();
	
   /*this return value will get passed back to you though routeHandleMessage().*/
   return 1;
}

LEVEL 2

The functionality to support this level is not yet implemented in the Myricom Mapper code. The mapper would need to be capable of providing routes to a node. As of v3.00 of the Myricom code, the mapper is only able to provide a map to other nodes.

LEVEL 3

The Level 3 MCP does not drop the map of the network that it receives from the Mapper. It uses the map to build a route table. To build a route table, the Level 3 MCP should call routeHandleMessage() with the flag MYRI_MAP included in its flag argument. To look up a route the Level 3 MCP should call routeLookup(). The message containing the map of the Myrinet is fairly huge. The exact size is MYRI_MAP_SIZE. The Level 3 MCP must be able to accept messages this large. The map message can be discarded as soon as routeHandleMessage() is called.

EXAMPLE

#include "route.h"

int myAddress[2]; /*our MCP address. Initialized somehow.*/
 
#define SWITCH_BIT 0x8000

main()
{
   ...

   routeInitialize();

   ...
}

int receive (short *p)
{
   short packetType = p[0];
	
   if (packetType & SWITCH_BIT)
      drop(p);
   else if (packetType == MYRI_MAP_PACKET_TYPE)
   {
       
      /* If it is a scout message, send a reply. 
            If it is a map message, build a route table */

      return routeHandleMessage ((MYRI_SCOUT | MYRI_MAP), p, 
                                myAddress, myReplyCallback);
   }
   else 
   {
      /*do whatever you do with one of your own messages*/
   }
  ...
}

int myReplyCallback (int sendAlignment, int *buffer, int bufferLength)
{

   /*assume we can send the message now.*/

   /*send the message*/
   SA = sendAlignment;
   SMP = buffer;
   SML = (char*)buffer + bufferLength - 4;

   /*wait for the message to be sent.*/
   while (!(ISR &  SEND_INT_BIT))
      touch (ISR);
   clear_SEND_INT_BIT();
	
   /* This return value will get passed back through routeHandleMessage().*/
   return 1;
}

/* A function that uses the route table built by routeHandleMessage()
      to send a message to an MCP at address "destination"*/

int send(int destination[2], char *message, int length)
{
   short*s,p;
   int sendAlignment,lengthInHalfs;
   short*hop;

   /*look up route to destination*/
   if (!routeLookup (destination,&sendAlignment,&lengthInHalfs,&hop))
      return 0;
 
   /* Get the place where the start of the route will be. Note
       * that we assume that there is room *in front* of the message
       * to append a route to.
       */

   s = p = (short*)((unsigned)message - lengthInHalfs*2);

   /*copy the route 16 bits at a time*/
   for (int i = 0; i< lengthInHalfs;i++)
      *s++ = hop[i];

   SA = sendAlignment;
   SMP = p;
   SML = (char*)p + length + lengthInHalfs*2 - 4;

   /*wait for the message to be sent. Or whatever.*/
   while (!(ISR &  SEND_INT_BIT))
      touch (ISR);
   clear_SEND_INT_BIT();
   

   return 1;

}

LEVEL 4

The Level 4 MCP is a Mapper. Mapping a Myrinet is too complicated for words. If you want a Level 4 MCP, use the Myricom MCP.

Table of Contents | The Myrinet API


Home | Product Information | Tech Support