This specification is now out-of-date.
Please go here for links to the current specifications.
We have retained this specification at this URL for historical / archival reasons.


MYRINET LINK SPECIFICATION

Myricom is publishing these specifications in order to assist others who may wish to develop systems that interoperate with Myricom's Myrinet products. Myricom makes no proprietary claims to these specifications or techniques, but reserves the right to change these specifications at any time and without notice. The information provided by Myricom in this document is believed to be accurate; however, Myricom disclaims responsibility or liability for its use, or for any infringement of patents or other rights of third parties resulting from its use.

1. INTRODUCTION

A Myrinet link is composed of a full-duplex pair of Myrinet channels. The connection of a link to a system is called a port. The specifications for Myrinet links include those for Myrinet channels and ports.


A Myrinet Link

The operation of a Myrinet link is described in sections 4 and 5. For this "top-down" exposition, it is necessary to understand initially that:

2. MYRINET COMPONENTS
2.1 COMPUTER INTERFACES


Computer interfaces have one port, and connect host computers to the network. The interface is powered by the host. A host may include more than one Myrinet interface, but from the network's viewpoint these interfaces operate independently.

Myricom computer interfaces include a small computer with 128KB of memory. The memory is used both for buffering packet data that is being transferred between the host and the network, and also for storing a control program. The processor of this computer, the interface to the Myrinet network, and a DMA engine are contained in a custom-VLSI chip called the LANai.

When the host is first turned on, the interface is held in a reset condition in which any packets directed to it are absorbed but ignored. After the host computer (device driver) loads the Myrinet Control Program, or MCP, into the interface, the host releases the reset and the computer within the interface starts executing the control program.

The program executed within the interface controls the transfer of packet data between the host's memory and the network. This program also provides other network-management functions, including translating between host addresses and Myrinet routes, and mapping and monitoring of the network.

Interfaces (and network-management nodes, if any) are the only points where new packets are injected into a Myrinet network. In addition, an interface that is powered is required to consume packets directed to its port. The interface is able to block the flow of packet data to its port receiver, but ideally will not block consumption for more than a few microseconds (for example, the time required to allocate a new receive buffer).

2.2 SWITCHES

Myrinet switches are multiple-port components that switch (route) a packet entering on an incoming channel of a port to the outgoing channel of the port selected by the packet header. Myrinet switches employ cut-through routing. If the selected outgoing channel is not already occupied by another packet, the head of the incoming packet is advanced into this outgoing channel as soon as the head of the packet is received and decoded. The packet is then spooled through this established path until the path is broken by the tail of the packet. If the selected outgoing channel is occupied by another packet or is blocked, the incoming packet is blocked.

Myricom's Myrinet switches have 4, 8, 12, 16, ... ports. Different switches in the network may have different numbers of ports. For a D-port (absolute-routing) switch, the ports have addresses {0, 1, ..., D-1}. Packets addressed outside of this range are dropped in the same way that packets addressed to ports that are unused, disconnected, or connected to an unpowered component are dropped.

Myricom's Myrinet switches are described as "perfect" because, for any switching permutation, there may be as many packets traversing a switch concurrently as the switch has ports. In a 4-port switch, for example, packets 0->1, 1->2, 2->3, 3->0 (or any other permutation) may traverse the switch concurrently. These switches are implemented using two types of custom-VLSI chips, crossbar-switch chips, and Myrinet-interface chips.

Switches are powered separately from hosts, so that the network will continue to function even when some of the hosts are turned off.

2.3 NETWORK TOPOLOGY

The network topology may be viewed as an undirected graph. Any way of linking together computer interfaces and switches that forms a connected graph is allowed. The graph may contain cycles. Topologies with cycles are useful in large networks to provide multiple-path redundancy. The physical network may include unpowered host interfaces and unused ports.


Example Myrinet Topologies

The network connecting those hosts that are operating can, nevertheless, be mapped as shown. The network as mapped may exhibit oddities such as switches in which only one or two ports are in use.

3.0 MYRINET PACKETS AND ROUTING

A packet that traverses a Myrinet channel includes in sequence:

  1. the header, which may be one or more bytes.

  2. the payload, which may be zero or more bytes.

  3. the cyclic-redundancy-check (CRC) byte.

3.1 HEADER AND ROUTING

In addition to sending and receiving packets over known routes, Myrinet routing was designed to allow network mapping. Packets used for "scouting" or "exploring" a Myrinet network are relatively short, and the redundancy in the encoding of the source route in the header of the packet allows packets addressed to a possible host to be dropped if they encounter a switch, or vice versa.

When a packet is injected into a Myrinet network from an interface, it has an n-byte header, n > 0, in the form:

Byte Contents Meaning            
1 1xxxxxxx select port xxxxxxx of this switch (absolute routing)            
... ... ... n - 1 1xxxxxxx select port xxxxxxx of this switch (absolute routing) n 0yyyyyyy enter interface with tag yyyyyyy


The header may be one byte long only for the network composed of two interfaces and no switches. For networks with three or more hosts, which necessarily contain at least one switch, the header must be two or more bytes long. The last byte of the header is marked by the most-significant bit (MSB) of the byte being 0. The MSB is redundant for packets that travel a known route, but this information is included to allow mapping packets to explore unknown routes.

Only the first byte of the header is examined when the packet enters a switch or interface. Once this byte is used to select an outgoing port of a switch, it is discarded (stripped) from the head of the packet. For example:


Header Stripping

The tag carried by the last byte of the header is used by the control program to distinguish between different types of packets, such as user packets and mapping packets. The tag is read by the control program when a packet is received, and is used to select the routine that handles this type of packet.

If the lead byte of a packet entering a switch has its MSB=0, the packet is dropped. If the lead byte of a packet entering an interface has its MSB=1, the packet is consumed but is handled as an error.

3.2 RELATIVE ADDRESSING SWITCHES

Myricom's M2-SW8A, M2F-SW8, M2F-SW4 and all future switch products use a relative-addressing scheme to route packets from port to port in a switch. The switch bit still applies, then the port "number" is interpreted as a relative address from the incoming port. Negative numbers are 6-bit twos-complement. The routing does NOT wrap around, so there are only N valid relative addresses for each port. The valid ports (routes) for an 8-port switch are listed in the table below.


               Relative Addressing 
                 (8-port switch)

                     Destination Port
Source   0     1     2     3     4     5     6     7
Port
0       0x80  0x81  0x82  0x83  0x84  0x85  0x86  0x87
1       0xBF  0x80  0x81  0x82  0x83  0x84  0x85  0x86
2       0xBE  0xBF  0x80  0x81  0x82  0x83  0x84  0x85  
3       0xBD  0xBE  0xBF  0x80  0x81  0x82  0x83  0x84
4       0xBC  0xBD  0xBE  0xBF  0x80  0x81  0x82  0x83
5       0xBB  0xBC  0xBD  0xBE  0xBF  0x80  0x81  0x82
6       0xBA  0xBB  0xBC  0xBD  0xBE  0xBF  0x80  0x81
7       0xB9  0xBA  0xBB  0xBC  0xBD  0xBE  0xBF  0x80


3.3 CRC BYTE

Although the error rate on Myrinet links is extremely low, Myrinet local-area networks employ a cyclic-redundancy-check (CRC) byte to detect packets whose data has been corrupted by cable or connector faults. Packets corrupted by data-overrun or channel-reset conditions may additionally be detected at higher protocol levels by the packet being the wrong length.

The CRC used by Myrinet is the same CRC-8 that is used in the header of ATM packets; however, the vacuous exclusive-or with hexadecimal 55 (binary 01010101) at the end is omitted. Thus, a packet composed of all zero bytes has a CRC of zero.

The CRC is computed on the entire preceeding part of the packet, including the header. Because the header is modified at each switch, the CRC is computed and checked for each link.

At an interface, CRC hardware within the LANai chip computes the CRC of the outgoing packet as the packet is sent, and the CRC byte is appended to the end of the packet.

At a switch, CRC hardware in Myrinet-interface chips computes the CRC of the incoming packet, and substitutes the exclusive-or of the computed and received CRC in the CRC byte position. A received packet whose CRC is correct will have a zero CRC at this point, whereas a packet whose CRC is incorrect will have a non-zero CRC. On the outgoing link, CRC hardware computes the CRC as the packet is sent, and exclusive-ors the computed CRC with the contents of the CRC byte. If the packet had a correct CRC when received, it will have a correct CRC when sent; whereas if the packet had an incorrect CRC when received, it will have an incorrect CRC when sent.

When the packet reaches an interface, CRC hardware in the LANai chip computes the CRC of the incoming packet and substitutes the exclusive-or of the computed and received CRC in the CRC byte position. Whether the CRC was correct (zero) or incorrect (non-zero) may be tested by the control program.

3.4 ROUTING EXTENSIONS

Several extensions to the operation of Myrinet switches are being considered for future products. These extensions will be accessed or controlled through packets addressed to the unused switch-port addresses, which, for a D-port switch are (D, ..., 0x7F).

Packets addressed to port 0x7F will allow mapping packets to determine the identity and configuration of the switch. In some cases, the 0x7F switch port may be attached to a computer that participates in network-management functions, and that monitors the state and utilization of its switch.

Another possible function of future Myrinet switches is cut-through-multicast routing. A multicast route (tree) would be set up by storing entries in a multicast table in each switch that is part of the multicast. Loading the table would be accomplished by sending packets with the appropriate tag to port 0x7F of each switch. For each switch-port address in a range such as (60, ..., 7E), the table entry would specify a list of (outgoing port, replacement header) items. Once the multicast tables are set up, packets addressed to these virtual switch-port addresses would be routed at once to the set of outgoing ports, with each header being replaced or stripped as specified by the table entries.

4. MYRINET LINKS AT THE LOGICAL LEVEL
4.1 PACKET DATA AND CONTROL SYMBOLS


A single Myrinet channel conveys a sequence of discrete (9-bit) characters from an alphabet composed of:

Control symbols are interleaved with the packet data in order to perform packet framing, flow control, and other functions. For example, the sequence:

... GAP, GO, IDLE, d0, d1, IDLE, GO, STOP, d2, d3, GAP, STOP, GAP, ...

includes the 4-byte packet (d0, d1, d2, d3), which is framed by GAP control symbols. The GO and STOP control symbols are inserted into this sequence in order to accomplish flow control on the opposite-going Myrinet channel, and may appear redundantly to fill unused cycles. The GAP control symbol may appear redundantly between packets. The IDLE control symbol may be used to fill unused cycles either within or between packets.

4.2 CHANNEL DELAY AND DATA RRATE

The physical-transport medium has a delay, D, the (one-way) time in flight of the characters. For example, the delay of a 25m cable in which the propagation velocity is ~0.6c is ~140ns.

A Myrinet sender produces characters at a fixed rate, B. Full-rate channels operate at B = 80M characters/s, corresponding to a maximum rate for packet data of 80MB/s or 640Mb/s. This output rate is regulated by the transitions of a symmetrical, 40MHz clock signal called 'sendclk' (section 5.1). Links may, however, operate at reduced rates in accordance with the frequency of the 'sendclk' signal. The open-link (section 4.5.3) and blocked-packet (section 4.6.2) timeouts are specified as multiples of the character period so that the timeout intervals will be larger at lower link rates. These timeouts are, accordingly, implemented by counting 'sendclk' transitions.

4.3 MYRINET INTERFACES

The control symbols GAP, GO, STOP, IDLE, and FRES (forward reset) are mandatory for all Myrinet-link implementations; all other control symbols are optional. The GAP, GO, STOP, and IDLE control symbols are typically generated or handled autonomously by an interface between the host system and the Myrinet channels. The following block diagram may be helpful for understanding the relationship between the various control symbols and their functions:


Myrinet Interfaces and Control Signals

The data paths shown in blue are Myrinet channels, those shown in black carry packet data and framing in a queue discipline, and the red paths indicate control information. If the Myrinet output channel is blocked from sending, the data source is blocked by the queue discipline. The data sink may, similarly, become blocked.

The Myrinet input circuit ignores IDLE control symbols, and separates other input characters into:

In order of priority, the Myrinet output circuit sends STOP/GO control symbols commanded from the slack buffer; any other control symbols commanded from the interface control circuit; and data and framing information from the data source if permitted by the flow control. Otherwise-unused cycles are filled with redundant STOP, GO, or GAP control symbols or with IDLE control symbols.

4.4 PACKET FRAMING (GAP)

The GAP control symbol indicates that the previous packet data on this channel is the tail of a packet, and that the next packet data on this channel is the head of a packet. Multiple, redundant GAP control symbols may appear between packets. Upon power-up initialization, a Myrinet channel does not require a GAP symbol to mark the first data byte as the head of the packet.

4.5 FLOW CONTROL (STOP, GO)
4.5.1 BASIC OPERATION

Flow control is accomplished on Myrinet channels by the receiver injecting STOP and GO control symbols into the stream being produced by the sender of the opposite-going channel. Flow control applies only to packet data; all control symbols are exempt from flow control and have priority over packet data. The STOP and GO control symbols used for flow control have priority over all other control symbols.

This flow-control mechanism is analogous to the simple X-off/X-on (^S/^Q) scheme used in RS-232 channels. The Myrinet receiver contains a "slack" buffer and the state information associated with the flow control. The Myrinet sender is simple, and its operation is independent of the amount of slack available in the receiver.

4.5.2 SLACK BUFFER

The slack buffer is a FIFO buffer of size r = k_g + h + k_s. This state of this buffer is characterized by the number of full cells, f; initially, f=0. The slack buffer can be visualized as a tank containing liquid:


Slack Buffer

The slack buffer generates a STOP control symbol when f increases to r - k_s, and generates a GO control symbol when f decreases to k_g. The k_s and k_g parts of the buffer provide the slack necessary for the delay between sender and receiver. The h part of the buffer provides hysteresis.

The parameter k_s is the slack available for stopping the flow on the Myrinet channel when the data sink becomes blocked, or when the data sink is operating at a lower bandwidth than the Myrinet channel. In the worst case in which the data sink becomes blocked, k_s must be large enough to stop the sender before the k_s buffer overflows. For B = 80MB/s and D = 140ns (section 4.2), there are 2*D*B = 23 characters in transit on the round-trip path. k_s must be at least 23 plus additional buffer positions required due to the latency in generating and interpreting the STOP control symbol. (At reduced data rates, the receiver buffer capacity will be adequate for proportionately larger D.)

Similarly, k_g is the slack available for maintaining the flow into the data sink after f decreases to k_g, and the GO control symbol is sent. Unlike the k_s part of the slack buffer, which prevents data loss, the k_g part of the slack buffer is required only for performance reasons. If the bandwidths of the Myrinet channel and the data sink are similar, typical designs will make k_g = k_s = k, where k is said to be the slack of the channel.

The hysteresis parameter, h, is important only for reducing the number of STOP and GO symbols that must be sent on the opposite-going channel in cases in which the data sink takes packet data in short bursts. The usual value for this parameter is h = 16, which, even under the most adversarial conditions, limits the use of channel bandwidth for flow control to 6%.

In the Myricom LANai and Myrinet-interface chips, k_g and k_s provide sufficient buffering for flow control on 25m cables at 80MB/s data rates, and h = 16.

4.5.3 OPEN-LINK TIMEOUT

Myrinet channels must not block with faulty channels or host systems. If a Myrinet link is unused, disconnected, or connected to a failed or powered-off receiver, packets must be dropped rather than blocked.

To assure this property, Myrinet channels employ an open-link timeout in which either packet data or a control symbol other than IDLE must be sent at a minimum interval. The GO and STOP control symbols, which may appear redundantly either within or between packets, may be used to fill unused cycles on a blocked or idle channel. The stream of characters emitted by a Myrinet sender can be thought of as a carrier that is modulated to zero by IDLE characters, and to a detectable signal by non-IDLE characters. The open-link timeout is analogous to carrier detection. Receipt of any non-IDLE character resets a timeout counter. An overflow of the open-link-timeout counter in System A after 16 character periods (0.2us for 80MB/s channels) allows System A to determine that System B or the channel from B to A have failed.


Open Link Timeout

A Myrinet sender initializes to the GO condition. If System A receives a STOP control symbol and its sender enters the STOP condition, it will revert to the GO condition upon an open-link timeout.

If the fault in System B or in the cable from B to A occurred in the midst of a packet sent from B to A, the "downstream" path followed by that packet would be source-blocked (occupy a path that will not be released until the packet is terminated with a GAP control symbol). Accordingly, if an open-link timeout occurs in System A while its receiver is in the midst of a packet, the packet will be terminated with a GAP control symbol.

To satisfy the requirement that some non-IDLE character be sent within the minimum period, it suffices for the Myrinet-sending circuits to fill any otherwise-unused cycles with redundant GAP, STOP, or GO control symbols. The Myrinet interfaces in the Myricom LANai and Myrinet-interface chips fill unused cycles within packets with STOP/GO control symbols, and between packets with alternating GAP and STOP/GO control symbols. Single IDLE characters may be emitted within packets to satisfy internal timing requirements.

The open-link timeout, a nearly instantaneous mechanism for detecting unused or faulty links, is independent of the long-period, blocked-packet timeout (section 4.6.2).

4.6 RESETS
4.6.1 FORWARD RESET (FRES)


The forward-reset function, which may be initiated either at the control interface of a Myrinet interface or due to a blocked-packet timeout, causes the Myrinet sender to send an FRES control symbol and to enter the GO flow-control condition.

When the FRES control symbol is received, the input circuits and the slack buffer are placed into a reset condition in which data bytes are dropped and the slack buffer is held in a cleared condition, but STOP, GO, and other control symbols except for GAP are processed normally. The receiver remains in this reset condition until a GAP control symbol is received. The reset signal is also delivered to the control interface.

The GAP symbol that terminates the receiver's reset condition may be the GAP symbol that terminates a packet enroute on the link, or a GAP symbol that fills unused cycles between packets. If the channel that is being reset is in the midst of a packet, the receiver thus discards the data bytes of the remainder of the packet up to and including the terminating GAP control symbol. Although the receiver has this responsibility, the GAP symbol that terminates the receiver's reset condition may alternatively be generated by the sender as part of the reset sequence, in which case the transmitter must, if it is in the midst of a packet, acknowledge but discard the remainder of the packet from the data source.

The actions of the sender or receiver dropping any en-route packet and the receiver clearing its slack buffer is the most effective way to achieve the reset function's main purpose, which is to clear network deadlocks.

In the Myricom LANai and Myrinet-interface chips, the sender generates a GAP symbol immediately after the FRES symbol, or one cycle later if there is a STOP or GO symbol to be sent. If the sender is in the midst of a packet, it discards the remainder of the packet.

In Myricom switches, the data sink (see the figure in section 4.3) is a cut-through routing circuit. The reset signal delivered to the control interface is used to reset parts of the routing circuit, and to append a GAP to any leading segment of a packet. In the Myricom LANai chip, the data sink is the LANai packet interface. Receipt of a forward reset sets an interrupt bit, and resets the processor to start executing at the initialization and bootstrap code.

4.6.2 BLOCKED-PACKET TIMEOUT

The flow control that is performed at each link allows the flow of packet data to be blocked either at the destination (downstream) or at the source (upstream). Destination blocking is normal. It occurs, for example, in Myrinet switches when the required outgoing channel is occupied by another packet. Destination blocking may occur also due to an erroneous or mis-addressed packet that causes the network to deadlock, or due to a failure in an interface. Source blocking is an abnormal and much less likely condition that could occur, for example, if the GAP symbol terminating a packet was not generated or was lost in transmission. In this case, the downstream path followed by the packet would remain occupied even though there was no packet data being sent on the links on this path.

In order to assure that the network will escape from these conditions, the Myrinet sender includes a long-period timeout. If the sender is held in the STOP flow-control condition or is in the process of sending a single packet for longer than 2^22 character periods (~4M character periods, or ~50ms for 80MB/s channels), it will, if in the stopped state, initiate a forward reset. If the sender is in the go state, it will terminate the current packet with a GAP symbol and consume the rest of the packet.

4.7 LINK-STATUS INDICATORS

The Myricom LANai and Myrinet-interface chips include two light-emitting-diode (LED) outputs to indicate the status of a port. These two outputs may be used either to drive separate red and green LEDs, or a single, two-color LED.

Green/Red LEDs Two-color LED Meaning
off/off off Not powered.
off/on red The link is disconnected or connected to a powered-offc component (open-link timeout).
on/off green The link is connected but idle.
on/on yellow The link is connected and is sending or receiving data.
4.8 EXTENSIONS

None of the following extensions are required or implemented in the Myricom LANai and Myrinet-interface chips. They are described here only to indicate the nature of the extensions that may be employed.

4.8.1 FLOW-CONTROL OVERRUN (ORUN)

If the time-in-flight delay D is too large for the amount of k_s slack, the slack buffer can over-run. If an overrun occurs, the receiver may drop packet-data bytes or GAP control symbols.

Due to the asynchrony between the arrival and consumption of packet data, it is physically impossible in cases of nearly simultaneous arrival and consumption for the receiver circuits to determine with certainty whether an overrun has occurred. It is, however, possible to determine whether an overrun may have occurred, and in this case the receiver may generate an ORUN control symbol on the opposite-going channel, and produce an overrun alarm signal to the host-system control interface.

4.8.2 BACKWARD RESET (BRES)

The backward-reset function, when initiated at the control interface of a Myrinet circuit, causes the Myrinet sender to send a BRES control symbol. When the BRES is received, it is reported on its control interface, and the interface control circuit initiates a forward-reset function.

4.8.3 PROBES (PRB0-PRB7, REPL)

A set of 8 control symbols, PRB0-PRB7, may be generated by a Myrinet sender to probe the continuity of a pair of Myrinet channels, or to obtain status information or to command specific actions in the Myrinet receiver's host system. When a probe operation is commanded from the host-system control interface, the selected probe control symbol is sent on the outgoing Myrinet channel, and bit 4 of the interface control circuit's status word is set to 1. Upon receiving a probe, a Myrinet circuit may respond with a probe-reply (REPL) control symbol, in which up to 4 bits of status information are embedded. For example, in the standard byte encoding for control symbols (section 5.2):

 
                         bit   d 7 6 5 4 3 2 1 0
                   REPL code   0 1 1 1 1 X X X X

When a REPL control symbol is received, it is directed to the control interface, which copies bits 3-0 of the symbol into bits 3-0 of its status word, and sets bit 4 to 0. Bit 4 can, accordingly, be used to determine whether a reply to a previous probe has been received.

The specific actions or status information returned by each of the 8 probe control symbols are system-dependent; however, by convention, PRB0 is a "simple probe" (or "ping") that can be used to generate a REPL that verifies the connectivity of a Myrinet link, and times the round-trip delay.

4.8.4 VIRTUAL CHANNELS

Systems that employ virtual channels that share the same physical channel may define virtual-channel selection control symbols (VC0, VC1, ...) that have the effect of selecting a particular slack buffer. It is then necessary to have a corresponding set of GO and STOP codes (GO0, STOP0, GO1, STOP1, ...).

4.8.5 OTHER EXTENSIONS

Systems that implement different subsets or supersets of these protocols are not necessarily incompatible. For example, the Myricom LANai and Myrinet-interface chips implement the functions associated with the IDLE, GAP, STOP, GO, and FRES control symbols. Only these control symbols can be generated, and any other control symbols received (such as ORUN, BRES, or probes) are ignored. These chips could communicate correctly over Myrinet links with systems that implemented a broader set of control-symbol functions.

Certain extensions are discouraged. For example, it may seem desirable to provide additional information about the packet format with control symbols -- eg, how is the header of this packet to be interpreted, and how is it to be routed? However, because this information must be routed together with a packet, the Myrinet input circuit would have to recognize these control symbols and direct them to the slack buffer. It would be more consistent to incorporate this information directly into the header of a packet.

5. MYRINET LINKS AT THE PHYSICAL -ELECTRONIC LEVEL
5.1 SIGNALS

The reference implementation of a Myrinet channel connects a sender (output) and receiver (input) through a 9-wire, parallel, communication medium:


Myrinet Channel Signals

If d=1, bits 7-0 convey the data byte. If d=0, bits 7-0 contain the code for a control symbol. The control-symbol codes are tabulated in section 5.2. The single-character signal names, {d, 7, ..., 0} are often used as subscripts, eg, on a port called "p3," a signal might be denoted as p3-out(4) or p3o4.

A binary 1 is encoded as a transition in either direction, and a binary 0 is encoded as the absence of a transition (non-return-to-zero (NRZ) encoding). Except for IDLE, whose code is all 0's, each character conveyed on the channel contains a transition on at least one of the 9 wires, allowing the arrival of a character to be detected.

The output rate of a Myrinet sender is determined by the _transitions_ of the sendclk input. For example, a symmetrical 40MHz clock signal is required to produce a rate of 80M characters/second (80MB/s data rate). In the Myricom LANai and Myrinet-interface chips, the transitions of {d, 7-0} occur ~5ns after the transitions of sendclk.


Signals and Transitions

The NRZ encoding has two particularly desirable properties. First, the maximum fundamental frequency of a data signal is only half the data rate, eg, a frequency of 40MHz at a data rate of 80MB/s. Of course, the spectrum of the signal contains harmonics that extend to higher frequencies. Second, the predominant interconnection fault, an open connector or cable wire, always causes the same error, a transmitted 1 received as a 0.

A Myrinet receiver operates asynchronously, and accepts characters whenever they may arrive.

At the Myrinet sender, the transitions that encode a character are nearly but not precisely coincident (within 0.5ns). Additional skew is inevitably introduced by different propagation delays in the transmission-line drivers (~1ns), different delays in the wires in the cable (4ns), and different delays in the receiver circuits (0.5ns). These variations include differences in the delays in circuits in this path in handling positive and negative transitions.

In order to accommodate this skew, which is typically less than 6ns, the receiver circuits employ a sampling-window technique. The window period w is nominally equal to 1/2 of the character period (6.25ns at 80MB/s). In the Myricom LANai or Myrinet-interface chips, the window width is determined by internal delays that are automatically adjusted by the period of the 'sendclk' signal. The width of the sampling window can also be adjusted by a digital input to these chips.

5.2 CHARACTER CODES
Character Code (bits)
d 7 6 5 4 3 2 1 0
Function
Data bytes 1 D D D D D D D D Data byte D
IDLE 0 0 0 0 0 0 0 0 0 idle
GAP 0 0 0 0 0 1 1 0 0 packet gap
GO 0 0 0 0 0 0 0 1 1 flow-control (transmit-on)
STOP 0 0 0 0 0 1 1 1 1 flow-control (transmit-off)
ORUN 0 0 0 1 1 0 0 0 0 over-run alarm **
FRES 0 0 0 1 1 0 0 1 1 Forward Reset
BRES 0 0 0 1 1 1 1 0 0 Backward Reset **
PRBn 0 1 1 0 0 0 n n n Probe nnn **
REPL 0 1 1 1 1 X X X X Reply (XXXX) to a probe**


Functions marked ** are not included in current Myricom Myrinet components.

The codes for the critical control symbols allow for the correction of single 1->0 errors.
000001100, 000000100, 000001000 are decoded as GAP;
000000011, 000000010, 000000001 are decoded as GO;
000001111, 000001110, 000001101, 000001011, 000000111 are decoded as STOP; etc.

5.3 ELECTRONIC CHARACTERISTICS

The Myricom LANai and Myrinet-interface chips are CMOS devices. Typical input-pin capacitance is 5pF, and inputs are protected against ESD and latchup to at least +/- 100mA. The input switching threshold at Vdd=5V is 2.4V +/- 0.1V. Timing measurements are between times at which signals cross the switching threshold.

All output pins are driven by a p-channel, n-channel MOSFET pair sized to produce the same transition and delay times for positive-switching and negative-switching signals. Outputs are ESD- and latchup-protected.

The following I-V characteristics of the output pins and of input pins without the ~10Kohm pulldowns was obtained by electrical measurement of a TYPICAL chip at 25C and Vdd = 5V. The sense of the currents shown is into the pin.

pin voltage (V) high output (mA) low output (mA) input (mA)
-2.0 -100.0 -100.0 -100.0
-1.5 -100.0 -100.0 -75.10
-1.0 -66.57 -46.30 -9.945
-0.5 -35.18 -12.32 -0.0004
0.0 -34.63 -0.37 0 *
0.5 -34.00 10.55 0 *
1.0 -33.21 19.16 0 *
1.5 -32.14 24.93 0 *
2.0 -30.60 28.20 0 *
2.5 -28.36 29.69 0 * +/- 1nA
3.0 -25.16 30.24 0 *
3.5 -20.79 30.47 0 *
4.0 -15.13 30.57 0 *
4.5 -8.13 30.63 0 *
5.0 0.11 30.67 0 *
5.5 9.44 30.73 0.00006
6.0 50.19 66.18 11.69
6.5 >100.0 >100.0 73.92
7.0 >100.0 >100.0 >100.0


Although the static saturation current of the p-channel output driver to 0V is ~35mA, whereas the static saturation current of the n-channel driver to Vdd=5V is ~31mA, the devices exhibit somewhat different nonlinearities. The net result of the transistor sizing is to produce very similar (0.2ns difference) switching and delay times into typical (20pF) lumped-capacitive loads.

5.4 SIGNAL INTEGRITY

If the sender and receiver are physically close enough that the communication medium does not need to be treated as a transmission line, the output signals may be connected directly to the corresponding input signals. Transmission lines, however, require suitable drivers and termination, and may also employ signal-conditioning circuits on the receiver end of the cable.

Myricom Myrinet links are multiple-twisted-pair transmission lines whose characteristic impedance is in the range of 100-ohms to 110-ohms. AT&T 41MM tranceiver chips provide the transmission-line drivers, receivers, and termination. The twisted-pair lines in the cable are driven in a balanced, differential mode in which each wire in the pair switches between ~3V and ~4V, the sum of the two voltages is nearly constant, and the difference between the voltages switches between ~-1V and ~+1V.

The characteristic impedance of the transmission line in the balanced mode is matched by a 110-ohm load resistor, which is built into the line receivers in the 41MM transceiver chip. The characteristic impedance of the longitudinal (common) mode of propagation is approximately matched in "split termination" by the outputs of the "pseudo-ECL" AT&T 41MM line drivers, which include 220-ohm pulldown resistors to ground.


Transmittion Line

5.5 CABLES

Myrinet links between Myricom components may employ a variety of cable types in the 100-ohm to 110-ohm range of characteristic impedance. Depending upon the EMI requirements, either unshielded or shielded cable may be used. The cable should exhibit a skew (maximum difference in delay from line to line) less than 3ns.

The preferred cable for Myrinet links is a 20-twisted-pair AT&T Type 1249 Digital Interconnect Cable, AT&T part number 1249020A. This cable is shielded, 0.40 inches in diameter, quite flexible, and is UL-Listed Type CL-2. The balanced-mode characteristic impedance is ~105 ohms, and the propagation velocity is ~0.6c. This cable exhibits a relatively small skew, and may be used for 80MB/s links up to 25m.

Amphenol Twist-N-Flat cable, Amphenol 843-132-2601-040, is a ~105-ohm, unshielded cable that may be used in some applications, and offers the advantage of gang termination. Although the flat (ribbon) version of this cable may be used for links up to ~20m, the round, shielded type of Twist-N-Flat cable exhibits substantially larger skew, and may be used only over distances up to ~10m.

An evaluation of the suitability of other types of cable may be obtained by contacting Myricom.

5.6 CONNECTORS

The standard cable-connector for Myricom Myrinet ports is a 37-pin mini-D connector. The board-mounted connectors are female, and the cable-end connectors are male.

The standard pin numbering for the board-mounted connector as viewed looking into the connector and the signal-to-pin assignment in this same orientation are:


Pin Numbers

The signals s*+ and s*- are the differential outputs of the line drivers bits * = {d, 7, 6, 5, 4, 3, 2, 1, 0}, and the signals r*+ and r*- are the corresponding differential inputs of the line receivers. The GND pin is the electrical-signal and chassis ground, which connects to the cable shield and unused conductors. GND is not a reference potential for the differential signals.

5.7. CABLE ASSEMBLY

The connection of each sender signal to the corresponding receiver signal is accomplished by assembling the connectors to the ends of the cable with a 180-degree reversal (mirror image). For example, the following procedure is used to assemble a Myrinet link using Amphenol Twist-N-Flat cable:

  1. 1. Cut the cable to the desired length. Slip the connector hoods over the ends of the cables. Then, for each end:

  2. 2. If the cable is shielded, remove 2.5 inches of the plastic outer covering, and unbraid the shield.

  3. 3. Separate the two outer pairs, and split the remaining 18 pairs into two 9-pair segments with a 1-inch cut.

  4. 4. Insert a 4-inch 28AWG wire into the middle of the two 9-pair segments while gang-terminating the ribbon cable to the cable-end connector. For the second end assembled, be sure that the ribbon cable is reversed from its orientation on the first end (note the colors of the wires).

  5. 5. Strip the single 28AWG wire, and the 4 wires from the unused pairs. Solder all 5 wire ends together, and to the shield if any.

  6. 6. After testing the resulting cable, assemble the connector hood and locking hardware to the cable connectors.

6. ACKNOWLEDGEMENTS

The logical-level specifications for Myrinet links are a simplification of the "dialog-channel" specifications developed in the Caltech Submicron Systems Architecture Project. The evolution of these protocols was a result of interactions over a period of two years between the Caltech project and the USC Information Sciences Institute ATOMIC project, which developed an experimental, high-speed, local-area network based on Caltech components developed for an experimental multicomputer. Both the Caltech Submicron Systems Architecture Project and the USC/ISI ATOMIC project are sponsored by the Advanced Research Projects Agency (ARPA) of the Department of Defense. The ATOMIC LAN is the research prototype of Myrinet.

All other specifications for Myrinet links and routing were developed by Myricom.



Home | Product Information | Tech Support