(c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 0 ============================================================================== TABLE OF CONTENTS 0. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 0.1 LANai 4 processor . . . . . . . . . . . . . . . . . . . . . . . . . 2 0.2 Special, Memory-Mapped Registers . . . . . . . . . . . . . . . . . . 2 0.3 Memory Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 3 0.4 Data Communication . . . . . . . . . . . . . . . . . . . . . . . . . 3 1. PACKET SENDING . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. PACKET RECEIVING . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. EBUS-LBUS DATA TRANSFER . . . . . . . . . . . . . . . . . . . . . . . 6 4. INTERNET-CHECKSUM COMPUTATION . . . . . . . . . . . . . . . . . . . . 7 5. TIMERS/COUNTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 6. MEMORY PROTECTION . . . . . . . . . . . . . . . . . . . . . . . . . . 9 7. MISCELLANEOUS SPECIAL REGISTERS . . . . . . . . . . . . . . . . . . . 10 8. INTERRUPT-STATUS REGISTER . . . . . . . . . . . . . . . . . . . . . . 11 9. INITIALIZATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 A. SPECIAL-REGISTERS SUMMARY . . . . . . . . . . . . . . . . . . . . . . 17 B. CHIP-VERSION-SPECIFIC INITIALIZATION . . . . . . . . . . . . . . . . . 19 (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 1 ============================================================================== LANai 4.X 0. INTRODUCTION The LANai 4.X is a series of programmable communication devices that provide an interface to the Myrinet local area network. As illustrated in Figure 1, a LANai 4 chip consists of the LANai core, with an instruction-interpreting processor and a packet interface, the Myrinet-link interface, the memory interface, and a DMA/Checksum engine. Myrinet ^ | + - - - - - - - - - - - - - - - - | - - - - | - - - - + | v | +---------------+ | | | | | Myrinet-Link | | +---------->| Interface | | | +---------| | | | | +---------------+ | | | ^ | | +---------------|-|------------|---------|------+ | | | v | v | | | +---------------+ +---------------+ | | | | | | | | | | | Processor |------>| Packet | | | | | |<------| Interface | | | | | | | | | | | +---------------+ +---------------+ | | | | ^ ^ ^ | | | | | | | | | | | | v v | | | | | <=================================> | | | | | ^ | | | | | | LANai core | | +-----|-|--------------------|------------------+ | | | | | v | | | +---------------+ | | | | /| | | | DMA/Checksum |----> / | | | | Engine |<---- / |<=+ | | | / | | +---------------+ | | | ^ | / | LBUS <=============+===========>| | Memory Interface | | \ | | | | \ | | \ |<==========================> EBUS | \ | | \| | | +- - - - - - - - - - - - - - - - - - - - - - - - - - -+ Figure 1: A LANai 4 chip (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 2 ============================================================================== 0.1 LANai 4 processor This specification describes only the network-related features of the LANai 4 processor; for instruction-set details, consult the LANai 4 Instruction Set specification. The LANai processor is a 32-bit dual-context machine, with 24 general-purpose registers. One context is interruptable, and we shall refer to it as the `user' context. An interrupt during the execution of the user context causes the processor to switch to the other, `system', context. The system context is not interruptable. The system context transfers execution to the user context by executing the `punt' assembly instruction (see LANai 4 Instruction Set specification). Executing the `punt' instruction in the user context similarly switches execution to the system context. Upon reset, the LANai processor begins executing code in the system context, starting from address 0. The three special registers that control interrupt generation are described in detail in Section 8. They are: the Interrupt Status Register (ISR), the Interrupt Mask Register (IMR), and the External Interrupt Mask Register (EIMR). When a bit of ISR is equal to 1 and the corresponding bit in the IMR is equal to 1, an interrupt request is asserted for the on-chip processor. When a bit of ISR is equal to 1 and the corresponding bit of EIMR is equal to 1, the external-interrupt output pin of the LANai 4 chip is asserted. The remainder of the special-purpose registers are used for data communication and are described in this specification. 0.2 Special, Memory-Mapped Registers All packet-interface special registers and interrupt-control special registers except for the IMR are memory-mapped. The memory-mapped special registers can be accessed both by the LANai on-chip processor and from the external-access bus (EBUS). The IMR is an internal register of the LANai on-chip processor and is not accessible from the EBUS. The summary of special-register access modes, and the relative addresses of these registers can be found in Appendix A. To access a memory-mapped special register from the LANai processor one should use the address of 0xFFFFFF00 plus the relative address of that special register as specified in Appendix A. The base address for EBUS access of memory-mapped special registers is application-specific; consult system documentation for details. When accessing the special, memory-mapped registers, the regular memory arbitration mechanism described in Section 0.3 applies. The mutual exclusion at any higher level is the responsibility of the programmer. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 3 ============================================================================== 0.3 Memory Interface In the remainder of this specification, we shall refer to 8-bit data units as bytes, to 16-bit units as half-words, and to 32-bit units as words. Although the internals of the LANai 4.X series of chips support 32-bit addresses, pin- count limitations restrict the LBUS of the LANai 4.0 and 4.1 chips to a maximum of 1M bytes. The LBUS operates at twice the chip-clock speed --- there are two memory cycles for every clock cycle. The external-access bus (EBUS), the packet- interface receive DMA, and the packet-interface send DMA each request a maximum of one memory access per clock cycle. The on-chip processor requests up to two memory accesses per clock cycle (instruction and data). The two memory cycles within each clock cycle are assigned based on the following priority (highest to lowest): EBUS, receive DMA, send DMA, and the processor. Since every EBUS memory request is granted, the LANai 4 chip along with the memory on its LBUS appears as a block of synchronous memory when observed from the EBUS. The word and half-word memory accesses on the LBUS must be aligned; any least- significant bits of an address that would make a memory access non-aligned are ignored. Both the LBUS and the EBUS addresses are byte addresses, and the byte order is big-endian (the most significant byte of a word is stored at the lowest byte address). The LANai chip provides a rudimentary memory-protection mechanism that allows a memory segment of programmable size to be write-protected from the LANai core, i.e., writeable only from the EBUS (Section 6). Although the LANai core cannot access the EBUS directly, the on-chip processor can initiate a data transfer between the LBUS and the EBUS (Section 3). 0.4 Data Communication The LANai 4 chip provides the programmer with bidirectional access to the Myrinet network. A data-communication, flow-control unit is called a flit, and consists of eight data bits plus a tail bit. Packets are of arbitrary length (in flits), and the tail bit marks the last flit of every packet. The byte order in the communication network is big-endian, i.e., the most significant byte of a word (or of a half-word) appears first in the network. Packets are injected into the network and consumed off of the network by the packet interface. The packet interface is controlled by accessing special, memory-mapped registers, described in the remainder of this specification. After the LANai 4 chip is out of reset, and prior to any Myrinet-network access, the TIMEOUT, MYRINET, and VERSION special registers (Section 7) must be initialized. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 4 ============================================================================== 1. PACKET SENDING An outgoing packet is produced by writing into the following 32-bit special registers: SB (Send Byte) --- appends to the outgoing packet the least significant byte of the value written into SB. SH (Send Half-Word) --- appends to the outgoing packet the least significant half-word written into SH (high-order byte first). SW (Send Word) --- appends to the outgoing packet the word written into SW (most significant byte first). ST (Send Tail) --- writing a 0 into ST completes the outgoing packet by appending a tail flit with its data field equal to the cyclic-redundancy-check (CRC-8) byte for that packet. SML (Send Memory Limit) --- initiates a send-DMA transfer that appends to the outgoing packet, one word at a time, the contents of the memory buffer that starts at SMP and ends at SML (it is not possible to specify a zero-length buffer). SMLT (Send Memory Limit, --- the same as SML, and, in addition, completes with the Tail) the outgoing packet by appending a tail flit with its data field equal to the CRC-8 byte for that packet. SMP (Send Memory Pointer) --- specifies the beginning of the send-DMA memory buffer. This register is incremented by four by the packet interface as each word is appended to the outgoing packet, and, upon completion, equals SML+4. SA (Send Align) --- the two least-significant bits of this register specify how many leading flits (0-3) of the contents of the next-specified send-DMA memory buffer should NOT be appended to the outgoing packet. Only the first send-DMA transfer following a write into this register is affected. See Section 8 for information on how the completion of a send-DMA transfer can be detected. Since the send-DMA transfer specified by the SMP and SML accesses 32 bits at a time, the send memory buffer must be aligned on a word boundary. Hence, the two least significant bits of SMP and the two least significant bits of SML(T) are hard-wired to zero. SB, SH, SW, ST, and SA are not physical storage registers, and reading any of them produces undefined values. Reading SMP and SML(T) are valid operations, but one should note that SML and SMLT are stored in the same physical register. Writing any of the send registers during a send-DMA transfer may corrupt the outgoing packet. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 5 ============================================================================== 2. PACKET RECEIVING An incoming packet is accepted from the network by accessing the following special registers: RB (Receive Byte) --- reading RB, an 8-bit special register, consumes one byte off of the incoming packet. RH (Receive Half-Word) --- reading RH, a 16-bit special register, consumes one half-word off of the incoming packet (the first byte consumed becomes the most significant one). RW (Receive Word) --- reading RW, a 32-bit special register, consumes one word off of the incoming packet (the first byte consumed becomes the most significant one). RML (Receive Memory Limit) --- writing into RML, a 32-bit special register, enables a receive-DMA transfer and instructs the packet interface to put the (remainder of the) incoming packet, one word at a time, into the memory buffer that starts at RMP and ends at RML (it is not possible to specify a zero-length buffer). RMP (Receive Memory Pointer) --- a 32-bit special register, specifies the beginning of the receive-DMA memory buffer. This register is incremented by four by the packet interface as each word is written into the buffer. After an entire packet has been received, RMP points to the first word past the end of the packet, unless the buffer has been exhausted (Section 8). See Section 8 for information on how the completion of a receive-DMA transfer can be detected. Since the receive-DMA transfer specified by the RML and RMP accesses 32 bits at a time, the receive memory buffer must be aligned on a word boundary. Hence, the two least significant bits of RML and the two least significant bits of RMP are hard-wired to zero. Upon receipt, any packet with a non-zero tail flit has failed a cyclic redundancy check (CRC-8). This should be checked in software after the packet has been received. It is possible to read past the tail flit of the incoming packet (with RH, RW, or with receive DMA). In such cases, the packet following the currently accessed packet is guaranteed not to be corrupted. However, the bytes corresponding to the flits past the tail flit are undefined (Section 8). RB, RH, and RW are not physical storage registers, and writing any of them is a nop. Reading RMP or RML is a valid operation. Initiating any receive operation during a receive-DMA transfer may corrupt the incoming packet. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 6 ============================================================================== 3. EBUS-LBUS DATA TRANSFER In the typical operating regime, a LANai 4 chip operates as a slave device on the EBUS. In this regime, the LANai 4 chip along with the memory on its LBUS appears as a block of synchronous memory when observed from the EBUS. The LANai 4 chip incorporates a DMA engine that can be instructed to perform data transfer between the LBUS and the EBUS, and, in this regime only, the chip acts as a master on the EBUS. The LANai EBUS interface is simple and generic, and extra hardware is necessary to connect it to any standard bus. The EBUS-LBUS DMA engine is controlled by the following 32-bit, special registers: LAR (LBUS Address Register) --- points to the beginning of the DMA buffer on the LBUS. This register is incremented by four as each word is transferred, and, upon completion, points to the first word past the LBUS DMA buffer. EAR (EBUS Address Register) --- points to the beginning of the DMA buffer on the EBUS. This register is incremented by four as each word is transferred, and, upon completion, points to the first word past the EBUS DMA buffer. DMA_DIR (DMA Direction) --- the least-significant bit of this register controls the direction of the EBUS-LBUS DMA transfer. Bit 0 = 1: EBUS -> LBUS Bit 0 = 0: LBUS -> EBUS DMA_CTR (DMA Counter) --- writing a non-zero value into the DMA_CTR register initiates the DMA transfer. This register is decremented by four as each word is transferred, and equals 0 upon completion. DMA_STS (DMA Status) --- the four least-significant bits of this register specify the allowed burst sizes for the EBUS DMA. If a DMA_STS bit is equal to 1, the corresponding burst size of the EBUS DMA is enabled. The 1-word bursts are always enabled. Bit 3: 16-word bursts Bit 2: 8-word bursts Bit 1: 4-word bursts Bit 0: 2-word bursts See Section 8 for information on how the completion of an EBUS-LBUS DMA transfer can be detected. Since the EBUS-LBUS DMA transfers 32 bits at a time, the LBUS memory buffer must be aligned on a word boundary. Hence, the two least significant bits of LAR and the two least significant bits of DMA_CTR are hard-wired to zero. DMA_DIR and DMA_STS are write-only registers, and reading either of them produces undefined values. Reading LAR, EAR, or DMA_CTR is a valid operation. Writing any of the registers listed above during an EBUS-LBUS DMA transfer may result in violation of the protocol on the I/O bus that the EBUS connects to. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 7 ============================================================================== 4. INTERNET-CHECKSUM COMPUTATION The LANai 4 chip includes a mechanism to compute a partial Internet checksum. The partial checksum is stored in the CKS (ChecKSum) register. This register is modified as a side effect of the EBUS-LBUS DMA transfers. Upon completion of an EBUS-LBUS DMA transfer (Section 8), the CKS register contains the result of the 32-bit, 1's-complement addition of its initial value and the values of all transferred data items (the DMA engine transfers 32-bit data items only). A typical 16-bit-Internet-checksum computation consists of: writing zero into CKS; performing one or more EBUS-LBUS DMA transfers; and, then adding the most and least significant half-word of CKS (in software) using 1's complement addition. Writing the CKS register during an EBUS-LBUS DMA transfer may corrupt the checksum computation. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 8 ============================================================================== 5. TIMERS/COUNTERS There are two real-time counters on the LANai 4 chip, both of which use the time reference that is equal to 40 times the period of the clock of the Myrinet-link interface. Nominally, this is a 40 MHz clock, so the time reference is equal to 1 microsecond. 5.1 The Real-Time Clock The RTC special register is a 32-bit counter that is incremented every time-reference period. 5.2 The Interrupt Timer The IT special register is a 32-bit counter that is decremented every time-reference period. Whenever this counter makes a transition from 0x00000000 to 0xFFFFFFFF, a timer interrupt occurs. Whenever it makes a transition from 0x80000000 to 0x7FFFFFFF, a watchdog interrupt occurs. See Sections 0.1 and 8 for information on interrupt handling. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 9 ============================================================================== 6. MEMORY PROTECTION NOTE: This feature is not provided in the LANai 4.0 chip. The LANai chip provides a rudimentary memory-protection mechanism that allows a memory segment of programmable size to be write-protected from the LANai core, i.e., writeable only from the EBUS. The WRITE_ENABLE special register contains the following bits: bit# : 31 30 29 28 27 26 25 24 most-significant value: WE - - - - - - - byte bit# : 23 22 21 20 19 18 17 16 value: - - - - A19 A18 A17 A16 bit# : 15 14 13 12 11 10 9 8 value: A15 A14 A13 A12 - - - - bit# : 7 6 5 4 3 2 1 0 least-significant value: - - - - - - - - byte If the WE (Write Enable) bit is 1, the LANai is allowed to write to any memory location (no memory protection). Upon reset, this is the default value of the WE bit. If the WE bit is 0, the A12-A19 bits (the A bits) define the region(s) of memory in which the LANai core is allowed to write: a write to a memory location is allowed if a bit in the address of that memory location is 1 and the corresponding A bit in the WRITE_ENABLE special register is 1. A typical use of this mechanism is to write-protect a memory segment at the bottom of the memory (where the LANai code is usually kept), and allow writes to addresses up to the highest available memory on the LBUS. For example, writing the value 0x0007E000 to the WRITE_ENABLE special register write-protects the lowest 8KB and allows writes to addresses up to 512KB. The pin-count limitations restrict the LBUS of the LANai 4 chip to a maximum of 1M bytes. The WRITE_ENABLE special register is a write-only register, and reading it produces an undefined value. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 10 ============================================================================== 7. MISCELLANEOUS SPECIAL REGISTERS TIMEOUT --- the TIMEOUT special register specifies the timeout period of the LANai watchdog timer. If the LANai 4.0 chip fails to consume an incoming message from the network for the duration of the timeout period, the watchdog timer asserts the NRES signal (network attempting to reset the LANai). The two least- significant bits of the TIMEOUT register determine the timeout period: TIMEOUT timeout period 0 1/16 second 1 1/4 second 2 1 second 3 4 second MYRINET --- the two least-significant bits of the MYRINET special register are: NRES_ENABLE - when the NRES signal is asserted, the nres_int bit of ISR (bit 0) is set (see Section 8). A value 1 in the least- significant bit of the MYRINET special register requests that the LANai chip be reset when NRES is asserted. CRC_ENABLE - as discussed in Sections 1 and 2, the packet interface (bit 1) completes every outgoing packet with the CRC-8 byte, and verifies the CRC-8 byte for every incoming packet. The tail flit contains the CRC-8 byte only if the CRC_ENABLE bit of the MYRINET special register is 1. When this bit is 0, the tail, CRC flit of every outgoing packet is set to 0, and no CRC verification is performed on the incoming packets. The Myrinet-link interface part of the LANai 4 chip requires configuration before use. After the chip is out of reset and prior to any Myrinet access, the chip-version-specific value (Appendix B) must be written to this register. VERSION --- writing to the VERSION special register configures the Myrinet- link interface. The LED special register can be used by the programmer to modify the values of 11 dedicated LANai 4 output pins. The name of this register is only suggestive of its actual usage which is application specific. LED --- the 11 least-significant bits of the special register LED are driven to the LED_0 through LED_10 output pins. The special register CLOCK controls the phase relationship of the on-chip generated clocks. During the power-on reset, the chip-version-specific value (Appendix B) must be written to this register from the EBUS. It must not be modified while the chip is out of reset. CLOCK --- the special register CLOCK controls the on-chip clock generation. The special registers described in this section are write-only registers, and reading any of them produces undefined values. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 11 ============================================================================== 8. INTERRUPT-STATUS REGISTER The ISR bits with the _sig (signal) postfix are included for simple host-LANai communication. A signal bit can be set only by the LANai processor and reset only from the EBUS, or vice versa. The ISR bits with the _rdy (ready) postfix are maintained by the packet interface, and cannot be modified by the programmer. However, when such a bit becomes 1, the packet interface may reset it only as a result of a bit- specific request to the packet interface, as described later in this section. Please read the warning about the timing of the clearing of _rdy bits at the end of this section. The ISR bits with the _int (interrupt) postfix are set by the packet interface when their corresponding events occur. These bits can be reset directly --- by writing a 1 into them, or indirectly --- in a bit-specific way. Please read the warning about the timing of the bit-specific clearing of _int bits at the end of this section. The information on the status of the packet interface is kept in the Interrupt Status Register (ISR), which consists of the following bits: bit# : 31 30 29 28 27 26 25 24 value: 1 host 0 0 0 0 0 0 _sig bit# : 23 22 21 20 19 18 17 16 value: lan7 lan6 lan5 lan4 lan3 lan2 lan1 lan0 _sig _sig _sig _sig _sig _sig _sig _sig bit# : 15 14 13 12 11 10 9 8 value: word half send 0 nres wake orun2 orun1 _rdy _rdy _rdy _int _int _int _int bit# : 7 6 5 4 3 2 1 0 value: tail wdog time dma send buff recv byte _int _int _int _int _int _int _int _rdy Bit 31 of ISR is also referred to as the dbg_bit. This bit is always equal to 1, and can be used for single-stepping the code that runs in user context (see the LANai 4 Instruction Set specification). host_sig --- this bit is set when the LANai processor writes a 1 into it, and reset when a 1 is written into it from the EBUS. Bits 29 through 24 are equal to 0 and are reserved for future expansion. lan7_sig ... lan0_sig --- each of these bits is set when a 1 is written into it from the EBUS, and reset when the LANai processor writes a 1 into it. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 12 ============================================================================== send_rdy --- this bit is maintained by the packet interface, and it denotes that the outgoing channel is not blocked, so that an SB, SH, SW, or an ST operation can be issued. Writing SB, SH, SW, or ST while this bit is equal to 0 may corrupt the outgoing packet. During a send-DMA operation (from the time when the SML(T) is written until the time when send_int becomes one), send_rdy is equal to 0 regardless of the state of the outgoing channel. This bit cannot be modified by the programmer. Once a send DMA has completed (see send_int bit), another send DMA may be initiated regardless of the state of the send_rdy bit. byte_rdy --- this bit is maintained by the packet interface and is equal to 1 if there is at least one flit on the incoming channel, indicating that an RB operation can be issued. Reading RB while the byte_rdy bit is equal to 0 may corrupt the incoming packet. During a receive-DMA operation (from the time when the RML is written, until the time when recv_int or buff_int becomes one), this bit is equal to 0 regardless of the state of the incoming channel. This bit cannot be modified by the programmer. half_rdy --- this bit is maintained by the packet interface and is equal to 1 if: (1) there are at least two flits of the same packet on the incoming channel, or (2) the next available flit on the incoming channel is a tail. In either case, the half_rdy bit indicates that an RH operation can be issued. Reading RH while the half_rdy bit is equal to 0 may corrupt the incoming packet. During a receive-DMA operation (from the time when the RML is written, until the time when recv_int or buff_int becomes one), this bit is equal to 0 regardless of the state of the incoming channel. Note that half_rdy=1 implies byte_rdy=1. However, because of (2), half_rdy=1 does not imply that two RB operations can be issued. This bit cannot be modified by the programmer. word_rdy --- this bit is maintained by the packet interface and is equal to 1 if: (1) there are at least four flits of the same packet on the incoming channel, or (2) the first, the second, or the third available flit on the incoming channel is a tail. In either case, the word_rdy bit indicates that an RW operation can be issued. Reading RW while the word_rdy bit is equal to 0 may corrupt the incoming packet. During a receive-DMA operation (from the time when the RML is written, until the time when recv_int or buff_int becomes one), this bit is equal to 0 regardless of the state of the incoming channel. Note that word_rdy=1 => half_rdy=1 => byte_rdy=1. However, because of (2), word_rdy=1 does not imply that, for example, two RH operations can be issued. This bit cannot be modified by the programmer. Once a receive DMA has completed (see recv_int and buff_int bits), another receive DMA may be initiated regardless of the state of the above three, receive-ready bits. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 13 ============================================================================== recv_int --- this bit is set by the packet interface to signal the completion of a receive-DMA transfer, i.e., when an entire incoming packet has been transferred into the receive memory buffer. This bit is cleared by the programmer, either directly --- by writing a 1 into it ---, or indirectly --- when RML is written. After a receive-DMA operation is initiated, one must not initiate any receive operation until the recv_int bit, the buff_int bit, or both, become 1. buff_int --- this bit is set by the packet interface when the receive-DMA buffer has been exhausted (the last word written is at the location pointed to by RML, and RMP=RML+4). This bit is cleared by the programmer, either directly --- by writing a 1 into it ---, or indirectly --- when RML is written. After a receive-DMA operation is initiated, one must not initiate any receive operation until the recv_int bit, the buff_int bit, or both, become 1. send_int --- this bit is set by the packet interface to signal the completion of a send-DMA transfer, i.e., when the contents of the memory buffer specified by SMP and SML(T) have been appended to the outgoing packet. This bit is cleared by the programmer, either directly --- by writing a 1 into it, or indirectly --- when SML(T) is written. After a send-DMA operation is initiated, one must not initiate any send operation until the send_int bit becomes 1. tail_int --- this bit is set by the packet interface when an RB, an RH, or an RW operation consumes a tail flit. This bit is cleared by the programmer, either directly --- by writing a 1 into it ---, or indirectly --- when another RB, RH, or RW operation is issued. orun2_int, orun1_int --- these bits are set by the packet interface when an overrun condition is detected, i.e., when a receive operation reads past the tail flit of a packet. The two bits taken together represent the number of flits read past the tail flit (0 through 3). For example, if when using receive DMA (RMP and RML), the tail is received in the most-significant byte of a 4-byte word, both orun2_int and orun1_int will be set, indicating that the values of the 3 least significant flits are undefined. These bits are cleared by the programmer, either directly --- by writing a 1 into them ---, or indirectly --- when any receive operation is initiated. Regardless of the unit of data transfer for packet receiving, after the whole packet has been received the overrun conditions should be tested. Although one may make simplifying assumptions about packet lengths and thereby eliminate the need for overrun check, this approach relies on a guarantee that the communication network is error-free. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 14 ============================================================================== wake_int --- this bit is set when the WAKE input pin is driven high. This bit is cleared by the programmer, only directly --- by writing a 1 into it. time_int --- this bit is set by the interrupt timer whenever it makes a transition from 0x00000000 to 0xFFFFFFFF. This bit is cleared by the programmer, either directly --- by writing a 1 into it, or indirectly --- when IT is written. wdog_int --- this bit is set by the interrupt timer whenever it makes a transition from 0x80000000 to 0x7FFFFFFF. This bit is cleared by the programmer, either directly --- by writing a 1 into it, or indirectly --- when IT is written. dma_int --- this bit is set by the EBUS-LBUS DMA engine when the DMA_CTR reaches 0 to signal the completion of a DMA transfer. This bit is cleared by the programmer, either directly --- by writing a 1 into it ---, or indirectly --- when the DMA_CTR is written. After an EBUS-LBUS DMA operation is initiated, one must not initiate another such operation until the dma_int bit becomes 1. nres_int --- this bit is set by the Myrinet-link interface whenever the LANai chip fails to consume an incoming message from the network for the duration of the period specified by the TIMEOUT special register. If the NRES_ENABLE bit of the MYRINET special register is 1, the LANai chip is also reset. By examining the nres_int bit, one can distinguish between the reset-pin-induced and NRES-induced reset. This bit is cleared by the programmer, only directly --- by writing a 1 into it. Please read the warning on the NRES-induced reset at the end of Section 9. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 15 ============================================================================== WARNING: Because the special registers are memory-mapped, and because the LANai processor is a pipelined machine, there is a delay between the time of a special-register access and the time of the change of the corresponding ISR bit(s). In case of tight polling loops, this delay can result in a race condition, whereby the program could, for example, misinterpret a not-yet-cleared ISR _int bit for an indication of a new event. The following table illustrates how special-register accesses affect ISR bits. A number in the table specifies for how many assembly instructions following the special-register-read-or-write instruction the ISR bit is undefined. clr - always cleared set - always set clr? - potentially cleared set? - potentially set +----------------+---------+---------+---------+---------+---------+---------+ | | write | write | read |write SB,| write | write | | | RML |SML,SMLT |RB,RH,RW |SH,SW,ST | IT | DMA_CTR | +----------------+---------+---------+---------+---------+---------+---------+ | send_rdy clr | | 2 | | | | | +----------------+---------+---------+---------+---------+---------+---------+ | send_rdy clr? | | | | 3 | | | +----------------+---------+---------+---------+---------+---------+---------+ | byte_rdy\ | | | | | | | | half_rdy >clr | 2 | | | | | | | word_rdy/ | | | | | | | +----------------+---------+---------+---------+---------+---------+---------+ | byte_rdy\ | | | | | | | | half_rdy >clr? | | | 3 | | | | | word_rdy/ | | | | | | | +----------------+---------+---------+---------+---------+---------+---------+ | send_int clr | | 1 | | | | | +----------------+---------+---------+---------+---------+---------+---------+ | recv_int\ | | | | | | | | buff_int/ clr | 1 | | | | | | +----------------+---------+---------+---------+---------+---------+---------+ | orun1_int\ | | | | | | | | orun2_int/clr | 1 | | 1 | | | | +----------------+---------+---------+---------+---------+---------+---------+ | orun1_int\ | | | | | | | | orun2_int/set? | | | 3 | | | | +----------------+---------+---------+---------+---------+---------+---------+ | tail_int clr | | | 1 | | | | +----------------+---------+---------+---------+---------+---------+---------+ | tail_int set? | | | 3 | | | | +----------------+---------+---------+---------+---------+---------+---------+ | time_int\ | | | | | | | | wdog_int/ clr | | | | | 1 | | +----------------+---------+---------+---------+---------+---------+---------+ | dma_int clr | | | | | | 1 | +----------------+---------+---------+---------+---------+---------+---------+ (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 16 ============================================================================== 9. INITIALIZATION During the power-on reset, the chip-version-specific value (Appendix B) must be written to the CLOCK special register from the EBUS, to initialize the on-chip clock generation. After chip reset, the on-chip processor begins executing code in the system context, starting from the address 0. The state of the nres_int bit in the ISR upon reset indicates whether the reset has been a regular, reset-pin initialization (0), or an NRES-induced reset (1). All the remaining bits of ISR are equal to 0, except the dbg_bit and send_rdy bit, which are equal to 1. The WRITE_ENABLE register is initialized to the no-memory-protection state. The IMR and the EIMR special registers are undefined and should be initialized by the programmer. (In the LANai 4.0 chip, the EIMR special register is initialized to 0 upon reset.) All other special registers are initialized to 0 upon reset. After the chip is out of reset and prior to any Myrinet access, the chip- version-specific value (Appendix B) must be written to the VERSION register to configure the Myrinet-link interface. WARNING: In case of an NRES-induced reset, if the chip has been reset while sending a message (impossible to check), the first message sent by the LANai after it comes out of reset will be concatenated to the interrupted message and dropped. It is therefore advisable to emit a zero-length message upon coming out of an NRES-induced reset (by writing a 0 into the ST special register). (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 17 ============================================================================== A. SPECIAL-REGISTERS SUMMARY +==============+===========+===========+=======================+========+====+ | | read | write | |relative| | | register +-----+-----+-----+-----+ function |address |page| | |EBUS |LANai|EBUS |LANai| | | | +==============+=====+=====+=====+=====+=======================+========+====+ | CKS | + | + | + | + | Internet checksum | 0x38 | 7 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | CLOCK (1) | | | + | | initialization of on- | 0xFC | 10 | | | | | | | chip clock generation | | | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | DMA_CTR | + | + | + | + | initiate EBUS DMA | 0x44 | 6 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | DMA_DIR | | | + | + | direction of EBUS DMA | 0x80 | 6 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | DMA_STS | | | + | + | EBUS-DMA configuration| 0x84 | 6 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | EAR | + | + | + | + | EBUS-DMA host address | 0x3C | 6 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | EIMR (2) | + | + | + | + | host-interrupt mask | 0x2C | 2 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | IMR (3) | | + | | + | LANai-interrupt mask | - | 2 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | ISR (4) | + | + | + | + | interrupt status | 0x28 | 11 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | IT | + | + | + | + | interrupt timer | 0x30 | 8 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | LAR | + | + | + | + | EBUS-DMA local address| 0x40 | 6 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | LED | | | + | + | 11 output pins | 0x94 | 10 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | MYRINET | | | + | + | Myrinet-link configur.| 0x8C | 10 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | RB | + | + | | | receive a byte | 0x60 | 5 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | RH | + | + | | | receive a half-word | 0x64 | 5 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | RML | + | + | + | + | initiate receive DMA | 0x4C | 5 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | RMP | + | + | + | + | receive-DMA buffer | 0x48 | 5 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | RTC | + | + | + | + | real-time clock | 0x34 | 8 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | RW | + | + | | | receive a word | 0x68 | 5 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ (1) The chip-version-specific value (Appendix B) must be written into this register during power-on reset. (2) EIMR can be written only from the EBUS in the LANai4.0 chip. (3) IMR is an internal register of the on-chip CPU and cannot be accessed from the EBUS. (4) ISR is an internal register of the on-chip CPU, but it can also be accessed from the EBUS as a memory mapped register. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 18 ============================================================================== +==============+===========+===========+=======================+========+====+ | | read | write | |relative| | | register +-----+-----+-----+-----+ function |address |page| | |EBUS |LANai|EBUS |LANai| | | | +==============+=====+=====+=====+=====+=======================+========+====+ | SA | | | + | + | send alignment | 0x6C | 4 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | SB | | | + | + | send a byte | 0x70 | 4 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | SH | | | + | + | send a half-word | 0x74 | 4 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | SML | + | + | + | + | initiate send DMA | 0x54 | 4 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | SMLT | + | + | + | + | initiate send DMA | 0x58 | 4 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | SMP | + | + | + | + | send-DMA buffer | 0x50 | 4 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | ST | | | + | + | send the tail | 0x7C | 4 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | SW | | | + | + | send a word | 0x78 | 4 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | TIMEOUT | | | + | + | NRES-timeout selection| 0x88 | 10 | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | VERSION (1) | | | + | + | configuration of the | 0x98 | 10 | | | | | | | Myrinet-link interface| | | +--------------+-----+-----+-----+-----+-----------------------+--------+----+ | WRITE_ENABLE | | | + | + | memory protection | 0x9C | 9 | | (2) | | | | | | | | +==============+=====+=====+=====+=====+=======================+========+====+ (1) The chip-version-specific value (Appendix B) must be written into this register during post-reset initialization, prior to any Myrinet access. (2) This register does not exist in the LANai 4.0 chip. (c)1994 Myricom, Inc., LANai 4.X, 17 January 1996 - page 19 ============================================================================== B. CHIP-VERSION-SPECIFIC INITIALIZATION +==============+============+============+ | chip version | CLOCK | VERSION | +==============+============+============+ | LANai 4.0 | 0x11371137 | 0x00000000 | +--------------+------------+------------+ | LANai 4.1 | 0x50E450E4 | 0x00000003 | +==============+============+============+