This is /ufs/gmbin/src/doc/gm.info, produced by Makeinfo version 3.12h from /ufs/gmbin/src/doc/gm.texi. ************************************************************************* * * * Myricom GM myrinet software and documentation * * * * Copyright (c) 2000 by Myricom, Inc. * * All rights reserved. * * * * Myricom, Inc. Email: info@myri.com * * 325 N. Santa Anita Ave. World Wide Web: http://www.myri.com/ * * Arcadia, CA 91024 * *************************************************************************  File: gm.info, Node: Top, Next: Copyright Notice, Prev: (dir), Up: (dir) GM ** GM is a message-passing system for Myrinet networks. * Menu: * Copyright Notice:: * About This Document:: * GM Overview:: * Definitions:: * Programming Model:: * Initialization:: * Memory Setup:: * Sending Messages:: * Receiving Messages:: * Alarms:: * High Availability Extensions:: * Utility Modules:: * Additional Features:: * GM Constants and Macros:: * Function Summary:: * Token Reference:: * Glossary:: * Variable Index:: * Concept Index:: --- The Detailed Node Listing --- Utility Modules * CRC Functions:: * Hash Table:: * Lookaside List:: * Page Allocation:: Hash Table * Hash Table Introduction:: * Hash Table API::  File: gm.info, Node: Copyright Notice, Next: About This Document, Prev: Top, Up: Top Copyright Notice **************** Myricom GM myrinet software and documentation Copyright (c) 1994-2000 by Myricom, Inc. All rights reserved. Permission to use, copy, modify, and distribute this software and its documentation in source and binary forms for non-commercial purposes and without fee is hereby granted, provided that the modified software is returned to Myricom, Inc. for redistribution. The above copyright notice must appear in all copies and both the copyright notice and this permission notice must appear in supporting documentation, and any documentation, advertising materials, and other materials related to such distribution and use must acknowledge that the software was developed by Myricom, Inc. The name of Myricom, Inc. may not be used to endorse or promote products derived from this software without specific prior written permission. Myricom, Inc. makes no representations about the suitability of this software for any purpose. THIS FILE IS PROVIDED "AS-IS" WITHOUT WARRANTY OF ANY KIND, WHETHER EXPRESSED OR IMPLIED, INCLUDING THE WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. MYRICOM, INC. SHALL HAVE NO LIABILITY WITH RESPECT TO THE INFRINGEMENT OF COPYRIGHTS, TRADE SECRETS OR ANY PATENTS BY THIS FILE OR ANY PART THEREOF. In no event will Myricom, Inc. be liable for any lost revenue or profits or other special, indirect and consequential damages, even if Myricom has been advised of the possibility of such damages. Other copyrights might apply to parts of this software and are so noted when applicable. Myricom, Inc. Email: 325 N. Santa Anita Ave. World Wide Web: `http://www.myri.com/' Arcadia, CA 91024 Portions of this program are subject to the following copyright: Copyright (c) 1990 The Regents of the University of California. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. All advertising materials mentioning features or use of this software must display the following acknowledgement: This product includes software developed by the University of California, Berkeley and its contributors. 4. Neither the name of the University nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.  File: gm.info, Node: About This Document, Next: GM Overview, Prev: Copyright Notice, Up: Top About This Document ******************* This document describes the GM message passing system. The document describes the GM-1.1 API, which is both simpler to use and more powerful that the GM-1.0 API. The 1.0 API will continue to be supported by the GM libraries for the foreseeable future and GM-1.0 programs actually run significantly faster under GM-1.1 than under GM-1.0, but new programs should use the GM API as described in this document. This document exists in the following formats: Adobe Acrobat (`gm.pdf') best suited for printing (using Adobe Acrobat Reader (http://www.adobe.com/prodindex/acrobat/readstep.html)). This format produces publication quality output. It can also be read online if you have installed the Adobe Acrobat Reader plug-in for your browser, but startup time can be quite large when doing so over a slow network connection. ASCII (`gm.txt') best suited for email and ASCII-only environments. All graphical figures are represented as "ASCII art". Gnu info format (`gm.info') best suited for interactive Unix online use with searching and indexing. Viewing this version of the documentation requires the Gnu `info' program, available as source from the Free Software Foundation's ftp server (ftp://prep.ai.mit.edu/pub/gnu/texinfo). All graphical figures are represented as "ASCII art". Hypertext Markup Language (`gm_toc.html') best suited for online interactive viewing with your favorite browser. All graphical figures are included as inline images. Monolithic Hypertext Markup Language (`gm.html') Second-best suited for printing if you can't print the Adobe Acrobat version. The following typeface conventions are used in this document: * Text like `this' represents user input. * Text like `this' represents code. * Text like THIS represents variables. * Text like `this' represents files. * Text like `this()' represents a function name, with the return type and parameters unspecified. Such references should not be interpreted to necessarily represent a function with no parameters and/or no return value. The absence of a return value or parameters is indicated only by the use of the keyword `void', as in "`void function(void)'", which indicates that `function()' returns nothing and requires no parameter. Numerical constants are represented in this document using the `C' language conventions.  File: gm.info, Node: GM Overview, Next: Definitions, Prev: About This Document, Up: Top GM Overview *********** GM is a message-based communication system for Myrinet. Like many messaging systems, GM's design objectives included low CPU overhead, portability, low latency, and high bandwidth. Additionally, GM has several distinguishing characteristics: * GM has extremely low overhead of about 1 microsecond per packet on all architectures. * GM can provide simultaneous memory-protected user-level OS-bypass network interface access to several user-level applications simultaneously. (On systems that do not support memory protection, such as VxWorks, no memory protection is provided.) * GM provides reliable ordered delivery between hosts in the presence of network faults. GM will detect and retransmit lost and corrupted packets. GM will also reroute packets around network faults when alternate routes exist. Catastrophic network errors, such as crashed hosts or disconnected links, are nonfatal; the undeliverable packets are returned to the client with an error indication, although most client programs are unable to adapt in the presence of such severe errors. * GM supports clusters of over 10,000 nodes. * GM provides two levels of message priority to allow efficient deadlock-free bounded-memory forwarding. * GM allows clients to send messages up to 2**31 - 1 bytes long, under operating systems that support sufficient amounts of DMAable memory to be allocated. * GM automatically maps Myrinet networks. GM is a light-weight communication layer, and as such has limitations that can be addressed by layering a heavier-weight interface over GM. Some such limitations are the following: * GM is unable to send messages from or receive messages into nonDMAable memory. * The GM API does not yet support any gather or scatter operations directly. From the client's point of view, GM consists of a library, `libgm.a', and a header file, `gm.h'. All externally visible GM identifiers in these files match the regular expression `^_*[Gg][Mm]_' to minimize name space pollution. Additionally, GM has other parts that system administrators need to be concerned about: `gm' The GM driver provides systems services. It is called ``gm'' under Unix, and is the `Myricom Myrinet Adapter' driver implemented in ``gm.sys'' under Windows NT. `mapper' The Myrinet mapper daemon maps the network. It is called ``sbin/mapper'' under Unix, and is the `Myricom Myrinet Mapper Daemon' service implemented in ``gm_mapper_service.exe'' under Windows NT.  File: gm.info, Node: Definitions, Next: Programming Model, Prev: GM Overview, Up: Top Definitions *********** This document attaches special meaning to a few commonly used words. The meaning of each of these words in the context of this document is defined here. _In particular, please note the special meanings of the words "size" and "length." Understanding the special meaning of these terms is critical to understanding this document._ _aligned_ A value is said to be aligned if it is a multiple of the required GM alignment. The required GM alignment is 1 on LANai7 hardware, 4 on LANai4 hardware, and 8 on LANai5 hardware. Pointers to memory allocated by GM are always automatically aligned. _client software_ _client_ The _client software_ or simply _client_ is the non-GM software that uses GM to provide a reliable ordered message delivery service. It can be an application, or a higher level networking layer, such as the MPICH over GM implementation provided by Myricom. _message_ A _message_ is an aligned array of bytes in DMAable memory. _buffer_ A _buffer_ is a contiguous region of DMAable memory into which a message may be copied. All GM buffers must be aligned. _length_ The _length_ of a message is the number of bytes of data that comprise the message. There is no alignment restriction on the length of any GM message. The _length_ of a receive buffer is the number of bytes that may be safely copied into the buffer. _packet_ A packet is an aggregation of bytes sent over the network. Packet lengths are limited to just over `GM_MTU' (usually 4096 bytes) to bound the time any packet can monopolize network resource. Note that multiple packets are required to send large messages over the network, but the segmentation of messages into packets and reassembly of packets into messages is performed automatically by GM. _size_ The _size_ of the message is any integer greater than or equal to log (LENGTH + 8) 2 where LENGTH is the length of the message. The _size_ of a receive buffer is any positive integer _less_ than or equal to log (LENGTH + 8) 2 where LENGTH is the length of the buffer. Consequently, a buffer of size SIZE must have a LENGTH of at least SIZE 2 - 8. A buffer having a longer length serves no useful purpose in GM, but is allowed. The function `gm_min_size_for_length(LENGTH)' can be used to compute the minimum size for any length, and the function `gm_max_length_for_size(SIZE)' can be used to compute the maximum length for any size. _port_ A _port_ is a GM communication endpoint, and serves as the interface between the client software and the network. _user_ A human using an application that uses GM. _user virtual memory_ Memory directly accessible by software running in a user application. _kernel virtual memory_ Memory directly accessible by the GM driver.  File: gm.info, Node: Programming Model, Next: Initialization, Prev: Definitions, Up: Top Programming Model ***************** The GM communication system provides reliable, ordered delivery between communication endpoints, called "ports," with two levels of priority. This model is "connectionless" in that there is no need for client software to establish a connection with a remote port in order to communicate with it: the client software simply builds a message and sends it to any port in the network. (This apparently paradoxical "connectionless reliability" is achieved by GM maintaining reliable connections between each pair of hosts in the network and multiplexing the traffic between ports over these reliable connections.) Host Host .----------------------. .---------------------. | Process | | Process | | .-----------.| |.--------. | | | Port|| ||Port | | | | .--.________.--. | | | | Port /| ||| ,--| | | | | | .--./ `--'|| ,' |/`--' | | | | | |----|---' /| | | | | `--' | || /|| | | | `----\----|-'| / |`--------' | | \ | | / | | | .----\--|-.|/^ | | | | \ | |/ ^ | | | | .--./| ^ | | | Process| Port| ||| ^ | | | | `--'|| ^ | | | | || ^ | | | | || ^ | | | `---------'| ^ | | `----------------------' ^ `---------------------' ^ Reliable Connection GM Endpoints (Ports) Under operating systems that provide memory protection, GM provides memory protected network access. It should be impossible for any non-privileged GM client application to use GM to access any memory other than the application's own memory, except as explicitly allowed by the GM API. The unforgeable source of each received message is available to the receiver, allowing the receiver to discard messages from untrusted sources. The largest message GM can send or receive is limited to (2**31)-1 bytes. However, because send and receive buffers must reside in DMAable memory, the maximum message size is limited by the amount of DMAable memory the GM driver is allowed to allocate by the operating system. Most GM applications obtain DMAable memory using the straightforward `gm_dma_malloc()' and `gm_dma_free()' calls, but sophisticated applications with large memory requirements may perform DMA memory management using `gm_register_memory()' and `gm_deregister_memory()' to pin and unpin memory on operating systems that support memory registration. Message order is preserved only for messages of the same priority, from the same sending port, and directed to the same receiving port. Messages with differing priority never block each other. Consequently, low priority messages may pass high priority messages, unlike in some other communication systems. Typical GM applications will either use only one GM priority, or use the high priority channel for control messages (such as client-to-client acks) or for single-hop message forwarding. Both sends and receives in GM are regulated by implicit tokens, representing space allocated to the client in various internal GM queues, as depicted in the following figure. At initialization, the client implicitly possesses `gm_num_send_tokens()' send tokens, and `gm_num_receive_tokens()' receive tokens. The client may call certain functions only when possessing an implicit send or receive token, and in calling that function, the client implicitly relinquishes the token(1). The client program is responsible for tracking the number of tokens of each type that it possesses, and must not call any GM function requiring a token when the client does not possess the appropriate token. Calling a GM API function without the required tokens has undefined results, but GM usually reports such errors, and such errors will not cause system security to be violated. Send Queue +-+-+-+-+-+ .----------| | | | | |<-------------------. | +-+-+-+-+-+ | | gm_num_send_tokens() slots | | | | Receive Buffer Pool | | +-+-+-+-+-+ | | .--------| | | | | |<-----------------. | | | +-+-+-+-+-+ | | | | gm_num_receive_tokens() slots | | | | | | | | | | LANai Memory - -|-|- - - - - - - - - - - - - - - - - - -|-|- - - - - - - - - - - | | | | User Virtual Memory | | Receive Event Queue | | | `------->+-+-+-+-+-+ +--------+ | | | | | | |--------------->| Client | `--------->+-+-+-+-+-+ |Software| gm_num_receive_tokens() + +--------+ gm_num_send_tokens() slots User Token Flow As stated above, sends are token regulated. A client of a port may send a message only when it possesses a send token for that port. By calling a GM API send functions, the client implicitly relinquishes that send token. The client passes a `callback' and `context' pointer to the send function. When the send completes, GM calls `callback', passing a pointer to the GM port, the client-supplied `context' pointer, and status code indicating if the send completed successfully or with an error. When GM calls the client's `callback' function, the send token is implicitly passed back to the client. Most GM programs, which rely on GM's fault tolerance to handle transient network faults, should consider a send completing with a status other than `GM_SUCCESS' to be a fatal error. However, more sophisticated programs may use the GM fault tolerance API extensions to handle such non-transient errors. These extensions are described in an appendix. It is important to note that the client-supplied `callback' function will be called only within a client's call to `gm_unknown()', the GM unknown event handler function that the client must call when it receives an unrecognized event. The `gm_unknown()' function is described in more detail below. Sent Packet ^ | +-------+ |Send | .-------------------------------------|State | | |Machine| | +-------+ | ^ | | | Send Queue | +-----------------+ | | | | | | | | | | | | +-----------------+ | ^ | | LANai Memory - - - - -|- - - - - - - - - - - - - - - - - - - - -|- - - - - - - - - - - - | | User Process Memory V | Receive Event Queue | +-----------------+ .---------' | | | | | | | | | | | +-----------------+ | | | | .--^--. | gm_send_with_callback(...,ptr,len,callback,context) | ... | | `-> event=gm_receive(); | | switch(event.recv.type){ | | ... | | default: | | gm_unknown(port,event); | | } | `-------------|--. | `--------. | | | .-----|----------' | | | | .-------|-----' | | | | + - - -|- - -|- - -|- - - -|- - - - - - - - - - - + | V V V V | | callback(port,context,status) | | | | [behind the scenes in gm_unknown()] | + - - - - - - - - - - - - - - - - - - - - - - - - + GM receives are also token regulated. After a port is opened, the client implicitly possesses `gm_num_receive_tokens()' receive tokens, allowing it to provide GM with up to this many receive buffers using `gm_provide_receive_buffer()'. With each call to `gm_provide_receive_buffer()', the client implicitly relinquishes a receive token. With each buffer passed to `gm_provide_receive_buffer()', the client passes a corresponding integer SIZE indicating that the length of the receive buffer is at least `gm_max_length_for_size()' bytes. Before a client of a port can receive a message of a particular size and priority, the client software must provide GM with a receive token of matching size and priority. The receive token specifies the buffer in which to store the matching receive. When a message of matching size and priority is received, that message will be transferred into the receive buffer specified in the receive token. Note that multiple receive tokens of the same size and priority *may* be provided to the port. After providing receive buffers with sizes matching the sizes of all packets that potentially could be received, the client must poll for receive events using a `gm_*receive*()' function. (Most developers who think polling is unacceptable in their application find that polling is fine as long as they do it in a separate thread.) The `gm_*receive*()' function will return a `gm_receive_event'. The receipt of events of type `GM_RECV_EVENT' and `GM_HIGH_RECV_EVENT' describe received packets of low and high priority, respectively. All other events should be simply passed to `gm_unknown()'. Such events are used internally by GM for sundry purposes, and the client need not be concerned with the contents of unrecognized receive events unless otherwise stated in this document. Arriving Packet | | ,---------------------. V V | +-------+ | |Receive| Receive Buffer Pool |State | +-------------------+ |Machine| | | | | | | | | | | | +-------+ +-------------------+ | ^ LANai Memory - - -|- - - - - - - - - - - - -|- - - - - - - - - - - - - - - - - | | User Process Memory V | Receive Event Queue | +-------------------+ | | | | | | | | | | | | | +-------------------+ `--- gm_provide_receive_buffer() | ... `------------------------- gm_receive() To avoid deadlock of the port, the client software must ensure that the port is never without a receive token for any acceptable combination of size and priority for more than a bounded amount of time, that the port is informed which combinations of size and priority are not acceptable for receives, and that the client not send to any remote port that does not do likewise. By convention, when a port runs out of *low* priority receive tokens for any combination of sizes, the client may defer replacing the receive tokens pending the completion of a bounded number of *high* priority sends, but must always replace exhausted types of high priority receive tokens without waiting for any sends to complete. Using this technique, reliable, deadlock-free, single-hop forwarding can be achieved. ---------- Footnotes ---------- (1) See *Note Token Reference:: for details.  File: gm.info, Node: Initialization, Next: Memory Setup, Prev: Programming Model, Up: Top Initialization ************** Before calling any other GM function, `gm_init()' should be called. `gm_finalize()' should be called after all other GM calls and before your program exits. Each call to `gm_init()' should be balanced by a call to `gm_finalize()' before the program exits. Although GM automatically handles ungraceful program termination without such balanced calls on operating systems with memory protection, developers are strongly discouraged from relying on this feature because on some systems, such as those using the VxWorks embedded runtime system, the calls to `gm_finalize()' are required for proper shutdown of GM to allow ports to be reused without rebooting VxWorks. A GM port is initialized by calling `gm_open(struct gm_port**PORT, unsigned int UNIT, unsigned int PORT_ID, char *PORT_NAME, enum gm_api_version VERSION)' to open port number PORT_ID of Myrinet interface number UNIT. The pointer returned at `*PORT' must be passed to subsequent GM API calls. PORT_NAME is an character string of up to `gm_max_port_name_length()' bytes describing the client. The name is currently used for debugging purposes only, but this information will eventually be available to all GM clients on the network through a mechanism TBD. VERSION should be `GM_API_VERSION_1_1'. Note that while the GM API uses "`struct gm_port *'" pointers throughout, these pointers are opaque to the client. The client should not attempt to dereference these pointers. After opening a port, the client implicitly possesses `gm_num_send_tokens()' send tokens and `gm_num_receive_tokens()' receive tokens. Most GM programs will use most or all of the `gm_num_receive_tokens()' immediately after opening a port to pass receive buffers to GM using `gm_provide_receive_buffer()'. After the client has provided all receive buffers that it will provide during port initialization, the client should call `gm_set_acceptable_sizes()' for each priority (`GM_LOW_PRIORITY' and `GM_HIGH_PRIORITY') to indicate what GM receive sizes the client expects to receive on the port. While this call is not strictly required, calling it allows GM to immediately reject any contradictory sends, immediately generating a send error at the sender. If these calls to `gm_set_acceptable_sizes()' are not made, then the error will not be reported until the sender experiences a GM long-period timeout, which takes about a minute to be generated by default. Therefore, calling `gm_set_acceptable_sizes()' can save much time during application development.  File: gm.info, Node: Memory Setup, Next: Sending Messages, Prev: Initialization, Up: Top Memory Setup ************ GM will only send messages from memory allocated with a `gm_dma_*alloc()' function, or memory that has been registered for DMA transfers using `gm_register_memory()'. If the client attempts to send data from nonDMAable memory, GM will send bytes of value `0xaa' instead. If the client attempts to receive data into nonDMAable memory, the data will be silently discarded. Note that some operating systems (e.g.: Solaris) do not support `gm_register_memory()' due to operating system limitations, so the `gm_dma_*alloc()' functions must be used instead to obtain DMA memory. Unless explicitly enabled using `gm_allow_remote_memory_access('PORT`)', GM will not allow remote processes to use `gm_directed_send()' to modify the memory of the process. If remote memory access has been enabled, then this protection is disabled, and `any' remote GM port may modify the contents of `any' DMAable memory associated with that port. GM developers should be aware of this potential security risk, although it is usually not a concern.  File: gm.info, Node: Sending Messages, Next: Receiving Messages, Prev: Memory Setup, Up: Top Sending Messages **************** In GM, message sends are regulated by a simple token-passing mechanism to prevent GM's bounded-size internal queues from overflowing. The client software must possess a send token before calling `gm_send_with_callback()'. After initialization, the client software implicitly possesses all `gm_num_send_tokens()' send tokens, and implicitly passes one token to the GM library with each call to `gm_send_with_callback()' or `gm_send_to_peer_with_callback()'. The token is retained by GM until the send completes, at which time GM calls the client-supplied callback, implicitly returning the send token to the client. The contents of the send message should not be modified in the interval between the call to `gm_send()' and the send completion, because doing so will cause undefined data to be delivered to the receiver. The `gm_send_with_callback()' call requires the following parameters: PORT a pointer to the GM port over which the message is to be sent MESSAGE a pointer to the data to be sent SIZE the size receive buffer in which to store the message on the remote node LEN the number of bytes to send PRIORITY the priority with which to send the message (`GM_HIGH_PRIORITY' or `GM_LOW_PRIORITY' TARGET_NODE_ID the ID of the node to which the message should be sent TARGET_PORT_ID the ID of the GM port to which the message should be sent CALLBACK the client function to call when the send completes CONTEXT a pointer to pass to the CALLBACK function when it is called The order of messages with different priorities or with different destination ports is not preserved. Only the order of messages with the same priority and to the same destination port is preserved. In the special case that the TARGET_PORT_ID is the same as the sending port ID (as is often the case), the streamlined `gm_send_to_peer_with_callback()' function may be used instead of `gm_send_with_callback()', allowing the TARGET_PORT_ID parameter to be omitted, and slightly improving small-message performance on 32-bit Myrinet interfaces.  File: gm.info, Node: Receiving Messages, Next: Alarms, Prev: Sending Messages, Up: Top Receiving Messages ****************** Similarly to message sends, message receives in GM are regulated by a simple token-passing mechanism: Before a message can be received, the client software must provide GM a receive token that allows the message to be received and specifies a buffer to hold the received data. After initialization, the client implicitly possesses all `gm_num_receive_tokens()' receive tokens. The client software grants receive tokens to GM by calling `gm_provide_receive_buffer(PORT, BUFFER, SIZE, PRIORITY)', indicating that GM may receive any message into BUFFER as long as the `size' and `priority' fields of the received message exactly match the SIZE and PRIORITY fields passed to `gm_provide_receive_buffer()'. Eventually, GM will use the buffer indicated by MESSAGE and SIZE to receive a message of the indicated SIZE and PRIORITY. Unlike some messaging systems, GM requires that the SIZE of the received message match the token size exactly. GM will _not_ use the next larger sized receive buffer when a receive buffer of the correct size is not available. All receive buffers passed to `gm_provide_receive_buffer' must DMAable. They must also be aligned or be within memory allocated using `gm_dma_*alloc()' to ensure that messages can be DMAed into the buffer, and must be at least `gm_max_length_for_size(SIZE)' bytes long. Typical GM clients will provide at least 2 receive buffers for each size and priority of message that might be received to maximize performance by allowing one buffer to be processed and replaced while the network is filling the other. However, 1 receive buffer for each size-priority combination is sufficient for correct operation. Additionally, it is almost always a good idea to provide additional buffers for the smallest sizes, so that many small messages may be received while the host is busy computing. There is no need to provide tokens for receives smaller than `gm_min_message_size()'. After providing receive tokens, code may poll for pending receives using `gm_receive_pending(port)', which returns a nonzero value if a receive is pending or zero if no message has been received. The client may also poll for receives using `gm_receive(PORT)', which returns a pointer to a event structure of type `gm_event_t'. If no recv event is in the receive queue, a pointer to a fake receive event of `GM_NO_RECV_EVENT' will be returned. The event returned by `gm_receive()' is only guaranteed to be valid until the next call to `gm_receive()'. There are several variants of `gm_receive()' available, all of which can safely be used in the same program. `gm_receive()' returns the first pending receive event or `GM_NO_RECV_EVENT' if none is pending. `gm_blocking_receive()' returns the first pending receive, blocking if necessary. This function polls for receives for 1 millisecond before sleeping, so it should generally be used only if the polling thread has a dedicated processor. `gm_blocking_receive_no_spin()' returns the first pending receive, blocking if necessary. This function sleeps immediately if no receive is pending. It should be generally used in environments with more than one thread per processor. Once the client has obtained a receive event from a `gm_*receive*()' function, the client should either process the event if the client recognizes the event, or pass the event to `gm_unknown()' if the event is unrecognized. The client is not required to handle any receive events, and may simply pass all events to `gm_unknown()', but any useful GM program will handle `GM_RECV_EVENT's or `GM_HIGH_RECV_EVENT's in order to access the received data. The receive event types that the client software may choose to recognize are as follows (GM internal events are not listed): `GM_ALARM_EVENT' `GM_ALARM_EVENT's should be treated as an unknown event and passed to `gm_unknown()'. However, because client alarm handlers are called within `gm_unknown()' when `gm_unknown()' receives a `GM_ALARM_EVENT', it can be useful for a program to perform alarm polling only after passing `GM_ALARM_EVENT's to `gm_unknown()', as in the `test/gm_allsize.c' example program. See the documentation for `gm_set_alarm()' for more information. `GM_RECV_EVENT' `GM_HIGH_RECV_EVENT' This event indicates that a normal receive has occurred. The following information is available in the `event->recv' structure. `length' the number of bytes of received data `size' the size of the buffer into which the message was received `buffer' a pointer to the buffer passed in a call to `gm_provide_receive_buffer()', which allowed this receive to occur `sender_node_id' the GM identifier for the node that sent the message `sender_port_id' the GM identifier for the port that sent the message `tag' the tag passed to `gm_provide_receive_buffer_with_tag()' or 0 if `gm_provide_receive_buffer()' was used instead `type' `GM_HIGH_RECV_EVENT' indicates the receipt of a high-priority packet. `GM_RECV_EVENT' indicates the receipt of a low-priority packet. `GM_PEER_RECV_EVENT' `GM_HIGH_PEER_RECV_EVENT' These events may be safely ignored (passed to `gm_unknown()'), in which case the event will be converted to a normal `GM_RECV_EVENT' and passed to the client in the next call to a `gm_*receive*()' function. These events are just like the normal `GM_RECV_EVENT' and `GM_HIGH_RECV_EVENT' events, but indicate that the sender port id is the same as the receiver port id. Most GM programs should handle these events directly just like they handle normal receive events. `length' the number of bytes of received data `size' the size of the buffer into which the message was received `buffer' a pointer to the buffer passed in a call to `gm_provide_receive_buffer()', which allowed this receive to occur `sender_node_id' the GM identifier for the node that sent the message `sender_port_id' the GM identifier for the port that sent the message `tag' the tag passed to `gm_provide_receive_buffer_with_tag()' or 0 if `gm_provide_receive_buffer()' was used instead. `type' The `PEER' event types indicate that the sender port number is the same as the port number. The `HIGH' event types indicate that the message was sent with high priority. `GM_FAST_RECV_EVENT' `GM_FAST_HIGH_RECV_EVENT' `GM_FAST_PEER_RECV_EVENT' `GM_FAST_HIGH_PEER_RECV_EVENT' These events may be safely ignored (passed to `gm_unknown()'), in which case the event will be converted to a normal `GM_RECV_EVENT' and passed to the client in the next call to a `gm_*receive*()' function. The conversion process will copy the receive message from the receive queue into the receive buffer. These types indicate that a small-message receive occurred with the small message stored in the receive queue for improved small-message performance. The `PEER' event types indicate that the sender port number is the same as the port number. The `HIGH' event types indicate that the message was sent with high priority. If your program uses any small messages that are immediately processed and discarded upon receipt, then your program can improve performance by processing these messages directly. If after examining the message your program determines that it needs the data copied into the buffer, it can either call `gm_memorize_message()' to do so or can pass the event to `gm_unknown()'. `message' a pointer to the received message, which is stored in the receive queue and is only guaranteed to be valid until the next call to `gm_receive()' `length' the number of bytes of received data `size' the size of the buffer into which the message was received `buffer' a pointer to the buffer passed in a call to `gm_provide_receive_buffer()', which allowed this receive to occur `sender_node_id' the GM identifier for the node that sent the message `sender_port_id' the GM identifier for the port that sent the message `tag' the tag passed to `gm_provide_receive_buffer_with_tag()' or 0 if `gm_provide_receive_buffer()' was used instead. `type' The `PEER' types indicate that the sender port number is the same as the port number. The `HIGH' types indicate that the message was sent with high priority. Note that although the receive data is in the receive queue and no receive buffer was used to store the received message, the client must have provided an appropriate receive buffer before the receive could take place, and this buffer is passed back to the client in the fast receive event. If the client needs to store the data `*message' past the next call to `gm_receive()', then the client should copy `*message' into `*buffer' using `gm_memorize_message()', which is simply a version of `bcopy()' optimized for copying aligned messages. After calling `gm_memorize_message()', the fast receive event becomes equivalent to a normal receive event. `GM_NO_EVENT' No event is in the event queue. `GM_RAW_RECV_EVENT' This type is for internal use by the GM mapper process and will never be received by normal GM clients. It provides the following information in the `event->recv' structure: `length' the number of bytes received `buffer' the location of the received bytes `GM_SENT_EVENT' This type indicates that one or more sends completed. *Developers using the GM-1.1 API should never see this event type*, as it is generated only if the client calls the GM-1.0 `gm_send()' function, which is deprecated in favor of the superior `gm_send_with_callback()' functions. `event->sent.message_list' points to a null-terminated array of `void' pointers, which are message pointers from earlier `gm_send()' calls that have completed successfully. For each pointer in this array, a send token is implicitly returned to the client. Although the number of receive events may seem daunting at first glance, almost all of the event types can be ignored. The following receive dispatch loop is fully functional for a nontrivial application that accepts messages ports, accepts only small control messages sent with high priority, and accepts low priority messages of any size: while (1) { gm_event_t *e; e = gm_receive (&my_port); switch (e->recv.type) { case GM_HIGH_RECV_EVENT: /* Handle high-priority control messages here in bounded time */ gm_provide_recv_buffer (&my_port, gm_ntohp(e->buffer), e->size, e->priority); break; case GM_RECV_EVENT: /* Handle data messages here in bounded time */ gm_provide_recv_buffer (&my_port, some_buffer, e->size, e->priority); break; case GM_NO_RECV_EVENT: /* Do bounded-time processing here, if desired. */ break; default: gm_unknown (&my_port, e); } However, the following implementation is slightly faster because it handles control messages without copying them into the receive buffer: while (1) { gm_event_t *e; e = gm_receive (&my_port); switch (e->recv.type) { case GM_FAST_HIGH_PEER_RECV_EVENT: case GM_FAST_HIGH_RECV_EVENT: /* Handle high-priority control messages here in bounded time */ gm_provide_recv_buffer (&my_port, gm_ntohp(e->buffer), e->size, e->priority); break; case GM_FAST_PEER_RECV_EVENT: case GM_FAST_RECV_EVENT: gm_memorize_message (e->buffer, e->message, e->length); case GM_RECV_PEER_EVENT: /* Handle data messages here in bounded time */ gm_provide_recv_buffer (&my_port, some_buffer, e->size, e->priority); break; case GM_NO_RECV_EVENT: /* Do bounded-time processing here, if desired. */ break; default: gm_unknown (&my_port, e); } Any receive event not recognized by an application must be passed immediately to `gm_unknown()', as in the example above. The function `gm_unknown()' will free any resources associated with the event that the client application would normally be expected to free if it recognized the type. Also, additional, undocumented event types will be received by an application and are handled by `gm_unknown()'. These messages can be used for supporting features such as GM alarms and blocking receives. The motivation for putting small messages in the receive queue despite the fact that doing so might require a receive-side copy is the following set of observations: * A large fraction of small receive messages are control messages that can be processed immediately upon reception, and consequently do not need to be copied into the more permanent buffer to survive calls to `gm_receive()'. * The cost of performing an additional DMA to place the message in the buffer, rather than in the receive queue, is actually more expensive for very small messages than having the host perform the copy. Therefore, placing small received messages in the receive command queue rather than in the more permanent receive buffer enhances performance and is worth the added complexity. To prevent program deadlock, the client software must ensure that GM is never without a receive token (buffer) for any potentially received message for more than a bounded amount of time. Generally, except for the case of message `forwarding' described in the next chapter, this means that after each successful call to `gm_receive()' the client will call `gm_provide_receive_buffer()' to replace the receive token (buffer) with one of the same SIZE and PRIORITY before the next call to `gm_receive()' or `gm_send()'. If such a deadlock condition exists for too long (over a minute), then the offending port will be closed.  File: gm.info, Node: Alarms, Next: High Availability Extensions, Prev: Receiving Messages, Up: Top Alarms ****** GM provides the following simple alarm API. The alarm API allow the GM client to schedule a callback function to be called after a delay, specified in microseconds. An unbounded number of alarms may be set, although alarm overhead increases linearly in the number of set alarms, and the client must provide storage for each set alarm. - : void gm_initialize_alarm (gm_alarm_t *ALARM) Initialize a client-allocated `gm_alarm_t' structure for use with `gm_set_alarm()'. This function should be called after the structure is allocated but before a pointer to it is passed to `gm_set_alarm()' or `gm_cancel_alarm()'. - : void gm_cancel_alarm (gm_alarm_t *ALARM) Cancel a scheduled alarm, or do nothing if the alarm is not scheduled. - : void gm_set_alarm (struct gm_port *PORT, gm_alarm_t *ALARM, unsigned int USEC, void (*CALLBACK)(void *), void *CONTEXT) Schedule `CALLBACK(CONTEXT)' to be called after USEC microseconds (or later), or reschedule the alarm if it has already been scheduled and has not yet triggered. CALLBACK must be non-`NULL'. CONTEXT is treated as an opaque pointer by GM, and may be used to pass a pointer to the client-supplied CALLBACK function. GM clients will also be able to take advantage of the fact that an application is guaranteed to receive a single `GM_ALARM_EVENT' for each call to a client-supplied callback, with the corresponding callback occurring during the call to `gm_unknown()' that processes that alarm. This means that a case statement like the following in the client's event loops can be used to significantly reduce the overhead of polling for any effect of a client supplied alarm callback: case GM_ALARM_EVENT: gm_unknown (event); /* poll for effect of alarm callbacks only here */ break;