INSTALLATION Enabling Sockets-GM: There are two choices of establishing Sockets-GM functionality. First is using the LD_PRELOAD mechanism. - This can be established using setenv (e.g: setenv LD_PRELOAD /absolute/path/libgmsocks.1.0 - Another option is to edit /etc/ld.so.preload This will map *all* applications running on your cluster completey to Sockets-GM. This is important if you want to rsh/ssh to your cluster and start up an application which should be mapped to Sockets-GM/Myrinet. Ensure that the shared library resides on the system. Problems can arise when the shared library is in a users home directory. Second is the Sockets-GM library. Add this to your linking procedure and your socket calls will automatically be mapped to Myrinet The default library will use the dlsym call to determine the access to the original libc functions. These functions will be set only once so that there is no performance penalty involved. Another option is to get the libc sockets functions from the object modules in libc.a. This will avoid calling dlsym. How Sockets-GM works: Sockets-GM is a user level library which provides the entry points for any TCP or UDP socket function. It implements the same semantics but maintains a reference to the original socket functions. If two communication partners which exchange data through send/recv functions are both connected via Myrinet, then Sockets-GM will use GM to send and receive data. In order to do this, the initial accept/connect sequence between a pair of processes has been implemented to establish a connection using the traditional functions, but it also exchanges necessary GM node and port information using traditional sockets. In order to achieve transparent application functionality, Sockets-GM also has to work for processes which are not connected via Myrinet. In order to decide whether the direct GM path can be taken, GM lookup functions are called while still being in the accept/connect phase. If a server accepts connections, then the resulting parameters from the accept call will reveal the clients IP address. When a client connects to the server, then the IP address of the server is provided as well. Both IP addresses can now be used to either determine the hostname or use the IP address directly. Requirement: Sockets-GM determines if the IP address is a valid Sockets-GM candidate. You can check using gm_board_info if the GM hostname which is queried to validate the GM connection is set correctly. The GM hostname can either be set to the AF_INET IP address (from eth0), or can be the AF_INET hostname. This information must be followed by the used board it. See examples below for board 0. For interoperability with the module version it is strongly advised to use the AF_INET ip address. You can create a /etc/gm/hostname file containing this information. You must restart GM for those changes to take effect. Applications using multiple fork()'s may not work with the User Level implementation. For those apps the module implementation is best suited. The limitations come from GM which does not allow a port to be shared. This will be different for MX. Using the hostname must be matching the GM hostname from gm_board_info. Depending on the setup this will have an impact on the performance of the accept / connect negotiation. If a GM lookup function is not successfull YP packets are being sent out and it will take time until all nodes have been queried. Example: host_a: host_b: server 192.168.1.111 5000, client 192.168.1.112 5000 or server host_a 5000, client host_b 5000 will use GM as a transport when gm_board_info provides (for example) Route table for this node follows: gmID MAC Address gmName Route ---- ----------------- -------------------------------- --------------------- 1 00:60:dd:7e:d6:61 192.168.1.111:0 (this node) 2 00:60:dd:7e:d6:64 192.168.1.112:0 (mapper) or gmID MAC Address gmName Route ---- ----------------- -------------------------------- --------------------- 1 00:60:dd:7e:d6:61 host_a:0 (this node) 2 00:60:dd:7e:d6:64 host_b:0 (mapper) debug output from a 'connect'ing application: (here the GM hostname is atipa1) connecting to port: 5000. querying host: 172.31.130.81 val for host 172.31.130.81 : 0 #(use the IP address to determine whether the accepting node is in Myri Land) gmname_match: socketsgm_candidate for 172.31.130.81 set to : 0 setting hname to: 172 gmname_match: socketsgm_candidate for 172 set to : 0 querying host: atipa1.sw.myri.com val for host atipa1.sw.myri.com : 0 #(use the full address to determine whether the accepting node is in Myri Land) gmname_match: socketsgm_candidate for atipa1.sw.myri.com set to : 0 setting hname to: atipa1 gmname_match: socketsgm_candidate for atipa1 set to : 1 connect: socketsgm_candidate for atipa1 set to : 1 connect: YES, a SGM Candidate could be found end of debugging output. Depending on the size of the cluster this lookup may take several seconds. Successful lookups are much faster. In addition: IP over Ethernet and IP over GM Sockets-GM requires one existing TCP/IP stack. Sockets-GM will use this protocol to enable the communication setup, but will not use it again after a connection has been setup. If you have TCP/IP over GM enabled and you are using the node's Myri IP addresses, you also have to specify these to be GM hostnames. e.g the name of the hostname might be host_a-myri, host_b-myri, then gm_board_info should also show these hostnames or their IP addresses respectively. If you decide to have evident proof that Sockets-GM is used, then you can either enable debugging or enable some printf in the accept/connect code to see whether the sockets gm candidates are set to true. COMPILING 0) Edit the Makefile.in to locate GM, specify which board to use (typically 0), ... The Makefile.in contains further information on available flags and their meanings 1) cd to Sockets-GM_SHRD_LIBRARY 2) call make, in case you are running on ia64 or recent Suse/RH distributions look for an updated Makefile such as Makefile.Linux64 or others Different Linux Distributions: RH 8.0 The libgmsocks.o library requires additional object files from libc.a: ar xv /usr/lib/libc.a mprotect.o libc-cancellation.o register-atfork.o CONFIGURATION Multiple Myrinet Boards: You might have more than 1 Myrinet boards. In order to use different boards you have the option to override the specified board number at compile time using an environment variable. Depending on the motherboards PCI configuration you will be able to double the total, aggregate bandwith. In order to use multiple boards, they must have different hostnames. If the AF_INET hostname is 192.168.1.201 and is also specified to be the hostname for board #1, you can use ifconfig to create an alias like: ifconfig eth0 add 192.168.2.201 Then, you also need to set the GM hostname to the new ip address. The application will then need to be invoked like: server 192.168.2.201 5000 EXAMPLES The examples directory contains small examples as well as benchmarks to get familiar with Sockets-GM. There are additional Readme files which help. If you get a message that the shared GM library can not be found, you have to either extern the LD_LIBRARY_PATH environment variable, or you can add the directory which contains the shared library to the /etc/ld.so.conf file and call ldconfig. The latter will automatically find the GM library then. DEBUGGING - GMSOCKS_ENABLE_DEBUG_HANDLER: This aids for debugging purposes. It catches a signal and will invoke the debugger automatically. When enabling, please modify the source code to use an appropriate debugger. If enabled, a waring message will appear. - You can also enable DEBUG to get additional output