************************************************************************ * Myricom GM networking software and documentation * * Copyright (c) 2005 by Myricom, Inc. * * All rights reserved. See the file `COPYING' for copyright notice. * ************************************************************************ README-MacOSX for gm-2.0 Supported OS/processors: Mac OS X 10.3 and 10.4 for PowerPC Supported NICs: PCI64, PCI64A, PCI64B, PCI64C, PCIXD, PCIXF Note: gm-2.0 does not interoperate with gm-1.x. A mixture of hosts with gm-1.x and gm-2.0 cannot talk to each other. Table of Contents: ----------------- I. GM Binary Installation a. Unpacking the GM driver b. Installing the GM driver c. Testing the GM installation II. GM Source Installation a. Configuring and compiling GM b. Installing the GM driver c. Testing the GM installation III. Verifying the GM performance IV. Running IP over GM V. Improving IP Performance VI. Cautions/Caveats VII. Miscellaneous a. Uninstallation of the GM driver ************************************************************************ If difficulties are encountered, please consult the FAQ http://www.myri.com/scs/GM_FAQ.html and all technical support questions should be directed to help@myri.com. If you have problems on any architecture, please send the output from /var/log/system.log to . ************************************************************************ ====================================== I. GM Binary Installation Instructions ====================================== Current cautions and common problems: * Incorrectly written programs which attempt to free() or munmap() memory prior to deregistering it may "hang" indefinitely. For details see the Cautions/Caveats below GM installation is performed in three easy steps: 1. Unpacking the GM driver. -------------------------------------- tar zxvf gm-2.0_MacOSX-10.3-16ports.tar.gz 2. Installing the GM driver: --------------------------------------------- Select an installation directory path . It is usually best for to be the path to an NFS directory available on all machines that are to share this GM installation. The directory must be accessible using on all machines that are to share the installation. must be an absolute path; it must start with "/". However, may contain symbolic links. cd gm-2.0_MacOSX-10.3-16ports ./GM_INSTALL You may omit to install the driver in /opt/gm/. Next, you must run sudo /sbin/gm_install_drivers on each machine to install the drivers on that machine. If you wish for the driver to auto-load an boot, you must copy StartupParameters.plist to /System/Library/StartupItems/GM sudo cp /etc/StartupParameters.plist /System/Library/StartupItems/GM/ Alternatively, you may start and stop the drivers manually using: sudo /System/Library/StartupItems/GM/GM start sudo /System/Library/StartupItems/GM/GM stop or sudo /System/Library/StartupItems/GM/GM restart to start, stop, or restart the driver, respectively. For directions on how to uninstall the GM driver, refer to the "Miscellaneous" section. Note: If the host is rebooted, you must reload the GM driver. 3. Testing the GM Installation. ------------------------------ Once the GM software has been properly installed on all of the hosts in your cluster, you are ready to validate your Myrinet installation by performing the following sequence of tests. * Check the LEDs on each switch port and NIC port * Run gm_board_info on one host * Run gm_debug to test the PCI bandwidth * Run gm_allsize to test the links in the network * Run gm_stress to test the network Each of these steps is detailed in the Troubleshooting section of the FAQ http://www.myri.com/scs/FAQ/ The test scripts (gm_board_info, gm_debug, gm_allsize, gm_stress) are available in /bin in your GM installation. A README describing each of these tests can be found in /bin/README. ======================================= II. GM Source Installation Instructions ======================================= To compile GM on Mac OS X platforms, you will need GNU make, and the Apple C/C++ compiler. These tools are provided by Apple as part of the Developer Tools package (also known as Xcode) available through the Apple Developer Connection (http://www.apple.com/developer). The Developer Tools package is also included in the Mac OS X 10.3 retail box. Current cautions and common problems: * Incorrectly written programs which attempt to free() or munmap() memory prior to deregistering it may "hang" indefinitely. For details see the Cautions/Caveats below. GM installation is performed in three easy steps: 1. Configuring and compiling the GM driver: ---------------------------------------------------- tar zxvf gm-2.0_MacOSX.tar.gz cd gm-2.0_MacOSX ./configure make 2. Installing the GM driver: --------------------------------------------- Select an installation directory path . It is usually best for to be the path to an NFS directory available on all machines that are to share this GM installation. The directory must be accessible using on all machines that are to share the installation. must be an absolute path; it must start with "/". However, may contain symbolic links. cd binary ./GM_INSTALL You may omit to install the driver in /opt/gm/. Next, you must run sudo /sbin/gm_install_drivers on each machine to install the drivers on that machine. If you wish for the driver to auto-load an boot, you must copy StartupParameters.plist to /System/Library/StartupItems/GM sudo cp /etc/StartupParameters.plist /System/Library/StartupItems/GM/ Alternatively, you may start and stop the drivers manually using: sudo /System/Library/StartupItems/GM/GM start sudo /System/Library/StartupItems/GM/GM stop or sudo /System/Library/StartupItems/GM/GM restart to start, stop, or restart the driver, respectively. For directions on how to uninstall the GM driver, refer to the "Miscellaneous" section. Note: If the host is rebooted, you must reload the GM driver. 3. Testing the GM Installation. ------------------------------ Once the GM software has been properly installed on all of the hosts in your cluster, you are ready to validate your Myrinet installation by performing the following sequence of tests. * Check the LEDs on each switch port and NIC port * Run gm_board_info on one host * Run gm_debug to test the PCI bandwidth * Run gm_allsize to test the links in the network * Run gm_stress to test the network Each of these steps is detailed in the Troubleshooting section of the FAQ http://www.myri.com/scs/FAQ/ The test scripts (gm_board_info, gm_debug, gm_allsize, gm_stress) are available in /bin in your GM installation. A README describing each of these tests can be found in /bin/README. ================================= III. Verifying the GM Performance ================================= We recommend the following test to verify the GM performance. View the results of the hardware benchmark test of the PCI bus with the DMA engine of the Myrinet adapter. cd /bin ./gm_debug --no-counters Note: The output of this command gives the maximum sustained bandwidth that can be obtained from the PCI bus. Refer to the section entitled "GM Performance" in the {GM_HOME}/README for complete details on expected GM performance. ====================== IV. Running IP over GM ====================== Note: GM ethernet emulation ("IP over GM") is not supported in MacOSX 10.4. See "Caveats" below. The IP device is accessed via ifconfig enX
netmask broadcast up where you must replace 'enX with the appropriate name (en1, en2, etc.) Note that this is different than GM on other platforms, where the device name is myriX. This is due to quirks in the Mac OS X network stack, over which we have no control. The first ethernet device number used by GM is obtained via sysctl: % sysctl net.gm.gm_base_en_unit net.gm.gm_base_en_unit: 1 This means that the first myrinet NIC on this host is named "en1", and subsequent NICs will be named "en2", "en3", etc. Note that the GM network interface is not visible or configurable through the Network control panel. You must use ifconfig to configure it. =========================== V. Improving IP performance =========================== Note: GM ethernet emulation ("IP over GM") is not supported in MacOSX 10.4. See "Caveats" below. If you don't see the speed that you expect, you might consider adjusting the TCP send and receive window sizes. To adjust the window sizes, modify the kern.ipc.maxsockbuf, net.inet.tcp.sendspace and net.inet.tcp.recvspace like this: % sudo sysctl -w kern.ipc.maxsockbuf=2097152 % sudo sysctl -w net.inet.tcp.sendspace=262144 % sudo sysctl -w net.inet.tcp.recvspace=262144 ==================== VI. Cautions/Caveats ==================== ************************************************************************* 64-BIT Applications on MacOSX 10.4: ----------------------------------- GM does not support running 64-bit binaries on MacOSX 10.4. This is due to two factors: MacOSX 10.4's IOKit is not 64-bit ready, and GM does not support systems with 32-bit kernel pointers and 64-bit application pointers. Customers needing to run 64-bit applications may wish to consider upgrading to MX. MX has experimental support for running 64-bit applications. ETHERNET EMULATION is NOT SUPPORTED on MacOSX 10.4: --------------------------------------------------- The hooks that GM uses to do ethernet emulation have been removed from the 10.4 kernel. Customers wishing to use ethernet emulation over Myrinet on MacOSX 10.4 should consider upgrading to MX. COMPILING from SOURCE: --------------------- When building GM on MacOSX 10.4, you must use Apple gcc version 3.3. When GM is compiled with Apple's gcc 4.0, we have seen the kextload command core dump. Apple and Myricom are working to resolve this issue. If GM fails to load, and the file /Library/Logs/CrashReporter/kextload.crash.log has been modified at the same time, check gcc_select to make sure you are building with gcc 3.3. You may not use the --enable-maintainer-mode flag on Mac OS X. This is because the lanai cross compiler contains a bug which causes it to crash when run on PowerPC. You must use the prebuilt firmware, or build your firmware on another platform. INCORRECTLY WRITTEN PROGRAMS "HANG" ----------------------------------- Incorrectly written programs which attempt to free() or munmap() memory prior to deregistering it may "hang" indefinitely. This happens because the MacOSX/darwin kernel does not expect drivers to leave memory regions "wired" in memory for long periods of time. When the memory region is free'ed, the VM system blocks uninterruptibly waiting for the GM driver to unwire the memory. The GM driver has no way of knowing that the memory system is waiting for it to unwire a memory region. To recover from this problem, you can unload & reload the module (cd binary; sudo ./GM_INSTALL). At unload time, the GM driver will free all memory associated with all ports and will send kill -9 signals to all processes with open ports. This will allow the hung program to exit. To diagnose this problem, run the program under sc_usage. If the last system call the program made before "hanging" was vm_deallocate or munmap, then your application has a bug which must fixed before it can run on MacOSX/darwin. See the sc_usage(1) man page for instructions on how to run sc_usage. Ethernet Unit Number "Stealing" ------------------------------ Due to a quirk in the way the GM driver interacts with the MacOSX/darwin network stack, users planning to use ethernet emulation must load the GM module *after* loading all other network drivers. If a network driver is loaded after GM, the newly loaded network driver has no way of knowing what unit number the GM ethernet (en) NIC is using and it may "steal" the GM unit number. This could make one or both devices unusable until the machine is rebooted. DARWIN on INTEL --------------- Darwin on Intel has not been tested and is not supported. We would not be terribly surprised to hear that it worked, but we offer no support. MacOS 10.1 and 10.2 ---------- MacOS 10.1 and 10.2 are not supported in this release of GM. We have verified that the source release will compile on and run on 10.2, but we have not done extensive testing. ************************************************************************* =================== VIII. Miscellaneous =================== ------------------------------------ a. Uninstallation of the GM driver ------------------------------------ The gm_install_drivers script generates the script /sbin/gm_uninstall_drivers, which can be used to uninstall the drivers. The GM_INSTALL script generates the script /sbin/GM_UNINSTALL, which can be used to uninstall GM.