Online Version of this Document (likely to be more recent): -------------------------------------------------------------- http://www.myri.com/scs/READMES/NDPROV-MX/README_NDPROV-MX.txt General Information: -------------------- What is NDPROV-MX ? --------------------------------------- NDPROV-MX is a proxy driver which plugs into the Network Direct Architecture which is available with Microsoft HPC Server. This package includes - the installer inst_ndprov_mx.exe which registers the ndprov-mx.dll - the ndprov-mx.dll and lib - a few binary test cases. e.g.: a pingpong test should give you an idea of available bandwidth and latency at ND level Low level software requirements (MX): --------------------------------------- - Make sure you have the "Microsoft Visual C++ 2008 Redistributable Package (x64)" installed on your hosts. An indication that the package is missing is that you will notice an error of (NdStartup failed with 800736b1" - Prior of using NDPROV-MX, MX has to be installed on the Windows systems. The MX version must be MX 1.2.7 or higher. Run mx_info.exe which should give you a list of Windows nodes running MX. If a node should be missing, then Rdesktop into the machine to look for errors. MX can be running over Myrinet (MXoM) or Ethernet (MXoE) (Layer 1). In both cases mx_info.exe should list participating hosts. Configuration: --------------------------------------- There is no configuration required. NDPROV-MX will detect and probe the network to determine candidates for NDPROV-MX. Some configuration of the NDPROV-MX proxy is driven by environment parameters. If you open a MS DOS prompt and do cmd> set = xyz you will obtain either little debug output or you can also change the runtime behavior of the proxy. Parameter: NDPROV_MX_BLOCKING -- set NDPROV_MX_BLOCKING=1 for activation -- Causes the communication manager to immediately wait for events DEFAULT: poll/wait NDPROV_MX_SPIN_COUNT -- set NDPROV_MX_SPIN_COUNT=500 (amount of loops before blocking) -- Value will be used as loop count for detection of messages. Default 200. This allows for fine tuning with respect to CPU load (spinning isn't resource friendly but key for lowest latency) NDPROV_MX_ZCOPY_THRESHOLD -- set NDPROV_MX_ZCOPY_THRESHOLD=524000 -- Will override the default ND related LargeRequestThreshold value which is set to 128*1024 Bytes. NDPROV_MX_ENABLE_AFFINITY -- When set, will perform SetProcessAffinityMask NDPROV_MX_SUBNET -- e.g: set NDPROV_MX_SUBNET=192.168.0 - Will avoid NDPROV-MX confusion about which IP to use for connections ------------- UNDER DEBUG MODE ---------------------- A debug dll will offer the following environment variables: The release dll is name ndprov-mx.dll , the debug dll is called ndprov-mx.deb.dll. In order to activate the debug dll you must rename the debug dll to the name of the release dll. Make sure you keep a copy of the release dll. -------------- LOGGING INFORMATION -------------------- The ND Provider for MX will be using the concept of a debugmask. Add -env NDPROV_MX_DBG_LEVEL xyz to the parameter where xyz can be xor of the following masks: NDPROV_MX_DBG_CONNECTOR 1 NDPROV_MX_DBG_CQ 2 NDPROV_MX_DBG_SEND 4 NDPROV_MX_DBG_RECV 8 NDPROV_MX_DBG_CADAPTER 16 NDPROV_MX_DBG_UNEXP_HANDLER 32 NDPROV_MX_DBG_CLISTEN 64 NDPROV_MX_DBG_CENDPOINT 128 e.g. -env NDPROV_MX_DBG_LEVEL 6 will log operations on _SEND and _CQ NDPROV_MX_LOG_FILE -- When set: The environment will print information to a file named debug_.txt file in the working directory. Default: Uses OutputDebugString NDPROV_MX_LOG_STDOUT -- When set: The environment will print information to stdout. When NDPROV_MX_LOG_STDOUT is set NDPROV_MX_LOG_FILE will take no action. Default: Uses OutputDebugString Testing Platforms: ------------------ NDPROV-MX has been tested on x64 machines running Windows Server 2003 and Microsoft HPC Server 2008 Remarks: --------- For the debug dll OutputDebugString can be captured via DebugView Troubleshooting: ---------------- I am getting an error code such as: Q: "NdStartup failed with 800736b1" A: "Install Microsoft Visual C++ 2008 Redistributable Package" Q: "INDConnector::Connect to www.xxx.yyy.zzz failed with 0xc000023d" A: "Run mx_info.exe and check that the MX hostname corresponds with the Windows hostname"