Myrinet logotype

Sockets-MX Performance Results for Myri-10G

Operating System: Linux

Performance

-- LATENCY:
NetPIPE is a well known benchmark which shows latency and
bandwidth results.
The following output was measured between two Linux 2.6.16
boxes with Myri-10G connected with a Myrinet switch.

./tests/NetPIPE_3.6.2/NPtcp -h rain12 -l 1 -u 4194304 -n 100 Send and receive buffers are 118784 and 118784 bytes (A bug in Linux doubles the requested buffer sizes) Now starting the main loop 0: 1 bytes 100 times --> 1.78 Mbps in 4.28 usec 1: 2 bytes 100 times --> 3.47 Mbps in 4.40 usec 2: 3 bytes 100 times --> 5.20 Mbps in 4.40 usec 3: 4 bytes 100 times --> 6.91 Mbps in 4.42 usec 4: 6 bytes 100 times --> 10.30 Mbps in 4.44 usec 5: 8 bytes 100 times --> 13.74 Mbps in 4.44 usec 6: 12 bytes 100 times --> 20.46 Mbps in 4.47 usec 7: 13 bytes 100 times --> 19.78 Mbps in 5.02 usec 8: 16 bytes 100 times --> 24.44 Mbps in 4.99 usec 9: 19 bytes 100 times --> 28.85 Mbps in 5.02 usec 10: 21 bytes 100 times --> 31.92 Mbps in 5.02 usec 11: 24 bytes 100 times --> 36.51 Mbps in 5.02 usec 12: 27 bytes 100 times --> 40.83 Mbps in 5.04 usec 13: 29 bytes 100 times --> 43.95 Mbps in 5.03 usec 14: 32 bytes 100 times --> 48.35 Mbps in 5.05 usec 15: 35 bytes 100 times --> 45.65 Mbps in 5.85 usec 16: 45 bytes 100 times --> 58.79 Mbps in 5.84 usec 17: 48 bytes 100 times --> 62.60 Mbps in 5.85 usec 18: 51 bytes 100 times --> 66.63 Mbps in 5.84 usec 19: 61 bytes 100 times --> 79.22 Mbps in 5.87 usec 20: 64 bytes 100 times --> 83.03 Mbps in 5.88 usec 21: 67 bytes 100 times --> 85.49 Mbps in 5.98 usec 22: 93 bytes 100 times --> 116.89 Mbps in 6.07 usec 23: 96 bytes 100 times --> 120.57 Mbps in 6.07 usec 24: 99 bytes 100 times --> 123.63 Mbps in 6.11 usec 25: 125 bytes 100 times --> 154.05 Mbps in 6.19 usec 26: 128 bytes 100 times --> 158.02 Mbps in 6.18 usec 27: 131 bytes 100 times --> 131.78 Mbps in 7.58 usec 28: 189 bytes 100 times --> 185.83 Mbps in 7.76 usec 29: 192 bytes 100 times --> 188.78 Mbps in 7.76 usec 30: 195 bytes 100 times --> 188.43 Mbps in 7.90 usec 31: 253 bytes 100 times --> 240.24 Mbps in 8.03 usec 32: 256 bytes 100 times --> 242.80 Mbps in 8.04 usec 33: 259 bytes 100 times --> 242.77 Mbps in 8.14 usec 34: 381 bytes 100 times --> 343.78 Mbps in 8.46 usec 35: 384 bytes 100 times --> 345.90 Mbps in 8.47 usec 36: 387 bytes 100 times --> 346.94 Mbps in 8.51 usec 37: 509 bytes 100 times --> 435.10 Mbps in 8.93 usec 38: 512 bytes 100 times --> 436.73 Mbps in 8.94 usec 39: 515 bytes 100 times --> 436.61 Mbps in 9.00 usec 40: 765 bytes 100 times --> 588.39 Mbps in 9.92 usec 41: 768 bytes 100 times --> 591.84 Mbps in 9.90 usec 42: 771 bytes 100 times --> 591.80 Mbps in 9.94 usec 43: 1021 bytes 100 times --> 714.69 Mbps in 10.90 usec 44: 1024 bytes 100 times --> 715.46 Mbps in 10.92 usec 45: 1027 bytes 100 times --> 681.90 Mbps in 11.49 usec 46: 1533 bytes 100 times --> 926.02 Mbps in 12.63 usec 47: 1536 bytes 100 times --> 929.33 Mbps in 12.61 usec 48: 1539 bytes 100 times --> 926.67 Mbps in 12.67 usec 49: 2045 bytes 100 times --> 1151.00 Mbps in 13.56 usec 50: 2048 bytes 100 times --> 1155.23 Mbps in 13.53 usec 51: 2051 bytes 100 times --> 1091.96 Mbps in 14.33 usec 52: 3069 bytes 100 times --> 1437.37 Mbps in 16.29 usec 53: 3072 bytes 100 times --> 1440.14 Mbps in 16.27 usec 54: 3075 bytes 100 times --> 1384.94 Mbps in 16.94 usec 55: 4093 bytes 100 times --> 1637.51 Mbps in 19.07 usec 56: 4096 bytes 100 times --> 1641.68 Mbps in 19.04 usec 57: 4099 bytes 100 times --> 1477.62 Mbps in 21.16 usec 58: 6141 bytes 100 times --> 1886.91 Mbps in 24.83 usec 59: 6144 bytes 100 times --> 1886.65 Mbps in 24.85 usec 60: 6147 bytes 100 times --> 1828.70 Mbps in 25.65 usec 61: 8189 bytes 100 times --> 2204.96 Mbps in 28.33 usec 62: 8192 bytes 100 times --> 2208.92 Mbps in 28.29 usec 63: 8195 bytes 100 times --> 2098.42 Mbps in 29.80 usec 64: 12285 bytes 100 times --> 2607.16 Mbps in 35.95 usec 65: 12288 bytes 100 times --> 2606.32 Mbps in 35.97 usec 66: 12291 bytes 100 times --> 2564.88 Mbps in 36.56 usec 67: 16381 bytes 100 times --> 2886.28 Mbps in 43.30 usec 68: 16384 bytes 100 times --> 2888.16 Mbps in 43.28 usec 69: 16387 bytes 100 times --> 2843.04 Mbps in 43.98 usec 70: 24573 bytes 100 times --> 3254.23 Mbps in 57.61 usec 71: 24576 bytes 100 times --> 3255.44 Mbps in 57.60 usec 72: 24579 bytes 100 times --> 3235.08 Mbps in 57.97 usec 73: 32765 bytes 100 times --> 3493.24 Mbps in 71.56 usec 74: 32768 bytes 100 times --> 3480.63 Mbps in 71.83 usec 75: 32771 bytes 100 times --> 3966.98 Mbps in 63.03 usec 76: 49149 bytes 100 times --> 4721.18 Mbps in 79.42 usec 77: 49152 bytes 100 times --> 4705.44 Mbps in 79.69 usec 78: 49155 bytes 100 times --> 4672.04 Mbps in 80.27 usec 79: 65533 bytes 100 times --> 4970.68 Mbps in 100.59 usec 80: 65536 bytes 100 times --> 4972.91 Mbps in 100.54 usec 81: 65539 bytes 100 times --> 5154.59 Mbps in 97.01 usec 82: 98301 bytes 100 times --> 5868.61 Mbps in 127.79 usec 83: 98304 bytes 100 times --> 5866.05 Mbps in 127.85 usec 84: 98307 bytes 100 times --> 5860.00 Mbps in 127.99 usec 85: 131069 bytes 100 times --> 6229.84 Mbps in 160.51 usec 86: 131072 bytes 100 times --> 6227.81 Mbps in 160.57 usec 87: 131075 bytes 100 times --> 6277.82 Mbps in 159.29 usec 88: 196605 bytes 100 times --> 6516.40 Mbps in 230.18 usec 89: 196608 bytes 100 times --> 6520.89 Mbps in 230.03 usec 90: 196611 bytes 100 times --> 6495.31 Mbps in 230.94 usec 91: 262141 bytes 100 times --> 6881.16 Mbps in 290.65 usec 92: 262144 bytes 100 times --> 6877.23 Mbps in 290.81 usec 93: 262147 bytes 100 times --> 6777.33 Mbps in 295.10 usec 94: 393213 bytes 100 times --> 7194.11 Mbps in 417.00 usec 95: 393216 bytes 100 times --> 7191.84 Mbps in 417.14 usec 96: 393219 bytes 100 times --> 7203.36 Mbps in 416.48 usec 97: 524285 bytes 100 times --> 7373.42 Mbps in 542.49 usec 98: 524288 bytes 100 times --> 7374.08 Mbps in 542.44 usec 99: 524291 bytes 100 times --> 7325.57 Mbps in 546.04 usec 100: 786429 bytes 100 times --> 7562.23 Mbps in 793.41 usec 101: 786432 bytes 100 times --> 7556.53 Mbps in 794.01 usec 102: 786435 bytes 100 times --> 7543.97 Mbps in 795.34 usec 103: 1048573 bytes 100 times --> 7666.29 Mbps in 1043.53 usec 104: 1048576 bytes 100 times --> 7655.46 Mbps in 1045.01 usec 105: 1048579 bytes 100 times --> 7699.77 Mbps in 1038.99 usec 106: 1572861 bytes 100 times --> 7770.40 Mbps in 1544.32 usec 107: 1572864 bytes 100 times --> 7786.06 Mbps in 1541.22 usec 108: 1572867 bytes 100 times --> 7765.22 Mbps in 1545.36 usec 109: 2097149 bytes 100 times --> 7779.51 Mbps in 2056.68 usec 110: 2097152 bytes 100 times --> 7830.95 Mbps in 2043.17 usec 111: 2097155 bytes 100 times --> 7800.97 Mbps in 2051.03 usec 112: 3145725 bytes 100 times --> 7841.98 Mbps in 3060.45 usec 113: 3145728 bytes 100 times --> 7830.97 Mbps in 3064.76 usec 114: 3145731 bytes 100 times --> 7844.21 Mbps in 3059.59 usec 115: 4194307 bytes 100 times --> 7816.82 Mbps in 4093.74 usec

-- Intel Pallas Benchmark
Sockets-MX can also speed up HPC applications in binary format which use TCP/IP.
For the following test the IMB benchmark was compiled and run under LAMP MPI.
The binary was pointed to the AF_MYRI protocol and reports a latency (with MPI overhead) of 5.36usec.
fischer@rain12:~$ lamnodes
n0      rain12.sw.myri.com:1:origin,this_node
n1      rain13.sw.myri.com:1:
fischer@rain12:~$ mpirun -np 2 ~/IMB_2.3/IMB_2.3/src/IMB-MPI1-lam706 Pingpong
#---------------------------------------------------
#    Intel (R) MPI Benchmark Suite V2.3, MPI-1 part
#---------------------------------------------------
# Date       : Wed Dec  6 11:54:25 2006
# Machine    : x86_64# System     : Linux
# Release    : 2.6.17.11-lp3
# Version    : #5 SMP Fri Sep 1 23:46:17 EDT 2006

#
# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

# PingPong

#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0          100         5.15         0.00
            1          100         8.23         0.15
            2          100         8.74         0.22
            4          100         8.74         0.44
            8          100         8.70         0.88
           16          100         8.75         1.74
           32          100         8.72         3.50
           64          100         8.95         6.82
          128          100         9.14        13.36
          256          100        10.11        24.15
          512          100        10.91        44.77
         1024          100        13.18        74.07
         2048          100        15.79       123.69
         4096          100        22.13       176.55
         8192          100        30.75       254.06
        16384          100        46.33       337.22
        32768          100        76.55       408.23
        65536          100       106.50       586.85
       131072          100       185.39       674.27
       262144          100       311.77       801.86
       524288           80       598.14       835.92
      1048576           40      1093.74       914.29
      2097152           20      2102.40       951.29
      4194304           10      4174.30       958.24

fischer@rain12:~$