Myrinet logotype

Sockets-GM Overview and Performance

Overview

Sockets-GM is a new middleware layer which mimics sockets semantics and replaces the traditional Ethernet protocol to allow for low latency, high speed data transfers. It overcomes current TCP/IP implementations which involve high system load. Sockets-GM bypasses the TCP/IP protocol stack which takes up to 50% of the time spent in communication.

Features

Sockets-GM achieves binary compatibility for existing applications through different interception techniques.
These interception techniques are dependent on the operating system.
Currently, Sockets-GM is available on any Linux and Solaris version, and also for Windows 2000 Professional, Server as well as .NET.

Test cases

There have been several projects been valided by using Sockets-GM. To name a few:
PVM 3.4, MPICH-2, Pallas PMB, Intel iSCSI, netperf, netpipe, Iperf, NTttcp
and more.

Concept

Depending on the installation, Sockets-GM can co-exist with the traditional TCP/IP protocol. Sockets-GM will create a companion socket which will be used after a connection between two endpoints has been established.

Sockets-GM Concept

The concept of Sockets-GM has been implemented using different approaches. Sockets-GM has been implemented as:
Sockets-GM Linux Module Sockets-GM Library

For Windows, Sockets-GM has been implemented as a Layered Service Provider and as a System Area Network proxy for Winsock Direct.

Sockets-GM Windows

When using Sockets-GM, applications operate entirely in user level mode. Costly traps into the kernel are avoided. This reduces latency significantly. Sockets-GM is implemented to mimic the given socket interfaces. This allows for high efficiency and optimization techniques can be applied.

For this, Sockets-GM offers two different communication concepts. It allows for buffered communication in which one copy of the data is copied into pre-registered buffers or a zero-copy protocol where data is exchanged directly from application to application buffers using GM RDMA functions. The latter is known for cutting down system load. As a matter of fact, approximately 200 MBytes/s are exchanged with a CPU load of 3%.

Another advantage of Sockets-GM is that it can be tuned for specific applications. That is, threshold values can be set dynamically which will specify when the zero-copy protocol should be used. For latency sensitive applications, Sockets-GM will return to the calling application much earlier, because only a copy of the data is made. The actual message delivery is then handled by the Myrinet NIC. Moreover, the performance boost is consistent on any given system. Unlike some operating systems which do not allow for tuning of protocol stacks, the performance increase is much higher.

In comparison with other TCP/IP implementations, Sockets-GM allows for higher throughput requiring less CPU load. If an application runs under Sockets-GM, then it can also communicate to applications which are not connected via Myrinet. In this case, the conventional TCP/IP over Ethernet protocol will be used.

Performance

Depending on your configuration a benchmark such as netperf achieves up to 3.9Gbps with a CPU utilization of 8%.
Latency numbers based on round trip communication are slightly higher than the numbers of GM.
Detailed performance graphs are being updated. Some results of netperf or the performance using the PMB Pallas benchmark are presented in the following:
fischer@atipa4:~/Sockets-GM_MODULE$ ./tests/netperf/netperf -l 1 -H atipa3 -- -m 4000 -M 4000
TCP STREAM TEST to atipa3
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 65535  65535   4000    1.00     3388.58

fischer@atipa4:~/Sockets-GM_MODULE$ ./tests/netperf/netperf -l 1 -H atipa4 -- -m 8100 -M 8100
TCP STREAM TEST to atipa4
Recv   Send    Send
Socket Socket  Message  Elapsed
Size   Size    Size     Time     Throughput
bytes  bytes   bytes    secs.    10^6bits/sec

 65535  65535   8100    1.00     3829.28


fischer@atipa3:~/Sockets-GM-1.7/Sockets-GM_MODULE$ ~/mpich2-0.97/bin/mpirun -hf ~/hostfile -np 2 ~/PMB2.2/SRC/PMB-MPI1
#---------------------------------------------------
#---------------------------------------------------
# Date       : Sat Aug 14 09:02:59 2004
# Machine    : i686# System     : Linux
# Release    : 2.4.25generic
# Version    : #3 SMP Wed Mar 17 15:41:21 PST 2004

#
# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

# PingPong
# PingPing
# Sendrecv
# Exchange
# Allreduce
# Reduce
# Reduce_scatter
# Allgather
# Allgatherv
# Alltoall
# Bcast
# Barrier

#---------------------------------------------------
# Benchmarking PingPong
# ( #processes = 2 )
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0           10        19.50         0.00
            1           10        25.89         0.04
            2           10        25.65         0.07
            4           10        26.25         0.15
            8           10        26.66         0.29
           16           10        26.39         0.58
           32           10        27.41         1.11
           64           10        26.89         2.27
          128           10        27.00         4.52
          256           10        28.15         8.67
          512           10        32.20        15.16
         1024           10        37.15        26.29
         2048           10        45.99        42.47
         4096           10        55.40        70.51
         8192           10        76.64       101.94
        16384           10       108.70       143.75
        32768           10       159.50       195.92
        65536           10       254.45       245.63
       131072           10       522.35       239.30
       262144           10      1023.45       244.27
       524288           10      1982.75       252.18
      1048576           10      3857.29       259.25
      2097152           10      7708.90       259.44
      4194304           10     15271.85       261.92



The full Pallas PMB run between two Xeon 2.4 nodes equipped with Myrinet 2XP cards can be found here .