Myrinet logotype
Ethernet Emulation
(TCP/IP and UDP/IP)
Performance for GM-2

In addition to its OS-bypass features, GM also presents itself to the host operating system as an ethernet interface. This "ethernet emulation" feature of GM allows Myrinet to carry any packet traffic and protocols that can be carried on ethernet, including TCP/IP and UDP/IP.

It is helpful to understand that when using ethernet emulation over GM, traffic goes from the application through the OS kernel to the GM driver, following the same path as it would for a "real" ethernet NIC; traffic does not go directly from the application to the NIC, as it does when using GM in its OS-bypass mode. Thus, the TCP/IP and UDP/IP performance over GM depends primarily on the host-CPU performance and the host-OS's IP protocol stack. This performance varies widely for different hosts and operating systems. Also, unlike GM's OS-bypass mode, which exhibits a very small host-CPU overhead, TCP/IP and UDP/IP protocol processing at high data-transfer rates may use a significant fraction of the host-CPU cycles.

The GM developers have streamlined ethernet emulation over GM wherever practical. For example, the ethernet-emulation code uses the PCIX-series NIC DMA engines to offload the receive-side IP-checksum computation for TCP/IP and UDP/IP in operating systems that support it (Linux, FreeBSD, MacOS-X, Tru64 5.1). This optimization results in less data being accessed in the host-OS kernel. GM supports 9000-Byte jumbo frames in addition to the standard 1500-Byte ethernet frames; indeed, the MTU (Maximum Transmission Unit) can be set to any value between 64 Bytes and 9000 Bytes. Larger frames result in fewer packets being sent to transfer the same amount of data. An optimization used in GM-2 but not provided in GM-1 is interrupt-coalescing, which reduces host overhead by batching multiple transmitted and received packets together, thereby reducing the number of interrupts the host needs to service.

In the tables below, we report the ethernet-emulation (TCP/IP and UDP/IP) performance of GM-2.1.9 and GM-2.0.19 between a pair of 3.06GHz Intel Pentium-4 hosts that use the Serverworks Grand Champion chipset is reported. The test machines were running Debian 3.0 and the kernel.org 2.6.11smp Linux kernel. Hyperthreading was enabled. The GM driver was configured to use a 9K MTU for ethernet emulation and to optimize interrupt coalescing for low-latency performance by issuing the command gm_ethertune -i 7.

The standard netperf2.2pl4 benchmark resulted in the following bandwidth performance for TCP and UDP. The TCP test uses 256K socket buffers; the UDP test uses an 8K message size.

NIC Bandwidth CPU Utilization
Sender Receiver
PCIXE TCP 3674 Mb/s 37% 38%
UDP 3964 Mb/s 36% 40%
PCIXD TCP 1977 Mb/s 19% 23%
UDP 1982 Mb/s 17% 19%

The following table shows the (half-round-trip) one-way latency performance for a 1-Byte message. The netperf benchmark presents this data as "number of transmits per second", so we divide 1 second by the number of transmits to get the full round-trip latency, then divide that by 2 to obtain the results below.

NIC One-way Latency CPU Utilization
Sender Receiver
PCIXE TCP 23 µs 17% 14%
UDP 22 µs 15% 15%
PCIXD TCP 25 µs 13% 12%
UDP 24 µs 11% 12%

The "raw" netperf output for these tests is attached below.


Raw netperf output for PCIXE NICs:

>netperf224 -Hshout-my -l 60 -c -C -- -S262144 -s262144
TCP STREAM TEST to shout-my
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

217088 217088 217088    60.01      3673.80   36.48    38.18    1.627   1.703 


>netperf224 -Hshout-my -l 60 -c -C -tUDP_STREAM -- -m 8192
UDP UNIDIRECTIONAL SEND TEST to shout-my
Socket  Message  Elapsed      Messages                   CPU      Service
Size    Size     Time         Okay Errors   Throughput   Util     Demand
bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB

108544    8192   60.01     3629130      0     3963.5     35.67    1.475 
108544           60.01     3629041            3963.4     39.89    1.649 


>netperf224 -Hshout-my -l 60 -c -C -tTCP_RR
TCP REQUEST/RESPONSE TEST to shout-my
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
Send   Recv   Size    Size   Time    Rate     local  remote local   remote
bytes  bytes  bytes   bytes  secs.   per sec  % S    % S    us/Tr   us/Tr

16384  87380  1       1      60.01   22026.08  16.58  13.91  15.059  12.628
16384  87380 

>netperf224 -Hshout-my -l 60 -c -C -tUDP_RR
UDP REQUEST/RESPONSE TEST to shout-my
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
Send   Recv   Size    Size   Time    Rate     local  remote local   remote
bytes  bytes  bytes   bytes  secs.   per sec  % S    % S    us/Tr   us/Tr

108544 108544 1       1      60.01   23033.84   15.34  15.00  13.322  13.024
108544 108544

Raw netperf output for PCIXD NICs:

>netperf224 -Hshout-my -l 60 -c -C -- -S262144 -s262144
TCP STREAM TEST to shout-my
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

217088 217088 217088    60.01      1976.77   19.46    22.78    1.613   1.888 

>netperf224 -Hshout-my -l 60 -c -C -tUDP_STREAM -- -m 8192
UDP UNIDIRECTIONAL SEND TEST to shout-my
Socket  Message  Elapsed      Messages                   CPU      Service
Size    Size     Time         Okay Errors   Throughput   Util     Demand
bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB

108544    8192   60.01     1814677      0     1981.9     16.49    1.364 
108544           60.01     1814608            1981.8     19.24    1.591 

>netperf224 -Hshout-my -l 60 -c -C -tTCP_RR
TCP REQUEST/RESPONSE TEST to shout-my
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
Send   Recv   Size    Size   Time    Rate     local  remote local   remote
bytes  bytes  bytes   bytes  secs.   per sec  % S    % S    us/Tr   us/Tr

16384  87380  1       1      60.01   19996.15  13.14  12.06  13.147  12.067
16384  87380 

>netperf224 -Hshout-my -l 60 -c -C -tUDP_RR
UDP REQUEST/RESPONSE TEST to shout-my
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
Send   Recv   Size    Size   Time    Rate     local  remote local   remote
bytes  bytes  bytes   bytes  secs.   per sec  % S    % S    us/Tr   us/Tr

108544 108544 1       1      60.01   20881.64   10.96  11.79  10.496  11.296
108544 108544

Myricom banner
02 June 2006