Welfenlab - Leibniz 
                        Universität Hannover Welfenlab Leibniz Universität Hannover
Search:

MPI on the Cell Cluster


For running programs over the high speed InfiniBand network, we use MPI as communication layer.
(for more information on MPI and its software structure, see documentation in ...gdv/Forschung/Cell/Documents/XXXmpi.pdf)

At the software side, InfiniBand uses OFED 1.4, the communication stack freely available from OpenFabrics.org. On RHEL installed systems, mellanox, the hardware vendor of the IB-Cards, provides a special iso for upgrading the communication stack (including firmware updates of the card).
On fedora, you must download the stack from OpenFabrics.org. There is some conflicting package of the fedora base (iscsi-xxxx) that should be ignored.

Currently, we have two versions of MPI installed:


You can choose among those be using the mpi-selector (--list) command. It creates a file in your home-directory to consider your selection for future uses.


InfiniBand related problems


Q: I cannot reach the nodes by tcp/ip. The node fails to respond on a icmp ping.

For some reason, the InfiniBand network switches have no logic of detecting connected network adapters. In contrast to ethernet, the lookup similar to an ARP Cache is not provided by the switch. Thus, a host connected to the switch needs to run this logic part for tcp/ip connections.

You will have to start the so-called subnet manager opensm on one of the machines by typing
service opensmd start

Afterwards you should be able to ping a node.



Benchmarking the System

Common Benchmarks

To make some very general benchmarks of the network for diagnostic purposes you can take

  • ib_rdma_[bw/lat], ib_tcp_[bw/lat] (coming from infiniband-diagnostics)
  • qperf (with various options, see manpage)
  • iperf
typical results so far should be: ~1500MiB/s on rdma transfers

Advanced tests conducting on the mpi are also available
  • OSU tests (from Ohio State University, a benchmark suite provided by mvapich)
  • IMB (Intel MPI Benchmark)

Linpack

For HPC system a reference benchmark is the Linpack. You can download the benchmark
from the links below
Linpack Webpage (including patch for QS22)
Readme to QS22-optimized linpack

For QS22 you will also need (also described in readme)
  • ATLAS 3.8.1 (warning: long build, takes approx. 12 hours)
  • OpenMPI 1.3.3 (custom version for infiniband)

There are no comments on this page. [Add comment]

Page was generated in 0.0661 seconds