MPI on the Cell Cluster
For running programs over the high speed InfiniBand network, we use MPI as communication layer.
(for more information on MPI and its software structure, see documentation in ...gdv/Forschung/Cell/Documents/XXXmpi.pdf)
At the software side, InfiniBand uses OFED 1.4, the communication stack freely available from OpenFabrics.org. On RHEL installed systems, mellanox, the hardware vendor of the IB-Cards, provides a special iso for upgrading the communication stack (including firmware updates of the card).
On fedora, you must download the stack from OpenFabrics.org. There is some conflicting package of the fedora base (iscsi-xxxx) that should be ignored.
Currently, we have two versions of MPI installed:
- OpenMPI 1.3.3
- MVAPICH 1.1.0
You can choose among those be using the mpi-selector (--list) command. It creates a file in your home-directory to consider your selection for future uses.
InfiniBand related problems
Q: I cannot reach the nodes by tcp/ip. The node fails to respond on a icmp ping.
For some reason, the InfiniBand network switches have no logic of detecting connected network adapters. In contrast to ethernet, the lookup similar to an ARP Cache is not provided by the switch. Thus, a host connected to the switch needs to run this logic part for tcp/ip connections.
You will have to start the so-called subnet manager opensm on one of the machines by typing
service opensmd start
Afterwards you should be able to ping a node.
Benchmarking the System
Common Benchmarks
To make some very general benchmarks of the network for diagnostic purposes you can take- ib_rdma_[bw/lat], ib_tcp_[bw/lat] (coming from infiniband-diagnostics)
- qperf (with various options, see manpage)
- iperf
Advanced tests conducting on the mpi are also available
- OSU tests (from Ohio State University, a benchmark suite provided by mvapich)
- IMB (Intel MPI Benchmark)
Linpack
For HPC system a reference benchmark is the Linpack. You can download the benchmarkfrom the links below
Linpack Webpage (including patch for QS22)
Readme to QS22-optimized linpack
For QS22 you will also need (also described in readme)
- ATLAS 3.8.1 (warning: long build, takes approx. 12 hours)
- OpenMPI 1.3.3 (custom version for infiniband)
There are no comments on this page. [Add comment]