Home/

Machine details: ziger

Machine: ziger

CPU: AMD Istanbul processor
Number of cores: 24
NUMA configuration: 4x6

Topology Information

TODO: likwid input here

Pairwise Data

The topology information gives us a rough understanding of the expected performance. We complement this with real measurements conducted on the hardware. For this purpose, we use ''pairwise'' a micro benchmark that ping-pongs messages between any combination of cores.

The benchmark measures the send, receive and roundtrip times, i.e. the time it takes until smlt_qp_send() or smlt_qp_recv() return.

Send latencies on AMD Istanbul processor
Send
RTT latencies on AMD Istanbul processor
RTT

Message Passing micro benchmark

A comparison of this benchmark can be found on this page.

We now show the results of our micro benchmarks. For reference, see bench/ab-bench in the Smelt directory.

Multicast benchmark

A comparison of this benchmark can be found on this page.

We now show the results of our micro benchmarks for multicasts. For reference, see bench/ab-bench-scale in the Smelt directory.

Showing plot ab.

Showing plot reduction.

Showing plot barriers.

Showing plot agreement.

Collective operations

A comparison of this benchmark can be found on this page.

The following is a benchmark for collective operations in MPI, OpenMP and Smelt.

EPCC benchmark

A comparison of this benchmark can be found on this page.

Execution of the EPCC benchmark with gcc's unmodified OpenMP compared to an instance using Smelt's barrier.

Showing plot csv.

PARSEC Streamcluster

A comparison of this benchmark can be found on this page.

PARSEC Streamcluster solves the online clustering problem. We execute it with various barrier implementations and report the runtime.