Home/

Machine details: nos

Machine: nos

CPU: AMD Opteron Dual Core Rev F 90nm processor
Number of cores: 4
NUMA configuration: 2x2

Topology Information

TODO: likwid input here

Pairwise Data

The topology information gives us a rough understanding of the expected performance. We complement this with real measurements conducted on the hardware. For this purpose, we use ''pairwise'' a micro benchmark that ping-pongs messages between any combination of cores.

The benchmark measures the send, receive and roundtrip times, i.e. the time it takes until smlt_qp_send() or smlt_qp_recv() return.

Send latencies on AMD Opteron Dual Core Rev F 90nm processor
Send
RTT latencies on AMD Opteron Dual Core Rev F 90nm processor
RTT

Message Passing micro benchmark

A comparison of this benchmark can be found on this page.

We now show the results of our micro benchmarks. For reference, see bench/ab-bench in the Smelt directory.

Multicast benchmark

A comparison of this benchmark can be found on this page.

We now show the results of our micro benchmarks for multicasts. For reference, see bench/ab-bench-scale in the Smelt directory.

Showing plot ab.

Showing plot reduction.

Showing plot barriers.

Showing plot agreement.

EPCC benchmark

A comparison of this benchmark can be found on this page.

Execution of the EPCC benchmark with gcc's unmodified OpenMP compared to an instance using Smelt's barrier.

Showing plot csv.

PARSEC Streamcluster

A comparison of this benchmark can be found on this page.

PARSEC Streamcluster solves the online clustering problem. We execute it with various barrier implementations and report the runtime.