> beckman@xxxxxxxxxxx wrote: > -------- > While running some benchmarks, we found out that MPI_Allreduce has a > serious performance bug in coprocessor mode on our machine (driver > 202). Allreduce for virtual node mode is 100x faster (for twice the > number of processors). Allreduce for co-processor mode is slow. pete, i think this may be an explanation for some strange POP behaviour we've seen recently. CO mode runs from March scaled OK, and recent VN mode runs were almost as good; however, recent CO mode runs do not scale at all. There are plenty of Allreduce calls in the code. ray
Attachment:
test.png
Description: Binary data