I brought up the power3 vs. bgl issue with
IBM to see what their experience is. Answer is below:
---from Bob
Walkup of IBM
If you are getting less performance than IBM Power3 @ 375 MHz, that is not
good. I do expect, and usually measure, performance that is somewhat
better on BG/L than on IBM Power3, on a per cpu basis. What Power3 has,
but BG/L does not, is a big L2 cache, 8MB on Power3, and a bigger L1 cache (2x
larger). For the L2 to matter, the code would have to miss L1 a lot but
hit L2 on Power3. On BG/L an L1 miss carries a higher penalty. That
is one of the few ways that I know of for Power3 to come out ahead. This
may be a quick question, but it is an important one, and one that we want to
follow up on. Can you point me to one or two key codes where BG/L is not
up to expectations?