[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bgl-discuss] Comparison between BGL nodes and standard PPCs
The kernel of my code, where it spends > 95% of its time on a typical
machine uses no math intrinsics, not even square roots. I have tried using
the second FPU by specifying -qarch=440d -O3 -qhot=simd. I also tried
-qarch=440d -O4. Neither gave any improvement over the -qarch=440 -O3 and
perhaps made things worse. So I agree with Steve Pieper's assessment.
Don Sinclair
On Thu, 10 Mar 2005, Andrew Siegel wrote:
>
> even though you're not using the second fpu, it's still a little worse
> than is typical. This is likely because of how the math intrinsics are
> implemented -- do you have a significant # of sqrts, logs, etc?
>
> On Thu, 10 Mar 2005, Donald Sinclair wrote:
>
> > What are the major differences other than the presence of communication
> > hardware between the BGL processors and other Power PC chips, Power 3, for
> > example? I ask, since despite the fact that both this chip and the
> > Power 3 have 2 floating point units and the rest of the unit has the
> > Power PC architecture, and I use the same compiler (xlf), the performance
> > I get from the 700 MHz BGL processor is less than half what I can get from
> > a 375 MHz Power 3, even when I use the -qarch=440d.
> > Don Sinclair
> >
> > - --------------------------------------------------------------------
> > To add or remove yourself from this mailing list, use the 'notifyme'
> > command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.
> >
> >
>
>
- --------------------------------------------------------------------
To add or remove yourself from this mailing list, use the 'notifyme'
command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.