[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bgl-discuss] Fwd: [bgl-support #302] May I ask you two questions?



One thing that might cause some confusion is that IBM always refers to it as the "torus network", even when it is configured as a mesh. Even the debug messages will still say "on an (X,Y,Z,T) torus" when running on a small block.

-Andrew

On Feb 7, 2007, at 4:31 PM, Andrew Cherry wrote:

Nope, only 512 nodes (a midplane) or more; IBM is pretty clear in their documentation on this one. I don't think it's a matter of being 2^k in size, I think it's more a matter of not having the physical connections to be able to complete the torus in sub- midplane blocks.

-Andrew

On Feb 7, 2007, at 4:08 PM, Paul Fischer wrote:


Andrew,

I got the impression (perhaps I misunderstood) from Ray Bair
that any of the viable subsets (i.e., 2^k node partitions)
gave was reconfigured to a torus??

Paul


On Wed, 7 Feb 2007, Andrew Cherry wrote:

To answer (2), the minimum partition that can give you a torus is 512
nodes.


-Andrew

On Feb 7, 2007, at 3:57 PM, Yongzhi Chen wrote:

Hi Kamil,

Thank you for your response.

(1) So in you opinion, a 2% fluctuation is normal. I really want to
make
sure it.

(2) Can you please tell me the minimum partition that can guarantee
me a
torus machine, 256 or above? Thanks a lot.

Best,
-Yongzhi

-----Original Message-----
From: Kamil Iskra [mailto:iskra@xxxxxxxxxxx]
Sent: Wednesday, February 07, 2007 12:18 PM
To: Yongzhi Chen
Cc: discuss@xxxxxxxxxxxxxxx; BG/L Support
Subject: Re: [bgl-discuss] Fwd: [bgl-support #302] May I ask you two
questions?


So you get a distribution of some... 2% ? That doesn't sound too bad.
Then again, I'm doing file I/O performance testing, so I'm used to
much
larger distributions.


Your way of measuring time looks OK, BTW.


Perhaps the links utilized by process pairs in your test happened to be non-overlapping? How did you run this experiment? Which pairs did you test (surely not all 120 at the same time)? Did you use a custom BGLMPI_MAPPING?

Bacground info:

Each node on BG/L has 6 torus connections to its nearest neighbors,
each
one with a bandwidth of around 150 MB/s.  With small partitions
like the
one you used, it's actually not a torus, but a mesh, and nodes use
between
3 and 5 links.  There is a number of ways that lets you find the
location
of any particular process within the torus topology:

MPI_Get_processor_name() function puts it in the returned string,
so you
can just print it out,

rts_coordinatesForRank(getpid(), &x, &y, &z, &t) (include <rts.h>;
it's in
/bgl/BlueLight/ppcfloor/bglsys/include),

rts_get_personality(), followed by BGLPersonality_xCoord(),
BGLPersonality_yCoord(), etc.

Kamil

--
Kamil Iskra, PhD
Argonne National Laboratory, Mathematics and Computer Science Division
9700 South Cass Avenue, Building 221, Argonne, IL 60439, USA
phone: +1-630-252-7197 fax: +1-630-252-5986




- --------------------------------------------------------------------
To add or remove yourself from this mailing list, use the 'notifyme'
command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.




- --------------------------------------------------------------------
To add or remove yourself from this mailing list, use the 'notifyme'
command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.




- -------------------------------------------------------------------- To add or remove yourself from this mailing list, use the 'notifyme' command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.