[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bgl-discuss] A bunch of specific questions about BGL



In one previous email, Rajeev mentioned that by setting the environment
variable BGLMPI_ALLGATHER=MPICH we can get better performance.
Well, I have no idea how to make sure that I set that environment variable
correctly.  I tried 'setenv BGLMPI_ALLGATHER MPICH' but I was told that
the shell settings are transferred to with cqsub.  So I tried just doing
'cqsub ... BGLMPI_ALLGATHER=MPICH'.  Neither approaches give me any
difference in performance.  What is the correct way of setting that
environment variable?

I am doing work on collective communication on BGL, so getting that work
is important.
So that leads me to my next question.  When I call any collective
operation such as MPI_Bcast or MPI_Allgather in my code, what is the
default implementation that is being called?
I compile just with mpicc with no fancy options, and run with cqsub.

I know there are three different collective implementations on BGL.  There
is the tried and true MPICH collectives, one that uses the global tree
network, and an optimized MPI_Bcast that uses an underlying hardware
mechanism.  The collectives that use the global tree network can only be
uses when using the entire rack though.  And the optimized MPI_Bcast can
only be used when a user uses MPI_COMM_WORLD in a certain partition, but
not whole rack of BGL.

So how do I call those different collective operations?  I am guessing
that environment variable sets the collectives to the MPICH
implementation, but I hunch is that the default is MPICH.

Also, does anyone know how to use the BGL personality?
  BGLPersonality p;
  rts_get_personality(&p, sizeof(p));
  BGLPersonality_getLocationString(&p, personality_buffer);
  Universal_rank = rts_get_processor_id();
  Universal_x_coord = p.xCoord;
  Universal_y_coord = p.yCoord;
  Universal_z_coord = p.zCoord;
  Universal_x_size = p.xSize;
  Universal_y_size = p.ySize;
  Universal_z_size = p.zSize;

Someone at IBM gave me this code, but I do not know what to include in
order to access it and compile the code for it to work.  From the
BGLPersonality I am supposed to be able to find the physical partition
that a job is running irrespective of MPI.

The only other way I know to find out what partition a job is given is by
looking at   MPI_Get_processor_name which returns a string with the
partitioning encoded in there, but that is a dirty way of getting that
information.

Also with partitions, how are they formed?  When a partition is formed, I
know the nodes also form a contigious cube.  So if there are jobs running
on the machine, are the different partitions physically touching each
other?

Also, how are packets routed?  Are they always routed intra-partition?  If
I am using a partition that is not a midplane or the entire rack, I cannot
take advantage of the torus network.  If I do try to send a packet around
the torus logically, how is that packet routed?  Does it go around the
partition thus leaving it and maybe going through another partition with a
job running, or does it just get routed through your own partition.

Ok, I think that are the questions I have for now.
Thanks for any help you can give me, especially if you have read the email
this far...

- --------------------------------------------------------------------
To add or remove yourself from this mailing list, use the 'notifyme'
command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.