[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bgl-discuss] vectorization report



Hi Andrew,

No, the compilers are separate from the new driver.  You should ask IBM
about getting them.  What you want is the PTF1 compilers with new BGL
addons.

The actual upgrade wasn't too bad. We ran into a few problems, some of
which you might see:

1. DB2 modification script gave errors.  Some are due to blank lines in
   the script, others were modifications to a table that didn't exist for
   us.  According to IBM, that table can be ignored.

2. We were missing some rpms needed for the upgrade. This was due to there
   not being a full install on the service node.  If you have a full
   install, I doubt if you will have problems.  I keep a copy of the SLES8
   iso images online, and had no problems installing the necessary rpms.

3. One of the upgrade rpms that I had pulled down off the IBM website was
   bad.  My guess is that it was bad because there was a failure during
   the download and it was missing a few bites.  When I attempted to pull
   it down again, the web site gave me java errors.  IBM support also had
   problems when they tried.  They eventually ftp'd the files to our ftp
   server.  Verify your rpms before you attempt to install them.

The remaining problems were all with mpirun.  The new split mpirun is the
root of all evil.

Do you have the email from Robin Goldstone (LLNL) about the split mpirun?
If not, let me know and I will forward it to you.  Without it, I would
never have gotten as far as I did.  IBM says they have a document on
setting up the new mpirun but they said it is confidential and refused to
give it to me.  Without that document, Robin's mail, or help from someone,
you have very little chance of getting the new mpirun working at all.

You will need to set up env vars, paths, etc.  Robin's mail contains info
about what things need to be set or changed, things in LD_LIBRARY_PATH,
etc. along with lots of other useful information.  In addition to Robin's
mail, you should know a few things:

The new mpirun uses an rsh from the frontend to the service node to start
the actual job.  Users will need to be able to rsh from the frontends to
the service node with no password.  We don't currently have rsh on our
system, so I've subsituted ssh.  That may or may not cause some of our
remaining problems.  Robin has experienced some of them, so my guess is
no, but keep it in mind as you read through my problem list. I will
probably talk to our security people and see about getting rsh set up to
verify that the problems still exist.  But I have to admit I'm very
unhappy about the behind the scenes rsh as using it basically just asks
for problems.

The new mpirun is very, very sensitive to what is in your dot files (i.e.
.bashrc, .profile, .cshrc).  If you have problems, the first thing to do
is move any you have out of the way (both on the frontend and on the
backend - including /etc shell setup files ie. /etc/profile.d/*
/etc/csh.cshrc, etc.)  and put in a very simple one that does only what
you need.  Do not put in any lines with output (i.e. no echo statements to
help you debug).

Possibly because of our use of ssh, users cannot forward their X11 when
they log onto the frontends.  If they do, they get a connection denied
error (even tho they can ssh to the service node and run a command -
i.e.  'ssh sn uptime' works fine).

Because of the security implications of the new mpirun, we are using a
restricted shell similar to the one from Robin.  You will want to do that.
However, get things working without the restricted shell first as there
are lots of weird issues that need to be worked out before you add that
complication.  Once you have a restricted shell in place, the first thing
to check for a user having problems is that they can 'ssh sn <command>'
where <command> is a command available from the restricted area.  I just
copied uptime.  Keep in mind that the system /etc shell setup files are
not always sourced for non-interactive logins (such as the rsh sn
<command>) for some shells, some shells source some of them, some source
none of them, etc.  If you aren't familiar with which shells do what, the
man page for each shell usually has good information.

After cleaning up as much as we can, we have the remaining messages
from mpirun:

/bin/bash: SHELL: readonly variable
/bin/bash: PATH: readonly variable

You might get other bash errors similar to these. You should be able to
get rid of all of them except these two.  These can be ignored.

FE_MPI (Info) : Scheduler interface library not loaded

The library mpirun is looking for is libsched_if.so (strace is your
friend) which I do not have on my system.  According to IBM, this is just
an Info and can be safely ignored.  I have the name of the package that
contains it and plan on asking IBM for a copy so that I can verify that
none of our remaining problems are caused by the missing library.  I
believe this library is part of the IBM resource manager (extended
LoadLeveler).  If you have purchased that, you probably won't see this
message.

We have problems with orphaned mpirun_be's on the service node.  I'm not
sure what is causing the problem.  I believe that Robin is seeing that as
well.

The error messages users get when their env isn't set up right are not
very useful.  There are lots of lost connection ones.  The following one
shows up under lots of circumstances, I have not yet tracked all the
problems down but usually it means they have something funky in their env.

<Apr 15 17:50:47> BE_MPI (ERROR):  "lost contact with control node 0. Connection reset by peer"

We still get the following error occasionally.  We also got it
consistently when I hadn't cleaned up the various shell setup files
completely.  Now, it is intermittent.  The user's next job usually runs
fine.

<Apr 14 22:12:35> FE_MPI (ERROR): blk_receive_incoming_message() - !

If you use partition sizes different from full midplanes or full racks,
this next problem with bite you.

The new mpirun will not allow you to specify both the partition name and
the partition mode (i.e. coprocessor or virtual).  It also sometimes
changes the mode of a partition to coprocessor mode if -mode vn is not
specified and the partition was set to be in virtual mode.  This doesn't
appear to happen all the time, but I haven't spent enough time to state
that with full confidence. The combination of these two things hurts us a
lot, mostly because we work with 32, 64, 512 and 1024 node partitions.
We can't allow the software to generate partitions on the fly as that
prevents us from using anything smaller than 512.  The way we accomplished
what we needed with the previous mpirun was to set up all our partitions
with 2 versions for every one - one in coprocessor mode and one.  Our
scheduler selected the partition for a job based on the mode the user
wanted and the number of nodes and then did a mpirun -partition
<partitionname>.  The scheduler did not have to specify -mode to the
mpirun because the mode was set for the partition.  As you can see, the
preventing of mpirun -partition <pn> -mode VN combined with it changing
the mode of the partition on the fly breaks our use of 32 and 64 node
partitions.

Ok, well, that's probably enough for now.  If you want more info or need
help, feel free to contact me.

Susan.

On Sat, 16 Apr 2005, Andrew Elwell wrote:

> > Looks like the new compiler (login node 4) gives a vectorization report:
>
> Is this a new compiler that came with driver 100?
> Just wondering as we're going to that level on Tuesday
>
> A
>
> --
> Andrew Elwell, System Administrator EPCC
> Tel 0131 445 7833 (ACF Building)
> Tel 0131 650 5023 (Rm 3309, JCMB)
>
> - --------------------------------------------------------------------
> To add or remove yourself from this mailing list, use the 'notifyme'
> command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.
>
>

- --------------------------------------------------------------------
To add or remove yourself from this mailing list, use the 'notifyme'
command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.