[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fw: [bgl-discuss] MPI failure, simplified



Thanks for the input, Bob.  I'm putting in a PMR right now....

On Thu, 2 Feb 2006, Bob Walkup wrote:

> ----- Forwarded by Bob Walkup/Watson/IBM on 02/02/2006 03:57 PM -----
>
>
>
>
>
> I would suggest opening a PMR with IBM.  Although it might be possible to
> code around this, it would be preferable to get a fix in the library.
> There is a good test case ready to go - and working through IBM support is
> probably the best approach.  I think this is an important issue - it is
> just too easy to run out of memory with the current MPI implementation.
>
> Regards,
> Bob Walkup (walkup@us.ibm.com, 914-945-1512)
> --------------------------------------------------------------
>
>
>
>
> "Andrew Siegel" <siegela@mcs.anl.gov>
> Sent by: owner-discuss@bgl.mcs.anl.gov
> 02/02/2006 12:24 PM
>
>         To:     <discuss@bgl.mcs.anl.gov>
>         cc:
>         Subject:        RE: [bgl-discuss] MPI failure, simplified
>
>
> To be clear, this isn't an EAGER vs RENDEZVOUS issue, since we are way way
> past the EAGER limit (as indicated also by the error message). My
> understanding of the MPI standard is that the implementation is free to
> buffer or not, but that no matter what it does that this is guaranteed to
> work (at least for "small" numbers of outstanding messages). Anyhow, would
> it be wise to contact the MPI IBM people (George Almasi?) and ask some
> specific questions. Not being able to do such operations reliably totally
> kills several of the important algorithms that we're relying on.
> -andrew
>
> -----Original Message-----
> From: owner-discuss@bgl.mcs.anl.gov [mailto:owner-discuss@bgl.mcs.anl.gov]
> On Behalf Of William Gropp
> Sent: Thursday, February 02, 2006 8:12 AM
> To: Stephen Siegel
> Cc: discuss@bgl.mcs.anl.gov; support@bgl.mcs.anl.gov
> Subject: Re: [bgl-discuss] MPI failure, simplified
>
> At 10:31 PM 2/1/2006, Stephen Siegel wrote:
> >I posted an earlier message about an MPI failure I was getting on BGL
> >when passing some large messages.  I can now produce a similar failure
> >with a very simple program.  The code is below, followed by the
> >(excerpted) output to stderr when run on 2 procs (co-proc mode).
> >
> >Each proc allocates a 400 MB buffer.  Proc 0 posts a send to proc 1 of
> >the first 80 MB, waits for that send to complete, then posts a receive
> >into the next 160 MB and waits for that request to complete.  Proc 1
> >posts a recv from proc 0 for the first 80 MB, then posts a send of the
> >next 160 MB, then waits for both requests to complete.  It seems to me
> >that this is a correct "safe" MPI program, going by the MPI Standard.
> >
> >The error message, "...cannot allocate unexpected buffer from...
> >unexpected requests 1, Total Mem: 160 MB ..." suggests that the MPI
> >implementation is trying to allocate 160 MB and it can't.  It seems to
> >me that it shouldn't have to allocate this memory--it should just
> >deliver the message directly into the receive buffer.  (That is the
> >point of the rendezvous protocol.)
> >
> >Question: is this a bug in the MPI implementation on BGL, or am I
> >missing something?
>
> Bug might be too strong a statement, but I agree with your interpretation
> -
> the MPI implementation should not be allocating space for Irecvs and this
> program should work.  There's always some tension over where to set the
> eager vs. rendezvous threshold for both performance and space reasons, and
>
> the MPI standard doesn't specify when eager, rendezvous, or something else
>
> should be used.  But this program should work.
>
> Bill
>
>
> >Thanks,
> >
> >   Steve
> >
> >
> >---------------------------------------------------------------------
> >#include<stdlib.h>
> >#include<assert.h>
> >#include<stdio.h>
> >#include "mpi.h"
> >
> >int main (int argc, char *argv[]) {
> >   int myRank, numProcs;
> >   unsigned char* ptr;
> >   MPI_Request req0;
> >   MPI_Request req1;
> >
> >   MPI_Init(&argc, &argv);
> >   MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
> >   MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
> >   if (numProcs != 2) {
> >     fprintf(stderr, "Usage: mpiexec -np 2 ./exp2c\n");
> >     fflush(stderr);
> >     return 1;
> >   }
> >   ptr = (unsigned char*)malloc(400000000);
> >   assert(ptr);
> >   if (myRank == 0) {
> >     MPI_Isend(ptr,80000000,MPI_BYTE,1,0,MPI_COMM_WORLD,&req0);
> >     MPI_Wait(&req0,MPI_STATUS_IGNORE);
> >     MPI_Irecv(ptr+80000000,160000000,MPI_BYTE,1,0,MPI_COMM_WORLD,&req1);
> >     MPI_Wait(&req1,MPI_STATUS_IGNORE);
> >   } else {
> >     MPI_Irecv(ptr,80000000,MPI_BYTE,0,0,MPI_COMM_WORLD,&req0);
> >     MPI_Isend(ptr+80000000,160000000,MPI_BYTE,0,0,MPI_COMM_WORLD,&req1);
> >     MPI_Wait(&req0,MPI_STATUS_IGNORE);
> >     MPI_Wait(&req1,MPI_STATUS_IGNORE);
> >   }
> >   free(ptr);
> >   printf("Proc %d has completed successfully\n", myRank);
> >   fflush(stdout);
> >   MPI_Finalize();
> >}
> >
> >---------------------------------------------------------------------
> >
> >.
> >.
> >.
> ><Feb 01 22:11:58.663360> BE_MPI (Info) : IO - Threads initialized
> >Rzv:cannot allocate unexpected buffer from R:1 T:0 C:0
> >Dumping 9 frames
> >         Frame 0:  0x2078f0
> >         Frame 1:  0x209da8
> >         Frame 2:  0x23e25c
> >         Frame 3:  0x237c04
> >         Frame 4:  0x23a1b4
> >         Frame 5:  0x207b0c
> >         Frame 6:  0x2052b0
> >         Frame 7:  0x200614
> >         Frame 8:  0x20016c
> >Posted Queue:
> >-------------
> >Posted Requests 0, Total Mem: 0 bytes
> >Unexpected Queue:
> >-----------------
> >Unexpected Requests 1, Total Mem: 160000000 bytes
> >Fatal:  Cannot allocate buffer for unexpected message<Feb 01
> >22:12:03.767341> BE_MPI (Info) : IO - Output thread terminated
> ><Feb 01 22:12:03.898684> BE_MPI (Info) : Job 44154 switched to state
> >TERMINATED ('T')
> >.
> >.
> >.
> >
> >- --------------------------------------------------------------------
> >To add or remove yourself from this mailing list, use the 'notifyme'
> >command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.
>
> William Gropp
> http://www.mcs.anl.gov/~gropp
>
> - --------------------------------------------------------------------
> To add or remove yourself from this mailing list, use the 'notifyme'
> command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.
>
> - --------------------------------------------------------------------
> To add or remove yourself from this mailing list, use the 'notifyme'
> command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.
>
>

- --------------------------------------------------------------------
To add or remove yourself from this mailing list, use the 'notifyme'
command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.