[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fw: [bgl-discuss] MPI failure, simplified
Thanks for the input, Bob. I'm putting in a PMR right now....
On Thu, 2 Feb 2006, Bob Walkup wrote:
> ----- Forwarded by Bob Walkup/Watson/IBM on 02/02/2006 03:57 PM -----
>
>
>
>
>
> I would suggest opening a PMR with IBM. Although it might be possible to
> code around this, it would be preferable to get a fix in the library.
> There is a good test case ready to go - and working through IBM support is
> probably the best approach. I think this is an important issue - it is
> just too easy to run out of memory with the current MPI implementation.
>
> Regards,
> Bob Walkup (walkup@us.ibm.com, 914-945-1512)
> --------------------------------------------------------------
>
>
>
>
> "Andrew Siegel" <siegela@mcs.anl.gov>
> Sent by: owner-discuss@bgl.mcs.anl.gov
> 02/02/2006 12:24 PM
>
> To: <discuss@bgl.mcs.anl.gov>
> cc:
> Subject: RE: [bgl-discuss] MPI failure, simplified
>
>
> To be clear, this isn't an EAGER vs RENDEZVOUS issue, since we are way way
> past the EAGER limit (as indicated also by the error message). My
> understanding of the MPI standard is that the implementation is free to
> buffer or not, but that no matter what it does that this is guaranteed to
> work (at least for "small" numbers of outstanding messages). Anyhow, would
> it be wise to contact the MPI IBM people (George Almasi?) and ask some
> specific questions. Not being able to do such operations reliably totally
> kills several of the important algorithms that we're relying on.
> -andrew
>
> -----Original Message-----
> From: owner-discuss@bgl.mcs.anl.gov [mailto:owner-discuss@bgl.mcs.anl.gov]
> On Behalf Of William Gropp
> Sent: Thursday, February 02, 2006 8:12 AM
> To: Stephen Siegel
> Cc: discuss@bgl.mcs.anl.gov; support@bgl.mcs.anl.gov
> Subject: Re: [bgl-discuss] MPI failure, simplified
>
> At 10:31 PM 2/1/2006, Stephen Siegel wrote:
> >I posted an earlier message about an MPI failure I was getting on BGL
> >when passing some large messages. I can now produce a similar failure
> >with a very simple program. The code is below, followed by the
> >(excerpted) output to stderr when run on 2 procs (co-proc mode).
> >
> >Each proc allocates a 400 MB buffer. Proc 0 posts a send to proc 1 of
> >the first 80 MB, waits for that send to complete, then posts a receive
> >into the next 160 MB and waits for that request to complete. Proc 1
> >posts a recv from proc 0 for the first 80 MB, then posts a send of the
> >next 160 MB, then waits for both requests to complete. It seems to me
> >that this is a correct "safe" MPI program, going by the MPI Standard.
> >
> >The error message, "...cannot allocate unexpected buffer from...
> >unexpected requests 1, Total Mem: 160 MB ..." suggests that the MPI
> >implementation is trying to allocate 160 MB and it can't. It seems to
> >me that it shouldn't have to allocate this memory--it should just
> >deliver the message directly into the receive buffer. (That is the
> >point of the rendezvous protocol.)
> >
> >Question: is this a bug in the MPI implementation on BGL, or am I
> >missing something?
>
> Bug might be too strong a statement, but I agree with your interpretation
> -
> the MPI implementation should not be allocating space for Irecvs and this
> program should work. There's always some tension over where to set the
> eager vs. rendezvous threshold for both performance and space reasons, and
>
> the MPI standard doesn't specify when eager, rendezvous, or something else
>
> should be used. But this program should work.
>
> Bill
>
>
> >Thanks,
> >
> > Steve
> >
> >
> >---------------------------------------------------------------------
> >#include<stdlib.h>
> >#include<assert.h>
> >#include<stdio.h>
> >#include "mpi.h"
> >
> >int main (int argc, char *argv[]) {
> > int myRank, numProcs;
> > unsigned char* ptr;
> > MPI_Request req0;
> > MPI_Request req1;
> >
> > MPI_Init(&argc, &argv);
> > MPI_Comm_size(MPI_COMM_WORLD, &numProcs);
> > MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
> > if (numProcs != 2) {
> > fprintf(stderr, "Usage: mpiexec -np 2 ./exp2c\n");
> > fflush(stderr);
> > return 1;
> > }
> > ptr = (unsigned char*)malloc(400000000);
> > assert(ptr);
> > if (myRank == 0) {
> > MPI_Isend(ptr,80000000,MPI_BYTE,1,0,MPI_COMM_WORLD,&req0);
> > MPI_Wait(&req0,MPI_STATUS_IGNORE);
> > MPI_Irecv(ptr+80000000,160000000,MPI_BYTE,1,0,MPI_COMM_WORLD,&req1);
> > MPI_Wait(&req1,MPI_STATUS_IGNORE);
> > } else {
> > MPI_Irecv(ptr,80000000,MPI_BYTE,0,0,MPI_COMM_WORLD,&req0);
> > MPI_Isend(ptr+80000000,160000000,MPI_BYTE,0,0,MPI_COMM_WORLD,&req1);
> > MPI_Wait(&req0,MPI_STATUS_IGNORE);
> > MPI_Wait(&req1,MPI_STATUS_IGNORE);
> > }
> > free(ptr);
> > printf("Proc %d has completed successfully\n", myRank);
> > fflush(stdout);
> > MPI_Finalize();
> >}
> >
> >---------------------------------------------------------------------
> >
> >.
> >.
> >.
> ><Feb 01 22:11:58.663360> BE_MPI (Info) : IO - Threads initialized
> >Rzv:cannot allocate unexpected buffer from R:1 T:0 C:0
> >Dumping 9 frames
> > Frame 0: 0x2078f0
> > Frame 1: 0x209da8
> > Frame 2: 0x23e25c
> > Frame 3: 0x237c04
> > Frame 4: 0x23a1b4
> > Frame 5: 0x207b0c
> > Frame 6: 0x2052b0
> > Frame 7: 0x200614
> > Frame 8: 0x20016c
> >Posted Queue:
> >-------------
> >Posted Requests 0, Total Mem: 0 bytes
> >Unexpected Queue:
> >-----------------
> >Unexpected Requests 1, Total Mem: 160000000 bytes
> >Fatal: Cannot allocate buffer for unexpected message<Feb 01
> >22:12:03.767341> BE_MPI (Info) : IO - Output thread terminated
> ><Feb 01 22:12:03.898684> BE_MPI (Info) : Job 44154 switched to state
> >TERMINATED ('T')
> >.
> >.
> >.
> >
> >- --------------------------------------------------------------------
> >To add or remove yourself from this mailing list, use the 'notifyme'
> >command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.
>
> William Gropp
> http://www.mcs.anl.gov/~gropp
>
> - --------------------------------------------------------------------
> To add or remove yourself from this mailing list, use the 'notifyme'
> command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.
>
> - --------------------------------------------------------------------
> To add or remove yourself from this mailing list, use the 'notifyme'
> command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.
>
>
- --------------------------------------------------------------------
To add or remove yourself from this mailing list, use the 'notifyme'
command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.