[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bgl-discuss] Fwd: [bgl-support #302] May I ask you two questions?



Yongzhi-

I'm forwarding your message on to the bgl-discuss mailing list to see if anyone there has any more ideas....

Andrew Cherry
MCS HPC Systems Group
Argonne National Laboratory


Begin forwarded message:

From: "Yongzhi Chen" <yonchen@xxxxxxxxxxxxx>
Date: February 5, 2007 1:48:47 PM CST
To: "'Andrew Cherry'" <acherry@xxxxxxxxxxx>
Subject: RE: [bgl-support #302] May I ask you two questions?

No.
The job did not perform any file I/O between the start/finish times.

-----Original Message-----
From: Andrew Cherry [mailto:acherry@xxxxxxxxxxx]
Sent: Monday, February 05, 2007 11:40 AM
To: Yongzhi Chen
Cc: support@xxxxxxxxxxxxxxx; smc@xxxxxxxxxxx
Subject: Re: [bgl-support #302] May I ask you two questions?

Hello Yohngzhi-

Does your job perform any file I/O between your start/finish times?

(1) Inconsistency of the timing results

Suppose I execute a same code several times in a same day or
different days.
Every time I used the command "cqsub -q short -t 00:30:00 -n 16
executed_program" to submit my request. So far the timing results
was as
small as 114.760775 seconds and as large as 117.027405 seconds.
Does this
make any sense? I expect a very tiny difference on timing result
such as
several centiseconds since every time the code was exclusively
executed on
the machines. Actually I collect the wall-clock time in the
following way;



double start, finish

MPI_Barrier(comm);

start = MPI_Wtime();

.

MPI_Barrier(comm);

finish = MPI_Wtime();

if (my_rank == 0)

  printf("Elapsed time = %e seconds\n", finish - start);



(2) MPI Behavior on MSC BG/L machine

I have no idea of MPI behavior on MSC BG/L machine therefore I test
the max
bandwidth on it by executing a ping-pong program. For example, I
execute it
on 16 processors to figure out the max bandwidth (MPI_Send &MPI_Recv
behavior) between each pair of them. To my surprise, I got the
range from
153 MB/s to 158 MB/s. To be honest, I expected a bigger difference
here
since BG/L is a 3-D torus machines therefore different pair of
processors
does corresponds to different hops on which the bandwidth should
dominantly
depends. How to explain these close performances? I suppose the BG/
L is a
heterogeneous system but the results showed that it was a
homogeneous system
on MPI behavior aspect.



Best Regards,

Yongzhi


Dear expert:



Currently I have the following two questions regarding MSC BG/L
machine:



(1) Inconsistency of the timing results

Suppose I execute a same code several times in a same day or
different days. Every time I used the command "cqsub -q short -t
00:30:00 -n 16 executed_program" to submit my request. So far the
timing results was as small as 114.760775 seconds and as large as
117.027405 seconds. Does this make any sense? I expect a very tiny
difference on timing result such as several centiseconds since
every time the code was exclusively executed on the machines.
Actually I collect the wall-clock time in the following way;



double start, finish

MPI_Barrier(comm);

start = MPI_Wtime();

.

MPI_Barrier(comm);

finish = MPI_Wtime();

if (my_rank == 0)

  printf("Elapsed time = %e seconds\n", finish - start);



(2) MPI Behavior on MSC BG/L machine

I have no idea of MPI behavior on MSC BG/L machine therefore I test
the max bandwidth on it by executing a ping-pong program. For
example, I execute it on 16 processors to figure out the max
bandwidth (MPI_Send &MPI_Recv behavior) between each pair of them.
To my surprise, I got the range from 153 MB/s to 158 MB/s. To be
honest, I expected a bigger difference here since BG/L is a 3-D
torus machines therefore different pair of processors does
corresponds to different hops on which the bandwidth should
dominantly depends. How to explain these close performances? I
suppose the BG/L is a heterogeneous system but the results showed
that it was a homogeneous system on MPI behavior aspect.



Best Regards,

Yongzhi



- -------------------------------------------------------------------- To add or remove yourself from this mailing list, use the 'notifyme' command on any BGL machine. To remove: notifyme -n, to add: notifyme -y.