[Aces-support] problems 11/19/04

Yang Zhang yangz at MIT.EDU
Fri Nov 19 17:11:59 EST 2004


Hi Chris,

Last Tuesday we mentioned that we need to assign a long time slot for 
calculation like 7days, I am wondering whether this is setted up 
well?Because I find the longest time is still 24 hours,right?

Another thing is I got a problem today when I was using LAM/MPI 
program.It is:
"
n-1<2956> ssi:boot:base:linear_windowed: booting n0 (aE34-500-063)
n-1<2956> ssi:boot:base:linear_windowed: booting n1 (aE34-500-035)
...........	...............		..........  (aE34-........)
n-1<2956> ssi:boot:base:linear_windowed: booting n9 (a48-206-m05)
-----------------------------------------------------------------------------
It seems that [at least] one of the processes that was started with
mpirun chose a different RPI than its peers.  For example, at least
the following two processes mismatched in their RPI selections:

     MPI_COMM_WORLD rank 0: usysv (v7.1.0)
     MPI_COMM_WORLD rank 18: gm (v1.2.0)

All MPI processes must choose the same RPI module and version when
they start.  Check your SSI settings and/or the local environment
variables on each node.
"

What is the problem here?

Thanks,

Yang




More information about the Aces-support mailing list