[Aces-support] Problems on ao nodes

aces-admin at techsquare.com aces-admin at techsquare.com
Fri Aug 31 10:24:31 EDT 2007


hello jmc-

this machine was confisticated 

  a54-1727-037

[greg]

> Date: Fri, 31 Aug 2007 10:06:35 -0400
> From: Jean-Michel Campin <jmc at ocean.mit.edu>
> Mime-Version: 1.0
> Cc: 
> Reply-To: ACES-support at mitgcm.org
> 
> Hi Greg,
> 
> run into similar problem last night (between 1.am & 3.30 am)
> with those 2 nodes:
>  a54-1727-037
>  a54-1727-045
> and same error msg:
> > poll: protocol failure in circuit setup
> > p0_12955:  p4_error: Child process exited while making connection to remote process on a54-1727-037: 0
> > p0_12955: (2.009466) net_send: could not write to fd=4, errno = 32
> 
> Jean-Michel
> 
> On Tue, Aug 28, 2007 at 02:03:26PM -0400, aces-admin at techsquare.com wrote:
> > hello jmc-
> > 
> > yes, these two (2) nodes are alright now.
> > fwiw, a54-1727-033 was troubled earlier today.
> > 
> > [greg]
> > 
> > > Date: Tue, 28 Aug 2007 10:52:03 -0400
> > > From: Jean-Michel Campin <jmc at ocean.mit.edu>
> > > Mime-Version: 1.0
> > > Cc: 
> > > Reply-To: ACES-support at mitgcm.org
> > > 
> > > Hi,
> > > 
> > > I have 2 mpi jobs which did not run last night, between 1 am & 3.30 am:
> > > both were on the same computing nodes:
> > > a54-1727-033 
> > > a54-1727-035
> > > 
> > > and generate this error msg:
> > > poll: protocol failure in circuit setup
> > > p0_30263:  p4_error: Child process exited while making connection to remote process on a54-1727-035: 0
> > > p0_30263: (2.010777) net_send: could not write to fd=4, errno = 32
> > > 
> > > Are these 2 nodes OK now for mpi jobs ?
> > > Thanks,
> > > Jean-Michel
> > > _______________________________________________
> > > Aces-support mailing list
> > > Aces-support at acesgrid.org
> > > http://acesgrid.org/mailman/listinfo/aces-support
> > > 
> > _______________________________________________
> > Aces-support mailing list
> > Aces-support at acesgrid.org
> > http://acesgrid.org/mailman/listinfo/aces-support
> _______________________________________________
> Aces-support mailing list
> Aces-support at acesgrid.org
> http://acesgrid.org/mailman/listinfo/aces-support
> 


More information about the Aces-support mailing list