[Aces-support] Problems on ao nodes

Jean-Michel Campin jmc at ocean.mit.edu
Tue Aug 28 10:52:03 EDT 2007


Hi,

I have 2 mpi jobs which did not run last night, between 1 am & 3.30 am:
both were on the same computing nodes:
a54-1727-033 
a54-1727-035

and generate this error msg:
poll: protocol failure in circuit setup
p0_30263:  p4_error: Child process exited while making connection to remote process on a54-1727-035: 0
p0_30263: (2.010777) net_send: could not write to fd=4, errno = 32

Are these 2 nodes OK now for mpi jobs ?
Thanks,
Jean-Michel


More information about the Aces-support mailing list