[Aces-support] cannot start new job on ao and geo
aces-admin at techsquare.com
aces-admin at techsquare.com
Wed Aug 29 08:56:27 EDT 2007
hello lurr-
this is not strange at all, if the
"cluster environment" doesn't change.
compute nodes are chosen in a fixed-order.
if you submit 2 jobs, quit them, and then
submit 2 more jobs you will get the
same compute nodes as long as no thing
else has changed in the "cluster environment" -
eg, other user-jobs, machines down for repairs, etc.
i am working to find the root cause of this
breakage, so please bear with me...
[greg]
> Date: Tue, 28 Aug 2007 16:28:46 -0400
> From: Richard Lu <lurr at mit.edu>
> MIME-Version: 1.0
> Cc:
> Reply-To: ACES-support at mitgcm.org
>
> It is strange that if I close all the jobs, then I can submit two job
> requests, the first one is going to start normally, but the second job
> is having problem. For example, I request two jobs a moment ago and
> first job 68906.geo in the queue gets started and the other job
> 68907.geo had a problem. Hope this give you more clue on what's going
> on. Thanks.
>
> [lurr at geo:~]
> $ nt1
> qsub: waiting for job 68907.geo to start
> qsub: job 68907.geo ready
>
>
> qsub: job 68907.geo completed
>
>
>
>
> aces-admin at techsquare.com wrote:
> > hmm, and again, please ?
> >
> > [greg]
> >
> >> Date: Tue, 28 Aug 2007 10:30:33 -0400
> >> From: Richard Lu <lurr at mit.edu>
> >> MIME-Version: 1.0
> >> Cc:
> >> Reply-To: ACES-support at mitgcm.org
> >>
> >> No, it still has problem:
> >>
> >> [lurr at geo:~/scratch/s40/deimos]
> >> $ nt1
> >> qsub: waiting for job 68882.geo to start
> >> qsub: job 68882.geo ready
> >>
> >>
> >> qsub: job 68882.geo completed
> >>
> >>
> >>
> >> aces-admin at techsquare.com wrote:
> >>> hello lurr-
> >>>
> >>> is this still happening for you ?
> >>> i've just checked both geo and ao
> >>> and they seem to be fine...
> >>>
> >>> actually, i just tweaked geo a bit.
> >>> does that help for you ?
> >>>
> >>> [greg]
> >>>
> >>> ps. i killed your MATLAB on the head-node.
> >>> please do not run computationally intensive
> >>> code on the head nodes, etc.
> >>>
> >>>> Date: Tue, 28 Aug 2007 10:05:23 -0400
> >>>> From: Richard Lu <lurr at mit.edu>
> >>>> MIME-Version: 1.0
> >>>> Cc:
> >>>> Reply-To: ACES-support at mitgcm.org
> >>>>
> >>>> Hi, there,
> >>>>
> >>>> I cannot start any new job on both ao and geo. When I submit a job
> >>>> request, it says the job was ready, and then immediately the job gets
> >>>> killed. Anything wrong? Thanks.
> >>>>
> >>>> [lurr at ao:~] $ qsub -I -q long -l nodes=1
> >>>> qsub: waiting for job 86713.ao to start
> >>>> qsub: job 86713.ao ready
> >>>>
> >>>>
> >>>> qsub: job 86713.ao completed
> >>>>
> >>>>
> >>>> [lurr at geo:~]
> >>>> $ qsub -I -q long -l nodes=1
> >>>> qsub: waiting for job 68879.geo to start
> >>>> qsub: job 68879.geo ready
> >>>>
> >>>>
> >>>> qsub: job 68879.geo completed
> >>>>
> >>>>
> >>>> Rongrong Lu
> >>>>
> >>>> --------------------------------------------
> >>>> Earth Resources Laboratory, MIT
> >>>> 77 Massachusetts Ave., Bldg.54-1815, Cambridge, MA 02139
> >>>> Tel: 617-253-7835 (o) 617-230-6729 (m)
> >>>> Email: lurr at mit.edu
> >>>> Web: http://web.mit.edu/lurr
> >>>> --------------------------------------------
> >>>> _______________________________________________
> >>>> Aces-support mailing list
> >>>> Aces-support at acesgrid.org
> >>>> http://acesgrid.org/mailman/listinfo/aces-support
> >>>>
> >>> _______________________________________________
> >>> Aces-support mailing list
> >>> Aces-support at acesgrid.org
> >>> http://acesgrid.org/mailman/listinfo/aces-support
> >> _______________________________________________
> >> Aces-support mailing list
> >> Aces-support at acesgrid.org
> >> http://acesgrid.org/mailman/listinfo/aces-support
> >>
> > _______________________________________________
> > Aces-support mailing list
> > Aces-support at acesgrid.org
> > http://acesgrid.org/mailman/listinfo/aces-support
> _______________________________________________
> Aces-support mailing list
> Aces-support at acesgrid.org
> http://acesgrid.org/mailman/listinfo/aces-support
>
More information about the Aces-support
mailing list