This is related to my parameter study mgt. code. The problem is how to keep track of and resubmit interrupted jobs.
The usage scenario I'll describe here illustrates the motivation of the code:
Create the task array (eg. folders 0-3). Each task has a pbs script with "pbs_name 0/1/2/3" in the script to identify it in the pbs system. (You can create these unique names yourself or have them automatically generated by PSMT.)
cd into the parent of the folders. submit all pbs scripts in the subdirectories with
python pbsmgr.pyThis will keep running and resubmit any jobs that are interrupted. So you should have your script rename or delete the pbs script when it's (actually) finished so that it doesn't get picked up my pbsmgr again.
You can watch the status of jobs with
python pbsmon.pyfrom the same directory. At first it will list all jobs but then it will only print /changes/ to job status.
No comments:
Post a Comment