#!/bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #!/bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # README # blkjac.f # cputm.c # d_and_c.f # data # data.blkjac # data.ts_dynamic # example.f # ftsubs.f # ftsubs.graph.f # indx0.h # indxj.h # maindp.f # make1 # makefile # maxparms.h # newevdp0.f # oldtest # putq.c # second.f # stateig.f # statses.f # stseswait.f # stuffspawn.f # testrun # ts_dynamic.f # This archive created: Wed Aug 30 14:34:47 1989 export PATH; PATH=/bin:$PATH if test -f 'README' then echo shar: over-writing existing file "'README'" fi cat << \SHAR_EOF > 'README' This package contains an enhanced version of the SCHEDULE Parallel Programming Package. By way of reusable or recycling queues, applications with the new ftsubs.f are now limited to 1000 active job processes, rather than a 1000 cumulative job processes (see the new version of the demonstration program ts_dynamic.f that illustrates the use of the new SCHEDULE subroutine GETTAG and see the description below; i.e. jobtags are now assigned by SCHEDULE instead of the user). This new version of ftsubs.f now also permits iterations of static dependency graphs (see the example blkjac.f that illustrates the use of the new SCHEDULE subroutines RESET and RSCHED). The new putq.c requires "include" files: maxparms.h, indx0.h and indxj.h (these may be easily changed from 20 to 60 parameters to allow an increased number of parameters in the calls to sched, putq and spawn; CAUTION: there is usually more overhead in subroutine argument passing than common argument passing, so rely more on common unless the job is big enough to underwrite the extra overhead or subroutine arguments can not be avoided). This revised version of SCHEDULE is currently only available on the Alliant, Balance, and Symmetry. ******************************************************************************* READ THIS PART!!! CAUTION: The first two arguments of NXTAG and SPAWN are now reversed from older versions to make them consistent with the static dependency subroutines DEP and PUTQ. ******************************************************************************* Caution: Calls to SCHED, PUTQ and SPAWN, should include at least one parameter in the argument list. To build the libraries, type: "make sched" or "make graph" make sched will produce the schedule library in file sched.a . This is the standard libarary for schedule. All examples begining with 'x' use this library. make graph will produce the trace version of schedule in file graph.a . This version produces output which can be viewed using the Schedule Trace Facility. All examples begining with 'g' use this library. To run the examples type: "make xtest; xtest" or "make gtest; gtest" This will make and run a test program which solves a triangular system of equations using the spawning capabilities of schedule. "make xdandc; xdandc" or "make gdandc; gdandc" This will make and run a test program which implements a recursive divide and conquer technique using the spawning capabilities of schedule. The output from gdandc will not work with the current trace facility because spawned processes also spawn other processes and the trace facility does not yet support more than one level of spawning. "make xeig; xeig" or "make geig; geig" This will make and run a test program which computes the eigenvalues of a symmetric tridiagonal eigenvalue problem using a divide and conquer technique. "make xwait; xwait" or "make gwait; gwait" This will make and run a test program which solves the same problem as above except that it uses the schedule wait command. "make xexample; xexample" or "make gexample; gexample" This will make and run a test program which computes the inner product of two vectors. "make xts_tsdynamic; xts_dynamic" or "make gts_tsdynamic; gts_dynamic" This will make and run an example program illustrating use of the new SCHEDULE package with the triangular stuffer demonstration program; note especially the new subroutine GETTAG that gets a SCHEDULE generated job for each of the user's processes; note that the arguments of NXTAG and SPAWN have been reordered to be more like that of DEP and PUTQ. The input to ts_dynamic the form: [n_processors] [n_array_size] [n_work_iterations]. There is a sample input file in data.ts_dynamic. "make xblkjac; xblkjac" or "make gblkjac; gblkjac" This will make and run asample FORTRAN static iteration driver for ftsubs.f with block Jacobi iteration of a variable coefficient EPDE; The input has the form: [n_processors] [n_x_size] [n_y_size] [n_x_blocks] [n_y_blocks] [max_iterations] [n_result_precision]. There is a sample input file in data.blkjac. Up to 10 X 10 blocks are permitted. The new SCHEDULE subroutine RSET marks processes that will take part in an iteration. Another new SCHEDULE subroutine RSCHED restores only those parameters, such as ICANGO, that have changed; NSLOTS have been increased to 105 to permit at most 10 X 10 block iterations. A complete make and test can be carried out by doing: testrun >& testout diff oldtest testout SHAR_EOF if test -f 'blkjac.f' then echo shar: over-writing existing file "'blkjac.f'" fi cat << \SHAR_EOF > 'blkjac.f' $STDUNIT program blkjac c Code Name: blkjac.f (STATIC VERSION) Code Data Input: [n_processors] [n_x_size] [n_y_size] [n_x_blocks] [n_y_blocks] Continued Input (single line assumed): [max_iterations] [n_result_precision] Change: Block Jacobi test for SCHEDULE rsched to restore STATIC & DYNAMIC cont: dependency graph for the next iteration. implicit real*8 (a-h,o-z) parameter(mdim=102, ndim=102, maxblk=11, maxprc=8) parameter(xmax1=100.d0,ymax1=100.d0,tol1=0.5d-4) parameter(nmyck=10*maxblk) parameter(mxprcs = 1000) integer itmp(maxblk),jtmp(maxblk),mychkn(nmyck),statag(mdim,ndim) integer itag(mxprcs) real t1,t2,t3,tt,second,foo real*8 u(mdim,ndim),v(mdim,ndim) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Caution: common block CONWRT is used in SCHEDULE for concurrent prints. COMMON /CONWRT/ WRLOCK EXTERNAL PARALG read(5,*) nprocs,m,n,mb,nb,mxiter,nprec write(6,6666) nprocs,m,n,mb,nb,mxiter,nprec 6666 format(' Static Block Jacobi Input:' & /3x,'nprocs =',i3,'; (m,n) = (',i4,',',i4 & ,'); (mblks,nblks) = (',i3,',',i3 & ,')'/3x,'; max iterations =',i6,'; nprec =',i3) mbpts = m/mb nbpts = n/nb tol = 0.5d0/10**nprec write(6,6667) mdim,ndim,maxblk,maxprc,nmyck,mbpts,nbpts & ,xmax1,ymax1,nprec,tol 6667 format(' Parameter input:'/3x,'(mdim,ndim) = (',i4,',',i4 & ,'); maxblkdim = ',i4 & /3x,'; maxprc =',i3,'; nmychk =',i6,'; (mbpts,nbpts) = (' & ,i4,',',i4,')' & /3x,' (xmax,ymax) = (',f7.2,',',f7.2,'); nprec =',i2 & ,'( tol =',d11.4,')') c if(m.gt.(mdim-2).or.n.gt.(ndim-2).or.mb.gt.maxblk.or.nb.gt.maxblk & .or.nprocs.gt.maxprc.or.nprocs.lt.1.or.mb*nb.gt.100) & then write(6,6668) mdim,ndim,maxblk,m,n,mb,nb,nprocs,mxiter 6668 format(' Improper inputs with limits exceeded; input was:' & ' mdim =',i5,'; ndim =',i5,'; maxblk =',i5 & /' m =',i5,'; n =',i5,';mb =',i5,';nb =',i5 & /' nprocs =',i5,'; max iterations =',i5) write(6,*) 'S T O P E X E C U T I O N I N M A I N' stop endif c xmax = xmax1 ymax = ymax1 c do 10 ib = 1,mb 10 itmp(ib) = ib do 11 jb = 1,nb 11 jtmp(jb) = jb c c: remove second.f timer t1 = second(foo) t2 = second(foo) c CALL SCHED(nprocs,paralg,mdim,ndim,itmp,jtmp,itag,statag,mychkn & ,u,v) c t3 = second(foo) tt = t3-t2-(t2-t1) c c output c amstep = amax1(m/10.,1.) anstep = amax1(n/10.,1.) mtop = min0(m+1,11) ntop = min0(n+1,11) do 1002 i = 1,mtop 1002 itmp(i) = 1 + (i-1)*amstep + .5 do 1003 j = 1,ntop 1003 jtmp(j) = 1 + (j-1)*anstep + .5 write(6,1001) (itmp(i),i = 1,mtop),m+2 j = n+2 write(6,1000) j,(u(itmp(i),n+2),i = 1,mtop),u(m+2,n+2) do 100 k = 1,ntop j = jtmp(ntop+1-k) write(6,1000) j,(u(itmp(i),j),i = 1,mtop),u(m+2,j) 100 continue 1000 format(i4,2x,12f5.2) 1001 format(' Static Block Jacobi -' & ,' Iteration SCHEDULE Final Results:' & /3x,'j/i',12i5) c mxjobs = 1+niter*(1+mb*nb+2)+1 if(nprocs.eq.1) write(6,664) mb,nb,niter,mxiter 664 format(' # Static Block Jacobi Schedule rsched & gettag & name ' & ,'program' & /' # ftsubs.f for iterated circular readyq & parmq & ' & ,'freeq version' & /' # with mblks, nblks =',2i5'; niter =',i8,'; mxiter=',i8) write(6,665) nprocs,tt 665 format(' #',i2,f12.5) write(*,666) m,n,mb,nb,niter,nprocs,mxjobs,tt,uvdiff 666 format(9x,'code',7x,'m',7x,'n',3x,'mblks',3x,'nblks',3x,'niter' & ,2x,'nprocs',1x,'maxjobs' & /1x,'BLOCK-JACOBI',7i8 & /1x,'STATIC VERSION',5x,'seconds =',f12.5,'; uvdiff =',d12.5) if(uvdiff.ge.tol) write(6,667) niter,mxiter,uvdiff,tol 667 format(3x,'Iteration UNSUCCESSFUL: niter =',i6, ' & mxiter =' & ,i6/5x,'while uvdiff =',d12.5,' .GE. tol = ',d12.5) c stop end c subroutine paralg(mdim,ndim,itmp,jtmp,itag,statag,mychkn & ,u,v) parameter(mxprcs = 1000) implicit real*8 (a-h,o-z) integer m,n,mb,nb,statag(mdim,*),mychkn(*),itmp(*),jtmp(*) integer itag(mxprcs) integer jobtag,initag,strtag,cnvtag,testag,stptag real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff EXTERNAL INIT,STARTT,JACOBI,CONVRG,TEST,STOPIT c c this is the parallel driver for the iterated dependency graph c c first, get all static job tags necessary to construct the c dependency graph. c CAUTION: At this point, execution is in parallel because sub paralg CAUTION: and its args are only passed to SCHEDULE by sub sched and is CAUTION: executed in concurrent mode by a copy of the sub work. CAUTION: It is essential that all subsequent sub args must be global CAUTION: variables, such as itag, statag, itmp and jtmp, else values CAUTION: passed will not be protected from concurrent overwrite. c CALL GETTAG(initag) itag(initag) = initag CALL GETTAG(strtag) itag(strtag) = strtag do 100 jb = 1,nb do 100 ib = 1,mb Caution: statag(ib,jb) gets the static job tag cont: for the block (ib,jb). CALL GETTAG(statag(ib,jb)) 100 continue CALL GETTAG(cnvtag) itag(cnvtag) = cnvtag CALL GETTAG(testag) itag(testag) = testag CALL GETTAG(stptag) itag(stptag) = stptag c jobtag = itag(initag) icango = 0 nchks = 1 nreset = 0 mychkn(1) = itag(strtag) c CAUTION: PUTQ does not call INIT, but only passes its name and args to CONT: SCHEDULE. c CALL name(jobtag,' init') CALL DEP(jobtag,icango,nchks,mychkn) CALL PUTQ(jobtag,init,itag(initag),mdim,u) c jobtag = itag(strtag) icango = 1 nchks = mb*nb Comment: Here nreset = 2 is used as the iteration set number, cont: but it may be any nonzero integer. nreset = 2 do 201 jb = 1,nb do 201 ib = 1,mb mychkn(ib+mb*(jb-1)) = statag(ib,jb) 201 continue c CALL name(jobtag,'startt') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,startt,itag(strtag),mdim,u,v) c do 301 jb = 1,nb do 301 ib = 1,mb jobtag = statag(ib,jb) icango = 1 nchks = 1 nreset = 2 mychkn(1) = itag(cnvtag) c CALL name(jobtag,'jacobi') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CAUTION: Make certain that global variables like itmp are passed as cont: arguments of subroutines that are passed to Schedule. CALL PUTQ(jobtag,jacobi,statag(ib,jb),mdim & ,itmp(ib),jtmp(jb),u,v) 301 continue c jobtag = itag(cnvtag) icango = mb*nb nchks = 1 nreset = 2 mychkn(1) = itag(testag) c CALL name(jobtag,'convrg') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,convrg,itag(cnvtag),mdim,u,v) c jobtag = itag(testag) icango = 1 nchks = 1 nreset = 2 mychkn(1) = itag(stptag) c CALL name(jobtag,' test') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,test,itag(testag),itag(strtag) & ,itag(stptag)) c jobtag = itag(stptag) icango = 1 nchks = 0 nreset = 0 c CALL name(jobtag,'stopit') CALL DEP(jobtag,icango,nchks,mychkn) CALL PUTQ(jobtag,stopit,itag(stptag)) c return end c subroutine init(initag,mdim,u) implicit real*8 (a-h,o-z) real*8 u(mdim,*) integer initag,mdim,m,n common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff niter = 0 dx = xmax/(m+1) dy = ymax/(n+1) do 200 i = 1,m+2 x = (i-1)*dx u(i,1) = (x/xmax)**2 u(i,n+2) = 0.5*(1+(x/xmax)**2) 200 continue do 300 j = 2,n+1 y = (j-1)*dy u(1,j)= 0.5*(y/ymax)**3 u(m+2,j) = 1.0 300 continue do 100 j = 2,n+1 do 100 i = 2,m+1 u(i,j) = ((n+2-j)*u(i,1)+(j-1)*u(i,n+2))/(n+1) 100 continue mstep=m/10 nstep=n/10 return end subroutine startt(strtag,mdim,u,v) implicit real*8 (a-h,o-z) integer strtag,mdim,m,n real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Code: saves current node values and is restarting point for iterations. niter = niter + 1 do 100 j = 1,n+2 do 100 i = 1,m+2 v(i,j) = u(i,j) 100 continue return end subroutine jacobi(statag,mdim,ib,jb,u,v) implicit real*8 (a-h,o-z) integer statag,mdim,m,n,mb,nb,ib,jb real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff r = dy/dx do 100 js = 1,nbpts j = js + 1 + nbpts*(jb-1) y = (j-1)*dy do 100 is = 1,mbpts i = is + 1 + mbpts*(ib-1) x = (i-1)*dx a = 1.d0/dsqrt(1.d0+x**2+y**2) b = dexp(-x**2 -y**2) den = 2*(r**2*a + b) u(i,j) = (r**2*a*(v(i+1,j)+v(i-1,j)) & + b*(v(i,j+1)+v(i,j-1)))/den 100 continue mstep=m/10 nstep=n/10 Code: computes Block Jacobi updates for block (ib,jb) return end subroutine convrg(cnvtag,mdim,u,v) implicit real*8 (a-h,o-z) integer cnvtag,mdim,m,n,idone real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Code: computes the Cauchy Convergence crierion in the inf-norm and cont: passes the flag idone as 0 for reset and 1 for stop uvdiff = 0 do 100 j = 2,n+1 do 100 i = 2,m+1 dumax = abs(u(i,j)-v(i,j)) if(dumax.gt.uvdiff) uvdiff = dumax 100 continue if(uvdiff.lt.tol.or.niter.ge.mxiter) then idone = 1 else idone = 0 endif c return end subroutine test(testag,strtag,stptag) implicit real*8 (a-h,o-z) integer testag,strtag,stptag,reset common /comint/ m,n,mb,nb,mbpkts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff CTERM INTEGER WRLOCK CTERM COMMON /CONWRT/ WRLOCK if(idone.eq.0) then Comment: If iteration is unfinished, Reset SCHEDULE sub RSCHED is called Cont: and my check in is changed to iteration start tag strtag. Comment: In this example, the iteration set integer is 2. kreset = 2 CALL RSCHED(testag,strtag,kreset) else Comment: Else, reset my check in to the iteration stop tag stptag. kreset = 0 CALL RSCHED(testag,stptag,kreset) endif return end subroutine stopit(stptag) integer stptag continue return end SHAR_EOF if test -f 'cputm.c' then echo shar: over-writing existing file "'cputm.c'" fi cat << \SHAR_EOF > 'cputm.c' #include #include #include long cputm_() { long seconds, microseconds, milliseconds; struct rusage buffer; getrusage(RUSAGE_SELF, &buffer); seconds = buffer.ru_utime.tv_sec; microseconds = buffer.ru_utime.tv_usec; milliseconds = 1000*seconds + microseconds/1000; return(milliseconds); } SHAR_EOF if test -f 'd_and_c.f' then echo shar: over-writing existing file "'d_and_c.f'" fi cat << \SHAR_EOF > 'd_and_c.f' $STDUNIT integer a(256),klevl(8),myid(256) external top write(6,*) ' input nprocs nlevls ' read (5,*) nprocs,nlevls do 50 j = 1,8 klevl(j) = j 50 continue call sched(nprocs,top,a,nlevls,klevl,myid) do 100 j = 1,2**nlevls-1 write(6,*) a(j) 100 continue stop end subroutine top(a,nlevls,klevl,myid) integer a(*),klevl(*),myid(*) character*6 subnam external split c write(6,*) ' from top ' , a(1) call gettag(jobtag) icango = 0 nchks = 0 myid(1) = jobtag a(1) = 1 c subnam = 'split' call name(jobtag,subnam) call dep(jobtag,icango,nchks,mychkn) call putq(jobtag,split,myid,a,nlevls,klevl) c return end subroutine split(myid,a,nlevls,klevl) integer a(*),klevl(*),myid(*),rnode external clone character*6 subnam c write(6,*) ' from split ',a(1) c if (klevl(1) .ge. nlevls) return c lnode = 2*a(1) rnode = lnode + 1 indx = lnode - a(1) + 1 a(indx) = lnode a(indx+1) = rnode mytag = myid(a(1)) c call gettag(jobtag) subnam = 'split' call name(jobtag,subnam) call nxtag(jobtag,mytag) myid(lnode) = jobtag call spawn(jobtag,mytag,clone,myid,a(indx),nlevls,klevl(2)) c call gettag(jobtag) subnam = 'split' call name(jobtag,subnam) call nxtag(jobtag,mytag) myid(rnode) = jobtag call spawn(jobtag,mytag,clone,myid,a(indx+1),nlevls,klevl(2)) c return end SHAR_EOF if test -f 'data' then echo shar: over-writing existing file "'data'" fi cat << \SHAR_EOF > 'data' 12 4 SHAR_EOF if test -f 'data.blkjac' then echo shar: over-writing existing file "'data.blkjac'" fi cat << \SHAR_EOF > 'data.blkjac' 8 100 100 10 10 100 2 SHAR_EOF if test -f 'data.ts_dynamic' then echo shar: over-writing existing file "'data.ts_dynamic'" fi cat << \SHAR_EOF > 'data.ts_dynamic' 8 43 1000 SHAR_EOF if test -f 'example.f' then echo shar: over-writing existing file "'example.f'" fi cat << \SHAR_EOF > 'example.f' $STDUNIT program main integer n, k c external parprd c real a(1000), b(1000), temp(50), sigma write (6,*) 'Input number of processors' read (5,*) nprocs n = 1000 k = 20 c do 100 j = 1, n a(j) = j b(j) = 1 100 continue c call sched(nprocs, parprd, n, k, a, b, temp, sigma) c write (6,*) ' sigma = ', sigma stop end c subroutine parprd(n, k, a, b, temp, sigma) c integer n, k real a(*), b(*), temp(*), sigma c integer m1, m2, index, j, jobtag, icango, ncheks, mychkn(2) integer itags(500) c character*6 subnam external inprod, addup save m1, m2 c do 150 j = 1, k + 1 call gettag(jobtag) itags(j) = jobtag 150 continue c m1 = n/k index = 1 do 200 j = 1, k - 1 jobtag = itags(j) icango = 0 ncheks = 1 mychkn(1) = itags(k + 1) subnam = 'inprod' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, inprod, m1, a(index), b(index), temp(j)) index = index + m1 200 continue c m2 = n - index + 1 jobtag = itags(k) icango = 0 ncheks = 1 mychkn(1) = itags(k + 1) subnam = 'inprod' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, inprod, m2, a(index), b(index), temp(k)) index = index + m1 c jobtag = itags(k + 1) icango = k ncheks = 0 subnam = 'addup' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, addup, k, sigma, temp) c return end c c subroutine inprod(m, a, b, sigma) integer m real a(*), b(*), sigma sigma = 0.0 do 100 j = 1, m sigma = sigma + a(j)*b(j) 100 continue return end c c subroutine addup(k, sigma, temp) integer k real sigma, temp(*) sigma = 0.0 do 100 j = 1, k sigma = sigma + temp(j) 100 continue return end SHAR_EOF if test -f 'ftsubs.f' then echo shar: over-writing existing file "'ftsubs.f'" fi cat << \SHAR_EOF > 'ftsubs.f' $STDUNIT $ALIGNWARN CVD$G NOINLINE (DUMP,DUMP2,LOCKON,LOCKOFF,NOPS,SECOND,WORK) subroutine chekin(jobtag) Code path: balance:/bfs2/brewer/SCHED/HANSON/ftsubs.f Comment: integrated iteration version of ftsubs.f and ftsubs.iter.f cont: with option to iterate a set of nodes with reset dependencies. Comment: combined graphics and terminal trace version of ftsubt.f Code parent: alliant:/afs1/hanson/dirsched/ftsubs.f change(1): iprcs = 200 <- 120; change(2): automatic return stmt removed out of loop do 20 in chekin; change(3): installed vector-circular ready queue, cont: vector <= nproc sub-qs, elastic with nproc processors; cont: circular <= readyq free space wraps around from rtail to rhead, cont: with the top end of readyq connected to the bottom end; cont: ready(rhead(id)+ndmrsq*(id-1)) <- readyq(rhead(id),id); cont: most mxces replaced by nprocc = nproc = no. sub-qs; cont: ldimrq = leading dim of readyq = iprcs*mxces cont: ndmrsq = dim of a ready-sub-q = ldimrq/nproc cont: idrsq = id of ready-sub-q <- iwrkr; dummy iw used in do's; cont: installed SCHED ERROR flags for readyq over-runs (mtail cond.); cont: round robin test in getprb reduced to single statement. Change: corrected next in nxtag & intspn in start2 to recover lost tag. CAUTION: nxtag and spawn arguments are consistent with dep and putq cont: now, but order of arguments may not be consistent with older cont: versions of ftsubs.f. Change(4): installed circular parm queue, jobtag is the circular cont: (reusable) job tag with 1.le.jobtag.le.mxprcs, cont: snext is the schedule or sum or cumulative jobtag. Change(5): install super next tag, whereby user gets schedule job tags cont: from new schedule sub gettag; hence schedule has no knowledge cont: of user tags and consequently the principal restriction on user cont: is that there be less than "mxprcs" undone jobs at any time. cont: integer array "unitag" keeps a unique job tag for undone jobs. Change(6): install rest and save arrays for jobtags that will be cont: iterated more than once with original dependencies: ireset, cont: icnsav. install sub rsched to reset icangoes cont: and call sub place on iteration. Change(6a): nslots = 105 <- 30 to handle multiple dependencies. Change(7): installed common block CONWRT with key WRLOCK for concurrent cont: writes for use in both ftsubs.f and the user's driver code. Change(8): installed c-include indx*.h files to enable the passing of cont: up to 60 parameters with sched, putq and spawn calls (via m. cont: johnson, ssi). Change(9): installed lock initializations in libopn to make porting cont: to other machines without automatic variable initialization. CAUTION: subroutine second uses machine dependent timer, which must be cont: changed when porting to other machines. cgraphChange: install write nproc in sub libopn. cgraphChange: installed extra traces in chekin & place. cgraphChange: replaced qlock(mxprcs) by glock as igraph's own lock. cgraphChange: installed process names for Dongarra/Brewer's sched.trace. cgraphChange(8): cgraph lines made compatible for SCHED.TRACE/sched.trace. cgraphcdirectory: /usr/alcaid/brewer/SCHED.TRACE/sched.trace cgraphcomment: for graphics trace, change 'cgraph' to null '' and run. ctermComment: for terminal trace, change 'cterm' to null '' and run. Change(9): Conversion of ftsubs.f to run on Sequent Balance 21000 cont: Add $STDUNIT & $ALIGNWARN compiler directives at beginning of file. cont: Change parameter "mxces" from 8 to 24. cont: Add line "ierr = m_set_procs(nproc)." cont: All locks are integer*1. cont: Use microtasking calls s_init_lock, s_lock, & s_unlock. cont: Add routines lckasn, lockon, lockoff, & nops. CVD$R NOCONCUR integer jobtag c*********************************************************************** c c this subroutine reports unit of computation labeled by c jobtag has completed to all dependent nodes. these nodes are c recorded in parmq(i,jobtag) where 6 .le. i .le. nchks+5 c checkin consists of decrementing the value in each of these c locations by 1. each of these is done in a critical section c protected by qlock(jobtag) c c if the value in parmq(2,jobtag) is 0 where jobtag is a process c dependent upon this one then jobtag is placed on the readyq c by entering the critical section protected by trlock. the c pointer rtail to the tail of the readyq is incremented c unless task done is to be recorded. task done is placed on c the ready q and the pointer rtail left in place if nchks .eq. 0 c is found. c c see the common block description in libopn for more detail. c c*********************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) Caution: common block CONWRT is used for concurrent writes, cont: with WRLOCK as the key to the LOCK. INTEGER*1 WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c a complete common block description is given in the routine libopn c c**************************************************************************** c c check to see if this process has completed. if not do not checkin c mtail = 0 idrsq = 0 c c first ask if any kids spawned by jobtag c if (parmq(4,jobtag) .ne. 0 .or. parmq(5,jobtag) .ne. 0 ) then c c either kids have been spawned or ientry has been referenced c indicating reentry is required c c c find out how many are waiting to complete c if (parmq(4,jobtag) .ne. 0) then call s_lock(qlock(jobtag)) parmq(2,jobtag) = parmq(2,jobtag) - parmq(4,jobtag) call s_unlock(qlock(jobtag)) endif c c reset number of kids c parmq(4,jobtag) = 0 c c update the number of times this procedure has been c entered c parmq(1,jobtag) = parmq(1,jobtag) + 1 c c return without checkin if all the kids have not c checked in to jobtag yet or if a reentry is required. c process jobtag is not done in either case. c comment: extra trace data. if (parmq(2,jobtag) .ne. 0) then cgraph call s_lock(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call s_unlock(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for chekin/child (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) cgraph igraph(1,insrt) = 7 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph else cgraphc trace for chekin/parent (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) cgraph igraph(1,insrt) = 6 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph endif return endif c c if ientry has been called but jobtag is not waiting c on any kids then jobtag is placed back on the readyq c if ( parmq(5,jobtag) .ne. 0) then idrsq = mod((jobtag-1),nprocc) + 1 call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = jobtag rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 10 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(8,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 9 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(8,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif return endif endif c c the process has completed so chekin proceeds c cgraph call s_lock(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call s_unlock(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cgraph igraph(1,insrt) = 5 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph gnames(insrt) = names(jobtag) cgraph else cgraphc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cgraph igraph(1,insrt) = 2 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph gnames(insrt) = names(jobtag) cgraph endif c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 5 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = jobtag cterm igraph(6,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 2 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = jobtag cterm igraph(5,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c c if (mtail .lt. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c nchks = parmq(3,jobtag) c c if this is the final process (indicated by nchks .eq. 0) then c record task done. do not advance the tail so task done sequence c is set. all subsequent gtprb queries will leave rhead .eq. rtail c with readyq(rhead+ndmrsq*(i-1)) .eq. done. c if (nchks .eq. 0) then do 20 iw = 1,nprocc call s_lock(trlock(iw)) readyq(rtail(iw)+ndmrsq*(iw-1)) = done call s_unlock(trlock(iw)) 20 continue cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 12 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 11 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif Change(2): removed following return from end of above loop do 20. return endif do 50 j = 6,nchks+5 mychek = parmq(j,jobtag) c c get unique access to the checkin node mychek c and checkin by decrementing the appropriate counter c mchkgo = 1 call s_lock(qlock(mychek)) parmq(2,mychek) = parmq(2,mychek) - 1 mchkgo = parmq(2,mychek) call s_unlock(qlock(mychek)) c c place mychek on readyq if parmq(2,mychek) is 0 c if (mchkgo .eq. 0 ) then idrsq = mod((mychek-1),nprocc) + 1 call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = mychek rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) endif 50 continue c c place finished process at the end of the free list freeq c provided it will not be reset for another iteration. c if(ireset(jobtag).eq.0) then call s_lock(tflock) ftail = mod(ftail,mxprcs) + 1 if(fhead.eq. ftail) free = 0 freeq(ftail) = jobtag call s_unlock(tflock) endif c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 8 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = fhead cterm igraph(6,insrt) = ftail cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 7 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c if ( free .eq. 0 ) then call s_lock(WRLOCK) inext = unitag(jobtag) write(6,*) '*************SCHED ERROR*************************' write(6,*) ' more processes have checked into sub chekin,' write(6,*) ' than should be active for free slots in the' write(6,*) ' parmq parameter queue. jobs are too many.' write(6,*) ' total number of jobtags were:',inext write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop call s_unlock(WRLOCK) c endif c return c c last card of chekin c end subroutine gettag(jobtag) CVD$R NOCONCUR integer jobtag c************************************************************************* c c this subroutine gets a schedule jobtag for problem on the queue, c provided a free column is available in parmq. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) INTEGER*1 WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c for a complete common block description see the subroutine libopn c c if ( free .eq. 0 ) then call s_lock(WRLOCK) write(6,*) '*************SCHED LIMIT ERROR*******************' write(6,*) ' user attempt to create to many active ' write(6,*) ' processes ; total number of jobs =',snext write(6,*) ' too many unfinished jobs while in gettag ' write(6,*) ' and no free slots on the parameter queue ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop call s_unlock(WRLOCK) c endif c c get tag for process on the next free column in the problem queue c call s_lock(hflock) jobtag = freeq(fhead) snext = snext + 1 if(fhead.eq. ftail) free = 0 fhead = mod(fhead,mxprcs) + 1 if(jobtag.ge.1.and.jobtag.le.mxprcs) unitag(jobtag) = snext call s_unlock(hflock) cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for gettag cterm igraph(1,insrt) = 15 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = jobtag cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail c if ( jobtag .le. 0 .or. jobtag .gt. mxprcs ) then write(6,*) '*************SCHED ERROR***********************' write(6,*) ' illegal jobtag for parmq column;' write(6,*) ' need 1 .le. jobtag .le. ',mxprcs,';' write(6,*) ' current jobtag =',jobtag,' in gettag' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif c return c c last card of gettag c end subroutine rsched(jobtag,settag,kreset) CVD$R NOCONCUR integer jobtag,settag,kreset c************************************************************************* c comment: usage c subroutine paralg() c integer strtag,stptag,itag(*) c external start,test c . c . c call gettag(strtag) c itag(strtag) = strtag c . c . c call gettag(stptag) c itag(stptag) = stptag c . c . comment: start iteration or time stepping c jobtag = strtag c icango = 1 c nchks = ... c nreset = c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,start,itag(strtag)) c . c . comment: test and continue iteration at start if undone c jobtag = testag c icango = ... c nreset = c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,test,itag(strtag),itag(stptag)) c . c . c subroutine test(jobtag,strtag,stptag) c common /