#!/bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #!/bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # README # blkjac.f # d_and_c.f # data # data.blkjac # data.ts_dynamic # example.f # ftsubs.f # ftsubs.graph.f # indx0.h # indxj.h # make1 # makefile # maxparms.h # oldtest # putq.c # second.f # stuffspawn.f # testrun # ts_dynamic.f # This archive created: Fri Sep 15 13:06:43 1989 export PATH; PATH=/bin:$PATH if test -f 'README' then echo shar: over-writing existing file "'README'" fi cat << \SHAR_EOF > 'README' This package contains an enhanced version of the SCHEDULE Parallel Programming Package for the Sun workstation. By way of reusable or recycling queues, applications with the new ftsubs.f are now limited to 1000 active job processes, rather than a 1000 cumulative job processes (see the new version of the demonstration program ts_dynamic.f that illustrates the use of the new SCHEDULE subroutine GETTAG and see the description below; i.e. jobtags are now assigned by SCHEDULE instead of the user). This new version of ftsubs.f now also permits iterations of static dependency graphs (see the example blkjac.f that illustrates the use of the new SCHEDULE subroutines RESET and RSCHED). The new putq.c requires "include" files: maxparms.h, indx0.h and indxj.h (these may be easily changed from 20 to 60 parameters to allow an increased number of parameters in the calls to sched, putq and spawn; CAUTION: there is usually more overhead in subroutine argument passing than common argument passing, so rely more on common unless the job is big enough to underwrite the extra overhead or subroutine arguments can not be avoided). This revised version of SCHEDULE is currently only available on the Alliant, Balance, Symmetry, and Sun. ******************************************************************************* READ THIS PART!!! CAUTION: The first two arguments of NXTAG and SPAWN are now reversed from older versions to make them consistent with the static dependency subroutines DEP and PUTQ. ******************************************************************************* Caution: Calls to SCHED, PUTQ and SPAWN, should include at least one parameter in the argument list. To build the libraries, type: "make sched" or "make graph" make sched will produce the schedule library in file sched.a . This is the standard libarary for schedule. All examples begining with 'x' use this library. make graph will produce the trace version of schedule in file graph.a . This version produces output which can be viewed using the Schedule Trace Facility. All examples begining with 'g' use this library. To run the examples type: "make xtest; xtest" or "make gtest; gtest" This will make and run a test program which solves a triangular system of equations using the spawning capabilities of schedule. "make xdandc; xdandc" or "make gdandc; gdandc" This will make and run a test program which implements a recursive divide and conquer technique using the spawning capabilities of schedule. The output from gdandc will not work with the current trace facility because spawned processes also spawn other processes and the trace facility does not yet support more than one level of spawning. "make xexample; xexample" or "make gexample; gexample" This will make and run a test program which computes the inner product of two vectors. "make xts_tsdynamic; xts_dynamic" or "make gts_tsdynamic; gts_dynamic" This will make and run an example program illustrating use of the new SCHEDULE package with the triangular stuffer demonstration program; note especially the new subroutine GETTAG that gets a SCHEDULE generated job for each of the user's processes; note that the arguments of NXTAG and SPAWN have been reordered to be more like that of DEP and PUTQ. The input to ts_dynamic the form: [n_processors] [n_array_size] [n_work_iterations]. There is a sample input file in data.ts_dynamic. "make xblkjac; xblkjac" or "make gblkjac; gblkjac" This will make and run asample FORTRAN static iteration driver for ftsubs.f with block Jacobi iteration of a variable coefficient EPDE; The input has the form: [n_processors] [n_x_size] [n_y_size] [n_x_blocks] [n_y_blocks] [max_iterations] [n_result_precision]. There is a sample input file in data.blkjac. Up to 10 X 10 blocks are permitted. The new SCHEDULE subroutine RSET marks processes that will take part in an iteration. Another new SCHEDULE subroutine RSCHED restores only those parameters, such as ICANGO, that have changed; NSLOTS have been increased to 105 to permit at most 10 X 10 block iterations. A complete make and test can be carried out by doing: testrun >& testout diff oldtest testout SHAR_EOF if test -f 'blkjac.f' then echo shar: over-writing existing file "'blkjac.f'" fi cat << \SHAR_EOF > 'blkjac.f' program blkjac c Code Name: blkjac.f (STATIC VERSION) Code Data Input: [n_processors] [n_x_size] [n_y_size] [n_x_blocks] [n_y_blocks] Continued Input (single line assumed): [max_iterations] [n_result_precision] Change: Block Jacobi test for SCHEDULE rsched to restore STATIC & DYNAMIC cont: dependency graph for the next iteration. implicit real*8 (a-h,o-z) parameter(mdim=102, ndim=102, maxblk=11, maxprc=8) parameter(xmax1=100.d0,ymax1=100.d0,tol1=0.5d-4) parameter(nmyck=10*maxblk) parameter(mxprcs = 1000) integer itmp(maxblk),jtmp(maxblk),mychkn(nmyck),statag(mdim,ndim) integer itag(mxprcs) real t1,t2,t3,tt,second,foo real*8 u(mdim,ndim),v(mdim,ndim) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Caution: common block CONWRT is used in SCHEDULE for concurrent prints. COMMON /CONWRT/ WRLOCK EXTERNAL PARALG read(5,*) nprocs,m,n,mb,nb,mxiter,nprec write(6,6666) nprocs,m,n,mb,nb,mxiter,nprec 6666 format(' Static Block Jacobi Input:' & /3x,'nprocs =',i3,'; (m,n) = (',i4,',',i4 & ,'); (mblks,nblks) = (',i3,',',i3 & ,')'/3x,'; max iterations =',i6,'; nprec =',i3) mbpts = m/mb nbpts = n/nb tol = 0.5d0/10**nprec write(6,6667) mdim,ndim,maxblk,maxprc,nmyck,mbpts,nbpts & ,xmax1,ymax1,nprec,tol 6667 format(' Parameter input:'/3x,'(mdim,ndim) = (',i4,',',i4 & ,'); maxblkdim = ',i4 & /3x,'; maxprc =',i3,'; nmychk =',i6,'; (mbpts,nbpts) = (' & ,i4,',',i4,')' & /3x,' (xmax,ymax) = (',f7.2,',',f7.2,'); nprec =',i2 & ,'( tol =',d11.4,')') c if(m.gt.(mdim-2).or.n.gt.(ndim-2).or.mb.gt.maxblk.or.nb.gt.maxblk & .or.nprocs.gt.maxprc.or.nprocs.lt.1.or.mb*nb.gt.100) & then write(6,6668) mdim,ndim,maxblk,m,n,mb,nb,nprocs,mxiter 6668 format(' Improper inputs with limits exceeded; input was:' & ' mdim =',i5,'; ndim =',i5,'; maxblk =',i5 & /' m =',i5,'; n =',i5,';mb =',i5,';nb =',i5 & /' nprocs =',i5,'; max iterations =',i5) write(6,*) 'S T O P E X E C U T I O N I N M A I N' stop endif c xmax = xmax1 ymax = ymax1 c do 10 ib = 1,mb 10 itmp(ib) = ib do 11 jb = 1,nb 11 jtmp(jb) = jb c c: remove second.f timer t1 = second(foo) t2 = second(foo) c CALL SCHED(nprocs,paralg,mdim,ndim,itmp,jtmp,itag,statag,mychkn & ,u,v) c t3 = second(foo) tt = t3-t2-(t2-t1) c c output c amstep = amax1(m/10.,1.) anstep = amax1(n/10.,1.) mtop = min0(m+1,11) ntop = min0(n+1,11) do 1002 i = 1,mtop 1002 itmp(i) = 1 + (i-1)*amstep + .5 do 1003 j = 1,ntop 1003 jtmp(j) = 1 + (j-1)*anstep + .5 write(6,1001) (itmp(i),i = 1,mtop),m+2 j = n+2 write(6,1000) j,(u(itmp(i),n+2),i = 1,mtop),u(m+2,n+2) do 100 k = 1,ntop j = jtmp(ntop+1-k) write(6,1000) j,(u(itmp(i),j),i = 1,mtop),u(m+2,j) 100 continue 1000 format(i4,2x,12f5.2) 1001 format(' Static Block Jacobi -' & ,' Iteration SCHEDULE Final Results:' & /3x,'j/i',12i5) c mxjobs = 1+niter*(1+mb*nb+2)+1 if(nprocs.eq.1) write(6,664) mb,nb,niter,mxiter 664 format(' # Static Block Jacobi Schedule rsched & gettag & name ' & ,'program' & /' # ftsubs.f for iterated circular readyq & parmq & ' & ,'freeq version' & /' # with mblks, nblks =',2i5'; niter =',i8,'; mxiter=',i8) write(6,665) nprocs,tt 665 format(' #',i2,f12.5) write(*,666) m,n,mb,nb,niter,nprocs,mxjobs,tt,uvdiff 666 format(9x,'code',7x,'m',7x,'n',3x,'mblks',3x,'nblks',3x,'niter' & ,2x,'nprocs',1x,'maxjobs' & /1x,'BLOCK-JACOBI',7i8 & /1x,'STATIC VERSION',5x,'seconds =',f12.5,'; uvdiff =',d12.5) if(uvdiff.ge.tol) write(6,667) niter,mxiter,uvdiff,tol 667 format(3x,'Iteration UNSUCCESSFUL: niter =',i6, ' & mxiter =' & ,i6/5x,'while uvdiff =',d12.5,' .GE. tol = ',d12.5) c stop end c subroutine paralg(mdim,ndim,itmp,jtmp,itag,statag,mychkn & ,u,v) parameter(mxprcs = 1000) implicit real*8 (a-h,o-z) integer m,n,mb,nb,statag(mdim,*),mychkn(*),itmp(*),jtmp(*) integer itag(mxprcs) integer jobtag,initag,strtag,cnvtag,testag,stptag real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff EXTERNAL INIT,STARTT,JACOBI,CONVRG,TEST,STOPIT c c this is the parallel driver for the iterated dependency graph c c first, get all static job tags necessary to construct the c dependency graph. c CAUTION: At this point, execution is in parallel because sub paralg CAUTION: and its args are only passed to SCHEDULE by sub sched and is CAUTION: executed in concurrent mode by a copy of the sub work. CAUTION: It is essential that all subsequent sub args must be global CAUTION: variables, such as itag, statag, itmp and jtmp, else values CAUTION: passed will not be protected from concurrent overwrite. c CALL GETTAG(initag) itag(initag) = initag CALL GETTAG(strtag) itag(strtag) = strtag do 100 jb = 1,nb do 100 ib = 1,mb Caution: statag(ib,jb) gets the static job tag cont: for the block (ib,jb). CALL GETTAG(statag(ib,jb)) 100 continue CALL GETTAG(cnvtag) itag(cnvtag) = cnvtag CALL GETTAG(testag) itag(testag) = testag CALL GETTAG(stptag) itag(stptag) = stptag c jobtag = itag(initag) icango = 0 nchks = 1 nreset = 0 mychkn(1) = itag(strtag) c CAUTION: PUTQ does not call INIT, but only passes its name and args to CONT: SCHEDULE. c CALL name(jobtag,' init') CALL DEP(jobtag,icango,nchks,mychkn) CALL PUTQ(jobtag,init,itag(initag),mdim,u) c jobtag = itag(strtag) icango = 1 nchks = mb*nb Comment: Here nreset = 2 is used as the iteration set number, cont: but it may be any nonzero integer. nreset = 2 do 201 jb = 1,nb do 201 ib = 1,mb mychkn(ib+mb*(jb-1)) = statag(ib,jb) 201 continue c CALL name(jobtag,'startt') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,startt,itag(strtag),mdim,u,v) c do 301 jb = 1,nb do 301 ib = 1,mb jobtag = statag(ib,jb) icango = 1 nchks = 1 nreset = 2 mychkn(1) = itag(cnvtag) c CALL name(jobtag,'jacobi') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CAUTION: Make certain that global variables like itmp are passed as cont: arguments of subroutines that are passed to Schedule. CALL PUTQ(jobtag,jacobi,statag(ib,jb),mdim & ,itmp(ib),jtmp(jb),u,v) 301 continue c jobtag = itag(cnvtag) icango = mb*nb nchks = 1 nreset = 2 mychkn(1) = itag(testag) c CALL name(jobtag,'convrg') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,convrg,itag(cnvtag),mdim,u,v) c jobtag = itag(testag) icango = 1 nchks = 1 nreset = 2 mychkn(1) = itag(stptag) c CALL name(jobtag,' test') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,test,itag(testag),itag(strtag) & ,itag(stptag)) c jobtag = itag(stptag) icango = 1 nchks = 0 nreset = 0 c CALL name(jobtag,'stopit') CALL DEP(jobtag,icango,nchks,mychkn) CALL PUTQ(jobtag,stopit,itag(stptag)) c return end c subroutine init(initag,mdim,u) implicit real*8 (a-h,o-z) real*8 u(mdim,*) integer initag,mdim,m,n common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff niter = 0 dx = xmax/(m+1) dy = ymax/(n+1) do 200 i = 1,m+2 x = (i-1)*dx u(i,1) = (x/xmax)**2 u(i,n+2) = 0.5*(1+(x/xmax)**2) 200 continue do 300 j = 2,n+1 y = (j-1)*dy u(1,j)= 0.5*(y/ymax)**3 u(m+2,j) = 1.0 300 continue do 100 j = 2,n+1 do 100 i = 2,m+1 u(i,j) = ((n+2-j)*u(i,1)+(j-1)*u(i,n+2))/(n+1) 100 continue mstep=m/10 nstep=n/10 return end subroutine startt(strtag,mdim,u,v) implicit real*8 (a-h,o-z) integer strtag,mdim,m,n real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Code: saves current node values and is restarting point for iterations. niter = niter + 1 do 100 j = 1,n+2 do 100 i = 1,m+2 v(i,j) = u(i,j) 100 continue return end subroutine jacobi(statag,mdim,ib,jb,u,v) implicit real*8 (a-h,o-z) integer statag,mdim,m,n,mb,nb,ib,jb real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff r = dy/dx do 100 js = 1,nbpts j = js + 1 + nbpts*(jb-1) y = (j-1)*dy do 100 is = 1,mbpts i = is + 1 + mbpts*(ib-1) x = (i-1)*dx a = 1.d0/dsqrt(1.d0+x**2+y**2) b = dexp(-x**2 -y**2) den = 2*(r**2*a + b) u(i,j) = (r**2*a*(v(i+1,j)+v(i-1,j)) & + b*(v(i,j+1)+v(i,j-1)))/den 100 continue mstep=m/10 nstep=n/10 Code: computes Block Jacobi updates for block (ib,jb) return end subroutine convrg(cnvtag,mdim,u,v) implicit real*8 (a-h,o-z) integer cnvtag,mdim,m,n,idone real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Code: computes the Cauchy Convergence crierion in the inf-norm and cont: passes the flag idone as 0 for reset and 1 for stop uvdiff = 0 do 100 j = 2,n+1 do 100 i = 2,m+1 dumax = abs(u(i,j)-v(i,j)) if(dumax.gt.uvdiff) uvdiff = dumax 100 continue if(uvdiff.lt.tol.or.niter.ge.mxiter) then idone = 1 else idone = 0 endif c return end subroutine test(testag,strtag,stptag) implicit real*8 (a-h,o-z) integer testag,strtag,stptag,reset common /comint/ m,n,mb,nb,mbpkts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff CTERM INTEGER WRLOCK CTERM COMMON /CONWRT/ WRLOCK if(idone.eq.0) then Comment: If iteration is unfinished, Reset SCHEDULE sub RSCHED is called Cont: and my check in is changed to iteration start tag strtag. Comment: In this example, the iteration set integer is 2. kreset = 2 CALL RSCHED(testag,strtag,kreset) else Comment: Else, reset my check in to the iteration stop tag stptag. kreset = 0 CALL RSCHED(testag,stptag,kreset) endif return end subroutine stopit(stptag) integer stptag continue return end SHAR_EOF if test -f 'd_and_c.f' then echo shar: over-writing existing file "'d_and_c.f'" fi cat << \SHAR_EOF > 'd_and_c.f' integer a(256),klevl(8),myid(256) external top write(6,*) ' input nprocs nlevls ' read (5,*) nprocs,nlevls do 50 j = 1,8 klevl(j) = j 50 continue call sched(nprocs,top,a,nlevls,klevl,myid) do 100 j = 1,2**nlevls-1 write(6,*) a(j) 100 continue stop end subroutine top(a,nlevls,klevl,myid) integer a(*),klevl(*),myid(*) character*6 subnam external split c write(6,*) ' from top ' , a(1) call gettag(jobtag) icango = 0 nchks = 0 myid(1) = jobtag a(1) = 1 c subnam = 'split' call name(jobtag,subnam) call dep(jobtag,icango,nchks,mychkn) call putq(jobtag,split,myid,a,nlevls,klevl) c return end subroutine split(myid,a,nlevls,klevl) integer a(*),klevl(*),myid(*),rnode external clone character*6 subnam c write(6,*) ' from split ',a(1) c if (klevl(1) .ge. nlevls) return c lnode = 2*a(1) rnode = lnode + 1 indx = lnode - a(1) + 1 a(indx) = lnode a(indx+1) = rnode mytag = myid(a(1)) c call gettag(jobtag) subnam = 'split' call name(jobtag,subnam) call nxtag(jobtag,mytag) myid(lnode) = jobtag call spawn(jobtag,mytag,clone,myid,a(indx),nlevls,klevl(2)) c call gettag(jobtag) subnam = 'split' call name(jobtag,subnam) call nxtag(jobtag,mytag) myid(rnode) = jobtag call spawn(jobtag,mytag,clone,myid,a(indx+1),nlevls,klevl(2)) c return end SHAR_EOF if test -f 'data' then echo shar: over-writing existing file "'data'" fi cat << \SHAR_EOF > 'data' 1 3 SHAR_EOF if test -f 'data.blkjac' then echo shar: over-writing existing file "'data.blkjac'" fi cat << \SHAR_EOF > 'data.blkjac' 1 100 100 10 10 100 2 SHAR_EOF if test -f 'data.ts_dynamic' then echo shar: over-writing existing file "'data.ts_dynamic'" fi cat << \SHAR_EOF > 'data.ts_dynamic' 1 43 1000 SHAR_EOF if test -f 'example.f' then echo shar: over-writing existing file "'example.f'" fi cat << \SHAR_EOF > 'example.f' program main integer n, k c external parprd c real a(1000), b(1000), temp(50), sigma write (6,*) 'Input number of processors' read (5,*) nprocs n = 1000 k = 20 c do 100 j = 1, n a(j) = j b(j) = 1 100 continue c call sched(nprocs, parprd, n, k, a, b, temp, sigma) c write (6,*) ' sigma = ', sigma stop end c subroutine parprd(n, k, a, b, temp, sigma) c integer n, k real a(*), b(*), temp(*), sigma c integer m1, m2, index, j, jobtag, icango, ncheks, mychkn(2) integer itags(500) c character*6 subnam external inprod, addup save m1, m2 c do 150 j = 1, k + 1 call gettag(jobtag) itags(j) = jobtag 150 continue c m1 = n/k index = 1 do 200 j = 1, k - 1 jobtag = itags(j) icango = 0 ncheks = 1 mychkn(1) = itags(k + 1) subnam = 'inprod' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, inprod, m1, a(index), b(index), temp(j)) index = index + m1 200 continue c m2 = n - index + 1 jobtag = itags(k) icango = 0 ncheks = 1 mychkn(1) = itags(k + 1) subnam = 'inprod' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, inprod, m2, a(index), b(index), temp(k)) index = index + m1 c jobtag = itags(k + 1) icango = k ncheks = 0 subnam = 'addup' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, addup, k, sigma, temp) c return end c c subroutine inprod(m, a, b, sigma) integer m real a(*), b(*), sigma sigma = 0.0 do 100 j = 1, m sigma = sigma + a(j)*b(j) 100 continue return end c c subroutine addup(k, sigma, temp) integer k real sigma, temp(*) sigma = 0.0 do 100 j = 1, k sigma = sigma + temp(j) 100 continue return end SHAR_EOF if test -f 'ftsubs.f' then echo shar: over-writing existing file "'ftsubs.f'" fi cat << \SHAR_EOF > 'ftsubs.f' subroutine chekin(jobtag) Comment: integrated iteration version of ftsubs.f and ftsubs.iter.f cont: with option to iterate a set of nodes with reset dependencies. Comment: combined graphics and terminal trace version of ftsubt.f Code parent: alliant:/afs1/hanson/dirsched/ftsubs.f change(1): iprcs = 200 <- 120; change(2): automatic return stmt removed out of loop do 20 in chekin; change(3): installed vector-circular ready queue, cont: vector <= nproc sub-qs, elastic with nproc processors; cont: circular <= readyq free space wraps around from rtail to rhead, cont: with the top end of readyq connected to the bottom end; cont: ready(rhead(id)+ndmrsq*(id-1)) <- readyq(rhead(id),id); cont: most mxces replaced by nprocc = nproc = no. sub-qs; cont: ldimrq = leading dim of readyq = iprcs*mxces cont: ndmrsq = dim of a ready-sub-q = ldimrq/nproc cont: idrsq = id of ready-sub-q <- iwrkr; dummy iw used in do's; cont: installed SCHED ERROR flags for readyq over-runs (mtail cond.); cont: round robin test in getprb reduced to single statement. Change: corrected next in nxtag & intspn in start2 to recover lost tag. CAUTION: nxtag and spawn arguments are consistent with dep and putq cont: now, but order of arguments may not be consistent with older cont: versions of ftsubs.f. Change(4): installed circular parm queue, jobtag is the circular cont: (reusable) job tag with 1.le.jobtag.le.mxprcs, cont: snext is the schedule or sum or cumulative jobtag. Change(5): install super next tag, whereby user gets schedule job tags cont: from new schedule sub gettag; hence schedule has no knowledge cont: of user tags and consequently the principal restriction on user cont: is that there be less than "mxprcs" undone jobs at any time. cont: integer array "unitag" keeps a unique job tag for undone jobs. Change(6): install rest and save arrays for jobtags that will be cont: iterated more than once with original dependencies: ireset, cont: icnsav. install sub rsched to reset icangoes cont: and call sub place on iteration. Change(6a): nslots = 105 <- 30 to handle multiple dependencies. Change(7): installed common block CONWRT with key WRLOCK for concurrent cont: writes for use in both ftsubs.f and the user's driver code. Change(8): installed c-include indx*.h files to enable the passing of cont: up to 60 parameters with sched, putq and spawn calls (via m. cont: johnson, ssi). Change(9): installed lock initializations in libopn to make porting cont: to other machines without automatic variable initialization. CAUTION: subroutine second uses machine dependent timer, which must be cont: changed when porting to other machines. cgraphChange: install write nproc in sub libopn. cgraphChange: installed extra traces in chekin & place. cgraphChange: replaced qlock(mxprcs) by glock as igraph's own lock. cgraphChange: installed process names for Dongarra/Brewer's sched.trace. cgraphChange(8): cgraph lines made compatible for SCHED.TRACE/sched.trace. cgraphcdirectory: /usr/alcaid/brewer/SCHED.TRACE/sched.trace cgraphcomment: for graphics trace, change 'cgraph' to null '' and run. ctermComment: for terminal trace, change 'cterm' to null '' and run. Change(9): Conversion of ftsubs.f to run on Sun cont: Change parameter "mxces" from 8 to 1. cont: Change parameter "iprcs" from 200 to 1000. cont: Add routines lckasn, lockon, lockoff, & nops. CVD$R NOCONCUR integer jobtag c*********************************************************************** c c this subroutine reports unit of computation labeled by c jobtag has completed to all dependent nodes. these nodes are c recorded in parmq(i,jobtag) where 6 .le. i .le. nchks+5 c checkin consists of decrementing the value in each of these c locations by 1. each of these is done in a critical section c protected by qlock(jobtag) c c if the value in parmq(2,jobtag) is 0 where jobtag is a process c dependent upon this one then jobtag is placed on the readyq c by entering the critical section protected by trlock. the c pointer rtail to the tail of the readyq is incremented c unless task done is to be recorded. task done is placed on c the ready q and the pointer rtail left in place if nchks .eq. 0 c is found. c c see the common block description in libopn for more detail. c c*********************************************************************** parameter (mxprcs = 1000,iprcs = 1000,mxces = 1,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) Caution: common block CONWRT is used for concurrent writes, cont: with WRLOCK as the key to the LOCK. INTEGER WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf cgraph integer glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c a complete common block description is given in the routine libopn c c**************************************************************************** c c check to see if this process has completed. if not do not checkin c mtail = 0 idrsq = 0 c c first ask if any kids spawned by jobtag c if (parmq(4,jobtag) .ne. 0 .or. parmq(5,jobtag) .ne. 0 ) then c c either kids have been spawned or ientry has been referenced c indicating reentry is required c c c find out how many are waiting to complete c if (parmq(4,jobtag) .ne. 0) then call lockon(qlock(jobtag)) parmq(2,jobtag) = parmq(2,jobtag) - parmq(4,jobtag) call lockoff(qlock(jobtag)) endif c c reset number of kids c parmq(4,jobtag) = 0 c c update the number of times this procedure has been c entered c parmq(1,jobtag) = parmq(1,jobtag) + 1 c c return without checkin if all the kids have not c checked in to jobtag yet or if a reentry is required. c process jobtag is not done in either case. c comment: extra trace data. if (parmq(2,jobtag) .ne. 0) then cgraph call lockon(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call lockoff(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for chekin/child (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) cgraph igraph(1,insrt) = 7 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph else cgraphc trace for chekin/parent (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) cgraph igraph(1,insrt) = 6 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph endif return endif c c if ientry has been called but jobtag is not waiting c on any kids then jobtag is placed back on the readyq c if ( parmq(5,jobtag) .ne. 0) then idrsq = mod((jobtag-1),nprocc) + 1 call lockon(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = jobtag rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call lockoff(trlock(idrsq)) cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 10 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(8,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 9 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(8,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif return endif endif c c the process has completed so chekin proceeds c cgraph call lockon(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call lockoff(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cgraph igraph(1,insrt) = 5 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph gnames(insrt) = names(jobtag) cgraph else cgraphc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cgraph igraph(1,insrt) = 2 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph gnames(insrt) = names(jobtag) cgraph endif c cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 5 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = jobtag cterm igraph(6,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 2 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = jobtag cterm igraph(5,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c c if (mtail .lt. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c nchks = parmq(3,jobtag) c c if this is the final process (indicated by nchks .eq. 0) then c record task done. do not advance the tail so task done sequence c is set. all subsequent gtprb queries will leave rhead .eq. rtail c with readyq(rhead+ndmrsq*(i-1)) .eq. done. c if (nchks .eq. 0) then do 20 iw = 1,nprocc call lockon(trlock(iw)) readyq(rtail(iw)+ndmrsq*(iw-1)) = done call lockoff(trlock(iw)) 20 continue cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 12 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 11 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif Change(2): removed following return from end of above loop do 20. return endif do 50 j = 6,nchks+5 mychek = parmq(j,jobtag) c c get unique access to the checkin node mychek c and checkin by decrementing the appropriate counter c mchkgo = 1 call lockon(qlock(mychek)) parmq(2,mychek) = parmq(2,mychek) - 1 mchkgo = parmq(2,mychek) call lockoff(qlock(mychek)) c c place mychek on readyq if parmq(2,mychek) is 0 c if (mchkgo .eq. 0 ) then idrsq = mod((mychek-1),nprocc) + 1 call lockon(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = mychek rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call lockoff(trlock(idrsq)) endif 50 continue c c place finished process at the end of the free list freeq c provided it will not be reset for another iteration. c if(ireset(jobtag).eq.0) then call lockon(tflock) ftail = mod(ftail,mxprcs) + 1 if(fhead.eq. ftail) free = 0 freeq(ftail) = jobtag call lockoff(tflock) endif c cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 8 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = fhead cterm igraph(6,insrt) = ftail cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 7 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c if ( free .eq. 0 ) then call lockon(WRLOCK) inext = unitag(jobtag) write(6,*) '*************SCHED ERROR*************************' write(6,*) ' more processes have checked into sub chekin,' write(6,*) ' than should be active for free slots in the' write(6,*) ' parmq parameter queue. jobs are too many.' write(6,*) ' total number of jobtags were:',inext write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) call lockoff(WRLOCK) stop c endif c return c c last card of chekin c end subroutine gettag(jobtag) CVD$R NOCONCUR integer jobtag c************************************************************************* c c this subroutine gets a schedule jobtag for problem on the queue, c provided a free column is available in parmq. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 1000,mxces = 1,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) INTEGER WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf cgraph integer glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c for a complete common block description see the subroutine libopn c c if ( free .eq. 0 ) then call lockon(WRLOCK) write(6,*) '*************SCHED LIMIT ERROR*******************' write(6,*) ' user attempt to create to many active ' write(6,*) ' processes ; total number of jobs =',snext write(6,*) ' too many unfinished jobs while in gettag ' write(6,*) ' and no free slots on the parameter queue ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) call lockoff(WRLOCK) stop c endif c c get tag for process on the next free column in the problem queue c call lockon(hflock) jobtag = freeq(fhead) snext = snext + 1 if(fhead.eq. ftail) free = 0 fhead = mod(fhead,mxprcs) + 1 if(jobtag.ge.1.and.jobtag.le.mxprcs) unitag(jobtag) = snext call lockoff(hflock) cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) ctermc trace for gettag cterm igraph(1,insrt) = 15 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = jobtag cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail c if ( jobtag .le. 0 .or. jobtag .gt. mxprcs ) then write(6,*) '*************SCHED ERROR***********************' write(6,*) ' illegal jobtag for parmq column;' write(6,*) ' need 1 .le. jobtag .le. ',mxprcs,';' write(6,*) ' current jobtag =',jobtag,' in gettag' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif c return c c last card of gettag c end subroutine rsched(jobtag,settag,kreset) CVD$R NOCONCUR integer jobtag,settag,kreset c************************************************************************* c comment: usage c subroutine paralg() c integer strtag,stptag,itag(*) c external start,test c . c . c call gettag(strtag) c itag(strtag) = strtag c . c . c call gettag(stptag) c itag(stptag) = stptag c . c . comment: start iteration or time stepping c jobtag = strtag c icango = 1 c nchks = ... c nreset = c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,start,itag(strtag)) c . c . comment: test and continue iteration at start if undone c jobtag = testag c icango = ... c nreset = c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,test,itag(strtag),itag(stptag)) c . c . c subroutine test(jobtag,strtag,stptag) c common /