#!/bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #!/bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # README # blkjac.f # cputm.c # d_and_c.f # data # data.blkjac # data.ts_dynamic # example.f # ftsubs.f # ftsubs.graph.f # indx0.h # indxj.h # maindp.f # make1 # makefile # maxparms.h # newevdp0.f # oldtest # putq.c # second.f # stateig.f # statses.f # stseswait.f # stuffspawn.f # testrun # ts_dynamic.f # This archive created: Wed Aug 30 14:34:47 1989 export PATH; PATH=/bin:$PATH if test -f 'README' then echo shar: over-writing existing file "'README'" fi cat << \SHAR_EOF > 'README' This package contains an enhanced version of the SCHEDULE Parallel Programming Package. By way of reusable or recycling queues, applications with the new ftsubs.f are now limited to 1000 active job processes, rather than a 1000 cumulative job processes (see the new version of the demonstration program ts_dynamic.f that illustrates the use of the new SCHEDULE subroutine GETTAG and see the description below; i.e. jobtags are now assigned by SCHEDULE instead of the user). This new version of ftsubs.f now also permits iterations of static dependency graphs (see the example blkjac.f that illustrates the use of the new SCHEDULE subroutines RESET and RSCHED). The new putq.c requires "include" files: maxparms.h, indx0.h and indxj.h (these may be easily changed from 20 to 60 parameters to allow an increased number of parameters in the calls to sched, putq and spawn; CAUTION: there is usually more overhead in subroutine argument passing than common argument passing, so rely more on common unless the job is big enough to underwrite the extra overhead or subroutine arguments can not be avoided). This revised version of SCHEDULE is currently only available on the Alliant, Balance, and Symmetry. ******************************************************************************* READ THIS PART!!! CAUTION: The first two arguments of NXTAG and SPAWN are now reversed from older versions to make them consistent with the static dependency subroutines DEP and PUTQ. ******************************************************************************* Caution: Calls to SCHED, PUTQ and SPAWN, should include at least one parameter in the argument list. To build the libraries, type: "make sched" or "make graph" make sched will produce the schedule library in file sched.a . This is the standard libarary for schedule. All examples begining with 'x' use this library. make graph will produce the trace version of schedule in file graph.a . This version produces output which can be viewed using the Schedule Trace Facility. All examples begining with 'g' use this library. To run the examples type: "make xtest; xtest" or "make gtest; gtest" This will make and run a test program which solves a triangular system of equations using the spawning capabilities of schedule. "make xdandc; xdandc" or "make gdandc; gdandc" This will make and run a test program which implements a recursive divide and conquer technique using the spawning capabilities of schedule. The output from gdandc will not work with the current trace facility because spawned processes also spawn other processes and the trace facility does not yet support more than one level of spawning. "make xeig; xeig" or "make geig; geig" This will make and run a test program which computes the eigenvalues of a symmetric tridiagonal eigenvalue problem using a divide and conquer technique. "make xwait; xwait" or "make gwait; gwait" This will make and run a test program which solves the same problem as above except that it uses the schedule wait command. "make xexample; xexample" or "make gexample; gexample" This will make and run a test program which computes the inner product of two vectors. "make xts_tsdynamic; xts_dynamic" or "make gts_tsdynamic; gts_dynamic" This will make and run an example program illustrating use of the new SCHEDULE package with the triangular stuffer demonstration program; note especially the new subroutine GETTAG that gets a SCHEDULE generated job for each of the user's processes; note that the arguments of NXTAG and SPAWN have been reordered to be more like that of DEP and PUTQ. The input to ts_dynamic the form: [n_processors] [n_array_size] [n_work_iterations]. There is a sample input file in data.ts_dynamic. "make xblkjac; xblkjac" or "make gblkjac; gblkjac" This will make and run asample FORTRAN static iteration driver for ftsubs.f with block Jacobi iteration of a variable coefficient EPDE; The input has the form: [n_processors] [n_x_size] [n_y_size] [n_x_blocks] [n_y_blocks] [max_iterations] [n_result_precision]. There is a sample input file in data.blkjac. Up to 10 X 10 blocks are permitted. The new SCHEDULE subroutine RSET marks processes that will take part in an iteration. Another new SCHEDULE subroutine RSCHED restores only those parameters, such as ICANGO, that have changed; NSLOTS have been increased to 105 to permit at most 10 X 10 block iterations. A complete make and test can be carried out by doing: testrun >& testout diff oldtest testout SHAR_EOF if test -f 'blkjac.f' then echo shar: over-writing existing file "'blkjac.f'" fi cat << \SHAR_EOF > 'blkjac.f' $STDUNIT program blkjac c Code Name: blkjac.f (STATIC VERSION) Code Data Input: [n_processors] [n_x_size] [n_y_size] [n_x_blocks] [n_y_blocks] Continued Input (single line assumed): [max_iterations] [n_result_precision] Change: Block Jacobi test for SCHEDULE rsched to restore STATIC & DYNAMIC cont: dependency graph for the next iteration. implicit real*8 (a-h,o-z) parameter(mdim=102, ndim=102, maxblk=11, maxprc=8) parameter(xmax1=100.d0,ymax1=100.d0,tol1=0.5d-4) parameter(nmyck=10*maxblk) parameter(mxprcs = 1000) integer itmp(maxblk),jtmp(maxblk),mychkn(nmyck),statag(mdim,ndim) integer itag(mxprcs) real t1,t2,t3,tt,second,foo real*8 u(mdim,ndim),v(mdim,ndim) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Caution: common block CONWRT is used in SCHEDULE for concurrent prints. COMMON /CONWRT/ WRLOCK EXTERNAL PARALG read(5,*) nprocs,m,n,mb,nb,mxiter,nprec write(6,6666) nprocs,m,n,mb,nb,mxiter,nprec 6666 format(' Static Block Jacobi Input:' & /3x,'nprocs =',i3,'; (m,n) = (',i4,',',i4 & ,'); (mblks,nblks) = (',i3,',',i3 & ,')'/3x,'; max iterations =',i6,'; nprec =',i3) mbpts = m/mb nbpts = n/nb tol = 0.5d0/10**nprec write(6,6667) mdim,ndim,maxblk,maxprc,nmyck,mbpts,nbpts & ,xmax1,ymax1,nprec,tol 6667 format(' Parameter input:'/3x,'(mdim,ndim) = (',i4,',',i4 & ,'); maxblkdim = ',i4 & /3x,'; maxprc =',i3,'; nmychk =',i6,'; (mbpts,nbpts) = (' & ,i4,',',i4,')' & /3x,' (xmax,ymax) = (',f7.2,',',f7.2,'); nprec =',i2 & ,'( tol =',d11.4,')') c if(m.gt.(mdim-2).or.n.gt.(ndim-2).or.mb.gt.maxblk.or.nb.gt.maxblk & .or.nprocs.gt.maxprc.or.nprocs.lt.1.or.mb*nb.gt.100) & then write(6,6668) mdim,ndim,maxblk,m,n,mb,nb,nprocs,mxiter 6668 format(' Improper inputs with limits exceeded; input was:' & ' mdim =',i5,'; ndim =',i5,'; maxblk =',i5 & /' m =',i5,'; n =',i5,';mb =',i5,';nb =',i5 & /' nprocs =',i5,'; max iterations =',i5) write(6,*) 'S T O P E X E C U T I O N I N M A I N' stop endif c xmax = xmax1 ymax = ymax1 c do 10 ib = 1,mb 10 itmp(ib) = ib do 11 jb = 1,nb 11 jtmp(jb) = jb c c: remove second.f timer t1 = second(foo) t2 = second(foo) c CALL SCHED(nprocs,paralg,mdim,ndim,itmp,jtmp,itag,statag,mychkn & ,u,v) c t3 = second(foo) tt = t3-t2-(t2-t1) c c output c amstep = amax1(m/10.,1.) anstep = amax1(n/10.,1.) mtop = min0(m+1,11) ntop = min0(n+1,11) do 1002 i = 1,mtop 1002 itmp(i) = 1 + (i-1)*amstep + .5 do 1003 j = 1,ntop 1003 jtmp(j) = 1 + (j-1)*anstep + .5 write(6,1001) (itmp(i),i = 1,mtop),m+2 j = n+2 write(6,1000) j,(u(itmp(i),n+2),i = 1,mtop),u(m+2,n+2) do 100 k = 1,ntop j = jtmp(ntop+1-k) write(6,1000) j,(u(itmp(i),j),i = 1,mtop),u(m+2,j) 100 continue 1000 format(i4,2x,12f5.2) 1001 format(' Static Block Jacobi -' & ,' Iteration SCHEDULE Final Results:' & /3x,'j/i',12i5) c mxjobs = 1+niter*(1+mb*nb+2)+1 if(nprocs.eq.1) write(6,664) mb,nb,niter,mxiter 664 format(' # Static Block Jacobi Schedule rsched & gettag & name ' & ,'program' & /' # ftsubs.f for iterated circular readyq & parmq & ' & ,'freeq version' & /' # with mblks, nblks =',2i5'; niter =',i8,'; mxiter=',i8) write(6,665) nprocs,tt 665 format(' #',i2,f12.5) write(*,666) m,n,mb,nb,niter,nprocs,mxjobs,tt,uvdiff 666 format(9x,'code',7x,'m',7x,'n',3x,'mblks',3x,'nblks',3x,'niter' & ,2x,'nprocs',1x,'maxjobs' & /1x,'BLOCK-JACOBI',7i8 & /1x,'STATIC VERSION',5x,'seconds =',f12.5,'; uvdiff =',d12.5) if(uvdiff.ge.tol) write(6,667) niter,mxiter,uvdiff,tol 667 format(3x,'Iteration UNSUCCESSFUL: niter =',i6, ' & mxiter =' & ,i6/5x,'while uvdiff =',d12.5,' .GE. tol = ',d12.5) c stop end c subroutine paralg(mdim,ndim,itmp,jtmp,itag,statag,mychkn & ,u,v) parameter(mxprcs = 1000) implicit real*8 (a-h,o-z) integer m,n,mb,nb,statag(mdim,*),mychkn(*),itmp(*),jtmp(*) integer itag(mxprcs) integer jobtag,initag,strtag,cnvtag,testag,stptag real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff EXTERNAL INIT,STARTT,JACOBI,CONVRG,TEST,STOPIT c c this is the parallel driver for the iterated dependency graph c c first, get all static job tags necessary to construct the c dependency graph. c CAUTION: At this point, execution is in parallel because sub paralg CAUTION: and its args are only passed to SCHEDULE by sub sched and is CAUTION: executed in concurrent mode by a copy of the sub work. CAUTION: It is essential that all subsequent sub args must be global CAUTION: variables, such as itag, statag, itmp and jtmp, else values CAUTION: passed will not be protected from concurrent overwrite. c CALL GETTAG(initag) itag(initag) = initag CALL GETTAG(strtag) itag(strtag) = strtag do 100 jb = 1,nb do 100 ib = 1,mb Caution: statag(ib,jb) gets the static job tag cont: for the block (ib,jb). CALL GETTAG(statag(ib,jb)) 100 continue CALL GETTAG(cnvtag) itag(cnvtag) = cnvtag CALL GETTAG(testag) itag(testag) = testag CALL GETTAG(stptag) itag(stptag) = stptag c jobtag = itag(initag) icango = 0 nchks = 1 nreset = 0 mychkn(1) = itag(strtag) c CAUTION: PUTQ does not call INIT, but only passes its name and args to CONT: SCHEDULE. c CALL name(jobtag,' init') CALL DEP(jobtag,icango,nchks,mychkn) CALL PUTQ(jobtag,init,itag(initag),mdim,u) c jobtag = itag(strtag) icango = 1 nchks = mb*nb Comment: Here nreset = 2 is used as the iteration set number, cont: but it may be any nonzero integer. nreset = 2 do 201 jb = 1,nb do 201 ib = 1,mb mychkn(ib+mb*(jb-1)) = statag(ib,jb) 201 continue c CALL name(jobtag,'startt') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,startt,itag(strtag),mdim,u,v) c do 301 jb = 1,nb do 301 ib = 1,mb jobtag = statag(ib,jb) icango = 1 nchks = 1 nreset = 2 mychkn(1) = itag(cnvtag) c CALL name(jobtag,'jacobi') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CAUTION: Make certain that global variables like itmp are passed as cont: arguments of subroutines that are passed to Schedule. CALL PUTQ(jobtag,jacobi,statag(ib,jb),mdim & ,itmp(ib),jtmp(jb),u,v) 301 continue c jobtag = itag(cnvtag) icango = mb*nb nchks = 1 nreset = 2 mychkn(1) = itag(testag) c CALL name(jobtag,'convrg') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,convrg,itag(cnvtag),mdim,u,v) c jobtag = itag(testag) icango = 1 nchks = 1 nreset = 2 mychkn(1) = itag(stptag) c CALL name(jobtag,' test') CALL DEP(jobtag,icango,nchks,mychkn) CALL RESET(jobtag,nreset) CALL PUTQ(jobtag,test,itag(testag),itag(strtag) & ,itag(stptag)) c jobtag = itag(stptag) icango = 1 nchks = 0 nreset = 0 c CALL name(jobtag,'stopit') CALL DEP(jobtag,icango,nchks,mychkn) CALL PUTQ(jobtag,stopit,itag(stptag)) c return end c subroutine init(initag,mdim,u) implicit real*8 (a-h,o-z) real*8 u(mdim,*) integer initag,mdim,m,n common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff niter = 0 dx = xmax/(m+1) dy = ymax/(n+1) do 200 i = 1,m+2 x = (i-1)*dx u(i,1) = (x/xmax)**2 u(i,n+2) = 0.5*(1+(x/xmax)**2) 200 continue do 300 j = 2,n+1 y = (j-1)*dy u(1,j)= 0.5*(y/ymax)**3 u(m+2,j) = 1.0 300 continue do 100 j = 2,n+1 do 100 i = 2,m+1 u(i,j) = ((n+2-j)*u(i,1)+(j-1)*u(i,n+2))/(n+1) 100 continue mstep=m/10 nstep=n/10 return end subroutine startt(strtag,mdim,u,v) implicit real*8 (a-h,o-z) integer strtag,mdim,m,n real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Code: saves current node values and is restarting point for iterations. niter = niter + 1 do 100 j = 1,n+2 do 100 i = 1,m+2 v(i,j) = u(i,j) 100 continue return end subroutine jacobi(statag,mdim,ib,jb,u,v) implicit real*8 (a-h,o-z) integer statag,mdim,m,n,mb,nb,ib,jb real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff r = dy/dx do 100 js = 1,nbpts j = js + 1 + nbpts*(jb-1) y = (j-1)*dy do 100 is = 1,mbpts i = is + 1 + mbpts*(ib-1) x = (i-1)*dx a = 1.d0/dsqrt(1.d0+x**2+y**2) b = dexp(-x**2 -y**2) den = 2*(r**2*a + b) u(i,j) = (r**2*a*(v(i+1,j)+v(i-1,j)) & + b*(v(i,j+1)+v(i,j-1)))/den 100 continue mstep=m/10 nstep=n/10 Code: computes Block Jacobi updates for block (ib,jb) return end subroutine convrg(cnvtag,mdim,u,v) implicit real*8 (a-h,o-z) integer cnvtag,mdim,m,n,idone real*8 u(mdim,*),v(mdim,*) common /comint/ m,n,mb,nb,mbpts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff Code: computes the Cauchy Convergence crierion in the inf-norm and cont: passes the flag idone as 0 for reset and 1 for stop uvdiff = 0 do 100 j = 2,n+1 do 100 i = 2,m+1 dumax = abs(u(i,j)-v(i,j)) if(dumax.gt.uvdiff) uvdiff = dumax 100 continue if(uvdiff.lt.tol.or.niter.ge.mxiter) then idone = 1 else idone = 0 endif c return end subroutine test(testag,strtag,stptag) implicit real*8 (a-h,o-z) integer testag,strtag,stptag,reset common /comint/ m,n,mb,nb,mbpkts,nbpts,niter,mxiter,idone common /comdat/ xmax,ymax,dx,dy,tol,uvdiff CTERM INTEGER WRLOCK CTERM COMMON /CONWRT/ WRLOCK if(idone.eq.0) then Comment: If iteration is unfinished, Reset SCHEDULE sub RSCHED is called Cont: and my check in is changed to iteration start tag strtag. Comment: In this example, the iteration set integer is 2. kreset = 2 CALL RSCHED(testag,strtag,kreset) else Comment: Else, reset my check in to the iteration stop tag stptag. kreset = 0 CALL RSCHED(testag,stptag,kreset) endif return end subroutine stopit(stptag) integer stptag continue return end SHAR_EOF if test -f 'cputm.c' then echo shar: over-writing existing file "'cputm.c'" fi cat << \SHAR_EOF > 'cputm.c' #include <sys/time.h> #include <sys/resource.h> #include <stdio.h> long cputm_() { long seconds, microseconds, milliseconds; struct rusage buffer; getrusage(RUSAGE_SELF, &buffer); seconds = buffer.ru_utime.tv_sec; microseconds = buffer.ru_utime.tv_usec; milliseconds = 1000*seconds + microseconds/1000; return(milliseconds); } SHAR_EOF if test -f 'd_and_c.f' then echo shar: over-writing existing file "'d_and_c.f'" fi cat << \SHAR_EOF > 'd_and_c.f' $STDUNIT integer a(256),klevl(8),myid(256) external top write(6,*) ' input nprocs nlevls ' read (5,*) nprocs,nlevls do 50 j = 1,8 klevl(j) = j 50 continue call sched(nprocs,top,a,nlevls,klevl,myid) do 100 j = 1,2**nlevls-1 write(6,*) a(j) 100 continue stop end subroutine top(a,nlevls,klevl,myid) integer a(*),klevl(*),myid(*) character*6 subnam external split c write(6,*) ' from top ' , a(1) call gettag(jobtag) icango = 0 nchks = 0 myid(1) = jobtag a(1) = 1 c subnam = 'split' call name(jobtag,subnam) call dep(jobtag,icango,nchks,mychkn) call putq(jobtag,split,myid,a,nlevls,klevl) c return end subroutine split(myid,a,nlevls,klevl) integer a(*),klevl(*),myid(*),rnode external clone character*6 subnam c write(6,*) ' from split ',a(1) c if (klevl(1) .ge. nlevls) return c lnode = 2*a(1) rnode = lnode + 1 indx = lnode - a(1) + 1 a(indx) = lnode a(indx+1) = rnode mytag = myid(a(1)) c call gettag(jobtag) subnam = 'split' call name(jobtag,subnam) call nxtag(jobtag,mytag) myid(lnode) = jobtag call spawn(jobtag,mytag,clone,myid,a(indx),nlevls,klevl(2)) c call gettag(jobtag) subnam = 'split' call name(jobtag,subnam) call nxtag(jobtag,mytag) myid(rnode) = jobtag call spawn(jobtag,mytag,clone,myid,a(indx+1),nlevls,klevl(2)) c return end SHAR_EOF if test -f 'data' then echo shar: over-writing existing file "'data'" fi cat << \SHAR_EOF > 'data' 12 4 SHAR_EOF if test -f 'data.blkjac' then echo shar: over-writing existing file "'data.blkjac'" fi cat << \SHAR_EOF > 'data.blkjac' 8 100 100 10 10 100 2 SHAR_EOF if test -f 'data.ts_dynamic' then echo shar: over-writing existing file "'data.ts_dynamic'" fi cat << \SHAR_EOF > 'data.ts_dynamic' 8 43 1000 SHAR_EOF if test -f 'example.f' then echo shar: over-writing existing file "'example.f'" fi cat << \SHAR_EOF > 'example.f' $STDUNIT program main integer n, k c external parprd c real a(1000), b(1000), temp(50), sigma write (6,*) 'Input number of processors' read (5,*) nprocs n = 1000 k = 20 c do 100 j = 1, n a(j) = j b(j) = 1 100 continue c call sched(nprocs, parprd, n, k, a, b, temp, sigma) c write (6,*) ' sigma = ', sigma stop end c subroutine parprd(n, k, a, b, temp, sigma) c integer n, k real a(*), b(*), temp(*), sigma c integer m1, m2, index, j, jobtag, icango, ncheks, mychkn(2) integer itags(500) c character*6 subnam external inprod, addup save m1, m2 c do 150 j = 1, k + 1 call gettag(jobtag) itags(j) = jobtag 150 continue c m1 = n/k index = 1 do 200 j = 1, k - 1 jobtag = itags(j) icango = 0 ncheks = 1 mychkn(1) = itags(k + 1) subnam = 'inprod' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, inprod, m1, a(index), b(index), temp(j)) index = index + m1 200 continue c m2 = n - index + 1 jobtag = itags(k) icango = 0 ncheks = 1 mychkn(1) = itags(k + 1) subnam = 'inprod' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, inprod, m2, a(index), b(index), temp(k)) index = index + m1 c jobtag = itags(k + 1) icango = k ncheks = 0 subnam = 'addup' call name(jobtag,subnam) call dep(jobtag, icango, ncheks, mychkn) call putq(jobtag, addup, k, sigma, temp) c return end c c subroutine inprod(m, a, b, sigma) integer m real a(*), b(*), sigma sigma = 0.0 do 100 j = 1, m sigma = sigma + a(j)*b(j) 100 continue return end c c subroutine addup(k, sigma, temp) integer k real sigma, temp(*) sigma = 0.0 do 100 j = 1, k sigma = sigma + temp(j) 100 continue return end SHAR_EOF if test -f 'ftsubs.f' then echo shar: over-writing existing file "'ftsubs.f'" fi cat << \SHAR_EOF > 'ftsubs.f' $STDUNIT $ALIGNWARN CVD$G NOINLINE (DUMP,DUMP2,LOCKON,LOCKOFF,NOPS,SECOND,WORK) subroutine chekin(jobtag) Code path: balance:/bfs2/brewer/SCHED/HANSON/ftsubs.f Comment: integrated iteration version of ftsubs.f and ftsubs.iter.f cont: with option to iterate a set of nodes with reset dependencies. Comment: combined graphics and terminal trace version of ftsubt.f Code parent: alliant:/afs1/hanson/dirsched/ftsubs.f change(1): iprcs = 200 <- 120; change(2): automatic return stmt removed out of loop do 20 in chekin; change(3): installed vector-circular ready queue, cont: vector <= nproc sub-qs, elastic with nproc processors; cont: circular <= readyq free space wraps around from rtail to rhead, cont: with the top end of readyq connected to the bottom end; cont: ready(rhead(id)+ndmrsq*(id-1)) <- readyq(rhead(id),id); cont: most mxces replaced by nprocc = nproc = no. sub-qs; cont: ldimrq = leading dim of readyq = iprcs*mxces cont: ndmrsq = dim of a ready-sub-q = ldimrq/nproc cont: idrsq = id of ready-sub-q <- iwrkr; dummy iw used in do's; cont: installed SCHED ERROR flags for readyq over-runs (mtail cond.); cont: round robin test in getprb reduced to single statement. Change: corrected next in nxtag & intspn in start2 to recover lost tag. CAUTION: nxtag and spawn arguments are consistent with dep and putq cont: now, but order of arguments may not be consistent with older cont: versions of ftsubs.f. Change(4): installed circular parm queue, jobtag is the circular cont: (reusable) job tag with 1.le.jobtag.le.mxprcs, cont: snext is the schedule or sum or cumulative jobtag. Change(5): install super next tag, whereby user gets schedule job tags cont: from new schedule sub gettag; hence schedule has no knowledge cont: of user tags and consequently the principal restriction on user cont: is that there be less than "mxprcs" undone jobs at any time. cont: integer array "unitag" keeps a unique job tag for undone jobs. Change(6): install rest and save arrays for jobtags that will be cont: iterated more than once with original dependencies: ireset, cont: icnsav. install sub rsched to reset icangoes cont: and call sub place on iteration. Change(6a): nslots = 105 <- 30 to handle multiple dependencies. Change(7): installed common block CONWRT with key WRLOCK for concurrent cont: writes for use in both ftsubs.f and the user's driver code. Change(8): installed c-include indx*.h files to enable the passing of cont: up to 60 parameters with sched, putq and spawn calls (via m. cont: johnson, ssi). Change(9): installed lock initializations in libopn to make porting cont: to other machines without automatic variable initialization. CAUTION: subroutine second uses machine dependent timer, which must be cont: changed when porting to other machines. cgraphChange: install write nproc in sub libopn. cgraphChange: installed extra traces in chekin & place. cgraphChange: replaced qlock(mxprcs) by glock as igraph's own lock. cgraphChange: installed process names for Dongarra/Brewer's sched.trace. cgraphChange(8): cgraph lines made compatible for SCHED.TRACE/sched.trace. cgraphcdirectory: /usr/alcaid/brewer/SCHED.TRACE/sched.trace cgraphcomment: for graphics trace, change 'cgraph' to null '' and run. ctermComment: for terminal trace, change 'cterm' to null '' and run. Change(9): Conversion of ftsubs.f to run on Sequent Balance 21000 cont: Add $STDUNIT & $ALIGNWARN compiler directives at beginning of file. cont: Change parameter "mxces" from 8 to 24. cont: Add line "ierr = m_set_procs(nproc)." cont: All locks are integer*1. cont: Use microtasking calls s_init_lock, s_lock, & s_unlock. cont: Add routines lckasn, lockon, lockoff, & nops. CVD$R NOCONCUR integer jobtag c*********************************************************************** c c this subroutine reports unit of computation labeled by c jobtag has completed to all dependent nodes. these nodes are c recorded in parmq(i,jobtag) where 6 .le. i .le. nchks+5 c checkin consists of decrementing the value in each of these c locations by 1. each of these is done in a critical section c protected by qlock(jobtag) c c if the value in parmq(2,jobtag) is 0 where jobtag is a process c dependent upon this one then jobtag is placed on the readyq c by entering the critical section protected by trlock. the c pointer rtail to the tail of the readyq is incremented c unless task done is to be recorded. task done is placed on c the ready q and the pointer rtail left in place if nchks .eq. 0 c is found. c c see the common block description in libopn for more detail. c c*********************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) Caution: common block CONWRT is used for concurrent writes, cont: with WRLOCK as the key to the LOCK. INTEGER*1 WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c a complete common block description is given in the routine libopn c c**************************************************************************** c c check to see if this process has completed. if not do not checkin c mtail = 0 idrsq = 0 c c first ask if any kids spawned by jobtag c if (parmq(4,jobtag) .ne. 0 .or. parmq(5,jobtag) .ne. 0 ) then c c either kids have been spawned or ientry has been referenced c indicating reentry is required c c c find out how many are waiting to complete c if (parmq(4,jobtag) .ne. 0) then call s_lock(qlock(jobtag)) parmq(2,jobtag) = parmq(2,jobtag) - parmq(4,jobtag) call s_unlock(qlock(jobtag)) endif c c reset number of kids c parmq(4,jobtag) = 0 c c update the number of times this procedure has been c entered c parmq(1,jobtag) = parmq(1,jobtag) + 1 c c return without checkin if all the kids have not c checked in to jobtag yet or if a reentry is required. c process jobtag is not done in either case. c comment: extra trace data. if (parmq(2,jobtag) .ne. 0) then cgraph call s_lock(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call s_unlock(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for chekin/child (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) cgraph igraph(1,insrt) = 7 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph else cgraphc trace for chekin/parent (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) cgraph igraph(1,insrt) = 6 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph endif return endif c c if ientry has been called but jobtag is not waiting c on any kids then jobtag is placed back on the readyq c if ( parmq(5,jobtag) .ne. 0) then idrsq = mod((jobtag-1),nprocc) + 1 call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = jobtag rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 10 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(8,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 9 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(8,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif return endif endif c c the process has completed so chekin proceeds c cgraph call s_lock(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call s_unlock(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cgraph igraph(1,insrt) = 5 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph gnames(insrt) = names(jobtag) cgraph else cgraphc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cgraph igraph(1,insrt) = 2 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph gnames(insrt) = names(jobtag) cgraph endif c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 5 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = jobtag cterm igraph(6,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 2 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = jobtag cterm igraph(5,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c c if (mtail .lt. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c nchks = parmq(3,jobtag) c c if this is the final process (indicated by nchks .eq. 0) then c record task done. do not advance the tail so task done sequence c is set. all subsequent gtprb queries will leave rhead .eq. rtail c with readyq(rhead+ndmrsq*(i-1)) .eq. done. c if (nchks .eq. 0) then do 20 iw = 1,nprocc call s_lock(trlock(iw)) readyq(rtail(iw)+ndmrsq*(iw-1)) = done call s_unlock(trlock(iw)) 20 continue cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 12 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 11 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif Change(2): removed following return from end of above loop do 20. return endif do 50 j = 6,nchks+5 mychek = parmq(j,jobtag) c c get unique access to the checkin node mychek c and checkin by decrementing the appropriate counter c mchkgo = 1 call s_lock(qlock(mychek)) parmq(2,mychek) = parmq(2,mychek) - 1 mchkgo = parmq(2,mychek) call s_unlock(qlock(mychek)) c c place mychek on readyq if parmq(2,mychek) is 0 c if (mchkgo .eq. 0 ) then idrsq = mod((mychek-1),nprocc) + 1 call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = mychek rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) endif 50 continue c c place finished process at the end of the free list freeq c provided it will not be reset for another iteration. c if(ireset(jobtag).eq.0) then call s_lock(tflock) ftail = mod(ftail,mxprcs) + 1 if(fhead.eq. ftail) free = 0 freeq(ftail) = jobtag call s_unlock(tflock) endif c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 8 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = fhead cterm igraph(6,insrt) = ftail cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 7 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c if ( free .eq. 0 ) then call s_lock(WRLOCK) inext = unitag(jobtag) write(6,*) '*************SCHED ERROR*************************' write(6,*) ' more processes have checked into sub chekin,' write(6,*) ' than should be active for free slots in the' write(6,*) ' parmq parameter queue. jobs are too many.' write(6,*) ' total number of jobtags were:',inext write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop call s_unlock(WRLOCK) c endif c return c c last card of chekin c end subroutine gettag(jobtag) CVD$R NOCONCUR integer jobtag c************************************************************************* c c this subroutine gets a schedule jobtag for problem on the queue, c provided a free column is available in parmq. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) INTEGER*1 WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c for a complete common block description see the subroutine libopn c c if ( free .eq. 0 ) then call s_lock(WRLOCK) write(6,*) '*************SCHED LIMIT ERROR*******************' write(6,*) ' user attempt to create to many active ' write(6,*) ' processes ; total number of jobs =',snext write(6,*) ' too many unfinished jobs while in gettag ' write(6,*) ' and no free slots on the parameter queue ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop call s_unlock(WRLOCK) c endif c c get tag for process on the next free column in the problem queue c call s_lock(hflock) jobtag = freeq(fhead) snext = snext + 1 if(fhead.eq. ftail) free = 0 fhead = mod(fhead,mxprcs) + 1 if(jobtag.ge.1.and.jobtag.le.mxprcs) unitag(jobtag) = snext call s_unlock(hflock) cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for gettag cterm igraph(1,insrt) = 15 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = jobtag cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail c if ( jobtag .le. 0 .or. jobtag .gt. mxprcs ) then write(6,*) '*************SCHED ERROR***********************' write(6,*) ' illegal jobtag for parmq column;' write(6,*) ' need 1 .le. jobtag .le. ',mxprcs,';' write(6,*) ' current jobtag =',jobtag,' in gettag' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif c return c c last card of gettag c end subroutine rsched(jobtag,settag,kreset) CVD$R NOCONCUR integer jobtag,settag,kreset c************************************************************************* c comment: usage c subroutine paralg(<subargs>) c integer strtag,stptag,itag(*) c external start,test c . c . c call gettag(strtag) c itag(strtag) = strtag c . c . c call gettag(stptag) c itag(stptag) = stptag c . c . comment: start iteration or time stepping c jobtag = strtag c icango = 1 c nchks = ... c nreset = <positive number for iteration set> c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,start,itag(strtag)) c . c . comment: test and continue iteration at start if undone c jobtag = testag c icango = ... c nreset = <positive number for iteration set> c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,test,itag(strtag),itag(stptag)) c . c . c subroutine test(jobtag,strtag,stptag) c common /<label>/ <finished> c . c . c if(<finished>) then c kreset = <positive number for iteration set> c call rsched(jobtag,strtag,kreset) c else c kreset = 0 c call rsched(jobtag,stptag,kreset) c endif c return c end c c this subroutine restores the icangoes of jobtags that work in c an iteration of a loop and calls place to place the reset jobtags c back on the ready queue. only those jobtags with c ireset(*) = kreset are reset. c c c jobtag is an integer job tag of the calling test subroutine, c that tests whether or not the iteration is done. c c strtag is an integer job tag of the iteration starting node. c c stptag is an integer job tag of the iteration stopping node. c c settag is an integer job tag of the iteration reset node, strtag if c kreset = <nonzero> and stptag if kreset = 0. c c kreset is an integer iteration number specifying how a resetting of c parmq and replacement on the readyq is in progress. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c c common block description: c c for a complete common block description see the subroutine libopn c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c mtail = 0 if(kreset.ne.0) then parmq(6,jobtag) = settag jmax = min0(mxprcs,snext) do 1111 j = 1,jmax if(ireset(j).eq.kreset) then if(j.ne.settag) then parmq(2,j) = icnsav(j) else parmq(2,j) = 1 endif icango = parmq(2,j) Caution: dynamic spawning nodes must have nentries reset to 1 parmq(1,j) = 1 parmq(5,j) = 0 c caution: what about race condition for dynamically spawned jobs? caution: what about resetting nkids = parmq(4,j)? c idrsq = mod((j-1),nprocc) + 1 if (icango .eq. 0 ) then call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = j rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) endif c endif 1111 continue else comment: kreset = 0 and the stop tag must be restored. parmq(6,jobtag) = settag endif c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine RSCHED' cgraph call dump(endgrf,igraph) stop endif c return c c last card of rsched c end subroutine dep(jobtag,icango,nchks,mychkn) CVD$R NOCONCUR integer jobtag,icango,nchks,mychkn(*) c************************************************************************* c c warning - this routine may only be used by driver in a static definition c of the data dependencies in the task. c c c usage c subroutine xxx(<parms>) c external yyy c . c . c . c call dep(jobtag,icango,nchks,mychkn) c call putq(jobtag,yyy,<parms2>) c . c . c . c c this subroutine puts data dependencies for problem on the queue. c no synchronization is necessary because each index of a column of c parmq is associated with a jobtag specified by the user and c associated with a unique unit of computation. the arguments of c dep specify a the data dependencies associated with the unit of c computation labeled by jobtag and are placed in a column of parmq c to the menu specified below. c c c jobtag is an integer specifying a unique column of parmq obtained c from subprogram gettag and is reused when the process jobtag c becomes done. c c icango is a positive integer specifying how many processes must check c in to this process before it can be placed on the readyq. c c nchks is the number of processes that depend upon the completion of c this process. c c mychkn is an integer array specifying schedule jobtags of the c processes which depend upon completion of this process. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c for a complete common block description see the subroutine libopn c c c place process jobtag on the problem queue c no synchronization required since c only one program work executes this code. c if( icango .lt. 0 .or. nchks .lt. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' incorrect specification of dependencies ' write(6,*) ' DEP parameters icango and nchks ' write(6,*) ' must be non-negative' write(6,*) ' input was ' write(6,*) ' jobtag ',jobtag write(6,*) ' icango ',icango write(6,*) ' nchks ',nchks write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED in subroutine DEP' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif c parmq(1,jobtag) = 1 parmq(2,jobtag) = icango parmq(3,jobtag) = nchks parmq(4,jobtag) = 0 c c check to see that exactly one node has nchks set to 0 c if (nchks .eq. 0 .and. done .eq. 0) then done = -2 else if (nchks .eq. 0) done = 0 endif c c specify identifiers of processes which depend on this one c if there are too many abort c if (nchks .gt. nslots - 5) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' attempt to place too many dependencies ' write(6,*) ' on chekin list during call to dep ' write(6,*) ' with jobtag ',jobtag write(6,*) ' ' write(6,*) ' user tried to place ',nchks ,' dependencies ' write(6,*) ' the maximum number is ',nslots - 5 write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED in subroutine DEP' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif do 50 j = 1,nchks parmq(j+5,jobtag) = mychkn(j) c if (mychkn(j) .le. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' incorrect specification of dependencies ' write(6,*) ' all mychkn entries must be positive' write(6,*) ' input was ' write(6,*) ' mychkn(',j,') = ',mychkn(j) write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED in subroutine DEP' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c 50 continue cgraph call s_lock(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call s_unlock(glock) cgraph inext = unitag(jobtag) cgraphc trace for dep cgraph igraph(1,insrt) = 0 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = icango cgraph igraph(4,insrt) = nchks cgraph do 9001 jnsrt = 5,nchks + 4 cgraph igraph(jnsrt,insrt) = parmq(jnsrt+1,jobtag) cgraph 9001 continue cgraph gnames(insrt) = names(jobtag) c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for dep cterm igraph(1,insrt) = 0 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = icango cterm igraph(4,insrt) = nchks cterm igraph(5,insrt) = fhead cterm igraph(6,insrt) = ftail cterm igraph(7,insrt) = jobtag cterm do 9001 jnsrt = 8,nchks + 7 cterm igraph(jnsrt,insrt) = parmq(jnsrt-2,jobtag) cterm 9001 continue cterm gnames(insrt) = names(jobtag) c return c c last card of dep c end subroutine reset(jobtag,nreset) CVD$R NOCONCUR integer jobtag,nreset c************************************************************************** c c this subroutine saves reset values of icango if nreset .ne. 0. c c nreset is the integer flag specifing that job jobtag can have its c dependencies reset to the originals for the next iteration. c c************************************************************************** c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) if (nreset .ne. 0) then ireset(jobtag) = nreset icnsav(jobtag) = parmq(2,jobtag) endif c return c c last card of reset c end integer function gtprb(id,jobtag) CVD$R NOCONCUR c************************************************************************** c c this function gets unique access to the head of the readyq c pointed to by id and then claims the pointer to the next c schedulable process if there is one and returns with a nonzero c value when there is a process to schedule. if there are no entries c in the readyq indexed by id then the remaning ready ques are c polled in a round robin manner until schedulable process is found c or task done is recorded. if task done has been recorded the value c zero is returned in gtprb. if a nonzero value is returned in gtprb, c the integer jobtag will contain the identifier of the unit of c computation that is to be executed. c c input parameter c c id an integer specifying which readyq to access first c for work to do. c c output parameters c c jobtag an integer containing the next process to be executed c c gtprb the return value of this integer function is: c c 0 if task done has been posted c c nonzero if a schedulable process has been claimed. c c c*************************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c common block description: c c for a complete common block description see the routine libopn c c nspins = 0 fsave = second(foo) idrsq = id 10 continue mhead = -1 call s_lock(hrlock(idrsq)) c c gain access to head of readyq. if task done has not been recorded c then increment the head of the readyq. otherwise the head pointer c is left fixed so the next active process will receive task done. c if (rhead(idrsq) .ne. rtail(idrsq)) then mhead = rhead(idrsq) rhead(idrsq) = mod(rhead(idrsq),ndmrsq) + 1 endif call s_unlock(hrlock(idrsq)) if (mhead .gt. 0) then c c there was a work unit on the readyq c jobtag = readyq(mhead+ndmrsq*(idrsq-1)) Change: events 1 & 4 changed from here to if/else below. c if (jobtag .ne. done) then c c record the subroutine call identifier in gtprb and return c the process identifier in jobtag. c gtprb = parmq(1,jobtag) if (gtprb .gt. 1 .and. parmq(5,jobtag) .eq. 0) then gtprb = -1 else cgraph call s_lock(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call s_unlock(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for grprb/child(mhead.gt.0) cgraph igraph(1,insrt) = 4 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph igraph(5,insrt) = id cgraph gnames(insrt) = names(jobtag) cgraph else cgraphc trace for grprb/parent(mhead.gt.0) cgraph igraph(1,insrt) = 1 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph igraph(4,insrt) = id cgraph gnames(insrt) = names(jobtag) cgraph endif c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for grprb/child(mhead.gt.0) cterm igraph(1,insrt) = 4 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = id cterm igraph(8,insrt) = jobtag cterm igraph(9,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for grprb/parent(mhead.gt.0) cterm igraph(1,insrt) = 1 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = id cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) endif c else c c task done has been indicated. request a return from subroutine work c by returning the value 0 in gtprb. c gtprb = 0 c endif else c jobtag = readyq(rhead(idrsq)+ndmrsq*(idrsq-1)) if (jobtag .eq. done) then c c task done has been posted c gtprb = 0 c else c c there was not any work on the readyq c Change(3a): round robin test replaced by single statement. idrsq = mod(idrsq,nprocc) + 1 nspins = nspins + 1 if (mod(nspins,nprocc) .eq. 0) call nops go to 10 c endif cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for grprb/child(mhead.le.0) cterm igraph(1,insrt) = 14 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = id cterm igraph(5,insrt) = jobtag cterm igraph(6,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for grprb/parent(mhead.le.0) cterm igraph(1,insrt) = 13 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = id cterm igraph(4,insrt) = jobtag cterm igraph(5,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c endif return c c last card of gtprb c end subroutine libopn(nproc) integer nproc c************************************************************************ c c this routine sets locks and initializes variables c and then spawns nproc generic work routines. c c nproc is a positive integer. care should be taken to c match nproc to the number of physical processors c available. c c************************************************************************ parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) INTEGER*1 WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c integer ispace(mxces) c c common block description: c c common/qdata/ c c parmq is a two dimensional integer array. each column of c this array represents a schedulable process. a process is c identified by its jobtag which corresponds to a unique c column of parmq. a column of parmq has the following c entries c c parmq(1,jobtag) = nentries c a nonzero integer. if process jobtag c is on the readyq then this integer c is equal to the one plus number of times c process jobtag has been entered. c thus when work executes this process c the integer is equal to the number c of times the process has been entered. c c parmq(2,jobtag) = icango c an integer specifying the number c of processes that must check in c before this process may scheduled c (ie. be placed on the ready queue) c c parmq(3,jobtag) = nchks c an integer specifying the number c of processes that this process c must checkin to. identifiers of c these processes are recorded below. c if nchks .eq. 0 then completion of c this process signifies completion of c task. c c parmq(4,jobtag) = the number of kids spawned by this c process. if this value is zero c then this process has not spawned c any subprocesses. c c parmq(5,jobtag) = entry_flag c has the value 1 if ientry was called c has the value 0 otherwise c c parmq(6:5+nchks,jobtag) is reserved for identifiers of the nchks c processes that must wait for completion c of this process before they can execute. c c fhead integer pointer to head of freeq. c c ftail integer pointer to tail of freeq. c c free integer flag so that there are free columns on parmq if c free = 1, while there are no free columns if free = 0. c c freeq one dimensional free list of free columns of parmq, with c free columns starting at fhead and ending at ftail in a c circular order. once a job is finished at the end of c chekin, its column or slot is added back onto freeq, c incrementing ftail mod mxprcs. c c snext integer counter holding the cumulative number of job tags c given out by gettag. c c unitag integer array holding the unique job tags "snext"s c corresponding to each current jobtag. c c intspn pointer to first spawned process. all jobtags with values c greater than or equal to intspn will be spawned processes. c c readyq a one dimensional integer array that holds the jobtags of c those processes that are ready to execute. the k-th block c of this array serves as a readyq for the k-th work routine. c on executing gtprb, the k-th work routine will look for work c in the k-th readyq first and then the others (round robin). c if readyq(*) .eq. done has been set then a return from c subroutine work(*,*) is indicated. c c rhead an integer array. the i-th entry of rhead is a pointer to the c head of the i-th block of readyq c c rtail an integer array. the i-th entry of rtail is a pointer to the c tail of the i-th block of readyq c c c common/qsync/ c c qlock is an integer array of locks. there is one lock for each c column of parmq. the purpose of this lock is to ensure c unique access to a column of parmq during the checkin operation. c c hrlock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer rhead to the head of the readyq. c c trlock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer rtail to the tail of the readyq. c c hflock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer fhead to the head of the freeq. c c tflock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer ftail to the tail of the freeq. c c common /qreset/ c c ireset is an integer flag array with ireset(j) .ne. 0 if job j c dependency will be reset, else ireset(j) = 0. c c icnsav is an integer array where icango will be caved for each job c that will be reset. c c common /CONWRT/ c c WRLOCK is an integer lock. the purpose of this lock is to ensure c a unique write during concurrent execution. c c done is a unique non positive integer set in libopn to indicate c task done. c c common /gphout/ c c endgrf is an integer pointing to the next available c slot in igraph c c glock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer endgrf of a column of igraph. c c igraph is a two dimensional integer array c used as a buffer for graphics output c each column of igraph records an event. c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c Change(3): nproc passed in common as nprocc nprocc = nproc Change(3): ndmrsq is the size of each sub-q, corresp. one proc. ndmrsq = ldimrq/nprocc c if (nproc .gt. mxces-1 .or. nproc .lt. 1) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user asking for non-physical processors' write(6,*) ' on this system: nprocs = ',nproc write(6,*) ' the maximum allowed is nproc = ',mxces-1 write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine LIBOPN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif ierr = m_set_procs(nproc) c done = -1 c c set readyq locks off c initialize readyq(*) = -1 to set done sequence c do 50 j = 1,nprocc call s_init_lock(hrlock(j)) call s_init_lock(trlock(j)) rhead(j) = 1 rtail(j) = 1 do 20 i = 1,ndmrsq readyq(i+ndmrsq*(j-1)) = -1 20 continue 50 continue c c set freeq pointers and locks c set qlocks off c initialize reentry indicator in parmq(5,*) c initial circular freeq with all parmq columns c free = 1 fhead = 1 ftail = mxprcs call s_init_lock(hflock) call s_init_lock(tflock) call s_init_lock(WRLOCK) cgraph call s_init_lock(glock) cterm call s_init_lock(glock) do 100 j = 1,mxprcs call s_init_lock(qlock(j)) parmq(5,j) = 0 freeq(j) = j ireset(j) = 0 icnsav(j) = 0 100 continue c c initialize queue pointers c intspn = 1 snext = 0 cgraph endgrf = 1 cgraph open( file='trace.graph',unit=3) cgraphc cgraphChange: Output nproc for sched.trace format cgraph write(3,30000) nproc cgraph30000 format(i8) cterm endgrf = 1 cterm open( file='term.trace',unit=3) ctermc ctermChange: Output nproc for terminal trace format cterm write(3,30000) nproc cterm30000 format('nprocs = ',i1/) c c set lock on pointer to head of readyq so c no process may start until all process and data dependencies c have been specified by the user supplied routine driver. c do 150 j = 1,nprocc call s_lock(hrlock(j)) 150 continue c c now spawn virtual processors. these generic work routines will c assume the identity of any schedulable process specified by driver. c C$DOACROSS share(ispace) do 200 j = 1,nproc call work(j,ispace(j)) 200 continue cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) return c c last card of libopn c end subroutine nxtag(jobtag,mypar) CVD$R NOCONCUR CAUTION: nxtag arguments are consistent with dep now, but order of cont: arguments may not be consistent with older versions of ftsubs.f. integer jobtag,mypar c*********************************************************************** c c c this subroutine puts parental dependencies for problem on the c queue. the arguments of spawn specify a process for this job. c c jobtag is an integer specifying a unique column of parmq. c c mypar is an integer specifying the parent of the dynamically c spawned process jobtag. c c c*********************************************************************** c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c for a complete common block description see the subroutine libopn c c c c place this process on the free slot in the problem queue c obtained from subprogram gettag. c parmq(1,jobtag) = 1 parmq(2,jobtag) = 0 parmq(3,jobtag) = 1 parmq(6,jobtag) = mypar c cgraph call s_lock(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call s_unlock(glock) cgraph inext = unitag(jobtag) cgraphc trace for nxtag cgraph igraph(1,insrt) = 3 cgraph igraph(2,insrt) = mypar cgraph igraph(3,insrt) = inext cgraph gnames(insrt) = names(jobtag) c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for nxtag cterm igraph(1,insrt) = 3 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = mypar cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail cterm igraph(6,insrt) = jobtag cterm gnames(insrt) = names(jobtag) c c update the icango counter of the parent process c by adding 2 to parmq(2,mypar)... prevents race condition. c add 1 to the number of kids spawned by parent mypar c call s_lock(qlock(mypar)) parmq(2,mypar) = parmq(2,mypar) + 2 parmq(4,mypar) = parmq(4,mypar) + 1 call s_unlock(qlock(mypar)) c c set number of kids spawned by jobtag to zero c parmq(4,jobtag) = 0 c c c return c c last card of nxtag c end subroutine start2 c c this routine allows parallel processing to start after user supplied c driver has completed by unlocking the head of the readyq c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) logical nostrt cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c for common block description see subroutine libopn. c if (done .ne. 0) then write(6,*) '*************SCHED USER ERROR********************' if (done .eq. -1 ) then write(6,*) ' no process has set nchks equal to 0 ' else write(6,*) ' more than one process has set nchks to 0 ' endif write(6,*) ' SCHEDULE will not be able to terminate job' write(6,*) ' correctly ' write(6,*) ' ' write(6,*) ' check subroutine passed to initial call to' write(6,*) ' to see that at exactly one call to DEP has ' write(6,*) ' set nchks = 0 ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine START2' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c nostrt = .true. do 100 iw = 1,nprocc if (rhead(iw) .ne. rtail(iw)) nostrt = .false. 100 continue if (nostrt) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' no process had an intitial icango of 0 ' write(6,*) ' SCHEDULE could not begin ' write(6,*) ' ' write(6,*) ' check subroutine passed to initial call to' write(6,*) ' to see that at least one call to DEP has ' write(6,*) ' set icango = 0 ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine START2' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif Change: intspn correction to recover lost jobtag. c intspn is the unique tag of the first or initially spawned process. intspn = snext + 1 do 200 iw = 1,nprocc call s_unlock(hrlock(iw)) 200 continue c return c c last card of start2 c end subroutine place(jobtag) CVD$R NOCONCUR integer jobtag c************************************************************************* c c c this subroutine places a problem on the readyq c c jobtag is an integer specifying a unique column of parmq. c c c icango is a positive integer specifying how many processes must check c into this process before it can be placed on the readyq. c c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c c common block description: c c for a complete common block description see the subroutine libopn c c place this process on readyq if icango is 0 c when icango .eq. 0 this process does not depend on any c others. c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c mtail = 0 icango = parmq(2,jobtag) idrsq = mod((jobtag-1),nprocc) + 1 if (icango .eq. 0 ) then call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = jobtag rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) endif Change: cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for place cterm igraph(1,insrt) = 6 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = icango cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine PLACE' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c return c c last card of place c end integer function ientry(mypar,nentrs) c integer mypar c***************************************************************************** c c this routine will allow process mypar to continue after c spawned processes have all checked in. it should only be called if c processes have been spawned by mypar through the use of c the subroutine spawn. c c go to (1000,2000,...,N000), ientry(mypar,N) c 1000 continue c . c . c . c do 10 j = 1,nproc c . c . (set parameters to define spawned process) c . c call nxtag(jobtag,mypar) c call spawn(jobtag,mypar,subname,<parms>) c 10 continue c return c 2000 continue c . c . c . c return c N000 continue c <statements> c return c end c c this subroutine returns the number of times process mypar c has been entered. if that number is equal to the total c number nentrs of expected reentries then parmq(5,mypar) c is set to 0 indicating no more reentries required. c c***************************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) cgraph integer endgrf cgraph integer*1 glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c report the entry point where process jobtag should resume c computation c inext = unitag(mypar) if (nentrs .lt. 2) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user call to IENTRY with number of ' write(6,*) ' labels in nentrs set less than 2 ' write(6,*) ' from parent process ',inext write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif ientry = parmq(1,mypar) if (ientry .lt. nentrs) then parmq(5,mypar) = nentrs else parmq(5,mypar) = 0 endif c return c c last card of ientry c end logical function wait(mypar,ienter) c integer mypar,ienter c***************************************************************************** c c this routine will allow process mypar to continue after c spawned processes have all checked in. it should only be called if c processes have been spawned by mypar through the use of c the subroutine spawn. this routine must be used in conjunction with c subroutine prtspn. the required syntax is c c go to (1000,...,L000,...,N000), ientry(mypar,N) c 1000 continue c . c . c . c do 100 j = 1,nproc c . c . (set parameters to define spawned process) c . c call nxtag(jobtag,mypar) c call spawn(jobtag,mypar,subname,<parms>) c 100 continue c label = L c if (wait(mypar,label)) return c L000 continue c . c . c . c c if this subroutine returns a value of .true. then the calling process c mypar should issue a return. if a value of .false. is returned then c the calling process mypar should resume execution at the c statement immediately following the reference to wait (ie. at L000 in c the example above. a return value .true. indicates that some spawned c process has not yet completed and checked in. a return value .false. c indicates all spawned processes have checked in. c c*********************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) c c c check the icango counter to see if all spawned processes (kids) c have checked in. c inext = unitag(mypar) icango = 1 call s_lock(qlock(mypar)) icango = parmq(2,mypar) - parmq(4,mypar) call s_unlock(qlock(mypar)) c if (icango .eq. 0) then c c all kids are done ... dont wait (ie return false) c wait = .false. c c record re_entry label where computation is to c resume after wait is complete c parmq(1,mypar) = ienter c if (ienter .gt. parmq(5,mypar)) then write(6,*) '*************SCHED LIMIT ERROR*****************' write(6,*) ' executing SCHEDULE function WAIT ' write(6,*) ' return label larger than the maximum ' write(6,*) ' specified by user in call to ientry ' write(6,*) ' from parent process ', inext write(6,*) ' ' write(6,*) ' the maximum reentry number is ' write(6,*) ' ', parmq(5,mypar) write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c c set last re_entry indication (parmq(5,mypar) = 0) c if this reentry point corresponds to last one c (recorded in parmq(5,mypar) during call to ientry) c if (ienter .eq. parmq(5,mypar)) parmq(5,mypar) = 0 c else c c kids are not done c wait = .true. c c a checkin will be made so set the number of c entries to return label ienter - 1 to get c correct entry point after checkin c parmq(1,mypar) = ienter - 1 c endif c return c c last card of wait c end subroutine dump(endgrf,igraph) CVD$R NOCONCUR Change: combined SUN SCHED.TRACE/sched.trace and terminal version of dump. parameter (nslots = 105,nbuffr = 500) parameter (mxprcs = 1000) integer endgrf real igraph(nslots,nbuffr) character*6 gnames,aname common /gphnam/ gnames(nbuffr) integer ievent(nslots) c*********************************************************************** c c this routine writes graphics and terminal output to a file c and resets endgrf to 1 c c*********************************************************************** do 300 j = 1,endgrf-1 do 302 i = 1,nslots ievent(i) = igraph(i,j) 302 continue inext = ievent(2) if( ievent(1) .eq. 0 ) then aname = gnames(j) cgraph write(3,30000) (ievent(i),i=1,ievent(4)+4) cgraph write(3,30010) aname cterm write(3,3000) j,(ievent(i),i=1,7) cterm & ,aname,(ievent(i),i=8,ievent(4)+7) endif if( ievent(1) .eq. 1 ) then aname = gnames(j) cgraph write(3,30001) (ievent(i),i=1,2),igraph(3,j) cgraph & ,ievent(4) cterm write(3,3001) j,(ievent(i),i=1,7),aname,igraph(8,j) endif if( ievent(1) .eq. 2 ) then aname = gnames(j) cgraph write(3,30002) (ievent(i),i=1,2),igraph(3,j) cterm write(3,3002) j,(ievent(i),i=1,4),aname,igraph(5,j) endif if( ievent(1) .eq. 3 ) then aname = gnames(j) cgraph write(3,30003) (ievent(i),i=1,3),aname cterm write(3,3003) j,(ievent(i),i=1,6),aname endif if( ievent(1) .eq. 4 ) then aname = gnames(j) cgraph write(3,30004) (ievent(i),i=1,3),igraph(4,j) cgraph & ,ievent(5) cterm write(3,3004) j,(ievent(i),i=1,8),aname,igraph(9,j) endif if( ievent(1) .eq. 5 ) then aname = gnames(j) cgraph write(3,30005) (ievent(i),i=1,3),igraph(4,j) cterm write(3,3005) j,(ievent(i),i=1,5),aname,igraph(6,j) endif if( ievent(1) .eq. 6 ) then cgraph write(3,30002) (ievent(i),i=1,2),igraph(3,j) endif if( ievent(1) .eq. 7 ) then cgraph write(3,30005) (ievent(i),i=1,3),igraph(4,j) endif cterm if( ievent(1) .eq. 6 ) then cterm aname = gnames(j) cterm write(3,3006) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 7 ) then cterm aname = gnames(j) cterm write(3,3007) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 8 ) then cterm aname = gnames(j) cterm write(3,3008) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 9 ) then cterm aname = gnames(j) cterm write(3,3009) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 10 ) then cterm aname = gnames(j) cterm write(3,3010) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 11 ) then cterm aname = gnames(j) cterm write(3,3011) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 12 ) then cterm aname = gnames(j) cterm write(3,3012) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 13 ) then cterm if ( ievent(4) .ne. 0 ) then cterm aname = gnames(j) cterm else cterm aname = ' work' cterm endif cterm write(3,3013) j,(ievent(i),i=1,4),aname,igraph(5,j) cterm endif cterm if( ievent(1) .eq. 14 ) then cterm if ( ievent(5) .ne. 0 ) then cterm aname = gnames(j) cterm else cterm aname = ' work' cterm endif cterm write(3,3014) j,(ievent(i),i=1,5),aname,igraph(6,j) cterm endif cterm if( ievent(1) .eq. 15 ) then cterm write(3,3015) j,(ievent(i),i=1,5) cterm endif cgraph30000 format(14i8) cgraph30010 format(2x,a) cgraph30001 format(2i8,1pe16.8,i8) cgraph30002 format(2i8,1pe16.8) cgraph30003 format(3i8,2x,a) cgraph30004 format(3i8,1pe16.8,i8) cgraph30005 format(3i8,1pe16.8) cterm3000 format(i4,'. dep:',i2,';jobtag=',i4,';icango=',i4 cterm & ,'; nchks=',i4,';fhead,ftail=',i4,',',i4 cterm & /21x,12x,';idparm=',i4,';mytask= ',a6 cterm & /21x,'; mychkn(s)=',5i4,(/21x,10i4)) cterm3001 format(i4,'. gtprb/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (mhead.gt.0) ',';idwork=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3002 format(i4,'.chekin/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4 cterm & /4x,' (entryflag.eq.0)',12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3003 format(i4,'. nxtag:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,12x,';fhead,ftail=',i4,',',i4 cterm & /21x,12x,';idparm=',i4,';mytask= ',a6) cterm3004 format(i4,'. gtprb/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (mhead.gt.0) ',';idwork=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3005 format(i4,'. chekin/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,' idrsq=',i4 cterm & /4x,' (entryflag.eq.0)',12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3006 format(i4,'. place:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /21x,';icango=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3007 format(i4,'.chekin/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';fhead,ftail=',i4,',',i4 cterm & /4x,' (nchks.ne.0) ',2x,';mychek=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3008 format(i4,'. chekin/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,'; idrsq=',i4,';fhead,ftail=',i4,',',i4 cterm & /4x,' (nchks.ne.0) ',2x,12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3009 format(i4,'.chekin/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (entryflag.ne.0)',';mychek=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3010 format(i4,'. chekin/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (entryflag.ne.0)',12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3011 format(i4,'.chekin/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (nchks.eq.0) ',2x,';mychek=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3012 format(i4,'. chekin/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /,4x,' (nchks.eq.0) ',12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3013 format(i4,'. gtprb/parent:',i2,';jobtag=',i4,12x cterm & /4x,' (mhead.le.0) ',';idwork=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3014 format(i4,'. gtprb/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & /4x,' (mhead.lt.0) ',';idwork=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3015 format(i4,'. gettag:',i2,';jobtag=',i4,';idparm=',i4 cterm & ,12x,';fhead,ftail=',i4,',',i4) 300 continue c endgrf = 1 c return c c last line of dump c end subroutine name(jobtag,myname) parameter (mxprcs = 1000) character*6 names,myname common /calls/ names(mxprcs) names(jobtag) = myname return c c last card of name c end c subroutine lckasn(ilock) integer*1 ilock call s_init_lock(ilock) return end c subroutine lockon(ilock) integer*1 ilock call s_lock(ilock) return end c subroutine lockoff(ilock) integer*1 ilock call s_unlock(ilock) return end c subroutine nops j = 1 return end SHAR_EOF if test -f 'ftsubs.graph.f' then echo shar: over-writing existing file "'ftsubs.graph.f'" fi cat << \SHAR_EOF > 'ftsubs.graph.f' $STDUNIT $ALIGNWARN CVD$G NOINLINE (DUMP,DUMP2,LOCKON,LOCKOFF,NOPS,SECOND,WORK) subroutine chekin(jobtag) Code path: balance:/bfs2/brewer/SCHED/HANSON/ftsubs.f Comment: integrated iteration version of ftsubs.f and ftsubs.iter.f cont: with option to iterate a set of nodes with reset dependencies. Comment: combined graphics and terminal trace version of ftsubt.f Code parent: alliant:/afs1/hanson/dirsched/ftsubs.f change(1): iprcs = 200 <- 120; change(2): automatic return stmt removed out of loop do 20 in chekin; change(3): installed vector-circular ready queue, cont: vector <= nproc sub-qs, elastic with nproc processors; cont: circular <= readyq free space wraps around from rtail to rhead, cont: with the top end of readyq connected to the bottom end; cont: ready(rhead(id)+ndmrsq*(id-1)) <- readyq(rhead(id),id); cont: most mxces replaced by nprocc = nproc = no. sub-qs; cont: ldimrq = leading dim of readyq = iprcs*mxces cont: ndmrsq = dim of a ready-sub-q = ldimrq/nproc cont: idrsq = id of ready-sub-q <- iwrkr; dummy iw used in do's; cont: installed SCHED ERROR flags for readyq over-runs (mtail cond.); cont: round robin test in getprb reduced to single statement. Change: corrected next in nxtag & intspn in start2 to recover lost tag. CAUTION: nxtag and spawn arguments are consistent with dep and putq cont: now, but order of arguments may not be consistent with older cont: versions of ftsubs.f. Change(4): installed circular parm queue, jobtag is the circular cont: (reusable) job tag with 1.le.jobtag.le.mxprcs, cont: snext is the schedule or sum or cumulative jobtag. Change(5): install super next tag, whereby user gets schedule job tags cont: from new schedule sub gettag; hence schedule has no knowledge cont: of user tags and consequently the principal restriction on user cont: is that there be less than "mxprcs" undone jobs at any time. cont: integer array "unitag" keeps a unique job tag for undone jobs. Change(6): install rest and save arrays for jobtags that will be cont: iterated more than once with original dependencies: ireset, cont: icnsav. install sub rsched to reset icangoes cont: and call sub place on iteration. Change(6a): nslots = 105 <- 30 to handle multiple dependencies. Change(7): installed common block CONWRT with key WRLOCK for concurrent cont: writes for use in both ftsubs.f and the user's driver code. Change(8): installed c-include indx*.h files to enable the passing of cont: up to 60 parameters with sched, putq and spawn calls (via m. cont: johnson, ssi). Change(9): installed lock initializations in libopn to make porting cont: to other machines without automatic variable initialization. CAUTION: subroutine second uses machine dependent timer, which must be cont: changed when porting to other machines. cgraphChange: install write nproc in sub libopn. cgraphChange: installed extra traces in chekin & place. cgraphChange: replaced qlock(mxprcs) by glock as igraph's own lock. cgraphChange: installed process names for Dongarra/Brewer's sched.trace. cgraphChange(8): cgraph lines made compatible for SCHED.TRACE/sched.trace. cgraphcdirectory: /usr/alcaid/brewer/SCHED.TRACE/sched.trace cgraphcomment: for graphics trace, change 'cgraph' to null '' and run. ctermComment: for terminal trace, change 'cterm' to null '' and run. Change(9): Conversion of ftsubs.f to run on Sequent Balance 21000 cont: Add $STDUNIT & $ALIGNWARN compiler directives at beginning of file. cont: Change parameter "mxces" from 8 to 24. cont: Add line "ierr = m_set_procs(nproc)." cont: All locks are integer*1. cont: Use microtasking calls s_init_lock, s_lock, & s_unlock. cont: Add routines lckasn, lockon, lockoff, & nops. CVD$R NOCONCUR integer jobtag c*********************************************************************** c c this subroutine reports unit of computation labeled by c jobtag has completed to all dependent nodes. these nodes are c recorded in parmq(i,jobtag) where 6 .le. i .le. nchks+5 c checkin consists of decrementing the value in each of these c locations by 1. each of these is done in a critical section c protected by qlock(jobtag) c c if the value in parmq(2,jobtag) is 0 where jobtag is a process c dependent upon this one then jobtag is placed on the readyq c by entering the critical section protected by trlock. the c pointer rtail to the tail of the readyq is incremented c unless task done is to be recorded. task done is placed on c the ready q and the pointer rtail left in place if nchks .eq. 0 c is found. c c see the common block description in libopn for more detail. c c*********************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) Caution: common block CONWRT is used for concurrent writes, cont: with WRLOCK as the key to the LOCK. INTEGER*1 WRLOCK COMMON /CONWRT/ WRLOCK integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c a complete common block description is given in the routine libopn c c**************************************************************************** c c check to see if this process has completed. if not do not checkin c mtail = 0 idrsq = 0 c c first ask if any kids spawned by jobtag c if (parmq(4,jobtag) .ne. 0 .or. parmq(5,jobtag) .ne. 0 ) then c c either kids have been spawned or ientry has been referenced c indicating reentry is required c c c find out how many are waiting to complete c if (parmq(4,jobtag) .ne. 0) then call s_lock(qlock(jobtag)) parmq(2,jobtag) = parmq(2,jobtag) - parmq(4,jobtag) call s_unlock(qlock(jobtag)) endif c c reset number of kids c parmq(4,jobtag) = 0 c c update the number of times this procedure has been c entered c parmq(1,jobtag) = parmq(1,jobtag) + 1 c c return without checkin if all the kids have not c checked in to jobtag yet or if a reentry is required. c process jobtag is not done in either case. c comment: extra trace data. if (parmq(2,jobtag) .ne. 0) then call s_lock(glock) if (endgrf .gt. nbuffr) call dump(endgrf,igraph) insrt = endgrf endgrf = endgrf + 1 call s_unlock(glock) inext = unitag(jobtag) if (inext .ge. intspn) then c trace for chekin/child (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) igraph(1,insrt) = 7 igraph(2,insrt) = parmq(6,jobtag) igraph(3,insrt) = inext igraph(4,insrt) = second(foo) else c trace for chekin/parent (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) igraph(1,insrt) = 6 igraph(2,insrt) = inext igraph(3,insrt) = second(foo) endif return endif c c if ientry has been called but jobtag is not waiting c on any kids then jobtag is placed back on the readyq c if ( parmq(5,jobtag) .ne. 0) then idrsq = mod((jobtag-1),nprocc) + 1 call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = jobtag rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 10 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(8,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 9 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(8,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif return endif endif c c the process has completed so chekin proceeds c call s_lock(glock) if (endgrf .gt. nbuffr) call dump(endgrf,igraph) insrt = endgrf endgrf = endgrf + 1 call s_unlock(glock) inext = unitag(jobtag) if (inext .ge. intspn) then c trace for chekin/child (entry_flag.eq.0 & nkids = 0) igraph(1,insrt) = 5 igraph(2,insrt) = parmq(6,jobtag) igraph(3,insrt) = inext igraph(4,insrt) = second(foo) gnames(insrt) = names(jobtag) else c trace for chekin/parent (entry_flag.eq.0 & nkids = 0) igraph(1,insrt) = 2 igraph(2,insrt) = inext igraph(3,insrt) = second(foo) gnames(insrt) = names(jobtag) endif c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 5 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = jobtag cterm igraph(6,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 2 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = jobtag cterm igraph(5,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c c if (mtail .lt. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c nchks = parmq(3,jobtag) c c if this is the final process (indicated by nchks .eq. 0) then c record task done. do not advance the tail so task done sequence c is set. all subsequent gtprb queries will leave rhead .eq. rtail c with readyq(rhead+ndmrsq*(i-1)) .eq. done. c if (nchks .eq. 0) then do 20 iw = 1,nprocc call s_lock(trlock(iw)) readyq(rtail(iw)+ndmrsq*(iw-1)) = done call s_unlock(trlock(iw)) 20 continue cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 12 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 11 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif Change(2): removed following return from end of above loop do 20. return endif do 50 j = 6,nchks+5 mychek = parmq(j,jobtag) c c get unique access to the checkin node mychek c and checkin by decrementing the appropriate counter c mchkgo = 1 call s_lock(qlock(mychek)) parmq(2,mychek) = parmq(2,mychek) - 1 mchkgo = parmq(2,mychek) call s_unlock(qlock(mychek)) c c place mychek on readyq if parmq(2,mychek) is 0 c if (mchkgo .eq. 0 ) then idrsq = mod((mychek-1),nprocc) + 1 call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = mychek rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) endif 50 continue c c place finished process at the end of the free list freeq c provided it will not be reset for another iteration. c if(ireset(jobtag).eq.0) then call s_lock(tflock) ftail = mod(ftail,mxprcs) + 1 if(fhead.eq. ftail) free = 0 freeq(ftail) = jobtag call s_unlock(tflock) endif c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 8 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = fhead cterm igraph(6,insrt) = ftail cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 7 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c if ( free .eq. 0 ) then call s_lock(WRLOCK) inext = unitag(jobtag) write(6,*) '*************SCHED ERROR*************************' write(6,*) ' more processes have checked into sub chekin,' write(6,*) ' than should be active for free slots in the' write(6,*) ' parmq parameter queue. jobs are too many.' write(6,*) ' total number of jobtags were:',inext write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop call s_unlock(WRLOCK) c endif c return c c last card of chekin c end subroutine gettag(jobtag) CVD$R NOCONCUR integer jobtag c************************************************************************* c c this subroutine gets a schedule jobtag for problem on the queue, c provided a free column is available in parmq. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) INTEGER*1 WRLOCK COMMON /CONWRT/ WRLOCK integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c for a complete common block description see the subroutine libopn c c if ( free .eq. 0 ) then call s_lock(WRLOCK) write(6,*) '*************SCHED LIMIT ERROR*******************' write(6,*) ' user attempt to create to many active ' write(6,*) ' processes ; total number of jobs =',snext write(6,*) ' too many unfinished jobs while in gettag ' write(6,*) ' and no free slots on the parameter queue ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop call s_unlock(WRLOCK) c endif c c get tag for process on the next free column in the problem queue c call s_lock(hflock) jobtag = freeq(fhead) snext = snext + 1 if(fhead.eq. ftail) free = 0 fhead = mod(fhead,mxprcs) + 1 if(jobtag.ge.1.and.jobtag.le.mxprcs) unitag(jobtag) = snext call s_unlock(hflock) cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for gettag cterm igraph(1,insrt) = 15 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = jobtag cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail c if ( jobtag .le. 0 .or. jobtag .gt. mxprcs ) then write(6,*) '*************SCHED ERROR***********************' write(6,*) ' illegal jobtag for parmq column;' write(6,*) ' need 1 .le. jobtag .le. ',mxprcs,';' write(6,*) ' current jobtag =',jobtag,' in gettag' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif c return c c last card of gettag c end subroutine rsched(jobtag,settag,kreset) CVD$R NOCONCUR integer jobtag,settag,kreset c************************************************************************* c comment: usage c subroutine paralg(<subargs>) c integer strtag,stptag,itag(*) c external start,test c . c . c call gettag(strtag) c itag(strtag) = strtag c . c . c call gettag(stptag) c itag(stptag) = stptag c . c . comment: start iteration or time stepping c jobtag = strtag c icango = 1 c nchks = ... c nreset = <positive number for iteration set> c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,start,itag(strtag)) c . c . comment: test and continue iteration at start if undone c jobtag = testag c icango = ... c nreset = <positive number for iteration set> c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,test,itag(strtag),itag(stptag)) c . c . c subroutine test(jobtag,strtag,stptag) c common /<label>/ <finished> c . c . c if(<finished>) then c kreset = <positive number for iteration set> c call rsched(jobtag,strtag,kreset) c else c kreset = 0 c call rsched(jobtag,stptag,kreset) c endif c return c end c c this subroutine restores the icangoes of jobtags that work in c an iteration of a loop and calls place to place the reset jobtags c back on the ready queue. only those jobtags with c ireset(*) = kreset are reset. c c c jobtag is an integer job tag of the calling test subroutine, c that tests whether or not the iteration is done. c c strtag is an integer job tag of the iteration starting node. c c stptag is an integer job tag of the iteration stopping node. c c settag is an integer job tag of the iteration reset node, strtag if c kreset = <nonzero> and stptag if kreset = 0. c c kreset is an integer iteration number specifying how a resetting of c parmq and replacement on the readyq is in progress. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c c common block description: c c for a complete common block description see the subroutine libopn c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c mtail = 0 if(kreset.ne.0) then parmq(6,jobtag) = settag jmax = min0(mxprcs,snext) do 1111 j = 1,jmax if(ireset(j).eq.kreset) then if(j.ne.settag) then parmq(2,j) = icnsav(j) else parmq(2,j) = 1 endif icango = parmq(2,j) Caution: dynamic spawning nodes must have nentries reset to 1 parmq(1,j) = 1 parmq(5,j) = 0 c caution: what about race condition for dynamically spawned jobs? caution: what about resetting nkids = parmq(4,j)? c idrsq = mod((j-1),nprocc) + 1 if (icango .eq. 0 ) then call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = j rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) endif c endif 1111 continue else comment: kreset = 0 and the stop tag must be restored. parmq(6,jobtag) = settag endif c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine RSCHED' call dump(endgrf,igraph) stop endif c return c c last card of rsched c end subroutine dep(jobtag,icango,nchks,mychkn) CVD$R NOCONCUR integer jobtag,icango,nchks,mychkn(*) c************************************************************************* c c warning - this routine may only be used by driver in a static definition c of the data dependencies in the task. c c c usage c subroutine xxx(<parms>) c external yyy c . c . c . c call dep(jobtag,icango,nchks,mychkn) c call putq(jobtag,yyy,<parms2>) c . c . c . c c this subroutine puts data dependencies for problem on the queue. c no synchronization is necessary because each index of a column of c parmq is associated with a jobtag specified by the user and c associated with a unique unit of computation. the arguments of c dep specify a the data dependencies associated with the unit of c computation labeled by jobtag and are placed in a column of parmq c to the menu specified below. c c c jobtag is an integer specifying a unique column of parmq obtained c from subprogram gettag and is reused when the process jobtag c becomes done. c c icango is a positive integer specifying how many processes must check c in to this process before it can be placed on the readyq. c c nchks is the number of processes that depend upon the completion of c this process. c c mychkn is an integer array specifying schedule jobtags of the c processes which depend upon completion of this process. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c for a complete common block description see the subroutine libopn c c c place process jobtag on the problem queue c no synchronization required since c only one program work executes this code. c if( icango .lt. 0 .or. nchks .lt. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' incorrect specification of dependencies ' write(6,*) ' DEP parameters icango and nchks ' write(6,*) ' must be non-negative' write(6,*) ' input was ' write(6,*) ' jobtag ',jobtag write(6,*) ' icango ',icango write(6,*) ' nchks ',nchks write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED in subroutine DEP' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif c parmq(1,jobtag) = 1 parmq(2,jobtag) = icango parmq(3,jobtag) = nchks parmq(4,jobtag) = 0 c c check to see that exactly one node has nchks set to 0 c if (nchks .eq. 0 .and. done .eq. 0) then done = -2 else if (nchks .eq. 0) done = 0 endif c c specify identifiers of processes which depend on this one c if there are too many abort c if (nchks .gt. nslots - 5) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' attempt to place too many dependencies ' write(6,*) ' on chekin list during call to dep ' write(6,*) ' with jobtag ',jobtag write(6,*) ' ' write(6,*) ' user tried to place ',nchks ,' dependencies ' write(6,*) ' the maximum number is ',nslots - 5 write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED in subroutine DEP' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif do 50 j = 1,nchks parmq(j+5,jobtag) = mychkn(j) c if (mychkn(j) .le. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' incorrect specification of dependencies ' write(6,*) ' all mychkn entries must be positive' write(6,*) ' input was ' write(6,*) ' mychkn(',j,') = ',mychkn(j) write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED in subroutine DEP' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c 50 continue call s_lock(glock) if (endgrf .gt. nbuffr) call dump(endgrf,igraph) insrt = endgrf endgrf = endgrf + 1 call s_unlock(glock) inext = unitag(jobtag) c trace for dep igraph(1,insrt) = 0 igraph(2,insrt) = inext igraph(3,insrt) = icango igraph(4,insrt) = nchks do 9001 jnsrt = 5,nchks + 4 igraph(jnsrt,insrt) = parmq(jnsrt+1,jobtag) 9001 continue gnames(insrt) = names(jobtag) c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for dep cterm igraph(1,insrt) = 0 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = icango cterm igraph(4,insrt) = nchks cterm igraph(5,insrt) = fhead cterm igraph(6,insrt) = ftail cterm igraph(7,insrt) = jobtag cterm do 9001 jnsrt = 8,nchks + 7 cterm igraph(jnsrt,insrt) = parmq(jnsrt-2,jobtag) cterm 9001 continue cterm gnames(insrt) = names(jobtag) c return c c last card of dep c end subroutine reset(jobtag,nreset) CVD$R NOCONCUR integer jobtag,nreset c************************************************************************** c c this subroutine saves reset values of icango if nreset .ne. 0. c c nreset is the integer flag specifing that job jobtag can have its c dependencies reset to the originals for the next iteration. c c************************************************************************** c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) if (nreset .ne. 0) then ireset(jobtag) = nreset icnsav(jobtag) = parmq(2,jobtag) endif c return c c last card of reset c end integer function gtprb(id,jobtag) CVD$R NOCONCUR c************************************************************************** c c this function gets unique access to the head of the readyq c pointed to by id and then claims the pointer to the next c schedulable process if there is one and returns with a nonzero c value when there is a process to schedule. if there are no entries c in the readyq indexed by id then the remaning ready ques are c polled in a round robin manner until schedulable process is found c or task done is recorded. if task done has been recorded the value c zero is returned in gtprb. if a nonzero value is returned in gtprb, c the integer jobtag will contain the identifier of the unit of c computation that is to be executed. c c input parameter c c id an integer specifying which readyq to access first c for work to do. c c output parameters c c jobtag an integer containing the next process to be executed c c gtprb the return value of this integer function is: c c 0 if task done has been posted c c nonzero if a schedulable process has been claimed. c c c*************************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c common block description: c c for a complete common block description see the routine libopn c c nspins = 0 fsave = second(foo) idrsq = id 10 continue mhead = -1 call s_lock(hrlock(idrsq)) c c gain access to head of readyq. if task done has not been recorded c then increment the head of the readyq. otherwise the head pointer c is left fixed so the next active process will receive task done. c if (rhead(idrsq) .ne. rtail(idrsq)) then mhead = rhead(idrsq) rhead(idrsq) = mod(rhead(idrsq),ndmrsq) + 1 endif call s_unlock(hrlock(idrsq)) if (mhead .gt. 0) then c c there was a work unit on the readyq c jobtag = readyq(mhead+ndmrsq*(idrsq-1)) Change: events 1 & 4 changed from here to if/else below. c if (jobtag .ne. done) then c c record the subroutine call identifier in gtprb and return c the process identifier in jobtag. c gtprb = parmq(1,jobtag) if (gtprb .gt. 1 .and. parmq(5,jobtag) .eq. 0) then gtprb = -1 else call s_lock(glock) if (endgrf .gt. nbuffr) call dump(endgrf,igraph) insrt = endgrf endgrf = endgrf + 1 call s_unlock(glock) inext = unitag(jobtag) if (inext .ge. intspn) then c trace for grprb/child(mhead.gt.0) igraph(1,insrt) = 4 igraph(2,insrt) = parmq(6,jobtag) igraph(3,insrt) = inext igraph(4,insrt) = second(foo) igraph(5,insrt) = id gnames(insrt) = names(jobtag) else c trace for grprb/parent(mhead.gt.0) igraph(1,insrt) = 1 igraph(2,insrt) = inext igraph(3,insrt) = second(foo) igraph(4,insrt) = id gnames(insrt) = names(jobtag) endif c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for grprb/child(mhead.gt.0) cterm igraph(1,insrt) = 4 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = id cterm igraph(8,insrt) = jobtag cterm igraph(9,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for grprb/parent(mhead.gt.0) cterm igraph(1,insrt) = 1 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = id cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) endif c else c c task done has been indicated. request a return from subroutine work c by returning the value 0 in gtprb. c gtprb = 0 c endif else c jobtag = readyq(rhead(idrsq)+ndmrsq*(idrsq-1)) if (jobtag .eq. done) then c c task done has been posted c gtprb = 0 c else c c there was not any work on the readyq c Change(3a): round robin test replaced by single statement. idrsq = mod(idrsq,nprocc) + 1 nspins = nspins + 1 if (mod(nspins,nprocc) .eq. 0) call nops go to 10 c endif cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for grprb/child(mhead.le.0) cterm igraph(1,insrt) = 14 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = id cterm igraph(5,insrt) = jobtag cterm igraph(6,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for grprb/parent(mhead.le.0) cterm igraph(1,insrt) = 13 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = id cterm igraph(4,insrt) = jobtag cterm igraph(5,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c endif return c c last card of gtprb c end subroutine libopn(nproc) integer nproc c************************************************************************ c c this routine sets locks and initializes variables c and then spawns nproc generic work routines. c c nproc is a positive integer. care should be taken to c match nproc to the number of physical processors c available. c c************************************************************************ parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) INTEGER*1 WRLOCK COMMON /CONWRT/ WRLOCK integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c integer ispace(mxces) c c common block description: c c common/qdata/ c c parmq is a two dimensional integer array. each column of c this array represents a schedulable process. a process is c identified by its jobtag which corresponds to a unique c column of parmq. a column of parmq has the following c entries c c parmq(1,jobtag) = nentries c a nonzero integer. if process jobtag c is on the readyq then this integer c is equal to the one plus number of times c process jobtag has been entered. c thus when work executes this process c the integer is equal to the number c of times the process has been entered. c c parmq(2,jobtag) = icango c an integer specifying the number c of processes that must check in c before this process may scheduled c (ie. be placed on the ready queue) c c parmq(3,jobtag) = nchks c an integer specifying the number c of processes that this process c must checkin to. identifiers of c these processes are recorded below. c if nchks .eq. 0 then completion of c this process signifies completion of c task. c c parmq(4,jobtag) = the number of kids spawned by this c process. if this value is zero c then this process has not spawned c any subprocesses. c c parmq(5,jobtag) = entry_flag c has the value 1 if ientry was called c has the value 0 otherwise c c parmq(6:5+nchks,jobtag) is reserved for identifiers of the nchks c processes that must wait for completion c of this process before they can execute. c c fhead integer pointer to head of freeq. c c ftail integer pointer to tail of freeq. c c free integer flag so that there are free columns on parmq if c free = 1, while there are no free columns if free = 0. c c freeq one dimensional free list of free columns of parmq, with c free columns starting at fhead and ending at ftail in a c circular order. once a job is finished at the end of c chekin, its column or slot is added back onto freeq, c incrementing ftail mod mxprcs. c c snext integer counter holding the cumulative number of job tags c given out by gettag. c c unitag integer array holding the unique job tags "snext"s c corresponding to each current jobtag. c c intspn pointer to first spawned process. all jobtags with values c greater than or equal to intspn will be spawned processes. c c readyq a one dimensional integer array that holds the jobtags of c those processes that are ready to execute. the k-th block c of this array serves as a readyq for the k-th work routine. c on executing gtprb, the k-th work routine will look for work c in the k-th readyq first and then the others (round robin). c if readyq(*) .eq. done has been set then a return from c subroutine work(*,*) is indicated. c c rhead an integer array. the i-th entry of rhead is a pointer to the c head of the i-th block of readyq c c rtail an integer array. the i-th entry of rtail is a pointer to the c tail of the i-th block of readyq c c c common/qsync/ c c qlock is an integer array of locks. there is one lock for each c column of parmq. the purpose of this lock is to ensure c unique access to a column of parmq during the checkin operation. c c hrlock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer rhead to the head of the readyq. c c trlock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer rtail to the tail of the readyq. c c hflock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer fhead to the head of the freeq. c c tflock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer ftail to the tail of the freeq. c c common /qreset/ c c ireset is an integer flag array with ireset(j) .ne. 0 if job j c dependency will be reset, else ireset(j) = 0. c c icnsav is an integer array where icango will be caved for each job c that will be reset. c c common /CONWRT/ c c WRLOCK is an integer lock. the purpose of this lock is to ensure c a unique write during concurrent execution. c c done is a unique non positive integer set in libopn to indicate c task done. c c common /gphout/ c c endgrf is an integer pointing to the next available c slot in igraph c c glock is an integer lock. the purpose of this lock is to ensure c unique access to the pointer endgrf of a column of igraph. c c igraph is a two dimensional integer array c used as a buffer for graphics output c each column of igraph records an event. c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c Change(3): nproc passed in common as nprocc nprocc = nproc Change(3): ndmrsq is the size of each sub-q, corresp. one proc. ndmrsq = ldimrq/nprocc c if (nproc .gt. mxces-1 .or. nproc .lt. 1) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user asking for non-physical processors' write(6,*) ' on this system: nprocs = ',nproc write(6,*) ' the maximum allowed is nproc = ',mxces-1 write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine LIBOPN' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif ierr = m_set_procs(nproc) c done = -1 c c set readyq locks off c initialize readyq(*) = -1 to set done sequence c do 50 j = 1,nprocc call s_init_lock(hrlock(j)) call s_init_lock(trlock(j)) rhead(j) = 1 rtail(j) = 1 do 20 i = 1,ndmrsq readyq(i+ndmrsq*(j-1)) = -1 20 continue 50 continue c c set freeq pointers and locks c set qlocks off c initialize reentry indicator in parmq(5,*) c initial circular freeq with all parmq columns c free = 1 fhead = 1 ftail = mxprcs call s_init_lock(hflock) call s_init_lock(tflock) call s_init_lock(WRLOCK) call s_init_lock(glock) cterm call s_init_lock(glock) do 100 j = 1,mxprcs call s_init_lock(qlock(j)) parmq(5,j) = 0 freeq(j) = j ireset(j) = 0 icnsav(j) = 0 100 continue c c initialize queue pointers c intspn = 1 snext = 0 endgrf = 1 open( file='trace.graph',unit=3) c Change: Output nproc for sched.trace format write(3,30000) nproc 30000 format(i8) cterm endgrf = 1 cterm open( file='term.trace',unit=3) ctermc ctermChange: Output nproc for terminal trace format cterm write(3,30000) nproc cterm30000 format('nprocs = ',i1/) c c set lock on pointer to head of readyq so c no process may start until all process and data dependencies c have been specified by the user supplied routine driver. c do 150 j = 1,nprocc call s_lock(hrlock(j)) 150 continue c c now spawn virtual processors. these generic work routines will c assume the identity of any schedulable process specified by driver. c C$DOACROSS share(ispace) do 200 j = 1,nproc call work(j,ispace(j)) 200 continue call dump(endgrf,igraph) cterm call dump(endgrf,igraph) return c c last card of libopn c end subroutine nxtag(jobtag,mypar) CVD$R NOCONCUR CAUTION: nxtag arguments are consistent with dep now, but order of cont: arguments may not be consistent with older versions of ftsubs.f. integer jobtag,mypar c*********************************************************************** c c c this subroutine puts parental dependencies for problem on the c queue. the arguments of spawn specify a process for this job. c c jobtag is an integer specifying a unique column of parmq. c c mypar is an integer specifying the parent of the dynamically c spawned process jobtag. c c c*********************************************************************** c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c common block description: c c for a complete common block description see the subroutine libopn c c c c place this process on the free slot in the problem queue c obtained from subprogram gettag. c parmq(1,jobtag) = 1 parmq(2,jobtag) = 0 parmq(3,jobtag) = 1 parmq(6,jobtag) = mypar c call s_lock(glock) if (endgrf .gt. nbuffr) call dump(endgrf,igraph) insrt = endgrf endgrf = endgrf + 1 call s_unlock(glock) inext = unitag(jobtag) c trace for nxtag igraph(1,insrt) = 3 igraph(2,insrt) = mypar igraph(3,insrt) = inext gnames(insrt) = names(jobtag) c cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for nxtag cterm igraph(1,insrt) = 3 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = mypar cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail cterm igraph(6,insrt) = jobtag cterm gnames(insrt) = names(jobtag) c c update the icango counter of the parent process c by adding 2 to parmq(2,mypar)... prevents race condition. c add 1 to the number of kids spawned by parent mypar c call s_lock(qlock(mypar)) parmq(2,mypar) = parmq(2,mypar) + 2 parmq(4,mypar) = parmq(4,mypar) + 1 call s_unlock(qlock(mypar)) c c set number of kids spawned by jobtag to zero c parmq(4,jobtag) = 0 c c c return c c last card of nxtag c end subroutine start2 c c this routine allows parallel processing to start after user supplied c driver has completed by unlocking the head of the readyq c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) logical nostrt integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c c for common block description see subroutine libopn. c if (done .ne. 0) then write(6,*) '*************SCHED USER ERROR********************' if (done .eq. -1 ) then write(6,*) ' no process has set nchks equal to 0 ' else write(6,*) ' more than one process has set nchks to 0 ' endif write(6,*) ' SCHEDULE will not be able to terminate job' write(6,*) ' correctly ' write(6,*) ' ' write(6,*) ' check subroutine passed to initial call to' write(6,*) ' to see that at exactly one call to DEP has ' write(6,*) ' set nchks = 0 ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine START2' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c nostrt = .true. do 100 iw = 1,nprocc if (rhead(iw) .ne. rtail(iw)) nostrt = .false. 100 continue if (nostrt) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' no process had an intitial icango of 0 ' write(6,*) ' SCHEDULE could not begin ' write(6,*) ' ' write(6,*) ' check subroutine passed to initial call to' write(6,*) ' to see that at least one call to DEP has ' write(6,*) ' set icango = 0 ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine START2' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif Change: intspn correction to recover lost jobtag. c intspn is the unique tag of the first or initially spawned process. intspn = snext + 1 do 200 iw = 1,nprocc call s_unlock(hrlock(iw)) 200 continue c return c c last card of start2 c end subroutine place(jobtag) CVD$R NOCONCUR integer jobtag c************************************************************************* c c c this subroutine places a problem on the readyq c c jobtag is an integer specifying a unique column of parmq. c c c icango is a positive integer specifying how many processes must check c into this process before it can be placed on the readyq. c c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c c common block description: c c for a complete common block description see the subroutine libopn c c place this process on readyq if icango is 0 c when icango .eq. 0 this process does not depend on any c others. c cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc c mtail = 0 icango = parmq(2,jobtag) idrsq = mod((jobtag-1),nprocc) + 1 if (icango .eq. 0 ) then call s_lock(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = jobtag rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call s_unlock(trlock(idrsq)) endif Change: cterm call s_lock(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call s_unlock(glock) cterm inext = unitag(jobtag) ctermc trace for place cterm igraph(1,insrt) = 6 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = icango cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine PLACE' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c return c c last card of place c end integer function ientry(mypar,nentrs) c integer mypar c***************************************************************************** c c this routine will allow process mypar to continue after c spawned processes have all checked in. it should only be called if c processes have been spawned by mypar through the use of c the subroutine spawn. c c go to (1000,2000,...,N000), ientry(mypar,N) c 1000 continue c . c . c . c do 10 j = 1,nproc c . c . (set parameters to define spawned process) c . c call nxtag(jobtag,mypar) c call spawn(jobtag,mypar,subname,<parms>) c 10 continue c return c 2000 continue c . c . c . c return c N000 continue c <statements> c return c end c c this subroutine returns the number of times process mypar c has been entered. if that number is equal to the total c number nentrs of expected reentries then parmq(5,mypar) c is set to 0 indicating no more reentries required. c c***************************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) integer endgrf integer*1 glock real igraph character*6 names,gnames common /calls/ names(mxprcs) common /gphnam/ gnames(nbuffr) common /gphout/ endgrf,igraph(nslots,nbuffr),glock cterm integer endgrf cterm integer*1 glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,igraph(nslots,nbuffr),glock c c report the entry point where process jobtag should resume c computation c inext = unitag(mypar) if (nentrs .lt. 2) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user call to IENTRY with number of ' write(6,*) ' labels in nentrs set less than 2 ' write(6,*) ' from parent process ',inext write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif ientry = parmq(1,mypar) if (ientry .lt. nentrs) then parmq(5,mypar) = nentrs else parmq(5,mypar) = 0 endif c return c c last card of ientry c end logical function wait(mypar,ienter) c integer mypar,ienter c***************************************************************************** c c this routine will allow process mypar to continue after c spawned processes have all checked in. it should only be called if c processes have been spawned by mypar through the use of c the subroutine spawn. this routine must be used in conjunction with c subroutine prtspn. the required syntax is c c go to (1000,...,L000,...,N000), ientry(mypar,N) c 1000 continue c . c . c . c do 100 j = 1,nproc c . c . (set parameters to define spawned process) c . c call nxtag(jobtag,mypar) c call spawn(jobtag,mypar,subname,<parms>) c 100 continue c label = L c if (wait(mypar,label)) return c L000 continue c . c . c . c c if this subroutine returns a value of .true. then the calling process c mypar should issue a return. if a value of .false. is returned then c the calling process mypar should resume execution at the c statement immediately following the reference to wait (ie. at L000 in c the example above. a return value .true. indicates that some spawned c process has not yet completed and checked in. a return value .false. c indicates all spawned processes have checked in. c c*********************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 24,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,intspn,rhead,rtail, & done,free,fhead,ftail,snext,unitag & ,ireset,icnsav integer*1 qlock,hrlock,trlock,hflock,tflock common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) c c c check the icango counter to see if all spawned processes (kids) c have checked in. c inext = unitag(mypar) icango = 1 call s_lock(qlock(mypar)) icango = parmq(2,mypar) - parmq(4,mypar) call s_unlock(qlock(mypar)) c if (icango .eq. 0) then c c all kids are done ... dont wait (ie return false) c wait = .false. c c record re_entry label where computation is to c resume after wait is complete c parmq(1,mypar) = ienter c if (ienter .gt. parmq(5,mypar)) then write(6,*) '*************SCHED LIMIT ERROR*****************' write(6,*) ' executing SCHEDULE function WAIT ' write(6,*) ' return label larger than the maximum ' write(6,*) ' specified by user in call to ientry ' write(6,*) ' from parent process ', inext write(6,*) ' ' write(6,*) ' the maximum reentry number is ' write(6,*) ' ', parmq(5,mypar) write(6,*) ' ' write(6,*) ' EXECUTION TERMINATED BY SCHED ' call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c c set last re_entry indication (parmq(5,mypar) = 0) c if this reentry point corresponds to last one c (recorded in parmq(5,mypar) during call to ientry) c if (ienter .eq. parmq(5,mypar)) parmq(5,mypar) = 0 c else c c kids are not done c wait = .true. c c a checkin will be made so set the number of c entries to return label ienter - 1 to get c correct entry point after checkin c parmq(1,mypar) = ienter - 1 c endif c return c c last card of wait c end subroutine dump(endgrf,igraph) CVD$R NOCONCUR Change: combined SUN SCHED.TRACE/sched.trace and terminal version of dump. parameter (nslots = 105,nbuffr = 500) parameter (mxprcs = 1000) integer endgrf real igraph(nslots,nbuffr) character*6 gnames,aname common /gphnam/ gnames(nbuffr) integer ievent(nslots) c*********************************************************************** c c this routine writes graphics and terminal output to a file c and resets endgrf to 1 c c*********************************************************************** do 300 j = 1,endgrf-1 do 302 i = 1,nslots ievent(i) = igraph(i,j) 302 continue inext = ievent(2) if( ievent(1) .eq. 0 ) then aname = gnames(j) write(3,30000) (ievent(i),i=1,ievent(4)+4) write(3,30010) aname cterm write(3,3000) j,(ievent(i),i=1,7) cterm & ,aname,(ievent(i),i=8,ievent(4)+7) endif if( ievent(1) .eq. 1 ) then aname = gnames(j) write(3,30001) (ievent(i),i=1,2),igraph(3,j) & ,ievent(4) cterm write(3,3001) j,(ievent(i),i=1,7),aname,igraph(8,j) endif if( ievent(1) .eq. 2 ) then aname = gnames(j) write(3,30002) (ievent(i),i=1,2),igraph(3,j) cterm write(3,3002) j,(ievent(i),i=1,4),aname,igraph(5,j) endif if( ievent(1) .eq. 3 ) then aname = gnames(j) write(3,30003) (ievent(i),i=1,3),aname cterm write(3,3003) j,(ievent(i),i=1,6),aname endif if( ievent(1) .eq. 4 ) then aname = gnames(j) write(3,30004) (ievent(i),i=1,3),igraph(4,j) & ,ievent(5) cterm write(3,3004) j,(ievent(i),i=1,8),aname,igraph(9,j) endif if( ievent(1) .eq. 5 ) then aname = gnames(j) write(3,30005) (ievent(i),i=1,3),igraph(4,j) cterm write(3,3005) j,(ievent(i),i=1,5),aname,igraph(6,j) endif if( ievent(1) .eq. 6 ) then write(3,30002) (ievent(i),i=1,2),igraph(3,j) endif if( ievent(1) .eq. 7 ) then write(3,30005) (ievent(i),i=1,3),igraph(4,j) endif cterm if( ievent(1) .eq. 6 ) then cterm aname = gnames(j) cterm write(3,3006) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 7 ) then cterm aname = gnames(j) cterm write(3,3007) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 8 ) then cterm aname = gnames(j) cterm write(3,3008) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 9 ) then cterm aname = gnames(j) cterm write(3,3009) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 10 ) then cterm aname = gnames(j) cterm write(3,3010) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 11 ) then cterm aname = gnames(j) cterm write(3,3011) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 12 ) then cterm aname = gnames(j) cterm write(3,3012) j,(ievent(i),i=1,7),aname,igraph(8,j) cterm endif cterm if( ievent(1) .eq. 13 ) then cterm if ( ievent(4) .ne. 0 ) then cterm aname = gnames(j) cterm else cterm aname = ' work' cterm endif cterm write(3,3013) j,(ievent(i),i=1,4),aname,igraph(5,j) cterm endif cterm if( ievent(1) .eq. 14 ) then cterm if ( ievent(5) .ne. 0 ) then cterm aname = gnames(j) cterm else cterm aname = ' work' cterm endif cterm write(3,3014) j,(ievent(i),i=1,5),aname,igraph(6,j) cterm endif cterm if( ievent(1) .eq. 15 ) then cterm write(3,3015) j,(ievent(i),i=1,5) cterm endif 30000 format(14i8) 30010 format(2x,a) 30001 format(2i8,1pe16.8,i8) 30002 format(2i8,1pe16.8) 30003 format(3i8,2x,a) 30004 format(3i8,1pe16.8,i8) 30005 format(3i8,1pe16.8) cterm3000 format(i4,'. dep:',i2,';jobtag=',i4,';icango=',i4 cterm & ,'; nchks=',i4,';fhead,ftail=',i4,',',i4 cterm & /21x,12x,';idparm=',i4,';mytask= ',a6 cterm & /21x,'; mychkn(s)=',5i4,(/21x,10i4)) cterm3001 format(i4,'. gtprb/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (mhead.gt.0) ',';idwork=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3002 format(i4,'.chekin/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4 cterm & /4x,' (entryflag.eq.0)',12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3003 format(i4,'. nxtag:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,12x,';fhead,ftail=',i4,',',i4 cterm & /21x,12x,';idparm=',i4,';mytask= ',a6) cterm3004 format(i4,'. gtprb/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (mhead.gt.0) ',';idwork=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3005 format(i4,'. chekin/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,' idrsq=',i4 cterm & /4x,' (entryflag.eq.0)',12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3006 format(i4,'. place:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /21x,';icango=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3007 format(i4,'.chekin/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';fhead,ftail=',i4,',',i4 cterm & /4x,' (nchks.ne.0) ',2x,';mychek=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3008 format(i4,'. chekin/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,'; idrsq=',i4,';fhead,ftail=',i4,',',i4 cterm & /4x,' (nchks.ne.0) ',2x,12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3009 format(i4,'.chekin/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (entryflag.ne.0)',';mychek=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3010 format(i4,'. chekin/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (entryflag.ne.0)',12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3011 format(i4,'.chekin/parent:',i2,';jobtag=',i4,12x cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /4x,' (nchks.eq.0) ',2x,';mychek=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3012 format(i4,'. chekin/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & ,'; idrsq=',i4,';rhead,rtail=',i4,',',i4 cterm & /,4x,' (nchks.eq.0) ',12x cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3013 format(i4,'. gtprb/parent:',i2,';jobtag=',i4,12x cterm & /4x,' (mhead.le.0) ',';idwork=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3014 format(i4,'. gtprb/child:',i2,';jobtag=',i4,'; mypar=',i4 cterm & /4x,' (mhead.lt.0) ',';idwork=',i4 cterm & ,';idparm=',i4,';mytask= ',a6,'; time=',1pe16.8) cterm3015 format(i4,'. gettag:',i2,';jobtag=',i4,';idparm=',i4 cterm & ,12x,';fhead,ftail=',i4,',',i4) 300 continue c endgrf = 1 c return c c last line of dump c end subroutine name(jobtag,myname) parameter (mxprcs = 1000) character*6 names,myname common /calls/ names(mxprcs) names(jobtag) = myname return c c last card of name c end c subroutine lckasn(ilock) integer*1 ilock call s_init_lock(ilock) return end c subroutine lockon(ilock) integer*1 ilock call s_lock(ilock) return end c subroutine lockoff(ilock) integer*1 ilock call s_unlock(ilock) return end c subroutine nops j = 1 return end SHAR_EOF if test -f 'indx0.h' then echo shar: over-writing existing file "'indx0.h'" fi cat << \SHAR_EOF > 'indx0.h' indx[0].subname(indx[0].parms[0], indx[0].parms[1], indx[0].parms[2], indx[0].parms[3], indx[0].parms[4], indx[0].parms[5], indx[0].parms[6], indx[0].parms[7], indx[0].parms[8], indx[0].parms[9], indx[0].parms[10], indx[0].parms[11], indx[0].parms[12], indx[0].parms[13], indx[0].parms[14], indx[0].parms[15], indx[0].parms[16], indx[0].parms[17], indx[0].parms[18], indx[0].parms[19]); /* For more parms, remove comments and move paren/semicolon. indx[0].parms[20], indx[0].parms[21], indx[0].parms[22], indx[0].parms[23], indx[0].parms[24], indx[0].parms[25], indx[0].parms[26], indx[0].parms[27], indx[0].parms[28], indx[0].parms[29], indx[0].parms[30], indx[0].parms[31], indx[0].parms[32], indx[0].parms[33], indx[0].parms[34], indx[0].parms[35], indx[0].parms[36], indx[0].parms[37], indx[0].parms[38], indx[0].parms[39], indx[0].parms[40], indx[0].parms[41], indx[0].parms[42], indx[0].parms[43], indx[0].parms[44], indx[0].parms[45], indx[0].parms[46], indx[0].parms[47], indx[0].parms[48], indx[0].parms[49], indx[0].parms[50], indx[0].parms[51], indx[0].parms[52], indx[0].parms[53], indx[0].parms[54], indx[0].parms[55], indx[0].parms[56], indx[0].parms[57], indx[0].parms[58], indx[0].parms[59], indx[0].parms[60], indx[0].parms[61], indx[0].parms[62], indx[0].parms[63], indx[0].parms[64], indx[0].parms[65], indx[0].parms[66], indx[0].parms[67], indx[0].parms[68], indx[0].parms[69], indx[0].parms[70], indx[0].parms[71], indx[0].parms[72], indx[0].parms[73], indx[0].parms[74], indx[0].parms[75], indx[0].parms[76], indx[0].parms[77], indx[0].parms[78], indx[0].parms[79], indx[0].parms[80], indx[0].parms[81], indx[0].parms[82], indx[0].parms[83], indx[0].parms[84], indx[0].parms[85], indx[0].parms[86], indx[0].parms[87], indx[0].parms[88], indx[0].parms[89], indx[0].parms[90], indx[0].parms[91], indx[0].parms[92], indx[0].parms[93], indx[0].parms[94], indx[0].parms[95], indx[0].parms[96], indx[0].parms[97], indx[0].parms[98], indx[0].parms[99]); */ SHAR_EOF if test -f 'indxj.h' then echo shar: over-writing existing file "'indxj.h'" fi cat << \SHAR_EOF > 'indxj.h' indx[j].subname(indx[j].parms[0], indx[j].parms[1], indx[j].parms[2], indx[j].parms[3], indx[j].parms[4], indx[j].parms[5], indx[j].parms[6], indx[j].parms[7], indx[j].parms[8], indx[j].parms[9], indx[j].parms[10], indx[j].parms[11], indx[j].parms[12], indx[j].parms[13], indx[j].parms[14], indx[j].parms[15], indx[j].parms[16], indx[j].parms[17], indx[j].parms[18], indx[j].parms[19]); /* For more parms, remove comments and move paren/semicolon. indx[j].parms[20], indx[j].parms[21], indx[j].parms[22], indx[j].parms[23], indx[j].parms[24], indx[j].parms[25], indx[j].parms[26], indx[j].parms[27], indx[j].parms[28], indx[j].parms[29], indx[j].parms[30], indx[j].parms[31], indx[j].parms[32], indx[j].parms[33], indx[j].parms[34], indx[j].parms[35], indx[j].parms[36], indx[j].parms[37], indx[j].parms[38], indx[j].parms[39], indx[j].parms[40], indx[j].parms[41], indx[j].parms[42], indx[j].parms[43], indx[j].parms[44], indx[j].parms[45], indx[j].parms[46], indx[j].parms[47], indx[j].parms[48], indx[j].parms[49], indx[j].parms[50], indx[j].parms[51], indx[j].parms[52], indx[j].parms[53], indx[j].parms[54], indx[j].parms[55], indx[j].parms[56], indx[j].parms[57], indx[j].parms[58], indx[j].parms[59], indx[j].parms[60], indx[j].parms[61], indx[j].parms[62], indx[j].parms[63], indx[j].parms[64], indx[j].parms[65], indx[j].parms[66], indx[j].parms[67], indx[j].parms[68], indx[j].parms[69], indx[j].parms[70], indx[j].parms[71], indx[j].parms[72], indx[j].parms[73], indx[j].parms[74], indx[j].parms[75], indx[j].parms[76], indx[j].parms[77], indx[j].parms[78], indx[j].parms[79], indx[j].parms[80], indx[j].parms[81], indx[j].parms[82], indx[j].parms[83], indx[j].parms[84], indx[j].parms[85], indx[j].parms[86], indx[j].parms[87], indx[j].parms[88], indx[j].parms[89], indx[j].parms[90], indx[j].parms[91], indx[j].parms[92], indx[j].parms[93], indx[j].parms[94], indx[j].parms[95], indx[j].parms[96], indx[j].parms[97], indx[j].parms[98], indx[j].parms[99]); */ SHAR_EOF if test -f 'maindp.f' then echo shar: over-writing existing file "'maindp.f'" fi cat << \SHAR_EOF > 'maindp.f' $STDUNIT double precision dd(500),ee(500),qq(500,500) double precision esave(500),dsave(500) double precision s,s2,t,t2,tnorm,tnorm2,res,res2 integer icase c double precision d(500),e(500),q(500,500) integer n,ldq common /prbdef/n,ldq,q,d,e common/prfpms/ksect,kgran double precision enorm,dummy external treeql c ldq = 500 c write(6,*) ' input nproc ... the number of processors ' read(5,*) nproc do 9999 n = 100,100 ksect = n/10 kgran = n/20 write(6,*) 'nprocs = ',nproc write(6,*) 'ksect = ',ksect, ' kgran = ',kgran write(6,*)'==============================' write(6,*)' n = ',n do 9998 icase = 1,1 write(6,*)'++++++++++++++++++++++++++++++' if( icase .eq. 1 ) write(6,*)' twos on diagonal' if( icase .eq. 2 ) write(6,*)' random numbers on diagonal' if( icase .eq. 3 ) write(6,*)' glued wilks eps = 1.d-8 ' if( icase .eq. 4 ) write(6,*)' glued wilks eps = 1.d-14 ' nsect = 2**ksect nd2 = n/2 go to (112,113,114,115), icase 112 do 13 i = 1,n d(i) = 2.0 e(i) = 1 13 continue go to 445 113 do 14 i = 1,n d(i) = rand(foo) e(i) = rand(foo) 14 continue go to 445 114 continue eps = 1.e-8 do 151 inum = 1,num do 15 i = 1,21 d((inum-1)*21+i) = iabs(11-i) e((inum-1)*21+i) = 1 15 continue 151 continue go to 444 115 continue eps = 1.e-14 do 161 inum = 1,num do 16 i = 1,n d((inum-1)*21+i) = iabs(11-i) e((inum-1)*21+i) = 1 16 continue 161 continue 444 continue do 443 inum = 1,num e(inum*21+1) = eps 443 continue 445 continue do 10 j = 1,n do 5 i = 1,n q(i,j) = 0.0 qq(i,j) = 0.0 5 continue q(j,j) = 1.0 qq(j,j) = 1.0 dd(j) = d(j) ee(j) = e(j) dsave(j) = d(j) esave(j) = e(j) 10 continue c if (n .eq. 50) go to 9999 t1 = second(gtime) call tql2(ldq,n,dd,ee,qq,ierr) t2t = second(gtime) - t1 write(6,*) ' time for tql ',t2t c c the test problem has been defined now comes the numerical c refinements that will avoid cancellation c t1 = second(gtime) c call sched(nproc,treeql,n,ldq,q,d,e,ifail) c t2 = second(gtime) - t1 write(6,*) ' time for sesupd ',t2 if( ifail .gt. 1 ) write(6,*)' deflate from sesupd',ifail write(6,*)' ratio of tql2/new',t2t/t2 tnorm = 0.0d0 tnorm2 = 0.0d0 res = 0.0d0 res2 = 0.0d0 do 530 j = 1,n e(1) = dsave(1)*q(1,j) + $ esave(2)*q(2,j) - d(j)*q(1,j) ee(1) = dsave(1)*qq(1,j) + $ esave(2)*qq(2,j) - dd(j)*qq(1,j) do 400 i = 2,n-1 e(i) = esave(i)*q(i-1,j) + dsave(i)*q(i,j) + $ esave(i+1)*q(i+1,j) - d(j)*q(i,j) ee(i) = esave(i)*qq(i-1,j) + dsave(i)*qq(i,j) + $ esave(i+1)*qq(i+1,j) - dd(j)*qq(i,j) 400 continue e(n) = esave(n)*q(n-1,j) + dsave(n)*q(n,j) $ - d(j)*q(n,j) ee(i) = esave(n)*qq(n-1,j) + dsave(n)*qq(n,j) $ - dd(j)*qq(n,j) t = enorm(n,e) t2 = enorm(n,ee) res = max(res,t) res2 = max(res2,t2) if (t .gt. 1.0d-13) then c write(6,*)' j ',j, ' d ',d(j),' er ',t,' dd ',dd(j),' eer ',t2 c write(6,*) ' ev ',q(1,j),' eev ',qq(1,j) endif do 520 i = 1,n t = 0.0 s = 0.0 t2 = 0.0 s2 = 0.0 do 510 k = 1,n s2 = s2 + qq(k,i)*qq(k,j) s = s + q(k,i)*q(k,j) 510 continue if (i .eq. j) s2 = s2 - 1.0 if (i .eq. j) s = s - 1.0 t2 = abs(s2) t = abs(s) tnorm2 = max(tnorm2,t2) tnorm = max(tnorm,t) c if (t .gt. 1.0d-13) write(6,*) ' ij ',i,j,' tnorm ',tnorm 520 continue 530 continue write(6,*)' the residual for the tql values and vectors',res2 write(6,*)' the residual for the updated values and vectors',res write(6,*)' the tql norm of q*q sup t is',tnorm2 write(6,*)' the upd norm of q*q sup t is',tnorm write (6,*) ' spectrum [ ',d(1) ,',',d(n),']' 9998 continue 9999 continue stop end SHAR_EOF if test -f 'make1' then echo shar: over-writing existing file "'make1'" fi cat << \SHAR_EOF > 'make1' FILES = ftsubs.o putq.o second.o cputm.o FILES2 = ftsubs.graph.o putq.o second.o cputm.o sched : $(FILES) rm -f sched.a; ar q sched.a $(FILES); ranlib sched.a graph : $(FILES2) rm -f graph.a; ar q graph.a $(FILES2); ranlib graph.a .f.o : ; fortran -g -c -mp $*.f .c.o : ; cc -g -c $*.c SHAR_EOF if test -f 'makefile' then echo shar: over-writing existing file "'makefile'" fi cat << \SHAR_EOF > 'makefile' FILES1 = stuffspawn.o xtest : $(FILES1) fortran -g -mp $(FILES1) sched.a -o xtest gtest : $(FILES1) fortran -g -mp $(FILES1) graph.a -o gtest FILES2 = d_and_c.o xdandc : $(FILES2) fortran -g -mp $(FILES2) sched.a -o xdandc gdandc : $(FILES2) fortran -g -mp $(FILES2) graph.a -o gdandc FILES4 = maindp.o newevdp0.o stateig.o statses.o xeig : $(FILES4) fortran -g -mp $(FILES4) sched.a -o xeig geig : $(FILES4) fortran -g -mp $(FILES4) graph.a -o geig FILES5 = maindp.o newevdp0.o stateig.o stseswait.o xwait : $(FILES5) fortran -g -mp $(FILES5) sched.a -o xwait gwait : $(FILES5) fortran -g -mp $(FILES5) graph.a -o gwait FILES6 = example.o xexample : $(FILES6) fortran -g -mp $(FILES6) sched.a -o xexample gexample : $(FILES6) fortran -g -mp $(FILES6) graph.a -o gexample FILES7 = ts_dynamic.o xts_dynamic : $(FILES7) fortran -g -mp $(FILES7) sched.a -o xts_dynamic gts_dynamic : $(FILES7) fortran -g -mp $(FILES7) graph.a -o gts_dynamic FILES8 = blkjac.o xblkjac : $(FILES8) fortran -g -mp $(FILES8) sched.a -o xblkjac gblkjac : $(FILES8) fortran -g -mp $(FILES8) graph.a -o gblkjac sched: make -f make1 sched graph: make -f make1 graph clean: /bin/rm -f *.o *.fpp *.dbg *~ .f.o : ; fortran -g -c -mp $*.f SHAR_EOF if test -f 'maxparms.h' then echo shar: over-writing existing file "'maxparms.h'" fi cat << \SHAR_EOF > 'maxparms.h' #define MAXPARMS 20 SHAR_EOF if test -f 'newevdp0.f' then echo shar: over-writing existing file "'newevdp0.f'" fi cat << \SHAR_EOF > 'newevdp0.f' $STDUNIT subroutine evupd(n,i,d,z,delta,rho,dlam,ifail) implicit real*8 (a-h,o-z) CVD$G NOCONCUR c*********************************************************************** c this subroutine computes the updated eigenvalues of a c rank one modification to a symmetric matrix. it is assumed c that the eigenvalues are in the array d, and that c c d(i) .ne. d(j) for i .ne. j c c it is also assumed that the eigenvalues are in increasing c order and that the value of rho is positive. this is c arranged by the calling subroutine sesupd, and is no loss c in generality. c it is also assumed that the values in the array z are c the squares of the components of the updatingvector. c c c the method consists of approximating the rational functions c c rho = sum(z(j)/((d(j)-d(i))/rho - lamda): j = i+1,n) c c phi =sum(z(j)/(d(j)-d(i))/rho - lamda): j = 1,i) c c by simple interpolating rational functions. this avoids c the need for safeguarding by bisection since c the convergence is monotone, and quadratic from any starting c point that is greater than zero but less than the solution c c c input variables... c n the length of all arrays c c i the i - th eigenvalue is computed c c d the original eigenvalues. it is assumed that they are c in order, d(i) .lt. d(j) for i .lt. j. c c z this array of length n contains the squares of c of the components of the updating vector c c delta this array of length n contains (d(j) - lamda(i))/rho c in its j - th component. these values will be used c to update the eigenvectors in sesupd. c c rho this is the scalar in the the symmetric updating c formula. c c dlam this scalar will contain the value of the i - th c updated eigenvalue on return c c ifail this integer variable indicates failure of the c updating process with value 1, and success c with value 0. c c c c************************************************************************** integer i,n,im1,ip1,ip2,niter dimension d(n),z(n), delta(n) double precision d,z,rho,delta,zero,one,two,tsave, 1 del,phi,dphi,psi,dpsi,lambda,oldlam, 2 t,temp,a,b,d1,w,eps,eps1,eta,dlam,dmax double precision epslon,enorm zero = 0.0d0 one = 1.0d0 two = 2.0d0 c c eps is machine precision c eta is the relative accuracy requirement on the roots. c c these values should be adjusted for the particular machine. c eps = epslon(one) eta = eps*8. c im1 = i - 1 ip1 = i + 1 ip2 = i + 2 niter = 1 lambda = zero oldlam = zero del = d(i) do 100 j = 1,n delta(j) = (d(j) - del)/rho 100 continue dmax = max(abs(d(1)), abs(d(n))) dmax = abs(rho) + dmax c c calculate initial guess c if (i .lt. n) * then del = d(ip1) else del = d(n) + rho endif a = zero do 200 j = 1,im1 a = a + rho*z(j)/(d(j) - del) 200 continue b = zero do 220 j = ip2,n b = b + rho*z(j)/(d(j) - del) 220 continue a = a + (b + one) if (ip1 .gt. n) * then t = z(i)/abs(a) else t = a*delta(ip1) b = t + z(i) + z(ip1) if (b .ge. zero) * then t = two*z(i)*delta(ip1)/ * (b + sqrt(abs(b*b - 4.0d0*t*z(i)))) else t = (b - sqrt(b*b - 4.0d0*t*z(i)))/(two*a) endif t = t/two endif c c test to see that the initial guess is not too close to endpoint c if (ip1 .le. n .and. t .ge. .9*delta(ip1)) * t = .9*delta(ip1) c c update the values of the array delta c 250 continue tsave = abs(t) do 300 j = 1,n delta(j) = delta(j) - t 300 continue lambda = lambda + t dlam = d(i) + rho*lambda c c evaluate psi and the derivative dpsi c dpsi = zero psi = zero do 400 j = 1,i t = z(j)/delta(j) psi = psi + t dpsi = dpsi + t/delta(j) 400 continue c c evaluate phi and the derivative dphi c dphi = zero phi = zero if (i .eq. n) go to 600 do 500 j = ip1,n t = z(j)/delta(j) phi = phi + t dphi = dphi + t/delta(j) 500 continue c c test for convergence c return if the test is satisfied c 600 continue w = one + phi + psi eps1 = eps*dmax*sqrt(dphi + dpsi) if ((abs(w) .le. eps1) .and. (abs(lambda - oldlam) .le. 1 eta*oldlam) ) then return endif c c return with ifail = -1 if convergence has not ocurred c within 45 iterations. c if (niter .lt. 45) go to 650 ifail = -1 return 650 continue niter = niter + 1 oldlam = lambda c c calculate the new step c if (i .ne. n) go to 700 c c otherwise c calculate the step for the special case i = n c t = (w*psi)/dpsi go to 250 700 continue del = delta(ip1) temp = psi/dpsi d1 = one + phi - del*dphi a = (del*(one + phi) + psi*temp)/d1 + temp b = (two*temp*del*w)/d1 t = sqrt(abs(a*a - two*b)) t = b/(a + t) if (t .lt. -eps) t = -tsave/two go to 250 c c last card of evupd c end subroutine deflat(j,jlam,jdel,indx,n,ldq,q,d,z,tol1) implicit real*8 (a-h,o-z) CVD$G NOCONCUR c c c rotate z components to 0 c assumes norm(z) .eq. 1 c integer indx(*),jdel,j,jlam,n,ldq double precision q(ldq,*),d(*),z(*),gamma,sigma,tau,t1,t2, * tol,tol1 c tol = tol1 gamma = z(jlam) sigma = z(j) tau = pythag(gamma,sigma) jdel = -1 if (tau .gt. 0.0d0) * then delta = d(j) - d(jlam) gamma = gamma/tau sigma = sigma/tau if (abs(delta*gamma*sigma) .le. tol) * then if (abs(gamma) .le. abs(sigma)) * then z(j) = tau z(jlam) = 0.0d0 jjlam = indx(jlam) jj = indx(j) do 100 i = 1,n t1 = q(i,jjlam) t2 = q(i,jj) q(i,jjlam) = -t1*sigma + t2*gamma q(i,jj) = t1*gamma + t2*sigma 100 continue temp = d(jlam)*sigma**2 + d(j)*gamma**2 d(j) = d(jlam)*gamma**2 + d(j)*sigma**2 d(jlam) = temp jdel = jlam jlam = j else z(jlam) = tau z(j) = 0.0d0 jjlam = indx(jlam) jj = indx(j) do 200 i = 1,n t1 = q(i,jjlam) t2 = q(i,jj) q(i,jjlam) = t1*gamma + t2*sigma q(i,jj) = -t1*sigma + t2*gamma 200 continue temp = d(jlam)*gamma**2 + d(j)*sigma**2 d(j) = d(jlam)*sigma**2 + d(j)*gamma**2 d(jlam) = temp jdel = j endif endif endif c return c c last card of deflat end function rand(foo) CVD$G NOCONCUR data init/1365/ init = mod(3125*init,65536) rand = (init - 32768.)/(2.**15) return end double precision function enorm(n,x) CVD$G NOCONCUR integer n double precision x(n),temp,tmax tmax = 0.0d0 do 100 j = 1,n temp = max(tmax,abs(x(j))) if (temp .gt. tmax) tmax = temp 100 continue if (tmax .eq. 0.0d0) then enorm = tmax else temp = 0.0d0 do 200 j = 1,n temp = temp + (x(j)/tmax)**2 200 continue enorm = tmax*sqrt(temp) endif return end subroutine tql2(nm,n,d,e,z,ierr) implicit real*8 (a-h,o-z) CVD$G NOCONCUR c integer i,j,k,l,m,n,ii,l1,l2,nm,mml,ierr double precision d(n),e(n),z(nm,n) double precision c,c2,c3,dl1,el1,f,g,h,p,r,s,s2,tst1,tst2,pythag c c this subroutine is a translation of the algol procedure tql2, c num. math. 11, 293-306(1968) by bowdler, martin, reinsch, and c wilkinson. c handbook for auto. comp., vol.ii-linear algebra, 227-240(1971). c c this subroutine finds the eigenvalues and eigenvectors c of a symmetric tridiagonal matrix by the ql method. c the eigenvectors of a full symmetric matrix can also c be found if tred2 has been used to reduce this c full matrix to tridiagonal form. c c on input c c nm must be set to the row dimension of two-dimensional c array parameters as declared in the calling program c dimension statement. c c n is the order of the matrix. c c d contains the diagonal elements of the input matrix. c c e contains the subdiagonal elements of the input matrix c in its last n-1 positions. e(1) is arbitrary. c c z contains the transformation matrix produced in the c reduction by tred2, if performed. if the eigenvectors c of the tridiagonal matrix are desired, z must contain c the identity matrix. c c on output c c d contains the eigenvalues in ascending order. if an c error exit is made, the eigenvalues are correct but c unordered for indices 1,2,...,ierr-1. c c e has been destroyed. c c z contains orthonormal eigenvectors of the symmetric c tridiagonal (or full) matrix. if an error exit is made, c z contains the eigenvectors associated with the stored c eigenvalues. c c ierr is set to c zero for normal return, c j if the j-th eigenvalue has not been c determined after 30 iterations. c c calls pythag for sqrt(a*a + b*b) . c c questions and comments should be directed to burton s. garbow, c mathematics and computer science div, argonne national laboratory c c this version dated august 1983. c c ------------------------------------------------------------------ c ierr = 0 if (n .eq. 1) go to 1001 c do 100 i = 2, n 100 e(i-1) = e(i) c f = 0.0d0 tst1 = 0.0d0 e(n) = 0.0d0 c do 240 l = 1, n j = 0 h = abs(d(l)) + abs(e(l)) if (tst1 .lt. h) tst1 = h c .......... look for small sub-diagonal element .......... do 110 m = l, n tst2 = tst1 + abs(e(m)) if (tst2 .eq. tst1) go to 120 c .......... e(n) is always zero, so there is no exit c through the bottom of the loop .......... 110 continue c 120 if (m .eq. l) go to 220 130 if (j .eq. 30) go to 1000 j = j + 1 c .......... form shift .......... l1 = l + 1 l2 = l1 + 1 g = d(l) p = (d(l1) - g) / (2.0d0 * e(l)) r = pythag(p,1.0d0) d(l) = e(l) / (p + sign(r,p)) d(l1) = e(l) * (p + sign(r,p)) dl1 = d(l1) h = g - d(l) if (l2 .gt. n) go to 145 c do 140 i = l2, n 140 d(i) = d(i) - h c 145 f = f + h c .......... ql transformation .......... p = d(m) c = 1.0d0 c2 = c el1 = e(l1) s = 0.0d0 mml = m - l c .......... for i=m-1 step -1 until l do -- .......... do 200 ii = 1, mml c3 = c2 c2 = c s2 = s i = m - ii g = c * e(i) h = c * p r = pythag(p,e(i)) e(i+1) = s * r s = e(i) / r c = p / r p = c * d(i) - s * g d(i+1) = h + s * (c * g + s * d(i)) c .......... form vector .......... do 180 k = 1, n h = z(k,i+1) z(k,i+1) = s * z(k,i) + c * h z(k,i) = c * z(k,i) - s * h 180 continue c 200 continue c p = -s * s2 * c3 * el1 * e(l) / dl1 e(l) = s * p d(l) = c * p tst2 = tst1 + abs(e(l)) if (tst2 .gt. tst1) go to 130 220 d(l) = d(l) + f 240 continue c .......... order eigenvalues and eigenvectors .......... do 300 ii = 2, n i = ii - 1 k = i p = d(i) c do 260 j = ii, n if (d(j) .ge. p) go to 260 k = j p = d(j) 260 continue c if (k .eq. i) go to 300 d(k) = d(i) d(i) = p c do 280 j = 1, n p = z(j,i) z(j,i) = z(j,k) z(j,k) = p 280 continue c 300 continue c go to 1001 c .......... set error -- no convergence to an c eigenvalue after 30 iterations .......... 1000 ierr = l 1001 return end double precision function pythag(a,b) double precision a,b c c finds sqrt(a**2+b**2) without overflow or destructive underflow c double precision p,r,s,t,u p = max(abs(a),abs(b)) if (p .eq. 0.0d0) go to 20 r = (min(abs(a),abs(b))/p)**2 10 continue t = 4.0d0 + r if (t .eq. 4.0d0) go to 20 s = r/t u = 1.0d0 + 2.0d0*s p = u*p r = (s/u)**2 * r go to 10 20 pythag = p return end double precision function epslon (x) double precision x c c estimate unit roundoff in quantities of size x. c double precision a,b,c,eps c c this program should function properly on all systems c satisfying the following two assumptions, c 1. the base used in representing floating point c numbers is not a power of three. c 2. the quantity a in statement 10 is represented to c the accuracy used in floating point variables c that are stored in memory. c the statement number 10 and the go to 10 are intended to c force optimizing compilers to generate code satisfying c assumption 2. c under these assumptions, it should be true that, c a is not exactly equal to four-thirds, c b has a zero for its last bit or digit, c c is not exactly equal to one, c eps measures the separation of 1.0 from c the next larger floating point number. c the developers of eispack would appreciate being informed c about any systems where these assumptions do not hold. c c this version dated 4/6/83. c a = 4.0d0/3.0d0 10 b = a - 1.0d0 c = b + b + b eps = abs(c-1.0d0) if (eps .eq. 0.0d0) go to 10 epslon = eps*abs(x) epslon = 1.0d-16 return end subroutine evdrv(k,kstart,kstop,ldq,n,q,d,rho,z,dlamda,q2, * w,indxp,indx,ksnrho,ifail) implicit real*8 (a-h,o-z) CVD$G NOCONCUR c integer k,kstart,kstop,ldq,n,ifail integer indx(n),indxp(n),ksnrho logical rhoge0 double precision * q(ldq,n),d(n),z(n),q2(ldq,n),dlamda(n), * w(n),rho c c local variables c double precision x(300),delta(300) double precision t,s,dlam double precision enorm rhoge0 = .true. if (ksnrho .lt. 0) rhoge0 = .false. c do 2700 j = kstart,kstop i = j if (.not. rhoge0) i = k - i + 1 call evupd(k,i,dlamda,w,delta,rho,dlam,ifail) c c if the zero finder failed the computation is terminated c if (ifail .eq. 1) return c if (rhoge0) * then jp = indxp(j) else dlam = -dlam jp = indxp(k-j+1) endif c d(jp) = dlam c c compute the updated eigenvectors c do 2200 i1 = 1,n x(i1) = 0.0d0 2200 continue do 2500 jj = 1,k j3 = jj if (.not. rhoge0) j3 = k - jj + 1 jp = indxp(j3) jjpp = indx(jp) s = z(jp)/delta(j3) c qq(jj,j) = s do 2300 i1 = 1,n x(i1) = x(i1) + q(i1,jjpp)*s 2300 continue 2500 continue t = enorm(n,x) j3 = j if (.not. rhoge0) j3 = k - j + 1 jp = indxp(j3) do 2600 i1 = 1,n q2(i1,jp) = x(i1)/t 2600 continue 2700 continue c return c c last card of evdrv c end SHAR_EOF if test -f 'oldtest' then echo shar: over-writing existing file "'oldtest'" fi cat << \SHAR_EOF > 'oldtest' make -f make1 sched fortran -g -c -mp ftsubs.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 1039 lines. File ftsubs.fpp cc -g -c putq.c fortran -g -c -mp second.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 23 lines. File second.fpp cc -g -c cputm.c rm -f sched.a; ar q sched.a ftsubs.o putq.o second.o cputm.o; ranlib sched.a ar: creating sched.a make -f make1 graph fortran -g -c -mp ftsubs.graph.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 1229 lines. File ftsubs.graph.fpp rm -f graph.a; ar q graph.a ftsubs.graph.o putq.o second.o cputm.o; ranlib graph.a ar: creating graph.a fortran -g -c -mp stuffspawn.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 151 lines. File stuffspawn.fpp fortran -g -mp stuffspawn.o sched.a -o xtest input nprocs time = +4.250000 nprocs = 12 +1.00000000000000 -4.00000000000000 -6.00000000000000 -8.00000000000000 -1.00000000000000E+001 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +2.00000000000000 -6.00000000000000 -8.00000000000000 -1.00000000000000E+001 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +3.00000000000000 -8.00000000000000 -1.00000000000000E+001 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +4.00000000000000 -1.00000000000000E+001 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +5.00000000000000 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +6.00000000000000 -1.40000000000000E+001 -1.60000000000000E+001 +7.00000000000000 -1.60000000000000E+001 +8.00000000000000 Programmed STOP fortran -g -mp stuffspawn.o graph.a -o gtest input nprocs time = +7.480000 nprocs = 12 +1.00000000000000 -4.00000000000000 -6.00000000000000 -8.00000000000000 -1.00000000000000E+001 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +2.00000000000000 -6.00000000000000 -8.00000000000000 -1.00000000000000E+001 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +3.00000000000000 -8.00000000000000 -1.00000000000000E+001 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +4.00000000000000 -1.00000000000000E+001 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +5.00000000000000 -1.20000000000000E+001 -1.40000000000000E+001 -1.60000000000000E+001 +6.00000000000000 -1.40000000000000E+001 -1.60000000000000E+001 +7.00000000000000 -1.60000000000000E+001 +8.00000000000000 Programmed STOP fortran -g -c -mp d_and_c.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 65 lines. File d_and_c.fpp fortran -g -mp d_and_c.o sched.a -o xdandc input nprocs nlevls 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Programmed STOP fortran -g -mp d_and_c.o graph.a -o gdandc input nprocs nlevls 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Programmed STOP fortran -g -c -mp maindp.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 150 lines. File maindp.fpp fortran -g -c -mp newevdp0.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 463 lines. File newevdp0.fpp fortran -g -c -mp stateig.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 124 lines. File stateig.fpp fortran -g -c -mp statses.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 280 lines. File statses.fpp fortran -g -mp maindp.o newevdp0.o stateig.o statses.o sched.a -o xeig input nproc ... the number of processors nprocs = 12 ksect = 10 kgran = 5 ============================== n = 100 ++++++++++++++++++++++++++++++ twos on diagonal time for tql +2.017100E+02 time for sesupd +7.58999633789063 ratio of tql2/new+2.65757712829049E+001 the residual for the tql values and vectors+9.53307622954643E-015 the residual for the updated values and vectors+3.55342003405926E-015 the tql norm of q*q sup t is+7.77156117237610E-015 the upd norm of q*q sup t is+1.42929017146048E-015 spectrum [ +3.86880573281076E-003,+3.99613119426719] Programmed STOP fortran -g -mp maindp.o newevdp0.o stateig.o statses.o graph.a -o geig input nproc ... the number of processors nprocs = 12 ksect = 10 kgran = 5 ============================== n = 100 ++++++++++++++++++++++++++++++ twos on diagonal time for tql +2.007200E+02 time for sesupd +9.33000183105469 ratio of tql2/new+2.15133935507506E+001 the residual for the tql values and vectors+9.53307622954643E-015 the residual for the updated values and vectors+3.55342003405926E-015 the tql norm of q*q sup t is+7.77156117237610E-015 the upd norm of q*q sup t is+1.42929017146048E-015 spectrum [ +3.86880573281076E-003,+3.99613119426719] Programmed STOP fortran -g -c -mp stseswait.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 300 lines. File stseswait.fpp fortran -g -mp maindp.o newevdp0.o stateig.o stseswait.o sched.a -o xwait input nproc ... the number of processors nprocs = 12 ksect = 10 kgran = 5 ============================== n = 100 ++++++++++++++++++++++++++++++ twos on diagonal time for tql +2.003900E+02 time for sesupd +7.77999877929688 ratio of tql2/new+2.57570733716437E+001 the residual for the tql values and vectors+9.53307622954643E-015 the residual for the updated values and vectors+3.55342003405926E-015 the tql norm of q*q sup t is+7.77156117237610E-015 the upd norm of q*q sup t is+1.42929017146048E-015 spectrum [ +3.86880573281076E-003,+3.99613119426719] Programmed STOP fortran -g -mp maindp.o newevdp0.o stateig.o stseswait.o graph.a -o gwait input nproc ... the number of processors nprocs = 12 ksect = 10 kgran = 5 ============================== n = 100 ++++++++++++++++++++++++++++++ twos on diagonal time for tql +2.004700E+02 time for sesupd +9.26998901367188 ratio of tql2/new+2.16256999792599E+001 the residual for the tql values and vectors+9.53307622954643E-015 the residual for the updated values and vectors+3.55342003405926E-015 the tql norm of q*q sup t is+7.77156117237610E-015 the upd norm of q*q sup t is+1.42929017146048E-015 spectrum [ +3.86880573281076E-003,+3.99613119426719] Programmed STOP fortran -g -c -mp example.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 99 lines. File example.fpp fortran -g -mp example.o sched.a -o xexample Input number of processors sigma = +5.005000E+05 Programmed STOP fortran -g -mp example.o graph.a -o gexample Input number of processors sigma = +5.005000E+05 Programmed STOP fortran -g -c -mp ts_dynamic.f FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 202 lines. File ts_dynamic.fpp fortran -g -mp ts_dynamic.o sched.a -o xts_dynamic 1. 47. 5. 51. 209. 9. 55. 213. 355. 13. 59. 217. 359. 485. 17. 63. 221. 363. 489. 599. 21. 67. 225. 367. 493. 603. 697. 25. 71. 229. 371. 497. 607. 701. 779. 29. 75. 233. 375. 501. 611. 705. 783. 845. 33. 79. 237. 379. 505. 615. 709. 787. 849. 895. 37. 83. 241. 383. 509. 619. 713. 791. 853. 899. 929. 41. 85. 243. 385. 511. 621. 715. 793. 855. 901. 931. 945. 43. # 8 4.43000 code norder niter nprocs maxjobs seconds ts_dynamic.f 43 1000 8 946 4.43000 Programmed STOP fortran -g -mp ts_dynamic.o graph.a -o gts_dynamic 1. 47. 5. 51. 209. 9. 55. 213. 355. 13. 59. 217. 359. 485. 17. 63. 221. 363. 489. 599. 21. 67. 225. 367. 493. 603. 697. 25. 71. 229. 371. 497. 607. 701. 779. 29. 75. 233. 375. 501. 611. 705. 783. 845. 33. 79. 237. 379. 505. 615. 709. 787. 849. 895. 37. 83. 241. 383. 509. 619. 713. 791. 853. 899. 929. 41. 85. 243. 385. 511. 621. 715. 793. 855. 901. 931. 945. 43. # 8 31.08000 code norder niter nprocs maxjobs seconds ts_dynamic.f 43 1000 8 946 31.08000 Programmed STOP fortran -g -c -mp blkjac.f "blkjac.f", line 239: already typed mdim is already typed integer initag,mdim,m,n ^ FORTRAN 77 V3.2 (C) Copyright 1981, 1988 Silicon Valley Software Inc. 0 errors. 337 lines. File blkjac.fpp fortran -g -mp blkjac.o sched.a -o xblkjac Static Block Jacobi Input: nprocs = 8; (m,n) = ( 100, 100); (mblks,nblks) = ( 10, 10) ; max iterations = 100; nprec = 2 Parameter input: (mdim,ndim) = ( 102, 102); maxblkdim = 11 ; maxprc = 8; nmychk = 110; (mbpts,nbpts) = ( 10, 10) (xmax,ymax) = ( 100.00, 100.00); nprec = 2( tol = .5000D-02) Static Block Jacobi - Iteration SCHEDULE Final Results: j/i 1 11 21 31 41 51 61 71 81 91 101 102 102 .50 .50 .52 .54 .58 .62 .68 .74 .81 .90 .99 1.00 101 .49 .50 .52 .54 .58 .62 .67 .74 .81 .90 .99 1.00 91 .35 .45 .47 .50 .53 .58 .64 .71 .79 .89 .99 1.00 81 .25 .40 .42 .45 .49 .55 .61 .69 .78 .88 .99 1.00 71 .17 .35 .37 .41 .45 .51 .58 .66 .76 .87 .99 1.00 61 .10 .30 .33 .36 .41 .47 .55 .64 .74 .86 .99 1.00 51 .06 .25 .28 .32 .37 .43 .51 .61 .72 .85 .99 1.00 41 .03 .20 .23 .27 .33 .40 .48 .58 .70 .84 .98 1.00 31 .01 .16 .18 .23 .28 .36 .45 .56 .68 .83 .98 1.00 21 .00 .11 .14 .18 .24 .32 .42 .53 .67 .82 .98 1.00 11 .00 .06 .09 .14 .20 .28 .39 .51 .65 .81 .98 1.00 1 .00 .01 .04 .09 .16 .25 .35 .48 .63 .79 .98 1.00 # 8 39.23000 code m n mblks nblks niter nprocs maxjobs BLOCK-JACOBI 100 100 10 10 19 8 1959 STATIC VERSION seconds = 39.23000; uvdiff = .48139D-02 Programmed STOP fortran -g -mp blkjac.o graph.a -o gblkjac Static Block Jacobi Input: nprocs = 8; (m,n) = ( 100, 100); (mblks,nblks) = ( 10, 10) ; max iterations = 100; nprec = 2 Parameter input: (mdim,ndim) = ( 102, 102); maxblkdim = 11 ; maxprc = 8; nmychk = 110; (mbpts,nbpts) = ( 10, 10) (xmax,ymax) = ( 100.00, 100.00); nprec = 2( tol = .5000D-02) Static Block Jacobi - Iteration SCHEDULE Final Results: j/i 1 11 21 31 41 51 61 71 81 91 101 102 102 .50 .50 .52 .54 .58 .62 .68 .74 .81 .90 .99 1.00 101 .49 .50 .52 .54 .58 .62 .67 .74 .81 .90 .99 1.00 91 .35 .45 .47 .50 .53 .58 .64 .71 .79 .89 .99 1.00 81 .25 .40 .42 .45 .49 .55 .61 .69 .78 .88 .99 1.00 71 .17 .35 .37 .41 .45 .51 .58 .66 .76 .87 .99 1.00 61 .10 .30 .33 .36 .41 .47 .55 .64 .74 .86 .99 1.00 51 .06 .25 .28 .32 .37 .43 .51 .61 .72 .85 .99 1.00 41 .03 .20 .23 .27 .33 .40 .48 .58 .70 .84 .98 1.00 31 .01 .16 .18 .23 .28 .36 .45 .56 .68 .83 .98 1.00 21 .00 .11 .14 .18 .24 .32 .42 .53 .67 .82 .98 1.00 11 .00 .06 .09 .14 .20 .28 .39 .51 .65 .81 .98 1.00 1 .00 .01 .04 .09 .16 .25 .35 .48 .63 .79 .98 1.00 # 8 77.14000 code m n mblks nblks niter nprocs maxjobs BLOCK-JACOBI 100 100 10 10 19 8 1959 STATIC VERSION seconds = 77.14000; uvdiff = .48139D-02 Programmed STOP SHAR_EOF if test -f 'putq.c' then echo shar: over-writing existing file "'putq.c'" fi cat << \SHAR_EOF > 'putq.c' #include <stdio.h> #include "maxparms.h" /* Code: putq.c for number of active jobs up to 1000 (indx[1001]). Caution: spawn (& nxtag) are reordered to be consistent with putq (& dep) arguments. */ struct parms { int static_link; int (*subname)(); long *parms[MAXPARMS]; }; shared struct parms indx[1001]; sched_(nprocs,parms) int *nprocs; struct parms parms; /* this procedure obtains nprocs physical processors devoted to the the execution of the parallel program indicated through parms which is a structure whose first entry is a subroutine name and whose remaining entries are parameters appearing in the calling sequence of that subroutine. */ { int libopn_(); bcopy(&parms, &indx[0], sizeof(struct parms)); /* the subroutine name and prameter list have been copied and placed in a special slot on the parmq then libopn is invoked to initialize pointers, grab physical processors and begin the computation */ libopn_(nprocs); return(0); } putq_(jobtag,parms) int *jobtag; struct parms parms; /* this procedure puts the descriptor of a schedulable process <jobtag> onto the problem queue. this process will be scheduled for execution when its data dependencies have been satisfied (indicated by icango==0). the argument parms is a structure whose first entry is a subroutine name and whose remaining entries are parameters appearing in the calling sequence of that subroutine. */ { register int j; int place_(); j = *jobtag; bcopy(&parms, &indx[j], sizeof(struct parms)); /* first the parms block is copied into the slot pointed to by by jobtag and then this descriptor is placed on the problem queue */ place_(jobtag); return(0); } spawn_(jobtag,parent,parms) int *jobtag,*parent; struct parms parms; /* this procedure puts the descriptor of a schedulable process <jobtag> onto the problem queue. this process will be scheduled for execution when its data dependencies have been satisfied (indicated by icango==0). the argument parms is a structure whose first entry is a subroutine name and whose remaining entries are parameters appearing in the calling sequence of that subroutine. the action of this procedure differs from putq in that the user does not assign jobtags or data dependencies. a parent may spawn any number of children but these child processes only report to the parent. Caution: First two arguments of NXTAG and SPAWN are reversed from older versions. */ { register int j,i; int place_(),clone_(); j = *jobtag; i = *parent; bcopy(&parms, &indx[j], sizeof(struct parms)); /* first the parms block is copied into the slot pointed to by by jobtag and then this descriptor is placed on the problem queue */ if (indx[j].subname == clone_) indx[j].subname = indx[i].subname; /* here we ask if this is a recursive spawning. if so the name clone has been called instead of subname so we replace the name clone by subname. */ place_(jobtag); return(0); } clone_() { /* this is a dummy routine to satisfy unresolved external */ return(0); } work_(id,jobtag) int *id,*jobtag; { int start2_(),gtprb_(); register int j,myjob; j = *id; if (j == 1) /* the worker whose id is 1 will execute the subroutine passed to sched. this subroutine executes the static data dependency graph. this graph must have at least one node. */ { #include "indx0.h" start2_(); } myjob = gtprb_(id,jobtag); while (myjob != 0) { j = *jobtag; if (myjob <= -1 ) { /* reenter... simple spawning was done all kids completed and no reentry is required. this indicates jobtag is all done and checkin can proceed. */ chekin_(jobtag); myjob = gtprb_(id,jobtag); } else { /* call subname(<parms>).......... */ #include "indxj.h" chekin_(jobtag); myjob = gtprb_(id,jobtag); } } return(0); } SHAR_EOF if test -f 'second.f' then echo shar: over-writing existing file "'second.f'" fi cat << \SHAR_EOF > 'second.f' $SYSTEM $STDUNIT real function second(t) c c this routine will gather the user time for a process. c it has resolution of a millisecond c and uses the unix system call getrusage. c see the unix manual for details c returns time in seconds. c integer cputm c itime = cputm() second = float(itime)/1000.0 c c this statement is here to bump the time by a bit c incase no the interval was too small. c c second = second + second*1.0e-6 c return end SHAR_EOF if test -f 'stateig.f' then echo shar: over-writing existing file "'stateig.f'" fi cat << \SHAR_EOF > 'stateig.f' $STDUNIT subroutine treeql(n,ldq,q,d,e,ifail) implicit real*8 (a-h,o-z) CVD$G NOCONCUR c * integer n,ldq double precision d(*),e(*),q(ldq,*) c double precision z(500),dlamda(500),x(500),q2(500,500),w(500), * delta(500),rho(300) integer indx(500),indxp(500),nn(3,300),kwork(50,300),ifail integer itags(300) common /wspace/z,dlamda,x,q2,w,delta,rho,indx,indxp,nn,kwork, * itags common/prfpms/nsmall,kgran external sesupd,tql2 character*6 subnam c c this subroutine splits a problem in two parts c nn(1,1) = n nn(2,1) = 1 nn(3,1) = 1 ifail = 0 nsize = n nlevl = -1 1111 continue nlevl = nlevl + 1 if (nsize .gt. 2*nsmall) then c c the problem is large enough to split one more level c define the splits here in the 100 loop and then c place a call to sesupd on the queue to glue the results c together c nsize = nsize/2 do 100 id = 2**nlevl,2**(nlevl+1) - 1 c idl = 2*id idr = idl + 1 n1 = nn(1,id)/2 c nn(1,idl) = n1 nn(2,idl) = idl nn(3,idl) = nn(3,id) c nn(1,idr) = nn(1,id) - n1 nn(2,idr) = idr nn(3,idr) = nn(3,id) + n1 c isplt1 = nn(3,idr) isplt = isplt1 - 1 n1p1 = n1 + 1 rho(id) = e(isplt1) alpha = d(isplt) alphap = d(isplt1) sigma = 1.0d0 c if (sign(1.0d0,alpha)*sign(1.0d0,alphap) .ge. 0.0d0) * then if (alpha .lt. 0.0d0 .or. alphap .lt. 0.0d0) * then if(rho(id) .ne. 0.0d0) * then rho(id) = - rho(id) sigma = -sigma endif endif endif kwork(50,id) = sigma d(isplt) = d(isplt) - rho(id) d(isplt1) = d(isplt1) - rho(id) c c the problem is large enough to split into two parts c isplt = nn(3,id) c call gettag(jobtag) itags(id) = jobtag icango = 2 nchks = 1 if (id .eq. 1) nchks = 0 list = itags(id/2) subnam = 'sesupd' call name(jobtag,subnam) call dep(jobtag,icango,nchks,list) call putq(jobtag,sesupd,itags(id),ldq,nn(1,id), * q(isplt,isplt),d(isplt),rho(id),z(isplt), * x(isplt),dlamda(isplt),q2(isplt,isplt), * delta(isplt),w(isplt),indxp(isplt), * indx(isplt),kwork(1,id),ifail) c 100 continue go to 1111 else c c we have reached the lowest level c granularity now warrants call to tql2 c do 200 id = 2**nlevl,2**(nlevl+1) - 1 c isplt = nn(3,id) c call gettag(jobtag) itags(id) = jobtag icango = 0 ncheks = 1 list = itags(id/2) subnam = 'tql2' call name(jobtag,subnam) call dep(jobtag,icango,ncheks,list) call putq(jobtag,tql2,ldq,nn(1,id),d(isplt),e(isplt), * q(isplt,isplt),ifail) c 200 continue endif c return c c last card of split c end SHAR_EOF if test -f 'statses.f' then echo shar: over-writing existing file "'statses.f'" fi cat << \SHAR_EOF > 'statses.f' $STDUNIT subroutine sesupd(myid,ldq,n,q,d,rho,z,x,dlamda,q2,delta,w,indxp, * indx,kwork,ifail) implicit real*8 (a-h,o-z) CVD$G NOCONCUR c*********************************************************************** c c c this subroutine will compute the updated eigensystem of a c of a symmetric matrix after modification by a rank one c symmetric matrix. c c a = qdq' + rho*z*z' c c it is assumed that the eigenvectors of the original matrix c are stored in q, and the eigenvalues are in d. c the algorithm consists of three stages... c c c the first stage constists of deflating the size of c the problem when there multiple eigenvalues or if there c zero of the vector q'z. for each such ocurrence the dimension c is reduced by one. c c the second stage consists of calculating the updated c eigenvalues of the reduced problem. this requires a call c to the zero finding routine evupd. c c the final stage consists of computing the updated eigenvectors c directly using the updated eigenvalue. c c c the algorithm requires o(n**2) operations to update the c eigenvectors, but n**3 + o(n**2) to update the eigenvectors. c c c input variables... c c n the dimension of the problem. q is n x n c c q an n x n matrix that contains the eigenvectors of c the original matrix on input and the updated c eigenvectors on output. c c d a vector of length n. the original eigenvalues are c contained in d on input. the updated eigenvectors c are contained on output. c c rho a scalar c c z a vector of length n. on input this vector c containes the updating vector. the contents of z c are destroyed during the updating process. c c x a working array of length n. c c dlamda a working array of length n c c q2 a working array of dimension n x n c c delta a working array of length n c c w a working array of length n. c c indx an integer array of length n. c c ifail this integer variable indicates failure of the c updating process with value 1, and success c with value 0. c c called subroutines... c c evupd a subroutine for calculating the updated eigenvalues. c c c*********************************************************************** integer ldq,n,ifail integer indx(n),indxp(n),kwork(*) integer ksize(20),kstart(20),kstop(20),kbins,msize,krem,prcsys integer kgran,ksect common/prfpms/ksect,kgran logical rhoge0 double precision * q(ldq,n),d(n),z(n),x(n),dlamda(n),q2(ldq,n),delta(n) * ,w(n),rho double precision eps,zero,one,two,s,t,evsprd,dmax,dlam double precision epslon,enorm external evdrv character*6 subnam c c eps is machine precision c ifail = 0 zero = 0.0d0 one = 1.0d0 two = 2.0d0 eps = epslon(one) rhoge0 = (rho .ge. zero) go to (1111,2222),ientry(myid,2) 1111 continue c c compute q(transpose)*z c n1 = n/2 n1p1 = n1 + 1 do 100 j = 1,n1 z(j) = q(n1,j) 100 continue sigma = kwork(50) if (sigma .gt. 0.0) then do 200 j = n1p1,n z(j) = q(n1p1,j) 200 continue else do 201 j = n1p1,n z(j) = -q(n1p1,j) 201 continue endif c c normalize z so that norm(z) = 1 c t = enorm(n,z) do 300 j = 1,n z(j) = z(j)/t indx(j) = j 300 continue rho = rho*t*t c c calculate the allowable deflation tolerence c tol = (1.0d2)*eps*max(abs(d(1)),abs(d(n))) if (abs(rho) .le. tol) return c c c order the eigenvalues c nm1 = n - 1 do 600 j = 1,nm1 t = d(1) s = z(1) inx = indx(1) k = n - j + 1 do 500 i = 2,k im1 = i - 1 if (d(i) .lt. t) go to 400 t = d(i) s = z(i) inx = indx(i) go to 500 400 continue d(im1) = d(i) d(i) = t z(im1) = z(i) z(i) = s indx(im1) = indx(i) indx(i) = inx 500 continue 600 continue c c if there multiple eigenvalues then the problem deflates. c here the number of equal eigenvalues are found. c then an elementary reflector is computed to rotate the c corresponding eigensubspace so that certain components of c z are zero in this new basis. c if (rhoge0) *then k = 0 k2 = n + 1 do 670 j = 1,n if(rho*abs(z(j)) .le. tol) * then c c deflate due to small z component c k2 = k2 - 1 indxp(k2) = j if (j .eq. n) go to 1111 else jlam = j go to 700 endif 670 continue 700 continue j = j + 1 if (j .gt. n) go to 800 if(rho*abs(z(j)) .le. tol) c if(rho*abs(z(j)) .le. 1.0d-10) * then c c deflate due to small z component c k2 = k2 - 1 indxp(k2) = j else call deflat(j,jlam,jdef,indx,n,ldq,q,d,z,tol) if (jdef .le. 0) * then k = k + 1 w(k) = z(jlam)**2 dlamda(k) = d(jlam) indxp(k) = jlam jlam = j else k2 = k2 - 1 indxp(k2) = jdef endif endif go to 700 800 continue c c record the last eigenvalue c k = k + 1 w(k) = z(jlam)**2 dlamda(k) = d(jlam) indxp(k) = jlam else jlam = n k = n + 1 k2 = 0 do 671 j = n,1,-1 if(abs(rho*z(j)) .le. tol) c if(abs(rho*z(j)) .le. 1.0d-10) * then c c deflate due to small z component c k2 = k2 + 1 indxp(n-k2+1) = j if (j .eq. 1) go to 1111 else jlam = j go to 701 endif 671 continue 701 continue j = j - 1 if (j .lt. 1) go to 801 if(abs(rho*z(j)) .le. tol) * then c c deflate due to small z component c k2 = k2 + 1 indxp(n-k2+1) = j else call deflat(j,jlam,jdef,indx,n,ldq,q,d,z,tol) if (jdef .le. 0) * then k = k - 1 w(n-k+1) = z(jlam)**2 dlamda(n-k+1) = -d(jlam) indxp(n-k+1) = jlam jlam = j else k2 = k2 + 1 indxp(n-k2+1) = jdef endif endif go to 701 801 continue c c record the last eigenvalue c k = k - 1 w(n-k+1) = z(jlam)**2 dlamda(n-k+1) = -d(jlam) indxp(n-k+1) = jlam k = n - k + 1 rho = -rho endif c c compute the updated eigenvalues of the deflated problem c c kbins = k/kgran + 1 kbins = min(kbins,20) krem = mod(k,kbins) msize = (k - krem)/kbins do 900 j = 1,kbins ksize(j) = msize 900 continue do 910 j = 1,krem ksize(j) = ksize(j) + 1 910 continue kstart(1) = 1 kstop(1) = ksize(1) kwork(49) = k kwork(48) = -1 if (rhoge0) kwork(48) = 1 if ( kbins .gt. 1 ) then c do 920 j = 2,kbins kstart(j) = kstop(j-1) + 1 kstop(j) = kstop(j-1) + ksize(j) kwork(j) = kstart(j) kwork(kbins+j) = kstop(j) call gettag(jtemp) subnam = 'evdrv' call name(jtemp,subnam) call nxtag(jtemp,myid) call spawn(jtemp,myid,evdrv,kwork(49), * kwork(j),kwork(kbins+j), * ldq,n,q,d,rho,z,dlamda,q2, * w,indxp,indx,kwork(48),ifail) 920 continue call evdrv(k,kstart(1),kstop(1),ldq,n,q,d,rho,z, * dlamda,q2,w,indxp,indx,kwork(48),ifail) c c kbins-1 processes were spawned c place a join point here and resume when all spawned c processes have checked in c c else c c there was not sufficient granularity to warrant spawning c of additional processes to compute roots c kstop(1) = k call evdrv(k,kstart(1),kstop(1),ldq,n,q,d,rho,z, * dlamda,q2,w,indxp,indx,kwork(48),ifail) endif return c 2222 continue c c store the updated eigenvectors back into q c k = kwork(49) do 1100 j = k+1,n jp = indxp(j) jjpp = indx(jp) do 1000 i = 1,n q2(i,jp) = q(i,jjpp) 1000 continue 1100 continue do 1300 j = 1,n do 1200 i = 1,n q(i,j) = q2(i,j) 1200 continue 1300 continue c return c c last card of sesupd c end SHAR_EOF if test -f 'stseswait.f' then echo shar: over-writing existing file "'stseswait.f'" fi cat << \SHAR_EOF > 'stseswait.f' $STDUNIT subroutine sesupd(myid,ldq,n,q,d,rho,z,x,dlamda,q2,delta,w,indxp, * indx,kwork,ifail) implicit real*8 (a-h,o-z) CVD$G NOCONCUR c*********************************************************************** c c c this subroutine will compute the updated eigensystem of a c of a symmetric matrix after modification by a rank one c symmetric matrix. c c a = qdq' + rho*z*z' c c it is assumed that the eigenvectors of the original matrix c are stored in q, and the eigenvalues are in d. c the algorithm consists of three stages... c c c the first stage constists of deflating the size of c the problem when there multiple eigenvalues or if there c zero of the vector q'z. for each such ocurrence the dimension c is reduced by one. c c the second stage consists of calculating the updated c eigenvalues of the reduced problem. this requires a call c to the zero finding routine evupd. c c the final stage consists of computing the updated eigenvectors c directly using the updated eigenvalue. c c c the algorithm requires o(n**2) operations to update the c eigenvectors, but n**3 + o(n**2) to update the eigenvectors. c c c input variables... c c n the dimension of the problem. q is n x n c c q an n x n matrix that contains the eigenvectors of c the original matrix on input and the updated c eigenvectors on output. c c d a vector of length n. the original eigenvalues are c contained in d on input. the updated eigenvectors c are contained on output. c c rho a scalar c c z a vector of length n. on input this vector c containes the updating vector. the contents of z c are destroyed during the updating process. c c x a working array of length n. c c dlamda a working array of length n c c q2 a working array of dimension n x n c c delta a working array of length n c c w a working array of length n. c c indx an integer array of length n. c c ifail this integer variable indicates failure of the c updating process with value 1, and success c with value 0. c c called subroutines... c c evupd a subroutine for calculating the updated eigenvalues. c c c*********************************************************************** integer ldq,n,ifail integer indx(n),indxp(n),kwork(*) integer ksize(20),kstart(20),kstop(20),kbins,msize,krem,prcsys integer kgran,ksect common/prfpms/ksect,kgran common/wsync/iwrite logical rhoge0,wait double precision eps,zero,one,two,s,t,evsprd,dmax,dlam double precision epslon,enorm double precision * q(ldq,n),d(n),z(n),x(n),dlamda(n),q2(ldq,n),delta(n) * ,w(n),rho character*6 subnam external evdrv c c c eps is machine precision c ifail = 0 zero = 0.0d0 one = 1.0d0 two = 2.0d0 eps = epslon(one) rhoge0 = (rho .ge. zero) go to (1111,2222),ientry(myid,2) 1111 continue c c compute q(transpose)*z c n1 = n/2 n1p1 = n1 + 1 do 100 j = 1,n1 z(j) = q(n1,j) 100 continue sigma = kwork(50) c write(6,*) ' sigma at 2 ',sigma if (sigma .gt. 0.0) then do 200 j = n1p1,n z(j) = q(n1p1,j) 200 continue else do 201 j = n1p1,n z(j) = -q(n1p1,j) 201 continue endif c c normalize z so that norm(z) = 1 c t = enorm(n,z) c call lockon(iwrite) c write(6,*) ' n t z1 zn ',n,t,z(1),z(n) c call lockoff(iwrite) do 300 j = 1,n z(j) = z(j)/t indx(j) = j 300 continue rho = rho*t*t c c calculate the allowable deflation tolerence c tol = (1.0d2)*eps*max(abs(d(1)),abs(d(n))) if (abs(rho) .le. tol) return c c order the eigenvalues c c write(6,*) ' before order evs ',d(1),d(n) nm1 = n - 1 do 600 j = 1,nm1 t = d(1) s = z(1) inx = indx(1) k = n - j + 1 do 500 i = 2,k im1 = i - 1 if (d(i) .lt. t) go to 400 t = d(i) s = z(i) inx = indx(i) go to 500 400 continue d(im1) = d(i) d(i) = t z(im1) = z(i) z(i) = s indx(im1) = indx(i) indx(i) = inx 500 continue 600 continue c c if there multiple eigenvalues then the problem deflates. c here the number of equal eigenvalues are found. c then an elementary reflector is computed to rotate the c corresponding eigensubspace so that certain components of c z are zero in this new basis. c if (rhoge0) *then k = 0 k2 = n + 1 do 670 j = 1,n if(rho*abs(z(j)) .le. tol) * then c c deflate due to small z component c k2 = k2 - 1 indxp(k2) = j if (j .eq. n) go to 1111 else jlam = j go to 700 endif 670 continue 700 continue j = j + 1 if (j .gt. n) go to 800 if(rho*abs(z(j)) .le. tol) c if(rho*abs(z(j)) .le. 1.0d-10) * then c c deflate due to small z component c k2 = k2 - 1 indxp(k2) = j else call deflat(j,jlam,jdef,indx,n,ldq,q,d,z,tol) if (jdef .le. 0) * then k = k + 1 w(k) = z(jlam)**2 dlamda(k) = d(jlam) indxp(k) = jlam jlam = j else k2 = k2 - 1 indxp(k2) = jdef endif endif go to 700 800 continue c c record the last eigenvalue c k = k + 1 w(k) = z(jlam)**2 dlamda(k) = d(jlam) indxp(k) = jlam else jlam = n k = n + 1 k2 = 0 do 671 j = n,1,-1 if(abs(rho*z(j)) .le. tol) c if(abs(rho*z(j)) .le. 1.0d-10) * then c c deflate due to small z component c k2 = k2 + 1 indxp(n-k2+1) = j if (j .eq. 1) go to 1111 else jlam = j go to 701 endif 671 continue 701 continue j = j - 1 if (j .lt. 1) go to 801 if(abs(rho*z(j)) .le. tol) * then c c deflate due to small z component c k2 = k2 + 1 indxp(n-k2+1) = j else call deflat(j,jlam,jdef,indx,n,ldq,q,d,z,tol) if (jdef .le. 0) * then k = k - 1 w(n-k+1) = z(jlam)**2 dlamda(n-k+1) = -d(jlam) indxp(n-k+1) = jlam jlam = j else k2 = k2 + 1 indxp(n-k2+1) = jdef endif endif go to 701 801 continue c c record the last eigenvalue c k = k - 1 w(n-k+1) = z(jlam)**2 dlamda(n-k+1) = -d(jlam) indxp(n-k+1) = jlam k = n - k + 1 rho = -rho endif c write(6,*) ' begin a problem ------------------------------- ' c write(6,*) ' ' c write(6,*) k,rho c do 6666 jj = 1,k c write(6,*) z(indxp(jj)),w(jj),dlamda(jj) c6666 continue c write(6,*) ' begin a problem ------------------------------- ' c c c compute the updated eigenvalues of the deflated problem c c kbins = k/kgran + 1 kbins = min(kbins,20) krem = mod(k,kbins) msize = (k - krem)/kbins do 900 j = 1,kbins ksize(j) = msize 900 continue do 910 j = 1,krem ksize(j) = ksize(j) + 1 910 continue c call lockon(iwrite) c write(6,*) ' from ses w ',w(1),w(2),w(k) c call lockoff(iwrite) kstart(1) = 1 kstop(1) = ksize(1) kwork(49) = k kwork(48) = -1 if (rhoge0) kwork(48) = 1 if ( kbins .gt. 1 ) then do 920 j = 2,kbins kstart(j) = kstop(j-1) + 1 kstop(j) = kstop(j-1) + ksize(j) kwork(j) = kstart(j) kwork(kbins+j) = kstop(j) c call lockon(iwrite) c write(6,*) ' about to spawn evdrv ',k,kstart(j),kstop(j) c call lockoff(iwrite) call gettag(jtemp) subnam = 'evdrv' call name(jtemp,subnam) call nxtag(jtemp,myid) call spawn(jtemp,myid,evdrv,kwork(49), * kwork(j),kwork(kbins+j), * ldq,n,q,d,rho,z,dlamda,q2, * w,indxp,indx,kwork(48),ifail) 920 continue c call lockon(iwrite) c write(6,*) ' about to call evdrv ',k,kstart(1),kstop(1) c call lockoff(iwrite) call evdrv(k,kstart(1),kstop(1),ldq,n,q,d,rho,z, * dlamda,q2,w,indxp,indx,kwork(48),ifail) c c kbins-1 processes were spawned c place a join point here and resume when all spawned c processes have checked in c c else c c there was not sufficient granularity to warrant spawning c of additional processes to compute roots c kstop(1) = k c write(6,*) 'bef rho q d z x ',rho,q(1,1),d(1),z(1),x(1) c call lockon(iwrite) c write(6,*) ' about to call evdrv ',k,kstart(1),kstop(1) c call lockoff(iwrite) call evdrv(k,kstart(1),kstop(1),ldq,n,q,d,rho,z, * dlamda,q2,w,indxp,indx,kwork(48),ifail) endif if( wait(myid,2) ) return c 2222 continue c c store the updated eigenvectors back into q c k = kwork(49) do 1100 j = k+1,n jp = indxp(j) jjpp = indx(jp) do 1000 i = 1,n q2(i,jp) = q(i,jjpp) 1000 continue 1100 continue do 1300 j = 1,n do 1200 i = 1,n q(i,j) = q2(i,j) 1200 continue 1300 continue c call lockon(iwrite) c write(6,*) ' after ret in ses ',k,n c write(6,*) ' k n z1 zn ',k,n,q(1,1),q(n,1) c call lockoff(iwrite) c ifail = n-k c return c c last card of sesupd c end SHAR_EOF if test -f 'stuffspawn.f' then echo shar: over-writing existing file "'stuffspawn.f'" fi cat << \SHAR_EOF > 'stuffspawn.f' $STDUNIT double precision a,b common /prbdef/ a(1000),b(100),itmp(10),jtmp(10),myid(10) EXTERNAL PARALG nblks = 8 do 10 j = 1,10 itmp(j) = j jtmp(j) = j 10 continue write(6,*) ' input nprocs ' read (5,*) nprocs t1 = second(foo) c CALL SCHED(nprocs,paralg,nblks,a,b,itmp,jtmp,myid) c t2 = second(foo) write(6,*) ' time = ',t2-t1,' nprocs = ',nprocs nn = 36 do 100 j = 1,nn write(6,*) a(j) 100 continue stop end c subroutine paralg(n,a,b,itmp,jtmp,myid) integer itmp(*),jtmp(*),myid(*) double precision a(*),b(*) integer mychkn(1) character*6 subnam EXTERNAL STUFF1 c c this is the driver for filling a packed triangular matrix with c i on the i-th diagonal and -(i+j) in the (i,j) off diagonal position c do 150 j = 1, n call GETTAG(jobtag) myid(j) = jobtag 150 continue c icount = 1 do 200 j = 1, n c c the j-th diagonal waits for the diagonal above to complete c the j-th diagonal completion will allow c the (j+1)-st diagonal to start c jobtag = myid(j) if (j .eq. 1) then icango = 0 else icango = 1 endif if (j .eq. n) then nchks = 0 else nchks = 1 mychkn(1) = myid(j + 1) endif c c we just set up data dependencies and are ready to put c this process on the queue c subnam = 'stuff1' CALL NAME(jobtag,subnam) CALL DEP(jobtag,icango,nchks,mychkn) CALL PUTQ(jobtag,stuff1,myid(j),n,a(icount),jtmp(j),itmp(1)) c c when the data dependencies for process jobtag are satisfied c the following call will be made c c call stuff1(myid....,itmp(1)) c icount = icount + (n-j+1) 200 continue c return end c subroutine stuff1(myid,n,a,j,itmp) double precision a(*) integer myid,n,j,itmp(*) logical wait character*6 subnam EXTERNAL STUFF2,STUFF3 c go to (1111,2222,3333),ientry(myid,3) 1111 continue ii = 2 do 100 i = 2,n c CALL GETTAG(jobtag) subnam = 'stuff2' CALL NAME(jobtag,subnam) CALL NXTAG(jobtag,myid) CALL SPAWN(jobtag,myid,stuff2,a(ii + 1),itmp(i),itmp(j)) c c this spawns a process that will execute a call to stuff2 c and report completion to process MYID c ii = ii + 1 100 continue call stuff2(a(2),itmp(1),itmp(j)) if (wait(myid,2)) return c c return to help out and then return here (at label 2222) c on the next reentry c 2222 continue ii = 2 do 200 i = 2,n c CALL GETTAG(jobtag) subnam = 'stuff3' CALL NAME(jobtag,subnam) CALL NXTAG(jobtag,myid) CALL SPAWN(jobtag,myid,stuff3,a(ii + 1),itmp(i),itmp(j)) c c this spawns a process that will execute a call to stuff3 c and report completion to process MYID c ii = ii + 1 200 continue call stuff3(a(2),itmp(1),itmp(j)) if (wait(myid,3)) return 3333 continue a(1) = j c return end subroutine stuff2(a,i,j) double precision a(*),one one = 1.0d0 a(1) = 0.0d0 do 100 kk = 1,10000 a(1) = a(1) + one 100 continue a(1) = -(i+j) return end subroutine stuff3(a,i,j) double precision a(*),one,save one = 1.0d0 save = a(1) do 100 kk = 1,10000 a(1) = a(1) + one 100 continue a(1) = -(i+j) + save return end c SHAR_EOF if test -f 'testrun' then echo shar: over-writing existing file "'testrun'" fi cat << \SHAR_EOF > 'testrun' make sched; make graph make xtest; xtest < data make gtest; gtest < data make xdandc; xdandc < data make gdandc; gdandc < data make xeig; xeig < data make geig; geig < data make xwait; xwait < data make gwait; gwait < data make xexample; xexample < data make gexample; gexample < data make xts_dynamic; xts_dynamic < data.ts_dynamic make gts_dynamic; gts_dynamic < data.ts_dynamic make xblkjac; xblkjac < data.blkjac make gblkjac; gblkjac < data.blkjac SHAR_EOF chmod +x 'testrun' if test -f 'ts_dynamic.f' then echo shar: over-writing existing file "'ts_dynamic.f'" fi cat << \SHAR_EOF > 'ts_dynamic.f' $STDUNIT program dynamc c Code Name: ts_dynamic.f Code Input Data: [nprocessors] [narraysize] [nworkiterations] Code Note: variation on tridiagonal stuffer program ts_dynamic.f Change: Modification for circular parmq & super nxtag. c: modification of ts_dynamic.f to correspond to ts_static.f c...or old stuffspawn.f: the triangular array stuffer. parameter(maxsiz=1000,mxszsq=500500,maxprc=8) c: mxszsq .ge. maxsiz*(maxsiz+1)/2 double precision a,b common /comitr/ niter common /prbdef/ a(mxszsq),b(maxsiz),itmp(maxsiz),jtmp(maxsiz) & ,statag(maxsiz) EXTERNAL PARALG C write(6,*) ' input order of array .le. 44, but dim =',maxsiz C read (5,*) nblks C write(6,*) ' input nprocs .le. ',maxprc C read (5,*) nprocs read(5,*) nprocs,nblks,niter c if(nblks.gt.maxsiz) then write(6,*) 'order of array, nblks =',nblks,' .gt. ',maxsiz & ,' = maxsize' write(6,*) 'S T O P E X E C U T I O N I N M A I N' stop endif c mxjobs = nblks*(nblks+1)/2 jstep = (nblks)/(10-0) c do 10 j = 1,nblks itmp(j) = j jtmp(j) = j 10 continue c c: add second.f timer t1=second(foo) t2=second(foo) c do 111 jj = 1,100 c CALL SCHED(nprocs,paralg,nblks,a,b,itmp,jtmp,statag) c c111 continue t3=second(foo) tt=t3-t2-(t2-t1) c c output c lower triangle of a matrix of order n c do 100 j = 1,nblks k = j b(1)= a(j) do 50 i = 1,j-1 b(i+1) = a(k+nblks-i) k = k+nblks-i 50 continue if(mxjobs.lt.100) then write(6,1000) (b(i),i=1,j) else if(mod(j-1,jstep).eq.0.or.j.eq.nblks) & write(6,2000) (b(i),i=1,j-1,jstep),b(j) endif 100 continue 1000 format(16f5.0) 2000 format(11f7.0) c if(nprocs.eq.1) write(6,664) nblks,niter,mxjobs 664 format(' # ts_dynamic.f = pgm.f schedule gettag & name program' & /' # ftsubs.f for circular readyq & parmq & freeq version' & /' # with nblks =',i5'; niter =',i8,'; maxjobs =',i8) write(6,665) nprocs,tt 665 format(' #',i2,f12.5) write(*,666) nblks,niter,nprocs,mxjobs,tt 666 format(11x,'code',4x,'norder',4x,' niter',4x,'nprocs' & ,3x,'maxjobs',5x,'seconds' & /3x,'ts_dynamic.f',4i10,f12.5) c stop end c subroutine paralg(n,a,b,itmp,jtmp,statag) integer n,itmp(*),jtmp(*),statag(*) double precision a(*),b(*) integer mychkn(1) EXTERNAL STUFF1 c c this is the driver for filling a packed triangular matrix with c j on the j-th diagonal and (j*n+i-j*(j+1)/2) in the (i,j) off c diagonal position c c first, get all static job tags necessary to construct the c dependency graph. c do 100 j = 1,n Caution: statag(j) gets the schedule output static job tag. CALL GETTAG(statag(j)) 100 continue c icount = 1 do 200 j = 1,n-1 c c c the j-th diagonal waits for the diagonal above to complete c c the j-th diagonal completion will allow c the (j+1)-st diagonal to start c c jobtag = j icango = 1 if (jobtag .eq. 1) icango = 0 nchks = 1 Caution: jobtag = j 's chekin is defined in terms of schedule static tags. mychkn(1) = statag(j+1) c c we just set up data dependencies and are ready to put c this process on the queue c jobtag = statag(j) CALL name(jobtag,'stuff1') CALL DEP(jobtag,icango,nchks,mychkn) CAUTION: Make certain that all arguments of the subroutine whose name cont: is passed to SCHEDULE, are global variables, as jtmp(j) is for cont: sub name stuff1. CALL PUTQ(jobtag,stuff1,statag(j),n, & a(icount),jtmp(j),itmp(1)) c c when the data dependencies for process statag(j) are satisfied c the following call will be made c c call stuff1(jobtag,....,itmp(1)) c icount = icount + (n-j+1) 200 continue c icango = 1 nchks = 0 Caution: mychkn gets dummy value only. mychkn(1) = n+1 c jobtag = statag(n) CALL name(jobtag,'stuff1') CALL DEP(jobtag,icango,nchks,mychkn) CALL PUTQ(jobtag,stuff1,statag(n),n, & a(icount),jtmp(n),itmp(1)) c return end c subroutine stuff1(mypar,n,a,j,itmp) double precision a(*) integer mypar,n,j,itmp(*) logical wait EXTERNAL STUFF2 c c write(6,*) ' enter stuff1 ',mypar,j c c write(6,*) ' enter stuff1 ',ientry(mypar),mypar nentrs=2 go to (1111,2222),ientry(mypar,nentrs) 1111 continue ii = 1 do 100 i = j+1,n c CALL GETTAG(jobtag) CALL name(jobtag,'stuff2') CAUTION: ARGUMENTS OF NXTAG & SPAWN ARE REORDERED FROM OLDER VERSIONS, CAUTION: MAKING THEM MORE CONSISTENT WITH DEP & PUTQ. CALL NXTAG(jobtag,mypar) c write(6,*) ' about to spawn jobtag, mypar ',jobtag,mypar CALL SPAWN(jobtag,mypar,stuff2,a(ii + 1),itmp(i),itmp(j),n) c c this spawns a process that will execute a call to stuff2 c and report completion to parent process MYPAR c ii = ii + 1 100 continue iexit=2 if (wait(mypar,iexit)) return c c return to help out and then return here (at label 2222) c on the next reentry c 2222 continue c a(1) = j c return end subroutine stuff2(a,i,j,n) double precision a(*) common /comitr/ niter do 99999 kk = 1,niter a(1) = a(1) + kk 99999 continue a(1) = j*n + i - j*(j+1)/2 return end SHAR_EOF # End of shell archive exit 0