$(F77) $(F77NO_OPTFLAGS) -c $*.fto:
$(F77) $(F77NO_OPTFLAGS) -fno-globals -fno-f90 -fugly-complex -w -c $*.f
Flags necessary to compile the BLACS tester with Intel's
Fortran compiler
If you are compiling it with Intel's Fortran compiler, the tester will hang in
determining epsilon unless you add -fp_port to F77NO_OPTFLAGS
in your Bmake.inc file.
MPIBLACS SECTION:
Error in most MPI implementations of MPI_Abort.
This error last confirmed in MPICH 1.0.13 and MPICH 1.1. MPI_Abort
does not kill any other processes at all, but seems to behave pretty much
like calling a local exit(). This will cause the BLACS tester to
hang on the BLACS_ABORT test in the auxiliary test. Here is straight
MPI code demonstrating the error:
#include#include "mpi.h" main(int narg, char **args) { int i, Iam, Np; MPI_Init(&narg, &args); MPI_Comm_size(MPI_COMM_WORLD, &Np); MPI_Comm_rank(MPI_COMM_WORLD, &Iam); if (Iam == Np-1) MPI_Abort(MPI_COMM_WORLD, -2); while(1); MPI_Finalize(); }
Problems compiling dwalltime00
There is a undiagnosed problem that causes some users' dwalltime00
routine to return bad values. It appears likely that there is a problem with
macro name overruns, but errors in cpp or the code have not been ruled out.
If you get bad return values from dwalltime00, overwrite
BLACS/SRC/MPI/dwalltime00_.c with:
#include "Bdef.h" #if (INTFACE == C_CALL) double Cdwalltime00(void) #else F_DOUBLE_FUNC dwalltime00_(void) #endif { return(MPI_Wtime()); }
Sun f77 and gcc compiler mismatch.
User's of Sun's f77 compilers may need to throw the -f
flag to force 8-byte double precision scalar alignment, which
gcc-compiled BLACS expect. Therefore, add -f to the
NOPT macro in SLmake.inc and to the
F77NO_OPTFLAGS in Bmake.inc.
NOTE: this is an old entry, and my no longer be needed.
T3E MPI error in handling zero-length segments
mpt.1.2.0.0.6beta couldn't handle 0-length segments used with
MPI_Type_indexed. To work around this problem, throw the
T3ETrError flag in your Bmake.inc of patched MPIBLACS
(as shown in the example Bmake.T3E supplied with the patch).
NOTE: this is an old entry, and my no longer be needed.
T3E MPI error in handling mixed types
mpt.1.2.0.0.6beta couldn't handle certain reductions where you mix types
iwth a MPI data type. To work around this problem, apply the patch and throw
the T3EReductErr flag in your Bmake.inc
(as shown in the example Bmake.T3E supplied with the patch).
NOTE: this is an old entry, and my no longer be needed.
Include file scoping problem.
This appears to be a compiler problem with including files within the
brackets of a routine. Must include system files before starting scope of the
routine. Therefore, in BLACS/SRC/PVM/blacs_setup_.c, move line:
#include "string.h"to second line of file (ie., after #include "Bdef.h").
.SUFFIXES: .o .C .c.C: $(CC) -c $(CCFLAGS) -o C$*.o $(BLACSDEFS) -DCallFromC $< mv C$*.o $*.CSGI error workaround:
.SUFFIXES: .o .C .c.C: ln -s $*.c C$*.c $(CC) -c $(CCFLAGS) $(BLACSDEFS) -DCallFromC C$*.c mv C$*.o $*.C rm -f C$*.c
program tst integer k, iam, Np, ictxt, i, j call mpc_environ(Np, Iam); k = Iam + 100 print*,'start' if (iam.eq.1) then call mp_send(Iam, 4, 0, 2, i) call mp_send(k, 4, 0, 3, j) print*,mp_status(i) print*,mp_status(j) else if (iam .eq. 0) then call mp_brecv(k, 4, 1, 3, j) call mp_brecv(k, 4, 1, 2, j) end if print*,'done' stop end When this is run, the output is: xtst2 -procs 2 start start 4 4 doneSo both sends complete, but the receives still hang.
long iaddr; iaddr = (long) A; /* * If address is on a 8 byte boundary, and lda and m are evenly divisible by 2, * can use double sized pointers for faster packing */ if ( !(iaddr % 8) && !(lda % 2) && !(m % 2) ) mvcopy8(m/2, n, (double *) A, lda/2, (double *) buff); /* * Otherwise, must use 4 byte packing */ else
You also need to delete basically the same lines from BLACS/SRC/NX/INTERNAL/vmcopy4.c:
long iaddr; iaddr = (long) A; /* * If address is on a 8 byte boundary, and lda and m are evenly divisible by 2, * can use double sized pointers for faster packing */ if ( !(iaddr % 8) && !(lda % 2) && !(m % 2) ) vmcopy8(m/2, n, (double *) A, lda/2, (double *) buff); /* * Otherwise, must use 4 byte packing */ else