$(F77) $(F77NO_OPTFLAGS) -c $*.f
to:
        $(F77) $(F77NO_OPTFLAGS) -fno-globals -fno-f90 -fugly-complex -w -c $*.f
 
Flags necessary to compile the BLACS tester with Intel's
Fortran compiler
If you are compiling it with Intel's Fortran compiler, the tester will hang in
determining epsilon unless you add -fp_port to F77NO_OPTFLAGS
in your Bmake.inc file.
MPIBLACS SECTION:
Error in most MPI implementations of MPI_Abort.
This error last confirmed in MPICH 1.0.13 and MPICH 1.1.  MPI_Abort
does not kill any other processes at all, but seems to behave pretty much
like calling a local exit().  This will cause the BLACS tester to
hang on the BLACS_ABORT test in the auxiliary test.  Here is straight
MPI code demonstrating the error:
#include#include "mpi.h" main(int narg, char **args) { int i, Iam, Np; MPI_Init(&narg, &args); MPI_Comm_size(MPI_COMM_WORLD, &Np); MPI_Comm_rank(MPI_COMM_WORLD, &Iam); if (Iam == Np-1) MPI_Abort(MPI_COMM_WORLD, -2); while(1); MPI_Finalize(); } 
Problems compiling dwalltime00
There is a undiagnosed problem that causes some users' dwalltime00
routine to return bad values.  It appears likely that there is a problem with
macro name overruns, but errors in cpp or the code have not been ruled out.
If you get bad return values from dwalltime00, overwrite
BLACS/SRC/MPI/dwalltime00_.c with:
#include "Bdef.h"
#if (INTFACE == C_CALL)
double Cdwalltime00(void)
#else
F_DOUBLE_FUNC dwalltime00_(void)
#endif
{
   return(MPI_Wtime());
}
Sun f77 and gcc compiler mismatch.
User's of Sun's f77 compilers may need to throw the -f
flag to force 8-byte double precision scalar alignment, which
gcc-compiled BLACS expect.  Therefore, add -f to the 
NOPT macro in SLmake.inc and to the 
F77NO_OPTFLAGS in Bmake.inc.
NOTE: this is an old entry, and my no longer be needed.
T3E MPI error in handling zero-length segments
mpt.1.2.0.0.6beta couldn't handle 0-length segments used with 
MPI_Type_indexed.  To work around this problem, throw the
T3ETrError flag in your Bmake.inc of patched MPIBLACS
(as shown in the example Bmake.T3E supplied with the patch).
NOTE: this is an old entry, and my no longer be needed.
T3E MPI error in handling mixed types
mpt.1.2.0.0.6beta couldn't handle certain reductions where you mix types
iwth a MPI data type.  To work around this problem, apply the patch and throw
the T3EReductErr flag in your Bmake.inc 
(as shown in the example Bmake.T3E supplied with the patch).
NOTE: this is an old entry, and my no longer be needed.
 
Include file scoping problem.
This appears to be a compiler problem with including files within the
brackets of a routine.  Must include system files before starting scope of the
routine.  Therefore, in BLACS/SRC/PVM/blacs_setup_.c, move line:
#include "string.h"to second line of file (ie., after #include "Bdef.h").
.SUFFIXES: .o .C
.c.C:
        $(CC) -c $(CCFLAGS) -o C$*.o $(BLACSDEFS) -DCallFromC $<
        mv C$*.o $*.C
SGI error workaround:
.SUFFIXES: .o .C
.c.C:
        ln -s $*.c C$*.c
        $(CC) -c $(CCFLAGS) $(BLACSDEFS) -DCallFromC C$*.c
        mv C$*.o $*.C
        rm -f C$*.c
      program tst
      integer k, iam, Np, ictxt, i, j
      call mpc_environ(Np, Iam);
      k = Iam + 100
      print*,'start'
      if (iam.eq.1) then
         call mp_send(Iam, 4, 0, 2, i)
         call mp_send(k,   4, 0, 3, j)
         print*,mp_status(i)
         print*,mp_status(j)
      else if (iam .eq. 0) then
         call mp_brecv(k, 4, 1, 3, j)
         call mp_brecv(k, 4, 1, 2, j)
      end if
      print*,'done'
      stop
      end
When this is run, the output is:
xtst2 -procs 2
 start
 start
 4
 4
 done
So both sends complete, but the receives still hang.
   long iaddr;
   iaddr = (long) A;
/*
 * If address is on a 8 byte boundary, and lda and m are evenly divisible by 2,
 * can use double sized pointers for faster packing
 */
   if ( !(iaddr % 8) && !(lda % 2) && !(m % 2) )
      mvcopy8(m/2, n, (double *) A, lda/2, (double *) buff);
/*
 * Otherwise, must use 4 byte packing
 */
   else
You also need to delete basically the same lines from BLACS/SRC/NX/INTERNAL/vmcopy4.c:
   long iaddr;
   iaddr = (long) A;
/*
 * If address is on a 8 byte boundary, and lda and m are evenly divisible by 2,
 * can use double sized pointers for faster packing
 */
   if ( !(iaddr % 8) && !(lda % 2) && !(m % 2) )
      vmcopy8(m/2, n, (double *) A, lda/2, (double *) buff);
/*
 * Otherwise, must use 4 byte packing
 */
   else