[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: SSE Level 3 drop in gemm
Greetings!
R Clint Whaley <rwhaley@cs.utk.edu> writes:
> Camm,
>
> No knowledge/understanding of the register reservation, unfortunately . . .
>
> >Otherwise, the kernel is working fine. Performance fluctuates on the
> >short timer runs, but is somewhere between 670 and 700 MFLOPS for the
> >beta=0 case, and about 670 for arbitrary beta.
>
> Great, that represents something like a 1.9 speedup over ATLAS's kernel,
> doesn't it?
>
> >On another front -- Do you have any word on the complex compilation
> >procedure, Clint? The deal is that all beta cases seem to be
> >referenced by the same timer (fc.c) program, regardless of beta= flag.
>
> Yep, ATLAS/doc/atlas_contrib.ps explains this in the section on complex
> matmul: it's done with 4 calls to essentially a real matmul. Even the
> case of beta=1 requires a real beta=X, 'cause you need the -1.0 case
> because the two imaginary elements that contribute to the real component
> (notice steps 1 and 3 on page 14 use negative). The timer compiles your
> complex code 3 times to get the b1, b0, and bX cases. What exactly is
> the problem you are having with it?
>
I guess my problem is that make mmutstcase pre=c nb=?? mmrout=... only
compiles the kernel once, so that xsmmtst fails to link, with an
undefined reference to the bX routine.
=============================================================================
</atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic$ make mmutstcase pre=s nb=56 mmrout=../CASES/ATL_sgemm_SSE.c
make mmutstcase pre=s nb=
<ne/blas/gemm/Linux_fpic$ make mmutstcase pre=s nb=5 6 mmrout=../CASES/ATL_sge
<ake mmutstcase pre=s nb=56 mmrout=../CASES/ATL_sgem m_SSE.c
rm -f smm.c smm.[o,c]
./xemit_mm -p s -b 1 -M 56 -N 56 -K 56 -R -3 \
> smm.c
pre=s, CU=0, ma=0, ff=0, if=-1, nf=-1, lo=1, ta=112, tb=111, lat=4, mu=4, nu=4, ku=1, m=56, n=56, k=56, lda=0, ldb=0, ldc=0, csA=1, csB=1, csC=1, alpha=1, beta=1
cat ../CASES/ATL_sgemm_SSE.c >> smm.c
/usr/bin/gcc -DL2SIZE=524288 -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/Linux_fpic -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/contrib -DAdd__ -DStringSunStyle -fomit-frame-pointer -O -fPIC -c smm.c
make mmtstcase0 pre=s ta=t tb=n muladd=1 lat=4 loopO=JIK M=56 N=56 K=56 mb=56 nb=56 kb=56 mu=4 nu=4 ku=1 lda=56 ldb=56 ldc=0 csA=1 csB=1 csC=1 alpha=1 beta=1 moves="-DMoveA -DMoveB" cleanup=0 mmobjs=smm.o
make[1]: Entering directory `/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic'
rm -f smmtst.o
/usr/bin/gcc -DL2SIZE=524288 -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/Linux_fpic -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/contrib -DAdd__ -DStringSunStyle -fomit-frame-pointer -O3 -funroll-all-loops -fPIC -DsREAL -DtranAt -DtranBn \
-DMULADD=1 -DLAT=4 -DJIK \
-DMB0=56 -DNB0=56 -DKB0=56 \
-DMB=56 -DNB=56 -DKB=56 \
-DKU=1 -DNU=4 -DMU=4 \
-DLDA=56 -DLDB=56 -DLDC=0 \
-DcsA=1 -DcsB=1 -DcsC=1 \
-DALPHA=1 -DBETA=1 -DMoveA -DMoveB \
-DCLEANUP=0 \
-o smmtst.o -c ../mmtst.c
/usr/bin/gcc -DL2SIZE=524288 -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/Linux_fpic -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/contrib -DAdd__ -DStringSunStyle -fomit-frame-pointer -O3 -funroll-all-loops -fPIC -o xsmmtst smmtst.o smm.o
/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/bin/Linux_fpic/ATLrun.sh /mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic xsmmtst
PASSED TEST
make[1]: Leaving directory `/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic'
</atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic$ make mmutstcase pre=c nb=56 mmrout=../CASES/ATL_sgemm_SSE.c
make mmutstcase pre=c nb=
<ne/blas/gemm/Linux_fpic$ make mmutstcase pre=c nb=5 6 mmrout=../CASES/ATL_sge
<ake mmutstcase pre=c nb=56 mmrout=../CASES/ATL_sgem m_SSE.c
rm -f cmm.c cmm.[o,c]
./xemit_mm -p c -b 1 -M 56 -N 56 -K 56 -R -3 \
> cmm.c
pre=c, CU=0, ma=0, ff=0, if=-1, nf=-1, lo=1, ta=112, tb=111, lat=4, mu=4, nu=4, ku=1, m=56, n=56, k=56, lda=0, ldb=0, ldc=0, csA=1, csB=1, csC=1, alpha=1, beta=1
cat ../CASES/ATL_sgemm_SSE.c >> cmm.c
/usr/bin/gcc -DL2SIZE=524288 -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/Linux_fpic -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/contrib -DAdd__ -DStringSunStyle -fomit-frame-pointer -O -fPIC -c cmm.c
make mmtstcase0 pre=c ta=t tb=n muladd=1 lat=4 loopO=JIK M=56 N=56 K=56 mb=56 nb=56 kb=56 mu=4 nu=4 ku=1 lda=56 ldb=56 ldc=0 csA=1 csB=1 csC=1 alpha=1 beta=1 moves="-DMoveA -DMoveB" cleanup=0 mmobjs=cmm.o
make[1]: Entering directory `/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic'
rm -f cmmtst.o
/usr/bin/gcc -DL2SIZE=524288 -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/Linux_fpic -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/contrib -DAdd__ -DStringSunStyle -fomit-frame-pointer -O3 -funroll-all-loops -fPIC -DcREAL -DtranAt -DtranBn \
-DMULADD=1 -DLAT=4 -DJIK \
-DMB0=56 -DNB0=56 -DKB0=56 \
-DMB=56 -DNB=56 -DKB=56 \
-DKU=1 -DNU=4 -DMU=4 \
-DLDA=56 -DLDB=56 -DLDC=0 \
-DcsA=1 -DcsB=1 -DcsC=1 \
-DALPHA=1 -DBETA=1 -DMoveA -DMoveB \
-DCLEANUP=0 \
-o cmmtst.o -c ../mmtst.c
/usr/bin/gcc -DL2SIZE=524288 -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/Linux_fpic -I/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/include/contrib -DAdd__ -DStringSunStyle -fomit-frame-pointer -O3 -funroll-all-loops -fPIC -o xcmmtst cmmtst.o cmm.o
cmmtst.o: In function `mmtst':
cmmtst.o(.text+0xbce): undefined reference to `ATL_cJIK56x56x56TN56x56x0_a1_bX'
cmmtst.o(.text+0xc41): undefined reference to `ATL_cJIK56x56x56TN56x56x0_a1_bX'
collect2: ld returned 1 exit status
make[1]: *** [mmtstcase0] Error 1
make[1]: Leaving directory `/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic'
make: *** [mmutstcase] Error 2
=============================================================================
Take care,
> Cheers,
> Clint
>
>
--
Camm Maguire camm@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens." -- Baha'u'llah