[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: SSE Level 3 drop in gemm
Greetings!
R Clint Whaley <rwhaley@cs.utk.edu> writes:
> Camm,
>
> >I guess my problem is that make mmutstcase pre=c nb=?? mmrout=... only
> >compiles the kernel once, so that xsmmtst fails to link, with an
> >undefined reference to the bX routine.
>
> Ah, I see. The complex test is cmmutstcase, mmutstcase is the real only.
> The names are not well chosen, to say the least, but I've just not got
> around to cleaning them up. Page 15 of atlas_contrib.ps shows the procedure
> for complex tests . . .
>
Thanks Clint! That's it. Sorry, I should have read more carefully.
Kernel works real and complex.
Now I have a different issue. My kernel likes nb=56 the best. Atlas
standard likes nb=64. And this is what I get in sMMRES:
intech20:/mnt/i19/f/debian/mm/atlas/tmp/atlas-3.1.2D/tune/blas/gemm/Linux_fpic$ cat res/sMMRES
MULADD LAT NB MU NU KU FFTCH IFTCH NFTCH MFLOP
0 2 64 5 1 64 0 5 1 371.46
16
ATL_sgemm_SSE.c "CM"
1 1 64 2 2 64 0 4 1 617.61
At nb=56, the performance is just under 700.
Also, any sugegstions on the unrolling issue? If I put in macros to
unroll k at differing levels depending on KB, would that confuse the
search engine? Should install faster than having 3 different k unroll
kernels to time.
Take care,
> Sorry for the confusion,
> Clint
>
>
--
Camm Maguire camm@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens." -- Baha'u'llah