[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Altivec matmul kernel (attachment)
Here's a question for the group:
Altivec fp instructions execute in one of two modes:
In "Java" mode, denormalized results are handled correctly, and
multiply-add instructions have a 5-cycle latency.
In "non-Java" mode, denormalized results may not be handled correctly,
and multiply-add instructions have a 4-cycle latency. All other
computations are IEEE compliant. My matmul kernel gets about 150-200
Mflop speed bump (1650 to 1850, roughly) when going from Java mode to
non-Java mode.
Should I let the user handle Java vs. non-Java mode, or should I turn
off Java mode explicitly? (The submitted version doesn't touch the Java
mode bit).
--
Nicholas Coult, Ph.D., web: http://melby.augsburg.edu/~coult
Assistant Professor, Department of Mathematics, Augsburg College
coult@augsburg.edu, phone: (612) 330-1064 office: Science Hall 137B