[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Latest Athlon test results
"diff" of 3.2.1 vs. 3.3.1 SUMMARY.LOG: < is Atlas 3.2.1, > is Atlas 3.3.1
M. Edward Borasky, Borasky Research, 3 July 2001
Atlas options: 3DNow yes, all others defaults
Environment: 1.333 GHz Athlon Thunderbird, 512 MB DDR RAM
*Stock* Red Hat Linux 7.1, gcc version 2.96 20000731 (Red Hat Linux 7.1
2.96-81)
----------------------------------------------------------------------------
---
5c5
< * BEGAN ATLAS INSTALL OF SECTION 0-0-0 ON 07/02/2001 AT 22:00
*
---
> * BEGAN ATLAS INSTALL OF SECTION 0-0-0 ON 07/02/2001 AT 19:29
*
19c19
< Apparent peak=1059.34MFLOPS
---
> Apparent peak=1061.83MFLOPS
22c22
< Apparent peak=1061.76MFLOPS
---
> Apparent peak=1059.31MFLOPS
33c33
< This gave performance of 786.35 (74.2327777751340f
apparent peak)
---
> This gave performance of 784.60 (73.8927777751420f
apparent peak)
35c35
< Performance = 171.72 (21.84% of copy matmul, 16.21% of
peak)
---
> Performance = 172.81 (22.03% of copy matmul, 16.27% of
peak)
37c37
< Performance = 171.06 (21.75% of copy matmul, 16.15% of
peak)
---
> Performance = 171.82 (21.90% of copy matmul, 16.18% of
peak)
39c39
< Performance = 780.63 (99.27% of copy matmul, 73.69% of
peak)
---
> Performance = 777.79 (99.13% of copy matmul, 73.25% of
peak)
41c41
< Performance = 164.40 (20.91% of copy matmul, 15.52% of
peak)
---
> Performance = 165.47 (21.09% of copy matmul, 15.58% of
peak)
63c63
< Performance = 244.32 (31.07% of copy matmul, 23.06% of peak)
---
> Performance = 213.35 (27.19% of copy matmul, 20.09% of peak)
66c66
< Performance = 151.74 (19.30% of copy matmul, 14.32% of peak)
---
> Performance = 153.38 (19.55% of copy matmul, 14.44% of peak)
71,72c71,72
< mu=32, nu=2, using 87.00% of L1 Cache
< Performance = 105.94 (13.47% of copy matmul, 10.00% of peak)
---
> mu=32, nu=2, using 89.00% of L1 Cache
> Performance = 93.74 (11.95% of copy matmul, 8.83% of peak)
79,80c79,80
< The best matmul kernel was ATL_mm_3dnow_100.c, written by Peter
Soendergaard
< This gave performance of 3254.55MFLOPS (306.5227777751340f apparent
peak)
---
> The best matmul kernel was ATL_smm_3dnow_100.c, written by Peter
Soendergaard
> This gave performance of 3208.61MFLOPS (302.9027777751420f apparent
peak)
82c82
< Performance = 889.87 (27.34% of copy matmul, 83.81% of
peak)
---
> Performance = 886.31 (27.62% of copy matmul, 83.67% of
peak)
84c84
< Performance = 966.71 (29.70% of copy matmul, 91.05% of
peak)
---
> Performance = 964.42 (30.06% of copy matmul, 91.04% of
peak)
86c86
< Performance = 882.56 (27.12% of copy matmul, 83.12% of
peak)
---
> Performance = 879.05 (27.40% of copy matmul, 82.98% of
peak)
88c88
< Performance = 936.38 (28.77% of copy matmul, 88.19% of
peak)
---
> Performance = 940.73 (29.32% of copy matmul, 88.81% of
peak)
110,113c110,113
< Performance = 208.95 ( 6.42% of copy matmul, 19.68% of peak)
< gemvT : chose routine ATL_gemvT_mm.c written by R. Clint Whaley
< Yunroll=0, Xunroll=0, using 100.00% of L1
< Performance = 193.37 ( 5.94% of copy matmul, 18.21% of peak)
---
> Performance = 208.00 ( 6.48% of copy matmul, 19.64% of peak)
> gemvT : chose routine ATL_gemvT_2x16_1.c written by R. Clint Whaley
> Yunroll=2, Xunroll=16, using 100.00% of L1
> Performance = 159.27 ( 4.96% of copy matmul, 15.04% of peak)
117,119c117,119
< ger : chose routine ATL_ger1_4x4_1.c written by R. Clint Whaley
< mu=4, nu=4, using 94.00% of L1 Cache
< Performance = 150.00 ( 4.61% of copy matmul, 14.13% of peak)
---
> ger : chose routine ATL_ger1_1x4_0.c written by R. Clint Whaley
> mu=1, nu=4, using 75.00% of L1 Cache
> Performance = 137.59 ( 4.29% of copy matmul, 12.99% of peak)
127c127
< This gave performance of 794.41 (74.9927777751340f
apparent peak)
---
> This gave performance of 790.37 (74.4327777751420f
apparent peak)
129c129
< Performance = 185.89 (23.40% of copy matmul, 17.55% of
peak)
---
> Performance = 185.49 (23.47% of copy matmul, 17.47% of
peak)
131c131
< Performance = 185.50 (23.35% of copy matmul, 17.51% of
peak)
---
> Performance = 185.55 (23.48% of copy matmul, 17.47% of
peak)
133c133
< Performance = 180.69 (22.75% of copy matmul, 17.06% of
peak)
---
> Performance = 180.99 (22.90% of copy matmul, 17.05% of
peak)
135c135
< Performance = 179.06 (22.54% of copy matmul, 16.90% of
peak)
---
> Performance = 180.87 (22.88% of copy matmul, 17.03% of
peak)
155,160c155,160
< gemvN : chose routine ATL_cgemvN_mm.c written by R. Clint Whaley
< Yunroll=0, Xunroll=0, using 93.00% of L1
< Performance = 129.62 (16.32% of copy matmul, 12.24% of peak)
< gemvT : chose routine ATL_cgemvT_mm.c written by R. Clint Whaley
< Yunroll=0, Xunroll=0, using 93.00% of L1
< Performance = 121.36 (15.28% of copy matmul, 11.46% of peak)
---
> gemvN : chose routine ATL_gemvN_SSE.c written by Camm Maguire
> Yunroll=16, Xunroll=2, using 81.00% of L1
> Performance = 392.09 (49.61% of copy matmul, 36.93% of peak)
> gemvT : chose routine ATL_gemvT_SSE.c written by Camm Maguire
> Yunroll=2, Xunroll=16, using 81.00% of L1
> Performance = 396.76 (50.20% of copy matmul, 37.37% of peak)
164c164
< ger : chose routine ATL_cger1_axpy.c written by R. Clint Whaley
---
> ger : chose routine ATL_ger1_SSE.c written by Camm Maguire
166c166
< Performance = 166.29 (20.93% of copy matmul, 15.70% of peak)
---
> Performance = 187.47 (23.72% of copy matmul, 17.66% of peak)
173,174c173,174
< The best matmul kernel was ATL_mm_3dnow_100.c, written by Peter
Soendergaard
< This gave performance of 3498.94MFLOPS (329.5427777751340f apparent
peak)
---
> The best matmul kernel was ATL_smm_3dnow_100.c, written by Peter
Soendergaard
> This gave performance of 3476.51MFLOPS (328.1927777751420f apparent
peak)
176c176
< Performance = 918.73 (26.26% of copy matmul, 86.53% of
peak)
---
> Performance = 911.95 (26.23% of copy matmul, 86.09% of
peak)
178c178
< Performance = 963.17 (27.53% of copy matmul, 90.71% of
peak)
---
> Performance = 952.44 (27.40% of copy matmul, 89.91% of
peak)
180c180
< Performance = 895.74 (25.60% of copy matmul, 84.36% of
peak)
---
> Performance = 898.19 (25.84% of copy matmul, 84.79% of
peak)
182c182
< Performance = 928.23 (26.53% of copy matmul, 87.42% of
peak)
---
> Performance = 927.75 (26.69% of copy matmul, 87.58% of
peak)
203,204c203,204
< Yunroll=0, Xunroll=0, using 75.00% of L1
< Performance = 386.43 (11.04% of copy matmul, 36.40% of peak)
---
> Yunroll=0, Xunroll=0, using 100.00% of L1
> Performance = 388.95 (11.19% of copy matmul, 36.72% of peak)
206,207c206,207
< Yunroll=0, Xunroll=0, using 75.00% of L1
< Performance = 383.12 (10.95% of copy matmul, 36.08% of peak)
---
> Yunroll=0, Xunroll=0, using 100.00% of L1
> Performance = 383.38 (11.03% of copy matmul, 36.19% of peak)
212,213c212,213
< mu=16, nu=1, using 75.00% of L1 Cache
< Performance = 225.38 ( 6.44% of copy matmul, 21.23% of peak)
---
> mu=16, nu=1, using 50.00% of L1 Cache
> Performance = 433.18 (12.46% of copy matmul, 40.89% of peak)
222c222
< * FINISHED ATLAS INSTALL OF SECTION 0-0-0 ON 07/02/2001 AT 23:07
*
---
> * FINISHED ATLAS INSTALL OF SECTION 0-0-0 ON 07/02/2001 AT 20:43
*
--
M. Edward (Ed) Borasky, Chief Scientist, Borasky Research
http://www.borasky-research.net http://www.aracnet.com/~znmeb
mailto:znmeb@borasky-research.com mailto:znmeb@aracnet.com
Q: How do you get an elephant out of a theatre?
A: You can't. It's in their blood.
- References:
- RE: 3dnow
- From: R Clint Whaley <rwhaley@cs.utk.edu>