[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
PII & PPRO MKL v ATLAS timings
I include below MKL5.0 and ATLAS 3.2.1 timings on PII and PPRO platforms.
There's very little difference between the two, but MKL seems to be
better overall on these architectures.
The most interesting part of the timings for me is in confirmation of my
earlier theory on Level 1 performance. If you recall, I said I thought
MKL beat us badly on Level 1 for the PIII because of two factors: prefetch
and 1-cycle ABS.
PII and PPRO do not have prefetch, and we see that ATLAS now essentially ties
MKL for all the routines except NRM2, ASUM and AMAX, all of which use ABS . . .
Cheers,
Clint
*******************************************************************************
* 300Mhz PII, 512K L2, WinNT 4.0 *
*******************************************************************************
100 200 300 400 500 600 700 800 900 1000
===== ===== ===== ===== ===== ===== ===== ===== ===== =====
MKL dMM 194.2 192.0 192.2 195.1 197.5 194.7 201.4 202.3 199.4 200.9
ATL dMM 152.2 192.0 191.5 199.7 202.6 207.9 210.0 208.0 210.6 212.6
MKL sMM 228.6 240.9 241.1 241.1 242.5 242.6 245.3 247.3 246.0 247.6
ATL sMM 194.0 219.4 230.4 234.4 235.4 236.3 239.9 240.0 240.5 242.4
MKL cMM 15.4 245.8 246.9 248.3 247.1 248.5 247.3 249.2 247.8 234.2
ATL cMM 131.9 231.9 234.5 237.5 239.8 240.4 240.6 241.8 242.1 240.8
MKL dLU 111.4 151.0 164.1 165.1 166.4 170.4 172.0 176.1 176.6 177.6
ATL dLU 100.5 129.6 143.6 151.3 161.6 164.4 170.0 171.8 176.6 178.4
MKL sLU 128.1 170.0 191.4 188.0 190.4 200.0 197.6 202.1 200.5 203.0
ATL sLU 116.5 155.4 179.6 188.0 197.2 200.0 206.0 205.9 210.0 214.3
MKL cLU 181.8 209.8 214.0 218.3 219.7 222.0 223.3 224.0 223.7 225.1
ATL cLU 142.5 176.0 191.8 198.3 205.0 209.3 213.5 215.7 217.8 221.0
MKL zLU 159.9 181.6 183.9 188.2 190.3 193.9 195.0 196.3 197.1 198.4
ATL zLU 122.6 147.4 158.7 168.0 174.8 177.9 183.4 184.6 187.9 187.7
HEMM HERK HER2K
GEMM SYMM SYRK SYR2K TRMM TRSM
====== ====== ====== ====== ====== ======
MKL s500 231.9 205.1 205.7 190.4 228.5 222.4
ATL s500 228.5 231.9 190.9 235.4 216.3 222.4
MKL d500 190.4 172.1 167.0 156.8 186.0 195.3
ATL d500 200.0 192.8 160.4 200.0 186.0 195.0
MKL c500 241.5 221.4 215.2 212.6 222.4 230.4
ATL c500 236.2 236.2 200.4 237.9 219.3 206.6
MKL z500 219.9 196.9 185.3 178.8 212.2 206.6
ATL z500 209.2 204.5 163.6 208.5 195.4 184.1
HEMV GERU HER HER2
GEMV SYMV TRMV TRSV GER SYR SYR2
====== ====== ====== ====== ====== ====== ======
MKL s500 68.4 68.4 64.0 65.6 34.7 34.1 56.4
ATL s500 72.3 81.5 67.3 64.0 34.7 37.0 57.6
MKL d500 37.2 48.7 36.1 36.1 19.5 19.8 39.7
ATL d500 34.2 56.2 33.3 32.4 20.4 19.6 34.3
MKL c500 97.3 90.4 83.2 96.2 65.7 57.1 111.0
ATL c500 97.3 121.6 92.5 83.4 56.6 55.8 84.3
MKL z500 78.6 81.1 52.2 73.4 38.6 35.3 61.1
ATL z500 59.5 86.9 56.7 55.5 33.4 31.7 53.1
ROTM SWAP SCAL COPY AXPY DOT NRM2 ASUM AMAX
====== ====== ====== ====== ====== ====== ====== ====== ======
MKL d500 34.6 10.5 32.0 7.5 15.1 18.6 141.8 160.0 67.3
ATL d500 33.7 10.4 33.7 7.7 14.2 18.8 9.6 31.2 25.6
*******************************************************************************
* 180Mhz PPRO, 256K L2, WinNT 4.0 *
*******************************************************************************
100 200 300 400 500 600 700 800 900 1000
===== ===== ===== ===== ===== ===== ===== ===== ===== =====
MKL sLU 77.1 97.0 104.7 113.6 118.2 119.6 120.8 124.7 125.3 126.9
ATL sLU 69.1 97.0 104.4 108.9 118.4 119.6 121.8 124.7 126.3 127.6
MKL dLU 69.1 85.0 100.0 108.9 108.8 109.6 113.3 115.5 117.7 119.8
ATL dLU 62.5 79.9 85.1 93.8 96.8 101.1 104.5 104.9 109.4 110.2
MKL cLU 107.3 123.8 128.0 130.0 130.8 133.5 135.1 135.6 136.4 138.3
ATL cLU 85.0 104.9 112.4 119.9 122.5 127.5 129.7 131.1 133.0 134.7
MKL zLU 97.2 109.2 112.4 117.3 119.8 122.4 124.2 122.0 125.6 125.7
ATL zLU 72.8 90.8 100.0 104.0 107.1 112.0 114.5 115.8 117.9 118.6
HEMM HERK HER2K
GEMM SYMM SYRK SYR2K TRMM TRSM
====== ====== ====== ====== ====== ======
MKL s500 148.1 128.0 123.3 116.0 142.9 135.6
ATL s500 139.2 139.1 117.8 139.1 129.0 133.4
MKL d500 131.1 114.3 111.3 104.6 123.0 129.0
ATL d500 123.1 121.2 92.2 123.1 116.0 116.0
MKL c500 147.5 134.5 130.3 129.8 133.5 138.7
ATL c500 142.9 142.5 121.9 143.2 132.9 124.6
MKL z500 136.5 124.5 117.0 112.9 129.2 128.1
ATL z500 128.5 125.3 99.6 128.0 119.1 113.2
HEMV GERU HER HER2
GEMV SYMV TRMV TRSV GER SYR SYR2
====== ====== ====== ====== ====== ====== ======
MKL s500 48.0 48.1 44.2 44.2 24.5 23.5 43.2
ATL s500 48.1 56.9 45.7 42.6 24.5 23.9 40.4
MKL d500 25.5 32.9 24.2 24.6 14.5 14.1 27.3
ATL d500 27.8 35.7 25.6 24.6 14.4 14.1 23.6
MKL c500 72.1 60.7 55.4 67.7 46.2 43.7 83.0
ATL c500 60.7 76.7 61.0 55.3 41.3 42.2 55.2
MKL z500 60.7 55.0 35.8 52.8 30.3 24.4 50.3
ATL z500 33.9 60.7 33.8 32.1 25.1 24.0 38.6
DOTU
ROTM SWAP SCAL COPY AXPY DOT NRM2 ASUM AMAX
====== ====== ====== ====== ====== ====== ====== ====== ======
MKL s500 37.6 12.5 14.2 11.0 16.0 23.8 53.5 58.1 30.4
ATL s500 37.7 12.3 12.5 8.8 16.0 24.6 5.7 17.8 14.9
MKL d500 18.8 6.3 7.3 5.7 9.0 13.1 35.6 35.6 22.1
ATL d500 18.3 6.3 7.4 4.6 8.9 13.1 5.5 13.9 12.3
MKL c500 12.6 33.7 11.2 33.7 40.0 91.7 58.1 32.1
ATL c500 12.3 31.9 8.8 30.5 40.0 11.4 17.8 14.5
MKL z500 6.3 22.1 5.6 16.8 22.8 70.9 37.6 23.7
ATL z500 6.2 18.3 4.5 17.3 26.7 11.0 13.6 11.0