We were discussing how 3Dnow could get 4*mhz even though it only does two ops per vector, rather than 4 as for SSE. The trick is that it can do an add and multiply in the same clock cycle, just as with normal flops. So this means that seperate multiply/add instructions will be key . . . Cheers, Clint