in CPUs that support SIMD extensions but with no dedicated hardware block, SIMD is still a win because it saves the CPU, at least, the work of instruction decoding. SSE is supported on pretty much everything.