This patch will calculate four hashes on one core using vector instructions. There’s a test programm included that validates the new hash function against the old one so it should be correct.
The patch is against 0.3.6. Improves khash/s by roughly 115%.
So are you saying you use 128-bit registers to SIMD four 32-bit data at once? I’ve wondered about that for a long time, but I didn’t think it would be possible due to addition carrying into the neighbour’s value.
5,905 total views, 1 views todayhttps://bitcointalk.org/index.php?topic=648.msg6751#msg6751