Re: 4 hashes parallel on SSE2 CPUs for 0.3.6

Well, reporting back.

I got it to compile by specifying -msse and -msse2 to gcc when compiling. I first was hashing about 692kh/s (50% of SVN r130[1400kh/s]) but recompiled and am now receiving about ~1120kh/s. This is currently the equivalent of using both of my CPUs without HyperThreading, though I can verify that it IS using HyperThreading. With HyperThreading turned off, I get ~1350kh/s. Pretty close to the stock build.

Also, does the git contain the patched and updated code?

Code:
// SVN r130 Using HT.
08/14/10 19:02 hashmeter   4 CPUs   1392 khash/s
08/14/10 19:32 hashmeter   4 CPUs   1387 khash/s
08/14/10 20:02 hashmeter   4 CPUs   1386 khash/s
08/14/10 20:32 hashmeter   4 CPUs   1380 khash/s
08/14/10 21:02 hashmeter   4 CPUs   1363 khash/s
// With -msse -msse2, first run. Using HT.
08/14/10 21:32 hashmeter   4 CPUs    692 khash/s
08/14/10 22:06 hashmeter   4 CPUs   1011 khash/s
08/14/10 22:11 hashmeter   4 CPUs   1104 khash/s
08/14/10 22:16 hashmeter   4 CPUs   1120 khash/s
// NOT using HT.
08/14/10 22:21 hashmeter   2 CPUs   1359 khash/s
08/14/10 22:26 hashmeter   2 CPUs   1340 khash/s

Just wanted to tell my story and help with whatever information I could.

On both MinGW GCC 4.4.1 and 4.5.0 I have it working with test.cpp but SIGSEGV when called by BitcoinMiner.  So now it doesn’t look like it’s the version of GCC, it’s something else, maybe just the luck of how the stack is aligned.

I have it working fine on GCC 4.3.3 on Ubuntu 32-bit.

I found the problem with Crypto++ on MinGW 4.5.0.  Here’s the patch for that:

Code:
--- \old\sha.cpp Mon Jul 26 13:31:11 2010
+++ \new\sha.cpp Sat Aug 14 20:21:08 2010
@@ -336,7 +336,7 @@
  ROUND(14, 0, eax, ecx, edi, edx)
  ROUND(15, 0, ecx, eax, edx, edi)
 
- ASL(1)
+    ASL(label1)   // Bitcoin: fix for MinGW GCC 4.5
  AS2(add WORD_REG(si), 4*16)
  ROUND(0, 1, eax, ecx, edi, edx)
  ROUND(1, 1, ecx, eax, edx, edi)
@@ -355,7 +355,7 @@
  ROUND(14, 1, eax, ecx, edi, edx)
  ROUND(15, 1, ecx, eax, edx, edi)
  AS2( cmp WORD_REG(si), K_END)
- ASJ( jne, 1, b)
+    ASJ(    jne,    label1,  )   // Bitcoin: fix for MinGW GCC 4.5
 
  AS2( mov WORD_REG(dx), DATA_SAVE)
  AS2( add WORD_REG(dx), 64)

44,181 total views, 9 views today

https://bitcointalk.org/index.php?topic=648.msg9359#msg9359