Sponsored by: █ Sparkhost - Hosting Without Compromises! █ Hybrid Performance Web Hosting █ Spark Host Stream Hosting █ Hybrid IRC & IRCd Server Shell Accounts
CUDA Mysql/Sha1 Cracker
#1
Posted 28 March 2009 - 12:35 PM
#2
Posted 29 March 2009 - 03:42 AM
#3 Guest_DiabloHorn_*
Posted 30 March 2009 - 02:23 AM
Maybe you could combine? so when cracking a hash it can use cpu and gpu? instead of gpu only?
#4
Posted 30 March 2009 - 03:44 AM
that's indeed a lot faster then the one posted @ http://blog.distracted.nl/.
Maybe you could combine? so when cracking a hash it can use cpu and gpu? instead of gpu only?
Thats what i was working on, i'm a little bit short on time atm.
I'll have to get my damn university's coursework finished first, after that i'll try to add, and in this order: cpu support (Intel), GPU support(ATI), Multi GPU support(Nvidia).
Atm i've ordered a second graphics card for my nvidia rig, and i'll be needing an ati card soon.
Once the sha1 part is done i might have time to add p2p features, or md5 cracking.
#5
Posted 30 March 2009 - 03:53 AM
#6
Posted 02 April 2009 - 10:40 AM
Ati gpu support most likely this weekend, including multi gpu support. After that the p2p architecture will be done. (need hosting! Not hosting this on my home conn.)
After that i'll add CPU support.
#7 Guest_DiabloHorn_*
Posted 02 April 2009 - 11:24 PM
Are you going to iimprove current code to be more efficient?
#8
Posted 03 April 2009 - 12:41 AM
The hard limit so far, on hashes/second is 80mln on a gforce 8800gts 320. That is given that the passes are <12 characters.
I know i'm hitting a memory limit somewhere, it might be that all the DATA##num structures arent properly cached in registers.
Also, please note that this app compares up to 128 hashes simultaneously. And i'm NOT giving that up in favor of single hash speed. I might write a signle hash benchmarking version to see how high i can go tomorrow, but atm i have _NO_ clue what is limiting performance.
Plus that there's no proper dissasemblers for gpu code, which doesn't really help.
The bottleneck (might) be that the GPU is paging DWORDS for each round in and out of shared memory, from global memory. Shared memory is fast (4 cycles), global is slow(400 cycles). Also: using the approach for the CPU from that blog post on the GPU would kill performance. Why? Using an array: -34% performance (PAGING!), using loops -8% performance (OVERHEAD!).
So, any tools to measure paging in and out of memory on the GPU would be useful. Any links/articles too. But i might just be hitting an instruction limit, and that'd make it a dead end.
As you might be able to read from above post, im just guessing, and cant see to it untill i get home.
#9
Posted 03 April 2009 - 11:17 AM
#10
Posted 03 April 2009 - 01:08 PM
i had a short look at your code... sha1 looks alright, but did i see correctly that you are generating plaintexts constantly using integer divisions? if so, that might actually kill your performance.
#11
Posted 04 April 2009 - 01:01 AM
hi there, nice work and tnx for sharing code
![]()
i had a short look at your code... sha1 looks alright, but did i see correctly that you are generating plaintexts constantly using integer divisions? if so, that might actually kill your performance.
Benchmarked that, thats only responsible for killing 7% of my performance. branching if-statements in a loop do worse than this. But thnx for the heads up. I'm going to put a reworked version online today, so i'm busy optimizing it all out of it atm.
As far as i know, this way of generating plaintext uses the least amounth of asm instructions, but i might be wrong, so got ideas for code that suits a similar purpose?
#12
Posted 04 April 2009 - 01:03 AM
Edit: Compile error, left bracket at line 2 was not matched in post.htm
#13
Posted 04 April 2009 - 09:58 AM
#14
Posted 25 May 2009 - 01:50 AM
#15
Posted 25 May 2009 - 03:52 AM
0 user(s) are reading this topic
0 members, 0 guests, 0 anonymous users












