Government Security
Network Security Resources

Jump to content

Photo

How To Generate Dictionary

- - - - - algorithm dictionary
  • Please log in to reply
9 replies to this topic

#1 ShadowRun

ShadowRun

    Corporal

  • Sergeant Major
  • 170 posts

Posted 06 March 2005 - 12:51 PM

i wanted to ask for possible algorithms used to generate dictionary
the simple one looks like this
we have two arrays
A=[a,b,c] and B=[a,b,c]
we have length of word as parameter
and we do sth like this
we concatenate A with B as long as we get desired word length
we could also make some improvements by removing words shorter than
maximum word length at the moment(say we have words 1 and 2 letters long
we remove those 1 letter before generating 3 letter long ones)

now the question is
what is the fastest solution(trees or sth else)?
some sample would be nice

#2 Guest_SilverSandStorm_*

Guest_SilverSandStorm_*
  • Guests

Posted 06 March 2005 - 05:48 PM

i wanted to ask for possible algorithms used to generate dictionary
the simple one looks like this
we have two arrays
A=[a,b,c] and B=[a,b,c]
we have length of word as parameter
and we do sth like this
we concatenate A with B as long as we get desired word length
we could also make some improvements by removing words shorter than
maximum word length at the moment(say we have words 1 and 2 letters long
we remove those 1 letter before generating 3 letter long ones)

now the question is
what is the fastest solution(trees or sth else)?
some sample would be nice

<{POST_SNAPBACK}>

Not sure if I understand you...your word should have 1 char from A, then 1 from B, then 1 from A and so on?

#3 ShadowRun

ShadowRun

    Corporal

  • Sergeant Major
  • 170 posts

Posted 07 March 2005 - 04:41 AM

A you can treat as an output words
B is a dictionary
to make it simplier
1 A is empty B=[a,b,c]
2 A=B
3 A=A+B=[a,b,c,aa,ab,ac,ba,bb,bc,ca,cb,cc]
4 and so on

2nd solution is to create a tree
we bind all the elements of B to root
then to all those binded elements we bind B elements again
and the deepth of such tree is our word length
by traveling through that tree we create words by passing all the nodes from
root to the bottom we actually get all the combinations

i've found free library
i'll take a look at src code
http://www.password-crackers.com/pcl.html

and a good paper
http://packetstorm.linuxsecurity.com/papers/password/art_of_brute_forcing.txt


#4 ShadowRun

ShadowRun

    Corporal

  • Sergeant Major
  • 170 posts

Posted 07 March 2005 - 05:57 PM

here's a sample code
maybe you'll understand my idea this way ;)

void generate(int length)
{  
	vector<string> passes;
	string letters = "abcdefghijklmnopqrstyz1234567890@#$!|%&";
	string temp;
	vector<string> vtmp;
	//initialize vector
	for(int i = 0; i < letters.size(); i++ ){
  temp = letters[i];
  passes.push_back(temp);
  count++;  
	}
	//generate desired word length
	for(int i = 1; i < length; i++ ){
  //empty old ones (shorter)
  if (i > 1){
 	 for(unsigned int x = 1; x <= passes.size(); x++ )
    if (passes[x].size() == i)
   	 vtmp.push_back(passes[x]);  
 	 passes = vtmp;
  }
  for(unsigned int j = 1; j <= passes.size(); j++ ){
 	 if (passes[j].size() == i){ //do not repeat same length
                for(unsigned int k = 0; k < letters.size(); k++ ){
                    temp = (string)passes[j];
   	 temp += letters[k];
   	 if (login(temp) == true) {
      cout << "the password is " << temp;
      return;
   	 }
   	 passes.push_back(temp);   	 
   	 count++;
    }
 	 }
  }
	}
}

this code worked 4 me
it hangs my computer (after using login(pass) function) :PPP
but it seems like app i'm brute forcing is not prepared
for such things :P

greetz

#5 gman24

gman24

    Specialist

  • Sergeant Major
  • 643 posts

Posted 12 March 2005 - 04:21 PM

I don't know if this is what your thinking about:

Use:
Enter the charset: EX: abcdefghijklmnopqrstuvwxyz
Enter where to start, can be any length EX: aaa
Enter where to end: ex: zzzzz
Enter the filename:

The charset is validated and matched in the order the objects were entered.

If you entered zcdefghijklmnopqrstuvwxyba as the charset.
Then the start for a 3-6 char generation would be zzz and the end would be aaaaaa.


I modded some functions from my p2p cracker and added some new code.

Attached Files



#6 Guest_SilverSandStorm_*

Guest_SilverSandStorm_*
  • Guests

Posted 13 March 2005 - 04:35 AM

Ah okay,

What you are looking to do is generate all possible words from a given character set, of all possible lengths. That is an exponential time algorithm, that is

Time(length of word = n + 1) = (n+1) Time(length of word = n) (approximately).


The idea is to spend minimum extra time on other operations, since atleast one operation per actual word is require (assignment !)


In my opinion the best balance would be something of this sort:-

For a length n word, create an array (or just use int a[102212] beforehand)
int a[n]

Let your charset be
char charset[size]

Initially let a[i] = 0 for all i.

Then use simple arithmetic.....just keep incrementing a[0]. If it hits size, then set it to 0, and add 1 to a[1]. If that hits size, then set a[1] to 0 and add 1 to a[2] and so on....until a[n] = size, which means we are done!

Each time you increment a[0], you will get one word = charset[a[0]] + charset[a[1]] + charset[a[2]]....charset[a[n]].

When will we spend more than 1 operation for a word? When it is time to increment.
When we spend i operations per word (i > 1) that means i places are being incremented in the array. That will happen when the first i places are all = size. So the remaining n - i are free, so the possible combinations with this occuring = size^(n - i).

Take a summation over all i > 1, we get 1 + size^1 + size^2 + size^3...size^n-1

That is an estimate of how efficient the process is :)

Some code off my head...recursive and unoptimized but one method :)


#include <stdio.h>
#define LENGTH 20
#define SIZE 26
main()
{

void Increment(char iarray[], int n);

char charset[SIZE] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char array[LENGTH];
char word[LENGTH];

memset(array, 0, LENGTH);

while (array[LENGTH] < SIZE)
{
        for (int i = 0; i < LENGTH; i++)
              word[i] = charset[(int)array[i]];
        UsePassWord(word);
        Increment(array, 0);
}

}

void Increment(char iarray[], int n)
{
   iarray[n]++;
   if (iarray[n] == SIZE)
   {
        if (n < LENGTH)
        {
             Increment(iarray, n + 1)
             iarray[n] = 0;
        }
   }
}


#7 plinius

plinius

    Private First Class

  • Members
  • 70 posts

Posted 13 March 2005 - 05:48 AM

http://www.insidepro.../doc/003e.shtml
->a optimized version (nice technique they use b.t.w. )


#include "stdio.h"
#include "windows.h"

int main(int argc, char* argv[])
{
        static char szPassword[256];
        ZeroMemory(szPassword, sizeof(szPassword));

        static char szAlphabet[256];
        strcpy(szAlphabet, "ABC");

        static unsigned char bAlphabet[256];
        ZeroMemory(bAlphabet, sizeof(bAlphabet));

        int i = 0, k = 0;
        while (TRUE)
        {
                 bAlphabet[k] = (unsigned char)szAlphabet[i];
                 if (!szAlphabet[i])
                         brea k;
                 k = (unsigned char)szAlphabet[i];
                 i++;
        }

        while (TRUE)
        {
                 __asm
                 {
                     pushad
                     mov edi,offset szPassword
                     mov ebx,offset bAlphabet
                 L1: movzx eax,byte ptr [edi]
                     xlat
                     cmp al,0

                     je L3
                     mov [edi],al
                     jmp L5

                 L3: xlat
                     stosb
                     jmp L1

                 L5: popad
                 }
                 printf("%s\n", szPassword);
        }
        return 0;
}


#8 Guest_SilverSandStorm_*

Guest_SilverSandStorm_*
  • Guests

Posted 13 March 2005 - 10:13 AM

@Plinius...

Thanks for your post, was initially very interested but it turned out to be just a minor modification of the algorithm ( to deal with chars rather than indices )

The assembly language code is interesting though...not my area of expertise, haven't touched it in 9 odd years.

#9 ShadowRun

ShadowRun

    Corporal

  • Sergeant Major
  • 170 posts

Posted 14 March 2005 - 10:02 AM

i tried plinius code
it runs quite fast in a few sec it runs out of memory ;)

i came with other solution consumes CPU instead of memory
which is always limited

void generate(string prefix, unsigned int max_len){
	string newword;
	if (!found){
  for (unsigned int i = 0; i < alphabet.size(); i++){
 	 newword = prefix;
 	 newword += alphabet[i];
 	 if (newword.size() < max_len){      
    if (login(newword)){
   	 found = true;
   	 cout << " the password is " << newword << "\n";
   	 return;

    }
    generate(newword, max_len);
 	 }
 	 else 
    if (newword.size() == max_len){
   	 if (login(newword)){
      cout << " the password is " << newword << "\n";
      return;
   	 }
    }
  }
	}
}
int main(int argc, char* argv[])
{
.............
   for (unsigned int i = 0; i < alphabet.size(); i++){
      temp = alphabet[i];
      generate(temp,3);
   }
.............
}

waiting for your opinions
i want to add multithreading to this algorithm so it should run faster
(don't know how at the moment :P )

greetz

#10 gman24

gman24

    Specialist

  • Sergeant Major
  • 643 posts

Posted 14 March 2005 - 01:06 PM

The original use of the function I modded to generate to file was to do split cracks on different peers.

You can easily mod the version I gave to split equally among threads, if you have an algorithm to figure out how to split the work.

You can probably use the function from my p2p cracker when I post it.

It has limitations on how far it can be split dependant on the charset however. Bigger charsets can be split more.
Edit: the above sentence is in reference to my distro function in my p2p cracker. Not to the function posted.

BTW:
It doesn't have to write the pass to file, you can do whatever with it.





Also tagged with one or more of these keywords: algorithm, dictionary