hacking contest

hacking exploits security forum
hacking
compliance articles
upgrade backup exec
information security consultant

Killaloop
I needed this today and since link.box.sk and newdata.box.sk are not online (all those tools link to them) I wrote a simple script. have fun

#!/usr/bin/perl -w
#
# dupes.pl
#
# Removes dupes from wordlists. Each word must use
# its own line.


die "Usage: $0 wordlist" if (@ARGV!=1);
open(OUTPUT, ">duped.txt" );
open(INPUT, $ARGV[0] ) || die "wordlist not found\n";

while(<INPUT>){

push(@words, $_)
}
close(INPUT);

foreach $word( @words ) {
@parts = split( "\n", $word );
$dupe = $parts[0];
unless( $seen{$dupe} ) {
print OUTPUT "$dupe\n";
$seen{$dupe} = 1;
}
}

close(OUTPUT);
ShadowRun
nice but i rather use RAPTOR wink.gif
or simple sort function in textpad, ultraedit

greetz
Killaloop
QUOTE (ShadowRun @ May 27 2004, 09:44 AM)
nice but i rather use RAPTOR wink.gif
or simple sort function in textpad, ultraedit

greetz

will check out raptor some day. for now it was too much for what I needed smile.gif
ultraedit and textpad are somewhat slow on removing dupes in bigger files because it works different then this. they sort the file meaning they need more to look for then a simple seen. they compare each word with each word and thats WAAAY to slooow for my 200mb wordlist smile.gif
still might be of use for some playing around with wordlists
whiskah
yep, It's better to sort in ultraedit when u have a huge wordlist like 40+MB coz raptor crashes or takes too much cpu power and memory opening up huge wordlists

/edit
I tested ur Perl script and I see the same effect like raptor when sorting huge wordlists(tried it on a smaller 28 MB wordlist)which takes much memory and CPU power

there's also some typo error $dube should have been $dupe dry.gif
AgentOrange
Naa, sort -u is the best. Sort is so popular and powerful it comes with like every *nix distro. There is nothing wrong with programming your own code even if better alternatives exist. It shows that you like to program. I do it sometimes, to really see how it works.

Peace out
Killaloop
QUOTE (whiskah @ May 27 2004, 01:15 PM)

/edit
I tested ur Perl script and I see the same effect like raptor when sorting huge wordlists(tried it on a smaller 28 MB wordlist)which takes much memory and CPU power

there's also some typo error $dube should have been $dupe dry.gif

thx changed the code for a better understanding and mistyped it smile.gif

sorting a big list takes up much cpu power yep, but it doesn't hang nor does it produce strange output errors like raptor. its just one little example, for a better dupecheck I would have to use more arrays.
I haven't come across one good program that does this stuff in windows without beeing a resource whore.
for the programs I have seen I was not sure if they are casesensitive when duping the files.....
so I took the 5mins to write this and thought you guys might need it for something.

@AgentOrange
sort -u is sure the best, but no program like that for windows
and you are right I wrote it to see how it works cause removing dupes is often needed in programs that work with lists (it should, but most scanprograms or bruteforcers are "stupid" and don't check)
and there is always better code than yours but only by trying to write it on your own you can learn smile.gif

ShadowRun
QUOTE
sort -u is sure the best, but no program like that for windows


its not true:

hxxp://unxutils.sourceforge.net/UnxUtils.zip
Size: 3.3 MB (3 365 638 bajtów)

check it biggrin.gif
i've attached complete list of tools included in this zip
Killaloop
QUOTE (ShadowRun @ May 28 2004, 10:02 AM)
QUOTE
sort -u is sure the best, but no program like that for windows


its not true:

hxxp://unxutils.sourceforge.net/UnxUtils.zip
Size: 3.3 MB (3 365 638 bajtów)

check it biggrin.gif
i've attached complete list of tools included in this zip

thx useful stuff
still love to write my own little script since I'm just starting with this all.


ps.: check my bannerfilter script smile.gif
whiskah
ur script is indeed very useful and I was totally wrong in comparing it to a commercial program..If u happen to get a way to speed it up or multithread then pls let us know

wink.gif
starsky32
Well, I think you must try this little one : OnceIsEnough, v4.
It is free, it is fast and I use it to remove dupes from my 500 Mo Wordlist, So I think it can handle big lists... It can sort or use a reference list too.
I Didn't test it with bigger lists, but I think it will be ok.

Starsky32.
Iltis
QUOTE (starsky32 @ May 29 2004, 10:19 PM)
Well, I think you must try this little one : OnceIsEnough, v4.
It is free, it is fast and I use it to remove dupes from my 500 Mo Wordlist, So I think it can handle big lists... It can sort or use a reference list too.
I Didn't test it with bigger lists, but I think it will be ok.

Starsky32.

really nice one starsky
this prog really kickz ass biggrin.gif
easy to use and very fast smile.gif
that's what i was looking for a long time smile.gif
nothing left to say except thank you smile.gif
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.

 
Invision Power Board © 2001-2005 Invision Power Services, Inc.