November 12, 2022 at 8:08 PM
[align=justify]Sometimes you need to test your scripts that work with parsing databases, distributing content, scrapping data and the like. Maybe you don't want to use real databases for whatever reason, so let's create a script to generate databases for these types of tests. Warning: do not try to use this code to sell fake databases. The emails will be easily recognizable and you will have problems if you try to trick someone.[/align]
[align=justify]So what are we going to do: We will create a script that 1) will generate a set of emails with one or more domains; 2) will include password hashes along with a 3) dummy salt. The structure of this database will be: email:password_hash,salt. First, we import all the necessary modules:[/align]
[align=justify]I'm using MD5 as a hash algorithm. Our dummy salt will be saved in Base64 for no reason. The choice method will help in the random choice of pre-defined domains. The last, token_urlsafe from the secrets library will be used to generate random passwords. If you prefer, you can use some wordlist with real passwords instead. Now let's configure some parameters:[/align]
In total, define the number of emails to be generated. domain is a list of all possible domains for your emails. If you want to use a single domain, leave the list with a single element. tmp_list is a temporary list used while emails are being generated. The next step is to use a wordlist with first and last names to create the users for each email. I'm using two specific wordlists and their link is at the end of the post. Also, if you are going to use a wordlist of passwords instead of randomly generating them, this is the time to open it:
We are ready to generate the emails now. The snippet below follows the following logic:
[align=justify]Now you just need to save the contents of tmp_list to a file. Here we use a txt. To save in csv, remember to change the ":" separator in the snippet above.[/align]
[align=justify]
[/align]
[align=justify]Basically, that's all. By modifying few lines you can generate databases with different formats, with more columns, without salt, etc. The code and name wordlists used are in this link: https://github.com/rf-peixoto/fakedb Let me warn you one more time: don't try to use this script to sell fake databases! People can check the veracity of emails with any quick search.[/align]
- This tutorial requires basic knowledge of python.
[align=justify]So what are we going to do: We will create a script that 1) will generate a set of emails with one or more domains; 2) will include password hashes along with a 3) dummy salt. The structure of this database will be: email:password_hash,salt. First, we import all the necessary modules:[/align]
from hashlib import md5
from base64 import b64encode
from secrets import token_urlsafe
from random import choice[align=justify]I'm using MD5 as a hash algorithm. Our dummy salt will be saved in Base64 for no reason. The choice method will help in the random choice of pre-defined domains. The last, token_urlsafe from the secrets library will be used to generate random passwords. If you prefer, you can use some wordlist with real passwords instead. Now let's configure some parameters:[/align]
total = 60000
domain = ["test.com", "test.net", "test.org"]
tmp_list = []In total, define the number of emails to be generated. domain is a list of all possible domains for your emails. If you want to use a single domain, leave the list with a single element. tmp_list is a temporary list used while emails are being generated. The next step is to use a wordlist with first and last names to create the users for each email. I'm using two specific wordlists and their link is at the end of the post. Also, if you are going to use a wordlist of passwords instead of randomly generating them, this is the time to open it:
with open("first.txt", "r") as fl:
first = fl.read().split("
")
with open("last.txt", "r") as fl:
last = fl.read().split("
")We are ready to generate the emails now. The snippet below follows the following logic:
- We randomly select a name from the "first" (First name) file and do the same with the "last" (Last name or surname).
- From the first name we will take only the first letter. This is a very common pattern in corporate emails. A user "Will Smith" will have the value "wsmith" as an email.
- Then we join the generated name to the domain and we have an email address.
- In the next step we are generating our pair :password,salt. If you are using a wordlist of passwords, use the choice method to randomly choose a password from the file.
- Then we add the set to our tmp_list variable.
counter = 0
while counter <= total:
try:
name = choice(first).lower()[0]
surname = choice(last).lower()
email = "{0}{1}@{2}".format(name, surname, choice(domain))
salt = token_urlsafe(16)
salt64 = b64encode(salt.encode())
password = md5(str(token_urlsafe(64) + salt).encode()).hexdigest()
pass_pkt = ":{0},{1}".format(password, salt64.decode())
tmp = email + pass_pkt
tmp_list.append(tmp)
print(tmp) # Optional
counter += 1
except:
continue[align=justify]Now you just need to save the contents of tmp_list to a file. Here we use a txt. To save in csv, remember to change the ":" separator in the snippet above.[/align]
[align=justify]
with open("output.txt", "w") as fl:
for i in tmp_list:
fl.write(i + "
")[/align]
[align=justify]Basically, that's all. By modifying few lines you can generate databases with different formats, with more columns, without salt, etc. The code and name wordlists used are in this link: https://github.com/rf-peixoto/fakedb Let me warn you one more time: don't try to use this script to sell fake databases! People can check the veracity of emails with any quick search.[/align]
