October 7, 2022 at 11:49 PM
Hello folks,
I'm parsing data breaches and loading them into a program to search locally. And cit0day is giving me a fucking headache. It seems to be upwards of 90% garbage. Missing salts, truncated hashes, mis-titled files, the entire collection is a train wreck. Even sorting by the largest files, many of them are just millions of lines of fake records due to spam bots.
Are any of these breaches even worth keeping around? Which ones? For something the media hyped up so much, it doesn't seem to be worth 10 minutes out of my day to write a parsing script for.
I'm parsing data breaches and loading them into a program to search locally. And cit0day is giving me a fucking headache. It seems to be upwards of 90% garbage. Missing salts, truncated hashes, mis-titled files, the entire collection is a train wreck. Even sorting by the largest files, many of them are just millions of lines of fake records due to spam bots.
Are any of these breaches even worth keeping around? Which ones? For something the media hyped up so much, it doesn't seem to be worth 10 minutes out of my day to write a parsing script for.


