Index for faster searching on multiple databases / datasets
by - Thursday, January 1, 1970 at 12:00 AM
Isnt it easiert to parse the files individually to a database and then use sql magic?
Reply
The best way to do this in most cases is to use a search engine like ElasticSearch.
Reply
Hey another question related to this, i have multiple databases scattered in my vm. how can i merge them in a more efficient manner ? what tech stack do i use ?
Reply
(September 1, 2022, 05:41 PM)trollinator321 Wrote: Isnt it easiert to parse the files individually to a database and then use sql magic?


How would you set up the DB structure? Would you use individual tables for each breach? Would you consolidate certain fields in an index table?
Reply
no i would use one table for all breaches and basically have all possible attributes (which are empty in case they were not in the breach)
of course..preprocessing is needed..but afterwards you can kill that one table with as many indizes as you like ;)
Reply
wow thanks, i have same question
i made some indexing with python but it doesnt seems to works well
Reply
i've used a sqlite fts5 virtual table with the trigram tokenizer before
Reply
Nice.
Reply
apart from sql you could also use elasticsearch or something, but with respective indizes on all columns it should be fine with common sql stuff
Reply
(September 13, 2022, 06:37 PM)trollinator321 Wrote: no i would use one table for all breaches and basically have all possible attributes (which are empty in case they were not in the breach)
of course..preprocessing is needed..but afterwards you can kill that one table with as many indizes as you like ;)


Would you be willing to provide the create table query for your db so we can have a better idea? Or just provide the columns that you use? Thanks in advance.
Reply


 Users viewing this thread: Index for faster searching on multiple databases / datasets: No users currently viewing.