Been verified scraped data
by - Thursday, January 1, 1970 at 12:00 AM
This looks like fun.
Reply
thanks
Reply
thanks
Reply
(April 15, 2022, 09:44 PM)nice project Martinabel007 Wrote: File type : Multiple folders containing JSON files
File size : 4gb compressed (~30gb uncompressed)
Date : (fairly recently I guess..early 2020)
Source : RF

My little project now is trying to work out a possible solution to making the file accessible through a database

Hope MySql is a good start.

Converting to csv with python would have been an option but...I'm going database :D


credits: RF user that posted (forget the user)
Data from the first file
[
  {
    "address": {
      "city": "HICKORY",
      "state": "KY",
      "street": "3919 STATE ROUTE 301",
      "zip": "42051"
    },
    "age": 22,
    "aka": [
      {
        "firstname": "ABBY",
        "lastname": "PADGETT",
        "middlename": "A"
      },
      {
        "firstname": "ABBY",
        "lastname": "SPENCER",
        "middlename": "A"
      }
    ],
    "autos": [
      {
        "make": "TOYOTA",
        "model": "COROLLA",
        "year": 2011
      }
    ],
    "court": {
      "judgements": true
    },
    "dateOfBirth": "1996-04-08",
    "dateOfDeath": "1997-12-18",
    "education": {
      "educationLevel": "High School"
    },
    "emails": [
      "[email protected]"
    ],
    "firstname": "ABBY",
    "gender": "F",
    "id": "5c6e3ed2d20c6db42860a479",
    "jobs": [
      {
        "title": "Student"
      }
    ],
    "lastname": "BRYANT",
    "middlename": "A",
    "pastAddresses": [
      {
        "city": "Billings",
        "dateRange": {
          "endYear": 2013,
          "startYear": 2010
        },
        "loc": "45.7813,-108.5727",
        "state": "MT",
        "street": "3909 Swallow Ln",
        "zip": "59102"
      }
    ],
    "phone": "7326951761",
    "politicalParty": "Republican",
    "professionalLicense": "NURSING",
    "profilePics": [
      {
        "url": "https://media.licdn.com/mpr/mprx/0_YSs4eLVFQfq_ou1wVDo9IXMFGfE_mmPwVJXvmkwWWwIhW73LZgRq2mTCzRj"
      },
      {
        "url": "https://graph.facebook.com/1179099847/picture?type=large"
      },
      {
        "url": "https://s-media-cache-ak0.pinimg.com/avatars/booth_abby-12_140.jpg"
      },
      {
        "url": "https://lh4.googleusercontent.com/-EPRX48-Rp8A/AAAAAAAAAAI/AAAAAAAAAAA/Te7uAkwYeuE/photo.jpg"
      },
      {
        "url": "https://media.licdn.com/mpr/mpr/shrinknp_400_400/p/8/000/272/397/1e30b4b.jpg"
      },
      {
        "url": "http://media.licdn.com/mpr/mpr/p/8/000/272/397/1e30b4b.jpg"
      }
    ],
    "race": "Caucasian",
    "religion": "Christian",
    "social": [
      {
        "domain": "facebook",
        "url": "https://www.facebook.com/abbywonwon"
      },
      {
        "domain": "linkedin",
        "url": "https://www.linkedin.com/pub/abby-shroyer/53/a92/605"
      },
      {
        "domain": "facebook",
        "url": "https://www.facebook.com/people/_/100000628123662"
      }
    ],
    "title": "MS"
  }
]



Will post more sample when I get to my pc
Reply
Thank you!
Reply
thanks
Reply
Mongdb can process these data
Reply
thank you so much
Reply
Thank you
Reply
don't convert to CSV, you'll ruin it like the apollo.io DB's got ruined.
`jq -c . *.json > ../amerimutts.json | sort -u ../amerimutts.json -o lessniggers.json`
then you can just compress as zstd to maintain fast read speeds.
1488 White Terror
Reply


 Users viewing this thread: Been verified scraped data: No users currently viewing.