Skip to content

Script to extract data dumps from the Caselaw Access Project into sqlite

License

Notifications You must be signed in to change notification settings

mdlincoln/caselaw-to-sqlite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

case.law to sqlite

This script parses multiple data export file(s) from the Caselaw Access Project into a single structured sqlite database for easier access using analysis tools like R or Python.

Requirements

  • jq
  • sqlite3 command line shell

Usage

To run, pass the path of the unzipped export file you want to add to your database:

./extract Illinois-20180829-text/data/data.jsonl.xz

The script will first use jq to write out several CSV files to disk, and then load and index them into an sqlite database named caselaw.sqlite. You may repeat this process with multiple different dumps from the project website, adding further data into caselaw.sqlite.

If you have downloaded multiple states' files into one directory - in this case, to a directory named downloads/, you can use this command to run the script over all dumps.

find downloads -name data.jsonl.xz -exec ./extract.sh {} \+

Note that the CSV files are only intermediaries, and will be overwritten every time you run the script with a new data file.

About

Script to extract data dumps from the Caselaw Access Project into sqlite

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published