-
Notifications
You must be signed in to change notification settings - Fork 23
Sample: COVID19 publication data parsing
Chris Mattmann edited this page Apr 17, 2023
·
7 revisions
Using the COORD-19 data provided by the Office of Science and Technology policy and Kaggle, you can perform a very cool demo of MEMEX GeoParser.
docker pull nasajplmemex/geo-parser
docker-compose up -d
pip install jupyterlab && pip install notebook
pip install pandas pysolr requests tqdm
Assuming that you have checked out GeoParser to $GEOPARSER_HOME
, then:
-
cd $GEOPARSER_HOME/examples/covid19
-
./download-metadata.sh
-
./create-core.sh
(make sure that you can seehttp://localhost:8983/solr/
if not wait a few seconds for Solr to start up.) -
./add-fields.sh
cd $GEOPARSER_HOME/examples/covid19 && jupyter notebook
- Run
Ingest COVID data.ipynb
(will take ~30-40 minutes)
- Click on Configure Index Tab
- Set Domain Name to
covid19_index
. - Set Index Path to
http://localhost:8983/solr/covid19/
- Click on add index
- Click add index to store the index of the domain in the database.
- Click on Database Icon Tab
- Click on
GeoParse
button, and then wait (takes ~10 minutes) - Click on
View
button