These instructions assume you're using Linux (MacOS should also work) with docker
installed.
NOTE: if you receive a "permissioned denied" error, prefix the commands below with sudo
(i.e., sudo docker ...
).
To follow along, you'll first need to build an Odinson index using the provided data.
docker run \
--name="odinson-extras" \
-it \
--rm \
-e "HOME=/app" \
-v "$PWD/odinson:/app/data/odinson" \
--entrypoint "bin/index-documents" \
"lumai/odinson-extras:latest"
docker run \
--name="odinson-rest-api" \
-it \
--rm \
-e "HOME=/app" \
-p "0.0.0.0:9001:9000" \
-v "$PWD/data:/app/data/odinson" \
"lumai/odinson-rest-api:latest"
If the service launched correctly, you should be able to view the OpenAPI docs for the REST API at the following URL:
Let's test a basic pattern which identifies grammatical subjects:
(?<subject> [tag=/(NN|JJ).*/]* [incoming=nsubj] [tag=/(NN|JJ).*/]*)
We can apply it by executing a GET
request to the /api/execute/pattern
endpoint.
The following URL will display all matches for this pattern in our test corpus:
For most nontrivial applications, we'll want to use a grammar (i.e., a set of interacting rules defining an information need). Odinson grammars are written in YAML
. In the following example, we'll use a very simple grammar defined in grammar.yml
and apply it using a POST
request using Python.
We'll be using Python to make our POST
request. First we need to build our Docker image:
docker build -f python/Dockerfile -t "parsertongue/odinson-example:python" python/
Filter using a regex on the title
:
docker run -it \
--network="host" \
-v $PWD:/data \
"parsertongue/odinson-example:python" \
--grammar /data/grammar.yml \
--host http://0.0.0.0:9001 \
--page-size 1 \
--parent-query "title:Subcell.*"
Filter using an exact match on pubType
:
docker run -it \
--network="host" \
-v $PWD:/data \
"parsertongue/odinson-example:python" \
--grammar /data/grammar.yml \
--host http://0.0.0.0:9001 \
--page-size 1 \
--parent-query "pubType:\"epreprint\"