Skip to content

Latest commit

 

History

History
94 lines (67 loc) · 2.76 KB

README.md

File metadata and controls

94 lines (67 loc) · 2.76 KB

Prerequisites

These instructions assume you're using Linux (MacOS should also work) with docker installed.

NOTE: if you receive a "permissioned denied" error, prefix the commands below with sudo (i.e., sudo docker ...).

Launching Odinson

To follow along, you'll first need to build an Odinson index using the provided data.

docker run \
  --name="odinson-extras" \
  -it \
  --rm \
  -e "HOME=/app" \
  -v "$PWD/odinson:/app/data/odinson" \
  --entrypoint "bin/index-documents" \
  "lumai/odinson-extras:latest"
docker run \
--name="odinson-rest-api" \
-it \
--rm \
-e "HOME=/app" \
-p "0.0.0.0:9001:9000" \
-v "$PWD/data:/app/data/odinson" \
"lumai/odinson-rest-api:latest"

If the service launched correctly, you should be able to view the OpenAPI docs for the REST API at the following URL:

http://localhost:9001/api

Let's test a basic pattern which identifies grammatical subjects:

(?<subject> [tag=/(NN|JJ).*/]* [incoming=nsubj] [tag=/(NN|JJ).*/]*)

We can apply it by executing a GET request to the /api/execute/pattern endpoint.

The following URL will display all matches for this pattern in our test corpus:

Querying Odinson

For most nontrivial applications, we'll want to use a grammar (i.e., a set of interacting rules defining an information need). Odinson grammars are written in YAML. In the following example, we'll use a very simple grammar defined in grammar.yml and apply it using a POST request using Python.

1. Build the docker image

We'll be using Python to make our POST request. First we need to build our Docker image:

docker build -f python/Dockerfile -t "parsertongue/odinson-example:python" python/

2. Querying the Odinson index from Python

Using parentQuery filters

Filter using a regex on the title:

docker run -it \
--network="host" \
-v $PWD:/data \
"parsertongue/odinson-example:python" \
--grammar /data/grammar.yml \
--host http://0.0.0.0:9001 \
--page-size 1 \
--parent-query "title:Subcell.*"

Filter using an exact match on pubType:

docker run -it \
--network="host" \
-v $PWD:/data \
"parsertongue/odinson-example:python" \
--grammar /data/grammar.yml \
--host http://0.0.0.0:9001 \
--page-size 1 \
--parent-query "pubType:\"epreprint\"