This package started as a migration of a set of finite-state grammars for the morphological analysis of German words delivered with SFST
, a finite-state transducer (FST) toolkit by Helmut Schmid, to Pynini
, another FST toolkit. The latter has the advantage that it is implemented as a python library allowing for seamless interaction with tons of other useful python packages. By now, a number of morphological operations have been added and some analysis strategies adjusted in comparison to the original rule set.
timur
is implemented in Python 3. In the following, we assume a working Python 3 (tested versions 3.5 and 3.6) installation as well as a working C++ compiler supporting C++-11.
The underlying FST toolkit Pynini
is itself based on OpenFST
, a C++ library for constructing, combining, optimizing, and searching weighted FSTs. Get the latest version of OpenFST which works with the current version of Pynini
(finding a working combination can by a little tricky since Pynini
usually is a bit behind OpenFST; comparing the release dates helps), unpack the archive, build and install via
$ ./configure --enable-grm
$ make
$ [sudo] make install && [sudo ldconfig]
TODO
Using virtualenv
is highly recommended, although not strictly necessary for installing timur
. It may be installed via:
$ [sudo] pip install virtualenv
Create a virtual environement in a subdirectory of your choice (e.g. env
) using
$ virtualenv -p python3 env
and activate it.
$ . env/bin/activate
timur
uses various 3rd party Python packages (including Pynini
) which may best be installed using pip
:
(env) $ pip install -r requirements.txt
Finally, timur
itself can be installed via pip
:
(env) $ pip install .