Autobuild Python/Pytorch module for derivative code #166

fymue · 2023-01-06T14:46:18Z

Differential Dynamic Programming

Motivation (some of this might not be 100% correct, ignore if that's the case)

Motivation

Recently, certain DP algorithms such as Needleman-Wunsch (NW) have been proven to be differentiable. This discovery has opened new doors for the usage of such DP algorithms, which can now be used in e.g. deep learning models like DeepBLAST, which aims to learn protein structural similarity from sequence alone. This particular model is written in Pytorch and uses Pytorch's autograd functionality (automatic differentiation) for the backward pass through the NW algorithm. Due to the implementation of this functionality, this is extremely slow though, because it takes a lot of kernel calls to complete the backward pass. It would be way more efficient if one could calculate the entire backward pass in one function/kernel call. This is where GAPC can help. GAPC can generate C++-code that can calculate the score matrices for e.g. differentiable NW in one kernel call. If we were able to call this generated code directly from Python/Pytorch, the entire backward pass could be executed in 1 kernel call, which has the potential to significantly speed up the training process of the model.

Implementation

In order to make our GAPC-generated derivative code Python-/Pytorch-compatible and usable in e.g. DeepBLAST, I implemented a method which uses the Pytorch C++ Extension API to automatically create a Python module from our generated code, which provides a forward and backward function for the first and second derivative (if requested). After installation, this module can be imported/used like any other Python module. Currently, these functions take two strings/sequences as arguments (just like the GAPL NW/Gotoh implementations) and output a List of Pytorch Tensors of all alignment score matrices (1 for NW, 3 for Gotoh).

These functions are provided trough an interface (pytorch_interface.cc), which contains generic definitions for the forward/backward functions as well as binding instructions for the module build step (Side note: since our GAPC-generated code for NW/Gotoh uses double vectors to store all values, I had to use float64 Tensors to store these values. This might cause problems if this is supposed to run on a GPU later, so we should maybe see if our GAPC algorithm also works with 32-bit floats). The module creation is handled via setuptools. An adjusted Makefile will generate a setup.py file (as well as the pytorch_interface.cc file) with all necessary instructions for the module creation.

The GAPC-generated code is only minimally modified. The out object(s) get the additional methods get_forward_score_matrices and get_backward_score_matrices, which convert the internal score matrices (std::vector) to torch::Tensor objects and return them. This means that as of right now, the core DP algorithm is unmodified and calculates everything regularly; I merely convert the score matrices to tensors after the algorithm finished it's calculations.

Prerequisites, Usage, Example

In order to create a Python-/Pytorch module with GAPC instead of the regular binary, you can use the --as-torchmod MODULE_NAME option (in combination with the --derivative N option; if this option is not set, the module creation cannot work so the user is instructed to set this option), e.g. :

gapc -i bothD --derivative 2 --as-torchmod "nw_gapc" alignments.gap

With this option enabled, an adjusted Makefile will be produced, which contains rules that generate the setup.py file as well as the pytorch_interface.cc file, through which all functions of the module will be provided.

The command

make -f out.mf

will generate the setup.py and pytorch_interface.cc files, the command

make -f out.mf install

will generate the Python module and install it locally.

For the installation to work, the user must have the Python modules torch and pip installed (as well as Python, obviously). The Python.h header file is also required. This file does not usually come with the default Python installation as it is part of the python-dev/python-devel package. This file is available though if the user installs Python/Pytorch in a conda environment (which is recommended for Pytorch anyways). As an example, I will provide a short installation guide on how I set everything up on my system. Since I don't have a GPU on my machine, I installed the cpu-only version of torch. This should also work for cuda versions of Pytorch though.

conda create -n pytorch python=3.8  # Pytorch recommends python3.6, python3.7 or python3.8
conda activate pytorch
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu  # for cpu-only
# OR pip3 install torch torchvision torchaudio  # for cuda 11.7
gapc -i bothD --derivative 2 --as-torchmod "nw_gapc" alignments.gap
make -f out.mf
make -f out.mf install  # install "nw_gapc" module locally

After that, you can use the nw_gapc module in Python:

import torch  # need to import torch first so certain linker symbols become visible
import nw_gapc

s1 = "Freizeit"
s2 = "Zeitgeist"
tensor = torch.zeros(1)  # example torch tensor

# all functions currently take a tensor as the first argument
# as an illustration that we can easily pass inputs from Python to these functions
# and process the inputs in our C++ code

forward_D1 = nw_gapc.forward_D1(tensor, s1, s2)  # forward pass 1st derivative (equivalent to run method in out_main.cc)
backward_D1 = nw_gapc.backward_D1(tensor, s1, s2)  # backward pass 1st derivative (equivalent to report_derivative method in out_main.cc)

forward_D2 = nw_gapc.forward_D2(tensor, s1, s2)  # forward pass  2nd derivative
backward_D2 = nw_gapc.backward_D2(tensor, s1, s2)  # backward pass 2nd derivative

# forward_DX and backward_DX return Lists of Pytorch Tensors, which are the score matrices for the respective pass

…ivative GAPC code (still need to add pytorch_interface.cc generation)

…odule with make -f out.mf install

…core matrices to pytorch Tensors; add generic_pytorch_interface.cc; can now calculate score matrices in Python for NW

…low errors when adding C++ double vals to tensor (probably need to change our code to use float so tensor can be float32 for GPU compatibility); use make_blob function to make tensors (way faster)

…ead of just 1

… matrices where returned instead of forward matrices

…in object initialization

…rything if forward gets re-called with different inputs; fix comment

…_interface

… torch::Tensor type based on nt table type

…r and Seq so input tensors can be processed

…pe as char analogous

…ensors; some refactors/comments

…parisons; module compiles and can be built now

…omments; add some more constructors; refactor indexing

…empty batch

…ave differently-sized batch dimensions

…so be converted to Tensors

…ts with the help of template args

fymue · 2023-07-04T15:24:40Z

Update (04.07.2023):

Didn't post an update for a while now, but here all all changes since the last update:

2nd derivative can now be executed as well; the generic interface provides the forward_D2 and backward_D2 functions, which can be run to execute the 2nd forward and backward pass, and to return the respective DP matrices as tensors
the dimensions of the tensors that are created from the computed DP matrices are now individually set for every DP table; before, I simply assumed that all DP tables have the same size and dimensions (pretty lazy)
I also added a small mechanism to the generated code that can check if a DP table is quadratic (NxN), linear (N) or just a triangular matrix; since the exposure of DP matrices as tensors only works for quadratic or linear tables, the new function convert_triu_to_tensor will properly convert a triangular matrix to a tensor if a triangular DP matrix is detected
if users want to generate an algorithm that can process batched inputs, they will be warned if the sizes of the batch dimension of multiple input tensors are unequal
I adjusted the make_index / index_components mechanism to be able to handle an arbitrary number of indices; to ensure that this stays fast and efficient, I had to implement a way to add template arguments to GAP-C-generated functions (not too much new code, most of it is in src/expr/template.hh); this allows us to add template arguments to GAP-C Fn_Def objects, so the objects inside those functions can be provided with template arguments
max batch size can be adjusted by user at compilation: e.g. MAX_BATCH_SIZE=32 make -f out.mf would limit the max batch size to 32 instead of the default 128; this can same a huge amount of memory if the user knows that a bigger max batch sizes isn't required

…when batched input is supposed to be processed; allow users to limit max batch size with environment variable at compilation

…Batch object allocation

fymue · 2023-07-09T19:07:53Z

Update (09.07.2023)

I added some minor bugfixes and overall improvements:

GAP-C will throw an error if users used batch types (e.g. F32batch) in algebra functions but didn't specify batched input tensors; Similarly, GAP-C will also report an error if batch-incompatible types were used in combination with batched input tensors
fixed an edge case where the internal DP matrices were DEALLOCATED prematurely (The commit msg mistakenly says "premature allocation")
cleaned up the GAP-C help msg for this feature a bit
added the option to use regular dynamic memory allocation for Batch objects with malloc instead of using the memory pool; this option can be turned on at compile-time by setting the MALLOC_BATCH environment variable (e.g. MALLOC_BATCH=1 make -f out.mf); this is useful when users aren't sure how big the batch dimension of the input tensors can become or if they simply don't want to set a compile-time limit for the maximum batch size

sjanssen2

I am not yet through. Here come my first comments.

sjanssen2 · 2023-06-27T08:25:25Z

src/tensor.hh

+    bool batched = input.find("batched") != input.npos;
+
+    // check the number of dims of the input Tensor (default: 2)
+    std::regex regex("[a-zA-Z0-9]*([0-9]{1,2})D[a-zA-Z0-9]*");


why limiting to 0-99? Is this a limit for PyTorch?

I don't think PyTorch tensors have a dimension limit. But how realistic is it that someone has a tensor with e.g. 90 dimensions? Since I wanted to at least support 10D tensors, this regex needs to parse a 2-digit number, so 99 is the natural hard limit. Would you prefer this to be unlimited instead?

sjanssen2 · 2023-06-27T08:26:25Z

src/tensor.hh

+      try {
+        n_dims = std::stoi(match[1].str());
+      } catch (const std::exception &e) {
+        std::cerr << "Couldn't convert " << match[1].str() << " to a number.\n";


could you please use the gapc log mechanism to report this error. It would also help if you prompt the user to the input declaration line (the Loc(action) mechanism)

Sure, I will adjust this.

sjanssen2 · 2023-06-27T08:27:15Z

src/tensor.hh

+        break;
+      }
+    }
+


we should also warn/error the user if he/she provides a "tensor" that is not supported at all

If the user misspells "tensor" in the input declaration, this will already be caught in str_to_mode in input.cc. I added another warning for the specified tensor type if the parser can't determine it.

sjanssen2 · 2023-07-02T19:33:39Z

Makefile

+test-pytorch:
+	cd testdata/regresstest &&\
+	$(SHELL) test_pytorch_module.sh $(TRUTH_DIR)
+
+test-pytorch-batched:
+	cd testdata/regresstest &&\
+	$(SHELL) test_pytorch_module.sh $(TRUTH_DIR) ""


could we merge both tests into just one target, to reduce the risk of missing execution of a test a little?

Sure, I will merge them.

sjanssen2 · 2023-07-02T19:35:15Z

testdata/regresstest/test_pytorch_module.sh

+
+GAPC=../../../gapc
+MAKE="make"
+PYTHON="python3"


why is it necessary to specify 3 at the end. Shouldn't python be sufficient if we install the correct dependency?

It should work with just python as well. I think that the official interpreter binary name is still python3, so that should always work. Just python probably works in most shells as well though.

sjanssen2 · 2023-07-02T19:44:47Z

testdata/regresstest/test_pytorch_module.py

+    equal: bool = True
+    for matrix, reference in zip(bw_matrices, reference_matrices):
+        # make sure both matrices have the same dtype (float32)
+        # and don't contain any inf vals (bug if so?)


you should check what happens for triangular matrices, i.e. single track, for cells i > j. These cells might be flagged as +/-inf

I will eventually, but as of right now I don't of a test program that makes use of derivatives and uses just one track / has triangular matrices. But any inf values will be set to 0 in one of the next couple of lines in this loop, so they wouldn't cause any trouble here.

sjanssen2 · 2023-07-02T19:48:08Z

testdata/regresstest/test_pytorch_module.py

+# tests for regular (single) input and batched inputs
+# for 1st and 2nd derivative
+# (Truth files are in gapc-test-suite/Truth/Regress)
+TESTS = {


could we extend this great framework to variable source input? Sometime in the future, I'd like to also process RNA folding algorithms this way. How much effort would it be to also subject testdata/grammars_outside/nodangle.gap to your mechanism?

I'm not sure what you mean. Are you asking if this feature can also convert programs like nodangle to Python packages? In general, this feature should be able to convert any GAP-L program that also works with the --derivative option into a Python package. If nodangle can be rewritten in a differentiable way that makes use of the --derivative option, that this feature should also be able to handle e.g. nodangle.

sjanssen2 · 2023-07-02T19:51:06Z

testdata/grammar/alignments_tensor.gap

+    return x * exp(-2.0);
+  }
+  float Ers(<alphabet a, alphabet b>, <tensorslice locA, tensorslice locB>, float x) {
+    if (equal(a, b)) {


is it possible to overload the == operator appropriately, as this would be more natural for programmers?

sjanssen2 · 2023-07-02T19:54:17Z

testdata/grammar/alignments_tensor_batched.gap

+algebra alg_count auto count;
+
+algebra alg_hessian implements sig_alignments(alphabet=tensorchar, answer=F64batch) {
+  F64batch Ins(<alphabet a, void>, <tensorslice locA, tensorslice locB>, F64batch x) {


note to myself (@sjanssen2) shouldn't it be possible to type answer and alphabet here in the source code and later replace all answer with what is provide in the first line of the algebra definition in the AST, here F64batch?!

sjanssen2 · 2023-07-02T19:58:12Z

rtlib/traces.hh

+#include <utility>
+#include <cassert>
+
+#define MAX_INDEX_COMPONENTS 3  // max indices for an index_components object


In general, we need to consider multi track programs for which some DP matrices can be quadratic, linear or even constant. I feel that restricting the number of index components to 3 might be too restrictive.

This is already fixed in the latest traces.hh adjustment.

…e PyTorch/Tensor-related GAP-L test programs to testdata/grammar_pytorch; bundle batched/non-batched test into 1; improve tensor input declaration error messages

…d package

…tes with differently-sized index components; substitute some copies for moves in the trace mechanism

…ault traces template args

fymue added 14 commits December 27, 2022 22:16

setup.py generation for automatic creation of a Python module for der…

91217c1

…ivative GAPC code (still need to add pytorch_interface.cc generation)

bugfix

55fca5f

generate basic structure of pytorch_interface.cc

76bcbee

draft of generic pytorch interface for setuptools

d6ce828

fixed errors in setup.py configuration

52a9aee

only generate setup.py and pytorch_interface.cc by default; install m…

97553c1

…odule with make -f out.mf install

modify init function; add functions to convert forward and backward s…

fa90e7a

…core matrices to pytorch Tensors; add generic_pytorch_interface.cc; can now calculate score matrices in Python for NW

need to use torch::kFloat64 instead of torch::kFloat32 to avoid overf…

03b8633

…low errors when adding C++ double vals to tensor (probably need to change our code to use float so tensor can be float32 for GPU compatibility); use make_blob function to make tensors (way faster)

fixed setup.py linker flags

60fbc26

convert all outside tables to tensors and return them to pytorch inst…

b4bbb3d

…ead of just 1

use pip install instead of easy_install; fix bug where empty backward…

880bd5d

… matrices where returned instead of forward matrices

refactor some things in to-be-generated setup.py

6f06b21

provide option to name the to-be-generated Python module; refactor ma…

bf16ae2

…in object initialization

added some comments

b04a222

fymue requested a review from sjanssen2 January 6, 2023 14:46

fymue and others added 15 commits January 7, 2023 14:18

also clear traces vector in init function for proper resetting of eve…

7e1c18f

…rything if forward gets re-called with different inputs; fix comment

Merge branch 'second_derivative' into pytorch_interface

06403d2

Merge branch 'second_derivative' into pytorch_interface

beda205

Merge branch 'master' of github.com:jlab/gapc into pytorch_interface

2ac1e49

Merge branch 'second_derivative' into pytorch_interface

b3900ba

Merge branch 'second_derivative' of github.com:jlab/gapc into pytorch…

60f68c1

…_interface

fix merge bug; refactored some stuff after merge; look up appropriate…

7a09968

… torch::Tensor type based on nt table type

confused these two expressions for one another

68083e0

add new Terminal Types Tensor and TensorSlice as replacements for Cha…

adf8327

…r and Seq so input tensors can be processed

refactor TensorSlice type to be a Subseq analogous; add TensorChar ty…

4f26556

…pe as char analogous

add runtime implementations of TensorSlice and TensorChar types

4ca24df

fix include typo; fix wrong TLOC return type; add tensor input mode

0d1ed7d

use TensorSlice as t_x_seq type; allow for variable number of input t…

4e377af

…ensors; some refactors/comments

fix some const bugs; add some operator overloads for basic tensor com…

6fdf97e

…parisons; module compiles and can be built now

move terminal parsers and empty.hh overloads to tensor.hh; add some c…

4a61a17

…omments; add some more constructors; refactor indexing

fymue added 5 commits May 11, 2023 15:00

initialize batch w/ empty vals to prevent Segfaults when acessing an …

0014ea1

…empty batch

fix initializer/setter bugs in candidate class

54d41f4

extend interface and test suite for 2nd derivative

67c9fe2

linting

a8eeb03

improve memory usage (for 2nd derivative)

99abbda

fymue mentioned this pull request Jun 13, 2023

Outside #187

Merged

13 tasks

fymue added 7 commits June 27, 2023 16:48

mark redundant funcs for removal later

2ea1bc1

add minsize/maxsize filters for tensors; warn user if input tensors h…

a4af9b5

…ave differently-sized batch dimensions

set tensor_size individually for every table so e.g. 1D tables can al…

2b11ae6

…so be converted to Tensors

add mechanism to convert triangualr DP tables to Tensors

90d7cf0

test

f53fa79

add new template expr; allow arbitrarily-sized index_components objec…

9a8001f

…ts with the help of template args

linting

9c805f0

fymue added 6 commits July 4, 2023 17:45

small refactor

cacc034

linting

ff9a11a

throw Error if batch-incompatible answer type is provided in algebra …

ff7b4fe

…when batched input is supposed to be processed; allow users to limit max batch size with environment variable at compilation

simply indices initialization

e02448b

update help msg; add option to use malloc instead of memory pool for …

903805a

…Batch object allocation

fix edge case where premature memory allocation can occur

c6718aa

fymue added 2 commits July 10, 2023 14:48

fix bug in triangular matrix detection it get_table_size function

3847f53

make template declaration mechanism safer/more usable

024f541

sjanssen2 reviewed Jul 10, 2023

View reviewed changes

fymue added 6 commits July 10, 2023 21:22

enable == comparison for TensorChar/TensorSlice objects in GAP-L; mov…

a51d6a7

…e PyTorch/Tensor-related GAP-L test programs to testdata/grammar_pytorch; bundle batched/non-batched test into 1; improve tensor input declaration error messages

add TSLICE0 terminal parser; add GAP-C call to description of compile…

c5ffa38

…d package

refactor index_component implementation to allow for multiple candida…

6b7e441

…tes with differently-sized index components; substitute some copies for moves in the trace mechanism

avoid multiple definition linker error

3377b11

fix bug where non-floating-point answer types were overwritten by def…

54d3e10

…ault traces template args

add push_back funcs to enable max and min choice functions

b45ac51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autobuild Python/Pytorch module for derivative code #166

Autobuild Python/Pytorch module for derivative code #166

fymue commented Jan 6, 2023

fymue commented Jul 4, 2023 •

edited

Loading

fymue commented Jul 9, 2023

sjanssen2 left a comment

sjanssen2 Jun 27, 2023

fymue Jul 10, 2023

sjanssen2 Jun 27, 2023

fymue Jul 10, 2023

sjanssen2 Jun 27, 2023

fymue Jul 10, 2023

sjanssen2 Jul 2, 2023

fymue Jul 10, 2023

sjanssen2 Jul 2, 2023

fymue Jul 10, 2023

sjanssen2 Jul 2, 2023

fymue Jul 10, 2023

sjanssen2 Jul 2, 2023

fymue Jul 10, 2023

sjanssen2 Jul 2, 2023

fymue Jul 10, 2023

sjanssen2 Jul 2, 2023

sjanssen2 Jul 2, 2023

fymue Jul 10, 2023

Autobuild Python/Pytorch module for derivative code #166

Are you sure you want to change the base?

Autobuild Python/Pytorch module for derivative code #166

Conversation

fymue commented Jan 6, 2023

Differential Dynamic Programming

Motivation

Implementation

Prerequisites, Usage, Example

fymue commented Jul 4, 2023 • edited Loading

fymue commented Jul 9, 2023

sjanssen2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fymue commented Jul 4, 2023 •

edited

Loading