Skip to content

A multi-backend graph learning library.

License

Notifications You must be signed in to change notification settings

YijianLiu/GammaGL

 
 

Repository files navigation

Gamma Graph Library(GammaGL)

GitHub release (latest by date) Documentation Status GitHub visitors GitHub all releases Total lines

Documentation |启智社区

GammaGL is a multi-backend graph learning library based on TensorLayerX, which supports TensorFlow, PyTorch, PaddlePaddle, MindSpore as the backends.

We release the version 0.1.0 on 20th June.

We give a development tutorial in Chinese on wiki.

Highlighted Features

Multi-backend

GammaGL supports multiple deep learning backends, such as TensorFlow, PyTorch, Paddle and MindSpore. Different from DGL, the GammaGL's examples are implemented with the same code on different backend. It allows users to run the same code on different hardwares like Nvidia-GPU and Huawei-Ascend. Besides, users could use a particular framework API based on preferences for different frameworks.

PyG-Like

Following PyTorch Geometric(PyG), GammaGL utilizes a tensor-centric API. If you are familiar with PyG, it will be friendly and maybe a TensorFlow Geometric, Paddle Geometric, or MindSpore Geometric to you.

Quick Tour for New Users

In this quick tour, we highlight the ease of creating and training a GNN model with only a few lines of code.

Train your own GNN model

In the first glimpse of GammaGL, we implement the training of a GNN for classifying papers in a citation graph. For this, we load the Cora dataset, and create a simple 2-layer GCN model using the pre-defined GCNConv:

import tensorlayerx as tlx
from gammagl.layers.conv import GCNConv
from gammagl.datasets import Planetoid

dataset = Planetoid(root='.', name='Cora')

class GCN(tlx.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels):
        super().__init__()
        self.conv1 = GCNConv(in_channels, hidden_channels)
        self.conv2 = GCNConv(hidden_channels, out_channels)
        self.relu = tlx.ReLU()

    def forward(self, x, edge_index):
        # x: Node feature matrix of shape [num_nodes, in_channels]
        # edge_index: Graph connectivity matrix of shape [2, num_edges]
        x = self.conv1(x, edge_index)
        x = self.relu(x)
        x = self.conv2(x, edge_index)
        return x

model = GCN(dataset.num_features, 16, dataset.num_classes)
We can now optimize the model in a training loop, similar to the standard TensorLayerX training procedure.
import tensorlayerx as tlx
data = dataset[0]
loss_fn = tlx.losses.softmax_cross_entropy_with_logits
optimizer = tlx.optimizers.Adam(learning_rate=1e-3)
net_with_loss = tlx.model.WithLoss(model, loss_fn)
train_one_step = tlx.model.TrainOneStep(net_with_loss, optimizer, train_weights)

for epoch in range(200):
    loss = train_one_step(data.x, data.y)
We can now optimize the model in a training loop, similar to the standard PyTorch training procedure.
import torch.nn.functional as F

data = dataset[0]
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

for epoch in range(200):
    pred = model(data.x, data.edge_index)
    loss = F.cross_entropy(pred[data.train_mask], data.y[data.train_mask])

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
We can now optimize the model in a training loop, similar to the standard TensorFlow training procedure.
import tensorflow as tf

optimizer = tf.keras.optimizers.Adam()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
for epoch in range(200):
    with tf.GradientTape() as tape:
        predictions = model(images, training=True)
        loss = loss_fn(labels, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
We can now optimize the model in a training loop, similar to the standard PaddlePaddle training procedure.
import paddle

data = dataset[0]
optim = paddle.optimizer.Adam(parameters=model.parameters())
loss_fn = paddle.nn.CrossEntropyLoss()

model.train()
for epoch in range(200):
    predicts = model(data.x, data.edge_index)
    loss = loss_fn(predicts, y_data)

    # Backpropagation
    loss.backward()
    optim.step()
    optim.clear_grad()
We can now optimize the model in a training loop, similar to the standard MindSpore training procedure.
# 1. Generate training dataset
train_dataset = create_dataset(num_data=160, batch_size=16)

# 2.Build a model and define the loss function
net = LinearNet()
loss = nn.MSELoss()

# 3.Connect the network with loss function, and define the optimizer
net_with_loss = nn.WithLossCell(net, loss)
opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9)

# 4.Define the training network
train_net = nn.TrainOneStepCell(net_with_loss, opt)

# 5.Set the model as training mode
train_net.set_train()

# 6.Training procedure
for epoch in range(200):
    for d in train_dataset.create_dict_iterator():
        result = train_net(d['data'], d['label'])
        print(f"Epoch: [{epoch} / {epochs}], "
              f"step: [{step} / {steps}], "
              f"loss: {result}")
        step = step + 1

More information about evaluating final model performance can be found in the corresponding example.

Create your own GNN layer

In addition to the easy application of existing GNNs, GammaGL makes it simple to implement custom Graph Neural Networks (see here for the accompanying tutorial). For example, this is all it takes to implement the edge convolutional layer from Wang et al.:

$$x_i^{\prime} ~ = ~ \max_{j \in \mathcal{N}(i)} ~ \textrm{MLP}_{\theta} \left( [ ~ x_i, ~ x_j - x_i ~ ] \right)$$

import tensorlayerx as tlx
from tensorlayerx.nn import Sequential as Seq, Linear, ReLU
from gammagl.layers import MessagePassing

class EdgeConv(MessagePassing):
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.mlp = Seq(Linear(2 * in_channels, out_channels),
                       ReLU(),
                       Linear(out_channels, out_channels))

    def forward(self, x, edge_index):
        # x has shape [N, in_channels]
        # edge_index has shape [2, E]

        return self.propagate(x=x, edge_index,aggr_type='max')

    def message(self, x_i, x_j):
        # x_i has shape [E, in_channels]
        # x_j has shape [E, in_channels]

        tmp = tlx.concat([x_i, x_j - x_i], axis=1)  # tmp has shape [E, 2 * in_channels]
        return self.mlp(tmp)

Get Started

  1. Python environment (Optional): We recommend using Conda package manager

    # python=3.7.5 or 3.9.0 is suitable for mindspore.
    conda create -n ggl python=3.7.5
    source activate ggl
  2. Install Backend

    # For tensorflow
    pip install tensorflow-gpu # GPU version
    pip install tensorflow # CPU version
    
    # For torch, version 1.10
    # https://pytorch.org/get-started/locally/
    pip3 install torch==1.10.2
    
    # For paddle, any latest stable version
    # https://www.paddlepaddle.org.cn/
    python -m pip install paddlepaddle-gpu
    
    # For mindspore, GammaGL only supports version1.6.1, GPU-CUDA 11.1 and python 3.7.5
    # https://www.mindspore.cn/install
    pip install https://ms-release.obs.cn-north-4.myhuaweicloud.com/1.6.1/MindSpore/gpu/x86_64/cuda-11.1/mindspore_gpu-1.6.1-cp37-cp37m-linux_x86_64.whl --trusted-host ms-release.obs.cn-north-4.myhuaweicloud.com -i https://pypi.tuna.tsinghua.edu.cn/simple

    For other backend with specific version, please check whether TLX supports.

    Install TensorLayerX

    pip install git+https://github.com/tensorlayer/tensorlayerx.git 

    大陆用户如果遇到网络问题,推荐从启智社区安装

    Try to git clone from OpenI

    pip install git+https://git.openi.org.cn/OpenI/TensorLayerX.git

  3. Download GammaGL

    git clone https://github.com/BUPT-GAMMA/GammaGL.git
    python setup.py install

    大陆用户如果遇到网络问题,推荐从启智社区安装

    Try to git clone from OpenI

    git clone https://git.openi.org.cn/GAMMALab/GammaGL.git

How to Run

Take GCN as an example:

# cd ./examples/gcn
# set parameters if necessary
python gcn_trainer.py --dataset cora --lr 0.01

If you want to use specific backend or GPU, just set environment variable like:

CUDA_VISIBLE_DEVICES="1" TL_BACKEND="paddle" python gcn_trainer.py

Note

The DEFAULT backend is tensorflow and GPU is 0. The backend TensorFlow will take up all GPU left memory by default.

The CANDIDATE backends are tensorflow, paddle, torch and mindspore.

Set CUDA_VISIBLE_DEVICES=" " if you want to run it in CPU.

Supported Models

TensorFlow PyTorch Paddle MindSpore
GCN [ICLR 2017] ✔️ ✔️ ✔️
GAT [ICLR 2018] ✔️ ✔️ ✔️
GraphSAGE [NeurIPS 2017] ✔️ ✔️ ✔️
ChebNet [NeurIPS 2016] ✔️ ✔️
GCNII [ICLR 2017] ✔️ ✔️ ✔️
JKNet [ICML 2018] ✔️ ✔️ ✔️
DiffPool [NeurIPS 2018]
SGC [ICML 2019] ✔️ ✔️ ✔️
GIN [ICLR 2019]
APPNP [ICLR 2019] ✔️ ✔️ ✔️
AGNN [arxiv] ✔️ ✔️ ✔️
SIGN [ICML 2020 Workshop] ✔️ ✔️ ✔️
DropEdge [ICLR 2020] ✔️ ✔️ ✔️
GATv2 [ICLR 2021] ✔️ ✔️ ✔️
GPRGNN [ICLR 2021] ✔️
FAGCN [AAAI 2021] ✔️ ✔️
Contrastive Learning TensorFlow PyTorch Paddle MindSpore
DGI [ICLR 2019] ✔️ ✔️ ✔️
GRACE [ICML 2020 Workshop] ✔️ ✔️ ✔️
MVGRL [ICML 2020] ✔️ ✔️ ✔️
InfoGraph [ICLR 2020] ✔️ ✔️ ✔️
MERIT [IJCAI 2021] ✔️ ✔️
Heterogeneous Graph Learning TensorFlow PyTorch Paddle MindSpore
RGCN [ESWC2018] ✔️ ✔️ ✔️
HAN [WWW 2019] ✔️ ✔️ ✔️
HGT [WWW 2020]
SimpleHGN [KDD 2021] ✔️

Note

The models can be run in mindspore backend. Howerver, the results of experiments are not satisfying due to training component issue, which will be fixed in future.

Contributors

GammaGL Team[GAMMA LAB] and Peng Cheng Laboratory.

See more in CONTRIBUTING.

Contribution is always welcomed. Please feel free to open an issue or email to [email protected].

About

A multi-backend graph learning library.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.7%
  • Other 2.3%