diff --git a/README.md b/README.md index d4f2996b..d9ea353a 100644 --- a/README.md +++ b/README.md @@ -62,7 +62,7 @@ poetry install -E arrow # Support for the CSV, Parquet, ORC and IPC/Feather/Arro poetry install -E dgl # DGL support (also includes torch) ``` -To run the tests, make sure you have an [active Memgraph instance](/memgraph), and execute one of the following commands: +To run the tests, make sure you have an [active Memgraph instance](https://memgraph.com/docs/getting-started), and execute one of the following commands: ```bash poetry run pytest . -k "not slow" # If all extras installed @@ -76,199 +76,6 @@ If you’ve installed only certain extras, it’s also possible to run their ass poetry run pytest . -k "arrow" poetry run pytest . -k "dgl" ``` - -## GQLAlchemy capabilities - -
-🗺️ Object graph mapper -
- -Below you can see an example of how to create `User` and `Language` node classes, and a relationship class of type `SPEAKS`. Along with that, you can see how to create a new node and relationship and how to save them in the database. After that, you can load those nodes and relationship from the database. -
-
- -```python -from gqlalchemy import Memgraph, Node, Relationship, Field -from typing import Optional - -db = Memgraph() - -class User(Node, index=True, db=db): - id: str = Field(index=True, exist=True, unique=True, db=db) - -class Language(Node): - name: str = Field(unique=True, db=db) - -class Speaks(Relationship, type="SPEAKS"): - pass - -user = User(id="3", username="John").save(db) -language = Language(name="en").save(db) -speaks_rel = Speaks( - _start_node_id = user._id, - _end_node_id = language._id -).save(db) - -loaded_user = User(id="3").load(db=db) -print(loaded_user) -loaded_speaks = Speaks( - _start_node_id=user._id, - _end_node_id=language._id - ).load(db) -print(loaded_speaks) -``` -
- -
-🔨 Query builder -
-When building a Cypher query, you can use a set of methods that are wrappers around Cypher clauses. -
-
- -```python -from gqlalchemy import create, match -from gqlalchemy.query_builder import Operator - -query_create = create() - .node(labels="Person", name="Leslie") - .to(relationship_type="FRIENDS_WITH") - .node(labels="Person", name="Ron") - .execute() - -query_match = match() - .node(labels="Person", variable="p1") - .to() - .node(labels="Person", variable="p2") - .where(item="p1.name", operator=Operator.EQUAL, literal="Leslie") - .return_(results=["p1", ("p2", "second")]) - .execute() -``` -
- -
-🚰 Manage streams -
- -You can create and start Kafka or Pulsar stream using GQLAlchemy. -
- -**Kafka stream** -```python -from gqlalchemy import MemgraphKafkaStream - -stream = MemgraphKafkaStream(name="ratings_stream", topics=["ratings"], transform="movielens.rating", bootstrap_servers="localhost:9093") -db.create_stream(stream) -db.start_stream(stream) -``` - - -**Pulsar stream** -```python -from gqlalchemy import MemgraphPulsarStream - -stream = MemgraphPulsarStream(name="ratings_stream", topics=["ratings"], transform="movielens.rating", service_url="localhost:6650") -db.create_stream(stream) -db.start_stream(stream) -``` - -
- -
-🗄️ Import table data from different sources -
- -**Import table data to a graph database** - -You can translate table data from a file to graph data and import it to Memgraph. Currently, we support reading of CSV, Parquet, ORC and IPC/Feather/Arrow file formats via the PyArrow package. - -Read all about it in [table to graph importer how-to guide](https://memgraph.com/docs/gqlalchemy/how-to-guides/table-to-graph-importer). - -**Make a custom file system importer** - -If you want to read from a file system not currently supported by GQLAlchemy, or use a file type currently not readable, you can implement your own by extending abstract classes `FileSystemHandler` and `DataLoader`, respectively. - -Read all about it in [custom file system importer how-to guide](https://memgraph.com/docs/gqlalchemy/how-to-guides/custom-file-system-importer). - -
- -
-⚙️ Manage Memgraph instances -
- -You can start, stop, connect to and monitor Memgraph instances with GQLAlchemy. - -**Manage Memgraph Docker instance** - -```python -from gqlalchemy.instance_runner import ( - DockerImage, - MemgraphInstanceDocker -) - -memgraph_instance = MemgraphInstanceDocker( - docker_image=DockerImage.MEMGRAPH, docker_image_tag="latest", host="0.0.0.0", port=7687 -) -memgraph = memgraph_instance.start_and_connect(restart=False) - -memgraph.execute_and_fetch("RETURN 'Memgraph is running' AS result"))[0]["result"] -``` - -**Manage Memgraph binary instance** - -```python -from gqlalchemy.instance_runner import MemgraphInstanceBinary - -memgraph_instance = MemgraphInstanceBinary( - host="0.0.0.0", port=7698, binary_path="/usr/lib/memgraph/memgraph", user="memgraph" -) -memgraph = memgraph_instance.start_and_connect(restart=False) - -memgraph.execute_and_fetch("RETURN 'Memgraph is running' AS result"))[0]["result"] -``` -
- -
-🔫 Manage database triggers -
- -Because Memgraph supports database triggers on `CREATE`, `UPDATE` and `DELETE` operations, GQLAlchemy also implements a simple interface for maintaining these triggers. - -```python -from gqlalchemy import Memgraph, MemgraphTrigger -from gqlalchemy.models import ( - TriggerEventType, - TriggerEventObject, - TriggerExecutionPhase, -) - -db = Memgraph() - -trigger = MemgraphTrigger( - name="ratings_trigger", - event_type=TriggerEventType.CREATE, - event_object=TriggerEventObject.NODE, - execution_phase=TriggerExecutionPhase.AFTER, - statement="UNWIND createdVertices AS node SET node.created_at = LocalDateTime()", -) - -db.create_trigger(trigger) -triggers = db.get_triggers() -print(triggers) -``` -
- -
-💽 On-disk storage -
- -Since Memgraph is an in-memory graph database, the GQLAlchemy library provides an on-disk storage solution for large properties not used in graph algorithms. This is useful when nodes or relationships have metadata that doesn’t need to be used in any of the graph algorithms that need to be carried out in Memgraph, but can be fetched after. Learn all about it in the [on-disk storage how-to guide](https://memgraph.com/docs/gqlalchemy/how-to-guides/on-disk-storage). -
- -
- -If you want to learn more about OGM, query builder, managing streams, importing data from different source, managing Memgraph instances, managing database triggers and using on-disk storage, check out the GQLAlchemy [how-to guides](https://memgraph.com/docs/gqlalchemy/how-to-guides). - ## Development (how to build) ```bash @@ -279,14 +86,22 @@ poetry run pytest . -k "not slow and not extras" ## Documentation -The GQLAlchemy documentation is available on [memgraph.com/docs/gqlalchemy](https://memgraph.com/docs/gqlalchemy/). +The GQLAlchemy documentation is available on [GitHub](https://github.com/memgraph/gqlalchemy). -The documentation can be generated by executing: +The reference guide can be generated from the code by executing: ``` pip3 install pydoc-markdown pydoc-markdown ``` +Other parts of the documentation are written and located at docs directory. To test the documentation locally execute: +``` +pip3 install mkdocs +pip3 install mkdocs-material +pip3 install pymdown-extensions +mkdocs serve +``` + ## License Copyright (c) 2016-2022 [Memgraph Ltd.](https://memgraph.com) diff --git a/docs/assets/favicon.png b/docs/assets/favicon.png new file mode 100644 index 00000000..f87d136e Binary files /dev/null and b/docs/assets/favicon.png differ diff --git a/docs/assets/memgraph-logo.svg b/docs/assets/memgraph-logo.svg new file mode 100644 index 00000000..8101972f --- /dev/null +++ b/docs/assets/memgraph-logo.svg @@ -0,0 +1,468 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/changelog.md b/docs/changelog.md new file mode 100644 index 00000000..9a81c6da --- /dev/null +++ b/docs/changelog.md @@ -0,0 +1,126 @@ +# Changelog + +## v1.4.1 - April 19, 2023 + +### Features and improvements + +- Installing and testing GQLAlchemy is now easier because Apache Arrow, PyTorch Geometric and DGL dependencies have been made optional. [#235](https://github.com/memgraph/gqlalchemy/pull/235) + +### Bug fixes + +- Removed unnecessary extra argument in the call of the `escape_value` method and fixed a bug in query creation for the `Map` property type. [#198](https://github.com/memgraph/gqlalchemy/pull/198/files) + +## v1.4 - March 10, 2023 + +### Features and improvements + +- Data from Memgraph can now be [imported from](reference/gqlalchemy/transformations/importing/graph_importer.md) and [exported to](reference/gqlalchemy/transformations/export/graph_transporter.md) `NetworkX`, `DGL` and `PyG` graph formats. [#215](https://github.com/memgraph/gqlalchemy/pull/215) +- Now you can execute procedures from query modules on a subgraph [using the project feature](how-to-guides/query-builder/graph-projection.md). [#210](https://github.com/memgraph/gqlalchemy/pull/210) +- Now you can pass values from Python variables as parameters in Cypher queries. [#217](https://github.com/memgraph/gqlalchemy/pull/217) +- Besides BSF, DSF and WSHORTEST, now you can also run the All shortest paths algorithm with GQLAlchemy. [#200](https://github.com/memgraph/gqlalchemy/pull/200) + +## v1.3.3 - Dec 15, 2022 + +### Bug fixes + +- Added initial support for NumPy arrays (`ndarray`) and scalars (`generic`) [#208](https://github.com/memgraph/gqlalchemy/pull/208) + +## v1.3.2 - Sep 15, 2022 + +### Bug fixes + +- Fixed Unicode serialisation [#189](https://github.com/memgraph/gqlalchemy/pull/189) +- Fixed `GQLAlchemyWaitForConnectionError` and `GQLAlchemyDatabaseError` [#188](https://github.com/memgraph/gqlalchemy/pull/188) +- Fixed `Datetime` serialisation [#185](https://github.com/memgraph/gqlalchemy/pull/185) + +### Updates + +- Bumped `pyarrow` [#193](https://github.com/memgraph/gqlalchemy/pull/193) +- Updated `poetry` to 1.2.0 and `pymgclient` to 1.3.1 [#191](https://github.com/memgraph/gqlalchemy/pull/191) +- Updated all dependencies [#194](https://github.com/memgraph/gqlalchemy/pull/194) + +## v1.3 - Jun 14, 2022 +!!! warning + ### Breaking Changes + + - Renamed keyword argument `edge_label` to `relationship_type` in `to()` and `from()` methods in the query builder. [#145](https://github.com/memgraph/gqlalchemy/pull/145) + +### Major Features and Improvements + +- Added option to suppress warning `GQLAlchemySubclassNotFoundWarning`. [#121](https://github.com/memgraph/gqlalchemy/pull/121) +- Added the possibility to import `Field` from `gqlalchemy.models`. [#122](https://github.com/memgraph/gqlalchemy/pull/122) +- Added `set_()` method to the query builder. [#128](https://github.com/memgraph/gqlalchemy/pull/128) +- Added wrapper class for query modules. [#130](https://github.com/memgraph/gqlalchemy/pull/130) +- Added `foreach()` method to the query builder. [#135](https://github.com/memgraph/gqlalchemy/pull/135) +- Added `load_csv()` and `return()` methods from the query builder to base classes list. [#139](https://github.com/memgraph/gqlalchemy/pull/139) +- Added new argument types in `return_()`, `yield_()` and `with_()` methods in the query builder. [#146](https://github.com/memgraph/gqlalchemy/pull/146) +- Added `IntegratedAlgorithm` class instance as argument in `to()` and `from()` methods in the query builder. [#141](https://github.com/memgraph/gqlalchemy/pull/141) +- Extended `IntegratedAlgorithm` class with the Breadth-first search algorithm. [#142](https://github.com/memgraph/gqlalchemy/pull/142) +- Extended `IntegratedAlgorithm` class with the Weighted shortest path algorithm. [#143](https://github.com/memgraph/gqlalchemy/pull/143) +- Extended `IntegratedAlgorithm` class with the Depth-first search algorithm. [#144](https://github.com/memgraph/gqlalchemy/pull/144) +- Removed the usage of `sudo` from the `instance_runner` module. [#148](https://github.com/memgraph/gqlalchemy/pull/148) +- Added support for Neo4j in the Object-Graph Mapper and the query builder. [#149](https://github.com/memgraph/gqlalchemy/pull/149) +- Changed string variables for Blob and S3 keyword arguments. [#151](https://github.com/memgraph/gqlalchemy/pull/151) +- Added variable support for node and relationship properties. [#154](https://github.com/memgraph/gqlalchemy/pull/154) +- Added `Tuple` as new argument type in query modules. [#155](https://github.com/memgraph/gqlalchemy/pull/155/) +- Changed `host` and `port` `Memgraph` properties to readonly. [#156](https://github.com/memgraph/gqlalchemy/pull/156) +- Changed `Memgraph.new_connection()` to be a private method. [#157](https://github.com/memgraph/gqlalchemy/pull/157) +- Added `push()` query modules for Kafka streams and Power BI. [#158](https://github.com/memgraph/gqlalchemy/pull/158) +- Added argument `lazy` for configuring lazy loading in the `Memgraph` class. [#159](https://github.com/memgraph/gqlalchemy/pull/159) +- Added `datetime` support for property types. [#161](https://github.com/memgraph/gqlalchemy/pull/161) +- Added `Operator` enum which can be used as `operator` value in `set_()` and `where()` methods in the query builder. [#165](https://github.com/memgraph/gqlalchemy/pull/165) +- Added an extension to the `QueryBuilder` class to support and autocomplete integrated and MAGE query modules. [#168](https://github.com/memgraph/gqlalchemy/pull/168) + + +### Bug fixes + +- Fixed the unbound variable error in the return statement of the Cypher query in `memgraph.save_relationship_with_id()`. [#166](https://github.com/memgraph/gqlalchemy/pull/166) +- Fixed checking if `None` for `Optional` properties. [#167](https://github.com/memgraph/gqlalchemy/pull/167) + + +## v1.2 - Apr 12, 2022 + +!!! warning + ### Breaking Changes + + - Ordering query results as in GQLAlchemy older than 1.2 will not be possible. + - `where()`, `and_where()` and `or_where()` methods can't be used as in + GQLAlchemy older than 1.2. + - Setting up the `bootstrap_servers` argument when creating a stream as in + GQLAlchemy older than 1.2 will not be possible. + +### Major Features and Improvements + +- Improved `where()`, `and_where()`, `or_where()` and `xor_where()` methods. [#114](https://github.com/memgraph/gqlalchemy/pull/114) +- Added `where_not()`, `and_not()`, `or_not()` and `xor_not()` methods. [#114](https://github.com/memgraph/gqlalchemy/pull/114) +- Improved `order_by()` method from query builder by changing its argument types. [#114](https://github.com/memgraph/gqlalchemy/pull/114) +- Added Docker and Binary Memgraph instance runners. [#91](https://github.com/memgraph/gqlalchemy/pull/91) +- Added methods for dropping all indexes (`drop_all_indexes()`) and dropping all triggers (`drop_all_triggers()`). [#100](https://github.com/memgraph/gqlalchemy/pull/100) +- Added table to graph importer and Amazon S3 importer. [#100](https://github.com/memgraph/gqlalchemy/pull/100) +- Added Azure Blob and local storage importers. [#104](https://github.com/memgraph/gqlalchemy/pull/104) +- Added an option to create a label index. [#113](https://github.com/memgraph/gqlalchemy/pull/113) +- Added batch save methods for saving nodes (`save_nodes()`) and saving relationships (`save_relationships()`). [#106](https://github.com/memgraph/gqlalchemy/pull/106) +- Added label filtering in `where()` method in query builder. [#103](https://github.com/memgraph/gqlalchemy/pull/103) +- Added support for creating a trigger without `ON` keyword in query builder. [#90](https://github.com/memgraph/gqlalchemy/pull/90) +- Added `execute()` option in query builder. [#92](https://github.com/memgraph/gqlalchemy/pull/92) +- Added `load_csv()` and `xor_where()` methods to query builder. [#90](https://github.com/memgraph/gqlalchemy/pull/90) + +### Bug fixes + +- Fixed `save_node_with_id()` signature in the `save_node()` method. [#109](https://github.com/memgraph/gqlalchemy/pull/109) +- Constraints and indexes defined in `Field` now work correctly. Before, when they were added to the `Field` of the property, they were always set to `True`, regardless of their actual value. [#90](https://github.com/memgraph/gqlalchemy/pull/90) +- Fixed label inheritance to get all labels of base class. [#105](https://github.com/memgraph/gqlalchemy/pull/105) +- Removed extra argument called `optional` from the `Merge` class. [#118](https://github.com/memgraph/gqlalchemy/pull/118) +- Removed unnecessary quotes from the `bootstraps_servers` argument when creating a stream. [#98](https://github.com/memgraph/gqlalchemy/pull/98) + +## v1.1 - Jan 19, 2022 + +### Major Features and Improvements + +- Added graph schema definition and validation. +- Added new methods to the query builder: `merge()`, `create()`, + `unwind()`,`with_()`, `return_()`, `yield_()`, `order_by()`, `limit()`, + `skip()`, `call()`, `delete()` and `remove()`. +- Added on-disk storage for large properties that don't need to be stored in the + graph database. +- Added support for managing streams and database triggers. diff --git a/docs/getting-started.md b/docs/getting-started.md new file mode 100644 index 00000000..405940d5 --- /dev/null +++ b/docs/getting-started.md @@ -0,0 +1,29 @@ +# Getting started with GQLAlchemy + +[![GQLAlchemy](https://img.shields.io/badge/source-GQLAlchemy-FB6E00?style=for-the-badge&logo=github&logoColor=white)](https://github.com/memgraph/gqlalchemy) + +**GQLAlchemy** is an open-source Python library and an **Object Graph Mapper** (OGM) - a link between graph database objects and Python objects. GQLAlchemy supports **Memgraph** and **Neo4j**. + +An Object Graph Mapper or OGM provides a developer-friendly workflow for writing object-oriented notation to communicate to a graph database. Instead of writing Cypher queries, you can write object-oriented code, which the OGM will automatically translate into Cypher queries. + +## Quick start + +### 1. Install GQLAlchemy + +Either install GQLAlchemy through [pip](installation.md#pip) or [build it from +source](installation.md#source). If you are using [Conda](https://docs.conda.io/en/latest/) for Python environment management, you can install GQLAlchemy through [pip](installation.md#pip). + +!!! danger + GQLAlchemy can't be installed with Python 3.11 [(#203)](https://github.com/memgraph/gqlalchemy/issues/203) and on Windows with Python > 3.9 [(#179)](https://github.com/memgraph/gqlalchemy/issues/179). If this is currently a blocker for you, please let us know by commenting on opened issues. + +### 2. Connect to Memgraph + +Check the [Python quick start guide](https://memgraph.com/docs) to learn how to connect to Memgraph using GQLAlchemy. + +### 3. Learn how to use GQLAlchemy + +With the help of the [How-to guides](how-to-guides/overview.md) you can learn how to use GQLAlchemy's features, such as object graph mapper and query builder. + +### 3. Check the reference guide + +Don't forget to check the [Reference guide](reference/gqlalchemy/overview.md) if you want to find out which methods GQLAlchemy has and how to use it. If the reference guide is not clear enough, head over to the [GQLAlchemy repository](https://github.com/memgraph/gqlalchemy) and inspect the source code. While you're there, feel free to give us a star or contribute to this open-source Python library. diff --git a/docs/how-to-guides/data/dgl-example.png b/docs/how-to-guides/data/dgl-example.png new file mode 100644 index 00000000..7c97f0da Binary files /dev/null and b/docs/how-to-guides/data/dgl-example.png differ diff --git a/docs/how-to-guides/data/networkx-example-2.png b/docs/how-to-guides/data/networkx-example-2.png new file mode 100644 index 00000000..000630ba Binary files /dev/null and b/docs/how-to-guides/data/networkx-example-2.png differ diff --git a/docs/how-to-guides/data/pyg-example.png b/docs/how-to-guides/data/pyg-example.png new file mode 100644 index 00000000..bc359162 Binary files /dev/null and b/docs/how-to-guides/data/pyg-example.png differ diff --git a/docs/how-to-guides/instance-runner/memgraph-binary-instance.md b/docs/how-to-guides/instance-runner/memgraph-binary-instance.md new file mode 100644 index 00000000..bd703948 --- /dev/null +++ b/docs/how-to-guides/instance-runner/memgraph-binary-instance.md @@ -0,0 +1,83 @@ +# How to manage Memgraph binary instances in Python + +Through this guide, you will learn how to start, stop, connect to and monitor +Memgraph instances with GQLAlchemy. + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + +First, perform all the necessary imports: + +```python +from gqlalchemy.instance_runner import MemgraphInstanceBinary +``` + +## Start the Memgraph instance + +!!! warning + In order to start a Memgraph instance that you installed using `dpkg`, you need + to run the binary file as user `memgraph`. Otherwise, the process won't have the + right access rights to the needed directories and files. + +The following code will create a Memgraph instance, start it and return a +connection object: + +```python +memgraph_instance = MemgraphInstanceBinary( + host="0.0.0.0", port=7698, binary_path="/usr/lib/memgraph/memgraph", user="memgraph" +) +memgraph = memgraph_instance.start_and_connect(restart=False) +``` + +We used the default values for the arguments: + +- `host="0.0.0.0"`: This is the wildcard address which indicates that the + instance should accept connections from all interfaces. +- `port=7687`: This is the default port Memgraph listens to. +- `binary_path="/usr/lib/memgraph/memgraph"`: The default location of the + Memgraph binary file on Ubuntu. +- `user="memgraph"`: The user that will start the Memgraph process. +- `restart=False`: If the instance is already running, it won't be stopped and + started again. + +After we have created the connection, we can start querying the database: + +```python +memgraph.execute_and_fetch("RETURN 'Memgraph is running' AS result"))[0]["result"] +``` + +## Pass configuration flags + +You can pass [configuration flags](htps://memgraph.com/docs/configuration/configuration-settings) +using a dictionary: + +```python +config={"--log-level": "TRACE"} +memgraph_instance = MemgraphInstanceBinary(config=config) +``` + +## Stop the Memgraph instance + +To stop a Memgraph instance, call the `stop()` method: + +```python +memgraph_instance.stop() +``` + +## Check if a Memgraph instance is running + +To check if a Memgraph instance is running, call the `is_running()` method: + +```python +memgraph_instance.is_running() +``` + +## Where to next? + +Hopefully, this guide has taught you how to manage Memgraph Docker instances. If +you have any more questions, join our community and ping us on +[Discord](https://discord.gg/memgraph). diff --git a/docs/how-to-guides/instance-runner/memgraph-docker-instance.md b/docs/how-to-guides/instance-runner/memgraph-docker-instance.md new file mode 100644 index 00000000..5eab3f90 --- /dev/null +++ b/docs/how-to-guides/instance-runner/memgraph-docker-instance.md @@ -0,0 +1,83 @@ +# How to manage Memgraph Docker instances in Python + +Through this guide, you will learn how to start, stop, connect to and monitor +Memgraph instances with GQLAlchemy. + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + + +First, perform all the necessary imports: + +```python +from gqlalchemy.instance_runner import ( + DockerImage, + MemgraphInstanceDocker +) +``` + +## Start the Memgraph instance + +The following code will create a Memgraph instance, start it and return a +connection object: + +```python +memgraph_instance = MemgraphInstanceDocker( + docker_image=DockerImage.MEMGRAPH, docker_image_tag="latest", host="0.0.0.0", port=7687 +) +memgraph = memgraph_instance.start_and_connect(restart=False) +``` + +We used the default values for the arguments: + +- `docker_image=DockerImage.MEMGRAPH`: This will start the `memgraph/memgraph` + Docker image. +- `docker_image_tag="latest"`: We use the `latest` tag to start the most recent + version of Memgraph. +- `host="0.0.0.0"`: This is the wildcard address which indicates that the + instance should accept connections from all interfaces. +- `port=7687`: This is the default port Memgraph listens to. +- `restart=False`: If the instance is already running, it won't be stopped and + started again. + +After we have created the connection, we can start querying the database: + +```python +memgraph.execute_and_fetch("RETURN 'Memgraph is running' AS result"))[0]["result"] +``` + +## Pass configuration flags + +You can pass [configuration flags](htps://memgraph.com/docs/configuration/configuration-settings) +using a dictionary: + +```python +config={"--log-level": "TRACE"} +memgraph_instance = MemgraphInstanceDocker(config=config) +``` + +## Stop the Memgraph instance + +To stop a Memgraph instance, call the `stop()` method: + +```python +memgraph_instance.stop() +``` + +## Check if a Memgraph instance is running + +To check if a Memgraph instance is running, call the `is_running()` method: + +```python +memgraph_instance.is_running() +``` + +## Where to next? + +Hopefully, this guide has taught you how to manage Memgraph Docker instances. If +you have any more questions, join our community and ping us on +[Discord](https://discord.gg/memgraph). diff --git a/docs/how-to-guides/loaders/import-table-data-to-graph-database.md b/docs/how-to-guides/loaders/import-table-data-to-graph-database.md new file mode 100644 index 00000000..d3b655c7 --- /dev/null +++ b/docs/how-to-guides/loaders/import-table-data-to-graph-database.md @@ -0,0 +1,139 @@ +# How to import table data to a graph database + +This guide will show you how to use `loaders.py` to translate table data from a +file to graph data and import it to **Memgraph**. Currently, we support reading +of CSV, Parquet, ORC and IPC/Feather/Arrow file formats via the **PyArrow** package. + +> Make sure you have a running Memgraph instance. If you're not sure how to run +> Memgraph, check out the Memgraph [Quick start](https://memgraph.com/docs/getting-started). + +The `loaders.py` module implements loading data from the local file system, as +well as Azure Blob and Amazon S3 remote file systems. Depending on where your +data is located, here are two guides on how to import it to Memgraph: + +- [Loading a CSV file from the local file + system](#loading-a-csv-file-from-the-local-file-system) +- [Using a cloud storage solution](#using-a-cloud-storage-solution) + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + +!!! info + The features below aren’t included in the default GQLAlchemy installation. To use them, make sure to [install GQLAlchemy](../../installation.md) with the relevant optional dependencies. + +## Loading a CSV file from the local file system + +Let's say you have a simple table data in a CSV file stored at +`/home/user/table_data`: + +```csv +name,surname,grade +Ivan,Horvat,4 +Marko,Andric,5 +Luka,Lukic,3 +``` + +To create a translation from table to graph data, you need to define a **data +configuration object**. This can be done inside your code by defining a +dictionary, but it is recommended to use a YAML file structured like this: + +```yaml +indices: # indices to be created for each table + individuals: # name of table containing individuals with ind_id + - ind_id + address: + - add_id + + +name_mappings: # how we want to name node labels + individuals: + label: INDIVIDUAL # nodes made from individuals table will have INDIVIDUAL label + address: + label: ADDRESS + column_names_mapping: {"current_column_name": "mapped_name"} # (optional) map column names + + +one_to_many_relations: + address: [] # currently needed, leave [] if no relations to define + individuals: + - foreign_key: # foreign key used for mapping; + column_name: add_id # specifies its column + reference_table: address # name of table from which the foreign key is taken + reference_key: add_id # column name in reference table from which the foreign key is taken + label: LIVES_IN # label applied to relationship created + from_entity: False # (optional) define direction of relationship created + + +many_to_many_relations: # intended to be used in case of associative tables + example: + foreign_key_from: # describes the source of the relationship + column_name: + reference_table: + reference_key: + foreign_key_to: # describes the destination of the relationship + column_name: + reference_table: + reference_key: + label: + +``` + +For this example, you don't need all of those fields. You only need to define +`indices` and `one_to_many_relations`. Hence, you have the following YAML file: + +```yaml +indices: + example: + - name + +name_mappings: + example: + label: PERSON + +one_to_many_relations: + example: [] +``` + +In order to read the data configuration from the YAML file, run: + +```python +with open("./example.yaml", "r") as stream: + try: + parsed_yaml = yaml.load(stream, Loader=SafeLoader) + except yaml.YAMLError as exc: + print(exc) +``` + +Having defined the data configuration for the translation, all you need to do is +make an instance of an `Importer` and call `translate()`. + +```python +importer = CSVLocalFileSystemImporter( + data_configuration=parsed_yaml, + path="/home/user/table_data", +) + +importer.translate(drop_database_on_start=True) +``` + +## Using a cloud storage solution + +To connect to Azure Blob, simply change the Importer object you are using. Like +above, first, define a data configuration object and then simply call: + +```python +importer = ParquetAzureBlobFileSystemImporter( + container_name="test", + data_configuration=parsed_yaml, + account_name="your_account_name", + account_key="your_account_key", +) +``` + +Hopefully, this guide has taught you how to import table data into Memgraph. If +you have any more questions, join our community and ping us on +[Discord](https://discord.gg/memgraph). diff --git a/docs/how-to-guides/loaders/make-a-custom-file-system-importer.md b/docs/how-to-guides/loaders/make-a-custom-file-system-importer.md new file mode 100644 index 00000000..f657ce81 --- /dev/null +++ b/docs/how-to-guides/loaders/make-a-custom-file-system-importer.md @@ -0,0 +1,101 @@ +# How to make a custom file system importer + +> To learn how to import table data from a file to the Memgraph database, head +> over to the [How to import table +> data](import-table-data-to-graph-database.md) guide. + +If you want to read from a file system not currently supported by +**GQLAlchemy**, or use a file type currently not readable, you can implement +your own by extending abstract classes `FileSystemHandler` and `DataLoader`, +respectively. + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + +!!! info + The features below aren’t included in the default GQLAlchemy installation. To use them, make sure to [install GQLAlchemy](../../installation.md) with the relevant optional dependencies. + +## Implementing a new `FileSystemHandler` + +For this guide, you will use the existing `PyArrowDataLoader` capable of reading +CSV, Parquet, ORC and IPC/Feather/Arrow file formats. The PyArrow loader class +supports [fsspec](https://filesystem-spec.readthedocs.io/en/latest/)-compatible +file systems, so to implement an **Azure Blob** file system, you need to follow +these steps. + +### 1. Extend the `FileSystemHandler` class + +This class holds the connection to the file system service and handles the path +from which the `DataLoader` object reads files. To get a fsspec-compatible instance of +an Azure Blob connection, you can use the [adlfs](https://github.com/fsspec/adlfs) package. We are going to pass `adlfs`-specific parameters such as `account_name` and `account_key` via kwargs. All that's left to do +is to override the `get_path` method. + +```python +import adlfs + +class AzureBlobFileSystemHandler(FileSystemHandler): + + def __init__(self, container_name: str, **kwargs) -> None: + """Initializes connection and data container.""" + super().__init__(fs=adlfs.AzureBlobFileSystem(**kwargs)) + self._container_name = container_name + + def get_path(self, collection_name: str) -> str: + """Get file path in file system.""" + return f"{self._container_name}/{collection_name}" +``` + +### 2. Wrap the `TableToGraphImporter` + +Next, you are going to wrap the `TableToGraphImporter` class. This is optional since you can use the class directly, but it will be easier to use if we extend it with our custom importer class. Since we will be using PyArrow for data loading, you can extend the `PyArrowImporter` class (which extends the `TableToGraphImporter`) and make your own +`PyArrowAzureBlobImporter`. This class should initialize the `AzureBlobFileSystemHandler` and leave the rest to the `PyArrowImporter` class. It should also receive a `file_extension_enum` argument, which defines the file type that you are going to be reading. + +```python +class PyArrowAzureBlobImporter(PyArrowImporter): + """PyArrowImporter wrapper for use with Azure Blob File System.""" + + def __init__( + self, + container_name: str, + file_extension_enum: PyArrowFileTypeEnum, + data_configuration: Dict[str, Any], + memgraph: Optional[Memgraph] = None, + **kwargs, + ) -> None: + super().__init__( + file_system_handler=AzureBlobFileSystemHandler( + container_name=container_name, **kwargs + ), + file_extension_enum=file_extension_enum, + data_configuration=data_configuration, + memgraph=memgraph, + ) +``` + +### 3. Call `translate()` + +Finally, to use your custom file system, initialize the Importer class and call +`translate()` + +```python +importer = PyArrowAzureBlobImporter( + container_name="test" + file_extension_enum=PyArrowFileTypeEnum.Parquet, + data_configuration=parsed_yaml, + account_name="your_account_name", + account_key="your_account_key", +) + +importer.translate(drop_database_on_start=True) +``` + +If you want to see the full implementation of the `AzureBlobFileSystem` and +other loader components, have a look [at the +code](https://github.com/memgraph/gqlalchemy). Feel free to create a PR on the +GQLAlchemy repository if you think of a new feature we could use. If you have +any more questions, join our community and ping us on +[Discord](https://discord.gg/memgraph). diff --git a/docs/how-to-guides/ogm.md b/docs/how-to-guides/ogm.md new file mode 100644 index 00000000..2001483b --- /dev/null +++ b/docs/how-to-guides/ogm.md @@ -0,0 +1,468 @@ +# How to use object graph mapper + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + +Through this guide, you will learn how to use GQLAlchemy object graph mapper to: +- [**Map nodes and relationships**](#map-nodes-and-relationships) +- [**Save nodes and relationships**](#save-nodes-and-relationships) +- [**Load nodes and relationships**](#load-nodes-and-relationships) + - [**Find node properties**](#find-node-properties) + - [**Create relationship between existing nodes**](#create-relationship-between-existing-nodes) + - [**Merge nodes and relationships**](#merge-nodes-and-relationships) +- [**Create indexes**](#create-indexes) +- [**Create constraints**](#create-constraints) + +>Hopefully, this guide will teach you how to properly use GQLAlchemy object graph mapper. If you +>have any more questions, join our community and ping us on [Discord](https://discord.gg/memgraph). + +!!! info + To test the above features, you must [install GQLAlchemy](../installation.md) and have a running Memgraph instance. If you're unsure how to run Memgraph, check out the Memgraph [Quick start](https://memgraph.com/docs/getting-started)). + +## Map nodes and relationships + +First, we need to import all the necessary classes from GQLAlchemy: + +```python +from gqlalchemy import Memgraph, Node, Relationship +``` + +After that, instantiate Memgraph and **create classes representing nodes**. + +```python +db = Memgraph() + +class User(Node): + id: str + username: str + +class Streamer(User): + id: str + username: str + followers: int + +class Language(Node): + name: str +``` + + + +`Node` is a Python class which maps to a graph object in Memgraph. `User`, `Streamer` and `Language` are classes which inherit from `Node` and they map to a label in a graph database. Class `User` maps to a single `:User` label with properties `id` and `username`, class `Streamer` maps to multiple labels `:Streamer:User` with properties `id`, `username` and `followers`, and class Language maps to a single `:Language` label with `name` property. + +In a similar way, you can **create relationship classes**: + +```python +class ChatsWith(Relationship, type="CHATS_WITH"): + last_chatted: str + +class Speaks(Relationship): + since: str +``` + +The code above maps to a relationship of type `CHATS_WITH` with the string property `last_chatted` and to a relationship of type `SPEAKS` with the string property since. There was no need to add type argument to `Speaks` class, since the label it maps to will automatically be set to uppercase class name in a graph database. + +If you want to **create a node class without any properties**, use `pass` statement: + +```python +class User(Node): + pass +``` + +For **relationships without any properties** also use `pass` statement: + +```python +class ChatsWith(Relationship, type="CHATS_WITH"): + pass +``` + +!!! info + Objects are modeled using GQLAlchemy’s Object Graph Mapper (OGM) which provides schema validation, so you can be sure that the data inside Memgraph is accurate. If you tried saving data that is not following the defined schema, you will get a `ValidationError`. + +To use the above classes, you need to [save](#save-nodes-and-relationships) or [load](#load-nodes-and-relationships) data first. + +## Save nodes and relationships + +In order to save a node using the object graph mapper, first define node classes: + +```python +from gqlalchemy import Memgraph, Node, Relationship + +db = Memgraph() + +class User(Node): + id: str + username: str + +class Language(Node): + name: str +``` + +The above classes map to `User` and `Language` nodes in the database. `User` nodes have properties `id` and `username` and `Language` nodes have property `name`. + + + +To **create and save node objects** use the following code: + +```python +john = User(id="1", username="John").save(db) +jane = Streamer(id="2", username="janedoe", followers=111).save(db) +language = Language(name="en").save(db) +``` + +There is **another way of creating and saving node objects**: + +```python +john = User(id="1", username="John") +db.save_node(john) + +jane = Streamer(id="2", username="janedoe", followers=111) +db.save_node(jane) + +language = Language(name="en") +db.save_node(language) +``` + +!!! danger + The `save()` and `save_node()` procedures will save nodes in Memgraph even if they already exist. This means that if you run the above code twice, you will have duplicate nodes in the database. To avoid that, [add constraints](#create-constraints) for properties or first [load](#load-nodes-and-relationships) the node from the database to check if it already exists. + +To **save relationships** using the object graph mapper, first define relationship classes: + +```python +class ChatsWith(Relationship, type="CHATS_WITH"): + last_chatted: str + +class Speaks(Relationship): + since: str +``` + +The code above maps to a relationship of type `CHATS_WITH` with the string property `last_chatted` and to a relationship of type `SPEAKS` with the string property since. There was no need to add type argument to `Speaks` class, since the label it maps to will automatically be set to uppercase class name in a graph database. + +To save relationships, create them with appropriate start and end nodes and then use the `save()` procedure: + +```python +ChatsWith( + _start_node_id=john._id, _end_node_id=jane._id, last_chatted="2023-02-14" +).save(db) + +Speaks(_start_node_id=john._id, _end_node_id=language._id, since="2023-02-14").save(db) +``` + +The property `_id` is an **internal Memgraph id** - an id given to each node upon saving to the database. This means that you have to first load nodes from the database or save them to variables in order to create a relationship between them. + +!!! info + Objects are modeled using GQLAlchemy’s Object Graph Mapper (OGM) which provides schema validation, so you can be sure that the data inside Memgraph is accurate. If you tried saving data that is not following the defined schema, you will get `ValidationError`. + +**Another way of saving relationships** is by using the `save_relationship()` procedure: + +```python +db.save_relationship( + ChatsWith(_start_node_id=john._id, _end_node_id=jane._id, last_chatted="2023-02-14") +) + +db.save_relationship( + Speaks(_start_node_id=user._id, _end_node_id=language._id, since="2023-02-14") +) +``` + +!!! danger + The `save()` and `save_relationship()` procedures will save relationships in Memgraph even if they already exist. This means that if you run the above code twice, you will have duplicate relationships in the database. To avoid that, first [load](#load-nodes-and-relationships) the relationship from the database to check if it already exists. + +## Load nodes and relationships + +Let's continue with the previously defined classes: + +```python +class User(Node): + id: str + username: str + + +class Streamer(User): + id: str + username: str + followers: int + + +class Language(Node): + name: str + + +class ChatsWith(Relationship, type="CHATS_WITH"): + last_chatted: str + + +class Speaks(Relationship, type="SPEAKS"): + since: str +``` + +For this example, we will also use previously saved nodes: + +```python +jane = Streamer(id="2", username="janedoe", followers=111).save(db) +language = Language(name="en").save(db) +``` + +There are many examples of when **loading a node** from the database may come in +handy, but let's cover the two most common. + +### Find node properties + +Suppose you just have the `id` of the streamer and you want to know the +streamer's name. You have to load that node from the database to check its +`name` property. If you try running the following code: + +```python +loaded_streamer = Streamer(id="2").load(db=db) +``` + +you will get a `ValidationError`. This happens because the schema you defined expects `username` and `followers` properties for the `Streamer` instance. To avoid that, define Streamer class like this: + +```python +class Streamer(User): + id: str + username: Optional[str] + followers: Optional[str] +``` + +The above class definition is not ideal, since it is not enforcing schema as before. To do that, [add constraints](#create-constraints). + +If you try loading the node again, the following code: + +```python +loaded_streamer = Streamer(id="2").load(db=db) +``` + +will print out the username of the streamer whose `id` equals `"2"`, that is, `"janedoe"`. + +### Create relationship between existing nodes + +To create a new relationship of type `SPEAKS`, between already saved streamer and language you need to first load those nodes: + +```python +loaded_streamer = Streamer(id="2").load(db=db) +loaded_language = Language(name="en").load(db=db) +``` + +The load() method returns one result above, since it matches unique database objects. When the matching object is not unique, the `load()` method will return a list of matching results. + +To **create a relationship** between `loaded_streamer` and `loaded_language` nodes run: + +```python +Speaks( + _start_node_id=loaded_streamer._id, + _end_node_id=loaded_language._id, + since="2023-02-15", +).save(db) +``` + +In the above example, the relationship will be created even if it existed before. To avoid that, check [merging nodes and relationships section](#merging-nodes-and-relationships). + +To **load a relationship** from the database based on its start and end node, first mark its property as optional: + +```python +class Speaks(Relationship, type="SPEAKS"): + since: Optional[str] +``` + +The above class definition is not ideal, since it is not enforcing schema as before. To do that, [add constraints](#create-constraints). + +To load the relationship, run the following: + +```python +loaded_speaks = Speaks( + _start_node_id=streamer._id, + _end_node_id=language._id + ).load(db) +``` + +It's easy to get its `since` property: + +```python +print(loaded_speaks.since) +``` +The output of the above print is `2023-02-15`. + +### Merge nodes and relationships + +To **merge nodes**, first try loading them from the database to see if they exist, and if not, save them: + +```python +try: + streamer = Streamer(id="3").load(db=db) +except: + print("Creating new Streamer node in the database.") + streamer = Streamer(id="3", username="anne", followers=222).save(db=db) +``` + +To **merge relationships** first try loading them from the database to see if they exist, and if not, save them: + +```python +try: + speaks = Speaks(_start_node_id=streamer._id, _end_node_id=language._id).load(db) +except: + print("Creating new Speaks relationship in the database.") + speaks = Speaks( + _start_node_id=streamer._id, + _end_node_id=language._id, + since="2023-02-20", + ).save(db) +``` + +## Create indexes + +To create indexes you need to do one additional import: + +```python +from gqlalchemy import Field +``` + +The `Field` class originates from `pydantic`, a Python library data validation and settings management. Here is the example of how `Field` class helps in creating label and label-property indexes: + +```python +class User(Node): + id: str = Field(index=True, db=db) + username: str + +class Language(Node, index=True, db=db): + name: str +``` + +The indexes will be set on class definition, before instantiation. This ensures that the index creation is run only once for each index type. To check which indexes were created, run: + +```python +print(db.get_indexes()) +``` + +The other way to create indexes is by creating an instance of `MemgraphIndex` class. For example, to create label index `NodeOne` and label-property index `NodeOne(name)`, run the following code: + +```python +from gqlalchemy import Memgraph +from gqlalchemy.models import MemgraphIndex + +db = Memgraph() + +index1 = MemgraphIndex("NodeOne") +index2 = MemgraphIndex("NodeOne", "name") + +db.create_index(index1) +db.create_index(index2) +``` + +To learn more about indexes, head over to the [indexing reference guide](https://memgraph.com/docs/fundamentals/indexes). + +## Create constraints + +Uniqueness constraint enforces that each `label`, `property_set` pair is unique. Here is how you can **enforce uniqueness constraint** with GQLAlchemy's OGM: + +```python +class Language(Node): + name: str = Field(unique=True, db=db) +``` + +The above is the same as running the Cypher query: + +```cypher +CREATE CONSTRAINT ON (n:Language) ASSERT n.name IS UNIQUE; +``` + +Read more about it at [uniqueness constraint how-to guide](https://memgraph.com/docs/fundamentals/constraints). + +Existence constraint enforces that each vertex that has a specific label also must have the specified property. Here is how you can **enforce existence constraint** with GQLAlchemy's OGM: + +```python +class Streamer(User): + id: str + username: Optional[str] = Field(exists=True, db=db) + followers: Optional[str] +``` + +The above is the same as running the Cypher query: + +```cypher +CREATE CONSTRAINT ON (n:Streamer) ASSERT EXISTS (n.username); +``` + +Read more about it at [existence constraint how-to guide](https://memgraph.com/docs/fundamentals/constraints). + +To check which constraints have been created, run: + +```python +print(db.get_constraints()) +``` + +## Full code example + +The above mentioned examples can be merged into a working code example which you can run. Here is the code: + +```python +from gqlalchemy import Memgraph, Node, Relationship, Field +from typing import Optional + +db = Memgraph() + +class User(Node): + id: str = Field(index=True, db=db) + username: str = Field(exists=True, db=db) + +class Streamer(User): + id: str + username: Optional[str] = Field(exists=True, db=db) + followers: Optional[str] + +class Language(Node, index=True, db=db): + name: str = Field(unique=True, db=db) + +class ChatsWith(Relationship, type="CHATS_WITH"): + last_chatted: str + +class Speaks(Relationship, type="SPEAKS"): + since: Optional[str] + +john = User(id="1", username="John").save(db) +jane = Streamer(id="2", username="janedoe", followers=111).save(db) +language = Language(name="en").save(db) + +ChatsWith( + _start_node_id=john._id, _end_node_id=jane._id, last_chatted="2023-02-14" +).save(db) + +Speaks(_start_node_id=john._id, _end_node_id=language._id, since="2023-02-14").save(db) + +streamer = Streamer(id="2").load(db=db) +language = Language(name="en").load(db=db) + +speaks = Speaks( + _start_node_id=streamer._id, + _end_node_id=language._id, + since="2023-02-20", +).save(db) + +speaks = Speaks(_start_node_id=streamer._id, _end_node_id=language._id).load(db) +print(speaks.since) + +try: + streamer = Streamer(id="3").load(db=db) +except: + print("Creating new Streamer node in the database.") + streamer = Streamer(id="3", username="anne", followers=222).save(db=db) + +try: + speaks = Speaks(_start_node_id=streamer._id, _end_node_id=language._id).load(db) +except: + print("Creating new Speaks relationship in the database.") + speaks = Speaks( + _start_node_id=streamer._id, + _end_node_id=language._id, + since="2023-02-20", + ).save(db) + +print(db.get_indexes()) +print(db.get_constraints()) +``` + +>Hopefully, this guide has taught you how to properly use GQLAlchemy object graph mapper. If you +>have any more questions, join our community and ping us on [Discord](https://discord.gg/memgraph). diff --git a/docs/how-to-guides/on-disk-storage/on-disk-storage.md b/docs/how-to-guides/on-disk-storage/on-disk-storage.md new file mode 100644 index 00000000..bf5a3034 --- /dev/null +++ b/docs/how-to-guides/on-disk-storage/on-disk-storage.md @@ -0,0 +1,65 @@ +# How to use on-disk storage + +Since Memgraph is an in-memory graph database, the GQLAlchemy library provides +an on-disk storage solution for large properties not used in graph algorithms. +This is useful when nodes or relationships have metadata that doesn’t need to be +used in any of the graph algorithms that need to be carried out in Memgraph, but +can be fetched after. In this how-to guide, you'll learn how to use an SQL +database to store node properties seamlessly as if they were being stored in +Memgraph. + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + + +## Connect to Memgraph and an SQL database + +First you need to do all necessary imports and connect to the running Memgraph +and SQL database instance: + +```python +from gqlalchemy import Memgraph, SQLitePropertyDatabase, Node, Field +from typing import Optional + +graphdb = Memgraph() +SQLitePropertyDatabase('path-to-my-db.db', graphdb) +``` + +The `graphdb` creates a connection to an in-memory graph database and +`SQLitePropertyDatabase` attaches to `graphdb` in its constructor. + +## Define schema + +For example, you can create the class `User` which maps to a node object in the +graph database. + +```python +class User(Node): + id: int = Field(unique=True, exists=True, index=True, db=graphdb) + huge_string: Optional[str] = Field(on_disk=True) +``` + +Here the property `id` is a required `int` that creates uniqueness and existence +constraints inside Memgraph. You can notice that the property `id` is also +indexed on label `User`. The `huge_string` property is optional, and because the +`on_disk` argument is set to `True`, it will be saved into the SQLite database. + +## Create data + +Next, you can create some huge string, which won't be saved into the graph +database, but rather into the SQLite databse. + +```python +my_secret = "I LOVE DUCKS" * 1000 +john = User(id=5, huge_string=my_secret).save(db) +john2 = User(id=5).load(db) +print(john2.huge_string) # prints I LOVE DUCKS, a 1000 times +``` + +Hopefully this guide has taught you how to use on-disk storage along with the +in-memory graph database. If you have any more questions, join our community and +ping us on [Discord](https://discord.gg/memgraph). diff --git a/docs/how-to-guides/overview.md b/docs/how-to-guides/overview.md new file mode 100644 index 00000000..3a7d1c0f --- /dev/null +++ b/docs/how-to-guides/overview.md @@ -0,0 +1,87 @@ +# How-to guides overview + +This section will teach you how to use object graph mapper (OGM) and query +builder from the GQLAlchemy. Here you will find step-by-step guides for the most +common usage of OGM and query builder, depending on the current GQLAlchemy +capabilities. If you are a Python developer not that familiar with Cypher query +language, you will find the how-to guides very useful. + +## Object graph mapper + +Object graph mapper (OGM) in GQLAlchemy maps Python classes to nodes and +relationships in graph database and converts function calls to Cypher queries. +To learn more about how to use OGM, take at [**OGM how-to guide**](ogm.md). + +## Query builder + +When working with GQLAlchemy, you can connect to the database and execute Cypher +queries using the query builder. To learn more about how to create a query using +query builder, check out the [**query builder how-to guide**](query-builder.md). + +## Stream & trigger support + +You can create streams and database triggers directly from GQLAlchemy. Check out +the following guides: + +- [**Kafka streams**](streams/kafka-streams.md) +- [**Pulsar streams**](streams/pulsar-streams.md) +- [**Triggers**](triggers/triggers.md) + +## Import data from different sources + +!!! info + The features below aren’t included in the default GQLAlchemy installation. To use them, make sure to [install GQLAlchemy](../installation.md) with the relevant optional dependencies. + + +You can translate table data from a file to graph data and import it to +Memgraph. Currently, we support reading of CSV, Parquet, ORC and +IPC/Feather/Arrow file formats via the PyArrow package. + +You can use `loaders.py` which implements loading data from the local file +system, as well as Azure Blob and Amazon S3 remote file systems: + +- **[Import table data to a graph + database](loaders/import-table-data-to-graph-database.md)** + +The other way to import data is to implement a custom file system importer: + +- **[Implement a custom file system + importer](loaders/make-a-custom-file-system-importer.md)** + +## Instance runner + +There are two ways of managing a Memgraph instance with the `instance_runner` +module: + +- **[Manage a Memgraph instance with +Docker](instance-runner/memgraph-docker-instance.md)** +- **[Manage a Memgraph instance from a + binary](instance-runner/memgraph-binary-instance.md)** + +## On-disk storage + +Since Memgraph is an in-memory graph database, the GQLAlchemy library provides +an on-disk storage solution for large properties that don’t need to be used in +any of the graph algorithms. Learn how to use on-disk storage in the following +guide: + +- [**On-disk storage**](on-disk-storage/on-disk-storage.md) + +## Graph projections + +As subgraphs are mainly used with Memgraph's query modules (graph algorithms), +QueryBuilder's `call()` method enables specifying the subgraph to use with a +certain algorithm. + +- [**Create a graph projection**](query-builder/graph-projection.md) + +## Transform Python graphs into Memgraph graphs + +GQLAlchemy holds transformations that can transform NetworkX, PyG and DGL graphs +into Memgraph graphs. These transformations take the source graph object and +translate it to the appropriate Cypher queries. The Cypher queries are then +executed to create a graph inside Memgraph. + +- [**Import NetworkX graph into Memgraph**](#import-networkx-graph-into-memgraph) +- [**Import PyG graph into Memgraph**](#import-pyg-graph-into-memgraph) +- [**Import DGL graph into Memgraph**](#import-dgl-graph-into-memgraph) diff --git a/docs/how-to-guides/query-builder.md b/docs/how-to-guides/query-builder.md new file mode 100644 index 00000000..536b0119 --- /dev/null +++ b/docs/how-to-guides/query-builder.md @@ -0,0 +1,1528 @@ +# How to use query builder + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +Through this guide, you will learn how to use GQLAlchemy query builder to: +- [**Create nodes and relationships**](#create-nodes-and-relationships) + - [**Create a node**](#create-a-node) + - [**Create a relationship**](#create-a-relationship) +- [**Merge nodes and relationships**](#merge-nodes-and-relationships) + - [**Merge a node**](#merge-a-node) + - [**Merge a relationship**](#merge-a-relationship) +- [**Set or update properties and labels**](#set-or-update-properties-and-labels) + - [**Set a property**](#set-a-property) + - [**Set a label**](#set-a-label) + - [**Replace all properties**](#replace-all-properties) + - [**Update all properties**](#update-all-properties) +- [**Filter data**](#filter-data) + - [**Filter data by property comparison**](#filter-data-by-property-comparison) + - [**Filter data by property value**](#filter-data-by-property-value) + - [**Filter data by label**](#filter-data-by-label) +- [**Return results**](#return-results) + - [**Return all variables from a query**](#return-all-variables-from-a-query) + - [**Return specific variables from a query**](#return-specific-variables-from-a-query) + - [**Limit the number of returned results**](#limit-the-number-of-returned-results) + - [**Order the returned results**](#order-the-returned-results) + - [**Order by a list of values**](#order-by-a-list-of-values) +- [**Delete and remove objects**](#delete-and-remove-objects) + - [**Delete a node**](#delete-a-node) + - [**Delete a relationship**](#delete-a-relationship) + - [**Remove properties**](#remove-properties) +- [**Call procedures**](#call-procedures) + - [**Call procedure with no arguments**](#call-procedure-with-no-arguments) + - [**Call procedure with arguments**](#call-procedure-with-arguments) +- [**Load CSV file**](#load-csv-file) + +>Hopefully, this guide will teach you how to properly use GQLAlchemy query builder. If you +>have any more questions, join our community and ping us on [Discord](https://discord.gg/memgraph). + +!!! info + To test the above features, you must install [GQLAlchemy](../installation.md) and have a running Memgraph instance. If you're unsure how to run Memgraph, check out the Memgraph [Quick start](https://memgraph.com/docs/getting-started)). + + +## Create nodes and relationships + +Methods [`create()`](../reference/gqlalchemy/query_builders/declarative_base.md#create), [`merge()`](../reference/gqlalchemy/query_builders/declarative_base.md#merge), [`match()`](../reference/gqlalchemy/query_builders/declarative_base.md#match), [`node()`](../reference/gqlalchemy/query_builders/declarative_base.md#node), [`to()`](../reference/gqlalchemy/query_builders/declarative_base.md#to) and [`from_()`](../reference/gqlalchemy/query_builders/declarative_base.md#from_) are most often used when building a query to create or merge nodes and relationships. + +### Create a node + +To **create a node** with label `Person` and a property `name` of value "Ron", run the following code: + + + + +```python +from gqlalchemy import create + +query = create().node(labels="Person", name="Ron").execute() +``` + + + + +```cypher +CREATE (:Person {name: 'Ron'}); +``` + + + + +### Create a relationship + +To **create a relationship** of type `FRIENDS_WITH` with property `since` from one `Person` node to another, run the following code: + + + + +```python +from gqlalchemy import create + +query = ( + create() + .node(labels="Person", name="Leslie") + .to(relationship_type="FRIENDS_WITH", since="2023-02-16") + .node(labels="Person", name="Ron") + .execute() +) +``` + + + +```cypher +CREATE (:Person {name: 'Leslie'})-[:FRIENDS_WITH {since: '2023-02-16'}]->(:Person {name: 'Ron'}); +``` + + + + +Since you are creating a relationship between two nodes, without first matching the existing nodes or merging the relationships, the nodes will be created too. + +To **create a relationship** of type `FRIENDS_WITH` from one `Person` node to another **in an opposite direction**, run the following code: + + + + +```python +from gqlalchemy import create + +query = ( + create() + .node(labels="Person", name="Leslie") + .from(relationship_type="FRIENDS_WITH") + .node(labels="Person", name="Ron") + .execute() +) +``` + + + +```cypher +CREATE (:Person {name: 'Leslie'})<-[:FRIENDS_WITH]-(:Person {name: 'Ron'}); +``` + + + + +Again, since you are creating a relationship between two nodes, without first matching the existing nodes or merging the relationships, the nodes will be created too. + +To **create a relationship between existing nodes**, first match the existing nodes and then create a relationship by running the following code: + + + + +```python +from gqlalchemy import create, match + +query = ( + match() + .node(labels="Person", name="Leslie", variable="leslie") + .match() + .node(labels="Person", name="Ron", variable="ron") + create() + .node(variable="leslie") + .to(relationship_type="FRIENDS_WITH") + .node(variable="ron") + .execute() +) +``` + + + +```cypher +MATCH (leslie:Person {name: 'Leslie'}) +MATCH (ron:Person {name: 'Ron'}) +CREATE (leslie)-[:FRIENDS_WITH]->(ron); +``` + + + + +Read more about `CREATE` clause in the [Cypher manual](https://memgraph.com/docs/querying/clauses/create). + +## Merge nodes and relationships + +### Merge a node + +To **merge a node**, run the following code: + + + + +```python +from gqlalchemy import merge + +query = merge().node(labels="Person", name="Leslie").execute() +``` + + + + +```cypher +MERGE (:Person {name: 'Leslie'}); +``` + + + + +### Merge a relationship + +To **merge a relationship**, first match the existing nodes and then merge the relationship by running the following code: + + + + + +```python +from gqlalchemy import match, merge + +query = ( + match() + .node(labels="Person", name="Leslie", variable="leslie") + .match() + .node(labels="Person", name="Ron", variable="ron") + .merge() + .node(variable="leslie") + .to(relationship_type="FRIENDS_WITH") + .node(variable="ron") + .execute() +) +``` + + + +```cypher +MATCH (leslie:Person {name: 'Leslie'}) +MATCH (ron:Person {name: 'Ron'}) +MERGE (leslie)-[:FRIENDS_WITH]->(ron); +``` + + + + +Read more about `MERGE` clause in the [Cypher manual](https://memgraph.com/docs/querying/clauses/merge). + +## Set or update properties and labels + +The [`set_()`](../reference/gqlalchemy/query_builders/declarative_base.md#set_) method is used to set labels on nodes, and properties on nodes and relationships. When being set, labels and properties can be updated or created, depending on the operator used as the argument of `set_()` method. + +### Set a property + +To **set a property** of a graph object use the **assignment operator** from the query builder or a simple equals sign as a string - `"="`. + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +query = ( + create() + .node(labels="Country", variable="c", name="Germany") + .set_(item="c.population", operator=Operator.ASSIGNMENT, literal=83000001) + .execute() +) +``` + + + + +```cypher +CREATE (c:Country {name: 'Germany'}) SET c.population = 83000001; +``` + + + + +!!! info + `Operator` is an enumeration class defined in the + [`declarative_base.py`](https://github.com/memgraph/gqlalchemy/blob/main/gqlalchemy/query_builders/declarative_base.py#L84-L94). It can be imported from `gqlalchemy.query_builders.memgraph_query_builder`. + + If you don't want to import it, you can use strings `"="`, `">="`, `">"`, `"<>"`, `":"`, `"<"`, `"<="`, `"!="` or `"+="` instead. + + +To **set a property of already existing node**, first match the node and then set its property. + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +query = ( + match() + .node(labels="Country", variable="c", name="Germany") + .set_(item="c.population", operator=Operator.ASSIGNMENT, literal=10000) + .execute() +) +``` + + + + +```cypher +MATCH (c:Country {name: 'Germany'}) SET c.population = 10000; +``` + + + + +To **set multiple properties of a node**, run the following code: + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +query = ( + match() + .node(variable="n") + .where(item="n.name", operator="=", literal="Germany") + .set_(item="n.population", operator=Operator.ASSIGNMENT, literal=83000001) + .set_(item="n.capital", operator=Operator.ASSIGNMENT, literal="Berlin") + .execute() +) +``` + + + + +```cypher +MATCH (n) WHERE n.name = 'Germany' SET n.population = 83000001 SET n.capital = 'Berlin'; +``` + + + + +If a node already has the properties we are setting, they will be updated to a new value. Otherwise, the properties will be created and their value will be set. + +### Set a label + +To **set a label of a node**, run the following code: + + + + + +```python +from gqlalchemy import Match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +query = Match() + .node(variable="c", name="Germany") + .set_(item="c", operator=Operator.LABEL_FILTER, expression="Land") + .return_() + .execute() +``` + + + + +```cypher +MATCH (c {name: 'Germany'}) SET c:Land RETURN *; +``` + + + + +If a node already has a label, then it will have both old and new label. + +### Replace all properties + +With Cypher, it is possible to **replace all properties using a map** within a `SET` clause. Here is how to do it with query builder: + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +query = ( + match() + .node(variable="c", labels="Country") + .where(item="c.name", operator="=", literal="Germany") + .set_( + item="c", + operator=Operator.ASSIGNMENT, + literal={"country_name": "Germany", "population": 85000000}, + ) + .execute() +) +``` + + + +```cypher +MATCH (c:Country) WHERE c.name = 'Germany' SET c = {country_name: 'Germany', population: 85000000}; +``` + + + + +The properties that are not a part of the graph objects, but are in the map, will be set. The properties that are not in the map, but are a part of the graph objects, will be removed. If a property is both in map and a graph object property, it will be updated to a new value set in map. + +### Update all properties + +With Cypher, it is also possible to **update all properties using a map** within a `SET` clause by using the **increment operator** (`+=`). Here is how to do it with query builder: + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +query = ( + match() + .node(variable="c", labels="Country") + .where(item="c.country_name", operator="=", literal="Germany") + .set_( + item="c", + operator=Operator.INCREMENT, + literal={"population": "85000000"}, + ) + .execute() +) +``` + + + +```cypher +MATCH (c:Country) WHERE c.country_name = 'Germany' SET c += {population: '85000000'}; +``` + + + + +All the properties in the map (value of the `literal` argument) that are on a graph object will be updated. The properties that are not on a graph object but are in the map will be added. Properties that are not present in the map will be left as is. + +## Filter data + +You can use the methods [`where()`](../reference/gqlalchemy/query_builders/declarative_base.md#where), [`where_not()`](../reference/gqlalchemy/query_builders/declarative_base.md#where_not), [`or_where()`](../reference/gqlalchemy/query_builders/declarative_base.md#or_where), +[`or_where_not()`](../reference/gqlalchemy/query_builders/declarative_base.md#or_where_node), [`and_where()`](../reference/gqlalchemy/query_builders/declarative_base.md#and_where), [`and_where_not()`](../reference/gqlalchemy/query_builders/declarative_base.md#and_where_not), [`xor_where()`](../reference/gqlalchemy/query_builders/declarative_base.md#xor_where) and +[`xor_where_not()`](../reference/gqlalchemy/query_builders/declarative_base.md#xor_where_not) to construct queries that will filter data. + + + +### Filter data by property comparison + +To **filter data by comparing properties** of two nodes, run the following code: + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +results = list( + match() + .node(labels="Person", variable="p1") + .to(relationship_type="FRIENDS_WITH") + .node(labels="Person", variable="p2") + .where(item="p1.name", operator=Operator.LESS_THAN, expression="p2.name") + .return_() + .execute() +) + +print(results) +``` + + + + +```cypher +MATCH (p1:Person)-[:FRIENDS_WITH]->(p2:Person) WHERE p1.name < p2.name RETURN *; +``` + + + + +Keyword arguments that can be used in filtering methods are `literal` and `expression`. Usually we use `literal` for property values and `expression` for property names and labels. That is because property names and labels shouldn't be quoted in Cypher statements. + +!!! info + You will probably see the `GQLAlchemySubclassNotFoundWarning` warning. This happens if you did not define a Python class which maps to a graph object in the database. To do that, check the [object graph mapper how-to guide](ogm.md). To ignore such warnings, you can do the following before query execution: + + ```python + from gqlalchemy import models + + models.IGNORE_SUBCLASSNOTFOUNDWARNING = True + ``` + +Standard boolean operators like `NOT`, `AND`, `OR` and `XOR` are used in the +Cypher query language. To have `NOT` within `WHERE` clause, you need to use +`where_not()` method. + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +results = list( + match() + .node(labels="Person", variable="p1") + .to(relationship_type="FRIENDS_WITH") + .node(labels="Person", variable="p2") + .where_not(item="p1.name", operator=Operator.LESS_THAN, expression="p2.name") + .return_() + .execute() +) + +print(results) +``` + + + + +```cypher +MATCH (p1:Person)-[:FRIENDS_WITH]->(p2:Person) WHERE NOT p1.name < p2.name RETURN *; +``` + + + + +In a similar way, you can use `AND` and `AND NOT` clauses which correspond to +the methods `and_where()` and `and_not_where()`. Using the query below you can +find all persons with the same `address` and `last_name`, but different +`name`. + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +results = list( + match() + .node(labels="Person", variable="p1") + .to(relationship_type="FRIENDS_WITH") + .node(labels="Person", variable="p2") + .where(item="p1.address", operator=Operator.EQUAL, expression="p2.address") + .and_where(item="p1.last_name", operator=Operator.EQUAL, expression="p2.last_name") + .and_not_where(item="p1.name", operator=Operator.EQUAL, expression="p2.name") + .return_() + .execute() +) + +print(results) +``` + + + + +```cypher +MATCH (p1:Person)-[:FRIENDS_WITH]->(p2:Person) +WHERE p1.address = p2.address +AND p1.last_name = p2.last_name +AND NOT p1.name = p2.name +RETURN *; +``` + + + + +The same goes for the `OR`, `OR NOT`, `XOR` and `XOR NOT` clauses, which +correspond to the methods `or_where()`, `or_not_where()`, `xor_where()` and +`xor_not_where()`. + +### Filter data by property value + +You can **filter data by comparing the property of a graph object to some value** (a +literal). Below you can see how to compare `age` property of a node to the +integer. + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +results = list( + match() + .node(labels="Person", variable="p") + .where(item="p.age", operator=Operator.GREATER_THAN, literal=18) + .return_() + .execute() +) +``` + + + + +```cypher +MATCH (p:Person) WHERE p.age > 18 RETURN *; +``` + + + + +The third keyword argument is `literal` since we wanted the property `age` to be saved as an integer. If we used `expression` keyword argument instead of `literal`, then the `age` property would be a string (it would be quoted in Cypher query). Instead of `Operator.GREATER_THAN`, a simple string of value `">"` can be used. + +Just like in [property comparison](#filter-data-by-property-comparison), it is possible to use different boolean operators to further filter the data. + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +results = list( + match() + .node(labels="Person", variable="p") + .where(item="p.age", operator=Operator.GREATER_THAN, literal=18) + .or_where(item="p.name", operator=Operator.EQUAL, literal="John") + .return_() + .execute() +) +``` + + + + +```cypher +MATCH (p:Person) WHERE p.age > 18 OR p.name = "John" RETURN *; +``` + + + + +The `literal` keyword is used again since you want `John` to be quoted in the +Cypher query (to be saved as a string in the database). + +### Filter data by label + +Nodes can be filtered by their label using the `WHERE` clause instead of +specifying it directly in the `MATCH` clause. You have to use `expression` as +the third keyword argument again since you don't want the quotes surrounding the +label in the Cypher clause. + +To **filter data by label** use the following code: + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Operator + +results = list( + match() + .node(variable="p") + .where(item="p", operator=Operator.LABEL_FILTER, expression="Person") + .return_() + .execute() +) + +print(results) +``` + + + + +```cypher +MATCH (p) WHERE p:Person RETURN *; +``` + + + + +Just like in [property comparison](#filter-data-by-property-comparison), it is possible to use different boolean operators to further filter the data. + +## Return results + +You can use the methods [`return_()`](../reference/gqlalchemy/query_builders/declarative_base.md#return_), [`limit()`](../reference/gqlalchemy/query_builders/declarative_base.md#limit), [`skip()`](../reference/gqlalchemy/query_builders/declarative_base.md#skip) and [`order_by()`](../reference/gqlalchemy/query_builders/declarative_base.md#order_by) to +construct queries that will return data from the database. + +### Return all variables from a query + +To **return all the variables from a query**, use the `return_()` method at the +end of the query: + + + + +```python +from gqlalchemy import match + +results = list(match().node(labels="Person", variable="p").return_().execute()) +print(results) +``` + + + + +```cypher +MATCH (p:Person) RETURN *; +``` + + + + +### Return specific variables from a query + +To **return only a subset of variables** from a query, specify them in the +`return_()` method: + + + + +```python +from gqlalchemy import match + +results = list( + match() + .node(labels="Person", variable="p1") + .to() + .node(labels="Person", variable="p2") + .return_(results=[("p1", "first"), "p2"]) + .execute() +) + +for result in results: + print("Here is one pair:") + print(result["first"]) + print(result["p2"]) +``` + + + + +```cypher +MATCH (p1:Person)-[]->(p2:Person) RETURN p1 AS first, p2; +``` + + + + +### Limit the number of returned results + +To **limit the number of returned results**, use the `limit()` method after the +`return_()` method: + + + + +```python +from gqlalchemy import match + +results = list(match().node(labels="Person", variable="p").return_().limit(3).execute()) +print(results) +``` + + + + +```cypher +MATCH (p:Person) RETURN * LIMIT 3; +``` + + + + +### Order the returned results + +The default ordering in the Cypher query language is ascending (`ASC` or +`ASCENDING`), and if you want the descending order, you need to add the `DESC` +or `DESCENDING` keyword to the `ORDER BY` clause. + +To **order the return results by one value**, use the `order_by(properties)` method, +where `properties` can be a string (a property) or a tuple of two strings (a +property and an order). + +The following query will order the results in an ascending (default) order by +the property `name` of a node. + + + + +```python +from gqlalchemy import match + +results = list( + match().node(variable="n").return_().order_by(properties="n.name").execute() +) +print(results) + +``` + + + + +```cypher +MATCH (n) RETURN * ORDER BY n.name; +``` + + + + +You can also emphasize that you want an ascending order: + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Order + +results = list( + match() + .node(variable="n") + .return_() + .order_by(properties=("n.name", Order.ASC)) + .execute() +) +print(results) +``` + + + + +```cypher +MATCH (n) RETURN * ORDER BY n.name ASC; +``` + + + + +The same can be done with the keyword `ASCENDING`: + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Order + +results = list( + match() + .node(variable="n") + .return_() + .order_by(properties=("n.name", Order.ASCENDING)) + .execute() +) +print(results) +``` + + + + +```cypher +MATCH (n) RETURN * ORDER BY n.name ASCENDING; +``` + + + + +!!! info + `Order` is an enumeration class defined in the + [`declarative_base.py`](https://github.com/memgraph/gqlalchemy/blob/main/gqlalchemy/query_builders/declarative_base.py#L97-L101). It can be imported from `gqlalchemy.query_builders.memgraph_query_builder`. + + If you don't want to import it, you can use strings `"ASC"`, `"ASCENDING"`, `"DESC"` or `"DESCENDING"` instead. + +To order the query results in descending order, you need to specify the `DESC` +or `DESCENDING` keyword. Hence, the argument of the `order_by()` method must be +a tuple. + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Order + +results = list( + match() + .node(variable="n") + .return_() + .order_by(properties=("n.name", Order.DESC)) + .execute() +) + +print(results) +``` + + + + +```cypher +MATCH (n) RETURN * ORDER BY n.name DESC; +``` + + + + +Similarly, you can use `Order.DESCENDING` to get `DESCENDING` keyword in `ORDER BY` clause. + +### Order by a list of values + +To **order the returned results by more than one value**, use the +`order_by(properties)` method, where `properties` can be a list of strings or +tuples of strings (list of properties with or without order). + +The following query will order the results in ascending order by the property +`id`, then again in ascending (default) order by the property `name` of a node. +After that, it will order the results in descending order by the property +`last_name`, then in ascending order by the property `age` of a node. Lastly, +the query will order the results in descending order by the node property +`middle_name`. + + + + +```python +from gqlalchemy import match +from gqlalchemy.query_builders.memgraph_query_builder import Order + +results = list( + match() + .node(variable="n") + .return_() + .order_by( + properties=[ + ("n.id", Order.ASC), + "n.name", + ("n.last_name", Order.DESC), + ("n.age", Order.ASCENDING), + ("n.middle_name", Order.DESCENDING), + ] + ) + .execute() +) + +print(results) +``` + + + + +```cypher +MATCH (n) +RETURN * +ORDER BY n.id ASC, n.name, n.last_name DESC, n.age ASCENDING, n.middle_name DESCENDING; +``` + + + + +## Delete and remove objects + +You can use the methods [`delete()`](../reference/gqlalchemy/query_builders/declarative_base.md#delete) and [`remove()`](../reference/gqlalchemy/query_builders/declarative_base.md#remove) to construct queries that will +remove nodes and relationships or properties and labels. + +### Delete a node + +To **delete a node** from the database, use the `delete()` method: + + + + +```python +from gqlalchemy import match + +match().node(labels="Person", name="Harry", variable="p").delete( + variable_expressions="p" +).execute() +``` + + + + +```cypher +MATCH (p:Person {name: 'Harry'}) DELETE p; +``` + + + + +### Delete a relationship + +To **delete a relationship** from the database, use the `delete()` method: + + + + +```python +from gqlalchemy import match + +match().node(labels="Person", name="Leslie").to( + relationship_type="FRIENDS_WITH", variable="f" +).node(labels="Person").delete(variable_expressions="f").execute() +``` + + + + +```cypher +MATCH (:Person {name: 'Leslie'})-[f:FRIENDS_WITH]->(:Person) DELETE f; +``` + + + + +### Remove properties + +To remove a property (or properties) from the database, use the `remove()` method: + + + + +```python +from gqlalchemy import match + +match().node(labels="Person", name="Jane", variable="p").remove( + items=["p.name", "p.last_name"] +).execute() +``` + + + + +```cypher +MATCH (p:Person {name: 'Jane'}) REMOVE p.name, p.last_name; +``` + + + + + +## Call procedures + +You can use the methods [`call()`](../reference/gqlalchemy/query_builders/declarative_base.md#call) and [`yield_()`](../reference/gqlalchemy/query_builders/declarative_base.md#yield_) to construct queries that will +call procedure and return results from them. + +### Call procedure with no arguments + +To call a procedure with no arguments, don't specify the arguments in the +`call()` method: + + + + +```python +from gqlalchemy import call + +results = list(call("pagerank.get").yield_().return_().execute()) +print(results) +``` + + + + +```cypher +CALL pagerank.get() YIELD * RETURN *; +``` + + + + +### Call procedure with arguments + +To call a procedure with arguments, specify the arguments as a string in the +`call()` method: + + + + +```python +from gqlalchemy import call + +results = list( + call( + "json_util.load_from_url", + "'https://download.memgraph.com/asset/mage/data.json'", + ) + .yield_("objects") + .return_(results="objects") + .execute() +) + +print("Load from URL with argument:", results, "\n") +``` + + + + +```cypher +CALL json_util.load_from_url('https://download.memgraph.com/asset/mage/data.json') +YIELD objects +RETURN objects; +``` + + + + +
+ Code example using all of the above mentioned queries + +```python +from gqlalchemy import create, merge, Memgraph, match, models, call +from gqlalchemy.query_builders.memgraph_query_builder import Operator, Order + + +db = Memgraph() +# clean database +db.drop_database() + +# create nodes and a relationship between them + +create().node(labels="Person", name="Leslie").to(relationship_type="FRIENDS_WITH").node( + labels="Person", name="Ron" +).execute() + + +# merge a node +merge().node(labels="Person", name="Leslie").execute() + +# create nodes and a relationship between them +create().node( + labels="Person", name="Jane", last_name="James", address="street", age=19 +).from_(relationship_type="FRIENDS_WITH", since="2023-02-16").node( + labels="Person", name="John", last_name="James", address="street", age=8 +).execute() + + +# merge a relationship between existing nodes + +match().node(labels="Person", name="Leslie", variable="leslie").match().node( + labels="Person", name="Ron", variable="ron" +).merge().node(variable="leslie").to(relationship_type="FRIENDS_WITH").node( + variable="ron" +).execute() + + +# set a property +create().node(labels="Country", variable="c", name="Germany").set_( + item="c.population", operator=Operator.ASSIGNMENT, literal=83000001 +).execute() + +# update a property +match().node(labels="Country", variable="c", name="Germany").set_( + item="c.population", operator=Operator.ASSIGNMENT, literal=10000 +).execute() + + +# update multiple properties +match().node(variable="n").where(item="n.name", operator="=", literal="Germany").set_( + item="n.population", operator=Operator.ASSIGNMENT, literal=83000001 +).set_(item="n.capital", operator=Operator.ASSIGNMENT, literal="Berlin").execute() + + +# replace all properties +match().node(variable="c", labels="Country").where( + item="c.name", operator="=", literal="Germany" +).set_( + item="c", + operator=Operator.ASSIGNMENT, + literal={"country_name": "Germany", "population": 85000000}, +).execute() + + +# update multiple properties + +match().node(variable="c", labels="Country").where( + item="c.country_name", operator="=", literal="Germany" +).set_( + item="c", + operator=Operator.INCREMENT, + literal={"population": "85000000"}, +).execute() + + +models.IGNORE_SUBCLASSNOTFOUNDWARNING = True + +results = list( + match() + .node(labels="Person", variable="p1") + .to(relationship_type="FRIENDS_WITH") + .node(labels="Person", variable="p2") + .where(item="p1.name", operator=Operator.LESS_THAN, expression="p2.name") + .return_() + .execute() +) + +print("Filter by property comparison:", results, "\n") + +results = list( + match() + .node(labels="Person", variable="p1") + .to(relationship_type="FRIENDS_WITH") + .node(labels="Person", variable="p2") + .where_not(item="p1.name", operator=Operator.LESS_THAN, expression="p2.name") + .return_() + .execute() +) + +print("Filter by property comparison (negation):", results, "\n") + +results = list( + match() + .node(labels="Person", variable="p1") + .to(relationship_type="FRIENDS_WITH") + .node(labels="Person", variable="p2") + .where(item="p1.address", operator=Operator.EQUAL, expression="p2.address") + .and_where(item="p1.last_name", operator=Operator.EQUAL, expression="p2.last_name") + .and_not_where(item="p1.name", operator=Operator.EQUAL, expression="p2.name") + .return_() + .execute() +) + +print("Filter by property comparison + logical operators:", results, "\n") + +results = list( + match() + .node(labels="Person", variable="p") + .where(item="p.age", operator=Operator.GREATER_THAN, literal=18) + .return_() + .execute() +) + +print("Filter by property value:", results, "\n") + +results = list( + match() + .node(labels="Person", variable="p") + .where(item="p.age", operator=Operator.GREATER_THAN, literal=18) + .or_where(item="p.name", operator=Operator.EQUAL, literal="John") + .return_() + .execute() +) + +print("Filter by property value + logical operators:", results, "\n") + +results = list( + match() + .node(variable="p") + .where(item="p", operator=Operator.LABEL_FILTER, expression="Person") + .return_() + .execute() +) + +print("Filter by label:", results, "\n") + + +results = list(match().node(labels="Person", variable="p").return_().execute()) +print("Return all:", results, "\n") + +results = list( + match() + .node(labels="Person", variable="p1") + .to() + .node(labels="Person", variable="p2") + .return_(results=[("p1", "first"), "p2"]) + .execute() +) + +for result in results: + print("Here is one pair:") + print(result["first"]) + print(result["p2"]) + +print() + +results = list(match().node(labels="Person", variable="p").return_().limit(3).execute()) +print("Limit results:", results, "\n") + + +results = list( + match().node(variable="n").return_().order_by(properties="n.name").execute() +) +print("Order descending:", results, "\n") + +results = list( + match() + .node(variable="n") + .return_() + .order_by(properties=("n.name", Order.ASCENDING)) + .execute() +) +print("Order ascending:", results, "\n") + +results = list( + match() + .node(variable="n") + .return_() + .order_by(properties=("n.name", Order.DESC)) + .execute() +) + +print("Order descending with ordering:", results, "\n") + +results = list( + match() + .node(variable="n") + .return_() + .order_by( + properties=[ + ("n.id", Order.ASC), + "n.name", + ("n.last_name", Order.DESC), + ("n.age", Order.ASCENDING), + ("n.middle_name", Order.DESCENDING), + ] + ) + .execute() +) + +print("Mix of ordering:", results, "\n") + + +# create a node to delete +create().node(labels="Person", name="Harry").execute() + +# delete a node +match().node(labels="Person", name="Harry", variable="p").delete( + variable_expressions="p" +).execute() + +# delete a relationship between Leslie and her friends +match().node(labels="Person", name="Leslie").to( + relationship_type="FRIENDS_WITH", variable="f" +).node(labels="Person").delete(variable_expressions="f").execute() + +# remove name and last_name properties from Jane +match().node(labels="Person", name="Jane", variable="p").remove( + items=["p.name", "p.last_name"] +).execute() + +# calculate PageRank +results = list(call("pagerank.get").yield_().return_().execute()) +print("PageRank:", results, "\n") + +# Load JSON from URL with arguments +results = list( + call( + "json_util.load_from_url", + "'https://download.memgraph.com/asset/mage/data.json'", + ) + .yield_("objects") + .return_(results="objects") + .execute() +) + +print("Load from URL with argument:", results, "\n") +``` +
+ +## Load CSV file + +To load a CSV file using query builder, use the `load_csv()` procedure. Here is an example CSV file: +``` +id,name,age,city +100,Daniel,30,London +101,Alex,15,Paris +102,Sarah,17,London +103,Mia,25,Zagreb +104,Lucy,21,Paris +``` + +To load it, run the following code: + +```python +from gqlalchemy import load_csv, Memgraph +from gqlalchemy.utilities import CypherVariable + +db = Memgraph() + +load_csv( + path="/path-to/people_nodes.csv", header=True, row="row" + ) + .create() + .node( + variable="n", + labels="Person", + id=CypherVariable(name="row.id"), + name=CypherVariable(name="row.name"), + age=CypherVariable(name="ToInteger(row.age)"), + city=CypherVariable(name="row.city"), + ) + .execute() +``` + +>Hopefully, this guide has taught you how to properly use GQLAlchemy query builder. If you +>have any more questions, join our community and ping us on [Discord](https://discord.gg/memgraph). + diff --git a/docs/how-to-guides/query-builder/graph-projection.md b/docs/how-to-guides/query-builder/graph-projection.md new file mode 100644 index 00000000..ddde895d --- /dev/null +++ b/docs/how-to-guides/query-builder/graph-projection.md @@ -0,0 +1,68 @@ +# How to create a graph projection + +[![Related - +How-to](https://img.shields.io/static/v1?label=Related&message=How-to&color=blue&style=for-the-badge)](https://memgraph.com/docs/advanced-algorithms/run-algorithms) +[![Related - Under the +Hood](https://img.shields.io/static/v1?label=Related&message=Under%20the%20hood&color=orange&style=for-the-badge)](https://memgraph.com/blog/how-we-designed-and-implemented-graph-projection-feature) + +As subgraphs are mainly used with Memgraph's query modules (graph algorithms), +`QueryBuilder`'s `call()` method enables specifying the subgraph to use with a certain algorithm. + +To call a procedure named `test_query_module` with argument `"arg"`, and run +it on a subgraph containing only nodes with label `:LABEL` and their mutual +relationships build the following query: + +```Python +from gqlalchemy import QueryBuilder + +label = "LABEL" + +query_builder = QueryBuilder().call(procedure="test_query_module", + arguments=("arg"), node_labels=label) + +query_builder.execute() +``` + +The above code executes the following Cypher query: +```Cypher +MATCH p=(a)-->(b) +WHERE (a:LABEL) +AND (b:LABEL) +WITH project(p) AS graph +CALL test_query_module(graph, 'arg') +``` + +`WHERE` and `AND` clauses are used to allow for more generalization. To expand +on this code you can use multiple relationship types and node +labels. Node labels and relationship types can be passed as a single string, in +which case that string is used for all labels or types. To specify different +labels and types for entities on a path, you need to pass a list of lists, +containing a list of labels for every node on a path, and likewise for relationships. You can use this as following: + +```Python +node_labels = [["COMP", "DEVICE"], ["USER"], ["SERVICE", "GATEWAY"]] +relationship_types = [["OWNER", "RENTEE"], ["USES", "MAKES"]] +relationship_directions = [RelationshipDirection.LEFT, RelationshipDirection.RIGHT] +arguments = ("arg0", 5) + +query_builder = QueryBuilder().call(procedure="test_query_module", + arguments = arguments, + node_labels=node_labels, + relationship_types=relationship_types, + relationship_directions=relationship_directions) + +query_builder.execute() +``` + +The above code executes the following Cypher query: +```Cypher +MATCH p=(a)<-[:OWNER | :RENTEE]-(b)-[:USES | :MAKES]->(c) +WHERE (a:COMP or a:DEVICE) +AND (b:USER) +AND (c:SERVICE or c:GATEWAY) +WITH project(p) AS graph +CALL test_query_module(graph, "arg0", 5) +``` + +This query calls `test_query_module` on a subgraph containing all nodes labeled +`USER` that have an outgoing relationship of types either `OWNER` or `RENTEE` towards nodes labeled `COMP` or `DEVICE` and also a relationship of type `USES` or `MAKES` towards nodes labeled `SERVICE` or `GATEWAY`. diff --git a/docs/how-to-guides/streams/kafka-streams.md b/docs/how-to-guides/streams/kafka-streams.md new file mode 100644 index 00000000..cde425df --- /dev/null +++ b/docs/how-to-guides/streams/kafka-streams.md @@ -0,0 +1,62 @@ +# How to manage Kafka streams + +The stream functionality enables Memgraph to connect to a Kafka, Pulsar or +Redpanda cluster and run graph analytics on the data stream. + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + + +## 1. Create a Kafka stream in Memgraph + +To set up the streams, first, create a `MemgraphKafkaStream` object with all the +required arguments: + +- `name: str` ➡ The name of the stream. +- `topics: List[str]` ➡ List of topic names. +- `transform: str` ➡ The transformation procedure for mapping incoming messages + to Cypher queries. +- `consumer_group: str` ➡ Name of the consumer group in Memgraph. +- `batch_interval: str = None` ➡ Maximum wait time in milliseconds for consuming + messages before calling the transform procedure. +- `batch_size: str = None` ➡ Maximum number of messages to wait for before + calling the transform procedure. +- `bootstrap_servers: str = None` ➡ Comma-separated list of bootstrap servers. + +Now you just have to call the `create_stream()` method with the newly created +`MemgraphKafkaStream` object: + +```python +from gqlalchemy import MemgraphKafkaStream + +stream = MemgraphKafkaStream(name="ratings_stream", topics=["ratings"], transform="movielens.rating", bootstrap_servers="localhost:9093") +db.create_stream(stream) +``` + +## 2. Start the stream + +To start the stream, just call the `start_stream()` method: + +```python +db.start_stream(stream) +``` + +## 3. Check the status of the stream + +To check the status of the stream in Memgraph, just run the following command: + +```python +check = db.get_streams() +``` + +## 4. Delete the stream + +You can use the `drop_stream()` method to delete a stream: + +```python +check = db.drop_stream(stream) +``` diff --git a/docs/how-to-guides/streams/pulsar-streams.md b/docs/how-to-guides/streams/pulsar-streams.md new file mode 100644 index 00000000..0650933b --- /dev/null +++ b/docs/how-to-guides/streams/pulsar-streams.md @@ -0,0 +1,61 @@ +# How to manage Pulsar streams + +The stream functionality enables Memgraph to connect to a Kafka, Pulsar or +Redpanda cluster and run graph analytics on the data stream. + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + + +## 1. Create a Pulsar stream in Memgraph + +To set up the streams, first, create a `MemgraphPulsarStream` object with all +the required arguments: + +- `name: str` ➡ The name of the stream. +- `topics: List[str]` ➡ List of topic names. +- `transform: str` ➡ The transformation procedure for mapping incoming messages + to Cypher queries. +- `batch_interval: str = None` ➡ Maximum wait time in milliseconds for consuming + messages before calling the transform procedure. +- `batch_size: str = None` ➡ Maximum number of messages to wait for before + calling the transform procedure. +- `service_url: str = None` ➡ URL to the running Pulsar cluster. + +Now you just have to call the `create_stream()` method with the newly created +`MemgraphPulsarStream` object: + +```python +from gqlalchemy import MemgraphPulsarStream + +stream = MemgraphPulsarStream(name="ratings_stream", topics=["ratings"], transform="movielens.rating", service_url="localhost:6650") +db.create_stream(stream) +``` + +## 2. Start the stream + +To start the stream, just call the `start_stream()` method: + +```python +db.start_stream(stream) +``` + +## 3. Check the status of the stream + +To check the status of the stream in Memgraph, just run the following command: + +```python +check = db.get_streams() +``` + +## 4. Delete the stream + +You can use the `drop_stream()` method to delete a stream: + +```python +check = db.drop_stream(stream) +``` diff --git a/docs/how-to-guides/translators/export-python-graphs.md b/docs/how-to-guides/translators/export-python-graphs.md new file mode 100644 index 00000000..cff451e4 --- /dev/null +++ b/docs/how-to-guides/translators/export-python-graphs.md @@ -0,0 +1,192 @@ +# How to export data from Memgraph into Python graphs + +GQLAlchemy holds translators that can export Memgraph graphs into Python graphs ([NetworkX](https://networkx.org/), [PyG](https://pytorch-geometric.readthedocs.io/en/latest/) or [DGL](https://www.dgl.ai/) graphs). These translators create a Python graph instance from the graph stored in Memgraph. + +[![docs-source](https://img.shields.io/badge/source-examples-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/tree/main/tests/transformations/translators) +[![docs-source](https://img.shields.io/badge/source-translators-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/tree/main/gqlalchemy/transformations/translators) +[![Related - Under the +hood](https://img.shields.io/static/v1?label=Related&message=Under%20the%20hood&color=orange&style=for-the-badge)](../../under-the-hood/python-graph-translators.md) + +In this guide you will learn how to: +- [**Export data from Memgraph into NetworkX graph**](#export-data-from-memgraph-into-networkx-graph) +- [**Export data from Memgraph into PyG graph**](#import-pyg-graph-into-memgraph) +- [**Export data from Memgraph into DGL graph**](#import-dgl-graph-into-memgraph) + +## General prerequisites +You need a running **Memgraph Platform instance**, which includes both the MAGE library and Memgraph Lab, a visual interface. To run the image, open a command-line interpreter and run the following Docker command: + +``` +docker run -it -p 7687:7687 -p 7444:7444 -p 3000:3000 memgraph/memgraph-platform:latest +``` + +
+To export data from Memgraph, you first have to create a graph in Memgraph. To do that, expand this section and run the given Python script. + +```python +from gqlalchemy import Memgraph + +memgraph = Memgraph() +memgraph.drop_database() + +queries = [] +queries.append(f"CREATE (m:Node {{id: 1, num: 80, edem: 30, lst: [2, 3, 3, 2]}})") +queries.append(f"CREATE (m:Node {{id: 2, num: 91, edem: 32, lst: [2, 2, 3, 3]}})") +queries.append( + f"CREATE (m:Node {{id: 3, num: 100, edem: 34, lst: [3, 2, 2, 3, 4, 4]}})" +) +queries.append(f"CREATE (m:Node {{id: 4, num: 12, edem: 34, lst: [2, 2, 2, 3, 5, 5]}})") +queries.append( + f"MATCH (n:Node {{id: 1}}), (m:Node {{id: 2}}) CREATE (n)-[r:CONNECTION {{edge_id: 1, edge_num: 99, edge_edem: 12, edge_lst: [0, 1, 0, 1, 0, 1, 0, 1]}}]->(m)" +) +queries.append( + f"MATCH (n:Node {{id: 2}}), (m:Node {{id: 3}}) CREATE (n)-[r:CONNECTION {{edge_id: 2, edge_num: 99, edge_edem: 12, edge_lst: [0, 1, 0, 1]}}]->(m)" +) +queries.append( + f"MATCH (n:Node {{id: 3}}), (m:Node {{id: 4}}) CREATE (n)-[r:CONNECTION {{edge_id: 3, edge_num: 99, edge_edem: 12, edge_lst: [1, 0, 1, 0, 1, 0, 1]}}]->(m)" +) +queries.append( + f"MATCH (n:Node {{id: 4}}), (m:Node {{id: 1}}) CREATE (n)-[r:CONNECTION {{edge_id: 4, edge_num: 99, edge_edem: 12, edge_lst: [0, 1, 0, 1]}}]->(m)" +) +queries.append( + f"MATCH (n:Node {{id: 1}}), (m:Node {{id: 3}}) CREATE (n)-[r:CONNECTION {{edge_id: 5, edge_num: 99, edge_edem: 12, edge_lst: [0, 1, 0, 1]}}]->(m)" +) +queries.append( + f"MATCH (n:Node {{id: 2}}), (m:Node {{id: 4}}) CREATE (n)-[r:CONNECTION {{edge_id: 6, edge_num: 99, edge_edem: 12, edge_lst: [0, 1, 0, 1, 0, 0]}}]->(m)" +) +queries.append( + f"MATCH (n:Node {{id: 4}}), (m:Node {{id: 2}}) CREATE (n)-[r:CONNECTION {{edge_id: 7, edge_num: 99, edge_edem: 12, edge_lst: [1, 1, 0, 0, 1, 1, 0, 1]}}]->(m)" +) +queries.append( + f"MATCH (n:Node {{id: 3}}), (m:Node {{id: 1}}) CREATE (n)-[r:CONNECTION {{edge_id: 8, edge_num: 99, edge_edem: 12, edge_lst: [0, 1, 0, 1]}}]->(m)" +) + +for query in queries: + memgraph.execute(query) +``` + +
+ +## Export data from Memgraph into NetworkX graph + +### Prerequisites + +Except for the [**general prerequisites**](#general-prerequisites), you also need to install [**NetworkX Python library**](https://pypi.org/project/networkx/). + +### Create and run a Python script + +Create a new Python script `memgraph-to-nx.py`, in the code editor of your choice, with the following code: + +```python +from gqlalchemy.transformations.translators.nx_translator import NxTranslator + +translator = NxTranslator() +graph = translator.get_instance() + +print(graph.number_of_edges()) +print(graph.number_of_nodes()) +``` + +To run it, open a command-line interpreter and run the following command: + +```python +python3 memgraph-to-nx.py +``` + +You will get the following output: +``` +8 +4 +``` + +This means that the NetworkX graph has the correct number of nodes and edges. You can explore it more to see if it has all the required features. + +## Export data from Memgraph into PyG graph + +### Prerequisites + +Except for the [**general prerequisites**](#general-prerequisites), you also need to install [**Pytorch Geometric Python library**](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html). + +### Create and run a Python script + +Create a new Python script `memgraph-to-pyg.py`, in the code editor of your choice, with the following code: + +```python +from gqlalchemy.transformations.translators.pyg_translator import PyGTranslator + +translator = PyGTranslator() +graph = translator.get_instance() + +print(len(graph.edge_types)) +print(len(graph.node_types)) + +source_node_label, edge_type, dest_node_label = ("Node", "CONNECTION", "Node") +can_etype = (source_node_label, edge_type, dest_node_label) +print(graph[source_node_label].num_nodes) +print(graph[can_etype].num_edges) +``` + +To run it, open a command-line interpreter and run the following command: + +```python +python3 memgraph-to-pyg.py +``` + +You will get the following output: +``` +1 +1 +4 +8 +``` + +This means that the PyG graph has the correct number of node and edge types, as well as correct total number of nodes and edges. You can explore it more to see if it has all the required features. + + +## Export data from Memgraph into DGL graph + +### Prerequisites + +Except for the [**general prerequisites**](#general-prerequisites), you also need to install [**Deep Graph Library**](https://www.dgl.ai/pages/start.html). + +### Create and run a Python script + +Create a new Python script `memgraph-to-dgl.py`, in the code editor of your choice, with the following code: + +```python +from gqlalchemy.transformations.translators.dgl_translator import DGLTranslator + +translator = DGLTranslator() +graph = translator.get_instance() + +print(len(graph.canonical_etypes)) +print(len(graph.ntypes)) + +source_node_label, edge_type, dest_node_label = ("Node", "CONNECTION", "Node") +can_etype = (source_node_label, edge_type, dest_node_label) +print(graph[can_etype].number_of_nodes()) +print(graph[can_etype].number_of_edges()) +print(len(graph.nodes[source_node_label].data.keys())) +print(len(graph.edges[(source_node_label, edge_type, dest_node_label)].data.keys())) +``` + +To run it, open a command-line interpreter and run the following command: + +```python +python3 memgraph-to-dgl.py +``` + +You will get the following output: +``` +1 +1 +4 +8 +3 +3 +``` + +This means that the DGL graph has the correct number of node and edge types, total number of nodes and edges, as well as node and edge features. You can explore it more to see if it has all the required features. + +## Learn more + +Head over to the [**Under the hood**](../../under-the-hood/python-graph-translators.md) section to read about implementation details. If you want to learn more about using NetworkX with Memgraph with interesting resources and courses, head over to the [**Memgraph for NetworkX developers**](https://memgraph.com/memgraph-for-networkx?utm_source=docs&utm_medium=referral&utm_campaign=networkx_ppp&utm_term=docsgqla%2Bhowto&utm_content=textlink) website. If you have any questions or want to connect with the Memgraph community, [**join our Discord server**](https://www.discord.gg/memgraph). diff --git a/docs/how-to-guides/translators/import-python-graphs.md b/docs/how-to-guides/translators/import-python-graphs.md new file mode 100644 index 00000000..270b6d68 --- /dev/null +++ b/docs/how-to-guides/translators/import-python-graphs.md @@ -0,0 +1,211 @@ +# How to import Python graphs into Memgraph + +GQLAlchemy holds translators that can import Python graphs ([NetworkX](https://networkx.org/), [PyG](https://pytorch-geometric.readthedocs.io/en/latest/) or [DGL](https://www.dgl.ai/) graphs) into Memgraph. These translators take the Python graph object and translate it to the appropriate Cypher queries. The Cypher queries are then executed to create a graph inside Memgraph. + +[![docs-source](https://img.shields.io/badge/source-examples-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/tree/main/tests/transformations/translators) +[![docs-source](https://img.shields.io/badge/source-translators-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/tree/main/gqlalchemy/transformations/translators) +[![Related - Under the +hood](https://img.shields.io/static/v1?label=Related&message=Under%20the%20hood&color=orange&style=for-the-badge)](../../under-the-hood/python-graph-translators.md) + +In this guide you will learn how to: +- [**Import NetworkX graph into Memgraph**](#import-networkx-graph-into-memgraph) +- [**Import PyG graph into Memgraph**](#import-pyg-graph-into-memgraph) +- [**Import DGL graph into Memgraph**](#import-dgl-graph-into-memgraph) + +## General prerequisites +You need a running **Memgraph Platform instance**, which includes both the MAGE library and Memgraph Lab, a visual interface. To run the image, open a command-line interpreter and run the following Docker command: + +``` +docker run -it -p 7687:7687 -p 7444:7444 -p 3000:3000 memgraph/memgraph-platform:latest +``` + +## Import NetworkX graph into Memgraph + +### Prerequisites + +Except for the [**general prerequisites**](#general-prerequisites), you also need to install [**NetworkX Python library**](https://pypi.org/project/networkx/). + +### Create and run a Python script + +Create a new Python script `networkx-graph.py` in the code editor of your choice, with the following code: + +```python +import networkx as nx +from gqlalchemy import Memgraph +from gqlalchemy.transformations.translators.nx_translator import NxTranslator + +memgraph = Memgraph() +memgraph.drop_database() + +graph = nx.Graph() +graph.add_nodes_from([(1, {"labels": "First"}), (2, {"name": "Kata"}), 3]) +graph.add_edges_from([(1, 2, {"type": "EDGE_TYPE", "date": "today"}), (1, 3)]) + +translator = NxTranslator() + +for query in list(translator.to_cypher_queries(graph)): + memgraph.execute(query) +``` + +First, connect to a running Memgraph instance. Next, drop the database to be sure that it's empty. After that, create a simple NetworkX graph and add nodes and edges to it. In the end, call `to_cypher_queries` procedure on `NxTranslator` instance to transform the NetworkX graph to Cypher queries which will be executed in Memgraph. + +To run it, open a command-line interpreter and run the following command: + +```python +python3 networkx-graph.py +``` + +### Explore the graph + +[Connect to Memgraph](htps://memgraph.com/docs/data-visualization/install-and-connect) via Memgraph Lab which is running at `localhost:3000`. Open the **Query Execution** section and write the following query: + +```cypher +MATCH (n)-[r]->(m) +RETURN n, r, m; +``` + +Click **Run Query** button to see the results. + +networkx-example-1 + +The NetworkX node identification number maps to the `id` node property in Memgraph. The `labels` key is reserved for the node label in Memgraph, while the edge `type` key is reserved for the relationship type in Memgraph. If no `type` is defined, then the relationship will be of type `TO` in Memgraph. You can notice that the node with the property `name` Kata and property `id` 2 doesn't have a label. This happened because the node property key `labels` was not defined. + +## Import PyG graph into Memgraph + +### Prerequisites + +Except for the [**general prerequisites**](#general-prerequisites), you also need to install [**Pytorch Geometric Python library**](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html). + +### Create and run a Python script + +Create a new Python script `pyg-graph.py` in the code editor of your choice, with the following code: + +```python +import torch +from gqlalchemy import Memgraph +from gqlalchemy.transformations.translators.pyg_translator import PyGTranslator +from torch_geometric.data import HeteroData + + +memgraph = Memgraph() +memgraph.drop_database() + +graph = HeteroData() + +graph[("user", "PLUS", "movie")].edge_index = torch.tensor( + [[0, 0, 1], [0, 1, 0]], dtype=torch.int32 +) +graph[("user", "MINUS", "movie")].edge_index = torch.tensor( + [[2], [1]], dtype=torch.int32 +) +# Set node features +graph["user"].prop1 = torch.randn(size=(3, 1)) +graph["user"].prop2 = torch.randn(size=(3, 1)) +graph["movie"].prop1 = torch.randn(size=(2, 1)) +graph["movie"].prop2 = torch.randn(size=(2, 1)) +graph["movie"].prop3 = torch.randn(size=(2, 1)) +graph["movie"].x = torch.randn(size=(2, 1)) +graph["movie"].y = torch.randn(size=(2, 1)) +# Set edge features +graph[("user", "PLUS", "movie")].edge_prop1 = torch.randn(size=(3, 1)) +graph[("user", "PLUS", "movie")].edge_prop2 = torch.randn(size=(3, 1)) +graph[("user", "MINUS", "movie")].edge_prop1 = torch.randn(size=(1, 1)) + +translator = PyGTranslator() + +for query in list(translator.to_cypher_queries(graph)): + memgraph.execute(query) +``` + +First, connect to a running Memgraph instance. Next, drop the database to be sure that it's empty. After that, create a simple PyG heterogeneous graph and add nodes and edges along with their features to it. The graph consist of three `user` nodes and two `movie` nodes, as well as two types of edges - `PLUS` and `MINUS`. The `edge_index` of a graph determines which nodes are connected by which edges. Provide a tensor, that is a multi-dimensional matrix, as a value of `edge_index`, to define edges. Each tensor element maps to one graph node - first row of matrix maps to `user`, while the second one to the `movie` nodes. Hence, `user` node 0 is connected to the `movie` node 0, `user` node 0 is connected to the `movie` node 1, and `user` node 1 is connected to the `movie` node 0, with edge of type `PLUS`. These integers are mapping to the values of the `pyg_id` nodes' property in Memgraph. Similarly, the edge of type `MINUS` is created between `user` node 2 and `movie` node 1. In the end, call `to_cypher_queries` procedure on `PyGTranslator` instance to transform the PysG graph to Cypher queries which will be executed in Memgraph. + +To run it, open a command-line interpreter and run the following command: + +```python +python3 pyg-graph.py +``` + +### Explore the graph + +[Connect to Memgraph](htps://memgraph.com/docs/data-visualization/install-and-connect) via Memgraph Lab which is running at `localhost:3000`. Open the **Query Execution** section and write the following query: + +```cypher +MATCH (n)-[r]->(m) +RETURN n, r, m; +``` + +Click **Run Query** button to see the results. + +pyg-example + +You can notice that we have nodes labeled with `user` and `movie` and relationships of type `PLUS` and `MINUS`. Besides that, nodes and relationships have randomized array properties as well as `pyg_id` property. + +## Import DGL graph into Memgraph + +### Prerequisites + +Except for the [**general prerequisites**](#general-prerequisites), you also need to install [**Deep Graph Library**](https://www.dgl.ai/pages/start.html). + +### Create and run a Python script + +Create a new Python script `dgl-graph.py` in the code editor of your choice, with the following code: + +```python +import numpy as np +import dgl +import torch +from gqlalchemy import Memgraph +from gqlalchemy.transformations.translators.dgl_translator import DGLTranslator + +memgraph = Memgraph() +memgraph.drop_database() + +graph = dgl.heterograph( + { + ("user", "PLUS", "movie"): (np.array([0, 0, 1]), np.array([0, 1, 0])), + ("user", "MINUS", "movie"): (np.array([2]), np.array([1])), + } +) +# Set node features +graph.nodes["user"].data["prop1"] = torch.randn(size=(3, 1)) +graph.nodes["user"].data["prop2"] = torch.randn(size=(3, 1)) +graph.nodes["movie"].data["prop1"] = torch.randn(size=(2, 1)) +graph.nodes["movie"].data["prop2"] = torch.randn(size=(2, 1)) +graph.nodes["movie"].data["prop3"] = torch.randn(size=(2, 1)) +# Set edge features +graph.edges[("user", "PLUS", "movie")].data["edge_prop1"] = torch.randn(size=(3, 1)) +graph.edges[("user", "PLUS", "movie")].data["edge_prop2"] = torch.randn(size=(3, 1)) +graph.edges[("user", "MINUS", "movie")].data["edge_prop1"] = torch.randn(size=(1, 1)) + +translator = DGLTranslator() + +for query in list(translator.to_cypher_queries(graph)): + memgraph.execute(query) +``` + +First, connect to a running Memgraph instance. Next, drop the database to be sure that it's is empty. After that, create a simple DGL heterogeneous graph and add nodes and edges along with their features to it. The graph consist of three `user` nodes and two `movie` nodes, as well as two types of edges - `PLUS` and `MINUS`. To define nodes and edge between them we are providing appropriate NumPy arrays. Hence, `user` node 0 is connected to the `movie` node 0, `user` node 0 is connected to the `movie` node 1, and `user` node 1 is connected to the `movie` node 0, with edge of type `PLUS`. These integers are mapping to the values of the `dgl_id` properties in Memgraph. Similarly, the edge of type `MINUS` is created between `user` node 2 and `movie` node 1. In the end, call `to_cypher_queries` procedure on `DGLTranslator` instance to transform the DGL graph to Cypher queries which will be executed in Memgraph. + +To run it, open a command-line interpreter and run the following command: + +```python +python3 dgl-graph.py +``` + +### 3. Explore the graph + +[Connect to Memgraph](htps://memgraph.com/docs/data-visualization/install-and-connect) via Memgraph Lab which is running at `localhost:3000`. Open the **Query Execution** section and write the following query: + +```cypher +MATCH (n)-[r]->(m) +RETURN n, r, m; +``` + +Click **Run Query** button to see the results. + +pyg-example + +You can notice that we have nodes labeled with `user` and `movie` and relationships of type `PLUS` and `MINUS`. Besides that, nodes and relationships have randomized array properties ad well as `dgl_id` property. + +## Learn more + +Head over to the [**Under the hood**](../../under-the-hood/python-graph-translators.md) section to read about implementation details. If you want to learn more about using NetworkX with Memgraph with interesting resources and courses, head over to the [**Memgraph for NetworkX developers**](https://memgraph.com/memgraph-for-networkx?utm_source=docs&utm_medium=referral&utm_campaign=networkx_ppp&utm_term=docsgqla%2Bhowto&utm_content=textlink) website. If you have any questions or want to connect with the Memgraph community, [**join our Discord server**](https://www.discord.gg/memgraph). diff --git a/docs/how-to-guides/triggers/triggers.md b/docs/how-to-guides/triggers/triggers.md new file mode 100644 index 00000000..7b27e413 --- /dev/null +++ b/docs/how-to-guides/triggers/triggers.md @@ -0,0 +1,77 @@ +# How to manage database triggers + +Because Memgraph supports database triggers on `CREATE`, `UPDATE` and `DELETE` +operations, GQLAlchemy also implements a simple interface for maintaining these +triggers. + +!!! info + You can also use this feature with Neo4j: + + ```python + db = Neo4j(host="localhost", port="7687", username="neo4j", password="test") + ``` + + +## 1. Create the trigger + +To set up the trigger, first, create a `MemgraphTrigger` object with all the +required arguments: +- `name: str` ➡ The name of the trigger. +- `event_type: TriggerEventType` ➡ The type of event that will trigger the + execution. The options are: `TriggerEventType.CREATE`, + `TriggerEventType.UPDATE` and `TriggerEventType.DELETE`. +- `event_object: TriggerEventObject` ➡ The objects that are affected with the + `event_type`. The options are: ``TriggerEventObject.ALL, + `TriggerEventObject.NODE` and `TriggerEventObject.RELATIONSHIP`. +- `execution_phase: TriggerExecutionPhase` ➡ The phase when the trigger should + be executed in regard to the transaction commit. The options are: `BEFORE` and + `AFTER`. +- `statement: str` ➡ The Cypher query that should be executed when the trigger + fires. + +Now, let's create a trigger in GQLAlchemy: + +```python +from gqlalchemy import Memgraph, MemgraphTrigger +from gqlalchemy.models import ( + TriggerEventType, + TriggerEventObject, + TriggerExecutionPhase, +) + +db = Memgraph() + +trigger = MemgraphTrigger( + name="ratings_trigger", + event_type=TriggerEventType.CREATE, + event_object=TriggerEventObject.NODE, + execution_phase=TriggerExecutionPhase.AFTER, + statement="UNWIND createdVertices AS node SET node.created_at = LocalDateTime()", +) + +db.create_trigger(trigger) +``` + +The trigger names `ratings_trigger` will be executed every time a node is +created in the database. After the transaction that created the node in question +finishes, the Cypher query `statement` will execute, and in this case, it will +set the property `created_at` of the newly created node to the current date and +time. + +## 2. Check the status of a trigger + +You can return all of the triggers from the database with the `get_Triggers()` +method: + +```python +triggers = db.get_triggers() +print(triggers) +``` + +## 3. Delete the trigger + +You can use the `drop_trigger()` method to delete a trigger: + +```python +db.drop_trigger(trigger) +``` diff --git a/docs/import-data.md b/docs/import-data.md new file mode 100644 index 00000000..2636e8c2 --- /dev/null +++ b/docs/import-data.md @@ -0,0 +1,45 @@ +# Import data + +You can import data in the following formats: +- [**CSV**](#csv) +- [**JSON**](#json) +- [**Parquet, ORC or IPC/Feather/Arrow**](#parquet-orc-or-ipcfeatherarrow) +- [**Python graphs - NetworkX, PyG or DGL graph**](#python-graphs---networkx-pyg-or-dgl-graph) +- [**Kafka, RedPanda or Pulsar data stream**](#kafka-redpanda-or-pulsar-data-stream) + +Besides that, you can create data directly from code using the [**object graph mapper**](how-to-guides/ogm.md) or [**query builder**](how-to-guides/query-builder.md). + + +!!! tip + The fastest way to import data into Memgraph is by using the [LOAD CSV clause](https://memgraph.com/docs/data-migration/csv). It's recommended to first [create indexes](https://memgraph.com/docs/fundamentals/indexes) using the `CREATE INDEX` clause. You can create them by [executing the Cypher query](https://memgraph.com/docs/client-libraries/python) or using [object graph mapper](how-to-guides/ogm.md#create-indexes). + +## CSV + +To import CSV file into Memgraph via GQLAlchemy, you can use the [`LOAD CSV` clause](https://memgraph.com/docs/data-migration/csv). That clause can be used by [executing the Cypher query](https://memgraph.com/docs/client-libraries/python) or by [building the query with the query builder](how-to-guides/query-builder.md#load-csv-file). Another way of importing CSV data into Memgraph is by [translating it into a graph](how-to-guides/loaders/import-table-data-to-graph-database.md). + +## JSON + +To import JSON files into Memgraph via GQLAlchemy, you can call procedures from the [`json_util` module](https://memgraph.com/docs/advanced-algorithms/available-algorithms/json_util) available in MAGE library. If the JSON data is formatted in a particular style, you can call the [`import_util.json()` procedure](https://memgraph.com/docs/advanced-algorithms/available-algorithms/json_util#jsonpath) from MAGE. The procedures can be called by [executing Cypher queries](https://memgraph.com/docs/client-libraries/python) or [using the query builder](how-to-guides/query-builder.md#call-procedures). + + +## Parquet, ORC or IPC/Feather/Arrow + +To import Parquet, ORC or IPC/Feather/Arrow file into Memgraph via GQLAlchemy, [transform table data from a file into a graph](how-to-guides/loaders/import-table-data-to-graph-database.md). + +!!! note + If you want to read from a file system not currently supported by GQLAlchemy, or use a file type currently not readable, you can implement your own by [making a custom file system importer](how-to-guides/loaders/make-a-custom-file-system-importer.md). + +## Python graphs - NetworkX, PyG or DGL graph + +To import NetworkX, PyG or DGL graph into Memgraph via GQLAlchemy, [transform the source graph into Memgraph graph](how-to-guides/translators/import-python-graphs.md). + +## Kafka, RedPanda or Pulsar data stream + +To consume Kafka, RedPanda or Pulsar data stream, you can write a [appropriate Cypher queries](https://memgraph.com/docs/data-streams/manage-streams-query) and [execute](https://memgraph.com/docs/client-libraries/python) them, or use GQLAlchemy stream manager for [Kafka, RedPanda](how-to-guides/streams/kafka-streams.md) or [Pulsar](how-to-guides/streams/pulsar-streams.md) streams. + + +## Learn more + +To learn how to utilize the GQLAlchemy library with Memgraph, check out the [how-to guides](how-to-guides/overview.md) or sign up for the [Getting started with Memgraph and Python course](https://app.livestorm.co/memgraph/getting-started-with-memgraph-and-python-on-demand). + + diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 00000000..000ea345 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,17 @@ +# Welcome to MkDocs + +For full documentation visit [mkdocs.org](https://www.mkdocs.org). + +## Commands + +* `mkdocs new [dir-name]` - Create a new project. +* `mkdocs serve` - Start the live-reloading docs server. +* `mkdocs build` - Build the documentation site. +* `mkdocs -h` - Print help message and exit. + +## Project layout + + mkdocs.yml # The configuration file. + docs/ + index.md # The documentation homepage. + ... # Other markdown pages, images and other files. diff --git a/docs/installation.md b/docs/installation.md new file mode 100644 index 00000000..46868abd --- /dev/null +++ b/docs/installation.md @@ -0,0 +1,75 @@ +# How to install GQLAlchemy + +There are two main ways of installing GQLAlchemy: with package managers such +as pip and Poetry, and by building it from source. + +## Prerequisites + +To install GQLAlchemy, you will need the following: + +- **Python 3.8 - 3.10** +- GQLAlchemy is built on top of Memgraph's low-level Python client `pymgclient`, so you need to install `pymgclient` [build prerequisites](https://memgraph.github.io/pymgclient/introduction.html#build-prerequisites). + +!!! danger + GQLAlchemy can't be installed with Python 3.11 [(#203)](https://github.com/memgraph/gqlalchemy/issues/203) and on Windows with Python >= 3.10 [(#179)](https://github.com/memgraph/gqlalchemy/issues/179). If this is currently a blocker for you, please let us know by commenting on opened issues. + +## Install with pip {#pip} + +After you’ve installed the prerequisites, run the following command to install +GQLAlchemy: + +```bash +pip install gqlalchemy +``` + +With the above command, you get the default GQLAlchemy installation which +doesn’t include import/export support for certain formats (see below). To get +additional import/export capabilities, use one of the following install options: + +```bash +pip install gqlalchemy[arrow] # Support for the CSV, Parquet, ORC and IPC/Feather/Arrow formats +pip install gqlalchemy[dgl] # DGL support (also includes torch) + +pip install gqlalchemy[all] # All of the above +``` + +!!! note + If you are using zsh terminal, you need to pass literal square brackets as an argument to a command: + ``` + pip install 'gqlalchemy[arrow]' + ``` + +## Build from source + +Clone or download the [GQLAlchemy source code](https://github.com/memgraph/gqlalchemy) locally and run the following command to build it from source with Poetry: + +```bash +poetry install --all-extras +``` + +The ``poetry install --all-extras`` command installs GQLAlchemy with all extras +(optional dependencies). Alternatively, you can use the ``-E`` option to define +what extras to install: + +```bash +poetry install # No extras + +poetry install -E arrow # Support for the CSV, Parquet, ORC and IPC/Feather/Arrow formats +poetry install -E dgl # DGL support (also includes torch) + +``` + +To run the tests, make sure you have an [active Memgraph instance](https://memgraph.com/docs/getting-started/install-memgraph), and execute one of the following commands: + +```bash +poetry run pytest . -k "not slow" # If all extras installed + +poetry run pytest . -k "not slow and not extras" # Otherwise +``` + +If you’ve installed only certain extras, it’s also possible to run their associated tests: + +```bash +poetry run pytest . -k "arrow" +poetry run pytest . -k "dgl" +``` diff --git a/docs/reference/gqlalchemy/connection.md b/docs/reference/gqlalchemy/connection.md index a2606070..0b690491 100644 --- a/docs/reference/gqlalchemy/connection.md +++ b/docs/reference/gqlalchemy/connection.md @@ -1,8 +1,3 @@ ---- -sidebar_label: connection -title: gqlalchemy.connection ---- - ## Connection Objects ```python diff --git a/docs/reference/gqlalchemy/disk_storage.md b/docs/reference/gqlalchemy/disk_storage.md index 16dacf5c..87c4a885 100644 --- a/docs/reference/gqlalchemy/disk_storage.md +++ b/docs/reference/gqlalchemy/disk_storage.md @@ -1,8 +1,3 @@ ---- -sidebar_label: disk_storage -title: gqlalchemy.disk_storage ---- - ## OnDiskPropertyDatabase Objects ```python diff --git a/docs/reference/gqlalchemy/exceptions.md b/docs/reference/gqlalchemy/exceptions.md index bf8ada6f..722a21e9 100644 --- a/docs/reference/gqlalchemy/exceptions.md +++ b/docs/reference/gqlalchemy/exceptions.md @@ -1,8 +1,3 @@ ---- -sidebar_label: exceptions -title: gqlalchemy.exceptions ---- - #### connection\_handler ```python diff --git a/docs/reference/gqlalchemy/graph_algorithms/integrated_algorithms.md b/docs/reference/gqlalchemy/graph_algorithms/integrated_algorithms.md index b3ccec90..c0d43c8e 100644 --- a/docs/reference/gqlalchemy/graph_algorithms/integrated_algorithms.md +++ b/docs/reference/gqlalchemy/graph_algorithms/integrated_algorithms.md @@ -1,8 +1,3 @@ ---- -sidebar_label: integrated_algorithms -title: gqlalchemy.graph_algorithms.integrated_algorithms ---- - ## IntegratedAlgorithm Objects ```python diff --git a/docs/reference/gqlalchemy/graph_algorithms/query_builder.md b/docs/reference/gqlalchemy/graph_algorithms/query_builder.md index 4a668b6e..a946d6d8 100644 --- a/docs/reference/gqlalchemy/graph_algorithms/query_builder.md +++ b/docs/reference/gqlalchemy/graph_algorithms/query_builder.md @@ -1,8 +1,3 @@ ---- -sidebar_label: query_builder -title: gqlalchemy.graph_algorithms.query_builder ---- - ## MemgraphQueryBuilder Objects ```python diff --git a/docs/reference/gqlalchemy/graph_algorithms/query_modules.md b/docs/reference/gqlalchemy/graph_algorithms/query_modules.md index 487cd3c7..4091346c 100644 --- a/docs/reference/gqlalchemy/graph_algorithms/query_modules.md +++ b/docs/reference/gqlalchemy/graph_algorithms/query_modules.md @@ -1,8 +1,3 @@ ---- -sidebar_label: query_modules -title: gqlalchemy.graph_algorithms.query_modules ---- - ## QueryModule Objects ```python diff --git a/docs/reference/gqlalchemy/instance_runner.md b/docs/reference/gqlalchemy/instance_runner.md index 0692ba44..3078e8f1 100644 --- a/docs/reference/gqlalchemy/instance_runner.md +++ b/docs/reference/gqlalchemy/instance_runner.md @@ -1,8 +1,3 @@ ---- -sidebar_label: instance_runner -title: gqlalchemy.instance_runner ---- - #### wait\_for\_port ```python diff --git a/docs/reference/gqlalchemy/loaders.md b/docs/reference/gqlalchemy/loaders.md index afaa6b5a..4401065c 100644 --- a/docs/reference/gqlalchemy/loaders.md +++ b/docs/reference/gqlalchemy/loaders.md @@ -1,8 +1,3 @@ ---- -sidebar_label: loaders -title: gqlalchemy.loaders ---- - ## ForeignKeyMapping Objects ```python diff --git a/docs/reference/gqlalchemy/models.md b/docs/reference/gqlalchemy/models.md index ba1c9af0..62292523 100644 --- a/docs/reference/gqlalchemy/models.md +++ b/docs/reference/gqlalchemy/models.md @@ -1,8 +1,3 @@ ---- -sidebar_label: models -title: gqlalchemy.models ---- - ## TriggerEventType Objects ```python diff --git a/docs/reference/gqlalchemy/overview.md b/docs/reference/gqlalchemy/overview.md new file mode 100644 index 00000000..c4cbea09 --- /dev/null +++ b/docs/reference/gqlalchemy/overview.md @@ -0,0 +1,35 @@ +# GQLAlchemy Reference + +This are the topics covered in the GQLAlchemy Reference: + +- Connection +- Disk Storage +- Exceptions +- Instance Runner +- Loaders +- Models +- Transformations +- Utilities +- Graph Algorithms + - Integrated Algorithms + - Query Builder + - Query Modules +- Query Builders + - Declarative Base + - Memgraph Query Builder +- Transformations + - Export + - Graph Transporter + - Transporter + - Importing + - Graph Importer + - Loaders + - Translators + - DGL Translator + - NX Translator + - PyG Translator + - Translator +- Vendors + - Database Client + - Memgraph + - Neo4j \ No newline at end of file diff --git a/docs/reference/gqlalchemy/query_builders/declarative_base.md b/docs/reference/gqlalchemy/query_builders/declarative_base.md index f36d58ee..8280e39e 100644 --- a/docs/reference/gqlalchemy/query_builders/declarative_base.md +++ b/docs/reference/gqlalchemy/query_builders/declarative_base.md @@ -1,8 +1,3 @@ ---- -sidebar_label: declarative_base -title: gqlalchemy.query_builders.declarative_base ---- - ## WhereConditionPartialQuery Objects ```python diff --git a/docs/reference/gqlalchemy/query_builders/memgraph_query_builder.md b/docs/reference/gqlalchemy/query_builders/memgraph_query_builder.md index 5453e16f..f12c2c70 100644 --- a/docs/reference/gqlalchemy/query_builders/memgraph_query_builder.md +++ b/docs/reference/gqlalchemy/query_builders/memgraph_query_builder.md @@ -1,8 +1,3 @@ ---- -sidebar_label: memgraph_query_builder -title: gqlalchemy.query_builders.memgraph_query_builder ---- - ## QueryBuilder Objects ```python diff --git a/docs/reference/gqlalchemy/transformations.md b/docs/reference/gqlalchemy/transformations.md index b8067a0f..7cb31124 100644 --- a/docs/reference/gqlalchemy/transformations.md +++ b/docs/reference/gqlalchemy/transformations.md @@ -1,8 +1,3 @@ ---- -sidebar_label: transformations -title: gqlalchemy.transformations ---- - #### nx\_to\_cypher ```python diff --git a/docs/reference/gqlalchemy/transformations/export/graph_transporter.md b/docs/reference/gqlalchemy/transformations/export/graph_transporter.md index 7955690d..1dd40df1 100644 --- a/docs/reference/gqlalchemy/transformations/export/graph_transporter.md +++ b/docs/reference/gqlalchemy/transformations/export/graph_transporter.md @@ -1,8 +1,3 @@ ---- -sidebar_label: graph_transporter -title: gqlalchemy.transformations.export.graph_transporter ---- - ## GraphTransporter Objects ```python diff --git a/docs/reference/gqlalchemy/transformations/export/transporter.md b/docs/reference/gqlalchemy/transformations/export/transporter.md index 1d6298c1..3cefd13c 100644 --- a/docs/reference/gqlalchemy/transformations/export/transporter.md +++ b/docs/reference/gqlalchemy/transformations/export/transporter.md @@ -1,8 +1,3 @@ ---- -sidebar_label: transporter -title: gqlalchemy.transformations.export.transporter ---- - ## Transporter Objects ```python diff --git a/docs/reference/gqlalchemy/transformations/importing/graph_importer.md b/docs/reference/gqlalchemy/transformations/importing/graph_importer.md index 74eceedb..0a9edaed 100644 --- a/docs/reference/gqlalchemy/transformations/importing/graph_importer.md +++ b/docs/reference/gqlalchemy/transformations/importing/graph_importer.md @@ -1,8 +1,3 @@ ---- -sidebar_label: graph_importer -title: gqlalchemy.transformations.importing.graph_importer ---- - ## GraphImporter Objects ```python diff --git a/docs/reference/gqlalchemy/transformations/importing/loaders.md b/docs/reference/gqlalchemy/transformations/importing/loaders.md index 4197d72f..d1fd8d2a 100644 --- a/docs/reference/gqlalchemy/transformations/importing/loaders.md +++ b/docs/reference/gqlalchemy/transformations/importing/loaders.md @@ -1,8 +1,3 @@ ---- -sidebar_label: loaders -title: gqlalchemy.transformations.importing.loaders ---- - ## ForeignKeyMapping Objects ```python diff --git a/docs/reference/gqlalchemy/transformations/translators/dgl_translator.md b/docs/reference/gqlalchemy/transformations/translators/dgl_translator.md index fbe51a39..e88b1cda 100644 --- a/docs/reference/gqlalchemy/transformations/translators/dgl_translator.md +++ b/docs/reference/gqlalchemy/transformations/translators/dgl_translator.md @@ -1,8 +1,3 @@ ---- -sidebar_label: dgl_translator -title: gqlalchemy.transformations.translators.dgl_translator ---- - ## DGLTranslator Objects ```python diff --git a/docs/reference/gqlalchemy/transformations/translators/nx_translator.md b/docs/reference/gqlalchemy/transformations/translators/nx_translator.md index c71457a9..75f15e85 100644 --- a/docs/reference/gqlalchemy/transformations/translators/nx_translator.md +++ b/docs/reference/gqlalchemy/transformations/translators/nx_translator.md @@ -1,8 +1,3 @@ ---- -sidebar_label: nx_translator -title: gqlalchemy.transformations.translators.nx_translator ---- - ## NetworkXCypherBuilder Objects ```python diff --git a/docs/reference/gqlalchemy/transformations/translators/pyg_translator.md b/docs/reference/gqlalchemy/transformations/translators/pyg_translator.md index 9f186abc..1f6fd57d 100644 --- a/docs/reference/gqlalchemy/transformations/translators/pyg_translator.md +++ b/docs/reference/gqlalchemy/transformations/translators/pyg_translator.md @@ -1,8 +1,3 @@ ---- -sidebar_label: pyg_translator -title: gqlalchemy.transformations.translators.pyg_translator ---- - ## PyGTranslator Objects ```python diff --git a/docs/reference/gqlalchemy/transformations/translators/translator.md b/docs/reference/gqlalchemy/transformations/translators/translator.md index 5877ff3c..6f3da35d 100644 --- a/docs/reference/gqlalchemy/transformations/translators/translator.md +++ b/docs/reference/gqlalchemy/transformations/translators/translator.md @@ -1,8 +1,3 @@ ---- -sidebar_label: translator -title: gqlalchemy.transformations.translators.translator ---- - ## Translator Objects ```python diff --git a/docs/reference/gqlalchemy/utilities.md b/docs/reference/gqlalchemy/utilities.md index e522d67b..58776e7f 100644 --- a/docs/reference/gqlalchemy/utilities.md +++ b/docs/reference/gqlalchemy/utilities.md @@ -1,8 +1,3 @@ ---- -sidebar_label: utilities -title: gqlalchemy.utilities ---- - #### to\_cypher\_value ```python diff --git a/docs/reference/gqlalchemy/vendors/database_client.md b/docs/reference/gqlalchemy/vendors/database_client.md index eab44496..5286737c 100644 --- a/docs/reference/gqlalchemy/vendors/database_client.md +++ b/docs/reference/gqlalchemy/vendors/database_client.md @@ -1,8 +1,3 @@ ---- -sidebar_label: database_client -title: gqlalchemy.vendors.database_client ---- - ## DatabaseClient Objects ```python diff --git a/docs/reference/gqlalchemy/vendors/memgraph.md b/docs/reference/gqlalchemy/vendors/memgraph.md index 73b161f4..e0165231 100644 --- a/docs/reference/gqlalchemy/vendors/memgraph.md +++ b/docs/reference/gqlalchemy/vendors/memgraph.md @@ -1,8 +1,3 @@ ---- -sidebar_label: memgraph -title: gqlalchemy.vendors.memgraph ---- - ## Memgraph Objects ```python diff --git a/docs/reference/gqlalchemy/vendors/neo4j.md b/docs/reference/gqlalchemy/vendors/neo4j.md index 7fe42802..f4720333 100644 --- a/docs/reference/gqlalchemy/vendors/neo4j.md +++ b/docs/reference/gqlalchemy/vendors/neo4j.md @@ -1,8 +1,3 @@ ---- -sidebar_label: neo4j -title: gqlalchemy.vendors.neo4j ---- - ## Neo4j Objects ```python diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css new file mode 100644 index 00000000..aee34545 --- /dev/null +++ b/docs/stylesheets/extra.css @@ -0,0 +1,4 @@ +:root { + --md-primary-fg-color: #FB6E00; + --md-accent-fg-color: #720096; +} \ No newline at end of file diff --git a/docs/under-the-hood/data/networkx-example-1.png b/docs/under-the-hood/data/networkx-example-1.png new file mode 100644 index 00000000..f3a2160e Binary files /dev/null and b/docs/under-the-hood/data/networkx-example-1.png differ diff --git a/docs/under-the-hood/overview.md b/docs/under-the-hood/overview.md new file mode 100644 index 00000000..b8d89f69 --- /dev/null +++ b/docs/under-the-hood/overview.md @@ -0,0 +1,5 @@ +Look under the hood and have a glimpse at the inner workings of GQLAlchemy. If you +are advanced GQLAlchemy user or graph database enthusiast we hope you will +enjoy reading about the following topics: + + * [**Python graph translators**](python-graph-translators.md) diff --git a/docs/under-the-hood/python-graph-translators.md b/docs/under-the-hood/python-graph-translators.md new file mode 100644 index 00000000..9fbffc0f --- /dev/null +++ b/docs/under-the-hood/python-graph-translators.md @@ -0,0 +1,116 @@ +In this under the hood content you can learn more about GQLAlchemy **Python graph translators**. + +[![Related - +How-to](https://img.shields.io/static/v1?label=Related&message=How%20to%20import&color=blue&style=for-the-badge)](../how-to-guides/translators/import-python-graphs.md) +[![Related - +How-to](https://img.shields.io/static/v1?label=Related&message=How%20to%20export&color=blue&style=for-the-badge)](../how-to-guides/translators/export-python-graphs.md) +[![docs-source](https://img.shields.io/badge/source-examples-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/tree/main/tests/transformations/translators) +[![docs-source](https://img.shields.io/badge/source-translators-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/tree/main/gqlalchemy/transformations/translators) + + +Within the code, translators are divided into the following parts, depending on the Python graph type you want to translate: + +- [**NetworkX graph translator**](#networkx-graph-translator) +- [**PyG graph translator**](#pyg-graph-translator) +- [**DGL graph translator**](#dgl-graph-translator) + + +## NetworkX graph translator + +The `NxTranslator` class implements the NetworkX graph translator and inherits from the `Translator` class. The `NxTranslator` class can be imported from the `gqlalchemy.transformations.translators.nx_translator` module. + +[![docs-source](https://img.shields.io/badge/source-NetworkX%20Translator-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/blob/main/gqlalchemy/transformations/translators/nx_translator.py) + +Translating the graph means that you can **import** NetworkX graph into Memgraph as well as **export** data from Memgraph into NetworkX graph in your Python code. The `NxTranslator` defines three important methods: + +- [`to_cypher_queries()`](#to_cypher_queries-method) - The method which generates Cypher queries to create a graph in Memgraph. +- [`nx_graph_to_memgraph_parallel()`](#nx_graph_to_memgraph_parallel-method) - The method which generates Cypher queries to insert data into Memgraph in parallel. +- [`get_instance()`](#get_instance-method) - The method which creates NetworkX instance from the graph stored in Memgraph. + + +### `to_cypher_queries()` method + +The `to_cypher_queries()` method yields queries from the `NetworkXCypherBuilder` object. These queries are creating nodes (with indexes) and relationships. To create nodes with indexes, `create_index` in `config` must be set to `True`. In that case, label-property indexes will be created on `id` property of each node. With or without indexes, node creation follows the same set or rules. The value of the `labels` key in NetworkX node will be translated into Memgraph node labels. Other properties will be translated into the same key-value pairs in Memgraph. Every node will have `id` property matching its NetworkX identification number. After Cypher queries for the node creation are generated, then Cypher queries for relationship creation are being generated. Those Cypher queries will match nodes by their label and property `id` and create a relationship between them. The value of the `TYPE` key in NetworkX edge will be translated into relationship type in Memgraph. Any other property in NetworkX edge will be translated into the same key-value pair in Memgraph. To run the generated queries, following code can be used: + +``` +for query in NxTranslator().to_cypher_queries(nx_graph): + memgraph.execute(query) +``` + +### `nx_graph_to_memgraph_parallel()` method + +The `nx_graph_to_memgraph_parallel()` method is similar to the [`to_cypher_queries()`](#to_cypher_queries-method) method. It creates a graph inside Memgraph following the same set of rules, but it writes in parallel. To do that, it splits generated queries into query groups and opens up a new connection to Memgraph in order to run queries. It will warn you if you did not set `create_index` in `config` to `True`, because otherwise, the write process might take longer than expected. To run the generated queries, the following code can be used: + +``` +for query in NxTranslator().nx_graph_to_memgraph_parallel(nx_graph): + memgraph.execute(query) +``` + +### `get_instance()` method + +The `get_instance()` method translates data stored inside Memgraph into NetworkX graph. It traverses the graph and it stores node and relationship objects along with their properties in a NetworkX DiGraph object. Since NetworkX doesn't support node labels and relationship type in a way Memgraph does, they are encoded as node and edge properties, as values of `label` and `type` key. To create NetworkX graph from data stored in Memgraph, following code can be run: + +``` +graph = NxTranslator().get_instance() +``` + +## PyG graph translator + +The `PyGTranslator` class implements the PyG graph translator and inherits from the `Translator` class. The `PyGTranslator` class can be imported from the `gqlalchemy.transformations.translators.pyg_translator` module. + +[![docs-source](https://img.shields.io/badge/source-PyG%20Translator-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/blob/main/gqlalchemy/transformations/translators/pyg_translator.py) + +Translating the graph means that you can **import** PyG graph into Memgraph as well as **export** data from Memgraph into PyG graph in your Python code. The `PyGTranslator` defines two important methods: + +- [`to_cypher_queries()`](#to_cypher_queries-method-1) - The method which generates Cypher queries to create a graph in Memgraph. +- [`get_instance()`](#get_instance-method-1) - The method which creates PyG instance from the graph stored in Memgraph. + +### `to_cypher_queries()` method + +The `to_cypher_queries()` method produces Cypher queries to create graph objects in Memgraph for both homogeneous and heterogeneous graph. This method can translate one-dimensional as well as multidimensional features to Memgraph properties. Isolated nodes in the graph won't get translated into Memgraph. Nodes and relationships will have property `pyg_id` set to the id they have as part of the PyG graph for the consistency reasons. To run the generated queries, following code can be used: + +``` +for query in PyGTranslator().to_cypher_queries(pyg_graph): + memgraph.execute(query) +``` + + +### `get_instance()` method + +The `get_instance()` method returns an instance of PyG heterograph from all relationships stored in Memgraph. Isolated nodes are ignored because they don't contribute to message passing neural networks. Only numerical properties that are set on all nodes and relationships are translated to the PyG instance since that is PyG requirement. Hence, any string properties, as well as numerical properties that aren't set on all nodes or relationships, won't be translated to the PyG instance. However, properties of type list will be translated to the PyG instance as a feature. Regardless of how data is connected in Memgraph, the returned PyG graph will be a heterograph instance. To create PyG graph from data stored in Memgraph, the following code can be run: + +``` +graph = PyGTranslator().get_instance() +``` + +## DGL graph translator + +The `DGLTranslator` class implements the DGL graph translator and inherits from the `Translator` class. The `DGLTranslator` class can be imported from the `gqlalchemy.transformations.translators.dgl_translator` module. + +[![docs-source](https://img.shields.io/badge/source-DGL%20Translator-FB6E00?logo=github&style=for-the-badge)](https://github.com/memgraph/gqlalchemy/blob/main/gqlalchemy/transformations/translators/dgl_translator.py) + +Translating the graph means that you can **import** DGL graph into Memgraph as well as **export** data from Memgraph into DGL graph in your Python code. The `DGLTranslator` defines two important methods: + +- [`to_cypher_queries()`](#to_cypher_queries-method-2) - The method which generates Cypher queries to create a graph in Memgraph. +- [`get_instance()`](#get_instance-method-2) - The method which creates PyG instance from the graph stored in Memgraph. + +### `to_cypher_queries()` method + +The `to_cypher_queries()` method produces Cypher queries to create graph objects in Memgraph for both homogeneous and heterogeneous graph. If the graph is homogeneous, the default `_N` as a node label and `_E` as a relationship label will be used. This method can translate one-dimensional as well as multidimensional features to Memgraph properties. Isolated nodes in the graph won't get translated into Memgraph. Nodes and relationships will have property `dgl_id` set to the ID they have as part of the DGL graph for the consistency reasons. To run the generated queries, the following code can be used: + +``` +for query in DGLTranslator().to_cypher_queries(dgl_graph): + memgraph.execute(query) +``` + +### `get_instance()` method + +The `get_instance()` method returns instance of DGL heterograph from all relationships stored in Memgraph. Isolated nodes are ignored because they don't contribute in message passing neural networks. Only numerical properties that are set on all nodes and relationships are translated to the DGL instance since that is DGL requirement. Hence, any string properties, as well as numerical properties, that aren't set on all nodes or relationships, won't be translated to the DGL instance. However, properties of type list will be translated to the PyG instance as a feature. Regardless of how data is connected in Memgraph, the returned DGL graph will be a heterograph instance. To create DGL graph from data stored in Memgraph, following code can be run: + +``` +graph = DGLTranslator().get_instance() +``` + +## Where to next? + +If you want to learn more about using NetworkX with Memgraph with interesting resources and courses, head over to the [**Memgraph for NetworkX developers**](https://memgraph.com/memgraph-for-networkx?utm_source=docs&utm_medium=referral&utm_campaign=networkx_ppp&utm_term=docsgqla%2Bhowto&utm_content=textlink) website. If you have any questions or want to connect with the Memgraph community, [**join our Discord server**](https://www.discord.gg/memgraph). diff --git a/mkdocs.yml b/mkdocs.yml new file mode 100644 index 00000000..86e21f03 --- /dev/null +++ b/mkdocs.yml @@ -0,0 +1,81 @@ + +site_name: GQLAlchemy Documentation + +markdown_extensions: + - admonition + - pymdownx.details + - pymdownx.superfences +theme: + name: 'material' + logo: 'assets/memgraph-logo.svg' + site_favicon: 'aassets/favicon.png' + font: + text: 'Roboto' + code: 'Roboto Mono' + icon: + repo: fontawesome/brands/github + palette: + primary: 'custom' + accent: 'custom' +nav: + - Home: 'index.md' + - Getting Started: 'getting-started.md' + - Installation: 'installation.md' + - Import Data: 'import-data.md' + - Changelog: 'changelog.md' + - How-To Guides: + - Overview: 'how-to-guides/overview.md' + - OGM: 'how-to-guides/ogm.md' + - Query Builder: 'how-to-guides/query-builder.md' + - Graph Projection: 'how-to-guides/query-builder/graph-projection.md' + - Data: + - Memgraph Binary Instance: 'how-to-guides/instance-runner/memgraph-binary-instance.md' + - Memgraph Docker Instance: 'how-to-guides/instance-runner/memgraph-docker-instance.md' + - Import Table Data to Graph Database: 'how-to-guides/loaders/import-table-data-to-graph-database.md' + - Make a Custom File System Importer: 'how-to-guides/loaders/make-a-custom-file-system-importer.md' + - On-Disk Storage: 'how-to-guides/on-disk-storage/on-disk-storage.md' + - Kafka Streams: 'how-to-guides/streams/kafka-streams.md' + - Pulsar Streams: 'how-to-guides/streams/pulsar-streams.md' + - Export Python Graphs: 'how-to-guides/translators/export-python-graphs.md' + - Import Python Graphs: 'how-to-guides/translators/import-python-graphs.md' + - Triggers: 'how-to-guides/triggers/triggers.md' + - Under the Hood: + - Overview: 'under-the-hood/overview.md' + - Python Graph Translators: 'under-the-hood/python-graph-translators.md' + - Reference: + - Overview: 'reference/gqlalchemy/overview.md' + - Connection: 'reference/gqlalchemy/connection.md' + - Disk Storage: 'reference/gqlalchemy/disk_storage.md' + - Exceptions: 'reference/gqlalchemy/exceptions.md' + - Instance Runner: 'reference/gqlalchemy/instance_runner.md' + - Loaders: 'reference/gqlalchemy/loaders.md' + - Models: 'reference/gqlalchemy/models.md' + - Transformations: 'reference/gqlalchemy/transformations.md' + - Utilities: 'reference/gqlalchemy/utilities.md' + - Graph Algorithms: + - Integrated Algorithms: 'reference/gqlalchemy/graph_algorithms/integrated_algorithms.md' + - Query Builder: 'reference/gqlalchemy/graph_algorithms/query_builder.md' + - Query Modules: 'reference/gqlalchemy/graph_algorithms/query_modules.md' + - Query Builders: + - Declarative Base: 'reference/gqlalchemy/query_builders/declarative_base.md' + - Memgraph Query Builder: 'reference/gqlalchemy/query_builders/memgraph_query_builder.md' + - Transformations: + - Export: + - Graph Transporter: 'reference/gqlalchemy/transformations/export/graph_transporter.md' + - Transporter: 'reference/gqlalchemy/transformations/export/transporter.md' + - Importing: + - Graph Importer: 'reference/gqlalchemy/transformations/importing/graph_importer.md' + - Loaders: 'reference/gqlalchemy/transformations/importing/loaders.md' + - Translators: + - DGL Translator: 'reference/gqlalchemy/transformations/translators/dgl_translator.md' + - NX Translator: 'reference/gqlalchemy/transformations/translators/nx_translator.md' + - PyG Translator: 'reference/gqlalchemy/transformations/translators/pyg_translator.md' + - Translator: 'reference/gqlalchemy/transformations/translators/translator.md' + - Vendors: + - Database Client: 'reference/gqlalchemy/vendors/database_client.md' + - Memgraph: 'reference/gqlalchemy/vendors/memgraph.md' + - Neo4j: 'reference/gqlalchemy/vendors/neo4j.md' +repo_name: 'memgraph/gqlalchemy' +repo_url: 'https://github.com/memgraph/gqlalchemy' +extra_css: + - stylesheets/extra.css