Skip to content

Commit

Permalink
Adds properties on edges (#306)
Browse files Browse the repository at this point in the history
Define relationship properties in many_to_many via name_mappings and properties field
  • Loading branch information
andrejtonev authored Jul 5, 2024
1 parent 69180c2 commit 651240c
Show file tree
Hide file tree
Showing 7 changed files with 158 additions and 20 deletions.
99 changes: 86 additions & 13 deletions docs/how-to-guides/loaders/import-table-data-to-graph-database.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,21 @@ data is located, here are two guides on how to import it to Memgraph:

## Loading a CSV file from the local file system

Let's say you have a simple table data in a CSV file stored at
`/home/user/table_data`:
Let's say you have a simple dataset stored in CSV files:

`/home/user/table_data/individual.csv`:
```csv
name,surname,grade
Ivan,Horvat,4
Marko,Andric,5
Luka,Lukic,3
ind_id, name, surname, add_id
1, Ivan, Horvat, 2
2, Marko, Andric, 2
3, Luka, Lukic, 1
```

`/home/user/table_data/address.csv`:
```csv
add_id, street, num, city
1, Ilica, 2, Zagreb
2, Broadway, 12, New York
```

To create a translation from table to graph data, you need to define a **data
Expand Down Expand Up @@ -78,24 +85,37 @@ many_to_many_relations: # intended to be used in case of associative table
column_name:
reference_table:
reference_key:
label:
label: # relationship's label
properties: # list of properties to add to the relationship

```

### One to many

For this example, you don't need all of those fields. You only need to define
`indices` and `one_to_many_relations`. Hence, you have the following YAML file:

```yaml
indices:
example:
- name
address:
- add_id
individual:
- ind_id

name_mappings:
example:
label: PERSON
individual:
label: INDIVIDUAL
address:
label: ADDRESS

one_to_many_relations:
example: []
address: []
individual:
- foreign_key:
column_name: add_id
reference_table: address
reference_key: add_id
label: LIVES_IN
```
In order to read the data configuration from the YAML file, run:
Expand All @@ -114,12 +134,65 @@ make an instance of an `Importer` and call `translate()`.
```python
importer = CSVLocalFileSystemImporter(
data_configuration=parsed_yaml,
path="/home/user/table_data",
path="/home/user/table_data/",
)
importer.translate(drop_database_on_start=True)
```

### Many to many

Relationships can also be defined using a third, associative table.

`/home/user/table_data/tenant.csv`:
```csv
ind_id, add_id, duration
1, 2, 21
2, 2, 3
3, 1, 5
```

We need to extend our data configuration YAML file to include the `many_to_many_relations`, like so:

```
indices:
address:
- add_id
individual:
- ind_id
name_mappings:
individual:
label: INDIVIDUAL
address:
label: ADDRESS
tenant:
column_names_mapping:
duration: years
one_to_many_relations:
address: []
individual: []
many_to_many_relations:
tenant:
foreign_key_from:
column_name: ind_id
reference_table: individual
reference_key: ind_id
foreign_key_to:
column_name: add_id
reference_table: address
reference_key: add_id
label: LIVES_IN
properties:
- duration
```

From here the procedure is the same as before.
In addition to having imported nodes and connected individuals and their addresses, we have also added an edge property.
This property is read from the associative table and is named in accordance with the `name_mappings`.

## Using a cloud storage solution

To connect to Azure Blob, simply change the Importer object you are using. Like
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ Class that holds the full description of a single one to many mapping in a table
- `foreign_key` - Foreign key used for mapping.
- `label` - Label which will be applied to the relationship created from this object.
- `from_entity` - Direction of the relationship created from the mapping object.
- `parameters` - Parameters that will be added to the relationship created from this object (Optional).

## ManyToManyMapping Objects

Expand All @@ -44,7 +43,7 @@ Many to many mapping is intended to be used in case of associative tables.
- `foreign_key_from` - Describes the source of the relationship.
- `foreign_key_to` - Describes the destination of the relationship.
- `label` - Label to be applied to the newly created relationship.
- `parameters` - Parameters that will be added to the relationship created from this object (Optional).
- `properties` - List of properties that will be added to the relationship created from this object (Optional).

## TableMapping Objects

Expand Down
20 changes: 15 additions & 5 deletions gqlalchemy/transformations/importing/loaders.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,13 +97,11 @@ class OneToManyMapping:
foreign_key: Foreign key used for mapping.
label: Label which will be applied to the relationship created from this object.
from_entity: Direction of the relationship created from the mapping object.
parameters: Parameters that will be added to the relationship created from this object (Optional).
"""

foreign_key: ForeignKeyMapping
label: str
from_entity: bool = False
parameters: Optional[Dict[str, str]] = None


@dataclass(frozen=True)
Expand All @@ -115,13 +113,13 @@ class ManyToManyMapping:
foreign_key_from: Describes the source of the relationship.
foreign_key_to: Describes the destination of the relationship.
label: Label to be applied to the newly created relationship.
parameters: Parameters that will be added to the relationship created from this object (Optional).
properties: Properties that will be added to the relationship created from this object (Optional).
"""

foreign_key_from: ForeignKeyMapping
foreign_key_to: ForeignKeyMapping
label: str
parameters: Optional[Dict[str, str]] = None
properties: Optional[List[str]] = None


Mapping = Union[List[OneToManyMapping], ManyToManyMapping]
Expand Down Expand Up @@ -494,6 +492,8 @@ def _load_cross_relationships(self) -> None:
property_from=mapping_from.reference_key,
property_to=mapping_to.reference_key,
relation_label=many_to_many_mapping.mapping.label,
table_name=many_to_many_mapping.table_name,
properties=many_to_many_mapping.mapping.properties,
row=row,
)

Expand Down Expand Up @@ -606,6 +606,8 @@ def _save_row_as_relationship(
property_from: str,
property_to: str,
relation_label: str,
table_name: str,
properties: List[str],
row: Dict[str, Any],
) -> None:
"""Translates a row to a relationship and writes it to Memgraph.
Expand All @@ -616,6 +618,8 @@ def _save_row_as_relationship(
property_from: Property of the source node.
property_to: Property of the destination node.
relation_label: Label for the relationship.
table_name: Name of the table used to read properties
properties: Relationship properties to be added
row: The row to be translated.
"""
(
Expand All @@ -642,7 +646,13 @@ def _save_row_as_relationship(
)
.create()
.node(variable=NODE_A)
.to(relation_label)
.to(
relationship_type=relation_label,
**{
self._name_mapper.get_property_name(collection_name=table_name, column_name=prop): row[prop]
for prop in properties
},
)
.node(variable=NODE_B)
.execute()
)
Expand Down
5 changes: 5 additions & 0 deletions tests/transformations/loaders/data/address.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
add_id,street,street_num,city
1,Ilica,2,Zagreb
2,Death Valley,0,Knowhere
3,Horvacanska,3,Horvati
4,Broadway,12,New York
3 changes: 3 additions & 0 deletions tests/transformations/loaders/data/i2a.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
add_id,ind_id,duration
1,2,12
2,1,5
6 changes: 6 additions & 0 deletions tests/transformations/loaders/data/individual.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
ind_id,name,surname,add_id
1,Tomislav,Petrov,1
2,Ivan,Horvat,3
3,Marko,Horvat,3
4,John,Doe,2
5,John,Though,4
42 changes: 42 additions & 0 deletions tests/transformations/loaders/test_loaders.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,48 @@ def test_local_table_to_graph_importer_csv(memgraph):
importer = CSVLocalFileSystemImporter(path=path, data_configuration=my_configuration, memgraph=memgraph)
importer.translate(drop_database_on_start=True)

conf_with_edge_params = {
"indices": {"address": ["add_id"], "individual": ["ind_id"]},
"name_mappings": {"individual": {"label": "INDIVIDUAL"}, "address": {"label": "ADDRESS"}},
"one_to_many_relations": {
"address": [],
"individual": [
{
"foreign_key": {"column_name": "add_id", "reference_table": "address", "reference_key": "add_id"},
"label": "LIVES_IN",
}
],
},
}
importer = CSVLocalFileSystemImporter(path=path, data_configuration=conf_with_edge_params, memgraph=memgraph)
importer.translate(drop_database_on_start=True)

conf_with_many_to_many = {
"indices": {"address": ["add_id"], "individual": ["ind_id"]},
"name_mappings": {
"individual": {"label": "INDIVIDUAL"},
"address": {"label": "ADDRESS"},
"i2a": {
"column_names_mapping": {"duration": "years"},
},
},
"one_to_many_relations": {"address": [], "individual": []},
"many_to_many_relations": {
"i2a": {
"foreign_key_from": {
"column_name": "ind_id",
"reference_table": "individual",
"reference_key": "ind_id",
},
"foreign_key_to": {"column_name": "add_id", "reference_table": "address", "reference_key": "add_id"},
"label": "LIVES_IN",
"properties": ["duration"],
}
},
}
importer = CSVLocalFileSystemImporter(path=path, data_configuration=conf_with_many_to_many, memgraph=memgraph)
importer.translate(drop_database_on_start=True)


@pytest.mark.extras
@pytest.mark.arrow
Expand Down

0 comments on commit 651240c

Please sign in to comment.