Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge step #2392

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

Conversation

adrianlzt
Copy link
Contributor

Aggregates elements from different revisions of the nodes into a new
metadata key.
Given a metadata element, that should be a map[string]interface{},
aggregate different values into another metadata key with format
map[string][]interface{}
Eg.:
Metadata.data V1: {"a":{x}, "b":{y}}
Metadata.data V2: {"a":{z}, "b":{y}}
Metadata.agg: {"a":[{x},{z}], "b":[{y}]}

It's purpose its to show data from past revisions of the same node.
Example:
G.At(1479899809,3600).V().Merge('data','agg')

It could be also called defining the time slice in the parameters
(since, from):
G.V().Merge('A','B',1500000000,1500099999)

This step return a modified copy of the last node, with all the
aggregated data, not the node stored in the graph.
This is to avoid modiying the node stored in the graph.

This PR also modifies the Reduce method of the Neighbors step.
Merge step only needs the node IDs, so Neighbors step could skip
retrieving the full content of nodes.

Do just one call to get all the edges for the given nodes, instead of
one call for each node.

This new call has been added as a new method to Graph and Backends.
GetNodesEdges, return the list with all edges for a list of nodes.

Batching is used to avoid hitting the max number of clauses set by ES
(is set to the default value of 512).
Like Descendants, but using edges in any direction (Descendants only
uses edges from parent to child, Neighbors uses from parent to child and
from child to parent).

The different with the Both step, is Neighbors accumulate nodes seen.
Example (pseudo-syntax):
A -> B -> C
G.V(A).Out().Out() return: C.

But:
G.V(A).Neighbors(2) return: A,B,C

The parameters allowed are the same as in Descendants.
Example:
G.V('foo').Neighbors('RelationType',Within('ownership','foobar'),2)

To improve speed and reduce backend load when using persistent backends,
a new method, GetNodesFromIDs, is implemented in Graph and Backends.
This method only uses one call (or a few in we have hundreds of nodes,
see batching) to get all nodes from the backend.

Batching is used to avoid hitting the max number of clauses set by ES
(is set to the default value of 512).
Node.Copy() creates a copy of the node.
It could be useful to return in a query output the modified content of
a node without actually modiying it.
@adrianlzt
Copy link
Contributor Author

It needs PR #2389 (GetNodesFromIDs method) and PR #2388 (Node.Copy)

@adrianlzt
Copy link
Contributor Author

I am not sure if this step makes sense to anyone else but me :)

The full picture is that we are adding/removing events to/from the nodes and we want to be able to see all of those events in the past 24 hours.

@adrianlzt adrianlzt force-pushed the feature/step_merge branch 2 times, most recently from eee82a9 to b3bbeb8 Compare August 9, 2021 11:38
@lebauce
Copy link
Member

lebauce commented Aug 11, 2021

Thanks ! It does make sense to me too :-)

@safchain did something similar when working on the new historical view of the new web UI and added a Valuemap step for this so he may have an opinion on this.

Just thinking out loud, I was wondering if we could do something like :
G.Context("-24h").V().Diff() that would give the difference between every revision of the node.

Aggregates elements from different revisions of the nodes into a new
metadata key.
Given a metadata element, that should be a map[string]interface{},
aggregate different values into another metadata key with format
map[string][]interface{}
Eg.:
  Metadata.data V1: {"a":{x}, "b":{y}}
  Metadata.data V2: {"a":{z}, "b":{y}}
  Metadata.agg:     {"a":[{x},{z}], "b":[{y}]}

It's purpose its to show data from past revisions of the same node.
Example:
G.At(1479899809,3600).V().Merge('data','agg')

It could be also called defining the time slice in the parameters
(since, from):
G.V().Merge('A','B',1500000000,1500099999)

This step return a modified copy of the last node, with all the
aggregated data, not the node stored in the graph.
This is to avoid modiying the node stored in the graph.

This PR also modifies the Reduce method of the Neighbors step.
Merge step only needs the node IDs, so Neighbors step could skip
retrieving the full content of nodes.
@lebauce
Copy link
Member

lebauce commented Oct 4, 2021

@adrianlzt Hello Adrian. Sorry for the delay. I was thinking that a somewhat similar implementation could be to add the Merge step on the output of the ValueMap step :

Metadata.data V1: {"a":{x}, "b":{y}}
Metadata.data V2: {"a":{z}, "b":{y}}

G.At(1479899809,3600).V().ValueMap("a, "b").Merge() would return
=>
{"a":[x, z], "b": [y]}

Would it suit your use case ?

@adrianlzt
Copy link
Contributor Author

Hi @lebauce. More daly here!
The thing is that I want the full node with some aggregated fields.
If I have understand correctly, your solution is to only return some fields aggregated.

But maybe you have another better approach.
For me the idea is to show the nodes in Grafana and the time selector will choose the period in which you could see events (Metadata.Events) in that node.
So for example, you see the node and its relations and you want to know if one week ago it was having events (problems).

But I always need the full node to draw it (name, type, etc).

Does it makes sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants