tree() Node Equality

While working with tree()
step in Gremlin, I noticed an important aspect of how the resulting Tree
object is constructed-one that can significantly affect your query results. As a reminder, tree() aggregates the paths traversers take through the graph into a hierarchical tree structure, capturing the branching nature of those traversals.
A basic example:
gremlin> g.V().out().out().tree().next()
==>v[1]={v[4]={v[3]={}, v[5]={}}}
The raw Tree object can be hard to interpret, but conceptually, the output above represents:
|--v[marko]
|--v[josh]
|--v[ripple]
|--v[lop]
Consider a more involved traversal:
g.V(1).outE().inV().bothE().otherV().tree()
This produces a tree structure like:
|--v[marko]
|--e[marko-knows->vadas]
|--v[vadas]
|--e[marko-knows->vadas]
|--v[marko]
|--e[marko-knows->josh]
|--v[josh]
|--e[josh-created->ripple]
|--v[ripple]
|--e[josh-created->lop]
|--v[lop]
|--e[marko-knows->josh]
|--v[marko]
|--e[marko-created->lop]
|--v[lop]
|--e[marko-created->lop]
|--v[marko]
|--e[josh-created->lop]
|--v[josh]
|--e[peter-created->lop]
|--v[peter]
The tree()
step supports by()
modulation allowing you to control what is stored at each level of the tree. For example, you can extract the “name” property for vertices and the label
for edges:
g.V(1).outE().inV().bothE().otherV().
tree().by("name").by(T.label)
This changes the structure of the output:
|--marko
|--knows
|--vadas
|--knows
|--marko
|--josh
|--created
|--ripple
|--lop
|--knows
|--marko
|--created
|--lop
|--created
|--marko
|--josh
|--peter
The crucial point to note is that the tree()
step builds its structure based on node equality at each level. This means that if two traversers arrive at nodes that are considered equal, they will share the same branch in the tree, even if they arrived via different paths. In the example above, the root node (“marko”) has only two children (“knows” and “created”), corresponding to the two unique edge labels, not the number of edges. This rule applies recursively throughout the tree and can lead to structural differences that may be surprising if you expect the tree to reflect the number of paths rather than the uniqueness of values at each step.