Gremlin Snippets are typically short and fun dissections of some aspect of the Gremlin language. For a full list of all steps in the Gremlin language see the Reference Documentation of Apache TinkerPop™. This snippet is based on Gremlin 3.7.3.This snippet demonstrates its lesson using the data of the "modern" toy graph (image). Please consider bringing any discussion or questions about this snippet to Discord or the Gremlin Users Mailing List.



Understanding the tree() Step in Gremlin

While helping someone with Gremlin traversal that made use of tree(), I realized that there are some aspects of this step that could use further explanation. To start, tree() can be used in two different ways:

gremlin> g.V().out().tree()
==>[v[1]:[v[2]:[],v[3]:[],v[4]:[]],v[4]:[v[3]:[],v[5]:[]],v[6]:[v[3]:[]]]
gremlin> g.V().out().tree('x').select('x')
==>[v[1]:[v[3]:[]]]
==>[v[1]:[v[2]:[],v[3]:[]]]
==>[v[1]:[v[2]:[],v[3]:[],v[4]:[]]]
==>[v[1]:[v[2]:[],v[3]:[],v[4]:[]],v[4]:[v[5]:[]]]
==>[v[1]:[v[2]:[],v[3]:[],v[4]:[]],v[4]:[v[3]:[],v[5]:[]]]
==>[v[1]:[v[2]:[],v[3]:[],v[4]:[]],v[4]:[v[3]:[],v[5]:[]],v[6]:[v[3]:[]]]

Eager vs. Lazy Execution and Key Differences

In the first example, tree() acts as a barrier which eagerly consumes the stream and then outputs the resulting Tree object. In the second example, with tree('x') which provides a side-effect key, the step consumes the stream lazily, such that each traverser reaching select() shows the Tree building itself as traversers pass through.

Focusing on the second example, a tempting approach to get the final Tree might be to try to introduce limit(1), but the following example demonstrates that this will not work:

gremlin> g.V().out().tree('x').limit(1).select('x')
==>[v[1]:[v[3]:[]]]

Since limit(1) only pulls a single traverser though tree(), it becomes the only one evaluated by the step. Earlier in this post, I mentioned that tree('x') does not have a barrier to force it to eagerly consume all the traversers. Adding barrier() to the prior traversal does help produce the output expected:

gremlin> g.V().out().tree('x').barrier().limit(1).select('x')
==>[v[1]:[v[2]:[],v[3]:[],v[4]:[]],v[4]:[v[3]:[],v[5]:[]],v[6]:[v[3]:[]]]

That said, a more idiomatic approach to this issue is to simply use cap():

gremlin> g.V().out().tree('x').cap('x')
==>[v[1]:[v[2]:[],v[3]:[],v[4]:[]],v[4]:[v[3]:[],v[5]:[]],v[6]:[v[3]:[]]]

Best Practices and Key Takeaways

The cap() step is the barrier, forcing consumption of the traversers in the pipeline to fully build the Tree as a side-effect in “x”. It then grabs the value in “x” and outputs it in the steam as the result. When working with more complex traversals, attention to how objects flow through a traversal is essential for shaping results as intended. Continued exploration of these traversal patterns often reveals additional nuances and opportunities for optimization.