Skip to content

Commit 8d6957a

Browse files
kenhuuuCole-Greer
andcommitted
TINKERPOP-3200 Make repeat() act as a global parent
Co-authored-by: Cole-Greer <[email protected]>
1 parent 57ed046 commit 8d6957a

File tree

15 files changed

+564
-47
lines changed

15 files changed

+564
-47
lines changed

CHANGELOG.asciidoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@ This release also includes changes from <<release-3-7-XXX, 3.7.XXX>>.
101101
* Removed the `@RemoteOnly` testing tag in Gherkin as lambda tests have all been moved to the Java test suite.
102102
* Updated gremlin-javascript to use GraphBinary as default instead of GraphSONv3
103103
* Added the `asNumber()` step to perform number conversion.
104+
* Changed `repeat()` to make `repeatTraversal` global rather than a mix of local and global.
104105
* Renamed many types in the grammar for consistent use of terms "Literal", "Argument", and "Varargs".
105106
* Changed `gremlin-net` so that System.Text.Json is only listed as an explicit dependency when it is not available from the framework.
106107
* Fixed translation of numeric literals for Go losing type definitions.

docs/src/dev/developer/for-committers.asciidoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -624,6 +624,8 @@ mid-`E()` step is not supported.
624624
mid-`V()` step is not supported.
625625
* `@GraphComputerVerificationOneBulk` - The scenario will not work because `withBulk(false)` is configured and that
626626
is not compatible with `GraphComputer`
627+
* `@GraphComputerVerificationOrderingNotSupported` - The scenario will not work with `GraphComputer` because ordering
628+
within `repeat()` is not supported.
627629
* `@GraphComputerVerificationReferenceOnly` - The scenario itself is not written to support `GraphComputer` because it
628630
tries to reference inaccessible properties that are on elements only available by "reference" (i.e `T.id` only).
629631
* `@GraphComputerVerificationStrategyNotSupported` - The scenario uses a traversal strategy that is not supported by

docs/src/dev/provider/gremlin-semantics.asciidoc

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2004,7 +2004,7 @@ link:https://tinkerpop.apache.org/docs/x.y.z/reference/#project-step[reference]
20042004
[[repeat-step]]
20052005
=== repeat()
20062006
2007-
*Description:* Iteratively applies a traversal (the "loop body") to each incoming traverser until a stopping
2007+
*Description:* Iteratively applies a traversal (the "loop body") to all incoming traversers until a stopping
20082008
condition is met. Optionally, it can emit traversers on each iteration according to an emit predicate. The
20092009
repeat step supports loop naming and a loop counter via `loops()`.
20102010
@@ -2045,6 +2045,9 @@ predicates are evaluated before the first iteration (pre) or after each iteratio
20452045
`do/while` semantics respectively:
20462046
- Pre-check / pre-emit: when the modulator appears before `repeat(...)`.
20472047
- Post-check / post-emit: when the modulator appears after `repeat(...)`.
2048+
- Global traversal scope: The `repeatTraversal` is a global child. This means all traversers entering the repeat body
2049+
are processed together as a unified stream with global semantics. `Barrier` (`order()`, `sample()`, etc.) steps within
2050+
the repeat traversal operate across all traversers collectively rather than in isolation per traverser.
20482051
- Loop counter semantics:
20492052
- The loop counter for a given named or unnamed repeat is incremented once per completion of the loop body (i.e.,
20502053
after the body finishes), not before. Therefore, `loops()` reflects the number of completed iterations.

docs/src/upgrade/release-3.8.x.asciidoc

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -262,6 +262,104 @@ gremlin> g.inject([Float.MAX_VALUE, Float.MAX_VALUE], [Double.MAX_VALUE, Double.
262262
263263
See link:https://issues.apache.org/jira/browse/TINKERPOP-3115[TINKERPOP-3115]
264264
265+
==== repeat() Step Global Children Semantics Change
266+
267+
The `repeat()` step has been updated to treat the repeat traversal as a global child in all cases. Previously, the
268+
repeat traversal behaved as a hybrid between local and global semantics, which could lead to unexpected results in
269+
certain scenarios. The repeat traversal started off as a local child but as traversers were added back per iteration,
270+
it behaved more like a global child.
271+
272+
With this change, the repeat traversal now consistently operates with global semantics, meaning that all traversers
273+
are processed together rather than being processed per traverser. This provides more predictable behavior and aligns
274+
with the semantics of other steps.
275+
276+
[source,text]
277+
----
278+
// In 3.7.x and earlier, the order would be local to the first traverser.
279+
// Notice how the results are grouped by marko, then vadas, then lop
280+
gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1, 2, 3).
281+
......1> repeat(both().simplePath().order().by("name")).times(2).path().by("name")
282+
==>[marko,lop,josh]
283+
==>[marko,josh,lop]
284+
==>[marko,lop,peter]
285+
==>[marko,josh,ripple]
286+
==>[vadas,marko,josh]
287+
==>[vadas,marko,lop]
288+
==>[lop,marko,josh]
289+
==>[lop,josh,marko]
290+
==>[lop,josh,ripple]
291+
==>[lop,marko,vadas]
292+
293+
// In 3.8.0, the repeat now consistently uses global semantics
294+
// The traversers from the final iteration are ordered first then by the traversers from previous iterations
295+
gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1, 2, 3).
296+
......1> repeat(both().simplePath().order().by("name")).times(2).path().by("name")
297+
==>[marko,lop,josh]
298+
==>[vadas,marko,josh]
299+
==>[lop,marko,josh]
300+
==>[marko,josh,lop]
301+
==>[vadas,marko,lop]
302+
==>[lop,josh,marko]
303+
==>[marko,lop,peter]
304+
==>[marko,josh,ripple]
305+
==>[lop,josh,ripple]
306+
==>[lop,marko,vadas]
307+
----
308+
309+
This change may affect traversals that relied on the previous hybrid behavior, particularly those using side effects
310+
or barrier steps within `repeat()`. Review any traversals using `repeat()` with steps like `aggregate()`, `store()`,
311+
or other barrier steps to ensure they produce the expected results.
312+
313+
If you would like `repeat()` to behave similarly to how it did in 3.7.x, then you should wrap the repeat inside a
314+
`local()`. The following example demonstrates this:
315+
316+
[source,text]
317+
----
318+
// In 3.7.x
319+
gremlin> g.V().repeat(both().simplePath().order().by("name")).times(2).path().by("name")
320+
==>[marko,lop,josh]
321+
==>[marko,josh,lop]
322+
==>[marko,lop,peter]
323+
==>[marko,josh,ripple]
324+
==>[vadas,marko,josh]
325+
==>[vadas,marko,lop]
326+
==>[lop,marko,josh]
327+
==>[lop,josh,marko]
328+
==>[lop,josh,ripple]
329+
==>[lop,marko,vadas]
330+
==>[josh,marko,lop]
331+
==>[josh,lop,marko]
332+
==>[josh,lop,peter]
333+
==>[josh,marko,vadas]
334+
==>[ripple,josh,lop]
335+
==>[ripple,josh,marko]
336+
==>[peter,lop,josh]
337+
==>[peter,lop,marko]
338+
339+
// In 3.8.0, placing the repeat inside a local will again cause the repeat traversal to apply per traverser (locally)
340+
gremlin> g.V().local(repeat(both().simplePath().order().by("name")).times(2)).path().by("name")
341+
==>[marko,lop,josh]
342+
==>[marko,josh,lop]
343+
==>[marko,lop,peter]
344+
==>[marko,josh,ripple]
345+
==>[vadas,marko,josh]
346+
==>[vadas,marko,lop]
347+
==>[lop,marko,josh]
348+
==>[lop,josh,marko]
349+
==>[lop,josh,ripple]
350+
==>[lop,marko,vadas]
351+
==>[josh,marko,lop]
352+
==>[josh,lop,marko]
353+
==>[josh,lop,peter]
354+
==>[josh,marko,vadas]
355+
==>[ripple,josh,lop]
356+
==>[ripple,josh,marko]
357+
==>[peter,lop,josh]
358+
==>[peter,lop,marko]
359+
----
360+
361+
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3200[TINKERPOP-3200]
362+
265363
==== Prefer OffsetDateTime
266364
267365
The default implementation for date type in Gremlin is now changed from the `java.util.Date` to the more encompassing
@@ -1168,6 +1266,62 @@ The `ChooseStep` now provides a `ChooseSemantics` enum which helps indicate if t
11681266
11691267
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3178[TINKERPOP-3178]
11701268
1269+
===== repeat() Step Global Children Semantics Change
1270+
1271+
The `RepeatStep` has been updated to consistently treat the repeat traversal as a global child rather than using
1272+
hybrid local/global semantics. This change affects how the repeat traversal processes traversers and interacts with
1273+
the parent traversal.
1274+
1275+
Previously, `RepeatStep` would start with local semantics for the first iteration and then switch to global semantics
1276+
for the subsequent iterations, which created inconsistencies in how side effects and barriers behaved within the repeat
1277+
traversal. The biggest change will be to `Barrier` steps in the repeat traversal as they will now have access to all
1278+
the starting traversers.
1279+
1280+
[source,text]
1281+
----
1282+
// In 3.7.x and earlier, the order would be local to the first traverser.
1283+
// Notice how the results are grouped by marko, then vadas, then lop
1284+
gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1, 2, 3).
1285+
......1> repeat(both().simplePath().order().by("name")).times(2).path().by("name")
1286+
==>[marko,lop,josh]
1287+
==>[marko,josh,lop]
1288+
==>[marko,lop,peter]
1289+
==>[marko,josh,ripple]
1290+
==>[vadas,marko,josh]
1291+
==>[vadas,marko,lop]
1292+
==>[lop,marko,josh]
1293+
==>[lop,josh,marko]
1294+
==>[lop,josh,ripple]
1295+
==>[lop,marko,vadas]
1296+
1297+
// In 3.8.0, the aggregate now consistently uses global semantics
1298+
// The traversers are now ordered so the traversers from the final iteration are ordered first then by
1299+
// the traversers from previous iterations
1300+
gremlin> g.withoutStrategies(RepeatUnrollStrategy).V(1, 2, 3).
1301+
......1> repeat(both().simplePath().order().by("name")).times(2).path().by("name")
1302+
==>[marko,lop,josh]
1303+
==>[vadas,marko,josh]
1304+
==>[lop,marko,josh]
1305+
==>[marko,josh,lop]
1306+
==>[vadas,marko,lop]
1307+
==>[lop,josh,marko]
1308+
==>[marko,lop,peter]
1309+
==>[marko,josh,ripple]
1310+
==>[lop,josh,ripple]
1311+
==>[lop,marko,vadas]
1312+
----
1313+
1314+
Providers implementing custom optimizations or strategies around `RepeatStep` should verify that their
1315+
implementations account for the repeat traversal being a global child. This particularly affects:
1316+
1317+
- Strategies that analyze or transform repeat traversals
1318+
- Optimizations that depend on the scope semantics of child traversals
1319+
1320+
The last point about optimizations may be particularly important for providers that have memory constraints as this
1321+
change may bring about higher memory usage due to more traversers needing to be held in memory.
1322+
1323+
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3200[TINKERPOP-3200]
1324+
11711325
===== Prefer OffsetDateTime
11721326
11731327
The default implementation for date type in Gremlin is now changed from the deprecated `java.util.Date` to the more

gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/branch/RepeatStep.java

Lines changed: 31 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,12 @@
2121
import org.apache.tinkerpop.gremlin.process.traversal.Step;
2222
import org.apache.tinkerpop.gremlin.process.traversal.Traversal;
2323
import org.apache.tinkerpop.gremlin.process.traversal.Traverser;
24+
import org.apache.tinkerpop.gremlin.process.traversal.step.Barrier;
2425
import org.apache.tinkerpop.gremlin.process.traversal.step.TraversalParent;
2526
import org.apache.tinkerpop.gremlin.process.traversal.step.util.ComputerAwareStep;
2627
import org.apache.tinkerpop.gremlin.process.traversal.traverser.TraverserRequirement;
28+
import org.apache.tinkerpop.gremlin.process.traversal.util.FastNoSuchElementException;
29+
import org.apache.tinkerpop.gremlin.process.traversal.util.TraversalHelper;
2730
import org.apache.tinkerpop.gremlin.process.traversal.util.TraversalUtil;
2831
import org.apache.tinkerpop.gremlin.structure.util.StringFactory;
2932
import org.apache.tinkerpop.gremlin.util.iterator.IteratorUtils;
@@ -43,6 +46,7 @@ public final class RepeatStep<S> extends ComputerAwareStep<S, S> implements Trav
4346
private Traversal.Admin<S, S> repeatTraversal = null;
4447
private Traversal.Admin<S, ?> untilTraversal = null;
4548
private Traversal.Admin<S, ?> emitTraversal = null;
49+
private boolean first = true;
4650
private String loopName = null;
4751
public boolean untilFirst = false;
4852
public boolean emitFirst = false;
@@ -206,20 +210,20 @@ protected Iterator<Traverser.Admin<S>> standardAlgorithm() throws NoSuchElementE
206210
throw new IllegalStateException("The repeat()-traversal was not defined: " + this);
207211

208212
while (true) {
209-
if (this.repeatTraversal.getEndStep().hasNext()) {
213+
if (!first && this.repeatTraversal.getEndStep().hasNext()) {
210214
return this.repeatTraversal.getEndStep();
211215
} else {
212-
final Traverser.Admin<S> start = this.starts.next();
213-
start.initialiseLoops(this.getId(), this.loopName);
214-
if (doUntil(start, true)) {
215-
start.resetLoops();
216-
return IteratorUtils.of(start);
217-
}
218-
this.repeatTraversal.addStart(start);
219-
if (doEmit(start, true)) {
220-
final Traverser.Admin<S> emitSplit = start.split();
221-
emitSplit.resetLoops();
222-
return IteratorUtils.of(emitSplit);
216+
this.first = false;
217+
if (TraversalHelper.hasStepOfAssignableClassRecursively(Barrier.class, repeatTraversal)) {
218+
// If the repeatTraversal has a Barrier then make sure that all starts are added to the
219+
// repeatTraversal before it is iterated so that RepeatStep always has "global" children.
220+
if (!this.starts.hasNext())
221+
throw FastNoSuchElementException.instance();
222+
while (this.starts.hasNext()) {
223+
processTraverser(this.starts.next());
224+
}
225+
} else {
226+
return processTraverser(this.starts.next());
223227
}
224228
}
225229
}
@@ -249,6 +253,21 @@ protected Iterator<Traverser.Admin<S>> computerAlgorithm() throws NoSuchElementE
249253
}
250254
}
251255

256+
private Iterator<Traverser.Admin<S>> processTraverser(final Traverser.Admin<S> start) {
257+
start.initialiseLoops(this.getId(), this.loopName);
258+
if (doUntil(start, true)) {
259+
start.resetLoops();
260+
return IteratorUtils.of(start);
261+
}
262+
this.repeatTraversal.addStart(start);
263+
if (doEmit(start, true)) {
264+
final Traverser.Admin<S> emitSplit = start.split();
265+
emitSplit.resetLoops();
266+
return IteratorUtils.of(emitSplit);
267+
}
268+
return Collections.emptyIterator();
269+
}
270+
252271
/////////////////////////
253272

254273
public static <A, B, C extends Traversal<A, B>> C addRepeatToTraversal(final C traversal, final Traversal.Admin<B, B> repeatTraversal) {

0 commit comments

Comments
 (0)