Skip to content

Commit 504bd9c

Browse files
kenhuuuCole-Greer
andcommitted
TINKERPOP-3196 Split bulked traversers for LocalStep
Local was behaving as a per traverser traversal when it should be a per object traversal. This means that there should be no bulked traversers that are added as starts to local traversals. Co-authored-by: Cole-Greer <[email protected]>
1 parent 65a438a commit 504bd9c

File tree

7 files changed

+178
-27
lines changed

7 files changed

+178
-27
lines changed

CHANGELOG.asciidoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ This release also includes changes from <<release-3-7-XXX, 3.7.XXX>>.
3535
* Renamed `none()` step to `discard()`.
3636
* Repurposed `none()` step as a list filtering step with the signature `none(P)`.
3737
* Modified mathematical operators to prevent overflows in steps such as `sum()` and 'sack()' to prefer promotion to the next highest number type.
38+
* Modified `local()` to be "object-local" rather than "traverser-local".
3839
* Added `DateTime` ontop of the existing 'datetime' grammar.
3940
* Added `UUID()` and `UUID(value)` to grammar.
4041
* Deprecated the `UnifiedChannelizer`.

docs/src/dev/provider/gremlin-semantics.asciidoc

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1589,6 +1589,36 @@ See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/j
15891589
link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/LengthLocalStep.java[source (local)],
15901590
link:https://tinkerpop.apache.org/docs/x.y.z/reference/#length-step[reference]
15911591
1592+
[[local-step]]
1593+
=== local()
1594+
1595+
*Description:* Executes the provided traversal in an object-local manner.
1596+
1597+
*Syntax:* `local(Traversal localTraversal)`
1598+
1599+
[width="100%",options="header"]
1600+
|=========================================================
1601+
|Start Step |Mid Step |Modulated |Domain |Range
1602+
|N |Y |N |`any` |`any`
1603+
|=========================================================
1604+
1605+
*Arguments:*
1606+
1607+
* `localTraversal` - The traversal that processes each single-object traverser individually.
1608+
1609+
*Modulation:*
1610+
1611+
None
1612+
1613+
*Considerations:*
1614+
1615+
The `local()` step enforces object-local execution. As a branching step with local children, it implements strict lazy
1616+
evaluation by passing a single traverser at a time to the local traversal (bulk of exactly one, if bulking is supported)
1617+
and resetting the traversal to clean state between executions.
1618+
1619+
See: link:https://github.com/apache/tinkerpop/tree/x.y.z/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/map/LocalStep.java[source],
1620+
link:https://tinkerpop.apache.org/docs/x.y.z/reference/#local-step[reference]
1621+
15921622
[[intersect-step]]
15931623
=== intersect()
15941624

docs/src/reference/the-traversal.asciidoc

Lines changed: 4 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -2557,30 +2557,15 @@ in an object-local traversal. As such, the `order().by()` and the `limit()` refe
25572557
stream as a whole.
25582558
25592559
Local Step is quite similar in functionality to <<general-steps,Flat Map Step>> where it can often be confused.
2560-
`local()` propagates the traverser through the internal traversal as is without splitting/cloning it. Thus, its
2561-
a “global traversal” with local processing. Its use is subtle and primarily finds application in compilation
2562-
optimizations (i.e. when writing `TraversalStrategy` implementations. As another example consider:
2560+
The primary distinction between these steps is that while `local()` preserves the path history of traversers as they
2561+
pass through its child traversal, `flatMap()` does not. As another example consider:
25632562
25642563
[gremlin-groovy,modern]
25652564
----
2566-
g.V().both().barrier().flatMap(groupCount().by("name"))
2567-
g.V().both().barrier().local(groupCount().by("name"))
2565+
g.V().local(outE().inV()).path()
2566+
g.V().flatMap(outE().inV()).path()
25682567
----
25692568
2570-
Use of `local()` is often a mistake. This is especially true when its argument contains a reducing step. For example,
2571-
let's say the requirement was to count the number of properties per `Vertex` in:
2572-
2573-
[gremlin-groovy,modern]
2574-
----
2575-
g.V().both().local(properties('name','age').count()) <1>
2576-
g.V().both().map(properties('name','age').count()) <2>
2577-
----
2578-
2579-
<1> The output here seems impossible because no single vertex in the "modern" graph can have more than two properties
2580-
given the "name" and "age" filters, but because the counting is happening object-local the counting is occurring unique
2581-
to each object rather than each global traverser.
2582-
<2> Replacing `local()` with `map()` returns the result desired by the requirement.
2583-
25842569
WARNING: The anonymous traversal of `local()` processes the current object "locally." In OLAP, where the atomic unit
25852570
of computing is the vertex and its local "star graph," it is important that the anonymous traversal does not leave
25862571
the confines of the vertex's star graph. In other words, it can not traverse to an adjacent vertex's properties or edges.

docs/src/upgrade/release-3.8.x.asciidoc

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -425,6 +425,34 @@ compatibility.
425425
426426
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3161[TINKERPOP-3161]
427427
428+
==== Split bulked traversers for `local()`
429+
430+
Prior to 3.8.0, local() exhibited "traverser-local" semantics, where the local traversal would apply independently to
431+
each individual `Traverser`. This often led to confusion, especially in the presence of reducing barrier steps, as
432+
bulked traversers would cause multiple objects to be processed at once. local() has been updated to automatically split
433+
any bulked traversers and thus now exhibits true "object-local" semantics.
434+
435+
[source,groovy]
436+
----
437+
// 3.7.4
438+
gremlin> g.V().out().barrier().local(count())
439+
==>3
440+
==>1
441+
==>1
442+
==>1
443+
444+
// 3.8.0
445+
gremlin> g.V().out().barrier().local(count())
446+
==>1
447+
==>1
448+
==>1
449+
==>1
450+
==>1
451+
==>1
452+
----
453+
454+
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3196[TINKERPOP-3196]
455+
428456
==== Removal of P.getOriginalValue()
429457
430458
`P.getOriginalValue()` has been removed as it was not offering much value and was often confused with `P.getValue()`.

gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/branch/LocalStep.java

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
import org.apache.tinkerpop.gremlin.process.traversal.step.TraversalParent;
2424
import org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep;
2525
import org.apache.tinkerpop.gremlin.process.traversal.traverser.TraverserRequirement;
26+
import org.apache.tinkerpop.gremlin.process.traversal.traverser.util.EmptyTraverser;
2627
import org.apache.tinkerpop.gremlin.process.traversal.util.FastNoSuchElementException;
2728
import org.apache.tinkerpop.gremlin.structure.util.StringFactory;
2829

@@ -38,6 +39,7 @@ public final class LocalStep<S, E> extends AbstractStep<S, E> implements Travers
3839

3940
private Traversal.Admin<S, E> localTraversal;
4041
private boolean first = true;
42+
private Traverser.Admin<S> currentStart = EmptyTraverser.instance();
4143

4244
public LocalStep(final Traversal.Admin traversal, final Traversal.Admin<S, E> localTraversal) {
4345
super(traversal);
@@ -58,20 +60,34 @@ public Set<TraverserRequirement> getRequirements() {
5860
protected Traverser.Admin<E> processNextStart() throws NoSuchElementException {
5961
if (this.first) {
6062
this.first = false;
61-
this.localTraversal.addStart(this.starts.next());
63+
this.localTraversal.addStart(nextStart());
6264
}
6365
while (true) {
6466
if (this.localTraversal.hasNext())
6567
return this.localTraversal.nextTraverser();
66-
else if (this.starts.hasNext()) {
68+
else if (hasStartRemaining()) {
6769
this.localTraversal.reset();
68-
this.localTraversal.addStart(this.starts.next());
70+
this.localTraversal.addStart(nextStart());
6971
} else {
7072
throw FastNoSuchElementException.instance();
7173
}
7274
}
7375
}
7476

77+
private boolean hasStartRemaining() {
78+
return (currentStart.bulk() > 0L) || this.starts.hasNext();
79+
}
80+
81+
private Traverser.Admin<S> nextStart() throws NoSuchElementException {
82+
if (currentStart.bulk() == 0L) {
83+
currentStart = starts.next();
84+
}
85+
final Traverser.Admin<S> split = currentStart.split();
86+
split.setBulk(1L);
87+
currentStart.setBulk(currentStart.bulk() - 1L);
88+
return split;
89+
}
90+
7591
@Override
7692
public void reset() {
7793
super.reset();

gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/branch/Choose.feature

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -337,10 +337,14 @@ Feature: Step - choose()
337337
When iterated to list
338338
Then the result should be unordered
339339
| result |
340-
| l[marko,marko] |
341-
| l[vadas,vadas] |
342-
| l[josh,josh] |
343-
| l[peter,peter] |
340+
| l[marko] |
341+
| l[marko] |
342+
| l[vadas] |
343+
| l[vadas] |
344+
| l[josh] |
345+
| l[josh] |
346+
| l[peter] |
347+
| l[peter] |
344348

345349
@GraphComputerVerificationMidVNotSupported
346350
Scenario: g_unionXV_VXhasLabelXpersonX_barrier_mapXchooseXageX_optionXbetweenX26_30X_name_foldX_optionXnone_name_foldXX

gremlin-test/src/main/resources/org/apache/tinkerpop/gremlin/test/features/branch/Local.feature

Lines changed: 88 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -183,4 +183,91 @@ Feature: Step - local()
183183
| m[{"name":"marko","project":"lop"}] |
184184
| m[{"name":"josh","project":"lop"}] |
185185
| m[{"name":"peter","project":"lop"}] |
186-
| m[{"name":"josh","project":"ripple"}] |
186+
| m[{"name":"josh","project":"ripple"}] |
187+
188+
# Barrier should show that bulked traversers don't affect local() traversal
189+
Scenario: g_V_in_barrier_localXcountX
190+
Given the modern graph
191+
And the traversal of
192+
"""
193+
g.V().in().barrier().local(__.count())
194+
"""
195+
When iterated to list
196+
Then the result should be unordered
197+
| result |
198+
| d[1].l |
199+
| d[1].l |
200+
| d[1].l |
201+
| d[1].l |
202+
| d[1].l |
203+
| d[1].l |
204+
205+
# Path of traversers isn't hidden in local
206+
@GraphComputerVerificationStarGraphExceeded
207+
Scenario: g_V_localXout_in_simplePathX_path
208+
Given the modern graph
209+
And the traversal of
210+
"""
211+
g.V().local(__.out().in().simplePath()).path()
212+
"""
213+
When iterated to list
214+
Then the result should be unordered
215+
| result |
216+
| p[v[josh],v[lop],v[marko]] |
217+
| p[v[josh],v[lop],v[peter]] |
218+
| p[v[marko],v[lop],v[josh]] |
219+
| p[v[marko],v[lop],v[peter]] |
220+
| p[v[peter],v[lop],v[marko]] |
221+
| p[v[peter],v[lop],v[josh]] |
222+
223+
# Traverser's sack value should be carried over from local traversal
224+
Scenario: g_withSackX0LX_V_in_barrier_localXsackXsumX_byXageXX_sack
225+
Given the modern graph
226+
And the traversal of
227+
"""
228+
g.withSack(0L).V().in().barrier().local(__.sack(sum).by("age")).sack()
229+
"""
230+
When iterated to list
231+
Then the result should be unordered
232+
| result |
233+
| d[29].l |
234+
| d[29].l |
235+
| d[29].l |
236+
| d[32].l |
237+
| d[32].l |
238+
| d[35].l |
239+
240+
# Nested local should return proper local count
241+
Scenario: g_V_localXout_localXcountXX
242+
Given the modern graph
243+
And the traversal of
244+
"""
245+
g.V().local(__.out().local(__.count()))
246+
"""
247+
When iterated to list
248+
Then the result should be unordered
249+
| result |
250+
| d[1].l |
251+
| d[1].l |
252+
| d[1].l |
253+
| d[1].l |
254+
| d[1].l |
255+
| d[1].l |
256+
257+
# Local should be applied to union's global child
258+
Scenario: g_V_unionXoutE_count_localXinE_countXX
259+
Given the modern graph
260+
And the traversal of
261+
"""
262+
g.V().union(__.outE().count(), __.local(inE().count()))
263+
"""
264+
When iterated to list
265+
Then the result should be unordered
266+
| result |
267+
| d[6].l |
268+
| d[0].l |
269+
| d[1].l |
270+
| d[3].l |
271+
| d[1].l |
272+
| d[1].l |
273+
| d[0].l |

0 commit comments

Comments
 (0)