-
Notifications
You must be signed in to change notification settings - Fork 2
feat: Use "natural" stop sequences to condense GL derived limits #1271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Outside of the code, I would like @shantigonzales and @fsaid90 to chime in on this. If this is better really depends on what this feature is designed to be. Basically, my concern is this: limits (as in limits we put in Arrow, not (necessarily) these derived limits) are a fairly simple, crude concept. If I have a limit between A and B, that limit looks for trips that visit both A and B, and cuts out the A->B segment, replacing that trip with 0 - 2 new trips. A limit won't affect trips that don't visit both stops. So if for example we have a limit between Government Center and Heath Street (E line) it won't affect B, C, D trips between Government Center and Kenmore because they don't hit Heath Street, even though they share a segment that's probably closed (Government Center -> Copley). So, to fully model the example outage, you need two limits: GC->Copley (B, C, D, E) and Copley->Heath (E). So, looking at the "new" derived limits of this feature, we see: If we think of this feature as helping someone looking at a track diagram understand what's closed, this is definitely easier to follow than the original (moreso if the redundant Babcock -> Copley is removed, it's a subset of Babcock -> North Station, but let's not get obsessive here). On the other hand, if we think of this as a literal documentation of the equivalent "limits" someone trying to apply these limits manually to replicate the outage would end up missing the C and D line trips because none of those trips hit both endpoints of any of these limits. Which is what is being capture in the original implementation: |
|
No code concerns. For now, I'm withholding the ✅ on confirmation from our stakeholders that they prefer this version, but barring that I think this is good to go. |
Those are good points Walton - and I'd definitely want to hear Shanti's thoughts as well before I give my own thoughts (I don't want to bias her!), BUT I do also wonder: Since we are deriving the actual GL branch, perhaps grouping them in the UI per branch (and indicating the actual branch) might make things more neat and tidy in general (per Jon's original limits derivation implementation). What do you both think? (Jon and Walton, while Shanti's out :) ) |
|
One factor that occurred to me in the discussion this morning, this new version is much closer to how we talk about limits and put them in tickets. So if the purpose is just validating the HASTUS export, does what we expect it to do, this is no doubt superior. |
|
My 2c, which echoes what Walton said to some extent: this is much more intuitive to me as a non-technical user, and will be helpful in streamlining validation. In Walton's example, I'm not entirely sure how those C + D line trips get captured, but that's more technical than program. From a non-code perspective, I like how this is approaching the problem. |
Summary of changes
Asana Ticket: 🏹 Improve limits derivation for complex GL disruptions
Summary of change
The overall idea is to compare stops visited by exported service against "natural" stop sequences, instead of canonical stop sequences.
For example, a disruption extending from Babcock to North Station is not possible to describe against one GL route's stop sequence because Green-B turns around at Gov Ctr. The old logic would produce 2 (or more) limits for this--Babcock to Gov Ctr, and Gov Ctr to North Station.
The new logic uses @arkadyan's unrooted_polytree data structure to convert a set of canonical stop sequences for a line to a set of "natural" stop sequences that represent the longest possible runs from one end of the line to the other, ignoring how many intra-line transfers you'd need to make.
Then, we compare exported service against these and do some additional steps to condense the resulting limits into a minimal number of maximally-long segments.
Example
Here's how this improves limits derivation for 2025-spring-GLBabcockNorthStation-v2.zip -- see associated asana ticket
Left is output of the existing logic in main branch, right is output of the new logic.

TO DO
ExportUploadTest.build_gtfs/1to add platform <-> parent station relations for all inserted stops, since these are required by the new logic.Reviewer Checklist