Skip to content

Conversation

@jzimbel-mbta
Copy link
Member

@jzimbel-mbta jzimbel-mbta commented Jun 13, 2025

Summary of changes

Asana Ticket: 🏹 Improve limits derivation for complex GL disruptions

Summary of change

The overall idea is to compare stops visited by exported service against "natural" stop sequences, instead of canonical stop sequences.

For example, a disruption extending from Babcock to North Station is not possible to describe against one GL route's stop sequence because Green-B turns around at Gov Ctr. The old logic would produce 2 (or more) limits for this--Babcock to Gov Ctr, and Gov Ctr to North Station.

The new logic uses @arkadyan's unrooted_polytree data structure to convert a set of canonical stop sequences for a line to a set of "natural" stop sequences that represent the longest possible runs from one end of the line to the other, ignoring how many intra-line transfers you'd need to make.

Then, we compare exported service against these and do some additional steps to condense the resulting limits into a minimal number of maximally-long segments.

Example

Here's how this improves limits derivation for 2025-spring-GLBabcockNorthStation-v2.zip -- see associated asana ticket

Left is output of the existing logic in main branch, right is output of the new logic.
image

TO DO

  • Update ExportUploadTest.build_gtfs/1 to add platform <-> parent station relations for all inserted stops, since these are required by the new logic.

Reviewer Checklist

  • Meets ticket's acceptance criteria
  • Any new or changed functions have typespecs
  • Tests were added for any new functionality (don't just rely on Codecov)
  • This branch was deployed to the staging environment and is currently running with no unexpected increase in warnings, and no errors or crashes.

@jzimbel-mbta jzimbel-mbta marked this pull request as ready for review June 23, 2025 11:35
@jzimbel-mbta jzimbel-mbta requested a review from a team as a code owner June 23, 2025 11:35
@jzimbel-mbta jzimbel-mbta requested review from Whoops and removed request for a team June 23, 2025 11:35
@Whoops
Copy link
Contributor

Whoops commented Jun 23, 2025

Outside of the code, I would like @shantigonzales and @fsaid90 to chime in on this. If this is better really depends on what this feature is designed to be. Basically, my concern is this: limits (as in limits we put in Arrow, not (necessarily) these derived limits) are a fairly simple, crude concept. If I have a limit between A and B, that limit looks for trips that visit both A and B, and cuts out the A->B segment, replacing that trip with 0 - 2 new trips. A limit won't affect trips that don't visit both stops. So if for example we have a limit between Government Center and Heath Street (E line) it won't affect B, C, D trips between Government Center and Kenmore because they don't hit Heath Street, even though they share a segment that's probably closed (Government Center -> Copley). So, to fully model the example outage, you need two limits: GC->Copley (B, C, D, E) and Copley->Heath (E).

So, looking at the "new" derived limits of this feature, we see:

Babcock (B) -> Copley (B,C,D,E) = B trips affected
Babcock (B) -> North Station (D, E) = No trips affected
Heath Street (E) -> North Station (D, E) = E trips affected

If we think of this feature as helping someone looking at a track diagram understand what's closed, this is definitely easier to follow than the original (moreso if the redundant Babcock -> Copley is removed, it's a subset of Babcock -> North Station, but let's not get obsessive here).

On the other hand, if we think of this as a literal documentation of the equivalent "limits" someone trying to apply these limits manually to replicate the outage would end up missing the C and D line trips because none of those trips hit both endpoints of any of these limits. Which is what is being capture in the original implementation:

North Station (D, E) -> Government Center (B, C, D, E) = (D, E)
Government Center (B, C, D, E) -> Boylston (B, C, D, E) = (B, C, D, E)
Copley (B,C,D,E) -> Heath Street (E) = (E)
Government Center (B, C, D, E) -> Babcock (B) = (B)
Government Center (B, C, D, E) -> Kenmore (B, C, D) = (B, C, D)
North Station (D, E) -> Kenmore (B, C, D) = (D)
North Station (D, E) -> Heath Street (E) = (E)

@Whoops
Copy link
Contributor

Whoops commented Jun 23, 2025

No code concerns. For now, I'm withholding the ✅ on confirmation from our stakeholders that they prefer this version, but barring that I think this is good to go.

@fsaid90
Copy link

fsaid90 commented Jun 23, 2025

Outside of the code, I would like @shantigonzales and @fsaid90 to chime in on this. If this is better really depends on what this feature is designed to be. Basically, my concern is this: limits (as in limits we put in Arrow, not (necessarily) these derived limits) are a fairly simple, crude concept. If I have a limit between A and B, that limit looks for trips that visit both A and B, and cuts out the A->B segment, replacing that trip with 0 - 2 new trips. A limit won't affect trips that don't visit both stops. So if for example we have a limit between Government Center and Heath Street (E line) it won't affect B, C, D trips between Government Center and Kenmore because they don't hit Heath Street, even though they share a segment that's probably closed (Government Center -> Copley). So, to fully model the example outage, you need two limits: GC->Copley (B, C, D, E) and Copley->Heath (E).

So, looking at the "new" derived limits of this feature, we see:

Babcock (B) -> Copley (B,C,D,E) = B trips affected
Babcock (B) -> North Station (D, E) = No trips affected
Heath Street (E) -> North Station (D, E) = E trips affected

If we think of this feature as helping someone looking at a track diagram understand what's closed, this is definitely easier to follow than the original (moreso if the redundant Babcock -> Copley is removed, it's a subset of Babcock -> North Station, but let's not get obsessive here).

On the other hand, if we think of this as a literal documentation of the equivalent "limits" someone trying to apply these limits manually to replicate the outage would end up missing the C and D line trips because none of those trips hit both endpoints of any of these limits. Which is what is being capture in the original implementation:

North Station (D, E) -> Government Center (B, C, D, E) = (D, E)
Government Center (B, C, D, E) -> Boylston (B, C, D, E) = (B, C, D, E)
Copley (B,C,D,E) -> Heath Street (E) = (E)
Government Center (B, C, D, E) -> Babcock (B) = (B)
Government Center (B, C, D, E) -> Kenmore (B, C, D) = (B, C, D)
North Station (D, E) -> Kenmore (B, C, D) = (D)
North Station (D, E) -> Heath Street (E) = (E)

Those are good points Walton - and I'd definitely want to hear Shanti's thoughts as well before I give my own thoughts (I don't want to bias her!), BUT I do also wonder:

Since we are deriving the actual GL branch, perhaps grouping them in the UI per branch (and indicating the actual branch) might make things more neat and tidy in general (per Jon's original limits derivation implementation).

What do you both think? (Jon and Walton, while Shanti's out :) )

@Whoops
Copy link
Contributor

Whoops commented Jul 7, 2025

One factor that occurred to me in the discussion this morning, this new version is much closer to how we talk about limits and put them in tickets. So if the purpose is just validating the HASTUS export, does what we expect it to do, this is no doubt superior.

@shantigonzales
Copy link

My 2c, which echoes what Walton said to some extent: this is much more intuitive to me as a non-technical user, and will be helpful in streamlining validation. In Walton's example, I'm not entirely sure how those C + D line trips get captured, but that's more technical than program. From a non-code perspective, I like how this is approaching the problem.

@jzimbel-mbta jzimbel-mbta marked this pull request as draft July 8, 2025 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants