Can we have a scoped representation for cleanups?

CIR currently attaches cleanup blocks to calls. For example, from https://godbolt.org/z/f4se8MPG7, the following code:

```cpp
struct S { ~S(); };
void f();
void g() {
    S s;
    f();
    f();
}
```

produces IR with a separate cleanup block for each call:

```
cir.try synthetic cleanup {
  cir.call exception @_Z1fv() : () -> () cleanup {
    cir.call @_ZN1SD1Ev(%0) : (!cir.ptr<!ty_S>) -> ()
    cir.yield
  }
  cir.yield
} catch [#cir.unwind {
  cir.resume
}]
cir.try synthetic cleanup {
  cir.call exception @_Z1fv() : () -> () cleanup {
    cir.call @_ZN1SD1Ev(%0) : (!cir.ptr<!ty_S>) -> ()
    cir.yield
  }
  cir.yield
} catch [#cir.unwind {
  cir.resume
}]
```

This leads to duplicate landing pads (#1052). From what I understand though, cleanups should always form a stack (and I believe Clang uses a stack to store them as well), and I'm wondering if we can take advantage of that and use scopes to represent cleanups instead. The example above would become something like (I'm omitting the `catch` clause here just for simplicity, but it would also be nice if we could omit no-op catch clauses in the actual IR as well):

```
cir.try synthetic cleanup {
  cir.call exception @_Z1fv()
  cir.call exception @_Z1fv()
  cir.yield
} cleanup {
  cir.call @_ZN1SD1Ev(%0) : (!cir.ptr<!ty_S>) -> ()
  cir.yield
}
```

It's more interesting when the cleanups actually stack. CIR currently fails on that (https://godbolt.org/z/Gz7rETEbY), but an example would be:

```cpp
struct S { ~S(); };
struct T { ~T(); };
void f();
void g() {
    S s;
    f();
    T t;
    f();
}
```

I'm envisioning it being represented using something like the following:

```
cir.try synthetic cleanup {
  cir.call exception @_Z1fv()
  cir.try synthetic cleanup {
    cir.call exception @_Z1fv()
    cir.yield
  } cleanup {
    cir.call @_ZN1TD1Ev(%1): (!cir.ptr<!ty_T>) -> ()
    cir.yield
  }
  cir.yield
} cleanup {
  cir.call @_ZN1SD1Ev(%0) : (!cir.ptr<!ty_S>) -> ()
  cir.yield
}
```

The idea is that each `cir.try` should correspond to a single call site (as in an entry in the call site table in the LSDA) and landing pad. ٓA `cir.yield` inside a cleanup cascades to an outer cleanup if there's any or else resumes unwinding. This would translate pretty well to the LLVM generated by Clang (https://godbolt.org/z/Tnd7r9vKo).

The interaction with catch blocks is another interesting design consideration. For example (this also fails with CIR today):

```cpp
struct S { ~S(); };
struct T { ~T(); };
void f();
void g() {
    try {
        S s;
        f();
        T t;
        f();
    } catch (int) {
        f();
    }
}
```

I'm picturing as (the catch syntax is just for simplicity, not something I'm actually suggesting):

```
cir.try cleanup {
  cir.call exception @_Z1fv()
  cir.try synthetic cleanup {
    cir.call exception @_Z1fv()
    cir.yield
  } cleanup {
    cir.call @_ZN1TD1Ev(%1): (!cir.ptr<!ty_T>) -> ()
    cir.yield
  }
  cir.yield
} cleanup {
  cir.call @_ZN1SD1Ev(%0) : (!cir.ptr<!ty_S>) -> ()
  cir.yield
} catch (int) {
  cir.call exception @_Z1fv()
  cir.yield
}
```

This is the same concept as before: each try corresponds to a single call site and landing pad, and cleanups cascade as before. The landing pad for the inner try in LLVM needs to also catch the `int`, but it delegates the actual catching work to the outer landing pad (see https://godbolt.org/z/Y1Gxr3McG). We could either explicit add a catch region to that try to indicate this or just have it be implicit based on nesting. Note that this generalizes across more nesting too, e.g. in https://godbolt.org/z/5xW9jxWor, the inner landing pad catches both int and bool, but it delegates to the outer landing pad for int.

Another interesting case to consider is the ordering of cleanups and catches. Consider:

```cpp
struct S { ~S(); };
void f();
void g() {
    S s1;
    try {
        S s2;
        f();
    } catch (int) {
        f();
    }
}
```

`s2` needs to be destroyed before the catch logic, whereas `s1` needs to be destroyed by the landing pad only if the catch isn't entered. https://godbolt.org/z/fnzTGPvr8 shows how Clang handles this (including the additional complication of also destroying `s1` if the call to `f` inside the catch throws). I can think of two ways to represent this. If we want to keep our property of each try corresponding to a single site and landing pad, we can have the ordering of cleanup and catch regions determine the order in which they run, i.e.

```
cir.try synthetic cleanup {
  cir.call exception @_Z1fv()
  cir.yield
} cleanup {
  cir.call @_ZN1SD1Ev(%s2): (!cir.ptr<!ty_T>) -> ()
  cir.yield
} catch (int) {
  cir.try synthetic cleanup {
    cir.call exception @_Z1fv()
    cir.yield
  } cleanup {
    cir.yield
  }
} cleanup {
  cir.call @_ZN1SD1Ev(%s1) : (!cir.ptr<!ty_S>) -> ()
  cir.yield
}
```

A `cir.yield` inside a cleanup would delegate to any subsequent cleanups on the outer try. Alternatively, we could abandon the one try = one call site and landing pad property, and instead have each try correspond to either a cleanup region or a catch region and rely on nesting to convey the correct sequence:

```
cir.try synthetic cleanup {
  cir.try {
    cir.try synthetic cleanup {
      cir.call exception @_Z1fv()
      cir.yield
    } cleanup {
      cir.call @_ZN1SD1Ev(%s2): (!cir.ptr<!ty_T>) -> ()
      cir.yield
    }
  } catch (int) {
    cir.try synthetic cleanup {
      cir.call exception @_Z1fv()
      cir.yield
    } cleanup {
      cir.yield
    }
  }
} cleanup {
  cir.call @_ZN1SD1Ev(%s1): (!cir.ptr<!ty_T>) -> ()
  cir.yield
}
```

I can see pros and cons for both. Either way though the general point is that cleanups become scoped instead of attached to individual calls, which should let us generate better code for them (including potentially outlining cleanup sequences so they can be shared by the normal and exceptional paths in the future for code size purposes). @bcardosolopes, what do you think are the pros and cons of this compared to the current representation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can we have a scoped representation for cleanups? #1123

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can we have a scoped representation for cleanups? #1123

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions