Support for multiarm ancova #520

gowerc · 2025-08-18T09:55:46Z

Closes #145

This contains my initial code proposal for implementing multiarm ancova support (note that it is currently missing documentation and tests).

Some key points:

To preserve backwards compatibility I've left the original ancova function unchanged. To support multiple groups a new naming scheme is required for the parameters which would break any existing code which is looking for the current names like trt / lsm_rf.
That is, if users want to use the new multi-group ancova they will have to explicitly set fun=ancova_m_groups in the call to analyse()
For the new naming convention I've just used the level numbers e.g. lsm_L0 = least squared means for the reference, lsm_L1 = least squared means for the first offset / coefficient. I've also used a more explicit trt_L1_L0 to mean the coefficient of the first offset from the reference.
As I think we agreed I'm currently only extracting the treatment effects against the reference e.g. L1 - L0 & L2 - L0. I'm not calculating / extracting all pairwise combinations.
For the naming convention I didn't want to use the user-provided group/level values as that can lead to a ton of special edge cases with whitespaces / special characters that are hard to predict and make value-extraction very error prone. Instead I opted for just renaming the group variable to rbmiGroup and re-leveling it to be L0, L1, etc. Only complication here was that the formula provided in theory can have interaction terms so had to create a function to recursively traverse the AST updating any references of group to rbmiGroup. I will need to put in extensive unit testing to make sure theres no edge cases here with things like I() functions.

If there any no main objections here I'll start trying to add docs and tests.

danielinteractive · 2025-08-18T19:44:02Z

Hi @gowerc , cool, thanks for working on this!

Sorry if I had a look too late - my only initial high level comment would be that I don't really "like" that we would split out ANCOVA with multiple arms from the version with two arms. I would hope that we could extend the current naming scheme from ancova also to multiple arms. This would make maintenance as well as usage easier long term.

As a possible inspiration, please see https://github.com/johnsonandjohnson/junco/blob/main/R/ancova_rbmi.R where I implemented an ancova function with multiple arms (not focused on backwards compatibility for the naming scheme).

I think in the "worst" case one could just use a condition in the ancova function and have a different naming schedule for 2 arms vs. multiple arms.

What do you think?

gowerc · 2025-08-19T10:08:11Z

Hi @gowerc , cool, thanks for working on this!

Sorry if I had a look too late - my only initial high level comment would be that I don't really "like" that we would split out ANCOVA with multiple arms from the version with two arms. I would hope that we could extend the current naming scheme from ancova also to multiple arms. This would make maintenance as well as usage easier long term.

As a possible inspiration, please see https://github.com/johnsonandjohnson/junco/blob/main/R/ancova_rbmi.R where I implemented an ancova function with multiple arms (not focused on backwards compatibility for the naming scheme).

I think in the "worst" case one could just use a condition in the ancova function and have a different naming schedule for 2 arms vs. multiple arms.

What do you think?

Interestingly despite what their documentation says I don't think they actually support more than 2 arms, in particular they have near the top of the function checkmate::assert_factor(data[[group]], n.levels = 2L) which would error if there were more than 2 arms.

I am very conflicted here, I agree with everything you said and I am also not happy with splitting it out but that being said I don't see an easy way of merging them without breaking backwards compatibility as the naming scheme has to be different in order for it to make sense we currently have:

lsm_alt_visit_1
lsm_ref_visit_1
trt_visit_1

In hindsight these are already bad names that confuse users. If we were to extend this whilst maintaining backwards compatibility we'd end up with something like:

lsm_alt_visit_1
lsm_alt2_visit_1
lsm_ref_visit_1
trt_visit_1
trt_alt2_visit_1

Which I would argue is even worse.

Alternatively we could just have some code that dispatches to the different ANCOVA function based on the number of levels in group variable but this makes the documentation / explanation clunky e.g.

"If you have 2 groups then it will be formatted like this, O but if you have more than 2 it would be formatted like this even though half the values are the same"

Which is why I settled on just a separate function.

One option though to minimise maintenance is that we could deprecate anocva with a note that it will be removed in the future and replaced with ancova_m_groups then in 6 months to a years time we could fully remove it, we could also create an alias of ancova = ancova_m_groups after removing so we don't have such an awkward name.

(Despite my tone I am not confident on the best path forward here as all options appear bad to me so please do challenge if you still disagree)

danielinteractive · 2025-08-20T09:21:54Z

Interestingly despite what their documentation says I don't think they actually support more than 2 arms, in particular they have near the top of the function checkmate::assert_factor(data[[group]], n.levels = 2L) which would error if there were more than 2 arms.

Hmm... ok weird. But I am pretty sure we wanted this to work with multiple arms, that is also what the other code suggests... I will need to check in September when I am back.

I am very conflicted here, I agree with everything you said and I am also not happy with splitting it out but that being said I don't see an easy way of merging them without breaking backwards compatibility as the naming scheme has to be different in order for it to make sense we currently have:
lsm_alt_visit_1
lsm_ref_visit_1
trt_visit_1

Yeah I understand the need for backwards compatibility. But I wonder if an alternative solution here could be to have this old naming scheme used for 2 arms, and the new naming scheme used for more than 2 arms?

In hindsight these are already bad names that confuse users. If we were to extend this whilst maintaining backwards compatibility we'd end up with something like:
lsm_alt_visit_1
lsm_alt2_visit_1
lsm_ref_visit_1
trt_visit_1
trt_alt2_visit_1
Which I would argue is even worse.

Personally I think that would also be acceptable. The user will not see these names too much anyway if they just use the rbmi functions.

Alternatively we could just have some code that dispatches to the different ANCOVA function based on the number of levels in group variable but this makes the documentation / explanation clunky e.g.

Yeah I would not have two functions necessarily but just two naming schemes. Or just go with the naming scheme just described which I think is fine.

One option though to minimise maintenance is that we could deprecate anocva with a note that it will be removed in the future and replaced with ancova_m_groups then in 6 months to a years time we could fully remove it, we could also create an alias of ancova = ancova_m_groups after removing so we don't have such an awkward name.

(Despite my tone I am not confident on the best path forward here as all options appear bad to me so please do challenge if you still disagree)

Personally I would just go with the naming scheme you suggested, I think that is reasonable. Just for the sake of a better naming scheme I would not go the route of two functions or function deprecation. Just my 2 cents 😄

tobiasmuetze · 2025-09-01T06:58:56Z

Hi @gowerc / @danielinteractive,
I finally got around to looking at this.

I'd prefer to have a single ancova() function.

As for the naming convention, I agree with Daniel that the following

lsm_alt_visit_1
lsm_alt2_visit_1
lsm_ref_visit_1
trt_visit_1
trt_alt2_visit_1

is acceptable.

I am wondering if long term, it would be better to allow the user to match the group levels to ref, alt, etc.

initial

79d7c2b

gowerc requested a review from danielinteractive August 18, 2025 09:55

improve formula find and replace

acc41d1

gowerc linked an issue Aug 18, 2025 that may be closed by this pull request

Ancova to support > 2 arms #145

Open

gowerc added 11 commits August 18, 2025 11:18

added tests for frm_find_and_replace

316c6df

added rest of frm_find_and_replace tests

5762a55

all hail lintr

c520ece

run roxygen

9bc47d4

added ancova unit tests

b4dae5f

fix examples for non-exported function

cf47a1f

updated news.md file

28b0a1e

added docs

4173749

fix broken test

6507320

updated docs / specs

d3dddb7

updated faq

cf9ce32

gowerc marked this pull request as ready for review August 18, 2025 11:43

gowerc and others added 3 commits August 18, 2025 12:44

update version number

d291fb3

roxygen

65266fe

Merge branch 'main' into 145-multi-ancova

ca06a82

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for multiarm ancova #520

Support for multiarm ancova #520

Uh oh!

gowerc commented Aug 18, 2025 •

edited

Loading

Uh oh!

danielinteractive commented Aug 18, 2025

Uh oh!

gowerc commented Aug 19, 2025

Uh oh!

danielinteractive commented Aug 20, 2025

Uh oh!

tobiasmuetze commented Sep 1, 2025

Uh oh!

Uh oh!

Support for multiarm ancova #520

Are you sure you want to change the base?

Support for multiarm ancova #520

Uh oh!

Conversation

gowerc commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danielinteractive commented Aug 18, 2025

Uh oh!

gowerc commented Aug 19, 2025

Uh oh!

danielinteractive commented Aug 20, 2025

Uh oh!

tobiasmuetze commented Sep 1, 2025

Uh oh!

Uh oh!

gowerc commented Aug 18, 2025 •

edited

Loading