Skip to content

Feature eair: Extra combined statistic required to be able to present "n (EAIR per 100 SY)"#362

Draft
iaugusty wants to merge 31 commits into
devfrom
feature/hotfix_eair
Draft

Feature eair: Extra combined statistic required to be able to present "n (EAIR per 100 SY)"#362
iaugusty wants to merge 31 commits into
devfrom
feature/hotfix_eair

Conversation

@iaugusty

@iaugusty iaugusty commented May 29, 2026

Copy link
Copy Markdown
Collaborator

Pull Request

Fixes #361

Checks

  • (Have you updated the NEWS.md ?)
  • (Have you added proper tests for new functions/features ?)
  • (Have you added new functions to the pkgdown.yml ?)
  • (Have you run document() on new functions ?)

@iaugusty iaugusty marked this pull request as ready for review May 29, 2026 12:53
@iaugusty iaugusty requested review from jkovach2 and munoztd0 May 29, 2026 12:53
@iaugusty iaugusty linked an issue May 29, 2026 that may be closed by this pull request
Comment thread tests/testthat/test-patyrs-eair100.R
@munoztd0 munoztd0 changed the title Feature/hotfix eair Feature eair Jun 2, 2026
@munoztd0 munoztd0 changed the title Feature eair Feature eair: Extra combined statistic required to be able to present "n (EAIR per 100 SY)" Jun 2, 2026
@munoztd0 munoztd0 marked this pull request as draft June 2, 2026 15:08
@iaugusty iaugusty marked this pull request as ready for review June 9, 2026 13:35
@iaugusty iaugusty marked this pull request as draft June 11, 2026 06:39

@nikolas-burkoff nikolas-burkoff left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it's looking good. A few comments from me (note I have not gone into detail to check the maths is correct).

I think if both eair is needed by subject and by events having two separate calls is fine - though I would add that as a note in the roxygen comment - it feels unlikely the same output would need both

I think the row labels behaviour makes sense though I don't know if there's an expectation from others for how it should behave

Comment thread DESCRIPTION
lifecycle,
dplyr
dplyr,
ratesci (>= 1.0.0.9000)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is only used for tests now. If it's going to stay like this (and the custom implementation is going to be used) I might actually remove the dependency here and create test cases like below.

We can then have control over when we update our test cases rather than tests failing because something changed in ratesci (the argument against this is if we think it's likely there are bugs and ratesci and our implementation and ratesci fixes them, then we might miss them)

testthat::test_that("our implementation matches ratesci", {

   our_result <- ...

   # These values have been created by running:
   #  ratesci::package_function(...) using version xxxx of the ratesci package
   ratesci_result <- c("0.0045", "0.33544")

   expect_equal(our_result, ratesci_result)
})

Comment thread DESCRIPTION Outdated
insightsengineering/formatters@main,
insightsengineering/tern@main
insightsengineering/tern@main,
petelaud/ratesci

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of keeping our custom implementation v using ratesci - once ratesci is updated on CRAN ( I wouldn't use ratesci until it is), personally I'd prefer to use ratesci rather than maintaining an implementation ourselves as this simplifies thing.

However, adding a dependency on a stats package adds risks in terms of validation. I don't know junco/tern's policy on this so it may be that the control we gain from our own implementation actually outweights the benefits of adding the dependency in our case (especially as the custom implementation is already written)

Comment thread R/a_eair_j.R Outdated
#' Defaults to `100`, yielding a rate per 100 person-years.
#' For example, use `1` for a rate per person-year or `1000` for a rate per
#' 1000 person-years.
#' @param count_events (`logical`)\cr if `TRUE`, counts the total number of events

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about count_multiple_events_per_subj being TRUE or FALSE or count_events_per_subject being "first" (or "single") or "all"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think the default label should change dependent on the value of this, like tern - as it makes it very clear to users what's going on

Comment thread R/a_eair_j.R Outdated
#' with at least one occurrence of a specified adverse event divided by the total
#' at-risk exposure time across all subjects.
#'
#' At-risk exposure time is calculated for each subject

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs updating with the new count_events argument

Comment thread R/a_eair_j.R
#' @param fup_var (`string`)\cr follow-up variable name.
#' @param occ_var (`string`)\cr occurrence variable name.
#' @param occ_dy (`string`)\cr occurrence day variable name.
#' @param fup_var (`string`)\cr name of the variable containing the total follow-up duration

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These add a lot more detail which I think is great

@iaugusty iaugusty marked this pull request as ready for review June 19, 2026 11:37
@iaugusty iaugusty marked this pull request as draft June 19, 2026 11:52
@iaugusty

Copy link
Copy Markdown
Collaborator Author

@munoztd0 can you take a look into failed checks and pkgdown issues, these seem to be from files not touched in this PR?

@iaugusty

Copy link
Copy Markdown
Collaborator Author

@danielinteractive could you take a look into this PR as well, especially your expertise on derivations of confidence interval for difference in rates - own derivations vs dependency on other statistical package (here ratesci) would be appreciated

current approach is : perform own calculations, in tests confirm results with ratesci package

Comment thread R/a_eair_diff.R

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @iaugusty , thanks for reaching out, I had a look at this file in particular. In general I would say it is ok to implement the calculations ourselves and then compare with (hardcoded results from) another package.
However, I think we should be careful not to have more locations where we put differences of rates code - currently we have https://github.com/insightsengineering/tern/blob/main/R/prop_diff.R and also in our junco package here we have in principle https://github.com/johnsonandjohnson/junco/blob/main/R/estimate_proportion_diff.R. If we could use the tern functions as much as possible here that would be good - if the interface from tern is not convenient we could also propose a change to that there. If we need something faster then we could also add it here in the junco file, but then if possible follow similar conventions for the function names, such that they can also be reused again in other places later.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielinteractive
incidence rate code is in https://github.com/insightsengineering/tern/blob/v0.9.10/R/incidence_rate.R, rather than proportions.

The design of incidence rate functions, esp how input datasets are utilized, is quite different between tern and junco. In tern, input dataset requires event and non events in the same input dataset, we have only events in input, non events information is from adsl/alt_counts_df dataset. Our design is easier to generate incidence rates per aedecod, and input dataset requirements is very similar to ADAM ADAE dataset.
As this difference in design exists, it seemed harder to integrate our function within tern.

wrt confidence intervals, in tern only confidence intervals for the incidence rates itself are available, not for difference in rates. This would be something we can propose in tern.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah those are incidence rates, ok, I did not see that quickly.

Maybe then for now we could put them in a similar incidence_rates_diff.R file in junco and propose an inclusion upstream in tern later. But then we will need more documentation of the calculation functions, including literature references where applicable.

Comment thread R/a_eair_diff.R
Comment on lines +9 to +11
waldcc = h_s_eair_diff_wald(x1, n1, x2, n2, conf_level, cc = TRUE),
wald = h_s_eair_diff_wald(x1, n1, x2, n2, conf_level, cc = FALSE),
mn = h_s_eair_diff_mn(x1, n1, x2, n2, conf_level, skew = FALSE),

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be relatively optimistic that we have these 3 in tern already implemented - did you have a look already?

Comment thread R/a_eair_diff.R
waldcc = h_s_eair_diff_wald(x1, n1, x2, n2, conf_level, cc = TRUE),
wald = h_s_eair_diff_wald(x1, n1, x2, n2, conf_level, cc = FALSE),
mn = h_s_eair_diff_mn(x1, n1, x2, n2, conf_level, skew = FALSE),
scas = h_s_eair_diff_mn(x1, n1, x2, n2, conf_level, skew = TRUE)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This skewness correction could be interesting and might not be in tern yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: a_eair100_j add extra statistics

4 participants