Skip to content

Support S7 extension of S4 classes and vice-versa#659

Open
lawremi wants to merge 77 commits into
mainfrom
issue-456-s4-extends-s7
Open

Support S7 extension of S4 classes and vice-versa#659
lawremi wants to merge 77 commits into
mainfrom
issue-456-s4-extends-s7

Conversation

@lawremi

@lawremi lawremi commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Fixes #456.

First iteration on enabling an S7 class to extend an S4 class and vice-versa.

The concrete, fail-fast goal is to enable SummarizedExperiment, a complex, central package in Bioconductor, to be rewritten in S7, based on S4 classes defined lower in the stack, like those from S4Vectors and IRanges, while not breaking compatibility with dependent packages, like SingleCellExperiment, via S4 shims extending the S7 classes. SingleCellExperiment is complex and deeply dependent on SummarizedExperiment, so if it continues working, it is a very good indicator.

What has been achieved outside of these improvements to S7:

  • Fixed a bug in the methods package related to upcasting S4 objects derived old classes. Should be part of the next patch release, but it will cause release failures in the checks.
  • Modified S4Vectors devel branch to use a class1() helper to be explicit when it wants the scalar name of the object's class and to ensure that it is robust to S3-style class vectors. Also improved the setValidity2() helper so that validity functions are enclosed in the S4Vectors namespace, enabling S4 class objects to remain identical across lazy loading. This is important since S4 class objects end up in the S7 class object, which is carried by S7 objects, and we want them to pass identity checks across packages.
  • Rewrote SummarizedExperiment (branch) to use S7 classes, S7 generics and S7 syntax for defining methods on upstream S4 generics. Also defined S4 classes with the original names that derive from their corresponding S7 classes as a compatibility layer. It passes its tests, which were ported to S7 while preserving the original logic.
  • Ensured SingleCellExperiment passes tests without any modification.
  • The path to Bioconductor adoption of S7 is now wide open.

For S7 classes extending S4 classes, the current approach is to instantiate S7 objects, meaning that the S4 bit is not set. The justification is that there are S4 object invariants that an S7 object cannot satisfy, and vice versa. The big one being the return value of class(). Instead we have an S7 shim that does a reasonable job of mimicking an S4 object, including slot access, validity checking, and initialization, but not class() compatibility. See the evolving man page insert for more details.

When an S7 class is made to inherit from an S4 class, S4_register() is automatically called on the S7 class. This branch extends S4_register() to declare the S7 properties and S7_class attribute as S4 slots, which is helpful for making the S4 class more self-describing but critical in case the S7 class needs to be later extended by S4.

To extend an S7 class using setClass(), the S7 class first needs to be registered via S4_register() and S4_contains() should be called to obtain a suitable value for the contains= argument. S4_contains() is an abstraction that currently just requires the S7 properties to be static, ie not using getters and setters, since S4 code using slot() would bypass those. Any instances of an S7-derived S4 class are proper S4 objects, with the bit set and a scalar class vector.

Besides adding and changing hundreds of lines of logic around S4 classes, significant changes to support this in S7 include:

  • Made S7 property storage compatible with S4 NULL slot sentinels. Until now, setting an S7 property to NULL removed the attribute, and S4 does not like it when its slots disappear from the attributes.
  • The inheritance checks in prop.c need to be robust to scalar class vectors, using @.S3Class instead.
  • Added S4 method registration for internal generics when S7 method signatures include classes with S4 ancestry. S4 methods on internal generics take precedence over S3 methods, so when S7 defines a method on an internal generic, it should define an S4 method if any of the classes in the signature are S4-derived classes, since otherwise it might be overridden by an S4 method defined for a parent class. S7 must also support multi-dispatch methods on internal generics where S4 supports it. The dimnames<-() generic is an example.
  • @<-.S7_object() needs to support slots defined on S4 children that are not S7 properties, because S4 classes that extend S7_object will dispatch to the method, even when the S4 bit is set.

Resolving the two incompatible class() return values, scalar vs. non-scalar, is a larger issue. My proposal is that we export methods:::.class1() as methods::class1() and encourage all S4 code to use class1() to make the expectation of a scalar return value explicit. Practically, we could update the core of Bioconductor to adhere to that policy and thus enable new Bioconductor packages to extend the infrastructure with S7.

@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch 6 times, most recently from 87b33e2 to 5e5d36d Compare June 3, 2026 00:26
@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 5e5d36d to 3650492 Compare June 7, 2026 20:14
@lawremi lawremi changed the title Support S7 extension of S4 classes Support S7 extension of S4 classes and vice-versa Jun 7, 2026
@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 3650492 to eb1092b Compare June 7, 2026 21:40
@lawremi lawremi requested review from hadley and t-kalinowski June 7, 2026 21:42

@hadley hadley left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you know infinitely more about S4 than me 😄 so I focussed on docs and style.

Also need an update to NEWS.md?

Comment thread man/rmd/S4-compatibility.Rmd Outdated
Comment thread man/rmd/S4-compatibility.Rmd Outdated
Comment thread man/rmd/S4-compatibility.Rmd Outdated
old-class object, not as an S4-bit object, so `isS4(child)` is `FALSE`.
This avoids advertising full S4 object invariants that the object does not
satisfy.
* The S3 class vector is not scalar. S4 code that needs one primary class name

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How common is this use of scalar class()?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty common, unfortunately. I would guess most uses are in show() methods.

Comment thread man/rmd/S4-compatibility.Rmd Outdated
Comment thread man/rmd/S4-registration.Rmd Outdated
Comment thread R/method-register.R Outdated
Comment thread R/method-register.R Outdated
Comment thread R/S4.R Outdated
Comment thread tests/testthat/test-class.R Outdated
Comment thread tests/testthat/test-convert.R Outdated
})

it("can convert_up() an S4-derived S7 object to an S4 object ", {
on.exit(S4_remove_classes(c("ParentS4", "ChildS7")))

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth considering extening local_S4_class() to handle more cases so we could skip the explicit cleanup steps (which are easy to forget)?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local_S4_class() is convenient and robust, but it obfuscates the code, in my opinion. I think readers expect to see setClass(), setOldClass(), and setClassUnion(). I'd prefer something like S4_remove_after(setClass(...)).

How about we defer this to a future change after merging this branch and discussing? Since the changes are so extensive, rebasing is becoming a chore. It might be better to iterate over multiple MRs.

@t-kalinowski t-kalinowski left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lawremi I took a cursory look and didn’t see any obvious issues. Could you please tag me again once you’ve had a chance to resolve Hadley’s comments? I’ll take a more thorough look then. That way I can avoid duplicating effort or leaving comments that might become stale.

lawremi added 17 commits June 19, 2026 15:59
…cing to an S3 class, and use methods::extends() to capture the entire class vector, not just the first class
…@() behavior, which needs to change in base R.
…S4_register() can now communicate any S7 class structure through the S4class= argument.
…t initialize() except uses the S7 property setting path.
lawremi and others added 24 commits June 19, 2026 16:25
…ults. We evaluate these if they are language objects, so S7 classes extending or being extended by S4 classes need to ensure that evaluation can happen at build time. Also mark it VIRTUAL because it should never be constructed directly.
… an S4 instance of that class (old classes are virtual normally but not in this case since we pass an S4Class prototype)
…ic is internal and an element of the signature has an S4 ancestor. This is needed because internal generics will favor S4 methods on S4 objects, so there is potential for inheriting overrides.
…ary S4 classes instead of old classes, because an old class implies an S3 instance of the object can exist, ie, there can be an S3Part().
… it to get stripped during S4 upcasting, breaking the S7 object
…in principle, there should only be one ::S4Slots shim for each S4 derivative of an S7 class
Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
Co-authored-by: Hadley Wickham <h.wickham@gmail.com>
…S4 to inherit from S7; there's just two types of classes: the stub old class and the reified old class. The former for when only dispatch is needed, the latter for when inheritance (in either direction) is needed.
@lawremi lawremi force-pushed the issue-456-s4-extends-s7 branch from 7a09607 to 238130f Compare June 20, 2026 00:37
lawremi added 5 commits June 19, 2026 21:10
…ontains=TRUE used to do), which ensures that classes are always able to be used as S4 superclasses. Introduce a separate helper S4_contains() that just ensures the S7 class is a suitable parent of an S4 class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mixing S4 and S7 inheritance hierarchies

3 participants