Skip to content

feat(implicit): add theoretical bridge between VCMs and neural models#89

Open
cnellington wants to merge 1 commit into
mainfrom
theoretical-bridge
Open

feat(implicit): add theoretical bridge between VCMs and neural models#89
cnellington wants to merge 1 commit into
mainfrom
theoretical-bridge

Conversation

@cnellington
Copy link
Copy Markdown
Collaborator

Summary

Adds a brief theoretical bridge in the Foundations of Implicit Adaptation section, showing that differentiable models with context inputs (e.g. neural networks) recover the varying-coefficient regression solution: the intermediate regression parameters $\beta_i$ can be obtained post-hoc by differentiating the model with respect to $c_i$ — a first-order Taylor approximation often used in post-hoc interpretation.

Renames the relevant subheading to "Theoretical Bridge: Architectural Conditioning via Context Inputs".

Split out from #88 to keep the amortized-estimation discussion separate from the theoretical bridge content.

Test plan

  • Manubot build renders the new math block without errors
  • Citation @doi:10.48550/arXiv.1602.04938 resolves

Adds a varying-coefficient regression view of differentiable models with context inputs, showing the intermediate regression parameters can be recovered by differentiating with respect to context — a first-order Taylor approximation.
Copilot AI review requested due to automatic review settings May 10, 2026 21:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a short theoretical link in the “Foundations of Implicit Adaptation” section arguing that context-as-input differentiable models (e.g., neural nets) can be related to varying-coefficient regression via local linearization / post-hoc derivatives.

Changes:

  • Renames the subsection to “Theoretical Bridge: Architectural Conditioning via Context Inputs”.
  • Adds a math-based bridge describing how intermediate coefficients could be recovered from a differentiable model via differentiation, with a citation to post-hoc interpretation work.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread content/07.implicit.md
The connection is explicit for differentiable models $g$. Consider the model $P(Y | X, C)$ as a varying-coefficient regression model. An explicit estimator for regression parameters will solve for the regression parameter map $\beta_i = f(c_i)$ through
$$\hat{f} = \text{argmin}_f \sum_i (y_i - x_i \cdot f(c_i))^2,$$
while a differentiable model (e.g. a neural network) will solve
$$\hat{\Phi} = \text{argmin}_\Phi \sum_i (y_i - g([x_i, c_i]; \Phi).$$
Comment thread content/07.implicit.md
Comment on lines +20 to +26
The connection is explicit for differentiable models $g$. Consider the model $P(Y | X, C)$ as a varying-coefficient regression model. An explicit estimator for regression parameters will solve for the regression parameter map $\beta_i = f(c_i)$ through
$$\hat{f} = \text{argmin}_f \sum_i (y_i - x_i \cdot f(c_i))^2,$$
while a differentiable model (e.g. a neural network) will solve
$$\hat{\Phi} = \text{argmin}_\Phi \sum_i (y_i - g([x_i, c_i]; \Phi).$$
Under mild assumptions, these result in an identical solution for the intermediate regression parameters $\beta$. While the varying-coefficient model solves this explicitly, these can be obtained post-hoc from the differentiable model by differentiating with respect to $c_i$
$$\beta_i = \frac{\delta}{\delta c} g([x_i, c_i]; \Phi).$$
This is the first-order Taylor approximation of the model, a locally linear approximation [@doi:10.48550/arXiv.1602.04938] often used in post-hoc interpretation methods.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants