feat(edm): implement generic prediction framework and Simplex Projection#1561
feat(edm): implement generic prediction framework and Simplex Projection#1561aman-raj-srivastva wants to merge 2 commits into
Conversation
80c0d37 to
3b443a3
Compare
|
Reverted the S-Map commit temporarily for further review before re-pushing |
|
@stevemunch @nathanvaughan-NOAA while implementing the S-Map predictor, I came across a few design questions where I'd appreciate feedback before I push the implementation. I have currently implemented the standard S-Map formulation described in Section 2.1 of Esguerra & Munch (2024), i.e., the local linear model that serves as the baseline/M-step component of the full HMS-map. For each local neighborhood, the implementation constructs a design matrix: and solves the weighted least-squares system: using a CppAD-safe Gaussian elimination routine. Before finalizing the implementation, I wanted to get your thoughts on a couple of design choices. 1. Weighting Kernel / Distance MetricClassic S-Map implementations (e.g., Sugihara 1994 and rEDM) typically use an exponential kernel based on Euclidean distance: However, Section 2.1 of Esguerra & Munch (2024) defines the weighting function as: which is effectively a Gaussian kernel based on squared Euclidean distance. For the generic FIMS S-Map predictor, would you prefer:
Using squared distances has the additional benefit of avoiding a 2. Target Alignment in Delay EmbeddingThe paper formulates the delay embedding map as a one-step-ahead forecast: While wiring the implementation, I noticed that the current Am I interpreting this correctly, or is the intended target shift handled elsewhere in the workflow? If not, would aligning the embedding construction with the standard one-step-ahead formulation be the preferred approach? 3. Scope of This PRMy understanding is that this PR should focus on the standard S-Map predictor only:
The EM iteration and state-estimation machinery required for the full HMS-map would then be a separate follow-up effort. Just wanted to confirm that this matches the intended project scope before I proceed further. Thanks! I currently have the S-Map implementation and associated tests working locally, and I wanted to verify these design details before pushing the next commit. |
|
Aman-In my experience using distance v. squared distance in the weighting kernel doesn’t make a huge difference in performance except in rare cases. I’d use whatever was easier/faster. In the state-estimation step of the HMS algorithm, we are updating all of the states based on the current estimates of the S-map coefficients and the observations. There’s no time lag in this step. In the S-map step, the model is fit one step ahead as usual. Regarding HMS map being a separate project - I will let others chime in. For my two cents - that’s the part that requires the most care in implementation. The rest is pretty straightforward. Critically ‘standard EDM’ does not handle observation error explicitly. The statistically minded folks at NMFS generally care about observation uncertainty, so it would be most valuable to if you completed the HMS map code or some other algorithm for handling observation uncertainty.Sent from my iPadOn Jun 24, 2026, at 12:04 PM, Aman Raj ***@***.***> wrote:aman-raj-srivastva left a comment (NOAA-FIMS/FIMS#1561)
@stevemunch @nathanvaughan-NOAA while implementing the S-Map predictor, I came across a few design questions where I'd appreciate feedback before I push the implementation.
I have currently implemented the standard S-Map formulation described in Section 2.1 of Esguerra & Munch (2024), i.e., the local linear model that serves as the baseline/M-step component of the full HMS-map.
For each local neighborhood, the implementation constructs a design matrix:
$$X_i = [1, x_t, x_{t-\tau}, \ldots, x_{t-(E-1)\tau}]$$
and solves the weighted least-squares system:
$$(X^T W X)\beta = X^T W y$$
using a CppAD-safe Gaussian elimination routine.
Before finalizing the implementation, I wanted to get your thoughts on a couple of design choices.
1. Weighting Kernel / Distance Metric
Classic S-Map implementations (e.g., Sugihara 1994 and rEDM) typically use an exponential kernel based on Euclidean distance:
$$w_i = \exp\left(-\theta \frac{d_i}{\bar d}\right)$$
However, Section 2.1 of Esguerra & Munch (2024) defines the weighting function as:
$$w_i = \exp\left(-\theta^2 \left(\frac{d_i}{D}\right)^2\right)$$
which is effectively a Gaussian kernel based on squared Euclidean distance.
For the generic FIMS S-Map predictor, would you prefer:
the formulation used in the HMS-map paper, or
compatibility with the more traditional rEDM-style weighting?
Using squared distances has the additional benefit of avoiding a sqrt() inside the AD tape.
2. Target Alignment in Delay Embedding
The paper formulates the delay embedding map as a one-step-ahead forecast:
$$x_{t+1} = f(x_t, \ldots, x_{t-E+1})$$
While wiring the implementation, I noticed that the current DelayEmbeddingMatrix assigns target_values[row] and embedded_values[row][0] to the same observation, which appears to correspond to a 0-step-ahead target.
Am I interpreting this correctly, or is the intended target shift handled elsewhere in the workflow?
If not, would aligning the embedding construction with the standard one-step-ahead formulation be the preferred approach?
3. Scope of This PR
My understanding is that this PR should focus on the standard S-Map predictor only:
local weighted linear regression
prediction interface integration
testing and validation infrastructure
The EM iteration and state-estimation machinery required for the full HMS-map would then be a separate follow-up effort.
Just wanted to confirm that this matches the intended project scope before I proceed further.
Thanks! I currently have the S-Map implementation and associated tests working locally, and I wanted to verify these design details before pushing the next commit.
—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications, keep track of coding agent tasks and review pull requests on the go with GitHub Mobile for iOS and Android. Download it today!
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
|
Hey @aman-raj-srivastva, great job on your progress so far. For these questions I would suggest
|
Summary
This PR begins the implementation of EDM prediction algorithms for Empirical Dynamic Modeling (EDM) in FIMS as part of the GSoC 2026 project: "Add Empirical Dynamic Modeling to FIMS."
This work builds on the delay embedding infrastructure introduced in PR #1528 and addresses the initial components of the prediction framework described in Issue #1487.
main-edmbranch.Note: This PR will be developed incrementally as additional prediction algorithms and validation components are implemented.
What this PR adds
Generic EDM Prediction Framework
Implemented a reusable prediction interface for EDM algorithms.
Features include:
Simplex Projection
Implemented the initial Simplex Projection prediction algorithm.
Features include:
Planned Follow-Up Work
The following components will be added in subsequent commits to this PR:
Testing
Current testing includes:
Additional tests will be added as the remaining prediction algorithms are implemented.
Checklist
Related Issues
Instructions for code reviewer
👋Hello reviewer👋, thank you for taking the time to review this PR!
nit:(for nitpicking) as the comment type. For example,nit:I prefer using adata.frame()instead of amatrixbecause ...This PR is now ready to be merged.Checklist