Skip to content

feat(edm): implement generic prediction framework and Simplex Projection#1561

Open
aman-raj-srivastva wants to merge 2 commits into
NOAA-FIMS:main-edmfrom
aman-raj-srivastva:feature/edm-prediction-functors
Open

feat(edm): implement generic prediction framework and Simplex Projection#1561
aman-raj-srivastva wants to merge 2 commits into
NOAA-FIMS:main-edmfrom
aman-raj-srivastva:feature/edm-prediction-functors

Conversation

@aman-raj-srivastva

@aman-raj-srivastva aman-raj-srivastva commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR begins the implementation of EDM prediction algorithms for Empirical Dynamic Modeling (EDM) in FIMS as part of the GSoC 2026 project: "Add Empirical Dynamic Modeling to FIMS."

This work builds on the delay embedding infrastructure introduced in PR #1528 and addresses the initial components of the prediction framework described in Issue #1487.

  • Target Branch: Targets the newly created main-edm branch.

Note: This PR will be developed incrementally as additional prediction algorithms and validation components are implemented.

What this PR adds

Generic EDM Prediction Framework

Implemented a reusable prediction interface for EDM algorithms.

Features include:

  • common prediction inputs and outputs
  • reusable prediction result structures
  • extensible architecture for future EDM prediction methods
  • support for algorithm-specific prediction implementations

Simplex Projection

Implemented the initial Simplex Projection prediction algorithm.

Features include:

  • nearest-neighbor based prediction workflow
  • distance-weighted forecasting
  • configurable prediction horizon support
  • integration with existing delay embedding structures

Planned Follow-Up Work

The following components will be added in subsequent commits to this PR:

  • Distance weighting infrastructure
  • S-Map prediction
  • GP-EDM prediction
  • Validation against rEDM reference implementation
  • Additional GoogleTest coverage
  • Additional testthat coverage
  • Documentation updates

Testing

Current testing includes:

  • GoogleTest coverage for prediction framework behavior
  • Simplex Projection unit tests

Additional tests will be added as the remaining prediction algorithms are implemented.

Checklist

  • Generic EDM prediction framework
  • Initial Simplex Projection implementation
  • Distance weighting infrastructure
  • S-Map prediction
  • GP-EDM prediction
  • rEDM validation
  • Additional unit tests
  • Documentation updates

Related Issues


Instructions for code reviewer

👋Hello reviewer👋, thank you for taking the time to review this PR!

  • Please use this checklist during your review, checking off items that you have verified are complete but feel free to skip over items that are not relevant!
  • See the GitHub documentation for how to comment on a PR to indicate where you have questions or changes are needed before approving the PR.
  • Please use standard conventional messages for both commit messages and comments
  • PR reviews are a great way to learn so feel free to share your tips and tricks. However, when suggesting changes to the PR that are optional please include nit: (for nitpicking) as the comment type. For example, nit: I prefer using a data.frame() instead of a matrix because ...
  • Engage with the developer. Make it clear when the PR is approved by selecting the approved status, and potentially commenting on the PR with something like This PR is now ready to be merged.

Checklist

  • The code is well-designed
  • The code is designed well for both users and developers
  • Code coverage remains high- [ ] Comments are clear, useful, and explain why instead of what
  • Code is appropriately documented (doxygen and roxygen)

@aman-raj-srivastva aman-raj-srivastva force-pushed the feature/edm-prediction-functors branch from 80c0d37 to 3b443a3 Compare June 24, 2026 18:02
@aman-raj-srivastva

Copy link
Copy Markdown
Contributor Author

Reverted the S-Map commit temporarily for further review before re-pushing

@aman-raj-srivastva

Copy link
Copy Markdown
Contributor Author

@stevemunch @nathanvaughan-NOAA while implementing the S-Map predictor, I came across a few design questions where I'd appreciate feedback before I push the implementation.

I have currently implemented the standard S-Map formulation described in Section 2.1 of Esguerra & Munch (2024), i.e., the local linear model that serves as the baseline/M-step component of the full HMS-map.

For each local neighborhood, the implementation constructs a design matrix:

$$X_i = [1, x_t, x_{t-\tau}, \ldots, x_{t-(E-1)\tau}]$$

and solves the weighted least-squares system:

$$(X^T W X)\beta = X^T W y$$

using a CppAD-safe Gaussian elimination routine.

Before finalizing the implementation, I wanted to get your thoughts on a couple of design choices.

1. Weighting Kernel / Distance Metric

Classic S-Map implementations (e.g., Sugihara 1994 and rEDM) typically use an exponential kernel based on Euclidean distance:

$$w_i = \exp\left(-\theta \frac{d_i}{\bar d}\right)$$

However, Section 2.1 of Esguerra & Munch (2024) defines the weighting function as:

$$w_i = \exp\left(-\theta^2 \left(\frac{d_i}{D}\right)^2\right)$$

which is effectively a Gaussian kernel based on squared Euclidean distance.

For the generic FIMS S-Map predictor, would you prefer:

  • the formulation used in the HMS-map paper, or
  • compatibility with the more traditional rEDM-style weighting?

Using squared distances has the additional benefit of avoiding a sqrt() inside the AD tape.

2. Target Alignment in Delay Embedding

The paper formulates the delay embedding map as a one-step-ahead forecast:

$$x_{t+1} = f(x_t, \ldots, x_{t-E+1})$$

While wiring the implementation, I noticed that the current DelayEmbeddingMatrix assigns target_values[row] and embedded_values[row][0] to the same observation, which appears to correspond to a 0-step-ahead target.

Am I interpreting this correctly, or is the intended target shift handled elsewhere in the workflow?

If not, would aligning the embedding construction with the standard one-step-ahead formulation be the preferred approach?

3. Scope of This PR

My understanding is that this PR should focus on the standard S-Map predictor only:

  • local weighted linear regression
  • prediction interface integration
  • testing and validation infrastructure

The EM iteration and state-estimation machinery required for the full HMS-map would then be a separate follow-up effort.

Just wanted to confirm that this matches the intended project scope before I proceed further.

Thanks! I currently have the S-Map implementation and associated tests working locally, and I wanted to verify these design details before pushing the next commit.

@stevemunch

stevemunch commented Jun 24, 2026 via email

Copy link
Copy Markdown

@nathanvaughan-NOAA

Copy link
Copy Markdown
Contributor

Hey @aman-raj-srivastva, great job on your progress so far. For these questions I would suggest

  1. Can you add the option for the user to choose the kernal function? That seems like it would offer the most utility for comparing with other approaches and leave options open for different functions to be added in the future. @stevemunch should be able to explain the rational for the change in kernal. I think the main difference is just that the guassian has fatter tails than the exponential but I thought both methods generally estimated a scale factor that may minimize the impact of the kernal choice. I just saw @stevemunch above say basically that same thing.

  2. Good catch on the indexing I missed that the first time. I think it's best to change that to the 1 step ahead index. From @stevemunch 's comment above the state-estimation step would be updating the base values (why it's so beneficial to set this all up as linked pointers) and then the S-MAP step would use the embedded and target objects to make predictions.

  3. @stevemunch is correct that the HMS uncertainty integration is critical to practically applying this and an important feature, but the mechanics will be complex and I think fairly isolated from this initial structure development so I think it's fine to compartmentalize for now and keep this pull request focused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants