feat(portfolio): Read data and calculate mean-variance for returns#81
feat(portfolio): Read data and calculate mean-variance for returns#81SaurabhJamadagni wants to merge 11 commits into
Conversation
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (0.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #81 +/- ##
==========================================
- Coverage 93.28% 89.76% -3.53%
==========================================
Files 29 30 +1
Lines 2754 2862 +108
==========================================
Hits 2569 2569
- Misses 185 293 +108 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
In the changes following this, I would also like to separate out the functions that calculate returns as they aren't really struct functions. Could move them to a more general utilities module or something that the whole package can make use of. |
|
Thanks for putting this together @SaurabhJamadagni; already looks really solid! Two quick notes:
Appreciate all the work so far! P.S. Just merged #65 🚀 |
|
No worries on the delay @carlobortolan. Hope you have a stress free move!
Noted. I'll give it a look.
Thanks on this one! Appreciate you pushing the final few commits to get it merged. Do we wanna keep this PR open till the whole optimization is implemented or are you looking to merge it after some of the above changes? No rush of course, I just wanted to clarify where you stand on this. |
|
Sorry for the long delay - finally finished moving (and all the bureaucracy that comes with it 🙄), so I'll have some time to review it over the weekend.
This PR is already quite well-sized with ~300 added LOC, so it's fine to leave it as is. (However, I'd say that it would be better to have the whole optimization merged as one PR, so that |
No worries @carlobortolan! I know the headache that comes with moving. I hope everything went well.
I agree with this and hence was curious. I also wanted to check if you would consider having dev and release branch instead of merging with main. Incomplete features could stay on dev and after a certain amount of features are added or after a certain period of time you could merge dev into main as a release which could be documented through the current CHANGELOG or something. Do you think there's a benefit to such a separation? |
@SaurabhJamadagni: Yes, having a dev branch could make things more organized, especially for tracking unreleased features. That said, I think that setup usually benefits larger projects more. If we had However, what we could do is keep master as the release branch and handle ongoing work through feature branches. For example, in this case we could have This way, features that are less modular / need multiple PRs can be split into smaller branches& PRs, while others can still be merged in a single PR. Does this approach sound good? (P.S.: I took the liberty of fixing the merge conflicts here, since I just updated some dependencies in another PR which might cause some issues with our |
…o portfolio_optimization
There was a problem hiding this comment.
Thanks for the PR and again sorry for the long review-time: The PR looks mostly good to me; just added a few very small comments 👍
edit: another thing I noticed: The weights field is Option<Vec<f64>> but currently never populated. Is this intended? It might be a good idea to add a method stub/placeholder for computing optimal weights, so we don't forget to implement this later.
|
|
||
| #[warn(unused_variables)] | ||
| fn main() { | ||
| let data_path = "/Users/moneymaker/Downloads/ETFprices.csv"; |
There was a problem hiding this comment.
Use std::env::args() or a relative path to an example file (e.g., examples/data/ETFprices.csv) to avoid hard coded paths.
| fn calculate_simple_returns(prices: &Array2<f64>) -> Array2<f64> { | ||
| let simple_returns = | ||
| (&prices.slice(s![1.., ..]) - &prices.slice(s![..-1, ..])) / prices.slice(s![..-1, ..]); | ||
| simple_returns.to_owned() |
There was a problem hiding this comment.
calculate_simple_returns and calculate_log_returns currently create new arrays using .to_owned(). Consider in-place operations or preallocating arrays for large datasets.
| rand_distr = "0.5.1" | ||
| rayon = "1.10.0" | ||
| statrs = "0.18.0" | ||
| ndarray = "0.16.1" |
There was a problem hiding this comment.
I think the latest version is 0.17.0 - is there any reason for using 0.16.x? If not, we should probably upgrade to the newer version
There was a problem hiding this comment.
Not really. I think I just copied what was on crates.io but I might be remembering it incorrectly. I'll update it.
| fn calculate_log_returns(prices: &Array2<f64>) -> Array2<f64> { | ||
| let log_prices = prices.mapv(|x| x.ln()); | ||
| let log_returns = &log_prices.slice(s![1.., ..]) - &log_prices.slice(s![..-1, ..]); | ||
| log_returns.to_owned() |
|
Hey @carlobortolan! Apologies for the giant break in between! I was wrapping up my final semester and graduated couple of weeks back. Had to go through the thread again a bit cause I lost touch with previous commit.
Agree with this, what I was thinking before is that I would immediately implement the function. I did not anticipate such a large break in between. I'll go ahead and complete the implementation for the optimization that's left for this feature. |
|
Hey there @SaurabhJamadagni , no worries at all and congratulations on graduating!
Sounds good 👍 |
* chore: initial issue structure setup * feat: trait structure for data sources * fix(linting): string formatting (carlobortolan#64) * refactor: string formatting * refactor: update string formatting * fix: changes to fix linting issues and scope changes * fix: changed variable with camel case causing linting error * fix: allowing for dead code to fix linting checks * test: Adding initial stage tests for data module * Update tests/data.rs Signed-off-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> * feat: trait structure for data sources * fix: changes to fix linting issues and scope changes * feat: saving current state of code based on a parent trait * feat: fetch global quote for alpha vantage * feat(data): traits setup for data sources (carlobortolan#63) * chore: initial issue structure setup * feat: trait structure for data sources * fix(linting): string formatting (carlobortolan#64) * refactor: string formatting * refactor: update string formatting * fix: changes to fix linting issues and scope changes * fix: changed variable with camel case causing linting error * fix: allowing for dead code to fix linting checks * test: Adding initial stage tests for data module * Update tests/data.rs Signed-off-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> --------- Signed-off-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> Co-authored-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> * chore(deps): bump criterion from 0.6.0 to 0.7.0 (carlobortolan#67) * chore(deps): bump criterion from 0.6.0 to 0.7.0 Bumps [criterion](https://github.com/bheisler/criterion.rs) from 0.6.0 to 0.7.0. - [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md) - [Commits](bheisler/criterion.rs@0.6.0...0.7.0) --- updated-dependencies: - dependency-name: criterion dependency-version: 0.7.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Update Cargo.lock and Cargo.lock.MSRV --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: carlobortolan <carlobortolan@gmail.com> * chore(deps): bump rand from 0.9.1 to 0.9.2 (carlobortolan#66) * chore(deps): bump rand from 0.9.1 to 0.9.2 Bumps [rand](https://github.com/rust-random/rand) from 0.9.1 to 0.9.2. - [Release notes](https://github.com/rust-random/rand/releases) - [Changelog](https://github.com/rust-random/rand/blob/master/CHANGELOG.md) - [Commits](rust-random/rand@rand_core-0.9.1...rand_core-0.9.2) --- updated-dependencies: - dependency-name: rand dependency-version: 0.9.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * Update Cargo.lock.MSRV --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: carlobortolan <carlobortolan@gmail.com> * feat(fixed_income): add basic FI structure (carlobortolan#74) * Add FI module definition * Add trait module for FI * Add PriceResult struct * Add FI types and custom pricing errors * Update day_count to include year_fraction and day_count methods * Add TODOs * Remove out unused future bond imports * Add basic tests * Add cashflow tests for schedule generation and price result validation * Remove unused imports and bond_price function stub from bond_pricing.rs * Update README (carlobortolan#76) Merged the Outlook section into the Contributing section. Signed-off-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> * feat(fixed_income): implement zero-coupon bond pricing (carlobortolan#75) * Add FI module definition * Add trait module for FI * Add PriceResult struct * Add FI types and custom pricing errors * Add ZeroCouponBond struct and implement Bond trait for pricing * Update day_count to include year_fraction and day_count methods * Add TODOs * Refactor ZeroCouponBond pricing logic to use DayCountConvention * Refactor bond module documentation and add test cases for ZeroCouponBond * Remove out unused future bond imports * Add basic tests * Add cashflow tests for schedule generation and price result validation * Remove unused imports and bond_price function stub from bond_pricing.rs * Update README and add Zero Coupon Bond example * update zero coupon bond tests and add day count tests * add maturity handling to FI day_counts * fix linting * add validation tests for ZeroCouponBond pricing errors * update tests * remove unused leap year check * update icma daycount tests * feat: trait structure for data sources * fix: changes to fix linting issues and scope changes * fix: allowing for dead code to fix linting checks * feat: trait structure for data sources * fix: changes to fix linting issues and scope changes * feat: saving current state of code based on a parent trait * feat: fetch global quote for alpha vantage * fix: Draft PR suggestions v1 * feat: adding company overview fundamental data * fix: merge conflict fixes * feat(data): traits setup for data sources (carlobortolan#63) * chore: initial issue structure setup * feat: trait structure for data sources * fix(linting): string formatting (carlobortolan#64) * refactor: string formatting * refactor: update string formatting * fix: changes to fix linting issues and scope changes * fix: changed variable with camel case causing linting error * fix: allowing for dead code to fix linting checks * test: Adding initial stage tests for data module * Update tests/data.rs Signed-off-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> --------- Signed-off-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> Co-authored-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> * feat: trait structure for data sources * fix: changes to fix linting issues and scope changes * feat: trait structure for data sources * fix: changes to fix linting issues and scope changes * feat: saving current state of code based on a parent trait * feat: fetch global quote for alpha vantage * fix: merge conflict fixes * Update MSRV and dependencies - Bump minimum supported Rust version from 1.77.0 to 1.82.0 in Cargo.toml and README.md. - Update reqwest dependency from version 0.12 to 0.12.23. - Upgrade tokio dependency from version 1.46.1 to 1.47.1. * update main function to use Tokio runtime and improve output formatting for stock quotes * test: async call success check for get_stock_quote & get_company_overview * test: removing redundant import * update deps in Cargo.lock and Cargo.lock.MSRV * update AlphaVantageSource initialization and add request fail test case --------- Signed-off-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Saurabh Jamadagni <68957712+SaurabhJamadagni@users.noreply.github.com> Co-authored-by: Carlo Bortolan <106114526+carlobortolan@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: carlobortolan <carlobortolan@gmail.com>
* Add Python bindings and examples for quantrs library - Implemented Python bindings using PyO3 for fixed income functionalities including DayCount and ZeroCouponBond. - Created README-py.md to document the Python bindings and usage examples. - Added basic usage examples in examples/python/basic_usage.py. - Integrated pandas for bulk calculations in examples/python/pandas_integration.py. - Developed pytest-compatible tests for Python bindings in tests-py/fixed_income.py. - Updated pyproject.toml to include project metadata and dependencies for Python. - Added requirements.txt for testing dependencies. - Enhanced Cargo.toml and Cargo.lock with new dependencies for Python integration. - Improved documentation and error handling in Python bindings. * move python files to bindings/python * update ci.yml to include python tests * update Cargo.lock.MSRV * update ci.yml * update ci.yml * rename python tests * feat: update version of quantrs to 0.1.7 in Cargo.toml and Cargo.lock files * implement fixed income bindings for DayCount and ZeroCouponBond
* chore(deps): bump tokio from 1.47.1 to 1.48.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.47.1 to 1.48.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-1.47.1...tokio-1.48.0) --- updated-dependencies: - dependency-name: tokio dependency-version: 1.48.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * Update Cargo.lock and Cargo.lock.MSRV --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: carlobortolan <carlobortolan@gmail.com>
|
Hey @carlobortolan! Just pushed some code to resolve :
Wanted to check if this is what you meant by in-place operations. I was thinking I would move the functions to utils once I am done with the optimization code first. After coming back to the code after some time, I was trying to remember some of the rust syntax as well as work with merge conflicts and that really took some time. I think I want to focus on getting the implementation done before I refactor code if that's okay. I wanted to ask, how do you deal with merge conflicts on the Cargo.lock or Cargo.lock.MSRV files? Do you manually pick between incoming and current changes or is there a more efficient way to do it as the Cargo.lock file is auto-generated? Thank you for your patience on this one. I really appreciate it! |
|
Hey @SaurabhJamadagni thanks for your update and sorry for the late reply! (I somehow missed the notification for your message) I'm currently on vacation, but I'll review the code once I've access to a laptop/PC again later this week.
👍
What I usually do is first delete the current lock files and then: # remove conflicted file
rm Cargo.lock
# generate MSRV lockfile
rustup install 1.83.0
rustup override set 1.83.0
cargo generate-lockfile
mv Cargo.lock Cargo.lock.MSRV
# generate stable lockfile
rustup override unset
cargo generate-lockfileEdit: Also took the liberty of cleaning up the merge conflicts on your branch, since I just released a new version -hope that's alright. |
| @@ -114,16 +114,42 @@ | |||
|
|
|||
| /// Function to calculate log returns | |||
| fn calculate_log_returns(prices: &Array2<f64>) -> Array2<f64> { | |||
There was a problem hiding this comment.
Wanted to check if this is what you meant by in-place operations
Yes exactly! This looks perfect 👍
| result | ||
| .expect("Failed to read record") | ||
| .iter() | ||
| .map(|s| s.parse::<f64>().unwrap_or(0.0)) |
There was a problem hiding this comment.
Just noticed an edge case: If there is a missing or malformed value in the CSV, it currently defaults to 0.0, but this would cause issues afterwards:
- In
calculate_simple_returns, ifprvis0.0, the calculation(nxt - prv) / prvwill result ininforNaN(division by zero) - In
calculate_log_returns,0.0.ln()will result in-inf
If these inf or NaN values feed into the cov matrix, they will most likely ruin the portfolio optimization results. I think we should handle bad data differently, e.g., by filtering out bad rows entirely or carrying forward the previous day's price (or maybe you also have any other ideas?).
…o portfolio_optimization
The PR is in reference to issue:
PR makes the following changes:
Portfoliostruct which holds:.csvfile which contains price data.fn new()will read from the data file and perform return calculations and produce a covariance matrix.ndarrayand 'ndarray_stats` are used to store records from the csv and perform operations.Example output:

Next steps:
@carlobortolan please let me know your feedback on this. I haven't written tests for this yet. Was planning to include them when I add the optimization part. Can add something before as well if you would like that before merging. Let me know :)