fix(data): implement market data layer; stop .gitignore swallowing src/data (#1)#19
Open
bradsmithmba wants to merge 1 commit into
Open
fix(data): implement market data layer; stop .gitignore swallowing src/data (#1)#19bradsmithmba wants to merge 1 commit into
bradsmithmba wants to merge 1 commit into
Conversation
…/data
src/data was missing __init__.py, providers.py, models.py, cache.py, and
schema.py, so importing FeatureEngineering (and the whole feature pipeline)
failed at runtime. Root cause: an unanchored `data/` rule in .gitignore
matched src/data/ at every level, so modules added under src/data were
silently ignored and never committed.
- Anchor the rule to `/data/` (ignores only the root runtime cache dir,
config cache_db_path = data/cache.db; no longer src/data/).
- Implement the data layer to the contract in the existing, previously
unrunnable tests/test_providers.py and tests/test_cache.py:
- models.py: PriceData, OptionChainData
- providers.py: YFinanceProvider, RateLimiter, error hierarchy,
MarketDataRequest, get_default_provider
- schema.py: SQLAlchemy ORM (PriceHistory, OptionsData, CacheMetadata)
- cache.py: SQLite-backed CacheManager with TTL, stats, cleanup
- __init__.py: package exports consumed by src/features/base.py
Unblocks src.features and src.data.regime_labeler. 21 passed.
Closes cloudtrainerwork#1
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This was referenced Jun 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
src/data/was missing__init__.py,providers.py,models.py,cache.py, andschema.py. Any import ofFeatureEngineering— and therefore the entire feature-engineering and regime-detection pipeline — failed at runtime withImportError.Closes #1.
Root cause: a .gitignore footgun
This was not simply unwritten code.
.gitignoreline 165 had an unanchored rule:An unanchored directory pattern matches a directory of that name at every level, so it matched
src/data/as well as the intended root-level runtime cache directory (config.cache.cache_db_path = data/cache.db). Any module added undersrc/data/was silently ignored by git and never committed.Evidence it was written and then lost:
tests/test_providers.pyandtests/test_cache.pyalready exist and specify a detailed API forYFinanceProvider,RateLimiter,CacheManager, and the ORM schema.requirements.txtalready declaresyfinance>=0.2.28andsqlalchemy>=2.0.0.data_utils.py,regime_labeler.py,training_data.py) predate the rule, so they remained tracked — which is why the directory looked partially populated.The fix anchors the rule to the repository root:
This still ignores the runtime cache (
data/cache.db) but no longer touchessrc/data/. Without this change, any future module added undersrc/data/would silently vanish again.Changes
.gitignoredata/→/data/(anchor to repo root)src/data/models.pyPriceData,OptionChainDatadataclassessrc/data/providers.pyYFinanceProvider,RateLimiter,DataProviderError/InvalidSymbol/DataNotAvailable,MarketDataRequest,get_default_provider()src/data/schema.pyPriceHistory,OptionsData,CacheMetadatasrc/data/cache.pyCacheManagerwith TTL freshness, hit/miss stats, and cleanupsrc/data/__init__.pysrc/features/base.pyThe implementation was written to the contract defined by the pre-existing test files, not invented — the tests are the spec.
Testing
The exact reproduction from the issue now imports cleanly:
Broader run across
tests/{test_providers,test_cache}.py,tests/features/, andtests/models/(excluding one unrelated missing module): 262 passed. Almost none of these collected before this change.Related findings (separate issues, not addressed here)
models/(line 168), which silently ignoressrc/models/. Filed separately.src/models/integrated_selector.pyis genuinely absent and breaksrecommendation_engineimport. Filed separately.transfer_trainerhas a torch API drift (ReduceLROnPlateauconstructor arg). Filed separately.🤖 Generated with Claude Code