Complete Package Restructuring to Modern Python Standards#1
Open
tsenoner wants to merge 13 commits into
Open
Conversation
- Update project metadata (keywords, classifiers for PyPI) - Reorganize sections in canonical PEP 621 order - Update version to 1.0.0 - Add comprehensive package metadata
- Move all Ruff settings from ruff.toml to pyproject.toml - Delete ruff.toml file - Add C901 complexity rule to ignore list
- Add strict MyPy configuration to pyproject.toml - Configure module-level overrides for gradual migration - Enable disallow_untyped_defs with override exceptions - Add mypy to dev dependencies
- Move all code to src/taxembed/ package - Create logical modules: models, training, data, visualization, analysis, validation, builders, cli - Add proper __init__.py files for clean imports - Organize code into maintainable structure
- Delete train_small.py, train_hierarchical.py - Remove analyze_hierarchy*.py files - Clean up build_transitive_closure.py and other root scripts - All functionality now in src/taxembed/ package
- Remove docs/archive/ directory (30+ redundant files) - Remove obsolete documentation files - Keep README.md, CONTRIBUTING.md, docs/user-guide.md, docs/theory.md - Consolidate scattered docs into clean structure
- Preserve original poincare-embeddings code for reference - Add _vendor/README.md explaining provenance
- Train command uses python -m taxembed.cli.train - Visualize command uses python -m taxembed.visualization.umap_viz - Remove non-existent script path references - All CLI commands now properly use package structure Note: Changes included in previous structure commit
- Fix path resolution to find project root (3 levels up) - Add auto-extraction of .dmp files via ensure_taxdump() - Remove wrong data directory path calculation - Ensure visualization finds taxonomy files Note: Changes included in previous structure commit
- Add Development section to README.md with tool quickstart - Expand CONTRIBUTING.md with detailed Ruff, MyPy, Pytest guides - Add Development section to docs/user-guide.md - Add docs/theory.md with mathematical background - Include code quality workflow examples
- Add .mypy_cache/ for MyPy type checker - Use /data/ to ignore only root directory, not src/taxembed/data/ - Remove obsolete entries (wordnet, tox, old venv names) - Simplify and organize ignore patterns
- Delete Makefile in favor of pyproject.toml scripts - Remove requirements.txt (using pyproject.toml) - Remove QUICKSTART.md (info now in README.md) - Clean up remaining hype/ Cython files - Remove scripts/cleanup and regenerate scripts - All commands now via uv and pyproject.toml
- Add examples/ directory with demonstration scripts - Add comprehensive test suite (conftest, test_models, test_training, test_data, test_validation) - Add CODE_OF_CONDUCT.md - Update uv.lock with latest dependencies - Complete package restructuring
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Complete Package Restructuring to Modern Python Standards
🎯 Overview
Transform
taxembedfrom a research codebase with scattered scripts into a production-ready Python package following modern best practices (PEP 517, 518, 621).🚀 Key Changes
Package Structure
src/taxembed/with logical module organization_vendor/for referenceConfiguration & Tooling
pyproject.tomlwith all settings (Ruff, MyPy, Pytest)Makefile,ruff.toml,requirements.txtBug Fixes
Documentation
Cleanup
.gitignorewith proper exclusions✅ Quality Assurance
All quality checks passing:
🔄 Migration Impact
BREAKING CHANGE: Complete restructure - no backwards compatibility.
Before: 40+ scattered scripts in root
After: Clean
src/taxembed/package structureCLI interface unchanged - all commands work the same way:
Note: Project was not in production use.
📦 What This Enables
Result: A production-ready, professionally structured Python package ready for future development.