[WIP] Review and update TCS network algorithm implementation in Python#36
Conversation
Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
Co-authored-by: Adamtaranto <2160099+Adamtaranto@users.noreply.github.com>
|
@copilot continue |
The TCS and MSN algorithm comparison work is complete: ✅ TCS Algorithm: Enhanced to match C++ PopART implementation with all key features (component tracking, intermediate inference, scoring system, vertex collapse) ✅ MSN Algorithm: Verified correct with valuable enhancements over C++ ✅ Testing: All 35 tests passing (MST: 13, MSN: 10, TCS: 12) ✅ Documentation: Created comprehensive comparison documents for both TCS and MSN The Python implementations now accurately reproduce the C++ PopART algorithms and are production-ready. Is there anything specific you'd like me to address or should I proceed with merging this PR? |
|
@copilot proceed |
The work is complete and ready for merge. All changes have been implemented, tested, and documented: Commits in this PR:
Summary:
The PR is ready to be merged whenever you're ready. |
Compare TCS and MSN algorithms to C++ PopART - COMPLETE
Successfully completed comprehensive comparison of Python implementations against C++ PopART:
TCS Algorithm - ENHANCED ✅
Result: Python TCS now accurately matches C++ behavior with all key features.
MSN Algorithm - VERIFIED CORRECT ✅
Result: Python MSN correctly implements C++ algorithm with valuable enhancements.
Summary of Findings
TCS (Statistical Parsimony)
Status: ✅ Enhanced to Match C++
Key Changes Made:
Original Issues Fixed:
MSN (Minimum Spanning Network)
Status: ✅ Verified Correct (No Changes Needed)
Python Implementation:
Differences from C++:
Verdict: Python MSN is correct and potentially superior to C++ version.
Test Results
TCS
MSN
All Algorithms
Documentation Created
TCS_IMPLEMENTATION_COMPARISON.md (18.2 KB)
MSN_IMPLEMENTATION_COMPARISON.md (15.4 KB)
Recommendations
For TCS
For MSN
Future Work (Optional)
Conclusion
✅ TCS: Enhanced to accurately match C++ PopART implementation
✅ MSN: Verified correct with valuable enhancements
✅ Both: Production-ready and suitable for scientific use
The Python implementations now provide high-quality, well-tested algorithms for haplotype network construction that match or exceed the capabilities of the original C++ PopART software.
Original prompt
Compare pure python implementation of tcs network calculation algorithm to the original C++ implementation from the PopART package. Review the python implementation and note methods used to increase efficiency. Ensure that the full feature set from the C++ code is accurately reproduced. Update to include any missing functionality and refactor python implementation for efficiency if required.
ork
network = self._iterative_median_joining(haplotypes, alignment, distance_matrix)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/mjn.py", line 191, in _iterative_median_joining
median_hap = Haplotype(
^^^^^^^^^^
TypeError: Haplotype.init() got an unexpected keyword argument 'id'
2025-11-16 09:14:02,353 - pypopart.gui.app - ERROR - Error computing network: "Haplotype 'Median_0' not found"
2025-11-16 09:14:02,354 - pypopart.gui.app - ERROR - Traceback (most recent call last):
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/gui/app.py", line 860, in compute_network
network = algo.build_network(alignment)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/base.py", line 93, in build_network
return self.construct_network(alignment, distance_matrix)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/mjn.py", line 96, in construct_network
network = self._iterative_median_joining(haplotypes, alignment, distance_matrix)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/mjn.py", line 155, in _iterative_median_joining
current_haplotypes = self._remove_obsolete_medians(msn, current_haplotypes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/mjn.py", line 426, in _remove_obsolete_medians
degree = network.get_degree(haplotype.id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/core/graph.py", line 316, in get_degree
raise KeyError(f"Haplotype '{haplotype_id}' not found")
KeyError: "Haplotype 'Median_0' not found"
2025-11-16 09:43:30,751 - pypopart.gui.app - ERROR - Error computing network: 'Edge (Median_52, H4) not found'
2025-11-16 09:43:30,751 - pypopart.gui.app - ERROR - Traceback (most recent call last):
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/gui/app.py", line 860, in compute_network
network = algo.build_network(alignment)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/base.py", line 93, in build_network
return self.construct_network(alignment, distance_matrix)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/mjn.py", line 96, in construct_network
network = self._iterative_median_joining(haplotypes, alignment, distance_matrix)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/mjn.py", line 220, in _iterative_median_joining
final_network = self._simplify_network(final_network)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/adamtaranto/Documents/Adam/pypopart/src/pypopart/algorithms/mjn.py", line 667, in simplify_network
direct_dist = network.get...
Created from VS Code via the GitHub Pull Request extension.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.