Skip to content

Refactor issue90#106

Open
marrip wants to merge 24 commits into
InPreD:develop_issue90from
marrip:refactor-issue90
Open

Refactor issue90#106
marrip wants to merge 24 commits into
InPreD:develop_issue90from
marrip:refactor-issue90

Conversation

@marrip
Copy link
Copy Markdown
Collaborator

@marrip marrip commented May 7, 2026

Hey Xiaoli,

My first attempt at compressing the changes you made by using pandas. I would like to refactor a bit more but I want you to be on board first so let me know if this looks ok. 🙏

OBS! The branch comes from my fork again so when merged we still need to merge your branch, just fyi 😉

@marrip marrip requested a review from xiaoliz0 May 7, 2026 13:35
@marrip marrip self-assigned this May 7, 2026
@marrip marrip changed the base branch from main to develop_issue90 May 7, 2026 13:35
@xiaoliz0 xiaoliz0 requested a review from tonjegul May 12, 2026 12:10
@xiaoliz0
Copy link
Copy Markdown
Contributor

Hi Martin, I start to check and test your codes now.
Emm, a small bug I fixed is in line 1354 in Script/PRONTO.py "pandas.read.csv" to "pandas.read_csv". More errors come from reading the small_variant_table file due to the formatting issue?
Screenshot 2026-05-12 at 14 10 35

@tonjegul It is probably better we prepare a testing sample data to Martin, then he could test and fix the codes in a much shorter time.

@marrip
Copy link
Copy Markdown
Collaborator Author

marrip commented May 12, 2026

Hi Martin, I start to check and test your codes now. Emm, a small bug I fixed is in line 1354 in Script/PRONTO.py "pandas.read.csv" to "pandas.read_csv". More errors come from reading the small_variant_table file due to the formatting issue?

Hey Xiaoli! Sorry about that, I think I fixed both issues now. But I agree, some testdata would be highly appreciated ☺️

@xiaoliz0
Copy link
Copy Markdown
Contributor

Hi @marrip, I added a fake sample data files in the "test_data/ous/test_files_for_PRONTO_newFunctions" which was designed by Tonje. The ppt report in the output folder is the output. Could you use it to test your codes for debugging? Thanks! :)

@marrip
Copy link
Copy Markdown
Collaborator Author

marrip commented May 20, 2026

Hi @marrip, I added a fake sample data files in the "test_data/ous/test_files_for_PRONTO_newFunctions" which was designed by Tonje. The ppt report in the output folder is the output. Could you use it to test your codes for debugging? Thanks! :)

🍐 fect! Great work, guys, will test and get back to you 👍

@marrip marrip force-pushed the refactor-issue90 branch from c68fd7d to 74c3d88 Compare May 21, 2026 08:04
@marrip
Copy link
Copy Markdown
Collaborator Author

marrip commented May 21, 2026

Hey, we tested with the test data locally and that worked fine now. We mitigated all bugs we found. There seems to be a discrepancy between the test raw data and the expected output ppt or maybe I misunderstood something. Please let me know.

@xiaoliz0
Copy link
Copy Markdown
Contributor

Hey, we tested with the test data locally and that worked fine now. We mitigated all bugs we found. There seems to be a discrepancy between the test raw data and the expected output ppt or maybe I misunderstood something. Please let me know.

Thanks Martin! Great work! Emm, about the output ppt for the testing sample, I will take a further check. Maybe this is due to the name of the small_variant_table file. Which makes the scripts did not read it, so only the meta data is updated in the output ppt. I will upload a new output ppt later.

@xiaoliz0
Copy link
Copy Markdown
Contributor

Hi @marrip, I try to test your new codes in our side. PRONTO failes when there is variant in small_variant_table_forQC.tsv fitting for the FILTER0-1 conditions in configure_PRONTO.ini
The error is like: "root:Column CPSR_ACMG_class not found in data".

@marrip
Copy link
Copy Markdown
Collaborator Author

marrip commented May 22, 2026

Hi @marrip, I try to test your new codes in our side. PRONTO failes when there is variant in small_variant_table_forQC.tsv fitting for the FILTER0-1 conditions in configure_PRONTO.ini The error is like: "root:Column CPSR_ACMG_class not found in data".

I think I have fixed this on my branch?

@xiaoliz0
Copy link
Copy Markdown
Contributor

Hey, we tested with the test data locally and that worked fine now. We mitigated all bugs we found. There seems to be a discrepancy between the test raw data and the expected output ppt or maybe I misunderstood something. Please let me know.

Thanks Martin! Great work! Emm, about the output ppt for the testing sample, I will take a further check. Maybe this is due to the name of the small_variant_table file. Which makes the scripts did not read it, so only the meta data is updated in the output ppt. I will upload a new output ppt later.

Just did a quick check. The scripts did read small_variant_table_forQC.tsv file. This probably caused by the issue you mentioned earlier " all_data will always be filtered according to the last item in the list".

Screenshot 2026-05-22 at 09 49 06

@xiaoliz0
Copy link
Copy Markdown
Contributor

Hi @marrip, I try to test your new codes in our side. PRONTO failes when there is variant in small_variant_table_forQC.tsv fitting for the FILTER0-1 conditions in configure_PRONTO.ini The error is like: "root:Column CPSR_ACMG_class not found in data".

I think I have fixed this on my branch?

I checked out the codes with this PR106. Do you have some codes not committed here?

@marrip
Copy link
Copy Markdown
Collaborator Author

marrip commented May 22, 2026

Hi @marrip, I try to test your new codes in our side. PRONTO failes when there is variant in small_variant_table_forQC.tsv fitting for the FILTER0-1 conditions in configure_PRONTO.ini The error is like: "root:Column CPSR_ACMG_class not found in data".

I think I have fixed this on my branch?

I checked out the codes with this PR106. Do you have some codes not committed here?

ah, I might have an idea. Are you using your test data or the test data in the repo? There was some formatting problems.

@xiaoliz0
Copy link
Copy Markdown
Contributor

Hi @marrip, I try to test your new codes in our side. PRONTO failes when there is variant in small_variant_table_forQC.tsv fitting for the FILTER0-1 conditions in configure_PRONTO.ini The error is like: "root:Column CPSR_ACMG_class not found in data".

I think I have fixed this on my branch?

I checked out the codes with this PR106. Do you have some codes not committed here?

ah, I might have an idea. Are you using your test data or the test data in the repo? There was some formatting problems.

Yes, I use the test data which is the same in the repo. What is the formatting problem? Could you provide a right format file I could take a further check?

@marrip marrip closed this May 26, 2026
@marrip marrip reopened this May 26, 2026
@xiaoliz0
Copy link
Copy Markdown
Contributor

Hi @marrip thanks for your updates. I tested with the new codes and got some errors like:
Screenshot 2026-05-27 at 09 18 57

I can not find '0,657' in the "_TMB_coding.txt" file, but only see a '657' in the "AF_tumor_DNA" column. Is this something about the way to read the files with panda? Or some settings need to be changed from European numbers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants