Hi there! I'm faced with a dataset where I can have up to 90% missing values for a given feature (m/z value but equivalent to a protein readout), and that means there are many cases where some samples have all NA's for all of their 3 replicates.
What would you recommend in terms of input filtering? Would you make sure that a feature has at least 1 replicate present for every sample? Or set some kind of maximum missing value threshold (ie. don't include features that have more than X% of samples missing)?
Currently I remove all m/z that have more than 90% of samples missing, but that leaves me with approx. 60 000 features and approx. 1500 samples.
I'm noticing that I can run the algorithm but it's not converging (at least not after 48 hours).
Thanks :)
Hi there! I'm faced with a dataset where I can have up to 90% missing values for a given feature (m/z value but equivalent to a protein readout), and that means there are many cases where some samples have all NA's for all of their 3 replicates.
What would you recommend in terms of input filtering? Would you make sure that a feature has at least 1 replicate present for every sample? Or set some kind of maximum missing value threshold (ie. don't include features that have more than X% of samples missing)?
Currently I remove all m/z that have more than 90% of samples missing, but that leaves me with approx. 60 000 features and approx. 1500 samples.
I'm noticing that I can run the algorithm but it's not converging (at least not after 48 hours).
Thanks :)