displays pvalues for all the covariates in the model by jazberna · Pull Request #21 · iansealy/DETCT

jazberna · 2014-07-28T16:10:46Z

No description provided.

iansealy · 2014-07-31T12:08:30Z

Thanks for this. So at the moment, this will all be ignored because there's nothing in DETCT::Misc::R to handle the extra p values. Have you got any thoughts on how best to display this to users? Just add the extra p values as extra columns in all.tsv, all.csv, etc... I worry that users will just ignore the extra columns. Is there any way we can combine them so we still have one p value per region?

jazberna · 2014-08-06T10:42:38Z

The reported pvalues (before and after FDR) are obtained using a Likelihood Ratio Test that compares the full model (which contains the interactions between condition and group) at the intercept only model.

jazberna · 2014-08-06T10:42:41Z

The reported pvalues (before and after FDR) are obtained using a Likelihood Ratio Test that compares the full model (which contains the interactions between condition and group) at the intercept only model.

iansealy · 2014-08-06T12:20:41Z

Two things:

The code as is won't work if there's only one factor (i.e. there are conditions, but not groups).
It sounds, from what you said in your email, that simplifying this to one p value per region isn't appropriate. So maybe we should go back to presenting all the numbers and just give some guidance to users about how to interpret them. That is, add a bunch more columns to all.tsv (and the other output files). What do you think? Would you be able to write the kind of guidance that anyone in the lab could use to decipher their data?

jazberna · 2014-08-06T13:34:36Z

Hi Ian.

On 6 Aug 2014, at 13:20, Ian Sealy notifications@github.com wrote:

Two things:

The code as is won't work if there's only one factor (i.e. there are conditions, but not groups).

I will run it with one factor to see if it crashes but I see that the saturated model only happens with there are two factors:

Create DESeqDataSet (with design according to number of factors)

dds <- DESeqDataSetFromMatrix(countData, samples, design = ~ condition)
if (numFactors == 2) {
design(dds) <- formula(~ group * condition) # <--- the asterisk only appears when there are two factors
}

It sounds, from what you said in your email, that simplifying this to one p value per region isn't appropriate.

Yes, the thing is that the LR test is for model selection. In my last commit that unique pvalue tells you if the model with condition and group is significantly better than non having the information coming the condition and group but since you are already interested in the condition and to control by group, that's your model, no selection is needed.

So maybe we should go back to presenting all the numbers and just give some guidance to users about how to interpret them. That is, add a bunch more columns to all.tsv (and the other output files). What do you think?

Yeap, If I were the user I would at least see the pvalues for each factor in the model. Also I know we all know this but… if the interaction is significant the region counts even if the condition itself is not significant in the same way that in all stats book they tell you not yo remove main effects of the model if the interaction is significant. i.e from this link
http://www.ssc.wisc.edu/sscc/pubs/sfr-stats.htm
"It's almost always a mistake to include interactions in a regression without the main effects,.."

Would you be able to write the kind of guidance that anyone in the lab could use to decipher their data?

Yes, can write some notes explaining a given example.

—
Reply to this email directly or view it on GitHub.

Jorge

The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.

displays pvalues for all the covariates in the model

b8bcae5

jazberna added 2 commits August 6, 2014 11:26

reports LTR against intercept only nmodel

aea2b6c

Merge remote-tracking branch 'upstream/master'

35457a4

jazberna closed this Aug 6, 2014

iansealy reopened this Aug 6, 2014

jazberna added 5 commits August 6, 2014 15:56

added LTR test

17c554e

had to resolve conflicts

0e7557c

added interactions

7fba6d0

call to resultsNames(dds) was in wrong place

04c62b2

fixed bug in the selection of the main condition term

e9ea85c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

displays pvalues for all the covariates in the model#21

displays pvalues for all the covariates in the model#21
jazberna wants to merge 8 commits into
iansealy:masterfrom
jazberna:master

jazberna commented Jul 28, 2014

Uh oh!

iansealy commented Jul 31, 2014

Uh oh!

jazberna commented Aug 6, 2014

Uh oh!

jazberna commented Aug 6, 2014

Uh oh!

iansealy commented Aug 6, 2014

Uh oh!

jazberna commented Aug 6, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jazberna commented Jul 28, 2014

Uh oh!

iansealy commented Jul 31, 2014

Uh oh!

jazberna commented Aug 6, 2014

Uh oh!

jazberna commented Aug 6, 2014

Uh oh!

iansealy commented Aug 6, 2014

Uh oh!

jazberna commented Aug 6, 2014

Create DESeqDataSet (with design according to number of factors)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants