add missing-tolerant adjust method #117

pdimens · 2020-07-20T19:21:31Z

This PR addresses issue #108 of including a method ignorant of missing values in a vector of P values. The reason this method is necessary/useful is that when using split-apply-combine methods for categorical data grouping when missing values are present, situations may arise where one of the grouped vectors contains only missing values and thus the resulting P value would be missing as well.

The introduced method to adjust creates a copy of the Pvalue vector, performs the correction method on all the non-missing Pvalues, and overwrites the original nonmissing Pvalues in the copied vector, returning the modified copy with all the missing values preserved in their original locations.

julia> a = [0.1, 0.2, missing, 0.4] ;

julia> adjust(a, Holm())
4-element Array{Union{Missing, Float64},1}:
 0.30000000000000004
 0.4
  missing
 0.4

This method should leave the original unaffected:

julia> b = adjust(rand(10), Holm())
10-element Array{Float64,1}:
 1.0
 1.0
 1.0
 1.0
 1.0
 0.18431430876447497
 1.0
 1.0
 1.0
 1.0

nignatiadis · 2020-07-20T19:41:38Z

Thanks for this!

I have just one concern, that was already pointed out in #108 by @juliangehring. The behaviour here should depend a lot on how the missing came about. If, say the i-th test is null, then is the missingness independent of the p-value? If this is true for all nulls tests, then this PR is correct.

Otherwise, it is not. For example, iirc some eQTL software only computes p-values that are smaller than t for some small t; the rationale is that large p-values will never be rejected by a multiple testing procedure so it is not worth spending effort to compute them. In this case the missing p-values should be replaced by 1 and not filtered out ( or they can be filtered out and then one can use the adjust(pvals, n, PValueAdjustmentMethod) functionality in this package).

In any case, I do not mind having this PR as the default, but I think it should be made explicit to users.

pdimens · 2020-07-20T20:19:24Z

That's a very good point I had not considered. It could, in theory, have a kwarg of a Bool to specify the handling of missing as the ignore or include-as-1 method, with a suitable docstring explaining the difference. What are your thoughts on that?

pdimens · 2020-07-21T00:06:05Z

@nignatiadis the committed change suggests a method accounting for the scenarios your describe:

julia> a = [0.1, 0.2, missing, 0.4] ;

julia> adjust(a, Holm())
4-element Array{Union{Missing, Float64},1}:
 0.30000000000000004
 0.4
  missing
 0.4

julia> adjust(a, Holm(), include_missing = false)    # verbose version of default kwarg
4-element Array{Union{Missing, Float64},1}:
 0.30000000000000004
 0.4
  missing
 0.4

julia> adjust(a, Holm(), include_missing = true)
4-element Array{Float64,1}:
 0.4
 0.6000000000000001
 1.0
 0.8

… into develop

add missing-tolerant adjust method

8154c0c

Update pval-adjustment.jl

2f4f801

pdimens added 2 commits July 20, 2020 19:06

update method

2272b2f

Merge branch 'develop' of https://github.com/pdimens/MultipleTesting.jl…

52d8013

… into develop

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add missing-tolerant adjust method #117

add missing-tolerant adjust method #117

pdimens commented Jul 20, 2020

nignatiadis commented Jul 20, 2020

pdimens commented Jul 20, 2020

pdimens commented Jul 21, 2020

add missing-tolerant adjust method #117

Are you sure you want to change the base?

add missing-tolerant adjust method #117

Conversation

pdimens commented Jul 20, 2020

nignatiadis commented Jul 20, 2020

pdimens commented Jul 20, 2020

pdimens commented Jul 21, 2020