-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
matchability function #5
Comments
I'd suggest reporting this information in a |
How would that work with matrices? Right now I have (albeit not implemented) |
I'm not sure we should bother with a matrix summary method. Somebody might On Thu, Mar 17, 2016 at 12:19 PM, Josh Errickson [email protected]
|
How's this for a mockup of the results of
The unmatchable sections would drop if empty. (And always drop in For (I'm thinking of having |
I like the way this is going. I think we could wind up with something very useful to users. Let me lay out some general comments first, then specific suggestions General thoughts
Comments on @josherrickson's mockup:
summary(mybISM)$blockA would invoke Possible additional features
|
Given the expanding scope of this function, is it safe to say this is no longer targeted for the 0.9-6 release? Or is there some more limited version you'd like to try and push through by next week? |
Correct, let's not target this for the 0.9-6 release. Maybe we can get some user feedback before we release it out into the wild. I could probably get some students in my class this term to try out our working version of the thing, particularly if we got it up and running over the next several days. The operation of determining the closest match for each treatment group member could be done with |
Various changes related to comments on #5
Here's what I've got so far.
Wall of text incoming. > data(nuclearplants)
> m1 <- match_on(pr ~ cost, data=nuclearplants)
> summary(m1)
Membership: 10 treatment, 22 control
Total eligible potential matches: 220
Total eligible potential matches: 0
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.006858 0.420900 1.002000 1.102000 1.539000 3.858000
> m2 <- match_on(pr ~ cost, data=nuclearplants, caliper=1)
> summary(m2)
Membership: 10 treatment, 22 control
Total eligible potential matches: 109
Total eligible potential matches: 111
1 unmatchable control member:
V
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.006858 0.181000 0.413200 0.453300 0.724000 0.993000
> m3 <- match_on(pr ~ cost, data=nuclearplants, caliper=.05)
> summary(m3)
Membership: 10 treatment, 22 control
Total eligible potential matches: 9
Total eligible potential matches: 211
5 unmatchable treatment members:
A, B, F, G, b
16 unmatchable control members:
J, K, M, N, O, ...
See summary(m3)$unmatchable$control for a complete list.
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.006858 0.017030 0.026270 0.025840 0.027780 0.047190
> summary(m3)$unmatchable$control
[1] "J" "K" "M" "N" "O" "P" "Q" "S" "T" "U" "V" "W" "X" "Y" "Z" "d"
> m4 <- match_on(pr ~ cost + strata(pt), data=nuclearplants)
> summary(m4)
Summary across all blocks:
Membership: 10 treatment, 22 control
Total eligible potential matches: 118
Total eligible potential matches: 102
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.01703 0.30750 0.81410 0.90400 1.35000 3.43800
To see summaries for individual blocks, call for example summary(m4)$`0`.
> summary(m4)$`1`
Membership: 3 treatment, 3 control
Total eligible potential matches: 9
Total eligible potential matches: 0
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.02627 0.05736 0.10330 0.21250 0.39230 0.42340
|
Fixed the double statement above, new version below > nuclearplants$strat <- rep(letters[1:3], times=15)[1:32]
> m5 <- match_on(pr ~ cost + strata(strat), data=nuclearplants, caliper=.2)
> summary(m5)
Summary across all blocks:
Membership: 10 treatment, 22 control
Total eligible potential matches: 2
Total eligible potential matches: 218
6 unmatchable treatment members:
A, C, D, G, a, ...
See summary(m5)$unmatchable$treatment for a complete list.
17 unmatchable control members:
H, I, K, L, M, ...
See summary(m5)$unmatchable$control for a complete list.
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.02732 0.07945 0.08236 0.09011 0.12080 0.14060
To see summaries for individual blocks, call for example summary(m5)$`a`.
> summary(m5)$`a`
Membership: 2 treatment, 9 control
Total eligible potential matches: 2
Total eligible potential matches: 16
1 unmatchable treatment member:
b
7 unmatchable control members:
H, L, Q, T, V, ...
See summary(object)$`a`$unmatchable$control for a complete list.
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.07945 0.09475 0.11000 0.11000 0.12530 0.14060
> print(summary(m5), printAllBlocks=TRUE)
Summary across all blocks:
Membership: 10 treatment, 22 control
Total eligible potential matches: 2
Total eligible potential matches: 218
6 unmatchable treatment members:
A, C, D, G, a, ...
See summary(m5)$unmatchable$treatment for a complete list.
17 unmatchable control members:
H, I, K, L, M, ...
See summary(m5)$unmatchable$control for a complete list.
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.02732 0.07945 0.08236 0.09011 0.12080 0.14060
Indiviual blocks:
$a
Membership: 2 treatment, 9 control
Total eligible potential matches: 2
Total eligible potential matches: 16
1 unmatchable treatment member:
b
7 unmatchable control members:
H, L, Q, T, V, ...
See summary(object)$`a`$unmatchable$control for a complete list.
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.07945 0.09475 0.11000 0.11000 0.12530 0.14060
$b
Membership: 3 treatment, 8 control
Total eligible potential matches: 3
Total eligible potential matches: 21
5 unmatchable control members:
I, M, O, U, Z
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.02732 0.05484 0.08236 0.07681 0.10160 0.12080
$c
Membership: 5 treatment, 5 control
Total eligible potential matches: 0
Total eligible potential matches: 25
5 unmatchable treatment members:
A, C, D, G, a
5 unmatchable control members:
K, P, S, W, d
|
Just noting I've pushed up these changes finally, feel free to play around. |
Also, pointing out that I addressed Ben's concerns about > data(nuclearplants)
> np <- subset(nuclearplants, pt==1)
> m <- match_on(pr ~ cost, data=np, caliper=1)
> m
control
treated d e f
a Inf 0.2016450 0.1122457
b 0.2451029 Inf Inf
c Inf 0.4412846 0.3518854
> summary(m)
Membership: 3 treatment, 3 control
Total eligible potential matches: 5
Total eligible potential matches: 4
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.1122 0.2016 0.2451 0.2704 0.3519 0.4413
> m[3] <- Inf
> m@.Data
[1] 0.2016450 0.1122457 Inf 0.4412846 0.3518854
> summary(m)
Membership: 3 treatment, 3 control
Total eligible potential matches: 4
Total eligible potential matches: 5
1 unmatchable treatment member:
b
1 unmatchable control member:
d
Summary of distances:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.1122 0.1793 0.2768 0.2768 0.3742 0.4413 |
Coming along nicely here! Suggestion: relabel at least one of the to "Total eligible potential matches" lines to clarify what they're giving back. (I'm not getting this myself.) |
"Total eligible potential matches" is basically |
Changing the second one to "Total ineligible..." will clear my issue right up! |
New version of BISM has been pushed, fixing the typo above, and adding a table describing block structure (only appears when Block structure:
Matchable Txt Matchable Ctl Unmatchable Txt Unmatchable Ctl
`a` 1 2 1 7
`b` 3 3 0 5
`c` 0 0 5 5 I shrunk the titles from "Treatment" and "Control" to "Txt and Ctl" to make it less wide; alternatively I could mess with Also, the only item left on your original comments, Ben, is the distances changes. |
First version of the change to distances summary is up. I'm using built-in Overall, speed may be an issue; the same 5000x5000 matrix takes several seconds to run. On first pass, there's no obvious bottleneck, rather a series of slowdowns. Summary of minimum matchable distance per treatment member:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.006858 0.019340 0.042050 0.041290 0.058320 0.079450 Any feedback on that description? |
Great! Built in On Mon, Mar 21, 2016 at 7:21 PM, Josh Errickson [email protected]
|
PS: One of the "various ways it might be speeded up" would be to encode the On Mon, Mar 21, 2016 at 7:31 PM, Ben Hansen [email protected] wrote:
|
Added flag to turn off distance summary. Also moved flags from summary.BlockedInfinitySparseMatrix <- function(object, ...,
distanceSummary=TRUE,
printAllBlocks=FALSE,
blockStructure=TRUE) |
This set of functions has been moved over to Optmatch. |
After a discussion with Ben, created a function tentatively called
matchability
. The idea is to take some version of a distance matrix with someInf
s (from caliper or otherwise) and identify which observations are completely unmatchable. E.g.Starting issue for commentary, especially on function name, how the output should look like, etc.
The text was updated successfully, but these errors were encountered: