-
Notifications
You must be signed in to change notification settings - Fork 9
/
index.Rmd
985 lines (684 loc) · 43.1 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
---
title: "Reaction time distributions: an interactive overview"
output:
html_document:
df_print: default
number_sections: yes
self_contained: no
toc: yes
toc_depth: 2
toc_float:
collapsed: no
runtime: shiny
---
<!--
For everyone looking here: setting up my own shiny server caused major
issues. See more at https://github.com/lindeloev/shiny-rt.
TO DO for later and probably never...
* add brms families for shifted Weibull and shifted Wald. And LBA?
* (Minor) The dataset selector does not update the trims and bin values when these are set in set_data
-->
<!-- Distributions and plotting -->
```{r, echo=FALSE, message=FALSE, warning=FALSE}
# SETTINGS
FIG_HEIGHT = 290
FIG_WIDTH = 530
RESOLUTION = 1024 # Number of points to evaluate densities at
NSAMPLES = 10000 # Number of samples for theoretical distributions
UPDATE_INTERVAL = 1000 # ms between each applet checking for updates in dataset
# # Uncomment if you haven't got these packages yet:
# install.packages('brms')
# install.packages('RWiener') # For dwiener
# install.packages('shiny')
# install.packages('rtdists')
# Import stuff
library(shiny)
library(rtdists) # For dLBA
# I had problems installing brms on Google Cloud, so I did a manual replacement:
#library(brms)
source('brms_simulator/misc.R'); source('brms_simulator/distributions.R')
# Density and radnom generation for (Shifted) Inverse Gaussian / (Shifted) Wald.
dshifted_inv_gaussian = function(x, mu, shape, shift) {
dens = dinv_gaussian(x - shift, mu, shape)
dens[is.na(dens)] = 0 # NaN means 0 density during shift.
dens
}
rshifted_inv_gaussian = function(n, mu, shape, shift) shift + rinv_gaussian(n, mu, shape)
# Density and random generation for three-parameter Weibull
dshifted_weibull = function(x, shape, scale, shift) dweibull(x - shift, shape, scale)
rshifted_weibull = function(n, shape, scale, shift) shift + rweibull(n, shape, scale)
# Score as scaled mean squared difference between empirical and theoretical density
# This score is a bit arbitrary, but it should be good enough
score_fit = function(ud, dfun) {
empirical_density = density(ud$rt_data, from=0, to=ud$TRIM_MAX, n = 1024, bw=0.02)$y # From actual observations
theoretical_density = dfun(seq(1/RESOLUTION, ud$TRIM_MAX, length.out = 1024)) # From density function. Start at 1 so invgauss doesn't fail
round(mean((theoretical_density - empirical_density)^2) * 10000) # Mean squares. Scaled and rounded to digestible number
}
# Show a plot, theoretical density, and score
show_plot = function(ud, dfun = NULL) {
# Histogram
par(mar = c(5, 0, 0, 0) + 0.1)
hist(ud$rt_data, breaks=seq(0, ud$TRIM_MAX, ud$TRIM_MAX/ud$BINS), xlim=c(0, ud$TRIM_MAX),
freq = FALSE,
axes = FALSE, # To remove y-axis, we need to remove both first...
xlab = 'Reaction Time', ylab = NULL, main = NULL)
axis(1) # Add x-axis
if(!is.null(dfun)) {
# Add theoretical curve
curve(dfun, add=TRUE, lwd=4, col='red')
# Present score as legend
error = score_fit(ud, dfun)
legend(
'topright',
inset = 0.05,
cex = 2,
box.col = NA,
bg = NA,
# Set text and text color based on error
text.col = ifelse(error < 40, '#AA9944', ifelse(
error < 150, '#00BB00', ifelse(error > 300, '#BB0000', 'black')
)),
legend = paste0('MSE: ', error, ifelse(error < 40, '\nNailed it!', ''))
)
}
}
```
<!-- Generate datasets -->
```{r, echo=FALSE}
data("rr98")
data("speed_acc")
mrt_data = read.csv('mrt_data.csv')
rt_types = c(
# Real RTs
'Brightness discrimination (Ratcliff & Rouder, 1998)' = 'rr98',
'Discriminate words/non-words - be accurate (Wagenmakers et al., (2008)' = 'accuracy',
'Discriminate words/non-words - be fast (Wagenmakers et al., (2008)' = 'speed',
'Mental Rotation of 2D figures - 0-30 degrees' = 'mrt_short',
'Mental Rotation of 2D figures - 150-180 degrees' = 'mrt_long',
# Theoretical RTs
'Theoretical: Gaussian (Normal distribution)' = 'gauss',
'Theoretical: Ex-gaussian' = 'exgauss',
'Theoretical: Wiener diffusion' = 'wiener',
'Theoretical: Inverse Gaussian (Wald)' = 'invgauss',
'Theoretical: Shifted Log-normal' = 'slog',
'Theoretical: Skew-normal' = 'skew',
'Theoretical: Gamma' = 'gamma',
'Theoretical: Weibull' = 'sweibull'
)
#last_data_update = Sys.time() # For init
set_data = function(rt_type, ud) {
ud$last_data_update = Sys.time() # Makes "update_this" function react
# Typical use case: choose by label
if(is.character(rt_type)) {
set.seed(42)
rt_data = switch(rt_type,
# Real datasets
'accuracy' = subset(speed_acc, condition == 'accuracy' & id == 3)$rt,
'speed' = subset(speed_acc, condition == 'speed' & id == 3)$rt,
'rr98' = subset(rr98, correct == TRUE & id == 'nh' & source == 'light')$rt,
'mrt_short' = subset(mrt_data, rotation_cat == 'short')$rt,
'mrt_long' = subset(mrt_data, rotation_cat == 'long')$rt,
# Theoretical datasets
'gauss' = rnorm(NSAMPLES, mean=0.7, sd=0.2),
'exgauss' = rexgaussian(NSAMPLES, mu=0.7, sigma=0.1, beta=0.3),
'wiener' = rwiener(NSAMPLES, alpha=1.5, tau=0.2, beta=0.6, delta=1)$q,
'invgauss' = rshifted_inv_gaussian(NSAMPLES, mu=0.6, shape=3, shift=0),
#'log' = rlnorm(NSAMPLES, meanlog=-0.6, sdlog=0.3),
'slog' = rshifted_lnorm(NSAMPLES, meanlog=-1, sdlog=0.5, shift=0.2),
'skew' = rskew_normal(NSAMPLES, mu=0.7, sigma=0.2, alpha=6),
'gamma' = rgamma(NSAMPLES, shape=15, rate=15),
#'weibull' = rweibull(NSAMPLES, shape=3.5, scale=0.6),
'sweibull' = rshifted_weibull(NSAMPLES, shape=5, scale=0.8, shift=0)
)
if(rt_type == 'rr98') ud$TRIM_MAX = 1.2
if(rt_type %in% c('mrt_short', 'mrt_long')) {
ud$TRIM_MAX = 8
ud$BINS = 20
}
# But one can also insert one's own data:
} else if(is.numeric(rt_type)) {
rt_data = rt_type[1:min(NSAMPLES, length(rt_type))] # A max on data length
}
# Make sure that it stays within bounds, and assign to global scope
ud$rt_data = rt_data[rt_data > ud$TRIM_MIN & rt_data < ud$TRIM_MAX]
ud$rt_type = rt_type
#return(rt_data)
}
# Updates the calling context if data is updated
# (detected by a change in userData$last_data_update)
update_this = function(session) {
reactivePoll(
intervalMillis = UPDATE_INTERVAL,
session = session,
checkFunc = function() session$userData$last_data_update,
valueFunc = function() ''
)()
}
```
<!-- CSS styles -->
<link rel="stylesheet" type="text/css" href="include/style.css">
<!-- VIsible meta-data -->
By Jonas Kristoffer Lindeløv ([blog](https://lindeloev.net), [profile](https://vbn.aau.dk/da/persons/117060)).<br />
Updated: `r format(Sys.time(), '%d %B, %Y')` ([source](https://github.com/lindeloev/shiny-rt/), [changes](https://github.com/lindeloev/shiny-rt/commits/master)).
<!-- Social sharing. From simplesharebuttons.com -->
<style type="text/css">
#share-buttons img {
width: 40px;
padding-right: 15px; vertical-align: top;
border: 0;
box-shadow: 0;
display: inline;
vertical-align: top;
}
</style>
<!-- For some reason, the code below ignores this style.
So I've inserted styles manually. Ugly. -->
<div id="share-buttons">
<!-- Twitter -->
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<a href="https://twitter.com/intent/tweet?text=Reaction time distributions: an interactive overview https://lindeloev.net/shiny/rt/ via %40jonaslindeloev" class="twitter-hashtag-button" data-size="large" data-related="jonaslindeloev" data-show-count="false">Share on Twitter</a>
<!-- Facebook -->
<a href="http://www.facebook.com/sharer.php?u=https://lindeloev.net/shiny/rt/" target="_blank"><img src="https://simplesharebuttons.com/images/somacro/facebook.png" style="width: 40px; padding-right: 10px; vertical-align: top;" alt="Facebook" /></a>
<!-- LinkedIn -->
<a href="http://www.linkedin.com/shareArticle?mini=true&url=https://lindeloev.net/shiny/rt/" target="_blank"><img src="https://simplesharebuttons.com/images/somacro/linkedin.png" style="width: 40px; padding-right: 10px; vertical-align: top;" alt="LinkedIn" /></a>
<!-- Digg -->
<a href="http://www.digg.com/submit?url=https://lindeloev.net/shiny/rt/" target="_blank"><img src="https://simplesharebuttons.com/images/somacro/diggit.png" style="width: 40px; padding-right: 10px; vertical-align: top;" alt="Digg" /></a>
<!-- Reddit -->
<a href="http://reddit.com/submit?url=https://lindeloev.net/shiny/rt/&title=Reaction time distributions: an interactive overview" target="_blank"><img src="https://simplesharebuttons.com/images/somacro/reddit.png" style="width: 40px; padding-right: 10px; vertical-align: top;" alt="Reddit" /></a>
<!-- Email -->
<a href="mailto:?Subject=Reaction time distributions: an interactive overview&Body=https://lindeloev.net/shiny/rt/"><img src="https://simplesharebuttons.com/images/somacro/email.png" style="width: 40px; padding-right: 10px; vertical-align: top;" alt="Email" /></a>
</div>
<br />
All too often, researchers use the normal distribution to model reaction times. Hopefully, this document to lower the burden of shifting to better distributions. Because the normal distribution is so poor for RTs (as you will see), the reward of shifting will be great!
The cheat sheet below gives an overview (get it in [PNG](https://github.com/lindeloev/shiny-rt/raw/master/rt_cheat_sheet.png) or [PDF](https://github.com/lindeloev/shiny-rt/raw/master/rt_cheat_sheet.pdf)). You can also skip straight to the [interactive part](#data) or to the [applied code example](#code).
<hr />
[![](rt_cheat_sheet.png)](https://github.com/lindeloev/shiny-rt/raw/master/rt_cheat_sheet.pdf)
<hr />
# Opinion: Three important types of parameters
For reaction times, I'll argue that there are three useful parameters to describe their distribution: Difficulty^["Difficulty" is a term borrowed from [Wagenmakers & Brown (2007)](http://ejwagenmakers.com/2007/WagenmakersBrown2007.pdf), but it is not standard], Onset^[Some distributions are parameterized using *rate* which is the inverse of *location center*.], and Scale.^[Many distributions include a *shape* parameter, but since all RT distributions have approximately the same shape, I'd rather pick a distribution which inherently has the "right" shape, than model this using a parameter.] In the figure below, the blue curves show the effect of *increasing* each of these parameters. [Try manipulating them yourself here](#slog).
<br />
<!-- The image was generated using this code, which is currently disabled using eval=FALSE -->
```{r, eval=F, echo=F, out.width= "100%", fig.height=1.3}
# Function to make a plot with 1-5 shifted log-normal curves,
# depending on the length of mu, sigma, etc.
show_curve = function(mu, sigma, shift, title, colors=c('red', 'blue', 'darkgreen', 'black', 'grey')) {
# Create the first curve without any axes, texts, etc.
par(mar = c(0, 0, 0, 0) + 0.1)
curve(dshifted_lnorm(x, mu[1], sigma[1], shift[1]),
lwd=3, col=colors[1], from=0, to=1.5,
axes=FALSE, xlab='', ylab='', main='')
# Add more curves
for(i in 2:length(mu)) {
curve(dshifted_lnorm(x, mu[i], sigma[i], shift[i]), lwd=3, col=colors[i], add=T)
}
# Finally, label it.
legend('topright', cex=2.5, box.col = NA, bg = NA, legend = title)
}
# Plot it! Three in one row
par(mfrow = c(1, 3)
show_curve(mu=c(-1, -0.5), sigma=c(0.5, 0.5), shift=c(0.2, 0.2), title='Difficulty')
show_curve(mu=c(-1, -1), sigma=c(0.5, 0.5), shift=c(0.0, 0.2), title='Shift')
show_curve(mu=c(-1, -1), sigma=c(0.5, 1.0), shift=c(0.2, 0.2), title='Scale')
par(mfrow = c(1, 1))
```
![](images/parameter_types.png)
## Difficulty
A dominant feature of RTs is that the standard deviation increases approximately linearly with the mean [(Wagenmakers et al., 2007)](http://ejwagenmakers.com/2007/WagenmakersBrown2007.pdf). Reaction times are more dispersed for conditions with longer reaction times (e.g., incongruent trials) and are more condensed for conditions with shorter reaction times (e.g., congruent trials). For interpretability, it is convenient to have one parameter that changes both rather than having to regress on mean and SD separately and always interpret their interaction. Ideally, the value of the difficulty parameter should be somewhere around the **center** (highest density region, e.g., the median RT or the mode RT).
## Location onset and center
Moves the whole distribution towards the left (longer RTs) or right (shorter RTs) without altering its shape.
1. **Onset:** This is the earliest possible reaction time (for descriptive models where it's often called **shift**) or the earliest possible start of the decision process (for mechanistic models where it's called **non-decision time**). In humans, this is often around 200 ms. It's good to have an onset-parameter since no animal can perceive or respond in 0.0 ms. Models without an onset-parameter often predict positive probabilities of impossibly short RTs.
2. **Center:** Somewhere "in the middle" of the distribution (e.g., $\mu$ in the normal distribution). I argued above that we should prefer a **difficulty**-like parameter to model the center.
## Scale
Stretches or squeezes the distribution over and above **difficulty**, without (severely) changing the **location**. This can be used to tune the distribution to population-specific or experiment-specific characteristics. For example, you would expect wider distributions in patient samples and for dual-tasks.
## Which distribution is better?
It is nice when each parameter does only one thing. If they do several things at once, or if multiple parameters do the *same* thing, I call them **messy**, because you will need to tune many parameters to achieve one of the above effects. And that's a mess for interpretability. However, if this behavior at the surface level is justified by an underlying theory that gives meaning to each parameter, I call them **mechanism** parameters.
With this terminology on the table, here are our three commandments:
> 1. Bad: Distributions with messy parameters.
> 3. Good: Distributions with an onset, a difficulty, AND a scale parameter.
> 2. Good: Distributions with mechanism parameters.
According to these criteria, [DDM](#wiener), [LBA](#lba), the [shifted log-normal](#slog), and the [inverse Gaussian](#invgauss) are the winners on this list. If we also consider how easy it is to interpret the parameter values, I'd say that [DDM](#wiener) and the [shifted log-normal](#slog) come out on top, but this is up for discussion.
# Select data {#data}
All of the following datasets are from a single participant in a single condition.
<!--- TESTING --->
```{r, echo=FALSE}
selectInput('rt_type', label = 'Choose a dataset:', choices = rt_types, selected = 'accuracy', width = '100%')
textInput('rts_input', label = '... or paste your own RTs (separator = space/,/;/tab)):', width='100%')
# Trim on one line
div(style="display: inline-block;vertical-align:top;", numericInput('trim_min', label = 'Min RT', value=0.180, min=0, max=10, step=0.1, width='80px'))
div(style="display: inline-block;vertical-align:top;", numericInput('trim_max', label = 'Max RT', value=2, min=0, max=10, step=0.1, width='80px'))
div(style="display: inline-block;vertical-align:top;", numericInput('bins', label = 'Bins', value=70, min=2, max=1000, step=10, width='80px'))
# Plot
renderPlot(height=FIG_HEIGHT, expr = {
# Store in session
ud = session$userData # Shorthand
ud$TRIM_MIN = input$trim_min
ud$TRIM_MAX = input$trim_max
ud$BINS = input$bins
# If custom data has been pasted, parse it to a numeric vector.
if(input$rts_input != '') {
set_data(rt_type = as.numeric(strsplit(input$rts_input, '[;,\t ]+')[[1]]), ud) # Splot by semi-colon, comma, tab, and space
}
# Else just set data by label
else {
set_data(rt_type = input$rt_type, ud)
}
# Finally! Show the plot
show_plot(ud)
})
```
<div class="vertical_spacer"></div>
# Drag those sliders!
Here are some fun challenges:
* **One-parameter champ:** For the speed/accuracy and MRT data, try (1) fitting the parameters to one dataset, (2) change the dataset, (3) see how close you can get by *only changing one parameter now*.
* **Fit champ:** There is a special reward if you can fit the data so well that the [mean squared error (MSE)](https://en.wikipedia.org/wiki/Mean_squared_error) drops below 40. Use the keyboard arrows to fine-tune parameters. The MSE in the plots is multiplied by 10.000 to give easy-to-read numbers.
* **Optimal champ:** Use [`brms`](https://github.com/paul-buerkner/brms) to find the optimal fits as shown in the code snippets. The empirical datasets are in the `rtdists` package. Use `library(rtdists')`, `data('rr98')`, and `data('speed_acc')` to load them. <a href="https://raw.githubusercontent.com/lindeloev/shiny-rt/master/mrt_data.csv" download>Download the MRT data here</a>. You can also try and use the (frequentist) [`gamlss` package](https://www.gamlss.com/), though it's harder to interpret the results.
<br /><br />
## Gaussian (normal distribution) {#gauss}
Models responses from a single mean with a symmetrical dispersion towards faster and slower RTs. In most cases, the normal distribution is the worst distribution to fit reaction times. For example, for typical reaction times, your fitted model will predict that more than 5% of the reaction times will be *faster* than the fastest RT ever recorded!
Yet, the Gaussian is by far the most frequently used analysis of reaction times because it is the default in most statistical tests, packages, and software. For example, t-tests, ANOVA, linear regression, `lm`, `lme4::lmer`, and `brms::brm`. Also, if your pre-processing includes computing means for each participant or condition and you use *those* means as data, you've subscribed to the Gaussian model (and thrown away a lot of wonderful data).
**Parameters**
* $\mu$ (**center**): the mean. It is heavily affected by the tail.
* $\sigma$ (**scale**): the standard deviation. This models (symmetric) variability in RTs.
It is easy to infer these parameters from data. Here, we model $\mu$ by `condition`. $\sigma$ is fixed across conditions. See [notes on modeling](#models) and the [applied example](#code) for more details.
```{r, eval=FALSE}
library(brms)
fit = brm(rt ~ condition, rt_data, family=gaussian())
```
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput("gauss_mu", label = NULL, pre='<i>μ</i> = ',
min = 0, max = 3, value = 1, step = 0.01),
sliderInput('gauss_sigma', label = NULL, pre='<i>σ</i> = ',
min = 0, max = 1.5, step = 0.01, value = 0.4)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
dfun = function(x) dnorm(x, input$gauss_mu, input$gauss_sigma)
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
## Ex-gaussian {#exgauss}
If you believe that responses were caused by a mix of two independent processes, (1) a [Gaussian](#gauss) and (2) a decaying exponential, the ex-gaussian is exactly this. Most people don't believe this but they use the ex-gaussian in a purely descriptive way because it usually has an excellent fit to RTs. See [Wikipedia](https://en.wikipedia.org/wiki/Exponentially_modified_Gaussian_distribution).
**Parameters:**
* $\mu$ (**center**): the mean of the gaussian. This models shorter/longer mean RTs.
* $\sigma$ (**scale**): the standard deviation of the gaussian. It models symmetrical variability around $\mu$.
* $\lambda$ (**messy**): the decay rate of the exponential. This models the tail of long RTs. Higher $\lambda$ --> more tail dominance over the Gaussian.
It is easy to infer these parameters from data. Here, we model $\mu$ by `condition`. $\sigma$ and $\lambda$ are fixed across conditions. See [notes on modeling](#models) and the [applied example](#code) for more details.
```{r, eval=FALSE}
library(brms)
fit = brm(rt ~ condition, rt_data, family=exgaussian())
```
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput("exgauss_mu", label = NULL, pre='<i>μ</i> = ',
min = 0, max = 3, value = 1, step = 0.01),
sliderInput('exgauss_sigma', label = NULL, pre='<i>σ</i> = ',
min = 0, max = 1, step = 0.01, value = 0.15),
sliderInput('exgauss_lambda', label = NULL, pre='<i>λ</i> = ',
min = 0, max = 2, step = 0.01, value = 0.3)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
dfun = function(x) dexgaussian(x, input$exgauss_mu, input$exgauss_sigma, input$exgauss_lambda)
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
## Skew normal {#skew}
Adds a skew to the normal (Gaussian) distribution via an extra skew-parameter. In my experience, it usually has a poorer fit to reaction times, though it is superior to the Gaussian.
**Parameters:**
* $\mu$ (**center**): the mean.
* $\sigma$ (**scale**): the standard deviation. Changing $\sigma$ preserves the mean $\mu$ but not the median. Note that when $\alpha \neq 0$, the distribution is not symmetric, so don't interpret it as the SD of a normal distribution.
* $\alpha$ (**messy**): the skewness. When $\alpha = 0$, it's a Gaussian.
It is easy to infer these parameters from data. Here, we model $\mu$ by `condition`. $\sigma$ and $\alpha$ are fixed across conditions. See [notes on modeling](#models) and the [applied example](#code) for more details.
```{r, eval=FALSE}
library(brms)
fit = brm(rt ~ condition, rt_data, family=skew_normal())
```
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput("skew_mu", label = NULL, pre='<i>μ</i> = ',
min = 0, max = 3, value = 1, step = 0.01),
sliderInput('skew_sigma', label = NULL, pre='<i>σ</i> = ',
min = 0, max = 1.5, step = 0.01, value = 0.3),
sliderInput('skew_alpha', label = NULL, pre='<i>α</i> = ',
min = -15, max = 15, step = 0.1, value = 3.5)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
dfun = function(x) dskew_normal(x, input$skew_mu, input$skew_sigma, input$skew_alpha)
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
## (Shifted) Log-normal {#slog}
When using logarithms, you model *percentage increase or decrease* instead of absolute differences. That is, rather than saying "RTs are 40 ms slower in incongruent trials", your parameter(s) say that "RTs are 7% slower in incongruent trials". This embodies the intuition that a difference of 40 ms can be huge for simple reaction times but insignificant for RTs around 10 seconds. Mathematically it is about *multiplicative* relationships rather than *additive* relationships, and this makes some sense when talking about RTs, so I'd say that the shifted log-normal is somewhere between being descriptive and mechanistic. In general, log-like models [makes a lot of sense for most positive-only variables](https://statmodeling.stat.columbia.edu/2019/08/21/you-should-usually-log-transform-your-positive-data/) like RT.
It's called "log-normal" because the parameters are the mean and sd of the log-transformed RTs, which is assumed to be a normal (Gaussian) distribution. For example, for simple RTs you will often see an intercept near `exp(-1.5) = 0.22 seconds`.
**Parameters:**
* $\mu$ (**difficulty**): the mean of the log-normal distribution. The mean of $\mu$ represents the median RT. The median RT is then $shift + exp(\mu)$.
* $\sigma$ (**scale**): the standard deviation of the log-normal distribution. Increases the mean but not the median of $\mu$.
* $shift$ (**shift**): time of the earliest possible response. When $shift = 0$, it's a straight-up *log-normal distribution* with two parameters.
It is easy to infer these parameters from data. Here, we model $\mu$ by `condition`. It is log, so use `exp()` to back-transform. $\sigma$ and $shift$ are fixed across conditions. See [notes on modeling](#models) and the [applied example](#code) for more details.
```{r, eval=FALSE}
library(brms)
fit = brm(rt ~ condition, rt_data, family=shifted_lognormal())
```
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput('slog_mu', label = NULL, pre='<i>μ</i> = ',
min = -2, max = 2, step = 0.05, value = -0.6),
sliderInput('slog_sigma', label = NULL, pre='<i>σ</i> = ',
min = 0, max = 2, step = 0.01, value = 0.6),
sliderInput("slog_shift", label = NULL, pre='shift = ',
min = 0, max = 1.5, value = 0, step = 0.01)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
dfun = function(x) dshifted_lnorm(x, input$slog_mu, input$slog_sigma, input$slog_shift)
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
## (Shifted) Inverse Gaussian {#invgauss}
... also known as the *(Shifted) Wald*. It is not as "inverse" of the Gaussian as it sounds. It is closely related to the [Weiener / Decision Diffusion model](#wiener) and the [LBA model](#lba) in the following way (quote from [Wikipedia](Wikipedia](https://en.wikipedia.org/wiki/Inverse_Gaussian_distribution)):
> The name can be misleading: it is an "inverse" only in that, while the Gaussian describes a Brownian motion's level at a fixed time, the inverse Gaussian describes the distribution of the time a Brownian motion with positive drift takes to reach a fixed positive level.
In other words, if you have time on the x-axis and start a lot of walkers at $t_0 = shift$, the Gaussian describes the distribution of their vertical location at a fixed time (`t_i`) while the inverse Gaussian describes the distribution of times when they cross a certain threshold on the y-axis. Like Wiener and LBA, the inverse Gaussian can be thought of as modeling a decision process, without incorporating the correctness of the response.
**Parameters:**
* $\mu$ (**difficulty**): The mean. Notice that this will be heavily influenced by long RTs.
* $\lambda$ (**scale**): Decreases the variance but is harder to intepret since $\sigma = \sqrt{(\mu^3/\lambda)}$.
* $shift$ (**onset**): time of earliest possible response. When $shift = 0$ it is a plain *Inverse Gaussian* or *Wald* distribution, i.e., only two parameters.
It is easy to infer the parameters of the (non-shifted) inverse Gaussian / Wald. Here, we model $\mu$ by `condition`. $\lambda$ is fixed across conditions. See [notes on modeling](#models) and the [applied example](#code) for more details.
```{r, eval=FALSE}
library(brms)
fit = brm(rt ~ condition, rt_data, family=inverse.gaussian())
```
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput("sinvgauss_mu", label = NULL, pre='<i>μ</i> = ',
min = 0, max = 2.5, value = 0.8, step = 0.01),
sliderInput('sinvgauss_lambda', label = NULL, pre='<i>λ</i> = ',
min = 0, max = 20, value = 4, step = 0.1),
sliderInput('sinvgauss_shift', label = NULL, pre='shift = ',
min = 0, max = 2, value = 0, step = 0.01)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
dfun = function(x) dshifted_inv_gaussian(x, input$sinvgauss_mu, input$sinvgauss_lambda, input$sinvgauss_shift)
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
## Wiener diffusion {#wiener}
Also called the *Diffusion Decision Model (DDM)*, this is probably the most popular model focused at *mechanisms*. It applies to two-alternative forced-choice (2afc) scenarios and requires data on *accuracy* as well (scored as correct/wrong). It models RTs as a decision being made when a random walker crosses one of two thresholds (boundary). It's clear mechanistic interpretability is a strength, but it requires some effort to understand and do. Read [Ratcliff & McKoon (2008)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2474742/) for a motivation for the model. I like [this visalization](https://images.app.goo.gl/b31JzswrSQK1f5zf8).
The diffusion idea also underlies the [inverse Gaussian](#invgauss) and the [LBA model](#lba).
**Parameters:**
All of these are mechanism-like
* $\alpha$ (**difficulty**): Boundary separation. The distance between thresholds. Longer distance --> more varied paths --> dispersed RTs.
* $\tau$ (**onset**): Non-decision time (ndt). The time at which the random walker starts.
* $\beta$ (**difficulty**): Bias. How close the walker starts to the upper threshold ($\beta = 0.5$ is in the middle). Higher = closer = faster and more condensed RTs.
* $\delta$ (**difficulty**): Drift Rate. The speed of the walker. Higher --> faster and more condensed RTs.
See [this three-part brms tutorial](http://singmann.org/wiener-model-analysis-with-brms-part-i/) by Henrik Singman on how to fit it using `brms::brm` and do regression on these parameters. It usually takes days or weeks to fit, but [Wagenmakers (2007)](https://www.ejwagenmakers.com/2007/EZ.pdf) made the [EZ diffusion calculator](https://www.ejwagenmakers.com/EZ.html) which approximates the parameters in the simplest scenario. The [HDDM python package](http://ski.clps.brown.edu/hddm_docs/) can do it as well.
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput("wiener_alpha", label = "Boundary separation", pre='<i>α</i> = ',
min = 0.5, max = 3.5, value = 1.5, step = 0.01),
sliderInput('wiener_tau', label = 'Non-decision time', pre='<i>τ</i> = ',
min = 0, max = 1, step = 0.01, value = 0.4),
sliderInput('wiener_beta', label = 'Bias', pre='<i>β</i> = ',
min = 0, max = 1, step = 0.01, value = 0.3),
sliderInput('wiener_delta', label = 'Drift rate', pre='<i>δ</i> = ',
min = -1, max = 4, step = 0.01, value = 1.5)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
dfun = function(x) dwiener(x, input$wiener_alpha, input$wiener_tau, input$wiener_beta, input$wiener_delta)
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
## Linear Ballistic Accumulator {#lba}
This model has some resemblance to the [Wiener diffusion model](#wiener) in that it models latent "movements" towards decision thresholds, and is very much focused on *mechanisms*. It likewise applies to two-alternative forced-choice (2afc) scenarios, requiring data on accuracy as well (correct/incorrect response). Although it involves a lot of parameters, the model itself is (at least supposed to be) simpler than the diffusion model [as explained in Brown & Heathcote (2008)](https://linkinghub.elsevier.com/retrieve/pii/S0010-0285(07)00072-2). A nice visualization from that paper [escaped the paywall here](https://www.semanticscholar.org/paper/The-simplest-complete-model-of-choice-response-Brown-Heathcote/6a23d45269edd7783fa6edd8c469ca8fb5d53ac6/figure/4).
**Parameters:**
All of these are mechanism-like:
* $A$: upper bound of the starting point. The starting point is modeled as randomly drawn in this interval.
* $b$ (**difficulty**): response threshold (usually, $b > A$).
* $t0$ (**shift**): the time before a "movement" is initiated, i.e., *non-decision time*.
* $mean\_v$ (**difficulty**): the mean drift rate and it's variability (SD).
* $sd\_v$: the mean standard deviation of the drift rate and *its* variability (SD)
Fit it using `glba::lba()`. There are some helpful distributions in the `rtdists` package.
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput("lba_A", label = NULL, pre='A = ',
min = 0, max = 2, value = 1, step = 0.01),
sliderInput('lba_b', label = NULL, pre='b = ',
min = 1, max = 10, value = 2, step = 0.05),
sliderInput('lba_t0', label = NULL, pre='t0 = ',
min = -1, max = 2.5, value = 0.2, step = 0.01),
sliderInput('lba_drift_mm', label = NULL, pre='mean(mean_v) = ',
min = 0, max = 10, value = 3, step = 0.1),
sliderInput('lba_drift_msd', label = NULL, pre='sd(mean_v) = ',
min = 0, max = 6, value = 1, step = 0.1),
sliderInput('lba_drift_sdm', label = NULL, pre='mean(sd_v) = ',
min = 0, max = 6, value = 1, step = 0.1),
sliderInput('lba_drift_sdsd', label = NULL, pre='sd(sd_v) = ',
min = 0, max = 6, value = 1, step = 0.1)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
# Density for LBA
dfun = function(x) {
rtdists::dLBA(
x,
response = 1,
A = input$lba_A,
b = input$lba_b,
t0 = input$lba_t0,
mean_v = c(input$lba_drift_mm, input$lba_drift_msd),
sd_v = c(input$lba_drift_sdm, input$lba_drift_sdsd),
silent = TRUE
)
}
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
## Gamma {#gamma}
The gamma distribution is so important in statistics, that it probably spilled over to RT research as well. Although it can fit the data, it's very hard to interpret the parameters in relation to RTs, and I will have little more to say. One could easily make a shifted Gamma, but I haven't seen it used for the analysis of reaction times.
**Parameters:**
Both parameters in isolation are **difficulty**-like so their interaction determines **scale** and **location**. I think that this makes them **messy**.
* $\alpha$ (**messy**): a shape parameter.
* $\beta$ (**messy**): a rate parameter.
It is easy to infer these parameters from data. Here, we model $\alpha$ by `condition`. $\beta$ is fixed across conditions. See [notes on modeling](#models) and the [applied example](#code) for more details.
```{r, eval=FALSE}
library(brms)
fit = brm(rt ~ condition, rt_data, family=gamma())
```
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput("gamma_alpha", label = NULL, pre='<i>α</i> = ',
min = 0, max = 50, value = 5, step = 0.1),
sliderInput('gamma_beta', label = NULL, pre='<i>β</i> = ',
min = 0, max = 50, value = 10, step = 0.1)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
dfun = function(x) dgamma(x, input$gamma_alpha, input$gamma_beta)
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
## (Shifted) Weibull {#sweibull}
... also called the *three-parameter Weibull* when there is a shift parameter. For RTs, it can be understood as modeling there being an underlying rate with which underlying non-response processes are "turned into" responses over time. If $\lambda = 0$, this rate is constant. If $\lambda > 1$, it decreases over time. The parameter $values$ do not correspond to anything intuitive, though.
**Parameters:**
* $\lambda$ (**difficulty**): It's called a scale parameter for some reason, but it affects the mean, median, and mode.
* $k$ (**messy**): A shape parameter.
* $shift$ (**onset**): when $shift = 0$, it's a straight-up *Weibull distribution* with the above two parameters.
It is easy to infer the parameters of the (non-shifted) Weibull from data. Here, we model $\lambda$ by `condition`. It is log, so use `exp()` to back-transform. $k$ is fixed across conditions. See [notes on modeling](#models) and the [applied example](#code) for more details.
```{r, eval=FALSE}
library(brms)
fit = brm(rt ~ condition, rt_data, family=weibull())
```
<hr />
```{r, echo=FALSE}
# Sliders
sidebarPanel(width=4,
sliderInput('sweibull_lambda', label = NULL, pre='<i>λ</i> = ',
min = 0, max = 2.5, value = 0.5, step = 0.01),
sliderInput("sweibull_k", label = NULL, pre='k = ',
min = 0, max = 10, value = 2.2, step = 0.1),
sliderInput('sweibull_shift', label = NULL, pre='shift = ',
min = 0, max = 2, value = 0, step = 0.01)
)
# Histogram
mainPanel(width=8,
renderPlot(height=FIG_HEIGHT, width=FIG_WIDTH, expr = {
update_this(session) # Responsive to dataset change
dfun = function(x) dshifted_weibull(x, input$sweibull_k, input$sweibull_lambda, input$sweibull_shift)
show_plot(session$userData, dfun)
})
)
```
<div class="vertical_spacer"></div>
# Applied code example {#code}
We load the Wagenmakers & Brown (2007) data and pre-process it:
```{r, eval=FALSE}
# Load data
library(rtdists)
data(speed_acc)
# Remove bad trials
library(tidyverse)
data = speed_acc %>%
filter(rt > 0.18, rt < 3) %>% # Between 180 and 3000 ms
mutate_at(c('stim_cat', 'response'), as.character) %>% # from factor to char
filter(response != 'error', stim_cat == response) # Only correct responses
```
Now we use the [shifted log-normal](#slog) to ask: "Does median RT differ by condition?". In other words, we model $\mu$ here, leaving the scale $\sigma$ and the onset $shift$ fixed. We include a random intercept per participant, but [see my notes on formulas for RT models](#models). In this example, it takes some time to run it because we have around 28.000 data points.
```{r, eval=FALSE}
library(brms)
fit = brm(formula = rt ~ condition + (1|id),
data = data,
family = shifted_lognormal(),
file = 'fit_slog') # Save all that hard work.
```
Done!
## Checking the fit
Much like the interactive applets above, we can see how the parameters predict the actual data:
```{r, eval=FALSE}
#plot(fit) # Check convergence and see posteriors
pp_check(fit) # Posterior predictive check
```
![](images/code_ppcheck.png)
This is *much* better than a Gaussian. But there's room for improvement. Mostly because $shift$ is close to the earliest rt *in the whole dataset*, which for one participant (id #8) in one condition was 0.18 - just above the trim. The mean shortest rt was 0.28 seconds so that participant is not representative. Setting one $shift$ per participant would make more sense:
```{r, eval = FALSE}
formula = bf(rt ~ condition + (1|id),
ndt ~ id)
```
Also, the shortest N RTs could be removed for each participant, if you can argue that they were accidental responses:
```{r, eval = FALSE}
data %>%
group_by(id) %>%
arrange(rt) %>%
slice(4:n()) # Remove three shortest RTs
```
## Understanding the parameter estimates
Leaving that aside, let's see what we learned about the parameters of the shifted log-normal model. `Intercept` and `conditionspeed` together define $\mu$ (the difficulty). You can see `sigma` (the scale) and `ndt` (the onset) too, so we have posteriors for all of our three parameters!
```{r, eval=FALSE}
summary(fit) # See parameters
```
```{r, eval=FALSE}
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -0.69 0.04 -0.76 -0.61 1.01 508 672
conditionspeed -0.40 0.00 -0.41 -0.39 1.00 2184 2143
Family Specific Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.37 0.00 0.36 0.37 1.00 1839 2088
ndt 0.17 0.00 0.17 0.18 1.00 1576 1724
```
Because $\mu$ and $\sigma$ define the *log*-normal distribution, they are on a log scale. This means that we expect a median RT of $ndt + exp(\mu) = 0.17 + exp(-0.69) = 0.67$ seconds for condition "accuracy" (the intercept), and a median RT of $exp(-0.69 -0.40) = 0.51$ seconds for condition "speed". If you get confused about these transformations, worry not. `brms` has you covered, no matter the distribution:
```{r, eval=FALSE}
marginal_effects(fit) # Back-transformed parameter estimates
marginal_effects(fit, method='predict') # Same, but for responses
```
We are very certain that they respond faster for "speed" because the upper bound of the credible interval is much smaller than zero. We could test this more directly:
```{r, eval=FALSE}
hypothesis(fit, 'conditionspeed < 0') # Numerically
```
## Priors and descriptives
I cut corners here for the sake of brevity. Instead of going with the default priors, I would have specified something sensible and run `brm(..., prior=prior)`. Mostly as [regularization](https://en.wikipedia.org/wiki/Regularization_(mathematics)). There is so much data that most priors will be overwhelmed.
```{r, eval=FALSE}
# See which priors you can set for this model
get_prior(rt ~ condition + (1|id), speed_acc, family = shifted_lognormal())
# Define them
prior = c(
set_prior('normal(-1, 0.5)', class = 'Intercept'), # around exp(-1) = 0.36 secs
set_prior('normal(0.4, 0.3)', class = 'sigma'), # SD of individual rts in log-units
set_prior('normal(0, 0.3)', class = 'b'), # around exp(-1) - exp(-1 + 0.3) = 150 ms effect in either direction
set_prior('normal(0.3, 0.1)', class = 'sd') # some variability between participants
)
```
I would probably also have checked that the pre-processing worked by plotting the response distributions of a few participants:
```{r, eval=FALSE}
data_plot = filter(data, id %in% c(1, 8, 15))
ggplot(data_plot, aes(x=rt)) +
geom_histogram(aes(fill=condition), alpha=0.5, bins=60) +
facet_grid(~id) + # One panel per id
coord_cartesian(xlim=c(0, 1.6)) # Zoom in
```
![](images/code_histograms.png)
# Notes on good models for RT data {#models}
Throughout this document, I have used a simple model: `rt ~ condition` to demo code. For real-world data, you probably know a lot more about the data generating process. This is a better typical recipe:
```{r, eval=FALSE}
rt ~ condition + covariates + (condition | id) + (1|nuisance)
```
This could be an example of using this recipe:
```{r, eval=FALSE}
rt ~ congruent + trial_no + bad_lag1 + (congruent|id) + (1|counterbalancing)
```
Let's unpack this:
* `condition` is our contrast of interest. This can be a variable coded as congruent/incongruent, the number of items in a visual search, or something else. It can also be an interaction. I just did an analysis where I had a `congruent:condition` term to model a congruency effect per condition.
* `covariates` could be, e.g., `trial_no + bad_lag1`, i.e., the effect of time (people typically get faster with practice) and longer RTs on the first trial in a block (you need to code this variable). It is often a good idea to $scale$ these to help the model fit, and to $center$ them (possible for continuous and binary predictors) so that the effect of `condition` represents the mean of the other conditions rather than a certain combination. You can use `scale` to scale and center.
* `(condition | id)` models the expected individual differences in general `rt`, and allows the effect of `condition` to vary between participants. The practical effect of specifying a random effect rather than a fixed `condition:id` effect is that [you get *shrinkage* in the former](https://lindeloev.net/lets-rename-fixed-to-population-level-and-random-to-varying/), naturally imposing a bit of control for outliers. Unfortunately, complicated random effects sometimes increase computation time considerably.
* `(1 | nuisance)` is a way to let the model "understand" a structure in the data, without having it affect the fixed parameter estimates. If you have a factorial design, the otherwise non-modeled design variables would generally go here.