Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/36991
Appears in Collections:Psychology Journal Articles
Peer Review Status: Refereed
Title: Sample size matters when estimating test–retest reliability of behaviour
Author(s): Williams, Brendan
FitzGibbon, Lily
Brady, Daniel
Christakou, Anastasia
Contact Email: lily.fitzgibbon@stir.ac.uk
Keywords: Reliability
Test retest
Sample size
Reinforcement learning
Computational modelling
Reversal learning
Cognitive flexibility
Issue Date: 21-Mar-2025
Date Deposited: 27-Mar-2025
Citation: Williams B, FitzGibbon L, Brady D & Christakou A (2025) Sample size matters when estimating test–retest reliability of behaviour. <i>Behavior Research Methods</i>, 57, Art. No.: 123. https://doi.org/10.3758/s13428-025-02599-1
Abstract: Intraclass correlation coefficients (ICCs) are a commonly used metric in test–retest reliability research to assess a measure’s ability to quantify systematic between-subject differences. However, estimates of between-subject differences are also influenced by factors including within-subject variability, random errors, and measurement bias. Here, we use data collected from a large online sample (N = 150) to (1) quantify test–retest reliability of behavioural and computational measures of reversal learning using ICCs, and (2) use our dataset as the basis for a simulation study investigating the effects of sample size on variance component estimation and the association between estimates of variance components and ICC measures. In line with previously published work, we find reliable behavioural and computational measures of reversal learning, a commonly used assay of behavioural flexibility. Reliable estimates of between-subject, within-subject (across-session), and error variance components for behavioural and computational measures (with ± .05 precision and 80% confidence) required sample sizes ranging from 10 to over 300 (behavioural median N: between-subject = 167, within-subject = 34, error = 103; computational median N: between-subject = 68, within-subject = 20, error = 45). These sample sizes exceed those often used in reliability studies, suggesting that sample sizes larger than are commonly used for reliability studies (circa 30) are required to robustly estimate reliability of task performance measures. Additionally, we found that ICC estimates showed highly positive and highly negative correlations with between-subject and error variance components, respectively, as might be expected, which remained relatively stable across sample sizes. However, ICC estimates were weakly or not correlated with within-subject variance, providing evidence for the importance of variance decomposition for reliability studies.
DOI Link: 10.3758/s13428-025-02599-1
Rights: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Licence URL(s): http://creativecommons.org/licenses/by/4.0/

Files in This Item:
File Description SizeFormat 
s13428-025-02599-1.pdfFulltext - Published Version8.82 MBAdobe PDFView/Open



This item is protected by original copyright



A file in this item is licensed under a Creative Commons License Creative Commons

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.