Non-parametric (NP) analysis of analogue report data

This code relates to my 2016 paper [1] on non-parametric analysis of swap errors in working memory tasks. I've made the relevant MATLAB functions available for download here [zip]. If you're looking for my older analysis code based on mixture modelling, click here.

This code is released under a GNU General Public License: feel free to use and adapt these functions as you like, but credit should be given if they contribute to your work, by citing:

Bays PM. Evaluating and excluding swap errors in analogue tests of working memory. Scientific Reports 6: 19203 (2016)

If you refer to this code directly, please also include the URL of this webpage, or my homepage bayslab.com.

Introduction

The following functions provide non-parametric methods for estimating and excluding the influence of swap errors in analogue report tasks. These functions were developed for analyzing recall tasks in which both stimuli and responses are chosen from a circular parameter space. For correct estimation, it is important that the stimuli are chosen independently and at random from the circular space (e.g. no minimum separations between items).

The data available for analysis on each trial will consist of a response value (the reported feature value), a target value (the feature value of the cued item), and a set of non-target values (the feature values of the uncued items). The following MATLAB functions require that each of these values be given in radians, in the range [−π, π].

Estimating swap frequency

Swap (or "non-target") errors occur when an observer incorrectly reports the feature value of one of the non-cued items in the memory array. Previous methods of estimating the frequency of swap errors have relied on specifying a model of the distribution of errors (see e.g. my 2009 paper on swap errors [2]). The NP method does not make any assumptions about this distribution.

The function NP16_alpha estimates the frequency with which different stimuli are reported. It can be used to estimate swap frequency as follows:

A = NP16_alpha(X,T)

The input X is an (nx1) vector of responses; input T is an (nxm) matrix of stimulus values (the first column should contain target values, the remaining columns non-target values). Output A is a (1xm) vector of mixture parameters summing to 1. The swap frequency can be obtained by summing the values in A corresponding to non-target inputs (all but the first value), or equivalently as one minus the first value in A (the target frequency).

Estimating the true error distribution

Swap errors appear randomly-distributed relative to the target feature value, and so distort the "true" distribution of errors in the report feature dimension. The function NP16_pdf can be used to exclude the influence of swap errors and uncover the true distribution:

P = NP16_pdf(X,T)

Inputs as above. By default the distribution is evaluated at 25 evenly-spaced points on the circle. Use NP16_pdf(X,T,Y) to specify evaluation points.

Estimating circular moments

Similarly, swap errors distort estimates of parameters of the distribution, such as standard deviation. The function NP16_moment can be used to recover the true circular moments of the error distribution:

M = NP16_moment(X,T,J)

returns the Jth circular moment. For example, the true circular standard deviation of errors in the report dimension can be estimated by:

sqrt(-2*log(abs(NP16_moment(X,T,1)))

Notes on use

By making fewer assumptions, the NP method has a substantial theoretical advantage over the older model-fitting approach [2], which may be strongly biased if the assumed distribution of errors is wrong. However, the NP method requires more data to achieve a given level of variability in its estimates. As a result, I would not recommend using this method with n < 100 trials.

The NP estimates occasionally fall outside the bounds of interpretable values, e.g. the range of probabilities [0, 1] for NP16_alpha. This is a desirable property: a bounded estimator, unless it is perfectly accurate, is necessarily biased in proximity to its bounds. However, a large out-of-bounds estimate can excessively influence group means. A compromise is to constrain the estimator beyond the true range: bounding individual observer estimates at [−1, 2] before calculating group means was found in simulations to provide a good balance between bias and variance for this estimator.

Demo

A demonstration of the use of these functions is included in the package (demo.m).

References

[1] Bays PM. Evaluating and excluding swap errors in analogue tests of working memory. Scientific Reports 6: 19203 (2016) [pdf]

[2] Bays PM, Catalao RFG & Husain M. The precision of visual working memory is set by allocation of a shared resource. Journal of Vision 9(10): 7, 1-11 (2009) [pdf] [code]