Measures and limits of models of fixation selection

Standard

Measures and limits of models of fixation selection. / Wilming, Niklas; Betz, Torsten; Kietzmann, Tim C; König, Peter.

In: PLOS ONE, Vol. 6, No. 9, 2011, p. e24038.

Research output: SCORING: Contribution to journalSCORING: Journal articleResearchpeer-review

Harvard

Wilming, N, Betz, T, Kietzmann, TC & König, P 2011, 'Measures and limits of models of fixation selection', PLOS ONE, vol. 6, no. 9, pp. e24038. https://doi.org/10.1371/journal.pone.0024038

APA

Wilming, N., Betz, T., Kietzmann, T. C., & König, P. (2011). Measures and limits of models of fixation selection. PLOS ONE, 6(9), e24038. https://doi.org/10.1371/journal.pone.0024038

Vancouver

Bibtex

@article{31db6f06d4e844009c4924c0825cf2d7,
title = "Measures and limits of models of fixation selection",
abstract = "Models of fixation selection are a central tool in the quest to understand how the human mind selects relevant information. Using this tool in the evaluation of competing claims often requires comparing different models' relative performance in predicting eye movements. However, studies use a wide variety of performance measures with markedly different properties, which makes a comparison difficult. We make three main contributions to this line of research: First we argue for a set of desirable properties, review commonly used measures, and conclude that no single measure unites all desirable properties. However the area under the ROC curve (a classification measure) and the KL-divergence (a distance measure of probability distributions) combine many desirable properties and allow a meaningful comparison of critical model performance. We give an analytical proof of the linearity of the ROC measure with respect to averaging over subjects and demonstrate an appropriate correction of entropy-based measures like KL-divergence for small sample sizes in the context of eye-tracking data. Second, we provide a lower bound and an upper bound of these measures, based on image-independent properties of fixation data and between subject consistency respectively. Based on these bounds it is possible to give a reference frame to judge the predictive power of a model of fixation selection. We provide open-source python code to compute the reference frame. Third, we show that the upper, between subject consistency bound holds only for models that predict averages of subject populations. Departing from this we show that incorporating subject-specific viewing behavior can generate predictions which surpass that upper bound. Taken together, these findings lay out the required information that allow a well-founded judgment of the quality of any model of fixation selection and should therefore be reported when a new model is introduced.",
keywords = "Adult, Behavior, Discrimination (Psychology), Eye Movements, Female, Humans, Male, Models, Neurological, Photic Stimulation, Young Adult, Journal Article, Research Support, Non-U.S. Gov't",
author = "Niklas Wilming and Torsten Betz and Kietzmann, {Tim C} and Peter K{\"o}nig",
year = "2011",
doi = "10.1371/journal.pone.0024038",
language = "English",
volume = "6",
pages = "e24038",
journal = "PLOS ONE",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "9",

}

RIS

TY - JOUR

T1 - Measures and limits of models of fixation selection

AU - Wilming, Niklas

AU - Betz, Torsten

AU - Kietzmann, Tim C

AU - König, Peter

PY - 2011

Y1 - 2011

N2 - Models of fixation selection are a central tool in the quest to understand how the human mind selects relevant information. Using this tool in the evaluation of competing claims often requires comparing different models' relative performance in predicting eye movements. However, studies use a wide variety of performance measures with markedly different properties, which makes a comparison difficult. We make three main contributions to this line of research: First we argue for a set of desirable properties, review commonly used measures, and conclude that no single measure unites all desirable properties. However the area under the ROC curve (a classification measure) and the KL-divergence (a distance measure of probability distributions) combine many desirable properties and allow a meaningful comparison of critical model performance. We give an analytical proof of the linearity of the ROC measure with respect to averaging over subjects and demonstrate an appropriate correction of entropy-based measures like KL-divergence for small sample sizes in the context of eye-tracking data. Second, we provide a lower bound and an upper bound of these measures, based on image-independent properties of fixation data and between subject consistency respectively. Based on these bounds it is possible to give a reference frame to judge the predictive power of a model of fixation selection. We provide open-source python code to compute the reference frame. Third, we show that the upper, between subject consistency bound holds only for models that predict averages of subject populations. Departing from this we show that incorporating subject-specific viewing behavior can generate predictions which surpass that upper bound. Taken together, these findings lay out the required information that allow a well-founded judgment of the quality of any model of fixation selection and should therefore be reported when a new model is introduced.

AB - Models of fixation selection are a central tool in the quest to understand how the human mind selects relevant information. Using this tool in the evaluation of competing claims often requires comparing different models' relative performance in predicting eye movements. However, studies use a wide variety of performance measures with markedly different properties, which makes a comparison difficult. We make three main contributions to this line of research: First we argue for a set of desirable properties, review commonly used measures, and conclude that no single measure unites all desirable properties. However the area under the ROC curve (a classification measure) and the KL-divergence (a distance measure of probability distributions) combine many desirable properties and allow a meaningful comparison of critical model performance. We give an analytical proof of the linearity of the ROC measure with respect to averaging over subjects and demonstrate an appropriate correction of entropy-based measures like KL-divergence for small sample sizes in the context of eye-tracking data. Second, we provide a lower bound and an upper bound of these measures, based on image-independent properties of fixation data and between subject consistency respectively. Based on these bounds it is possible to give a reference frame to judge the predictive power of a model of fixation selection. We provide open-source python code to compute the reference frame. Third, we show that the upper, between subject consistency bound holds only for models that predict averages of subject populations. Departing from this we show that incorporating subject-specific viewing behavior can generate predictions which surpass that upper bound. Taken together, these findings lay out the required information that allow a well-founded judgment of the quality of any model of fixation selection and should therefore be reported when a new model is introduced.

KW - Adult

KW - Behavior

KW - Discrimination (Psychology)

KW - Eye Movements

KW - Female

KW - Humans

KW - Male

KW - Models, Neurological

KW - Photic Stimulation

KW - Young Adult

KW - Journal Article

KW - Research Support, Non-U.S. Gov't

U2 - 10.1371/journal.pone.0024038

DO - 10.1371/journal.pone.0024038

M3 - SCORING: Journal article

C2 - 21931638

VL - 6

SP - e24038

JO - PLOS ONE

JF - PLOS ONE

SN - 1932-6203

IS - 9

ER -