Search

Reviews
Isto Huvila, Uppsala Universitet, Sweden, Lisa Andersson, Uppsala Universitet, Sweden, Zanna Friberg, Uppsala Universitet, Sweden, Ying-Hsang Liu, Uppsala Universitet, Sweden, Olle Sköld, Uppsala Universitet, Sweden
Book:

Paradata

Published online:

05 August 2025

Print publication:

07 August 2025, pp ii-ii
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation

Index
Isto Huvila, Uppsala Universitet, Sweden, Lisa Andersson, Uppsala Universitet, Sweden, Zanna Friberg, Uppsala Universitet, Sweden, Ying-Hsang Liu, Uppsala Universitet, Sweden, Olle Sköld, Uppsala Universitet, Sweden
Book:

Paradata

Published online:

05 August 2025

Print publication:

07 August 2025, pp 221-224
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation

5 - Methods for Identifying Paradata for Data Reuse
- By Ying-Hsang Liu, Isto Huvila
Isto Huvila, Uppsala Universitet, Sweden, Lisa Andersson, Uppsala Universitet, Sweden, Zanna Friberg, Uppsala Universitet, Sweden, Ying-Hsang Liu, Uppsala Universitet, Sweden, Olle Sköld, Uppsala Universitet, Sweden
Book:

Paradata

Published online:

05 August 2025

Print publication:

07 August 2025, pp 116-150
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Summary

This chapter introduces a selection of methods applicable for identifying and extracting paradata from existing datasets and data documentation which can then be used to complement existing formal documentation of practices and processes. Data reuse, in its multiple forms, enables researchers to build upon the foundations laid by previous studies. Retrospective methods for eliciting paradata, including qualitative and quantitative backtracking and data forensics, provide means to get insights into past research practices and processes for data-driven analysis. The methods discussed in this chapter enhance understanding of data-related practices and processes, reproducibility of findings by facilitating the replication and verification of results through data reuse. Key references and further reading are provided after each method description.

Copyright page
Isto Huvila, Uppsala Universitet, Sweden, Lisa Andersson, Uppsala Universitet, Sweden, Zanna Friberg, Uppsala Universitet, Sweden, Ying-Hsang Liu, Uppsala Universitet, Sweden, Olle Sköld, Uppsala Universitet, Sweden
Book:

Paradata

Published online:

05 August 2025

Print publication:

07 August 2025, pp iv-iv
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation

4 - Methods for Generating and Documenting Paradata
- By Ying-Hsang Liu, Isto Huvila
Isto Huvila, Uppsala Universitet, Sweden, Lisa Andersson, Uppsala Universitet, Sweden, Zanna Friberg, Uppsala Universitet, Sweden, Ying-Hsang Liu, Uppsala Universitet, Sweden, Olle Sköld, Uppsala Universitet, Sweden
Book:

Paradata

Published online:

05 August 2025

Print publication:

07 August 2025, pp 75-115
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Summary

This chapter introduces methods for generating and documenting paradata before and during data creation practices and processes (i.e. prospective and in-situ approaches, respectively). It introduces formal metadata-based paradata documentation using standards and controlled vocabularies to contribute to paradata consistency and interoperability. Narrative descriptions and recordings are advantageous for providing contextual richness and detailed documentation of data generation processes. Logging methods, including log files and blockchain technology, allow for automatic paradata generation and for maintaining the integrity of the record. Data management plans and registered reports are examples of measures to prospectively generate potential paradata on forthcoming activities. Finally, facilitative workflow-based approaches are introduced for step-by-step modelling of practices and processes. Rather than suggesting that a single approach to generating and documenting paradata will suffice, we encourage users to consider a selective combination of approaches, facilitated by adequate institutional resources, technical and subject expertise, to enhance the understanding, transparency, reproducibility and credibility of paradata describing practices and processes.

Contents
Isto Huvila, Uppsala Universitet, Sweden, Lisa Andersson, Uppsala Universitet, Sweden, Zanna Friberg, Uppsala Universitet, Sweden, Ying-Hsang Liu, Uppsala Universitet, Sweden, Olle Sköld, Uppsala Universitet, Sweden
Book:

Paradata

Published online:

05 August 2025

Print publication:

07 August 2025, pp v-viii
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation

Acknowledgements
Isto Huvila, Uppsala Universitet, Sweden, Lisa Andersson, Uppsala Universitet, Sweden, Zanna Friberg, Uppsala Universitet, Sweden, Ying-Hsang Liu, Uppsala Universitet, Sweden, Olle Sköld, Uppsala Universitet, Sweden
Book:

Paradata

Published online:

05 August 2025

Print publication:

07 August 2025, pp xi-xii
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation

Preface
Isto Huvila, Uppsala Universitet, Sweden, Lisa Andersson, Uppsala Universitet, Sweden, Zanna Friberg, Uppsala Universitet, Sweden, Ying-Hsang Liu, Uppsala Universitet, Sweden, Olle Sköld, Uppsala Universitet, Sweden
Book:

Paradata

Published online:

05 August 2025

Print publication:

07 August 2025, pp ix-x
- Chapter
- - You have access
  - Open access
- PDF
- HTML
- Export citation

Paradata

Documenting Data Creation, Curation and Use
Isto Huvila, Lisa Andersson, Zanna Friberg, Ying-Hsang Liu, Olle Sköld
Published online:

05 August 2025

Print publication:

07 August 2025
- Book
- - You have access
  - Open access
- Export citation
To make sense of data and use it effectively, it is essential to know where it comes from and how it has been processed and used. This is the domain of paradata, an emerging interdisciplinary field with wide applications. As digital data rapidly accumulates in repositories worldwide, this comprehensive introductory book, the first of its kind, shows how to make that data accessible and reusable. In addition to covering basic concepts of paradata, the book supports practice with coverage of methods for generating, documenting, identifying and managing paradata, including formal metadata, narrative descriptions and qualitative and quantitative backtracking. The book also develops a unifying reference model to help readers contextualise the role of paradata within a wider system of knowledge, practices and processes, and provides a vision for the future of the field. This guide to general principles and practice is ideal for researchers, students and data managers. This title is also available as open access on Cambridge Core.

Designing Learning Intervention Studies: Identifiability of Heterogeneous Hidden Markov Models
Ying Liu, Steven Culpepper
Journal:

Psychometrika ,

Published online by Cambridge University Press:

22 July 2025, pp. 1-26
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Hidden Markov models (HMMs) are popular for modeling complex, longitudinal data. Existing identifiability theory for conventional HMMs assume emission probabilities are constant over time and the Markov chain governing transitions among the hidden states is irreducible, which are assumptions that may not be applicable in all educational and psychological research settings. We generalize existing conditions on homogeneous HMMs by considering heterogeneous HMMs with time-varying emission probabilities and the potential for absorbing states. Researchers are investigating a family of models known as restricted HMMs (RHMMs), which combine HMMs and restricted latent class models (RLCMs) to provide fine-grained classification of educationally and psychologically relevant attribute profiles over time. These RHMMs leverage the benefits of RLCMs and HMMs to understand changes in attribute profiles within longitudinal designs. The identifiability of RHMM parameters is a critical issue for ensuring successful applications and accurate statistical inference regarding factors that impact outcomes in intervention studies. We establish identifiability conditions for RHMMs. The new identifiability conditions for heterogeneous HMMs and RHMMs provide researchers insights for designing interventions. We discuss different types of assessment designs and the implications for practice. We present an application of a heterogeneous HMM to daily measures of positive and negative affect.

The association between coffee intake and femoral neck bone mineral density based on the NHANES and Mendelian randomisation study
Ke Wang, Guoxin Huang, Ying Liu, Beibei Zhang, Da Qian, Bin Pei
Journal:

Journal of Nutritional Science / Volume 14 / 2025

Published online by Cambridge University Press:

18 July 2025, e51
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Femoral neck bone mineral density (FNBMD) is a high risk factor for femoral head fractures, and coffee intake affects bone mineral density, but the effect on FNBMD remains to be explored. First, we conducted an observational study in the National Health and Nutrition Examination Survey and collected data on coffee intake, FNBMD, and sixteen covariates. Weight linear regression was used to explore the association of coffee intake with FNBMD. Then, Mendelian randomisation (MR) was used to explore the causal relationship between coffee intake and FNBMD, the exposure factor was coffee intake, and the outcome factor was FNBMD. The inverse variance weighting (IVW) method was used for the analysis, while heterogeneity tests, sensitivity, and pleiotropy analysis were performed. A total of 5 915 people were included in the cross-sectional study, including 3 178 men and 2 737 women. In the completely adjusted model, no coffee was used as a reference. The ORs for the overall population at ‘< 1’, ‘1–<2’, ‘2–<4’, and ‘4+’ (95% CI) were 0.02 (–0.01, 0.04), 0.00 (–0.01, 0.02), –0.01 (–0.02, 0.00), and 0.00 (–0.01, 0.02), respectively. The male and female population showed no statistically significant differences in both univariate and multivariate linear regressions. In the MR study, the IVW results showed an OR (95% CI) of 1.06 (0.88–1.27), a P-value of 0.55, and an overall F-value of 80.31. The heterogeneity, sensitivity analyses, and pleiotropy had no statistical significance. Our study used cross-sectional studies and MR to demonstrate that there is no correlation or causal relationship between coffee intake and FNBMD.

Thickness model for viscous impinging liquid sheets
Ziyang Peng, Xuan Liu, Zhuo-Yang Song, Bo Wang, Zhengxuan Cao, Erjun Wu, Jiarui Zhao, Ying Gao, Xiaodong Chen, Wenjun Ma
Journal:

Journal of Fluid Mechanics / Volume 1014 / 10 July 2025

Published online by Cambridge University Press:

30 June 2025, R3
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Ultra-thin liquid sheets generated by impinging two liquid jets are crucial high-repetition-rate targets for laser ion acceleration and ultra-fast physics, and serve widely as barrier-free samples for structural biochemistry. The impact of liquid viscosity on sheet thickness should be comprehended fully to exploit its potential. Here, we demonstrate experimentally that viscosity significantly influences thickness distribution, while surface tension primarily governs shape. We propose a thickness model based on momentum exchange and mass transport within the radial flow, which agrees well with the experiments. These results provide deeper insights into the behaviour of liquid sheets and enable accurate thickness control for various applications, including atomization nozzles and laser-driven particle sources.

Spatial metabolomics to profile metabolic reprogramming of liver in Schistosoma japonicum-infected mice
Yu Zhang, Luo Ming, Junhui Li, Chen Guo, Jie Jiang, Ying Zhang, Gao Tan, Xiaoli Liu, Yingzi Ming
Journal:

Parasitology / Accepted manuscript

Published online by Cambridge University Press:

28 March 2025, pp. 1-31
- Article
- - You have access
  - Open access
- PDF
- Export citation

Resting-state network alterations in depression: a comprehensive meta-analysis of functional connectivity
Zhihui Zhang, Yijing Zhang, He Wang, Minghuan Lei, Yifan Jiang, Di Xiong, Yayuan Chen, Yujie Zhang, Guoshu Zhao, Yao Wang, Wanwan Zhang, Jinglei Xu, Ying Zhai, Qi An, Shen Li, Xiaoke Hao, Feng Liu
Journal:

Psychological Medicine / Volume 55 / 2025

Published online by Cambridge University Press:

26 February 2025, e63
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Background
Depression has been linked to disruptions in resting-state networks (RSNs). However, inconsistent findings on RSN disruptions, with variations in reported connectivity within and between RSNs, complicate the understanding of the neurobiological mechanisms underlying depression.
Methods
A systematic literature search of PubMed and Web of Science identified studies that employed resting-state functional magnetic resonance imaging (fMRI) to explore RSN changes in depression. Studies using seed-based functional connectivity analysis or independent component analysis were included, and coordinate-based meta-analyses were performed to evaluate alterations in RSN connectivity both within and between networks.
Results
A total of 58 studies were included, comprising 2321 patients with depression and 2197 healthy controls. The meta-analysis revealed significant alterations in RSN connectivity, both within and between networks, in patients with depression compared with healthy controls. Specifically, within-network changes included both increased and decreased connectivity in the default mode network (DMN) and increased connectivity in the frontoparietal network (FPN). Between-network findings showed increased DMN–FPN and limbic network (LN)–DMN connectivity, decreased DMN–somatomotor network and LN–FPN connectivity, and varied ventral attention network (VAN)–dorsal attentional network (DAN) connectivity. Additionally, a positive correlation was found between illness duration and increased connectivity between the VAN and DAN.
Conclusions
These findings not only provide a comprehensive characterization of RSN disruptions in depression but also enhance our understanding of the neurobiological mechanisms underlying depression.

Overlapping and differential neuropharmacological mechanisms of stimulants and nonstimulants for attention-deficit/hyperactivity disorder: a comparative neuroimaging analysis
Nanfang Pan, Tianyu Ma, Yixi Liu, Shufang Zhang, Samantha Hu, Aniruddha Shekara, Hengyi Cao, Qiyong Gong, Ying Chen
Journal:

Psychological Medicine / Volume 54 / Issue 16 / December 2024

Published online by Cambridge University Press:

14 January 2025, pp. 4676-4690
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Background
Psychostimulants and nonstimulants have partially overlapping pharmacological targets on attention-deficit/hyperactivity disorder (ADHD), but whether their neuroimaging underpinnings differ is elusive. We aimed to identify overlapping and medication-specific brain functional mechanisms of psychostimulants and nonstimulants on ADHD.
Methods
After a systematic literature search and database construction, the imputed maps of separate and pooled neuropharmacological mechanisms were meta-analyzed by Seed-based d Mapping toolbox, followed by large-scale network analysis to uncover potential coactivation patterns and meta-regression analysis to examine the modulatory effects of age and sex.
Results
Twenty-eight whole-brain task-based functional MRI studies (396 cases in the medication group and 459 cases in the control group) were included. Possible normalization effects of stimulant and nonstimulant administration converged on increased activation patterns of the left supplementary motor area (Z = 1.21, p < 0.0001, central executive network). Stimulants, relative to nonstimulants, increased brain activations in the left amygdala (Z = 1.30, p = 0.0006), middle cingulate gyrus (Z = 1.22, p = 0.0008), and superior frontal gyrus (Z = 1.27, p = 0.0006), which are within the ventral attention network. Neurodevelopmental trajectories emerged in activation patterns of the right supplementary motor area and left amygdala, with the left amygdala also presenting a sex-related difference.
Conclusions
Convergence in the left supplementary motor area may delineate novel therapeutic targets for effective interventions, and distinct neural substrates could account for different therapeutic responses to stimulants and nonstimulants.

Asymptotics for crank of overpartitions
Part of
- Combinatorics
- Additive number theory; partitions
Edward Y.S. Liu, Helen W.J. Zhang, Ying Zhong
Journal:

Canadian Journal of Mathematics , First View

Published online by Cambridge University Press:

09 January 2025, pp. 1-34
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Let $\overline {M}(a,c,n)$ denote the number of overpartitions of n with first residual crank congruent to a modulo c with $c\geq 3$ being odd and $0\leq a<c$. The central objective of this paper is twofold: firstly, to establish an asymptotic formula for the crank of overpartitions; and secondly, to establish several inequalities concerning $\overline {M}(a,c,n)$ that encompasses crank differences, positivity, and strict log-subadditivity.

PD80 Exploring Key Factors Influencing Patients’ Antidepressant Preferences: Insights From A Multicenter Best-Worst Scaling Survey
Ying Tao, Yanfeng Ren, Shimeng Liu, Yingyao Chen
Journal:

International Journal of Technology Assessment in Health Care / Volume 40 / Issue S1 / December 2024

Published online by Cambridge University Press:

07 January 2025, p. S127
- Article
- - You have access
  - Open access
- PDF
- Export citation
Introduction
Depression is associated with serious disease burden. Despite the multitude of antidepressant options available, the adherence rate is often low. Accounting for patient preferences can potentially boost adherence to antidepressant medication and elevate patient satisfaction. However, limited evidence exists regarding patient preferences for antidepressant selection. This study aims to elicit patient preferences regarding the benefits, risks, and cost attributes of antidepressants in China.
Methods
A best-worst scaling profile case experiment was conducted using a face-to-face survey administered to patients diagnosed with depression. Patients were recruited from general and psychiatric hospitals. We utilized a multiphase approach that integrated literature review, expert consultation, and best-worst scaling to develop attributes within choice sets. The attributes with each varying across two or three levels encompassed remission rate, sleep disorders, risk of headache or dizziness, risk of gastrointestinal adverse events, risk of liver or kidney injury, and monthly out-of-pocket costs. Each respondent answered seven choice tasks, including a dominant task. Data were analyzed using conditional logit, mixed logit, and generalized multinomial logit models. Subgroup analyses were conducted to explore preference heterogeneity.
Results
A total of 331 participants completed the survey and met the inclusion criteria. Almost all attribute levels were statistically significant. Overall, the most desirable characteristics of antidepressant medications were higher remission rates (80% and 55% rates; p<0.05), lower risk of liver or kidney injury (1% rate; p<0.05), and fewer monthly out-of-pocket costs (CNY100 [USD13.93, EUR12.75]; p<0.05). Risks of gastrointestinal adverse events (60% and 35% rates) and insomnia were the least preferred features. Regarding attributes, efficacy, the risk of gastrointestinal adverse events, and sleep disorders were relatively important factors influencing patient choice. Preferences differed slightly by age, degree of education, personal annual income, and treatments currently received.
Conclusions
Our study suggests that efficacy, gastrointestinal adverse effects, sleep disorders, and treatment costs are critical drivers behind medication choices among patients with depression. Preference heterogeneity also exists regarding individual and therapeutic characteristics, which need more samples and further analyses to identify. These discoveries hold the potential to enrich the shared decision-making process between physicians and patients within healthcare settings.

Latent Feature Extraction for Process Data via Multidimensional Scaling
Xueying Tang, Zhi Wang, Qiwei He, Jingchen Liu, Zhiliang Ying
Journal:

Psychometrika / Volume 85 / Issue 2 / June 2020

Published online by Cambridge University Press:

01 January 2025, pp. 378-397
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Computer-based interactive items have become prevalent in recent educational assessments. In such items, detailed human–computer interactive process, known as response process, is recorded in a log file. The recorded response processes provide great opportunities to understand individuals’ problem solving processes. However, difficulties exist in analyzing these data as they are high-dimensional sequences in a nonstandard format. This paper aims at extracting useful information from response processes. In particular, we consider an exploratory analysis that extracts latent variables from process data through a multidimensional scaling framework. A dissimilarity measure is described to quantify the discrepancy between two response processes. The proposed method is applied to both simulated data and real process data from 14 PSTRE items in PIAAC 2012. A prediction procedure is used to examine the information contained in the extracted latent variables. We find that the extracted latent variables preserve a substantial amount of information in the process and have reasonable interpretability. We also empirically prove that process data contains more information than classic binary item responses in terms of out-of-sample prediction of many variables.

Regularized Latent Class Analysis with Application in Cognitive Diagnosis
Yunxiao Chen, Xiaoou Li, Jingchen Liu, Zhiliang Ying
Journal:

Psychometrika / Volume 82 / Issue 3 / September 2017

Published online by Cambridge University Press:

01 January 2025, pp. 660-692
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Diagnostic classification models are confirmatory in the sense that the relationship between the latent attributes and responses to items is specified or parameterized. Such models are readily interpretable with each component of the model usually having a practical meaning. However, parameterized diagnostic classification models are sometimes too simple to capture all the data patterns, resulting in significant model lack of fit. In this paper, we attempt to obtain a compromise between interpretability and goodness of fit by regularizing a latent class model. Our approach starts with minimal assumptions on the data structure, followed by suitable regularization to reduce complexity, so that readily interpretable, yet flexible model is obtained. An expectation–maximization-type algorithm is developed for efficient computation. It is shown that the proposed approach enjoys good theoretical properties. Results from simulation studies and a real application are presented.

On the Identifiability of Diagnostic Classification Models
Guanhua Fang, Jingchen Liu, Zhiliang Ying
Journal:

Psychometrika / Volume 84 / Issue 1 / 15 March 2019

Published online by Cambridge University Press:

01 January 2025, pp. 19-40
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This paper establishes fundamental results for statistical analysis based on diagnostic classification models (DCMs). The results are developed at a high level of generality and are applicable to essentially all diagnostic classification models. In particular, we establish identifiability results for various modeling parameters, notably item response probabilities, attribute distribution, and Q-matrix-induced partial information structure. These results are stated under a general setting of latent class models. Through a nonparametric Bayes approach, we construct an estimator that can be shown to be consistent when the identifiability conditions are satisfied. Simulation results show that these estimators perform well under various model settings. We also apply the proposed method to a dataset from the National Epidemiological Survey on Alcohol and Related Conditions (NESARC).

Search Results

Refine search

Refine search

Actions for selected content:

299 results

Reviews

Index

5 - Methods for Identifying Paradata for Data Reuse

Summary

Copyright page

4 - Methods for Generating and Documenting Paradata

Summary

Contents

Acknowledgements

Preface

Paradata

Designing Learning Intervention Studies: Identifiability of Heterogeneous Hidden Markov Models

The association between coffee intake and femoral neck bone mineral density based on the NHANES and Mendelian randomisation study

Thickness model for viscous impinging liquid sheets

Spatial metabolomics to profile metabolic reprogramming of liver in Schistosoma japonicum-infected mice

Resting-state network alterations in depression: a comprehensive meta-analysis of functional connectivity

Overlapping and differential neuropharmacological mechanisms of stimulants and nonstimulants for attention-deficit/hyperactivity disorder: a comparative neuroimaging analysis

Asymptotics for crank of overpartitions

PD80 Exploring Key Factors Influencing Patients’ Antidepressant Preferences: Insights From A Multicenter Best-Worst Scaling Survey

Latent Feature Extraction for Process Data via Multidimensional Scaling

Regularized Latent Class Analysis with Application in Cognitive Diagnosis

On the Identifiability of Diagnostic Classification Models

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

299 results

Summary

Summary

Paradata