Search

3 - Unsupervised Learning Warm-Up
James Burridge, University of Portsmouth, Nick Tosh, University of Galway
Book:

Inference in Statistical Modelling and Machine Learning

Published online:

22 May 2026

Print publication:

23 July 2026, pp 24-35
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we explore an unsupervised learning problem: estimating a distribution function from two-dimensional data. Although there is no response variable, the workflow mirrors that of supervised learning. We select the best-fitting function within a family by maximising the sum of the log of the distribution's values at the observed data points. As in supervised learning, excessive flexibility leads to overfitting, while insufficient flexibility leads to underfitting. We use cross-validation to identify a function family that achieves a happy medium.

2 - Supervised Learning Warm-Up
James Burridge, University of Portsmouth, Nick Tosh, University of Galway
Book:

Inference in Statistical Modelling and Machine Learning

Published online:

22 May 2026

Print publication:

23 July 2026, pp 8-23
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we examine our first supervised learning problem, focusing on how to construct prediction functions and assess their performance. Given data consisting of predictor–response pairs, we can learn the parameters of a prediction function by minimising a loss, such as the residual sum of squares, which measures the discrepancy between actual and predicted responses. Using more flexible families of prediction functions typically reduces loss on the training data, but excessive flexibility can lead to overfitting: fitting to noise rather than the systematic component of the relationship. Overfitting results in poor prediction performance on new, unseen data. To estimate how a prediction method will perform on unseen data, we use cross-validation. However, when we compare many prediction methods using cross-validation, the best-performing method often appears better than it truly is; its apparent performance is an unreliable guide to its future accuracy. Prior knowledge is crucial for selecting plausible prediction methods to compare. Finally, we can use bootstrapping to quantify uncertainty in prediction functions and their predictions.

3 - Prediction Error, Cross-Validation, and Model Selection
Isabella Verdinelli, Carnegie Mellon University, Pennsylvania, Larry Wasserman, Carnegie Mellon University, Pennsylvania
Book:

All of Regression

Published online:

08 May 2026

Print publication:

04 June 2026, pp 44-55
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter, we explain how to estimate the prediction error of a regression model. The training error (the average of the squared residuals) under-estimates the prediction error. Instead, we use cross-validation that involves separating the data into one part for fitting the model and one part for estimating the prediction error. We can use the estimated prediction error to choose among a set of possible regression models.

Mixed-Effects XGBoost with Group-Aware Permutation Importance and Cross-Validation for Multilevel Cross-Classified Continuous Outcomes
Sun-Joo Cho, Sophia Mueller
Journal:

Psychometrika ,

Published online by Cambridge University Press:

10 April 2026, pp. 1-40
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
This article proposes a mixed-effects machine-learning framework for modeling complex, nonlinear relations between predictors and continuous outcomes in multilevel cross-classified data. The proposed method, termed LMM–XGBoost, embeds extreme gradient boosting (XGBoost) within a linear mixed model (LMM) to combine flexible modeling of nonlinear and interaction effects with random effects that model dependence. In addition, an iterative estimation procedure for LMM–XGBoost is developed, a group-aware permutation importance measure that respects multilevel dependence is proposed, and a combined-group cross-validation (CV) strategy for hyperparameter tuning, out-of-fold (OOF) prediction, and importance estimation is developed for cross-classified designs. The simulation study shows that the proposed estimation method for LMM–XGBoost yields good parameter recovery under non-zero random-effect variances. In addition, relative to standard LMM and XGBoost, LMM–XGBoost achieves lower OOF prediction error and more accurate recovery of variable importance. The study further shows that combined-group CV and group-aware permutation importance yield less biased error estimates and substantially higher agreement with the true importance rankings than conventional permutation measures. An empirical application using the Add Health study illustrates how the proposed methods can identify important factors across multiple domains associated with adolescent depressive symptoms.

A novel adaptive sampling approach with batch selection for the automatic generation of surrogate models in geotechnical engineering
Part of
- Data-driven Techniques in Geoscience, Geomechanics, and Geotechnical Engineering
Yunxiang Yang, Agustín Ruiz López, Aikaterini Tsiampousi, David M.G. Taborda
Journal:

Data-Centric Engineering / Volume 7 / 2026

Published online by Cambridge University Press:

28 January 2026, e2
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Surrogate models have gained widespread popularity for their effectiveness in replacing computationally expensive numerical analyses, particularly in scenarios such as design optimization procedures, requiring hundreds or thousands of simulations. While one-shot sampling methods—where all samples are generated in a single stage without prior knowledge of the required sample size—are commonly adopted in the creation of surrogate models, these methods face significant limitations. Given that the characteristics of the underlying system are generally unknown prior to training, adopting one-shot sampling can lead to suboptimal model performance or unnecessary computational costs, especially in complex or high-dimensional problems. This paper addresses these challenges by proposing a novel, model-independent adaptive sampling approach with batch selection, termed Cross-Validation Batch Adaptive Sampling for High-Efficiency Surrogates (CV-BASHES). CV-BASHES is first validated using two analytical functions to explore its flexibility and accuracy under different configurations, confirming its robustness. Comparative studies on the same functions with two state-of-the-art methods, maximum projection (MaxPro) and scalable adaptive sampling (SAS), demonstrate the superior accuracy and robustness of CV-BASHES. Its applicability is further demonstrated through a geotechnical application, where CV-BASHES is used to develop a surrogate model to predict the horizontal deformation of a diaphragm wall supporting a deep excavation. Results show that CV-BASHES efficiently selects training samples, reducing the dataset size while maintaining high surrogate accuracy. By offering more efficient sampling strategies, CV-BASHES streamlines and enhances the process of creating machine learning models as surrogates for tackling complex problems in general engineering disciplines.

10 - Data Collection, Experimentation, and Evaluation
from Part IV - Applications, Evaluations, and Methods
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science with R

Published online:

07 February 2026

Print publication:

22 January 2026, pp 285-310
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter focuses on data collection methods, analysis approaches, and evaluation techniques in data science. It covers various data collection methods including surveys (with different question types like multiple-choice, Likert scales, and open-ended questions), interviews, focus groups, diary studies, and user studies in lab and field settings.
The chapter distinguishes between quantitative methods (using numerical measurements and statistical analysis) and qualitative methods (observing behaviors, attitudes, and opinions through techniques like grounded theory and constant comparison). It also discusses mixed-method approaches that combine both methodologies.
For evaluation, the chapter explains model comparison metrics including precision, recall, F-measure, ROC curves, AIC, and BIC. It covers validation techniques like training-testing splits, A/B testing, and cross-validation methods. The chapter emphasizes that data science involves pre-data collection planning and post-analysis evaluation, not just data processing.

10 - Data Collection, Experimentation, and Evaluation
from Part IV - Applications, Evaluations, and Methods
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science with Python

Published online:

07 February 2026

Print publication:

22 January 2026, pp 303-328
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter focuses on data collection methods, analysis approaches, and evaluation techniques in data science. It covers various data collection methods including surveys (with different question types like multiple-choice, Likert scales, and open-ended questions), interviews, focus groups, diary studies, and user studies in lab and field settings.
The chapter distinguishes between quantitative methods (using numerical measurements and statistical analysis) and qualitative methods (observing behaviors, attitudes, and opinions through techniques like grounded theory and constant comparison). It also discusses mixed-method approaches that combine both methodologies.
For evaluation, the chapter explains model comparison metrics including precision, recall, F-measure, ROC curves, AIC, and BIC. It covers validation techniques like training-testing splits, A/B testing, and cross-validation methods. The chapter emphasizes that data science involves pre-data collection planning and post-analysis evaluation, not just data processing.

Regularized Joint Maximum Likelihood Estimation of Latent Space Item Response Models
Dylan Molenaar, Minjeong Jeon
Journal:

Psychometrika / Volume 91 / Issue 1 / March 2026

Published online by Cambridge University Press:

09 January 2026, pp. 335-359
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
In latent space item response models (LSIRMs), subjects and items are embedded in a low-dimensional Euclidean latent space. As such, interactions among persons and/or items can be revealed that are unmodeled in conventional item response theory models. Current estimation approach for LSIRMs is a fully Bayesian procedure with Markov Chain Monte Carlo, which is, while practical, computationally challenging, hampering applied researchers to use the models in a wide range of settings. Therefore, we propose an LSIRM based on two variants of regularized joint maximum likelihood (JML) estimation: penalized JML and constrained JML. Owing to the absence of integrals in the likelihood, the JML methods allow for various models to be fit in limited amount of time. This computational speed facilitates a practical extension of LSIRMs to ordinal data, and the possibility to select the dimensionality of the latent space using cross-validation. In this study, we derive the two JML approaches and address different issues that arise when using maximum likelihood to estimate the LSIRM. We present a simulation study demonstrating acceptable parameter recovery and adequate performance of the cross-validation procedure. In addition, we estimate different binary and ordinal LSIRMs on real datasets pertaining to deductive reasoning and personality. All methods are implemented in R package ‘LSMjml’ which is available from CRAN.

4 - Bias–Variance Tradeoff and Overfitting vs. Underfitting
from Part II - Regression
Ruye Wang, Harvey Mudd College, California
Book:

Introduction to Machine Learning

Published online:

05 February 2026

Print publication:

18 December 2025, pp 113-123
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter considers some basic concepts of essentail importance in supervised learning, of which the fundamental task is to model the given dataset (training set) so that the model prediction matches the given data optimally in certain sense. As typically the form of the model is predetermined, the task of supervised learning is essentially to find the optimal parameters of the model in either of two ways: (a) the least squares estimation (LSE) method that minimizes the squared error between the model prediction and observed data, or (b) the maximum A posteriori (MAP) method that maximizes the posterior probability of the model parameters given the data is maximized. The chapter further considers some important issues including overfitting, underfitting, and bias-variance tradeoff, faced by all supervised learning methods based on noisy data, and then some specific methods to address such issues, including cross-validation, regularization, and ensemble learning.

8 - Applications for Multiple Networks
from Part III - Applications
Eric W. Bridgeford, The Johns Hopkins University, Alexander R. Loftus, The Johns Hopkins University, Joshua T. Vogelstein, The Johns Hopkins University
Book:

Hands-On Network Machine Learning with Python

Published online:

23 September 2025

Print publication:

18 September 2025, pp 357-376
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter explores advanced applications of network machine learning for multiple networks. We introduce anomaly detection in time series of networks, identifying significant structural changes over time. The chapter then focuses on signal subnetwork estimation for network classification tasks. We present both incoherent and coherent approaches, with incoherent methods identifying edges that best differentiate between network classes, and coherent methods leveraging additional network structure to improve classification accuracy. Practical applications, such as classifying brain networks, are emphasized throughout. These techniques apply to collections of networks, providing a toolkit for analyzing and classifying complex, multinetwork datasets. By integrating previous concepts with new methodologies, we offer a framework for extracting insights and making predictions from diverse network structures with associated attributes.

How detailed do measures of bilingual language experience need to be? A cost–benefit analysis using the Q-BEx questionnaire
Cécile De Cat, Arief Gusnanto, Draško Kašćelan, Philippe Prévost, Ludovica Serratrice, Laurie Tuller, Sharon Unsworth
Journal:

Bilingualism: Language and Cognition , First View

Published online by Cambridge University Press:

12 September 2025, pp. 1-12
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
What is the optimal level of questionnaire detail required to measure bilingual language experience? This empirical evaluation compares alternative measures of language exposure of varying cost (i.e., questionnaire detail) in terms of their performance as predictors of oral language outcomes. The alternative measures were derived from Q-BEx questionnaire data collected from a diverse sample of 121 heritage bilinguals (5–9 years of age) growing up in France, the Netherlands and the UK. Outcome data consisted of morphosyntax and vocabulary measures (in the societal language) and parental estimates of oral proficiency (in the heritage language). Statistical modelling exploited information-theoretic and cross-validation approaches to identify the optimal language exposure measure. Optimal cost–benefit was achieved with cumulative exposure (for the societal language) and current exposure in the home (for the heritage language). The greatest level of questionnaire detail did not yield more reliable predictors of language outcomes.

Bayesian Comparison of Latent Variable Models: Conditional Versus Marginal Likelihoods
Edgar C. Merkle, Daniel Furr, Sophia Rabe-Hesketh
Journal:

Psychometrika / Volume 84 / Issue 3 / September 2019

Published online by Cambridge University Press:

01 January 2025, pp. 802-829
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Typical Bayesian methods for models with latent variables (or random effects) involve directly sampling the latent variables along with the model parameters. In high-level software code for model definitions (using, e.g., BUGS, JAGS, Stan), the likelihood is therefore specified as conditional on the latent variables. This can lead researchers to perform model comparisons via conditional likelihoods, where the latent variables are considered model parameters. In other settings, however, typical model comparisons involve marginal likelihoods where the latent variables are integrated out. This distinction is often overlooked despite the fact that it can have a large impact on the comparisons of interest. In this paper, we clarify and illustrate these issues, focusing on the comparison of conditional and marginal Deviance Information Criteria (DICs) and Watanabe–Akaike Information Criteria (WAICs) in psychometric modeling. The conditional/marginal distinction corresponds to whether the model should be predictive for the clusters that are in the data or for new clusters (where “clusters” typically correspond to higher-level units like people or schools). Correspondingly, we show that marginal WAIC corresponds to leave-one-cluster out cross-validation, whereas conditional WAIC corresponds to leave-one-unit out. These results lead to recommendations on the general application of the criteria to models with latent variables.

Restricted Recalibration of Item Response Theory Models
Yang Liu, Ji Seung Yang, Alberto Maydeu-Olivares
Journal:

Psychometrika / Volume 84 / Issue 2 / June 2019

Published online by Cambridge University Press:

01 January 2025, pp. 529-553
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In item response theory (IRT), it is often necessary to perform restricted recalibration (RR) of the model: A set of (focal) parameters is estimated holding a set of (nuisance) parameters fixed. Typical applications of RR include expanding an existing item bank, linking multiple test forms, and associating constructs measured by separately calibrated tests. In the current work, we provide full statistical theory for RR of IRT models under the framework of pseudo-maximum likelihood estimation. We describe the standard error calculation for the focal parameters, the assessment of overall goodness-of-fit (GOF) of the model, and the identification of misfitting items. We report a simulation study to evaluate the performance of these methods in the scenario of adding a new item to an existing test. Parameter recovery for the focal parameters as well as Type I error and power of the proposed tests are examined. An empirical example is also included, in which we validate the pediatric fatigue short-form scale in the Patient-Reported Outcome Measurement Information System (PROMIS), compute global and local GOF statistics, and update parameters for the misfitting items.

A Systematic Study into the Factors that Affect the Predictive Accuracy of Multilevel VAR(1) Models
Ginette Lafit, Kristof Meers, Eva Ceulemans
Journal:

Psychometrika / Volume 87 / Issue 2 / June 2022

Published online by Cambridge University Press:

01 January 2025, pp. 432-476
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The use of multilevel VAR(1) models to unravel within-individual process dynamics is gaining momentum in psychological research. These models accommodate the structure of intensive longitudinal datasets in which repeated measurements are nested within individuals. They estimate within-individual auto- and cross-regressive relationships while incorporating and using information about the distributions of these effects across individuals. An important quality feature of the obtained estimates pertains to how well they generalize to unseen data. Bulteel and colleagues (Psychol Methods 23(4):740–756, 2018a) showed that this feature can be assessed through a cross-validation approach, yielding a predictive accuracy measure. In this article, we follow up on their results, by performing three simulation studies that allow to systematically study five factors that likely affect the predictive accuracy of multilevel VAR(1) models: (i) the number of measurement occasions per person, (ii) the number of persons, (iii) the number of variables, (iv) the contemporaneous collinearity between the variables, and (v) the distributional shape of the individual differences in the VAR(1) parameters (i.e., normal versus multimodal distributions). Simulation results show that pooling information across individuals and using multilevel techniques prevent overfitting. Also, we show that when variables are expected to show strong contemporaneous correlations, performing multilevel VAR(1) in a reduced variable space can be useful. Furthermore, results reveal that multilevel VAR(1) models with random effects have a better predictive performance than person-specific VAR(1) models when the sample includes groups of individuals that share similar dynamics.

External Analyses of Preference Models
Mark L. Davison
Journal:

Psychometrika / Volume 41 / Issue 4 / December 1976

Published online by Cambridge University Press:

01 January 2025, pp. 557-558
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Using Carroll's external analysis, several studies have found that unfolding models account for more, although seldom significantly more, variance in preferences than Tucker's vector model. In studies of sociometric ratings and political preferences, the unfolding model again rarely outpredicted the vector model by a significant amount. Yet on cross-validation, the unfolding model consistently accounted for more variance. Results suggest that sometimes significance tests are less sensitive than cross-validation procedures to the small but consistent superiority of the unfolding model. Future researchers may wish to use significance tests and cross-validation techniques in comparing models.

A Cautionary Note on using Internal Cross Validation to Select the Number of Clusters
Abba M. Krieger, Paul E. Green
Journal:

Psychometrika / Volume 64 / Issue 3 / September 1999

Published online by Cambridge University Press:

01 January 2025, pp. 341-353
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
A highly popular method for examining the stability of a data clustering is to split the data into two parts, cluster the observations in Part A, assign the objects in Part B to their nearest centroid in Part A, and then independently cluster the Part B objects. One then examines how close the two partitions are (say, by the Rand measure). Another proposal is to split the data into k parts, and see how their centroids cluster. By means of synthetic data analyses, we demonstrate that these approaches fail to identify the appropriate number of clusters, particularly as sample size becomes large and the variables exhibit higher correlations.

What Can We Learn from a Semiparametric Factor Analysis of Item Responses and Response Time? An Illustration with the PISA 2015 Data
Yang Liu, Weimeng Wang
Journal:

Psychometrika / Volume 89 / Issue 2 / June 2024

Published online by Cambridge University Press:

27 December 2024, pp. 386-410
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
It is widely believed that a joint factor analysis of item responses and response time (RT) may yield more precise ability scores that are conventionally predicted from responses only. For this purpose, a simple-structure factor model is often preferred as it only requires specifying an additional measurement model for item-level RT while leaving the original item response theory (IRT) model for responses intact. The added speed factor indicated by item-level RT correlates with the ability factor in the IRT model, allowing RT data to carry additional information about respondents’ ability. However, parametric simple-structure factor models are often restrictive and fit poorly to empirical data, which prompts under-confidence in the suitablity of a simple factor structure. In the present paper, we analyze the 2015 Programme for International Student Assessment mathematics data using a semiparametric simple-structure model. We conclude that a simple factor structure attains a decent fit after further parametric assumptions in the measurement model are sufficiently relaxed. Furthermore, our semiparametric model implies that the association between latent ability and speed/slowness is strong in the population, but the form of association is nonlinear. It follows that scoring based on the fitted model can substantially improve the precision of ability scores.

4 - Model Selection
Philip Hans Franses, Erasmus University
Book:

Ethics in Econometrics

Published online:

14 November 2024

Print publication:

28 November 2024, pp 88-112
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We first discuss a phenomenon called data mining. This can involve multiple tests on which variables or correlations are relevant. If used improperly, data mining may associate with scientific misconduct. Next, we discuss one way to arrive at a single final model, involving stepwise methods. We see that various stepwise methods lead to different final models. Next, we see that various configurations in test situations, here illustrated for testing for cointegration, lead to different outcomes. It may be possible to see which configurations make most sense and can be used for empirical analysis. However, we suggest that it is better to keep various models and somehow combine inferences. This is illustrated by an analysis of the losses in airline revenues in the United States owing to 9/11. We see that out of four different models, three estimate a similar loss, while the fourth model suggests only 10 percent of that figure. We argue that it is better to maintain various models, that is, models that stand various diagnostic tests, for inference and for forecasting, and to combine what can be learned from them.

10 - Validity
from Part III - Methodological Challenges of Experimentation in Sociology
Davide Barrera, Università degli Studi di Torino, Italy, Klarita Gërxhani, Vrije Universiteit, Amsterdam, Bernhard Kittel, Universität Wien, Austria, Luis Miller, Institute of Public Goods and Policies, Spanish National Research Council, Tobias Wolbring, School of Business, Economics and Society at the Friedrich-Alexander-University Erlangen-Nürnberg
Book:

Experimental Sociology

Published online:

23 November 2024

Print publication:

21 November 2024, pp 119-131
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter addresses the often-misunderstood concept of validity. Much of the methodological discussion around sociological experiments is framed in terms of internal and external validity. The standard view is that the more we ensure that the experimental treatment is isolated from potential confounds (internal validity), the more unlikely it is that the experimental results can be representative of phenomena of the outside world (external validity). However, other accounts describe internal validity as a prerequisite of external validity: Unless we ensure internal validity of an experiment, little can be said of the outside world. We contend in this chapter that problems of either external or internal validity do not necessarily depend on the artificiality of experimental settings or on the laboratory–field distinction between experimental designs. We discuss the internal–external distinction and propose instead a list of potential threats to the validity of experiments that includes "usual suspects" like selection, history, attrition, and experimenter demand effects and elaborate on how these threats can be productively handled in experimental work. Moreover, in light of the different types of experiments, we also discuss the strengths and weaknesses of each regarding threats to internal and external validity.

9 - Predictor Importance and Model Selection in Multiple Regression Models
Gerry P. Quinn, Deakin University, Victoria, Michael J. Keough, University of Melbourne
Book:

Experimental Design and Data Analysis for Biologists

Published online:

04 September 2023

Print publication:

07 September 2023, pp 174-193
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

We can easily find ourselves with lots of predictors. This situation has been common in ecology and environmental science but has spread to other biological disciplines as genomics, proteomics, metabolomics, etc., become widespread. Models can become very complex, and with many predictors, collinearity is more likely. Fitting the models is tricky, particularly if we’re looking for the “best” model, and the way we approach the task depends on how we’ll use the model results. This chapter describes different model selection approaches for multiple regression models and discusses ways of measuring the importance of specific predictors. It covers stepwise procedures, all subsets, information criteria, model averaging and validation, and introduces regression trees, including boosted trees.

Search Results

Refine search

Refine search

Actions for selected content:

40 results

3 - Unsupervised Learning Warm-Up

Summary

2 - Supervised Learning Warm-Up

Summary

3 - Prediction Error, Cross-Validation, and Model Selection

Summary

Mixed-Effects XGBoost with Group-Aware Permutation Importance and Cross-Validation for Multilevel Cross-Classified Continuous Outcomes

A novel adaptive sampling approach with batch selection for the automatic generation of surrogate models in geotechnical engineering

10 - Data Collection, Experimentation, and Evaluation

Summary

10 - Data Collection, Experimentation, and Evaluation

Summary

Regularized Joint Maximum Likelihood Estimation of Latent Space Item Response Models

4 - Bias–Variance Tradeoff and Overfitting vs. Underfitting

Summary

8 - Applications for Multiple Networks

Summary

How detailed do measures of bilingual language experience need to be? A cost–benefit analysis using the Q-BEx questionnaire

Bayesian Comparison of Latent Variable Models: Conditional Versus Marginal Likelihoods

Restricted Recalibration of Item Response Theory Models

A Systematic Study into the Factors that Affect the Predictive Accuracy of Multilevel VAR(1) Models

External Analyses of Preference Models

A Cautionary Note on using Internal Cross Validation to Select the Number of Clusters

What Can We Learn from a Semiparametric Factor Analysis of Item Responses and Response Time? An Illustration with the PISA 2015 Data

4 - Model Selection

Summary

10 - Validity

Summary

9 - Predictor Importance and Model Selection in Multiple Regression Models

Summary

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

40 results

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary