Comparative Analysis of Conformal Prediction: Split, Full, and Adaptive Approaches for Statistical and Neural Network Models

Yuwei Zong; Yanwen Xu

doi:10.1017/pds.2025.10112

Comparative Analysis of Conformal Prediction: Split, Full, and Adaptive Approaches for Statistical and Neural Network Models

Published online by Cambridge University Press: 27 August 2025

Yuwei Zong and

Yanwen Xu

Show author details

Yuwei Zong: Affiliation:
Department of Mechanical Engineering, The University of Texas at Dallas, US
Yanwen Xu*: Affiliation:
Department of Mechanical Engineering, The University of Texas at Dallas, US
*: yanwen.xu@utdallas.edu

Article contents

Abstract:
Introduction
Methodology
Experiment Description
Discussion
Conclusion
References

Abstract:

Conformal prediction (CP) is a framework that provides uncertainty quantification output as valid marginal coverage for predictive models. At present, the main methods used are divided into Bayesian methods and statistical inference method. Among the statistical inference methods, split, full and adaptive conformal prediction are the basic methods. Although there are numerous variations of these methods, a clear comparison is lacking. In this paper, three basic conformal prediction methods are compared on low-dimensional and high-dimensional dataset to illustrate the advantages and disadvantages of each method. The experiment shows that split conformal prediction performs stable coverage but holds data partition as key issue to solve; Expected coverage could not be achieved by Full conformal though it can decrease the prediction interval; Adaptive conformal prediction faces the quantile distribution deviation of complex model. This paper also illustrate the direction of future research.

Keywords

conformal prediction uncertainty qualification nonparametric statistics prediction interval

Information

Type: Article
Information: Proceedings of the Design Society , Volume 5: ICED25 , August 2025 , pp. 981 - 990

DOI: https://doi.org/10.1017/pds.2025.10112 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright: © The Author(s) 2025

1. Introduction

In high-risk applications such as aerospace, medical efficacy, and credit prediction, even small prediction errors can lead to catastrophic consequences, which are unacceptable (Reference Shafer and VovkShafer and Vovk, 2005). Both system analysis and predictive models face significant uncertainty. Uncertainty quantification (UQ) requires the use of mathematical models and computational methods to quantitatively analyze this uncertainty (Reference Saltelli, Tarantola and CampolongoSaltelli et al., 2000). Providing the level of uncertainty associated with point estimates allows for more confident decision-making. Therefore, in uncertainty quantification problems, conformal prediction is introduced to address the uncertainty in the output. Conformal prediction (CP) is a framework that provides uncertainty quantification outputs as valid marginal coverage for predictive models (Reference Tibshirani, Barber, Candès and RamdasTibshirani et al., 2019). In the process of conformal prediction, there is no need to consider the specific details of the predictive model; instead, the assumption of data exchangeability is made, and valid prediction intervals are generated based on a specified significance level. The significance level here limits the frequency at which the conformal prediction model can make errors. Therefore, a well-performing conformal prediction model should provide prediction intervals as small as possible, while still adhering to the specified significance level, to give more accurate estimations of the predictive model’s performance (Reference Vovk, Gammerman and ShaferVovk et al., 2005).

The concept of conformal prediction was first introduced in 1988. Since then, several methods of conformal prediction have been developed and matured, including Bayesian methods and statistical approaches. The article published by Glenn Shafer et al. in 2005 provides a detailed introduction to the fundamental theories, methods, and practical applications of conformal prediction (Reference Vovk, Gammerman and ShaferVovk et al., 2005).

The article by Matteo Fontana et al. in 2023 offers an excellent recent review of the field and highlights future development trends (Reference Fontana, Zeni and VantiniFontana et al., 2023). Currently, the statistical methods applied are mainly based on split conformal prediction, full conformal prediction, and adaptive conformal prediction. Split conformal prediction (SCP) was first proposed by Vladimir Vovk in 2002 (Reference Papadopoulos, Proedrou, Vovk and GammermanPapadopoulos et al., 2002). In some situations, the dataset for training is limited, and splitting the dataset may consume useful information, especially with a small dataset with a relatively simple structure. To handle the split of datasets, Vladimir Vovk, Glenn Shafer, and others proposed other CP methods, and in 2014, they further developed these methods. Ryan Tibshirani detailed full conformal prediction in 2013 in his work. Later, building on previous methods, Ryan Tibshirani et al. introduced Adaptive Conformal Prediction in 2019 (Reference Tibshirani, Barber, Candès and RamdasTibshirani et al., 2019). Data-Dependent Weights (DDW) is also a type of adaptive conformal prediction, based on jackknife and cross-validation, and was proposed by Parisa Hajibabaee in 2024 to address situations where the prediction model may experience a shift (Reference Hajibabaee, Pourkamali-Anaraki and Hariri-ArdebiliHajibabaee et al., 2024). This idea was discussed in the context of conformal prediction by Rina Foygel Barber et al. in their 2021 work (Reference Gibbs and CandèsGibbs and Candès, 2021).

Although previous works have provided detailed summaries of different CP methods, and various articles have applied these methods to real-world data, a comprehensive and standardized comparison of the different CP methods is still lacking. Furthermore, the advantages and disadvantages of these methods are not sufficiently clear. Therefore, this paper aims to compare different CP methods through standardized tests on the same dataset to clarify the application scenarios suggested by each method. By examining coverage rate, forecast interval size, and other indicators, this study identifies unresolved issues and outlines potential directions for future improvements.

Section 2 of this paper explains the methodologies of split, full, and adaptive conformal prediction. Section 3 describes the details of the experiment, including dataset preparation and the experimental process. Section 4 discusses the final experimental results, and the last section presents the conclusion of this paper.

2. Methodology

Under a given prediction model and a set of known data…, (x ₁, y ₂), …, (x_m ,y_m ) that constitute the training dataset, the CP method aims to utilize this information along with the input of a new point x_m ₊₁. It generates the conformity score function $\hat S(x)$ to construct the prediction interval $\hat C_{O_0 m} (x_{m + 1} )$ , which covers the true value y_m+1 (Reference Fontana, Zeni and VantiniFontana et al., 2023). Common formulations of conformal prediction problems are based on the assumption of data exchangeability, a statistical concept. Unlike Bayesian methods, which infer the posterior distribution of unknown quantities using prior information and data, CP provides a confidence interval where the probability of containing the true result is known, rather than just making a point prediction (Reference Vovk, Gammerman and ShaferVovk et al., 2005; Reference Bernardo and SmithBernardo and Smith, 1994).

2.1. Split Conformal Prediction

Given a dataset containing n data, first divide the dataset into a training set with n ₁ data points and a calibration set with the remaining n — n ₁ data points. The prediction model $\hat f$ is defined to evaluate the performance of the prediction model, which is usually residual in regression problem. Using the output of $\hat S_t (y)$ on calibration set for $\hat f,\hat q_k^* $ is defined as the [(1 — k)(n - n ₁)] — th smallest value among the sequence of the conformity score $\hat S_t (y_i )$ . So, the prediction interval is

(1)

$$\hat C_{k,m} (X_t ) = \{ y:\hat S_t (y) \le \hat q_k^ * \} ,$$

Here k is set as a fixed parameter representing the expected coverage of the forecast interval, also, related to significance level(Reference Lei, G’Sell, Rinaldo, Tibshirani and WassermanLei et al., 2018).

2.2. Full conformal prediction

When given (x_n ₊₁,y_n +₁) at current step, consider all the previous points (x ₁,y ₁),…, (x_n ,y_n ) as the training set. Unlike split conformal prediction, full conformal prediction retrains the prediction model on the augmented training set defined as $M_{n + 1} : = \{ (x_1 ,y_1 ), \ldots ,(x_n ,y_n ),(x_{n + 1} ,y)\} $ to obtain $\hat f_{n}^{(x_{n + 1} ,y)} $ . The conformity score $\hat S_{i}^{(x_{n + 1} ,y)} $ is calculated on $\hat f_{n}^{(x_{n + 1} ,y)} $ and M _n+1 to determine the (1 – k) quantile $\hat q_{n,k}^{(x_{n + 1} ,y)} $ .

Finally, the prediction interval $\hat C_{k,n} (x_{n + 1} )$ is produced on equation 1. Aside from the partitioning of the dataset, the biggest difference between split and full conformal prediction is whether the prediction model is fixed. Both split conformal prediction and full conformal prediction follow a “fixed” rule to generate prediction intervals, rather than adapting with the model and data. Considering the possibility of model shift, adaptive conformal prediction is introduced (Reference Papadopoulos, Proedrou, Vovk and GammermanPapadopoulos et al., 2002).

2.3. Adaptive conformal prediction

Adaptive conformal prediction updates the quantile value based solely on historical coverage, without considering the details of the prediction model. This method mainly focuses on optimizing the former. Suppose that for the verification point (x_n ₊₁, y_n +₁), the prediction model for point estimation is $\hat f$, and the calibration set at the current step is $D_{n + 1}^{{\text{cal}}} $ . The quantile estimation is defined as $\hat q(1 - k)$ . An indicator function is defined to recognize the occurrence of uncovered points.

(2)

$$e_i = \{ \matrix{ {0,} \hfill {y_i \in \hat C_k (x_i )} \hfill \cr {1,} \hfill {{\rm{else}}} \hfill \cr }$$

Suppose that $(x_n ,y_n ,\hat f(x_n ))$ and the prediction interval $\hat C_k (x_n )$ are given. The current significance level $k_{n + 1}^* $ at (x_n ₊₁,y_n +1) is updated according to the update equation:

(3)

$$k_{n + 1}^* = k_n^* + \lambda (k - e_i )$$

The parameter λ represents the step size, which is used to limit the speed of adjustment, and k _n^* is also provided. The direction of adjustment of the prediction interval is determined by the previous point. If the model performs well at the previous point, the forecast interval will be appropriately tightened; otherwise, it will be extended. The adjustment at each step occurs only once, and the prediction interval is recorded as the output of adaptive conformal prediction (Reference Gibbs and CandèsGibbs and Candès, 2021).

3. Experiment Description

3.1. Preparation

3.1.1. Dataset Description

This experiment is conducted on both low-dimensional and high-dimensional data. The low-dimensional data is collected from Combined Cycle Power Plant and published in the UCI Machine Learning Repository (Reference Tüfekci and KayaTüfekci and Kaya, 2014). In this data set, each data contains four features: Temperature (AT), Ambient Pressure (AP), Relative Humidity (RH), Exhaust Vacuum (V), which are used to predict Net hourly electrical energy output (PE). The features of the dataset are detailed in Tab. 1. The low-dimensional tests apply linear regression models (LR), random forest regression models (RF), artificial neural networks (ANN), and convolutional neural networks (CNN).

Table 1: Low-dimensional data characteristics

Table 2: High-dimensional data characteristics

The higher-dimensional data is generated using the 10-dimensional Griewank function. During the data generation process, each feature is randomly selected from the interval [—600,600], and the complete data is generated by

(4)

$$f(x) = 1 + \mathop \sum \limits_{i = 1}^{10} {{x_i^2 } \over {4000}} - \mathop \prod \limits_{i = 1}^{10} \cos ({{x_i } \over {\sqrt {i + 1} }})$$

The feature of the dataset can be found in Table 2, where MSE is the mean square error. Since the 10 features of the data are generated in the same way, only one of the features and the target are shown here. Considering the complexity of high-dimensional data, the experiment is conducted only using neural network models including Gaussian process model, GP, ANN and CNN.

3.1.2. Model Accuracy Test

To ensure the model’s accuracy is sufficiently high, a model test is conducted at the beginning of the entire experiment. The evaluation metrics include the relative Root Mean Square Error (r-RMSE), which is defined as

(5)

$$r - RMSE = \sqrt {{1 \over n}\mathop \sum \limits_{i = 1}^n \bigg({{\hat y_i - y_i } \over {y_i }}\bigg)^2 } $$

The relative Root Mean Squared Error (r-RMSE) and Coefficient of Determination R ², measuring how well the model fits the data, are used to evaluate the accuracy of the model.

Through tuning and cross-validation, specific parameters for each model are determined. In the low-dimensional model test, all models are trained on datasets with sizes ranging from 200 to 1500 and validated on the remaining data, with results obtained from 10 repeated experiments. In the high-dimensional model test, models are trained on datasets with sizes ranging from 300 to 1900. After removing outliers, the results are shown in Fig. 1a. If the r-RMSE is not larger than 0.1, the accuracy of the model is considered acceptable. Based on the experimental data, we chose 400 data points for the low-dimensional experiment and 600 data points for the high-dimensional experiment, where the accuracy performance of all four models is acceptable and sufficiently stable. Additionally, in all experiments, we set the expected coverage rate 1 — k = 0.95. The conformity score is defined as the residual:

(6)

$$\hat S_i^{(x_{n + 1} ,y)} = |y_i - \hat f_n^{(x_{n + 1} ,y)} (x_i )|$$

Figure 1: Model accuracy test result

3.2. Low-dimensional Data Experiment

3.2.1. Split conformal prediction on unlimited dataset

In this section, the prediction model is trained on a sufficiently large dataset to ensure high accuracy, demonstrating the best performance of the SCP under ideal conditions. By defining a partition coefficient $a_1 \in (0,1)$ , 400 data points are used in the training set to train the prediction model, with a₁ representing the proportion of the entire training data. Thus, the size of the calibration set will be $m = 400 \times ({{1 - \alpha _1 } \over {\alpha _1 }})$

The conformity score sequence is calculated according to Eq. 6 to obtain the [(1 - k)m] — th smallest value among the scrore sequence as the quantile $\hat q_{\hat \tau } $ . The prediction interval $\hat C_{k,n} (X_* )$ is generated according to Eq.1. With 10 repetitions for each partition coefficient, the average value is calculated after removing outliers, as shown in Fig. 2a.

Throughout the entire test, the coverage rate of models at each α ₁ is fixed at 0.95 as expected. With different α ₁ the length of prediction interval for LR and RF fluctuates in [16,17], while ANN and CNN models are varying in [34,40]. Additionally, the CNN model shows a decreasing trend in the prediction interval length, whereas the ANN model does not exhibit a clear trend.

3.2.2. Split conformal prediction on limited dataset

For this issue, the core discussion is on how to handle the division of information with limited data. Data partitioning ensures that the training and prediction processes do not influence each other, thereby reducing computational costs. The division problem may arise when the model’s training and evaluation require completely independent information, even though the data comes from the same source. With 400 data points, the training set has 400 × α ₂ data points, while the calibration set contains 400 × (1 — α ₂) data points, where α ₂ ranges from 0.1 to 0.9. The rest process is the same as the unlimited data experiment of SCP, and the results are shown in Fig. 3.

Figure 2: Mean interval length of SCP with unlimited data

Figure 3: Mean interval length and coverage rate of SCP with low-dimensional limited training data

The prediction interval coverage rates of all four models fluctuate around the expected value of 0.95, compared to their stable result in unlimited scenario. All four models perform better and stabilize around 95% coverage when the split ratio α ₂ is between 0.3 and 0.5. However, the prediction interval coverage rate suffer a sudden drop as α ₂ reaches to 0.9 in RF, ANN and CNN model.

Regarding the prediction interval length, there are two trends across the four models: the mean length of prediction model with LR still fluctuates in [16, 17], with minimal change, while the RF, ANN, and CNN models all show a decrease in prediction interval length as the α ₂ increases.

3.2.3. Full conformal prediction

In full conformal prediction, the entire training dataset is used to train the prediction model and obtain the prediction interval without partitioning. Similarly, 400 data points constitute the training set. First, train the prediction model $\hat f_{{\rm{full}}} $ on training set and obtain predicted value $\hat f_{{\text{full}}} (x_{t + 1} )$ . Since the input unknowns cannot be directly fitted to the predicted value distribution of the preceding points through the prediction model, $\hat f_{{\text{full}}} (x_{t + 1} )$ is set as the starting point to generate a backup set {y _backup} which is defined in Eq. 7 to simulate the distribution.

(7)

$$U_{\alpha ,\beta } = \{ y_{{\rm{backup}}} = n*\beta + \hat f_{{\rm{full}}} (x_t ),n \in \mathbb{Z}\} \;\;L_{\alpha ,\beta } = \{ y_{{\rm{backup}}} = n*\beta - \hat f_{{\rm{full}}} (x_t ),n \in \mathbb{Z}\}$$

Here, n refers to the size of y_backup, which is set 200 in all experiment, and β is the fixed step size. Define the augment training dataset as $R_{\beta ,{\rm{backup}}} = \{ (x_{{\rm{train}}} ,y_{{\rm{train}}} )\mathop \cup \nolimits (x_t ,y_{{\rm{backup}}} )\} $ , corresponding to each $y_{{\text{backup}}} \in (U_{t,\beta } \mathop \cup \nolimits L_{t,\beta } ).\;\hat f_{{\text{t}},{\text{backup}}} ( \cdot )$ is trained on augment training dataset and used to calculate the conformity score of augment training dataset. The quantile of conformity score at (x_t , y _backup) is to produce prediction interval according to Eq.1. Denote the ybackup whose quantiles are closest to expected significance level in the two sets y_up and y_lo respectively.

The choice of step size has an impact on the final performance of the experiment. The step size used in the experiment refers to the model accuracy data and requires y_backup to be amplified based on the average error. In experiment, there are 50 data points for verification, and the result is showed in Tab.3.

Table 3: Result of full conformal prediction with low-dimensional data

Table 4: Result of full conformal prediction on high-dimensional data

The full conformal prediction does not consistently achieve the expected 0.95 coverage rate, with all four models either exceeding or falling short of the desired value. Although the LR and RF models exhibited highly similar accuracy levels, under the same step size, there are differences in coverage and interval length, reflecting the inherent accuracy differences between the two models.

3.2.4. Comparison between split and full conformal prediction on limited dataset

In this section, the same 400 training points are used for both split conformal prediction and full conformal prediction. Additionally, the same 50 validation points are used. Results are shown in Fig.4.

Figure 4: Comparison of mean interval length and coverage rate between split and full conformal prediction with low-dimensional data. The red column represent split CP and orange column represent full CP; The blue line represent the coverage rate of split CP and the green line represent the coverage rate of full CP

The results of the split conformal prediction for LR and RF are relatively stable, and the full conformal prediction results are very close to the split conformal prediction results, but worse. However, when dealing with ANN, full conformal prediction is relatively more stable. The CNN model shows more stable results in the split conformal prediction, with more acceptable coverage. However, full conformal prediction holds a poor coverage rate of 0.8, despite providing very narrow prediction intervals.

3.2.5. Adaptive-split conformal prediction

In this part, adaptive conformal prediction is applied to split conformal prediction using update function Eq. 3, where the update step size λ is 0.05 for each model. A set of 400 data points is used for training the prediction model and quantile calculation, and 200 data points are used for validation. The other settings are the same as in the previous experiment. The results are shown in Fig. 5.

Figure 5: Comparison between with vs without adaptive quantile of split conformal prediction on low-dimensional data

With adaptive process, the length of the prediction interval for LR and RF models decreases when α ₂ ≤ —0.6 and α ₂ ≤ —0.5, respectively. As partition coefficient increases, the coverage of these two models fails to reach expected 0.95, even when adaptive conformal prediction provides a larger prediction interval. For the CNN and ANN, after introducing adaptive quantile, the length of prediction interval decreases but no longer stabilize at the expected coverage level of 0.95, with the coverage eventually rising close to 1.

3.2.6. Adaptive-full conformal prediction

Similarly, there are 400 data for training and 50 points for validation, and the result is showed in Tab. 5.

Table 5: Comparison of adaptive vs. fixed-quantile full conformal prediction on low-dimensional data

The results of the four models are quite similar. Adaptive conformal prediction does not significantly affect the coverage of the generated prediction intervals but does influence their length. Since the true values of the data themselves are not large, the optimization effect of this method is not obvious.

3.3. High-dimensional Data Experiment

In the high-dimensional data experiment, 10-dimensional data generated by the Griewank equation is applied to GP, ANN, and CNN models. The training data size increases from 400 to 600. The experimental process is exactly the same as in the low-dimensional experiment, with only the experimental results shown.

3.3.1. Split conformal prediction on unlimited dataset

The coverage rate of three models is fixed at 0.95 for any α ₁. According to the results in Fig. 2b, when the total amount of training data is 600, changes in a₁ do not significantly impact the length and coverage of the prediction intervals. The prediction interval lengths of the GP and ANN models are similar, fluctuating in [55, 65], while the prediction interval length of the CNN model fluctuates in [30, 40].

3.3.2. Split conformal prediction on limited dataset

Based on the Fig. 6, three models exhibit similar performance as the partitioning ratio changes. as the α ₂ increasing, their prediction intervals become shorter, but the differences among the models remain. The CNN model typically produces the smallest prediction intervals, while the GP model generates relatively larger ones. However, once the partitioning ratio reaches 0.4, the change in prediction interval lengths becomes smaller, indicating that there is sufficient data for the whole process.

Figure 6: Mean interval length and coverage rate of SCP with high-dimensional limited training data

3.3.3. Full conformal prediction

For all of models, the step size is 0.8. According to the results in Tab. 4, the performance of full conformal prediction with the same time step varies significantly across different models. Similar to the low-dimensional experiments, this method struggles to guarantee the expected 95% coverage. Even when the accuracy levels of different models are similar, the coverage and average interval length of full conformal prediction can still differ markedly.

3.3.4. Comparison between split and full conformal prediction on limited dataset

Based on results in Fig. 7, when using the same experimental dataset and validation set, split conformal prediction tends to achieve the expected coverage more effectively across all three models, albeit with larger prediction intervals. This is especially true for the ANN and CNN models, where the coverage is nearly identical to 0.95. In contrast, full conformal prediction shows a considerable deviation from the expected coverage, despite having shorter prediction intervals. Notably, when the partition coefficient is large enough with sufficient data, the prediction interval lengths of full conformal prediction approach those of split conformal prediction.

Figure 7: Comparison of mean interval length and coverage rate between split and full conformal prediction with high-dimensional data

3.3.5. Adaptive-split conformal prediction

Based on the results in Fig. 8, we can visually observe that the optimization effect of adaptive conformal prediction on split conformal prediction is not particularly significant. This optimization does not necessarily reduce the length of the prediction interval. The main role of the adaptive quantile is to stabilize the coverage rate of the conformal prediction interval at the expected 95% level. Based on Tab. 6, applying adaptive quantile leads to a noticeable improvement on GP and ANN models, but not in CNN. For GP and ANN, not only does the coverage rate get closer to the expected value, but the prediction interval lengths decrease by 12% and 7% respectively. For the CNN model, adaptive conformal prediction shortens the prediction interval length by 6% but the coverage rate drops.

Figure 8: Comparison between with vs without adaptive quantile of split conformal prediction on high-dimensional data

Table 6: Comparison of adaptive vs. fixed-quantile full conformal prediction on high-dimensional data

4. Discussion

4.1. Split conformal prediction

Split conformal prediction focuses on the training dataset and generates prediction intervals with minimal computational cost. It maintains good coverage rates, especially when the predictive model is accurate. However, when using complex models or when model accuracy decreases, the prediction interval leng thincreases due to model bias. A key issue is the partitioning of the dataset, which may introduce overlap between training sets, potentially violating the assumption of no prior knowledge about the model. In low-dimensional data, simpler models like LR and RF show minimal differences in results, but for more complex models, model accuracy should take priority over interval precision. Data partitioning inevitably loses information, and full conformal prediction can be considered to avoid this issue.

4.2. Full conformal prediction

Full conformal prediction avoids data loss from partitioning but introduces challenges in constructing target value distributions. The step size for fitting the model must be carefully selected: too small leads to underfitting, while too large risks missing valuable details. Although larger step sizes may increase coverage, full conformal prediction struggles with consistent coverage in both low- and high-dimensional settings. Additionally, its computational demands remain high, limiting its practicality. Optimizing this method to reduce computation time and improve coverage stability is an important future direction.

4.3. Adaptive conformal prediction

Adaptive conformal prediction adjusts prediction intervals based on past coverage, aiming for narrower intervals when predictions are accurate and wider ones when deviations occur. However, it may lead to excessively high coverage rates, sacrificing prediction accuracy. The quantile adjustment algorithm can result in large fluctuations in coverage, particularly when previous points are not covered. Additionally, while adaptive methods can improve coverage, they might increase interval lengths, especially in complex models where the quantile distribution does not follow expected trends which is shown in Fig. 9, especially the distribution of calibration data are gathering in specific range. With actual distribution of ybackup, a small change of quantile value may lead to an unexpected adjustment of interval. Balancing coverage and interval precision remains a challenge in this approach.

Figure 9: The y_backup distribution diagram

5. Conclusion

In this study, three main conformal prediction methods—split, full, and adaptive conformal prediction—are compared using both low-dimensional and high-dimensional datasets. Split conformal prediction performs best in terms of coverage rate, while full conformal prediction struggles to maintain consistent coverage, particularly in high-dimensional settings, and requires significant resources. Adaptive conformal prediction improves coverage stability but needs further refinement to balance coverage and model complexity. In the future, adaptive conformal prediction will have greater potential in practical application scenarios, and the adaptive method should consider adding more model information to the update decision. Currently, the update decision is simple and fixed, and more flexible adaptation strategies with lower computational power consumption need to be explored.

References

Bernardo, J. M. and Smith, A. F. M. (1994). Bayes/an Theory. Wiley.Google Scholar

Chemali, E., Kollmeyer, P. J., Preindl, M. and Emadi, A. (2018). State-of-charge estimation of Li-ion batteries using deep neural networks: A machine learning approach. Journal of Power Sources, 400, 242–255.CrossRef Google Scholar

Fontana, M., Zeni, G. and Vantini, S. (2023). Conformal prediction: a unified review of theory and new challenges. Bernoulli, 29 (1), 1–23.CrossRef Google Scholar

Gibbs, I. and Candès, E. (2021). Adaptive conformal inference under distribution shift. Advances in Neural Information Processing Systems, 34, 1660–1672.Google Scholar

Hajibabaee, P., Pourkamali-Anaraki, F. and Hariri-Ardebili, M. (2024). Adaptive Conformal Prediction Intervals Using Data-Dependent Weights with Application to Seismic Response Prediction. IEEE Access.CrossRef Google Scholar

Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. J. and Wasserman, L. (2018). Distribution-free predictive inference for regression. Journal of the American Statistical Association, 113 (523), 1094–1111.CrossRef Google Scholar

Papadopoulos, H., Proedrou, K., Vovk, V. and Gammerman, A. (2002). Inductive confidence machines for regression. Machine Learning: ECML 2002, Springer, 345–356.Google Scholar

Saltelli, A., Tarantola, S. and Campolongo, F. (2000). Sensitivity analysis as an ingredient of modeling. Statistical Science, 377–395.Google Scholar

Shafer, G. and Vovk, V. (2005). Probability and Finance: It’s Only a Game! John Wiley & Sons.Google Scholar

Tibshirani, R. J., Barber, R. F., Candès, E. and Ramdas, A. (2019). Conformal prediction under covariate shift. Advances in Neural Information Processing Systems, 32.Google Scholar

Tibshirani, R. (2023). Conformal Prediction. UC Berkeley.Google Scholar

Vovk, V., Gammerman, A. and Shafer, G. (2005). Algorithmic Learning in a Random World. Springer.Google Scholar

Tüfekci, P. and Kaya, H. (2014). Combined Cycle Power Plant. UCI Machine Learning Repository. https://doi.org/10.24432/C5002N CrossRef Google Scholar

Table 1: Low-dimensional data characteristics

Table 2: High-dimensional data characteristics

Figure 1: Model accuracy test result

Figure 2: Mean interval length of SCP with unlimited data

Figure 3: Mean interval length and coverage rate of SCP with low-dimensional limited training data

Table 3: Result of full conformal prediction with low-dimensional data

Table 4: Result of full conformal prediction on high-dimensional data

Figure 5: Comparison between with vs without adaptive quantile of split conformal prediction on low-dimensional data

Table 5: Comparison of adaptive vs. fixed-quantile full conformal prediction on low-dimensional data

Figure 6: Mean interval length and coverage rate of SCP with high-dimensional limited training data

Figure 7: Comparison of mean interval length and coverage rate between split and full conformal prediction with high-dimensional data

Figure 8: Comparison between with vs without adaptive quantile of split conformal prediction on high-dimensional data

Table 6: Comparison of adaptive vs. fixed-quantile full conformal prediction on high-dimensional data

Figure 9: The ybackup distribution diagram

Article contents

Comparative Analysis of Conformal Prediction: Split, Full, and Adaptive Approaches for Statistical and Neural Network Models

Abstract:

Keywords

Information

1. Introduction

2. Methodology

2.1. Split Conformal Prediction

2.2. Full conformal prediction

2.3. Adaptive conformal prediction

3. Experiment Description

3.1. Preparation

3.1.1. Dataset Description

3.1.2. Model Accuracy Test

3.2. Low-dimensional Data Experiment

3.2.1. Split conformal prediction on unlimited dataset

3.2.2. Split conformal prediction on limited dataset

3.2.3. Full conformal prediction

3.2.4. Comparison between split and full conformal prediction on limited dataset

3.2.5. Adaptive-split conformal prediction

3.2.6. Adaptive-full conformal prediction

3.3. High-dimensional Data Experiment

3.3.1. Split conformal prediction on unlimited dataset

3.3.2. Split conformal prediction on limited dataset

3.3.3. Full conformal prediction

3.3.4. Comparison between split and full conformal prediction on limited dataset

3.3.5. Adaptive-split conformal prediction

4. Discussion

4.1. Split conformal prediction

4.2. Full conformal prediction

4.3. Adaptive conformal prediction

5. Conclusion

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests