Hostname: page-component-89b8bd64d-j4x9h Total loading time: 0 Render date: 2026-05-07T09:54:50.587Z Has data issue: false hasContentIssue false

Human leukocyte antigen distributions do not share a copula across sub-populations

Subject: Life Science and Biomedicine

Published online by Cambridge University Press:  10 October 2022

Dan Schellhas
Affiliation:
Data Science Program, Bowling Green State University, Bowling Green, OH 43402, USA
Robert C. Green II*
Affiliation:
Department of Computer Science, Bowling Green State University, Bowling Green, OH 43402, USA
*
Corresponding author: Email: greenr@bgsu.edu

Abstract

The distribution of human leukocyte antigens in the population assists in matching solid organ donors and recipients when the typing methods used do not provide sufficiently precise information. This is made possible by linkage disequilibrium (LD), where alleles co-occur more often than random chance would suggest. There is a trade-off between the high bias and low variance of a broad sample from the population and the low bias but high variance of a focused sample. Some of this trade-off could be alleviated if sub-populations shared LD despite having different allele frequencies. These experiments show that Bayesian estimation can balance bias and variance by tuning the effective sample size of the reference panel, but the LD as represented by an additive or multiplicative copula is not shared.

Information

Type
Research Article
Information
Result type: Negative result
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. The benefit of Bayesian estimation varies significantly by group, but is clearly useful for reference panels with less than 80,000 samples.

Figure 1

Figure 2. When the loci are reduced to only A, B, DRB1, and DQB1, the sampling error is decreased but the results of Bayesian estimation are nearly identical.

Figure 2

Figure 3. When the loci are reduced to A, B, and DRB1, the results are nearly indistinguishable from the 4-locus version.

Figure 3

Table 1. Optimal effective sample sizes vary by ethnicity from 1,000 to 12,000 and the largest panel size that benefits from additional references vary from 86,000 to 650,000. Native American is the only broad reference panel that falls within the size where the Bayesian estimation would help

Reviewing editor:  Vitor Francisco
Minor revisions requested.

Review 1: Human leukocyte antigen distributions do not share a copula across sub-populations

Conflict of interest statement

Reviewer declares none

Comments

Comments to the Author: This computational study is overall very well presented, and the conclusions appear to be justified by the data. However, I recommend that the manuscript be sent to a statistical editor for further validation since I am not a statistics expert.

Minor issues to be fixed:

1. The authors should increase the size of all figure panels. The font size in the figures is too small and the labels are difficult to read.

2. The authors provide little if any discussion of their results and do not highlight the limitations. Thus, a brief Discussion should be added, either as a separate section or as part of the Results.

3. The first sentence in the legend to Table 1 (“...and largest panel size that benefits from additional references vary...”) appears to be grammatically incorrect.

Presentation

Overall score 3.9 out of 5
Is the article written in clear and proper English? (30%)
5 out of 5
Is the data presented in the most useful manner? (40%)
3 out of 5
Does the paper cite relevant and related articles appropriately? (30%)
4 out of 5

Context

Overall score 5 out of 5
Does the title suitably represent the article? (25%)
5 out of 5
Does the abstract correctly embody the content of the article? (25%)
5 out of 5
Does the introduction give appropriate context? (25%)
5 out of 5
Is the objective of the experiment clearly defined? (25%)
5 out of 5

Analysis

Overall score 3.4 out of 5
Does the discussion adequately interpret the results presented? (40%)
3 out of 5
Is the conclusion consistent with the results and discussion? (40%)
4 out of 5
Are the limitations of the experiment as well as the contributions of the experiment clearly outlined? (20%)
3 out of 5

Review 2: Human leukocyte antigen distributions do not share a copula across sub-populations

Conflict of interest statement

Reviewer declares none.

Comments

Comments to the Author: The article discussed the distribution of HLA in fitve subpopulations and their application on tuning bias and variance in the high-resolution HLA estimation. Their simulations showed a larger reference panel in the Bayesian estimation would be useful to balance the bias of the small panel, but no copula of LD was shared in the subpopulations. Overall, their conclusion could benefit development of the HLA estimation algorithm in future.

However, this manuscript was lack of many detais which the authors should offer:

(1) How to define the larger and smaller panel?

(2) the details of the first and second experiment. e.g. how to perform the simulations? how to get the effective sample size from the simulations?

(3) Formula in the manuscript was not well defined. For instance, in Formula (1), what does θ,Χ,P,L means? They need to be explicitly defined in manuscript.

Presentation

Overall score 4.3 out of 5
Is the article written in clear and proper English? (30%)
4 out of 5
Is the data presented in the most useful manner? (40%)
4 out of 5
Does the paper cite relevant and related articles appropriately? (30%)
5 out of 5

Context

Overall score 5 out of 5
Does the title suitably represent the article? (25%)
5 out of 5
Does the abstract correctly embody the content of the article? (25%)
5 out of 5
Does the introduction give appropriate context? (25%)
5 out of 5
Is the objective of the experiment clearly defined? (25%)
5 out of 5

Analysis

Overall score 4.8 out of 5
Does the discussion adequately interpret the results presented? (40%)
5 out of 5
Is the conclusion consistent with the results and discussion? (40%)
5 out of 5
Are the limitations of the experiment as well as the contributions of the experiment clearly outlined? (20%)
4 out of 5