Search results for Mathematics

VARIABLE SELECTION IN MULTIVARIATE DATA BY ANALYSIS OF DATA PATTERN
Part of
- Multivariate analysis
LUCY CONRAN
Journal:

Bulletin of the Australian Mathematical Society , First View

Published online by Cambridge University Press:

11 August 2025, pp. 1-2
- Article
- - You have access
- PDF
- HTML
- Export citation

Estranged facets and k-facets of Gaussian random point sets
Part of
Brett Leroux, Luis Rademacher
Journal:

Journal of Applied Probability / Volume 62 / Issue 3 / September 2025

Published online by Cambridge University Press:

02 April 2025, pp. 859-875

Print publication:

September 2025
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Gaussian random polytopes have received a lot of attention, especially in the case where the dimension is fixed and the number of points goes to infinity. Our focus is on the less-studied case where the dimension goes to infinity and the number of points is proportional to the dimension d. We study several natural quantities associated with Gaussian random polytopes in this setting. First, we show that the expected number of facets is equal to $C(\alpha)^{d+o(d)}$, where $C(\alpha)$ is some constant which depends on the constant of proportionality $\alpha$. We also extend this result to the expected number of k-facets. We then consider the more difficult problem of the asymptotics of the expected number of pairs of estranged facets of a Gaussian random polytope. When the number of points is 2d, we determine the constant C such that the expected number of pairs of estranged facets is equal to $C^{d+o(d)}$.

EXPLORING OUT-OF-SAMPLE PREDICTION AND SPATIAL DEPENDENCY FOR COMPLEX BIG DATA
Part of
ADAM BILCHOURIS
Journal:

Bulletin of the Australian Mathematical Society / Volume 111 / Issue 2 / April 2025

Published online by Cambridge University Press:

10 February 2025, pp. 380-382

Print publication:

April 2025
- Article
- - You have access
- PDF
- HTML
- Export citation

On a class of bivariate distributions built of q-ultraspherical polynomials
Part of
Paweł J. Szabłowski
Journal:

Proceedings of the Royal Society of Edinburgh. Section A: Mathematics , First View

Published online by Cambridge University Press:

02 December 2024, pp. 1-34
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Our primary result concerns the positivity of specific kernels constructed using the q-ultraspherical polynomials. In other words, it concerns a two-parameter family of bivariate, compactly supported distributions. Moreover, this family has a property that all its conditional moments are polynomials in the conditioning random variable. The significance of this result is evident for individuals working on distribution theory, orthogonal polynomials, q-series theory, and the so-called quantum polynomials. Therefore, it may have a limited number of interested researchers. That is why, we put our results into a broader context. We recall the theory of Hilbert–Schmidt operators and the idea of Lancaster expansions (LEs) of the bivariate distributions absolutely continuous with respect to the product of their marginal distributions. Applications of LE can be found in Mathematical Statistics or the creation of Markov processes with polynomial conditional moments (the most well-known of these processes is the famous Wiener process).

An MBO method for modularity optimisation based on total variation and signless total variation
Part of
Zijun Li, Yves van Gennip, Volker John
Journal:

European Journal of Applied Mathematics , First View

Published online by Cambridge University Press:

25 November 2024, pp. 1-83
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
In network science, one of the significant and challenging subjects is the detection of communities. Modularity [1] is a measure of community structure that compares connectivity in the network with the expected connectivity in a graph sampled from a random null model. Its optimisation is a common approach to tackle the community detection problem. We present a new method for modularity maximisation, which is based on the observation that modularity can be expressed in terms of total variation on the graph and signless total variation on the null model. The resulting algorithm is of Merriman–Bence–Osher (MBO) type. Different from earlier methods of this type, the new method can easily accommodate different choices of the null model. Besides theoretical investigations of the method, we include in this paper numerical comparisons with other community detection methods, among which the MBO-type methods of Hu et al. [2] and Boyd et al. [3], and the Leiden algorithm [4].

Almost exact recovery in noisy semi-supervised learning
Part of
Konstantin Avrachenkov, Maximilien Dreveton
Journal:

Probability in the Engineering and Informational Sciences / Volume 39 / Issue 1 / January 2025

Published online by Cambridge University Press:

11 November 2024, pp. 1-22
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Graph-based semi-supervised learning methods combine the graph structure and labeled data to classify unlabeled data. In this work, we study the effect of a noisy oracle on classification. In particular, we derive the maximum a posteriori (MAP) estimator for clustering a degree corrected stochastic block model when a noisy oracle reveals a fraction of the labels. We then propose an algorithm derived from a continuous relaxation of the MAP, and we establish its consistency. Numerical experiments show that our approach achieves promising performance on synthetic and real data sets, even in the case of very noisy labeled data.

The dutch draw: constructing a universal baseline for binary classification problems
Part of
Etienne van de Bijl, Jan Klein, Joris Pries, Sandjai Bhulai, Mark Hoogendoorn, Rob van der Mei
Journal:

Journal of Applied Probability / Volume 62 / Issue 2 / June 2025

Published online by Cambridge University Press:

19 September 2024, pp. 475-493

Print publication:

June 2025
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Novel prediction methods should always be compared to a baseline to determine their performance. Without this frame of reference, the performance score of a model is basically meaningless. What does it mean when a model achieves an $F_1$ of 0.8 on a test set? A proper baseline is, therefore, required to evaluate the ‘goodness’ of a performance score. Comparing results with the latest state-of-the-art model is usually insightful. However, being state-of-the-art is dynamic, as newer models are continuously developed. Contrary to an advanced model, it is also possible to use a simple dummy classifier. However, the latter model could be beaten too easily, making the comparison less valuable. Furthermore, most existing baselines are stochastic and need to be computed repeatedly to get a reliable expected performance, which could be computationally expensive. We present a universal baseline method for all binary classification models, named the Dutch Draw (DD). This approach weighs simple classifiers and determines the best classifier to use as a baseline. Theoretically, we derive the DD baseline for many commonly used evaluation measures and show that in most situations it reduces to (almost) always predicting either zero or one. Summarizing, the DD baseline is general, as it is applicable to any binary classification problem; simple, as it can be quickly determined without training or parameter tuning; and informative, as insightful conclusions can be drawn from the results. The DD baseline serves two purposes. First, it is a robust and universal baseline that enables comparisons across research papers. Second, it provides a sanity check during the prediction model’s development process. When a model does not outperform the DD baseline, it is a major warning sign.

Exact recovery of community detection in k-community Gaussian mixture models
Part of
Zhongyang Li
Journal:

European Journal of Applied Mathematics / Volume 36 / Issue 3 / June 2025

Published online by Cambridge University Press:

18 September 2024, pp. 491-523
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
We study the community detection problem on a Gaussian mixture model, in which vertices are divided into $k\geq 2$ distinct communities. The major difference in our model is that the intensities for Gaussian perturbations are different for different entries in the observation matrix, and we do not assume that every community has the same number of vertices. We explicitly find the necessary and sufficient conditions for the exact recovery of the maximum likelihood estimation, which can give a sharp phase transition for the exact recovery even though the Gaussian perturbations are not identically distributed; see Section 7. Applications include the community detection on hypergraphs.

The limiting spectral distribution of large random permutation matrices
Part of
- Multivariate analysis
- Probability theory on algebraic and topological structures
Jianghao Li, Huanchao Zhou, Zhidong Bai, Jiang Hu
Journal:

Journal of Applied Probability / Volume 61 / Issue 4 / December 2024

Published online by Cambridge University Press:

12 April 2024, pp. 1301-1318

Print publication:

December 2024
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We explore the limiting spectral distribution of large-dimensional random permutation matrices, assuming the underlying population distribution possesses a general dependence structure. Let $\textbf X = (\textbf x_1,\ldots,\textbf x_n)$ $\in \mathbb{C} ^{m \times n}$ be an $m \times n$ data matrix after self-normalization (n samples and m features), where $\textbf x_j = (x_{1j}^{*},\ldots, x_{mj}^{*} )^{*}$. Specifically, we generate a permutation matrix $\textbf X_\pi$ by permuting the entries of $\textbf x_j$ $(j=1,\ldots,n)$ and demonstrate that the empirical spectral distribution of $\textbf {B}_n = ({m}/{n})\textbf{U} _{n} \textbf{X} _\pi \textbf{X} _\pi^{*} \textbf{U} _{n}^{*}$ weakly converges to the generalized Marčenko–Pastur distribution with probability 1, where $\textbf{U} _n$ is a sequence of $p \times m$ non-random complex matrices. The conditions we require are $p/n \to c >0$ and $m/n \to \gamma > 0$.

The unified extropy and its versions in classical and Dempster–Shafer theories
Part of
Francesco Buono, Yong Deng, Maria Longobardi
Journal:

Journal of Applied Probability / Volume 61 / Issue 2 / June 2024

Published online by Cambridge University Press:

23 October 2023, pp. 685-696

Print publication:

June 2024
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Measures of uncertainty are a topic of considerable and growing interest. Recently, the introduction of extropy as a measure of uncertainty, dual to Shannon entropy, has opened up interest in new aspects of the subject. Since there are many versions of entropy, a unified formulation has been introduced to work with all of them in an easy way. Here we consider the possibility of defining a unified formulation for extropy by introducing a measure depending on two parameters. For particular choices of parameters, this measure provides the well-known formulations of extropy. Moreover, the unified formulation of extropy is also analyzed in the context of the Dempster–Shafer theory of evidence, and an application to classification problems is given.

ON THE N-POINT CORRELATION OF VAN DER CORPUT SEQUENCES
Part of
- Multivariate analysis
- Probabilistic theory: distribution modulo $1$; metric theory of algorithms
CHRISTIAN WEIß
Journal:

Bulletin of the Australian Mathematical Society / Volume 109 / Issue 3 / June 2024

Published online by Cambridge University Press:

15 September 2023, pp. 471-475

Print publication:

June 2024
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
We derive an explicit formula for the N-point correlation $F_N(s)$ of the van der Corput sequence in base $2$ for all $N \in \mathbb {N}$ and $s \geq 0$. The formula can be evaluated without explicit knowledge about the elements of the van der Corput sequence. This constitutes the first example of an exact closed-form expression of $F_N(s)$ for all $N \in \mathbb {N}$ and all $s \geq 0$ which does not require explicit knowledge about the involved sequence. Moreover, it can be immediately read off that $\lim _{N \to \infty } F_N(s)$ exists only for $0 \leq s \leq 1/2$.

Exchangeable FGM copulas
Part of
- Distribution theory - Probability
- Multivariate analysis
Christopher Blier-Wong, Hélène Cossette, Etienne Marceau
Journal:

Advances in Applied Probability / Volume 56 / Issue 1 / March 2024

Published online by Cambridge University Press:

24 August 2023, pp. 205-234

Print publication:

March 2024
- Article
- - You have access
  - Open access
- PDF
- HTML
- Export citation
Copulas provide a powerful and flexible tool for modeling the dependence structure of random vectors, and they have many applications in finance, insurance, engineering, hydrology, and other fields. One well-known class of copulas in two dimensions is the Farlie–Gumbel–Morgenstern (FGM) copula, since its simple analytic shape enables closed-form solutions to many problems in applied probability. However, the classical definition of the high-dimensional FGM copula does not enable a straightforward understanding of the effect of the copula parameters on the dependence, nor a geometric understanding of their admissible range. We circumvent this issue by analyzing the FGM copula from a probabilistic approach based on multivariate Bernoulli distributions. This paper examines high-dimensional exchangeable FGM copulas, a subclass of FGM copulas. We show that the dependence parameters of exchangeable FGM copulas can be expressed as a convex hull of a finite number of extreme points. We also leverage the probabilistic interpretation to develop efficient sampling and estimating procedures and provide a simulation study. Throughout, we discover geometric interpretations of the copula parameters that assist one in decoding the dependence of high-dimensional exchangeable FGM copulas.

EXTRACTING FEATURES FROM EIGENFUNCTIONS: HIGHER CHEEGER CONSTANTS AND SPARSE EIGENBASIS APPROXIMATION
Part of
CHRISTOPHER P. ROCK
Journal:

Bulletin of the Australian Mathematical Society / Volume 108 / Issue 3 / December 2023

Published online by Cambridge University Press:

22 August 2023, pp. 511-512

Print publication:

December 2023
- Article
- - You have access
- PDF
- HTML
- Export citation

Asymptotic expansion of the expected Minkowski functional for isotropic central limit random fields
Part of
Satoshi Kuriki, Takahiko Matsubara
Journal:

Advances in Applied Probability / Volume 55 / Issue 4 / December 2023

Published online by Cambridge University Press:

14 July 2023, pp. 1390-1414

Print publication:

December 2023
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
The Minkowski functionals, including the Euler characteristic statistics, are standard tools for morphological analysis in cosmology. Motivated by cosmic research, we examine the Minkowski functional of the excursion set for an isotropic central limit random field, whose k-point correlation functions (kth-order cumulants) have the same structure as that assumed in cosmic research. Using 3- and 4-point correlation functions, we derive the asymptotic expansions of the Euler characteristic density, which is the building block of the Minkowski functional. The resulting formula reveals the types of non-Gaussianity that cannot be captured by the Minkowski functionals. As an example, we consider an isotropic chi-squared random field and confirm that the asymptotic expansion accurately approximates the true Euler characteristic density.

Asymptotic results on tail moment and tail central moment for dependent risks
Part of
Jinzhu Li
Journal:

Advances in Applied Probability / Volume 55 / Issue 4 / December 2023

Published online by Cambridge University Press:

08 May 2023, pp. 1116-1143

Print publication:

December 2023
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
In this paper, we consider a financial or insurance system with a finite number of individual risks described by real-valued random variables. We focus on two kinds of risk measures, referred to as the tail moment (TM) and the tail central moment (TCM), which are defined as the conditional moment and conditional central moment of some individual risk in the event of system crisis. The first-order TM and the second-order TCM coincide with the popular risk measures called the marginal expected shortfall and the tail variance, respectively. We derive asymptotic expressions for the TM and TCM with any positive integer orders, when the individual risks are pairwise asymptotically independent and have distributions from certain classes that contain both light-tailed and heavy-tailed distributions. The formulas obtained possess concise forms unrelated to dependence structures, and hence enable us to estimate the TM and TCM efficiently. To demonstrate the wide application of our results, we revisit some issues related to premium principles and optimal capital allocation from the asymptotic point of view. We also give a numerical study on the relative errors of the asymptotic results obtained, under some specific scenarios when there are two individual risks in the system. The corresponding asymptotic properties of the degenerate univariate versions of the TM and TCM are discussed separately in an appendix at the end of the paper.

How do empirical estimators of popular risk measures impact pro-cyclicality?
Part of
Marcel Bräutigam, Marie Kratz
Journal:

Annals of Actuarial Science / Volume 17 / Issue 3 / November 2023

Published online by Cambridge University Press:

29 March 2023, pp. 547-579
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Risk measurements are clearly central to risk management, in particular for banks, (re)insurance companies, and investment funds. The question of the appropriateness of risk measures for evaluating the risk of financial institutions has been heavily debated, especially after the financial crisis of 2008/2009. Another concern for financial institutions is the pro-cyclicality of risk measurements. In this paper, we extend existing work on the pro-cyclicality of the Value-at-Risk to its main competitors, Expected Shortfall, and Expectile: We compare the pro-cyclicality of historical quantile-based risk estimation, taking into account the market state. To characterise the latter, we propose various estimators of the realised volatility. Considering the family of augmented GARCH(p, q) processes (containing well-known GARCH models and iid models, as special cases), we prove that the strength of pro-cyclicality depends on the three factors: the choice of risk measure and its estimators, the realised volatility estimator and the model considered, but, no matter the choices, the pro-cyclicality is always present. We complement this theoretical analysis by performing simulation studies in the iid case and developing a case study on real data.

The Berkelmans–Pries dependency function: A generic measure of dependence between random variables
Part of
- Foundations of probability theory
- Multivariate analysis
Guus Berkelmans, Sandjai Bhulai, Rob van der mei, Joris Pries
Journal:

Journal of Applied Probability / Volume 60 / Issue 4 / December 2023

Published online by Cambridge University Press:

23 March 2023, pp. 1115-1135

Print publication:

December 2023
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Measuring and quantifying dependencies between random variables (RVs) can give critical insights into a dataset. Typical questions are: ‘Do underlying relationships exist?’, ‘Are some variables redundant?’, and ‘Is some target variable Y highly or weakly dependent on variable X?’ Interestingly, despite the evident need for a general-purpose measure of dependency between RVs, common practice is that most data analysts use the Pearson correlation coefficient to quantify dependence between RVs, while it is recognized that the correlation coefficient is essentially a measure for linear dependency only. Although many attempts have been made to define more generic dependency measures, there is no consensus yet on a standard, general-purpose dependency function. In fact, several ideal properties of a dependency function have been proposed, but without much argumentation. Motivated by this, we discuss and revise the list of desired properties and propose a new dependency function that meets all these requirements. This general-purpose dependency function provides data analysts with a powerful means to quantify the level of dependence between variables. To this end, we also provide Python code to determine the dependency function for use in practice.

On Tournaments and negative dependence
Part of
Yaakov Malinovsky, Yosef Rinott
Journal:

Journal of Applied Probability / Volume 60 / Issue 3 / September 2023

Published online by Cambridge University Press:

07 February 2023, pp. 945-954

Print publication:

September 2023
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Negative dependence of sequences of random variables is often an interesting characteristic of their distribution, as well as a useful tool for studying various asymptotic results, including central limit theorems, Poisson approximations, the rate of increase of the maximum, and more. In the study of probability models of tournaments, negative dependence of participants’ outcomes arises naturally, with application to various asymptotic results. In particular, the property of negative orthant dependence was proved in several articles for different tournament models, with a special proof for each model. In this note we unify these results by proving a stronger property, negative association, a generalization leading to a very simple proof. We also present a natural example of a knockout tournament where the scores are negatively orthant dependent but not negatively associated. The proof requires a new result on a preservation property of negative orthant dependence that is of independent interest.

CONVERGENCE RATE FOR STATISTICS OF POINT PROCESSES
Part of
TIANSHU CONG
Journal:

Bulletin of the Australian Mathematical Society / Volume 106 / Issue 3 / December 2022

Published online by Cambridge University Press:

28 September 2022, pp. 511-512

Print publication:

December 2022
- Article
- - You have access
- PDF
- HTML
- Export citation

Gromov–Wasserstein distances between Gaussian distributions
Part of
- Distribution theory - Probability
- Multivariate analysis
Julie Delon, Agnes Desolneux, Antoine Salmona
Journal:

Journal of Applied Probability / Volume 59 / Issue 4 / December 2022

Published online by Cambridge University Press:

18 August 2022, pp. 1178-1198

Print publication:

December 2022
- Article
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Gromov–Wasserstein distances were proposed a few years ago to compare distributions which do not lie in the same space. In particular, they offer an interesting alternative to the Wasserstein distances for comparing probability measures living on Euclidean spaces of different dimensions. We focus on the Gromov–Wasserstein distance with a ground cost defined as the squared Euclidean distance, and we study the form of the optimal plan between Gaussian distributions. We show that when the optimal plan is restricted to Gaussian distributions, the problem has a very simple linear solution, which is also a solution of the linear Gromov–Monge problem. We also study the problem without restriction on the optimal plan, and provide lower and upper bounds for the value of the Gromov–Wasserstein distance between Gaussian distributions.

62Hxx

Refine listing

Refine listing

Actions for selected content:

103 results in 62Hxx

VARIABLE SELECTION IN MULTIVARIATE DATA BY ANALYSIS OF DATA PATTERN

Estranged facets and k-facets of Gaussian random point sets

EXPLORING OUT-OF-SAMPLE PREDICTION AND SPATIAL DEPENDENCY FOR COMPLEX BIG DATA

On a class of bivariate distributions built of q-ultraspherical polynomials

An MBO method for modularity optimisation based on total variation and signless total variation

Almost exact recovery in noisy semi-supervised learning

The dutch draw: constructing a universal baseline for binary classification problems

Exact recovery of community detection in k-community Gaussian mixture models

The limiting spectral distribution of large random permutation matrices

The unified extropy and its versions in classical and Dempster–Shafer theories

ON THE N-POINT CORRELATION OF VAN DER CORPUT SEQUENCES

Exchangeable FGM copulas

EXTRACTING FEATURES FROM EIGENFUNCTIONS: HIGHER CHEEGER CONSTANTS AND SPARSE EIGENBASIS APPROXIMATION

Asymptotic expansion of the expected Minkowski functional for isotropic central limit random fields

Asymptotic results on tail moment and tail central moment for dependent risks

How do empirical estimators of popular risk measures impact pro-cyclicality?

The Berkelmans–Pries dependency function: A generic measure of dependence between random variables

On Tournaments and negative dependence

CONVERGENCE RATE FOR STATISTICS OF POINT PROCESSES

Gromov–Wasserstein distances between Gaussian distributions

62Hxx

Refine listing

Refine listing

Actions for selected content:

Save Search

103 results in 62Hxx