Hostname: page-component-76fb5796d-dfsvx Total loading time: 0 Render date: 2024-04-28T10:24:39.282Z Has data issue: false hasContentIssue false

THE ANALYSIS OF SINGLETONS IN GENERALIZED BIRTHDAY PROBLEMS

Published online by Cambridge University Press:  27 April 2012

Matthijs R. Koot
Affiliation:
Informatics Institute, University of Amsterdam, The Netherlands E-mail: koot@cyberwar.nl
Michel Mandjes
Affiliation:
Korteweg-de Vries Institute for Mathematics, University of Amsterdam, The Netherlands; Eurandom, Eindhoven University of Technology, The Netherlands CWI, Amsterdam, The Netherlands E-mail: m.r.h.mandjes@uva.nl

Abstract

This paper describes techniques to characterize the number of singletons in the setting of the generalized birthday problem, that is, the birthday problem in which the birthdays are non-uniformly distributed over the year. Approximations for the mean and variance presented which explicitly indicate the impact of the heterogeneity (expressed in terms of the Kullback–Leibler distance with respect to the homogeneous distribution). Then an iterative scheme is presented for determining the distribution of the number of singletons. The approximations are validated by experiments with demographic data.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2012

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

1.Aggarwal, C. (2005). On k-anonymity and the curse of dimensionality. Proceedings of VLDB 2005, pp. 901909.Google Scholar
2.Dembo, A. & Zeitouni, O. (1998). Large deviations techniques and applications, 2nd ed.New York: Springer.CrossRefGoogle Scholar
3.Diaconis, P. & Mosteller, F. (1989). Methods for studying coincidences. Journal of the American Statistical Association 84: 853861.CrossRefGoogle Scholar
4.Gail, M., Weiss, G., Mantel, N. & O'Brien, S. (1979). A solution to the generalized birthday problem with application to allozyme screening for cell culture contamination. Journal of Applied Probability 16: 242251.Google Scholar
5.Joag-Dev, K. & Proschan, F. (1992). The birthday problem with unlike probabilities. American Mathematical Monthly 99: 1012.CrossRefGoogle Scholar
6.Klotz, J. (1979). The birthday problem with unequal probabilities. Technical Report, No. 59. Department of Statistics, University of Wisconsin.Google Scholar
7.Koot, M., Mandjes, M., & De Laat, C. (2011). Efficient probabilistic estimation of quasi-identifier uniqueness. Proceedings of NWO ICT. Open 2011.Google Scholar
8.Kullback, S. & Leibler, R. (1951). On information and sufficiency. Annals of Mathematical Statistics 22: 7986.CrossRefGoogle Scholar
9.Mandjes, M. (2011). Generalized birthday problems in the large-deviations regime. Submitted.Google Scholar
10.Nunnikhoven, T. (1992). A birthday problem solution for non-uniform birth frequencies. American Statistician 46: 270274.CrossRefGoogle Scholar
11.Sweeney, L. (2002). k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10: 557570.Google Scholar