Skip to main content
×
Home
    • Aa
    • Aa

Valid Generalisation from Approximate Interpolation

  • Martin Anthony (a1), Peter Bartlett (a2), Yuval Ishai (a3) and John Shawe-Taylor (a4)
Abstract

Let and be sets of functions from domain X to ℝ. We say that validly generalises from approximate interpolation if and only if for each η > 0 and ∈, δ ∈ (0,1) there is m0(η, ∈, δ) such that for any function t and any probability distribution on X, if m > m0 then with m-probability at least 1 – δ, a sample X = (x1, X2,…,xm) ∈ Xm satisfies

We find conditions that are necessary and sufficient for to validly generalise from approximate interpolation, and we obtain bounds on the sample length m0{η,∈,δ) in terms of various parameters describing the expressive power of .

Copyright
References
Hide All
[1]Valiant L. G. (1984) A theory of the learnable. Comm. ACM 27(11) 11341142.
[2]Blumer A., Ehrenfeucht A., Haussler D. and Warmuth M. K. (1989) Learnability and the Vapnik-Chervonenkis dimension. J. ACM 36(4) 929965.
[3]Anthony M. and Biggs N. (1992) Computational Learning Theory: An Introduction. Cambridge University Press.
[4]Natarajan B. K. (1991) Machine Learning: A Theoretical Approach. Morgan Kaufmann.
[5]Haussler D. (1992) Decision theoretic generalizations of the PAC model for neural net and other learning applications Information and Computation 100 78150.
[6]Alon N., Ben-David S., Cesa-Bianchi N. and Haussler D. (1993) Scale-sensitive dimensions, uniform convergence, and learnability. Proceedings IEEE Symposium on Foundations of Computer Science. IEEE Press.
[7]Bartlett P. L., Long P. M. and Williamson R. C. (1994) Fat-shattering and the learnability of real-valued functions. Proceedings 7th Annual ACM Conference on Computational Learning Theory. ACM Press. (J. Computer and System Sciences. To appear.)
[8]Sontag E. D. (1992) Feedforward nets for interpolation and classification. J. Computer and System Sciences 45 2048.
[9]Pollard D. (1984) Convergence of Stochastic Processes. Springer-Verlag.
[10]Ben-David S., Benedek G. M. and Mansour Y. (1989) A parameterization scheme for classifying models of learnability. COLT'89, Proceedings 2nd Workshop on Computational Learning Theory. Morgan Kaufmann.
[11]Kearns M. J. and Schapire R. E. (1990) Efficient distribution-free learning of probabilistic concepts. Proceedings IEEE Symposium on Foundations of Computer Science. IEEE Press.
[12]Vapnik V. N. and Chervonenkis A. Ya. (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications 16(2) 264280.
[13]Natarajan B. K. (1993) Occam's razor for functions. Proceedings 6th ACM Workshop on Computational Learning Theory. ACM Press.
[14]Simon H. U. (1993) General bounds on the number of examples needed for learning probabilistic concepts. Proceedings 6th ACM Workshop on Computational Learning Theory. ACM Press. (J. Computer and System Sciences. To appear.)
[15]Simon H. U. (1994) Bounds on the number of examples needed for learning functions. In Computational Learning Theory: EUROCOLT'93 (Shawe-Taylor J. and Anthony M., eds.). Oxford University Press.
[16]Vapnik V. N. (1982). Estimation of Dependences Based on Empirical Data. Springer-Verlag.
[17]Anthony M. and Shawe-Taylor J. (1994) Valid generalisation of functions from close approximations on a sample. In Computational Learning Theory: EUROCOLT'93 (Shawe-Taylor J. and Anthony M., eds.). Oxford University Press.
[18]Anthony M., Biggs N. and Shawe-Taylor J. (1990) The learnability of formal concepts. COLT'90, Proceedings 3rd Annual Workshop on Computational Learning Theory. Morgan Kaufmann.
[19]Shawe-Taylor J., Anthony M. and Biggs N. L. (1993) Bounding sample size with the Vapnik-Chervonenkis dimension. Discrete Appl. Math. 41 6573.
[20]Angluin D. and Valiant L. (1979) Fast probabilistic algorithms for Hamiltonian circuits and matchings. J. Computer and System Sciences 18 155193.
[21]Natarajan B. K. (1989) On learning sets and functions. Machine Learning 4 6797.
[22]Anthony M. and Shawe-Taylor J. (1993) A result of Vapnik with applications. Discrete Appl. Math. 47 207217.
[23]Ben-David S., Cesa-Bianchi N., Haussler D. and Long P. (1992) Characterizations of learnability for classes of {0,...,n}-valued functions. J. Computer and System Sciences 50 7486.
[24]Anthony M. and Bartlett P. L. (1994) Function learning from interpolation. In preparation.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Combinatorics, Probability and Computing
  • ISSN: 0963-5483
  • EISSN: 1469-2163
  • URL: /core/journals/combinatorics-probability-and-computing
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 4 *
Loading metrics...

Abstract views

Total abstract views: 26 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 24th October 2017. This data will be updated every 24 hours.