To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Summary In this chapter we summarise some important mathematical results which are frequently required to solve numerical and statistical problems arising in scientific work. First, we discuss some Taylor series expansions for functions of one variable; we continue with the notion of functions of two or more variables and of partial differentiation. Lastly, we outline the concept of a matrix and the basic matrix operations: addition, subtraction, multiplication and the derivation of an inverse.
The level of mathematics required
The level of mathematics necessary for understanding the techniques described in this book is quite modest. Yet, using these techniques we are able to solve many difficult numerical and statistical problems. We are able to solve non-linear equations in one or more unknowns, integrate and differentiate a given function (including empirical functions), smooth crude data, fit curves and interpolate. Some of the techniques we develop can be used to solve other complex problems. For example, the method of finite differences (chapter 6) is used extensively for solving differential equations.
Readers will be familiar with the mathematical concepts of differentiation (obtaining the slope of a curve) and integration (finding the area under a curve). There are certain other basic concepts and formulae which bear repeating and these are summarised in the remainder of this chapter.
The Taylor series expansion
This formula uses the value of a function f(x) and its derivatives at a particular point x to produce the value of that function at a neighbouring point x + h.
Summary The normal, chi-square, t- and F-distributions all play prominent roles in statistical theory. The importance of the normal distribution is a direct result of the central limit theorem, and the other distributions are defined in terms of the normal. In this chapter, we summarise the important properties of each of these distributions. We also describe the log-normal distribution and the multivariate normal distribution.
The normal distribution
Random errors or fluctuations in the results of scientific experiments frequently follow a distribution which approximates the normal (or Gaussian) form. There are good reasons to expect this to be so. The central limit theorem states that the distribution function of a random variable which is the sum of n independent identically-distributed random variables with means μ and variances ρ2 approaches the normal distribution function with mean nμ and variance nρ2 as n becomes large. A particular variable which we measure is often the result of combining a large number of variables which we cannot or do not measure; the deviation of an observation from the mean value or expected value is thus a sum of a number of deviations, some positive and some negative, and often of comparable magnitude.
The normal distribution is a continuous distribution completely defined by its mean μ and variance ρ2.
Summary The result of a numerical calculation may differ from the exact answer because of truncation errors, round-off errors and mistakes. In this chapter, we describe some simple techniques for detecting and reducing truncation and round-off errors. We also suggest ways of avoiding mistakes.
Introduction
The result of numerical calculation may differ from the exact answer to the mathematical problem for one or more of three basic reasons:
The calculation formula may be derived by cutting off an infinite series after a finite number of terms; the errors introduced in this manner are called truncation errors.
A calculating device is only able to retain a certain number of decimal digits and the less significant digits are dropped; errors introduced in this manner are called round-off errors.
Mistakes may be made by man or machine in performing the calculation or recording the result. The word ‘mistake’ is used to distinguish this source of discrepancy due to human or mechanical fallibility from the largely unavoidable ‘errors’ caused by the necessity to truncate an infinite series or the finite capacity of the calculating device.
The research worker must ensure that the final results of a calculation are not rendered useless by errors or mistakes. Intermediate checks are advisable a long calculation.
Summary The Pearson system of probability-density functions is defined by a differential equation similar in form to the difference equation of the hypergeometric distribution. By varying the parameters in the equation, it is possible to produce continuous distributions with different levels of skewness and kurtosis. The parameters are completely defined in terms of the first four moments, and the first four sample moments are used for curve-fitting purposes. The technique is convenient if one just wants to fit a curve without worrying about the justification of the functional form.
Introduction
Many different shapes of discrete distribution can be obtained by varying the parameters R, B and n in the hypergeometric distribution (section 10.13). When, for example, R = B and both are very large, and a reasonably large random sample of size n is drawn, the distribution of the number of red objects chosen is effectively binomial with parameters n (large) and p = ½. We then have a symmetric discrete distribution closely resembling the normal. By varying the hypergeometric parameters, we are able to produce distributions with different levels of skewness and kurtosis (section 8.6). Karl Pearson observed this, and he devised the Pearson system of probability-density functions by using a differential equation similar to the difference equation of the discrete hypergeometric distribution.
The process of fitting a Pearson probability-density function usually involves a large amount of arithmetic, but the work is nowhere near as onerous as previously now that electronic desk calculators are available with exponential, logarithmic and trigonometric keys.
Summary In this chapter we describe several different methods for finding the real roots of non-linear equations. Each has its own advantages and disadvantages. The method of false position (section 3.2) is the only one which always converges. The Newton-Raphson method of section 3.6 for solving two simultaneous non-linear equations can be generalised to three or more variables.
Introduction
Non-linear equations often arise in scientific work. Graphical methods may be used to solve these equations, but the order of accuracy is usually such that the answer obtained can only be regarded as a first approximation. Trialand-error methods can also be used. Iterative methods are usually most convenient. These are procedures whereby xn the nth approximation to the root is obtained by evaluating a function of the earlier approximation xn–1.
The iterative procedures described in this chapter are usually quite satisfactory, but the reader is warned that they may fail in certain circumstances; for example, with multiple roots and close roots. The practical research worker will try a reasonable method. If this fails, he will try another method. If this also fails, he may be advised to seek the assistance of a numerical analyst.
Summary In this chapter we describe a variety of useful numerical techniques. We begin with the problem of regrouping grouped data, and then move on to Hardy's formula, which allows us to estimate the central ordinate of an area. The next important topic is a technique for fusing together two smooth curves to form a single curve. The method of steepest descent for minimising a function of several variables is described in section 7.5; this technique finds applications in many areas, including non-linear least squares (chapter 18). The chapter ends with a simple trick for increasing the data storage capacity of a programmable calculator.
Regrouping grouped data
Data are often grouped. This may be done to produce a concise table or because the data are sparse or some other reason. The user of a particular table may find that the grouping in the table is not suitable and he needs to regroup the data. This can be a very difficult task, particularly when the numbers in certain groups considerably exceed those in neighbouring groups. The best approach is to apply an interpolation method to the cumulative total.
Example 7.1.1. Births to unmarried mothers in Australia in 1968 totalled 18980. The ages of these mothers are shown in table 7.1.1, in broad age groups. Estimate the number of births to unmarried mothers in each of the quinquennial age groups 12–16, 17–21, 22–26, …, 47–51.