To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The need to solve large linear systems of algebraic equations arises in almost any mathematical model, as illustrated in the instances below. In particular, we go into some detail regarding electrical networks. The most common method used to solve such linear systems is based on factorization of the matrix in triangular factors, which is discussed and shown to be equivalent to the Gaussian elimination method (learned in almost any elementary course in algebra).
We present this method in a form that can be the basis for a computer algorithm. When one wants to solve very large systems of equations, it is important to know how computational complexity grows with the size of the problem. This topic is also addressed, including the case where the matrix has a special structure in the form of a bandmatrix. The solution of tridiagonal systems is considered in particular detail. In addition, some basic dimension theory for matrices, such as the relation between rank and the dimension of the nullspace, is discussed and derived using the factored form of the matrix.
Let us now consider various iterative methods for the numerical computation of the solution of a linear system of equations. Iterative methods for solving linear systems (originally by Gauss in 1823, Liouville in 1837, and Jacobi in 1845) embody an approach quite different from that behind direct methods such as Gaussian elimination (see Chapter 1). In 1823 Gauss wrote, “Fast jeden Abend mache ich eine neue Auflage des Tableau, wo immer leicht nachzuhelfen ist. Bei der Einförmigkeit des Messungsgeschäfts gibt dies immer eine angenehme Unterhaltung; man sieht daran auch immer gleich, ob etwas Zweifelhaftes eingeschlichen ist, was noch wünschenswert bleibt usw. Ich empfehle Ihnen diesen Modus zur Nachahmung. Schwerlich werden Sie je wieder direct eliminieren, wenigstens nicht, wenn Sie mehr als zwei Unbekannte haben. Das indirecte Verfahren läßt sich halb im Schlafe ausführen oder man kann während desselben an andere Dingen denken.” (Freely translated, ”I recommend this modus operandi. You will hardly eliminate directly anymore, at least not when you have more than two unknowns. The indirect method can be pursued while half asleep or while thinking about other things.”)
In this chapter, we define the concept of positive definite matrices and present some properties of real symmetric positive definite matrices. Next, some particularly important properties of Schur complement matrices are discussed, condition numbers for positive definite matrices are analyzed, and some estimates of eigenvalues of generalized eigenvalue problems based on the Courant-Fischer theorem are derived. We conclude with a discussion of congruence transformations and quasisymmetric matrices.
The following definitions are introduced in this chapter:
Definition 3.1
(a) A matrix A is said to be positive definite (positive semidefinite) in ℂn if its quadratic form is real and (Ax, x) > 0 [(Ax, x) ≥ 0] for all x ≠ 0, x ∈ ℝn.
(b) A real matrix A is said to be positive definite (positive semidefinite) in ℝn if (Ax, x) > 0 [(Ax, x) ≥ 0], for all x ≠ 0, x ∈ ℝn.
Definition 3.2 A matrix A is said to be positive stable if all its eigenvalues have positive real parts.
Definition 3.3
(a) B = 1/2(A + A*) is called the Hermitian part of A.
(b) C = 1/2(A − A*) is called the anti-Hermitian part of A. For a real matrix A,
(c) B = 1/2(A + AT)is called the symmetric part of A.
(d) C = 1/2(A − AT) is called the antisymmetric (also called the skew-symmetric) part of A.
Algorithms for the solution of linear systems of algebraic equations arise in one way or another in almost every scientific problem. This happens because such systems are of such a fundamental nature. For example, nonlinear problems are typically reduced to a sequence of linear problems, and differential equations are discretized to a finite dimensional system of equations.
The present book deals primarily with the numerical solution of linear systems. The solution algorithms considered are mainly iterative methods. Some results related to the estimate of eigenvalues (of importance for estimating the rate of convergence of iterative solution methods, for instance), are also presented. Both the algorithms and their theory are discussed. Many phenomena that can occur in the numerical solution of the above problems require a good understanding of the theoretical background of the methods. This background is also necessary for the further development of algorithms. It is assumed that the reader has a basic knowledge of linear algebra such as properties of sets of linearly independent vectors, elementary matrix algebra, and basic properties of determinants.
The first six or seven chapters and Appendix A can be (and have been) used as a textbook for an introductory course in numerical linear algebra, but this material demands students who are not afraid of theory. The theory is presented so that it can be followed even in selfstudy.
A crucial task in the construction of efficient basic iterative methods is the choice of a splitting of the given matrix, A = C − R, into two matrices, C and R. At each iteration step of such a method, we must solve a linear system with C. This matrix is sometimes called a preconditioning matrix; it must be simple (in a certain respect) but still effective for the increase of the rate of convergence. The case where the corresponding preconditioning matrix is triangular or block-triangular and A is a so-called M-matrix is of particular importance because it appears frequently in practice. Let us first study some general properties of M-matrices and general families of convergent splittings. Several principles for comparing the rate of convergence of different splittings are also presented.
Even if a given matrix is not a M-matrix, we show next that any symmetric positive definite matrix can be reduced to a Stieltjes matrix by using a method of diagonal compensation of reduced positive entries. Splittings for the reduced matrix can be used as splittings for the original matrix.
By introducing a relaxation parameter (ω) as in the successive overrelaxation (SOR) method, we show that a proper choice of this parameter sometimes can speed up the rate of convergence by an order of magnitude. It turns out that the theory for this method can be based on some of the results already derived in the previous chapter.
For various applications, we need to quantify errors and distances—i.e., we need to measure the size of a vector and a matrix. We shall find that it is not only the Euclidian measure of a vector that is appropriate in practice. Accordingly, we introduce vector and matrix norms, which are real valued functions of vectors and matrices, respectively. It is shown how to calculate matrix norms associated with certain vector norms and apply them in the estimation of eigenvalues. We shall find that properties of positive definite matrices play a crucial role in the calculation of the norm associated with the Euclidian vector norm. We show also that matrix norms are useful when estimating inherent errors—i.e., errors caused by errors in given data—in solutions of systems of linear algebraic equations. Finally, it is shown how to estimate the errors in eigenvalue computations by a posteriori estimates and how to compute sequences of approximations of them by matrix power methods. The following definitions are introduced in this appendix.
In the two previous chapters, it was shown that conjugate gradient-type methods have many advantages, one being simplicity of implementation: No method parameter need be estimated, and the method consists mainly of matrix vector multiplications and vector operations. In the present chapter, we shall analyze the rate of convergence of the methods. This will be done not only for the case of a symmetric, positive definite matrix but also for general matrices. First, it will be shown that if the matrix is normal, then the norm of the residual can be estimated by the error in a best polynomial approximation problem. It will then be shown that the rate of convergence depends on the distribution of eigenvalues and, to some extent, also on the initial residual. If the matrix is not normal, the estimate involves also the condition number of the matrix that transforms the given matrix to Jordan canonical form. In general, we are unable to give exact estimates of the rate of convergence, but we shall derive various upper bounds using different methods of estimation.
The rate of convergence can be measured in various norms. When we use a relative measure—say, the ratio of the norm of the current residual and the initial residual—it will be shown that the condition number of the matrix can frequently be used to give a sufficiently accurate estimate.
As we saw in the previous chapter, there exist two types of methods to solve systems of equations with short recurrence relations: (1) minimization algorithms, such as the conjugate gradient type algorithms, and (2) Lanczos-type algorithms. The short recurrence and the minimization properties have been shown to hold for the conjugate gradient methods for matrices that are selfadjoint and positive definite w.r.t. to the inner product used in the algorithm. The short recurrence holds for the biconjugate gradient-type Lanczos algorithms, also for nonsymmetric matrices, but these algorithms can break down when the matrix is indefinite.
In this chapter, it will first be shown that such short recurrence relations for the conjugate gradient minimization type algorithms exist for a broader class of matrices, the H-normal class w.r.t. the initial vector. This extends the applicability of these methods. However, many matrices occurring in practice belong to still more general classes. On the other hand, the Lanczos-type algorithms do not have a minimization property as a rule and, as just said, can even break down.
We shall also analyze a general class of methods based on minimizing the least square norm of the residual (w.r.t. an inner product) and using a set of search directions (orthogonal w.r.t. another inner product, in general).