To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
A quantum Monte Carlo method is simply a Monte Carlo method applied to a quantum problem. What distinguishes a quantum Monte Carlo method from a classical one is the initial effort necessary to represent the quantum problem in a form that is suitable for Monte Carlo simulation. It is in making this transformation that the quantum nature of the problem asserts itself not only through such obvious issues as the noncommutivity of the physical variables and the need to symmetrize or antisymmetrize the wave function, but also through less obvious issues such as the sign problem. Almost always, the transformation replaces the quantum degrees of freedom by classical ones, and it is to these classical degrees of freedom that the Monte Carlo method is actually applied. Succeeding chapters present and explain many of the quantum Monte Carlo methods being successfully used on a variety of quantum problems. In Chapters 1 and 2 we focus on discussing what the Monte Carlo method is and why it is useful.
The Monte Carlo method
The Monte Carlo method is not a specific technique but a general strategy for solving problems too complex to solve analytically or too intensive numerically to solve deterministically. Often a specific strategy incorporates several different Monte Carlo techniques. In what is likely the first journal article to use the phrase “Monte Carlo,” Metropolis and Ulam (1949) discuss this strategy. To paraphrase them,
The Monte Carlo method is an iterative stochastic procedure, consistent with a defining relation for some function, which allows an estimate of the function without completely determining it.
This is quite different from the colloquialism, “a method that uses random numbers.” Let us examine the definition piece by piece. A key point will emerge.
Ulam and Metropolis were presenting the motivation and a general description of a statistical approach to the study of differential and integro-differential equations. These equations were their “defining relation for some function.” The “function” was the solution of these equations. This function is of course unknown a priori. Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller (1953) a few years later would propose a statistical approach to the study of equilibrium statistical mechanics. The defining relation there was a thermodynamic average of a physical quantity over the Boltzmann distribution. The function was the physical quantity, and the unknown its average.
In this chapter, we discuss the fixed-node and constrained-path Monte Carlo methods for computing the ground state properties of systems of interacting electrons. These methods are arguably the two most powerful ones presently available for doing such calculations, but they are approximate. By sacrificing exactness, they avoid the exponential scaling of the Monte Carlo errors with system size that typically accompanies the simulation of systems of interacting electrons. This exponential scaling is called the Fermion sign problem. After a few general comments about the sign problem, we outline both methods, noting points of similarity and difference, plus points of strength and weakness. We also discuss the constrained-phase method, an extension of the constrained-path method, which controls the phase problem that develops when the ground state wave function cannot be real.
Sign problem
The “sign problem” refers to the exponential increase of the Monte Carlo errors with increasing system size or decreasing temperature (e.g., Loh et al., 1990, 2005; Gubernatis and Zhang, 1994) that often accompanies a Markov chain simulation whose limiting distribution is not everywhere positive. Such a case generally arises in simulations of Fermion and frustrated quantum-spin systems. It seems so inherent to Monte Carlo simulations of Fermion systems that the phrase “the sign problem” to many seems almost synonymous with the phrase “the Fermion sign problem.”
Explanations for the cause of the sign problem vary and are still debated. In this chapter, we choose to summarize two explanations that seem to connect the causes in ground state Fermion simulations in the continuum and on the lattice. The sign problem, of course, is not limited to ground state calculations or even to Fermion simulations. While the cause we discuss in Sections 11.2 and 11.3 focuses on the low-lying states of diffusion-like operators, several topological pictures have been proposed (Muramatsu et al., 1992; Samson, 1993; Gubernatis and Zhang, 1994; Samson, 1995). Some of these discussions are done in the context of the zero- and finite-temperature determinant methods (Muramatsu et al., 1992; Gubernatis and Zhang, 1994). Others are done more analytically from a Feynman path-integral point of view (Samson, 1993, 1995). Some are for particles with statistics other than Fermions. The presentation of the sign problem in this chapter is appropriate for the Monte Carlo methods discussed in this chapter.
The presence of dynamical information is a feature distinguishing a finite-temperature quantum Monte Carlo simulation from a classical one. We now discuss numerical methods for extracting this information that use techniques and concepts borrowed from an area of probability theory called Bayesian statistical inference. The use of these techniques and concepts provided a solution to the very difficult problem of analytically continuing imaginary-time Green's functions, estimated by a quantum Monte Carlo simulation, to the real-time axis. Baym and Mermin (1961) proved that a unique mapping between these functions exists. However, executing this mapping numerically, with a simulation's incomplete and noisy data, transforms the problem into one without a unique solution and thus into a problem of finding a “best” solution according to some reasonable criterion. Instead of executing the analytic continuation between imaginary- and real-time Green's functions, thereby obtaining real-time dynamics, we instead estimate the experimentally relevant spectral density function these Green's functions share. We present three “best” solutions and emphasize that making the simulation data consistent with the assumptions of the numerical approach is a key step toward finding any of these best solutions.
Preliminary comments
The title of this chapter, “Analytic Continuation,” is unusual in the sense that it describes the task we wish to accomplish instead of the method we use to accomplish it. If we used the name of the method, the title would be something like “Bayesian Statistical Inference Using an Entropic Prior.” A shorter title would be “The Maximum Entropy Method.”We hope by the end of the chapter the reader will agree that using the short title is perhaps too glib and the longer one has meaningful content.
The methods to sample from a discrete probability for n events, presented in Algorithms 1 and 2, become inefficient when n is large. To devise an efficient algorithm for large n, we must distinguish the case where the list of probabilities is constantly changing from the case where it remains unchanged. In the latter case, there are several ways to boost efficiency by performing a modestly expensive operation once and then using more efficient operations in subsequent samplings. One such approach is to sort the list of probabilities, an operation generally requiring O(n log n) operations, and use a bisection method, requiring O (log n) operations, to select the events from the list. What we really want, however, is a method requiring only O (1) operations. Walker's alias algorithm (Walker, 1977) has this property.
Suppose we need to repeatedly choose one of five colors randomly – red, blue, yellow, pink, and green – with weights 6, 1, 3, 2, and 8 (Fig. A.1). We first generate five sticks with lengths 6, 1, 3, 2, and 8 and paint each stick with the color that it represents. Next, we define a linked list of the sticks that are longer than the average (which is 4), and another for the sticks that are shorter than the average. In the present case, the long-stick list is (“red”→“green”), and the short-stick list is (“blue” → “yellow” → “pink”). We pick the first item from each list and cut the longer (red) stick in two pieces in such a way that if we join one of the two to the shorter (blue) stick, we obtain an average-length stick. As a result, we are left with a red stick of length 3 and a joint stick of length 4. Since the original red stick was made shorter than the average length, we remove it from the long-stick list and append it to the short-stick list. On the other hand, the original blue stick has become an average-length stick, so we remove it from the short-stick list. Then, we pick the first item from each list and repeat the same operations again and again. When finished, we have five sticks of average length, some with a single color and others with two colors.
Our discussion of Markov chains, with the exception of mentioning the Metropolis and heat-bath algorithms, has so far been very general with little contact with issues and opportunities related to specific applications. In this chapter, we recall that our target is many-body problems defined on a lattice and introduce several frameworks exploiting what is special about Markov processes for these types of problems. We consider here classical many-body problems, using the Ising model as the representative. Our discussion will be extended to various quantum many-body problems and algorithms in subsequent chapters.
Many-body phase space
The numerical difficulty of studying many-body problems on a lattice arises from the fact that their phase space Ω is a direct product of many phase spaces of local degrees of freedom. Generally, a local phase space is associated with each lattice site. If n is the size of this phase space and N is the number of lattice sites, then the number of states |Ω| available to the whole system is nN. In other words, the number of states in the phase space grows exponentially fast with the physical size of the system. For example, in the Ising model, the Ising spin si on each site can take one of the two values ±1, and hence the number of states in the total phase space is 2N.
In Chapter 1, we noted that this exponential scaling thwarts deterministic solutions and is a reason why we use the Monte Carlo method. The exponentially large number of states implies that the enumeration of all states, which requires a computational effort proportional to the size of the phase space, is not an option for a problem with a large number of sites. As discussed in Section 2.7, the Monte Carlo method generally samples from a compressed phase space, avoiding the need to solve the entire problem. However, even with this advantage, can we reduce the computational effort to a manageable level?
An obvious requirement is that we can equilibrate the simulation in a reasonable amount of computer time. We know that the Markov process converges to a stationary state sampling some distribution (Section 2.4), and for detailed balance algorithms, Rosenbluth's theorem (Section 2.6) guarantees monotonic convergence.
Recent trends in computer hardware have made modern computers parallel computers; that is, the power of today's computers depends on the number of processors. In order to use these powerful machines, we must split the algorithmic tasks into pieces and assign them to a large number of processors. In many cases, this requires nontrivial modifications of the algorithm. In this chapter, we discuss several basic concepts about parallelizing an algorithm and illustrate them in the context of loop/cluster identification.
Parallel architectures
The key issue in parallel computation is the distribution of computer memory. In other words, how much memory is available and at what access speed. Memories are organized in a hierarchical structure: L1 cache, L2 cache, main memory, and so on, all with different access speeds. The access speed also depends on the physical distance between the computing unit and the memory block.
Discussing how to fine-tune computer programs, taking all machine details into account, is clearly beyond the scope of this book. Therefore, in the following, we focus our discussion of parallel computers on two common types of architectures (Fig. 13.1): shared memory and distributed memory. In either case, we assume that the parallel computer has Np processors.
In most parallel computers available today, each local memory block is directly accessible by only a small number of processors in the local processor block which is physically closest to it. To access a remote block of memory, a processor must communicate with another processor that has direct access to this block. In some sense, the shared-memory architecture is a model for a local processor block. Alternatively, it can be regarded as a model of the whole computer system in which the communication cost is negligible. In this model, we do not have to account for the process of communicating between the processors. We simply assume that they are all reading and writing to the same block of memory, and each processor immediately knows when something is written to memory by another. The distributed-memory architecture, on the other hand, is a model in which every processor monopolizes the access to the block of memory that it possesses, and for a processor to get the information written on another processor's memory, the owner of the information must explicitly send it the content of this memory.