Impact Statement
For the past several decades, it has been known that quantum mechanics gives rise to a computational paradigm that offers spectacular speedups for certain tasks relative to the best possible conventional, “classical” algorithms. In recent years, quantum computation has become a physical reality; and simultaneously there has been an explosion in the volume of data that must be processed by computers. Thus the tantalizing possibility of quantum computation aiding the processing of large classical datasets has gained widespread interest. This article takes a broad look at the prominent proposals for how quantum computation may impact on datacentric applications. The pros and cons of “quantum machine learning” algorithms are discussed, and quantum Monte Carlo integration is identified as a potential source of (relatively) nearterm quantum advantage. Finally, some speculative predictions are given for when such quantum advantage may materialize.
1. Quantum Computing: A Very Brief Introduction
The conception of quantum computing is usually attributed to Richard Feynman, who in 1981 speculated that simulating the behavior of a quantum mechanical system would require a computer that was itself somehow quantum mechanical in nature (Feynman, Reference Feynman1982; Preskill, Reference Preskill2021). Manin (Reference Manin1980) and Benioff (Reference Benioff1980) also espoused similar ideas at around the same time. It was David Deutsch who in 1985 then laid the groundwork for quantum computing as we now know it, by formalizing a quantum mechanical model of computation, and posing welldefined mathematical problems where quantum computing offers a clear computational advantage (Deutsch, Reference Deutsch1985). This in turn spawned a great profusion of activity in the embryonic field of quantum computing in the late 1980s and early 1990s, leading to what remains to this day two of the crowning achievements of the field: in 1994, Peter Shor proposed a quantum algorithm for factoring in polynomial time (Shor, Reference Shor1994) and in 1996, Lov Grover proposed an algorithm to search an unstructured database in time proportional to the square root of the database size (Grover, Reference Grover1996).
Unstructured search (in this context) is the problem where we have some $ N={2}^n $ elements, indexed $ {\left\{0,1\right\}}^n $ , to search through, and a “function” $ f $ , such that for exactly one $ x\in {\left\{0,1\right\}}^n $ , $ f(x)=1 $ _{,} and $ f(x)=0 $ otherwise. “Unstructured” means there is no algorithmic shortcut— $ f $ is a function in the technical sense only and does not imply it can be represented as some simple algebraic expression—and hence classically the best (only) strategy is exhaustive search, which requires $ f(x) $ to be evaluated for all $ N $ elements at worse, and $ N/2 $ elements on average. Quantumly, we can prepare a superposition of all possible $ n $ bistrings, and hence “query” $ f $ for all possible $ x $ in a single step, however, this does not imply that quantum unstructured search completes in $ \mathcal{O}(1) $ operations. In fact, as the answer is encoded in a quantum state it turns out that it takes at least $ \mathcal{O}\left(\sqrt{N}\right) $ operations to extract—a lower bound that Grover’s search algorithm achieves. This improvement from $ \mathcal{O}(N) $ classical operations to $ \mathcal{O}\left(\sqrt{N}\right) $ quantum operations is commonly referred to as a “quadratic advantage.”
While the quadratic advantage is extremely valuable, the fact that quantum computing enables the simultaneous querying of $ f $ for an exponential number of $ x $ (that is, we say the problem size is “ $ n $ ” and we query $ f $ for all $ N={2}^n $ possible $ n $ bistrings in superposition), dangles the tantalizing possibility of exponential computational advantages. To see such advantages, we must move on from unstructured search to problems with some specific structure that can be attacked by quantum, but not classical algorithms. The manner in which this structure is attackable by the quantum algorithm can be a little hard to grasp, but essentially amounts to the fact that the answer we are searching for is in some sense determined by all $ {2}^n $ queries. But in such a way that a quantum mechanical “interference” step (for which there is no analog in classical computation) can efficiently extract the solution. This is indeed the case for Shor’s factoring algorithm, where the Quantum Fourier Transform (QFT) performs this interference step (in fact many of the most prominent proposals for superpolynomial quantum advantage use the QFT). Classically the best factoring algorithm is the number field sieve (Lenstra et al., Reference Lenstra, Lenstra, Manasse, Pollard, Lenstra and Lenstra1993), which has complexity $ \exp \left(\Theta \left({n}^{1/3}{\log}^{2/3}n\right)\right) $ , whereas Shor’s algorithm requires only $ \mathcal{O}\left({n}^2\log n\log \log n\right) $ operations, where the problem size, $ n $ , is the number of bits required to express the number being factored.
For the past 25 years, Shor’s and Grover’s algorithms have been the mighty pillars upon which many other proposals for quantum algorithms have been built, and the computational complexity thereof continues to provide some insight into the sorts of advantage we should expect from quantum algorithms: if the algorithm is tackling a task with little structure, then we expect a quadratic (or other polynomials) advantage; whereas if there is a structure that can be exploited by a quantum interference step (such as the QFT) then we can get a superpolynomial speedup. (Scott Aaronson recently posted a very nice and concise article about the role of structure in quantum speedups [Aaronson, Reference Aaronson2022].)
Furthermore, an important point to note is that all of the “canonical” quantum algorithms presume an abstract model of quantum computation, which is innately noiseless (quantum noise occurs when the environment randomly perturbs the quantum state such that it departs from that predicted by the abstract model of quantum computation). It was therefore a substantial and highly important breakthrough when it was shown that real, noisy, and quantum hardware can efficiently simulate the noiseless model of quantum computation in principle owing to the celebrated threshold theorem (Shor, Reference Shor1996; Knill et al., Reference Knill, Laflamme and Zurek1998; Kitaev, Reference Kitaev2003; Aharonov and BenOr, Reference Aharonov and BenOr2008). However, in practice, this still requires a quantum errorcorrection overhead that takes the noiseless model out of reach of nearterm quantum hardware. In the past several years, significant attention has been given to the question of whether useful quantum advantage can be obtained by computing with noisy qubits, that is, without quantum error correction. For this setting, John Preskill coined the term “NISQ” (noisy intermediatescale quantum [computer]) (Preskill, Reference Preskill2018), and it has become commonplace to speak of the “NISQera” (computing with noisy qubits) which will eventually give way to the “fullscale era” (when quantum error correction will mean that we can essentially treat the qubits as noiseless), although I shall later argue that this is something of a false dichotomy.
In general, both “NISQ” and “fullscale” quantum algorithms are usually formulated using the quantum circuit model,Footnote ^{1} which is briefly introduced in Figure 1. It is common to use the circuit depth (the number of layers of operations) as a proxy for computational complexity and in the case of NISQ algorithms, the circuit depth dictates the number of operations that must be performed with the state remaining coherent, that is, before the noise becomes too great and the information contained within the state is lost. So it follows that, when designing quantum algorithms with resource constraints in mind, it is important to keep the circuit depth as low as possible—in order to achieve the computation within the physical qubit coherence time (NISQ) or with as little error correction as possible (fullscale).
Figure 2 summarizes some of the most important breakthroughs in quantum computing, and for further information, the reader is directed to Nielsen and Chuang (Reference Nielsen and Chuang2010), which remains the authoritative textbook on the subject. The quantum algorithm zoo (Jordan, Reference Jordan2011) also provides a catalog of many suggested quantum algorithms—although the total number of algorithms can be somewhat misleading: many of the listed algorithms amount to different instances and applications of the same essential quantum speedup. Additionally, quantumalgorithms.org brings together many important quantum algorithms for data analysis (Luongo, Reference Luongo2022).
2. How Will Quantum Computing Help Me With All My Data?
We can see, even from the concise introduction above, that quantum computation, as it is conventionally broached, is very much bound up with the theory of computational complexity. However, when I speak to computational researchers from outside of quantum computing, invariably what they say to me is not, for example, “I am struggling with this computationally hard problem,” but rather they ask “how can quantum computing help me with all of my data?” For we are living through an era of unprecedented data generation, and this poses problems at every stage of the computational workflow. The most urgent question that researchers are asking of nascent computational technologies is how they can remedy these emerging and growing problems.
This is the challenge taken up by Aram Harrow in small quantum computers and large classical datasets (Harrow, Reference Harrow2020), which proposes using the quantum computer to do computationally intensive model searches when substantial data reduction is possible on the “large classical dataset.” However, the question of what quantum computing can do to deal with “big data” more generally is tricky: loading data onto the quantum computer is wellknown to be a hard problem, not only with the smallscale quantum hardware that is available at present, but a fundamental problem in principle, and one which if we are not careful could easily nullify the quantum advantage. This has in turn brought about something of a divide between “pessimists” who believe that the dataloading problem is fundamentally an insurmountable obstacle, and “optimists” who focus on the unquestionable computational benefits once the data is loaded, and assume that some solution will emerge to the dataloading problem itself.
The purpose of this article is to provide one answer to the motivating question of what quantum computing can do to help with the massive proliferation of data, that is neither unduly pessimistic or optimistic, but rather is realistic—and illuminates a plausible path ahead for the eventual integration of quantum computing into datacentric applications.
3. Quantum Computing and Machine Learning: A Match Made in Heaven?
In recent years, there has been an explosion of papers on “Quantum Machine Learning” (QML), and a cynic would say that this amounts to little more than a case of buzzword fusion to unlock funding sources and generate hype. But I am not a cynic—for one thing, some of the most respected researchers in quantum computing are working on QML—and there are (at least) two very good reasons to believe that quantum computing may ultimately offer significant computational advantages for machine learning tasks. These two reasons in turn inform two complementary approaches to QML.
One approach stems from the functional similarity between artificial neural networks (ANNs) and parameterized quantum circuits (PQCs) (Benedetti et al., Reference Benedetti, Lloyd, Sack and Fiorentini2019b), as shown in Figure 3. In particular, by virtue of the fact that we must always measure the quantum state to extract some information (and noting that measurement triggers a probabilistic “collapse” of the quantum superposition into one of some ensemble of possible states), a PQC is innately something we sample from, and is therefore, in a sense, analogous to an ANN trained as a generative model (Lloyd and Weedbrook, Reference Lloyd and Weedbrook2018; Benedetti et al., Reference Benedetti, GarciaPintos, Perdomo, LeytonOrtega, Nam and PerdomoOrtiz2019a; Zoufal et al., Reference Zoufal, Lucchi and Woerner2019; Chang et al., Reference Chang, Herbert, Vallecorsa, Combarro and Duncan2021; Zoufal, Reference Zoufal2021). (There are myriad proposals to use PQCs in place of ANNs for other learning tasks [Dong et al., Reference Dong, Chen, Li and Tarn2008; Havlíček et al., Reference Havlíček, Córcoles, Temme, Harrow, Kandala, Chow and Gambetta2019; Jerbi et al., Reference Jerbi, Gyurik, Marshall, Briegel and Dunjko2021], but considering generative models suffices to illustrate my point here.) The original hope of quantum advantage in generative modeling stemmed from the fact that there is strong theoretical evidence for the existence of probability distributions from which samples can be prepared quantumly in polynomial time, but would require exponential time classically, for example, probability distributions sampled by IQP circuits (Bremner et al., Reference Bremner, Jozsa and Shepherd2010). Indeed, most proposals and demonstrations of Quantum Supremacy are sampling experiments (Harrow and Montanaro, Reference Harrow and Montanaro2017; Arute et al., Reference Arute, Arya, Babbush, Bacon, Bardin, Barends, Biswas, Boixo, Brandao, Buell, Burkett, Chen, Chen, Chiaro, Collins, Courtney, Dunsworth, Farhi, Foxen, Fowler, Gidney, Giustina, Graff, Guerin, Habegger, Harrigan, Hartmann, Ho, Hoffmann, Huang, Humble, Isakov, Jeffrey, Jiang, Kafri, Kechedzhi, Kelly, Klimov, Knysh, Korotkov, Kostritsa, Landhuis, Lindmark, Lucero, Lyakh, Mandrà, McClean, McEwen, Megrant, Mi, Michielsen, Mohseni, Mutus, Naaman, Neeley, Neill, Niu, Ostby, Petukhov, Platt, Quintana, Rieffel, Roushan, Rubin, Sank, Satzinger, Smelyanskiy, Sung, Trevithick, Vainsencher, Villalonga, White, Yao, Yeh, Zalcman, Neven and Martinis2019). The ramifications for generative modeling are that, should the target distribution be some such “classically intractable” distribution to sample, then we would need an infeasibly large ANN to train a generative model thereof, but only a relatively small PQC.
However, in practice, this is perhaps a slightly oversimplistic outlook: because the datasets of interest in engineering and other typical applications will themselves have been generated by some “classical” process, and so are unlikely to have probability distributions that we expect to be hard to classically sample from. (For instance, we do not, in general, expect classical random processes such as financial time series to exhibit the sort of correlations seen in the measurement statistics of highly entangled quantum circuits.) To put it another way, even though PQCs have greater expressivity, it is not clear that this can be harnessed for any useful application. Compounding this apparently fundamental obstacle is the fact that PQCs are incredibly hard to train (Bittel and Kliesch, Reference Bittel and Kliesch2021), and the cost function landscape is overwhelmingly dominated by large, flat regions termed barren plateaus (McClean et al., Reference McClean, Boixo, Smelyanskiy, Babbush and Neven2018; Arrasmith et al., Reference Arrasmith, Cerezo, Czarnik, Cincio and Coles2021; Cerezo et al., Reference Cerezo, Sone, Volkoff, Cincio and Coles2021; Thanasilp et al., Reference Thanasilp, Wang, Nghiem, Coles and Cerezo2021; Wang et al., Reference Wang, Fontana, Cerezo, Sharma, Sone, Cincio and Coles2021). Nevertheless, in spite of these apparent problems, there is some evidence that QML based on PQC training will yield useful quantum advantage in classical data science applications (Coyle et al., Reference Coyle, Mills, Danos and Kashefi2020; Hubregtsen et al., Reference Hubregtsen, Pichlmeier, Stecher and Bertels2020; Shalaginov and Dubrovsky, Reference Shalaginov and Dubrovsky2022). Indeed, in spite of the question marks hanging over PQCs as ML models in terms of their trainability and expressivity, there remains hope that such models may still have greater power in terms of generalization capability (Schreiber et al., Reference Schreiber, Eisert and Meyer2022).
So we turn to the second approach to QML, which builds on the ability, in principle, of quantum computers to perform certain linear algebra computations (exponentially) faster than the best classical counterpart, and in particular, suggests that this feature can be used to enhance certain machine learning and data science tasks. One recent paper has suggested that the finding Betti numbers, a task in topological data analysis, may be feasible on NISQ machines (Akhalwaya et al., Reference Akhalwaya, Ubaru, Clarkson, Squillante, Jejjala, He, Naidoo, Kalantzis and Horesh2022)—although the question of whether the Betti numbers for which the computation can be exponentially spedup is practically relevant has been raised (McArdle et al., Reference McArdle, Gilyén and Berta2022). Other than this, it is worth noting, that while the training of PQCs as ML models is championed by its proponents as a naturally NISQ application, the quantum enhancement of linear algebra computations is expected to require fullscale (faulttolerant) quantum computers.
The most famous quantum algorithm for linear algebra is Harrow, Hassidim, and Lloyd’s algorithm for solving a linear system (ubiquitously known as “HHL”) (Harrow et al., Reference Harrow, Hassidim and Lloyd2009). Specifically, consider the system of linear equations:
which is solved by inverting $ A $ and then premultiplying $ \mathbf{b} $ by $ {A}^{1} $ to find $ \mathbf{x} $ . Classically this computation takes time that is worse than linear in the size of $ A $ (even if $ A $ is sparse), whereas in certain circumstances HHL runs in time that is only polylogarithmic in the size of $ A $ —thus giving an exponential improvement over the best classical algorithms. HHL leverages the fact that an $ n $ qubit quantum circuit is nothing more than a $ {2}^n\times {2}^n $ unitary matrix and thus, in a sense, a quantum computer is simply a machine that performs exponentially big matrix multiplications. When the matrix $ A $ is sparse and wellconditioned (the ratio of its largest to its smallest eigenvalues is not too big) then it is possible to construct the matrix operation $ {A}^{1} $ as a quantum circuit. Moreover, this $ n $ qubit circuit is only polynomially deep (in $ n $ ) and hence the entire algorithm runs in time that is polylogarithmic in the size of the matrix, $ A $ (a full complexity analysis also accounts for the fact that each attempt at inversion only succeeds with a certain probability, however, the overall polylogarithmic complexity continues to hold, even when this is included).
HHL does, however, suffer from a number of caveats, one of which is the model for access to the data: it is assumed that the quantum computer has access to $ \mathbf{b} $ as some quantum state $ b $ , the preparation of which is not counted in the algorithm’s complexity. Indeed, one can immediately see that complexity at least linear in the size of $ \mathbf{x} $ (and hence the size of $ A $ ) would be incurred even to read $ x $ , and so an overall polylogarithmic complexity can only be possible if the appropriate quantum state is preprepared. The question of the need for a reasonable data access model is one of the problems raised by Scott Aaronson when discussing the potential for quantum advantage in machine learning applications (Aaronson, Reference Aaronson2015)].
This issue was clarified and generalized by Ewin Tang, who showed that all proposed QML algorithms of this second approach can be dequantized if a classical algorithm is given commensurate data access (Tang, Reference Tang2019; Chia et al., Reference Chia, Gilyén, Li, Lin, Tang and Wang2020a,Reference Chia, Gilyén, Lin, Lloyd, Tang, Wang, Cao, Cheng and Lib; Gilyén et al., Reference Gilyén, Song and Tang2022; Tang, Reference Tang2021). “Dequantized” means that there is no exponential quantum advantage—although there could still be a practically useful polynomial advantage. That is, with Tang’s results, the quantum and classical algorithms both run in time polynomial in the logarithm of the size of the linear algebra objects in question, however that polynomial may be of much higher degree for the classical algorithm—thus the quantum algorithms may still provide a practically beneficial speedup. This was indeed the case for Tang’s original dequantization breakthrough (Tang, Reference Tang2019), where she proposed a “quantuminspired” classical version algorithm of the quantum recommendation system of Kerenidis and Prakash (Reference Kerenidis and Prakash2017).
Finally, it is also pertinent to note that HHL itself has only been dequantized for lowrank matrix inversion (Chia et al., Reference Chia, Gilyén, Lin, Lloyd, Tang, Wang, Cao, Cheng and Li2018, Reference Chia, Lin and Wang2020b): there is still an exponential quantum advantage when the matrix to be inverted is full (or close to full) rank. Indeed, fast matrix inversion is still seen as being a potential “killer application” of quantum computing, and is a fertile area of research in, for example, partial differential equation (PDE) solving, when a finitedifference approach can be used to turn the PDE into a system of linear equations (Berry et al., Reference Berry, Childs, Ostrander and Wang2017; Childs and Liu, Reference Childs and Liu2020; Lloyd et al., Reference Lloyd, De Palma, Gokler, Kiani, Liu, Marvian, Tennie and Palmer2020; Childs et al., Reference Childs, Liu and Ostrander2021; Liu et al., Reference Liu, Kolden, Krovi, Loureiro, Trivisa and Childs2021).
4. Quantum Data: The Holy Grail?
We have seen that the drawbacks of QML do not entirely diminish its potential practical utility. However, what is in some ways even more notable is that, until now, we have solely been talking about classical data, so we may ask: “what about quantum data?” That is, what if we have some quantum sensing or metrology process delivering quantum states directly as training data?
Taking the two approaches to QML in turn, when manipulating quantum rather than classical data, it is certainly more reasonable to expect that there may be some fundamental reason why a QML model may be required. In particular, even if the quantum data is immediately measured to give a classical sample, in general, such a sample may exhibit “nonclassical” correlations that cannot be reproduced by any reasonablesized classical algorithm (for instance, as already noted, the correlations present in measurements of highlyentangled IQP circuits are believed to need exponentially large classical circuits to reproduce). Moreover, it has been shown that, in certain instances, barren plateaus are not present in generative modeling of quantum data (in this case, when the quantum state itself, not a measurement thereof, is delivered as training data to the model) (Kiani et al., Reference Kieferova, Carlos and Wiebe2022; Kieferova et al., Reference Kiani, De Palma, Marvian, Liu and Lloyd2021).
Turning to the second approach to QML, to provide commensurate data access to compare classical and quantum algorithms Tang’s dequantization results ordain the classical algorithms with sample and query access to the data. Suppose we have some $ N $ element vector $ x $ , “query access” means the value $ {x}_i $ can be extracted (for any $ i $ ), and “sample access” means we sample a number, $ i $ between 0 and $ N1 $ with probability $ {x}_i/{\sum}_j{x}_j $ . If the quantum state is prepared from classical data then (as Tang asserts) it is reasonable to assume that sample and query access could be attained in about the same number of operations. If, however, the data is presented as a quantum state, then only sample access is available to a classical algorithm (sample access is obtained simply by measuring the quantum state in question). This in turn implies that, when the input is quantum data, the dequantization results no longer necessarily hold, and the possibility of exponential quantum advantage is upheld.
5. Monte Carlo or Bust?
Responding by basically saying “soon there may be even more data which is quantum in nature and thus intrinsically needs QML” is only really half an answer to the motivating question of what quantum computing can do to help processing vast datasets. For the implicit emphasis in the question was on the data we already have, and expect in the immediate future. To answer this, it is helpful to step back and ask: what is it we want from these large datasets? Invariably, the aim will be to extract some quantities pertaining to the dataset as a whole and, moreover (even if it is not immediately thought of in these terms), such quantities will usually amount to some sort of expectation of the distribution that the data has been sampled from (or simple combinations of expectation values). For instance, obviously recognizable quantities such as the mean and higher moments are expectation values, however other quantities such as various measures of risk will be found by computing an appropriate expectation. Additionally, quantities that are usually thought of not as expectations but rather as probabilities such as the probability of rain in a weather forecast will actually be found by numerically integrating over a number of marginal parametersFootnote ^{2} .
Such a desideratum coincides with one of the (still relatively few) fundamental computational tasks that we know admits a provable quantum advantage, namely quantum Monte Carlo integration (QMCI) (Montanaro, Reference Montanaro2015). Furthermore, significant progress has been made to allow such an advantage to be realized with minimal quantum resources.
Monte Carlo integration (MCI) is the process of numerically estimating some expectation value
which cannot be evaluated analytically, but where the probability distribution, $ p(x) $ , can be sampled from (and $ f(.) $ is some function). Notably, on any digital computer (classical or quantum) the integral will actually be a sum, owing to the necessary quantization and truncation of the support of $ p(x) $ , thus:
where $ {X}_i\sim p(x) $ are i.i.d samples. The approximate equality represents the process of MCI, and in particular, the mean squared error (MSE) is $ \mathcal{O}\left({q}^{1}\right) $ . When performing highdimensional integrals numerically, MCI is the most efficient method. (Note that quasiMonte Carlo (Morokoff and Caflisch, Reference Morokoff and Caflisch1995) and other noni.i.d classical methods have better convergence in $ q $ , but suffer the curse of dimensionality—the complexity grows exponentially in the number of dimensions—and hence are inefficient for highdimensional integrals.)
If we break down (classical) MCI, we can see that it amounts to a very simple threestep process: sample from $ p(x) $ , apply the function $ f(.) $ , and then average over many such samples with the function applied. In QMCI, there is an analogous threestep process: first, we take as an input a state preparation circuit, $ P $ , which prepares a quantum state $ p $ that samples from $ p(x) $ when measured on the computational basis
where, for simplicity, we assume that $ p(x) $ is supported over some $ N={2}^n $ points.
Second, a circuit, denoted $ R $ , is applied to $ \mid p\Big\rangle $ with one further qubit appended such that the following state is prepared:
The circuit $ R $ thus encodes the function applied, $ f(.) $ and in particular has the property that, when measured, the appended qubit has a probability of being one equal to
according to the Born rule, which tells us to square and sum all terms in the sum where the final qubit is in state $ \mid 1\Big\rangle $ . This is exactly the value we are trying to estimate with MCI, and it turns out that the (quantum) algorithm quantum amplitude estimation (QAE) can estimate this with MSE $ \mathcal{O}\left({q}^{2}\right) $ , where $ q $ is now the number of uses of the circuit $ P $ (Brassard et al., Reference Brassard, Høyer, Mosca and Tapp2002). Accepting (for now) that this quantity “ $ q $ ” corresponds to that in classical MCI, we can see that this represents a quadratic advantage in convergence: for a certain desired MSE, only about square root as many samples are required quantumly as would be classically.
However, QAE was not originally seen as an ideal candidate as a source of near term quantum advantage, as it uses quantum phase estimation (Kitaev, Reference Kitaev1996) an algorithm that is expected to require fullscale quantum computers. That all changed with the advent of amplitude estimation without phase estimation (Suzuki et al., Reference Suzuki, Uno, Raymond, Tanaka, Onodera and Yamamoto2020), which showed how to obtain the full quadratic quantum advantage, but using a number of shallowdepth circuits and classical postprocessing to estimate the expectation value. A number of other proposals have since followed in the same vein (Aaronson and Rall, Reference Aaronson and Rall2020; GiurgicaTiron et al., Reference GiurgicaTiron, Kerenidis, Labib, Prakash and Zeng2022; Nakaji, Reference Nakaji2020; Grinko et al., Reference Grinko, Gacon, Zoufal and Woerner2021).
Two more, complementary, breakthroughs have further fueled the hope that QMCI can be a source of nearterm quantum advantage:

1. Noiseaware QAE (Herbert et al., Reference Herbert, Guichard and Ng2021) takes an advantage of the fact that QAE circuits have a very specific structure to handle device noise as if it were estimation uncertainty. This suggests that significantly less error correction may be needed to achieve a useful advantage in QMCI, compared to other calculations of comparable size.

2. Quantum Monte Carlo integration: the full advantage in minimal circuit depth (Herbert, Reference Herbert2022) shows how to decompose the Monte Carlo integral as a Fourier series such that the circuit $ R $ , which may in general constitute an unreasonably large contribution to the total circuit depth, can be replaced by minimally deep circuits of rotation gates (this procedure is hereafter referred to as “Fourier QMCI”).
In particular, the second of these informs us of the sorts of applications that are likely to see a (relatively) early quantum advantage. For instance, Fourier QMCI is especially advantageous for numerical integrals that can be decomposed as a product of some $ p(x) $ , for which a suitable encoding can be prepared by a relatively shallow state preparation circuit (i.e., as described in equation 4); and some $ f(x) $ which can be extended as a piecewise periodic function whose Fourier series can be calculated and satisfies certain smoothness conditions. Areas in which computationally intensive numerical integrations are commonplace include computational fluid dynamics and highenergy physics—indeed, in the latter QMCI solutions have begun to be explored (Agliardi et al., Reference Agliardi, Grossi, Pellen and Prati2022).
For general numerical integrals, the randomness in MCI is a device to enable efficient numerical integration; however, for most datacentric MCIs, we do expect that $ p(x) $ will have a more literal role as the probability distribution from which the data has been sampled. This in turn raises the question of how to construct the state preparation circuit, $ P $ .
From a theoretical point of view, it is always possible to construct a suitable $ P $ from the corresponding classical sampling process (Herbert, Reference Herbert2021) (this resolves the earlier question of why the number of classical samples can be compared to the number of quantum uses of the state preparation circuit—the two uses of “ $ q $ ” that were treated as equivalent), and such a result may well ultimately find practical application. For example, ANNs trained as generative models are instances of classical sampling processes, and so such generative models can be converted into suitable circuits, $ P $ . However, in the nearterm such quantum circuits are likely to be infeasibly deep, and so instead we should focus on applications that leverage the quantum advantage in a more direct manner. In particular, the quantum advantage is manifested in a quadratic reduction in the number of samples required to attain a certain required accuracy, and so applications, where a very large number of samples are required, provide a good starting point—especially when those samples are from (relatively) simple stochastic processes.
With regards to datacentric engineering and science, one helpful way to think about which applications that QMCI will impact in the nearterm is in terms of the distinction between parametric and nonparametric models. For we have already established that we must operate on some model for the generation of the observed data: in cases where the best (classical) approach is to use the dataset to fit parameters from some parameterized family of distributions then we expect the corresponding circuit $ P $ to be relatively easy to construct. Conversely, if the model is nonparametric (or even something like a deep neural network that, by some, may be regarded as parametric—just with an enormous number of parameters that do not correspond in the natural and straightforward way to the statistics of the observed data as do traditional parametric models) then there is a real risk that the circuit $ P $ will be hard to construct in the nearterm.
Turning now to some specific examples, one promising area concerns timeseries data, where some large dataset is used to fit the parameters of some model such as a hidden Markov model, autoregressive model, or autoregressive moving average model (amongst many others). Notably, using an abundance of historical data to tune the parameters of timeseries models is commonplace throughout applications of MCI in financial and actuarial engineering. Indeed, it is the case that the vast majority of the early QMCI literature has focused on financial applications (Rebentrost and Lloyd, Reference Rebentrost and Lloyd2018; Rebentrost et al., Reference Rebentrost, Gupt and Bromley2018; Egger et al., Reference Egger, Gambella, Marecek, McFaddin, Mevissen, Raymond, Simonetto, Woerner and Yndurain2020, Reference Egger, Gutiérrez, Mestre and Woerner2021; Orús et al., Reference Orús, Mugel and Lizaso2019; Woerner and Egger, Reference Woerner and Egger2019; Bouland et al., Reference Bouland, van Dam, Joorati, Kerenidis and Prakash2020; Kaneko et al., Reference Kaneko, Miyamoto, Takeda and Yoshino2020; Stamatopoulos et al., Reference Stamatopoulos, Egger, Sun, Zoufal, Iten, Shen and Woerner2020, Reference Stamatopoulos, Mazzola, Woerner and Zeng2022; An et al., Reference An, Linden, Liu, Montanaro, Shao and Wang2021; Chakrabarti et al., Reference Chakrabarti, Krishnakumar, Mazzola, Stamatopoulos, Woerner and Zeng2021; Herman et al., Reference Herman, Googin, Liu, Galda, Safro, Sun, Pistoia and Alexeev2022).
That is not to say, however, that QMCI will ultimately only find application in financial engineering. For instance, the “function applied” in financial applications of QMCI usually corresponds to some sort of thresholded average—most obviously when calculating the expected return on a European option the function applied is essentially a ReLU function—and functions of these types (i.e., piecewise linear) are likely to be similar to suitable functions to calculate notions of “cost” in, for example, supplychain and logisticoptimization applications (Ozkan and Kilic, Reference Ozkan and Kilic2019). Moreover, Monte Carlo methods are widely used in virtually every area of datacentric engineering and science from medical imaging (Chen et al., Reference Chen, Pizer, Chaney and Joshi2002) to chemical, biochemical, and environmental (Sin and Espuña, Reference Sin and Espuña2020) to energy modeling (Dhaundiyal et al., Reference Dhaundiyal, Singh, Atsu and Dhaundiyal2019) and handling big data in general (Ji and Li, Reference Ji and Li2016). Indeed, rather than exhaustively cataloging every conceivable application of QMCI to datacentric engineering and science, a better approach is to set out the general framework, as is shown in Figure 4, such that expert readers can see how the quantum advantage may be realized in their respective domains.
At first sight, the scheme laid out in Figure 4 may appear to sidestep the central question of how quantum computing will help with large datasets, as the data handling itself represents a classical preprocessing step. However, this is simply a reflection of the reality that dataloading is generally hard and that it is prudent to focus on tasks where there is an unequivocal quantum advantage (i.e., in estimate converge as a function of number of samples). Moreover, many datacentric applications (e.g., those in finance) are indeed of the form where the dataloading can be achieved by fitting the parameters of some statistical model, but thereafter the statistical estimation is the bottleneck: and it is exactly those applications that quantum computing can incontrovertibly enhance.
6. Outlook and Speculation
In this article, I have deliberately homed in on QMCI as a likely source of nearterm quantum advantage in data science applications. This is a personal view, and many others still focus on the possibility that there will be genuine useful “NISQ advantage” (e.g., using PQCs as ML models). However, since this is my view, as explicitly noted in the introduction, I see the established division into the NISQ and fullscale eras of quantum computing as overly simplistic. Instead, I prefer to think about how we can use resource constrained quantum hardware, that is neither strictly NISQ (there may be enough qubits for some mild error correction) but neither is fullscale (in the sense that fullscale algorithms may be typified by an attitude of being unconcerned with resource demands that appear only as negligible contributions to asymptotic complexity)—and hence bridges between the established ideas of NISQ and fullscale quantum computing. Certainly, it is true that the first applications of quantum computing with a provable advantage will be those where consideration of resource management has been given special attention. For reasons that have recurred throughout this article, I am of the view that QMCI shows great promise to be such an application.
In light of this, I feel it is incumbent upon me to offer some predictions about when this will come to fruition. As any responsible quantum computing researcher will attest, such predictions are hard to make at present, but now that the leading quantum hardware manufacturers are beginning to commit to roadmaps with quantified targets, it is at least possible for theorists to make loose predictions contingent on said roadmaps being met. The first comment to make is that both IBM and Google have committed to reaching 1 million qubit by the end of the decade (Google, 2021; IBM, 2021), and such a figure would certainly be enough for useful quantum advantage in many applications including QMCI.
To make a more precise prediction, with regards to QMCI, we benefit from the fact that a leading research group has published a paper setting out suitable benchmarks to facilitate resource estimation for when there will be a useful quantum advantage in QMCI (Chakrabarti et al., Reference Chakrabarti, Krishnakumar, Mazzola, Stamatopoulos, Woerner and Zeng2021). The specific focus of the paper in question is on QMCI applications in finance, but the broad principle is likely to extend to other applications. We have estimated that our Fourier QMCI algorithm (Herbert, Reference Herbert2022) reduces resource requirements (number of quantum operations) by at least 30% to over 90% in some cases for the benchmarks set out, and if the circuits were reconstructed in a slightly different way we have estimated that the total number of physical superconducting qubits required would be in the 1,000s to 10,000 s. The exact value within this range depends on whether qubit qualities improve enough for lowoverhead errorcorrection codes (Tomita and Svore, Reference Tomita and Svore2014) to be practical; and in particular, whether the overhead can be reduced by exploiting asymmetries in the noise (Ataides et al., Reference Ataides, Tuckett, Bartlett, Flammia and Brown2021; Higgott et al., Reference Higgott, Bohdanowicz, Kubica, Flammia and Campbell2022)—both of which remain very active research topics.
This resource estimation places a useful advantage in QMCI in the 5year horizon for the leading superconducting roadmaps and a similar timescale is likely for trappedion devices, for example (Quantinuum, 2022), note that trappedion quantum computers typically have many fewer, but higherquality qubits, and tend to have less specific roadmaps). This is also consistent with other predictions, for example, that of QCWare and Goldman Sachs (Goldman Sachs and QC Ware, 2021).
Rather than leaving the prediction at that, it is worth “playing devil’s advocate” and exploring whether this is unreasonably hubristic—particularly in light of a widelycirculated paper suggesting that nearterm quantum advantage would be hard to obtain for algorithms exhibiting only a quadratic speedup (Babbush et al., Reference Babbush, McClean, Newman, Gidney, Boixo and Neven2021). There are three central claims in (Babbush et al., Reference Babbush, McClean, Newman, Gidney, Boixo and Neven2021), which provide a useful framework to scrutinize the legitimacy of my prediction:

1. Quadraticadvantage quantum algorithms are dominated by circuits of Toffoli gates, which are extremely expensive to implement using errorcorrected quantum computation. This is certainly true for unoptimized algorithms, however, Fourier QMCI (Herbert, Reference Herbert2022) moves to classical postprocessing exactly those Toffoliheavy circuits, while upholding the full quantum advantage.

2. Errorcorrection overheads are, in any case, expensive. Again, this is true, which is why bespoke approaches to error correction that exploit the specific algorithm structure and handle a certain amount of the device noise at the application level (as does noiseaware QAE [Herbert et al., Reference Herbert, Guichard and Ng2021]) are crucial to achieve nearterm quantum advantage.

3. Algorithms for which there is a quadratic quantum advantage can typically be massively parallelized when performed classically, meaning that a useful quantum advantage only occurs at much larger problems sizes, once the parallelism has been accounted for. This is again true in the case of MCI, but leaves out one very significant detail, namely that QMCI can itself be massively parallelized.
The third item reveals an important subtlety: in the (quantum computing) sector, we obsess about the “route to scale” in terms of adding ever more qubits to the same chip—but once quantum computers reach moderate scale, it will be just as important to scale up the number of quantum computers available. Just as classical HPC can accelerate classical datacentric applications by running calculations on different cores in parallel, so it will be the case that running quantum circuits in parallel will be crucial for an early advantage in datacentric applications. (To be clear, here the parallelization is classical—there is no need for entangled connections between the different cores—although the subject of distributed quantum computing, where there are entangled connections between the different quantum cores is a fascinating topic in its own right, see, e.g., Cirac et al., Reference Cirac, Ekert, Huelga and Macchiavello1999; Cuomo et al., Reference Cuomo, Caleffi and Cacciapuoti2020; Meter et al., Reference Meter, Nemoto and Munro2007.) Indeed, for our above resource estimates, we have only quantified when quantum hardware will be capable of running the requisite quantum circuits—in order for this to translate into a practical benefit, sufficient quantum cores must be available to the user.
So where does that leave us? That there exist data sciencerelevant quantum algorithms, such as QMCI, that exhibit a provable quantum advantage, coupled with the fact that the leading players are now beginning to scale up quantum hardware, provides a great cause for optimism that quantum computing will impact on datacentric engineering and science applications in the near to mediumterm. However, in quantum computing we have learned that optimism must always go handinhand with caution: there are serious engineering challenges at every layer, from the design of the quantum computer itself, to the control software, to the optimization of the algorithms that run the desired applications. In particular, we know that preparing quantum states that encode the relevant model or distribution is usually the bottleneck in QMCI—cracking this dataloading problem is the key to unleashing the power of quantum computation onto myriad applications within finance, supply chain & logistics, medical imaging, and energy modeling.
Acknowledgments
My thanks to Ewin Tang and Seth Lloyd for kindly answering my questions regarding quantum linear algebra and the dequantization thereof—of course, any remaining errors or misconceptions are my own. I thank Alexandre Krajenbrink, Sam Duffield, and Konstantinos Meichanetzidis for reviewing and providing valuable suggestions. I also thank the reviewers at DCE, whose suggestions helped to further improve the work.
Competing Interests
The author declares no competing interests exist.
Data Availability Statement
Data availability is not applicable to this article as no new data were created or analyzed in this study.
Author Contributions
S.H. contributed to the conceptualization, formal analysis, investigation, writing—original draft, and review and editing.
Ethics Statement
The research meets all ethical guidelines, including adherence to the legal requirements of the study country.
Funding Statement
This work received no specific grant from any funding agency, commercial or notforprofit sectors.
Comments
No Comments have been published for this article.