Skip to main content Accessibility help
Hostname: page-component-7f7b94f6bd-9g8ph Total loading time: 1.599 Render date: 2022-06-28T13:21:12.308Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "useNewApi": true } hasContentIssue true

Machine Learning for Asset Managers

Published online by Cambridge University Press:  04 April 2020

Marcos M. López de Prado
Cornell University, New York


Successful investment strategies are specific implementations of general theories. An investment strategy that lacks a theoretical justification is likely to be false. Hence, an asset manager should concentrate her efforts on developing a theory rather than on backtesting potential trading rules. The purpose of this Element is to introduce machine learning (ML) tools that can help asset managers discover economic and financial theories. ML is not a black box, and it does not necessarily overfit. ML tools complement rather than replace the classical statistical methods. Some of ML's strengths include (1) a focus on out-of-sample predictability over variance adjudication; (2) the use of computational methods to avoid relying on (potentially unrealistic) assumptions; (3) the ability to “learn” complex specifications, including nonlinear, hierarchical, and noncontinuous interaction effects in a high-dimensional space; and (4) the ability to disentangle the variable search from the specification search, robust to multicollinearity and other substitution effects.
Get access
Online ISBN: 9781108883658
Publisher: Cambridge University Press
Print publication: 30 April 2020
© True Positive Technologies, LP 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Aggarwal, C., and Reddy, C (2014): Data Clustering – Algorithms and Applications. 1st ed. CRC Press.CrossRefGoogle Scholar
Ahmed, N., Atiya, A., Gayar, N., and El-Shishiny, H. (2010): “An Empirical Comparison of Machine Learning Models for Time Series Forecasting.Econometric Reviews, Vol. 29, No. 5–6, pp. 594621.CrossRefGoogle Scholar
Anderson, G., Guionnet, A, and Zeitouni, O (2009): An Introduction to Random Matrix Theory. 1st ed. Cambridge Studies in Advanced Mathematics. Cambridge University Press.CrossRefGoogle Scholar
Ballings, M., van den Poel, D., Hespeels, N., and Gryp, R. (2015): “Evaluating Multiple Classifiers for Stock Price Direction Prediction.Expert Systems with Applications, Vol. 42, No. 20, pp. 7046–56.CrossRefGoogle Scholar
Bansal, N., Blum, A, and Chawla, S (2004): “Correlation Clustering.Machine Learning, Vol. 56, No. 1, pp. 89113.CrossRefGoogle Scholar
Benjamini, Y., and Yekutieli, D (2001): “The Control of the False Discovery Rate in Multiple Testing under Dependency.Annals of Statistics, Vol. 29, pp. 1165–88.Google Scholar
Benjamini, Y., and Liu, W (1999): “A Step-Down Multiple Hypotheses Testing Procedure that Controls the False Discovery Rate under Independence.Journal of Statistical Planning and Inference, Vol. 82, pp. 163–70.CrossRefGoogle Scholar
Benjamini, Y., and Hochberg, Y (1995): “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.Journal of the Royal Statistical Society, Series B, Vol. 57, pp. 289300.Google Scholar
Bontempi, G., Taieb, S., and Le Borgne, Y. (2012): “Machine Learning Strategies for Time Series Forecasting.Lecture Notes in Business Information Processing, Vol. 138, No. 1, pp. 6277.CrossRefGoogle Scholar
Booth, A., Gerding, E., and McGroarty, F. (2014): “Automated Trading with Performance Weighted Random Forests and Seasonality.Expert Systems with Applications, Vol. 41, No. 8, pp. 3651–61.CrossRefGoogle Scholar
Cao, L., and Tay, F. (2001): “Financial Forecasting Using Support Vector Machines.Neural Computing and Applications, Vol. 10, No. 2, pp. 184–92.CrossRefGoogle Scholar
Cao, L., Tay, F., and Hock, F. (2003): “Support Vector Machine with Adaptive Parameters in Financial Time Series Forecasting.IEEE Transactions on Neural Networks, Vol. 14, No. 6, pp. 1506–18.CrossRefGoogle ScholarPubMed
Cervello-Royo, R., Guijarro, F., and Michniuk, K. (2015): “Stockmarket Trading Rule Based on Pattern Recognition and Technical Analysis: Forecasting the DJIA Index with Intraday Data.Expert Systems with Applications, Vol. 42, No. 14, pp. 5963–75.CrossRefGoogle Scholar
Chang, P., Fan, C., and Lin, J. (2011): “Trend Discovery in Financial Time Series Data Using a Case-Based Fuzzy Decision Tree.Expert Systems with Applications, Vol. 38, No. 5, pp. 6070–80.CrossRefGoogle Scholar
Chen, B., and Pearl, J (2013): “Regression and Causation: A Critical Examination of Six Econometrics Textbooks.Real-World Economics Review, Vol. 65, pp. 220.Google Scholar
Creamer, G., and Freund, Y. (2007): “A Boosting Approach for Automated Trading.Journal of Trading, Vol. 2, No. 3, pp. 8496.CrossRefGoogle Scholar
Creamer, G., and Freund, Y. (2010): “Automated Trading with Boosting and Expert Weighting.Quantitative Finance, Vol. 10, No. 4, pp. 401–20.Google Scholar
Creamer, G., Ren, Y., Sakamoto, Y., and Nickerson, J. (2016): “A Textual Analysis Algorithm for the Equity Market: The European Case.Journal of Investing, Vol. 25, No. 3, pp. 105–16.CrossRefGoogle Scholar
Dixon, M., Klabjan, D., and Bang, J. (2017): “Classification-Based Financial Markets Prediction Using Deep Neural Networks.Algorithmic Finance, Vol. 6, No. 3, pp. 6777.CrossRefGoogle Scholar
Dunis, C., and Williams, M. (2002): “Modelling and Trading the Euro/US Dollar Exchange Rate: Do Neural Network Models Perform Better?Journal of Derivatives and Hedge Funds, Vol. 8, No. 3, pp. 211–39.Google Scholar
Easley, D., and Kleinberg, J (2010): Networks, Crowds, and Markets: Reasoning about a Highly Connected World. 1st ed. Cambridge University Press.CrossRefGoogle Scholar
Easley, D., López de Prado, M, O’Hara, M, and Zhang, Z (2011): “Microstructure in the Machine Age.” Working paper.Google Scholar
Efroymson, M. (1960): “Multiple Regression Analysis.” In Ralston, A and Wilf, H (eds.), Mathematical Methods for Digital Computers. 1st ed. Wiley.Google Scholar
Einav, L., and Levin, J (2014): “Economics in the Age of Big Data.Science, Vol. 346, No. 6210. Available at ScholarPubMed
Feuerriegel, S., and Prendinger, H. (2016): “News-Based Trading Strategies.Decision Support Systems, Vol. 90, pp. 6574.CrossRefGoogle Scholar
Greene, W. (2012): Econometric Analysis. 7th ed. Pearson Education.Google Scholar
Harvey, C., and Liu, Y (2015): “Backtesting.The Journal of Portfolio Management, Vol. 42, No. 1, pp. 1328.CrossRefGoogle Scholar
Harvey, C., and Liu, Y (2018): “False (and Missed) Discoveries in Financial Economics.” Working paper. Available at Scholar
Harvey, C., and Liu, Y (2018): “Lucky Factors.” Working paper. Available at Scholar
Hastie, T., Tibshirani, R, and Friedman, J (2016): The Elements of Statistical Learning: Data Mining, Inference and Prediction. 2nd ed. Springer.Google Scholar
Hayashi, F. (2000): Econometrics. 1st ed. Princeton University Press.Google Scholar
Holm, S. (1979): “A Simple Sequentially Rejective Multiple Test Procedure.Scandinavian Journal of Statistics, Vol. 6, pp. 6570.Google Scholar
Hsu, S., Hsieh, J., Chih, T., and Hsu, K. (2009): “A Two-Stage Architecture for Stock Price Forecasting by Integrating Self-Organizing Map and Support Vector Regression.Expert Systems with Applications, Vol. 36, No. 4, pp. 7947–51.CrossRefGoogle Scholar
Huang, W., Nakamori, Y., and Wang, S. (2005): “Forecasting Stock Market Movement Direction with Support Vector Machine.Computers and Operations Research, Vol. 32, No. 10, pp. 2513–22.CrossRefGoogle Scholar
Ioannidis, J. (2005): “Why Most Published Research Findings Are False.PLoS Medicine, Vol. 2, No. 8. Available at ScholarPubMed
James, G., Witten, D, Hastie, T, and Tibshirani, R (2013): An Introduction to Statistical Learning. 1st ed. Springer.CrossRefGoogle Scholar
Kahn, R. (2018): The Future of Investment Management. 1st ed. CFA Institute Research Foundation.CrossRefGoogle Scholar
Kara, Y., Boyacioglu, M., and Baykan, O. (2011): “Predicting Direction of Stock Price Index Movement Using Artificial Neural Networks and Support Vector Machines: The Sample of the Istanbul Stock Exchange.Expert Systems with Applications, Vol. 38, No. 5, pp. 5311–19.CrossRefGoogle Scholar
Kim, K. (2003): “Financial Time Series Forecasting Using Support Vector Machines.Neurocomputing, Vol. 55, No. 1, pp. 307–19.CrossRefGoogle Scholar
Kolanovic, M., and Krishnamachari, R (2017): “Big Data and AI Strategies: Machine Learning and Alternative Data Approach to Investing.” J.P. Morgan Quantitative and Derivative Strategy, May.Google Scholar
Kolm, P., Tutuncu, R, and Fabozzi, F (2010): “60 Years of Portfolio Optimization.European Journal of Operational Research, Vol. 234, No. 2, pp. 356–71.Google Scholar
Krauss, C., Do, X., and Huck, N. (2017): “Deep Neural Networks, Gradient-Boosted Trees, Random Forests: Statistical Arbitrage on the S&P 500.European Journal of Operational Research, Vol. 259, No. 2, pp. 689702.CrossRefGoogle Scholar
Kuan, C., and Tung, L. (1995): “Forecasting Exchange Rates Using Feedforward and Recurrent Neural Networks.Journal of Applied Econometrics, Vol. 10, No. 4, pp. 347–64.CrossRefGoogle Scholar
Kuhn, H. W., and Tucker, A. W. (1952): “Nonlinear Programming.” In Proceedings of 2nd Berkeley Symposium. University of California Press, pp. 481–92.Google Scholar
Laborda, R., and Laborda, J. (2017): “Can Tree-Structured Classifiers Add Value to the Investor?Finance Research Letters, Vol. 22, pp. 211–26.CrossRefGoogle Scholar
López de Prado, M. (2018): “A Practical Solution to the Multiple-Testing Crisis in Financial Research.” Journal of Financial Data Science, Vol. 1, No. 1. Available at Scholar
López de Prado, M., and Lewis, M (2018): “Confidence and Power of the Sharpe Ratio under Multiple Testing.” Working paper. Available at Scholar
MacKay, D. (2003): Information Theory, Inference, and Learning Algorithms. 1st ed. Cambridge University Press.Google Scholar
Marcenko, V., and Pastur, L (1967): “Distribution of Eigenvalues for Some Sets of Random Matrices.Matematicheskii Sbornik, Vol. 72, No. 4, pp. 507–36.Google Scholar
Michaud, R. (1998): Efficient Asset Allocation: A Practical Guide to Stock Portfolio Optimization and Asset Allocation. Boston: Harvard Business School Press.Google Scholar
Nakamura, E. (2005): “Inflation Forecasting Using a Neural Network.Economics Letters, Vol. 86, No. 3, pp. 373–78.CrossRefGoogle Scholar
Olson, D., and Mossman, C. (2003): “Neural Network Forecasts of Canadian Stock Returns Using Accounting Ratios.International Journal of Forecasting, Vol. 19, No. 3, pp. 453–65.CrossRefGoogle Scholar
Otto, M. (2016): Chemometrics: Statistics and Computer Application in Analytical Chemistry. 3rd ed. Wiley.CrossRefGoogle Scholar
Patel, J., Sha, S., Thakkar, P., and Kotecha, K. (2015): “Predicting Stock and Stock Price Index Movement Using Trend Deterministic Data Preparation and Machine Learning Techniques.Expert Systems with Applications, Vol. 42, No. 1, pp. 259–68.Google Scholar
Pearl, J. (2009): “Causal Inference in Statistics: An Overview.Statistics Surveys, Vol. 3, pp. 96146.CrossRefGoogle Scholar
Plerou, V., Gopikrishnan, P, Rosenow, B, Nunes Amaral, L, and Stanley, H (1999): “Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series.Physical Review Letters, Vol. 83, No. 7, pp. 1471–74.CrossRefGoogle Scholar
Porter, K. (2017): “Estimating Statistical Power When Using Multiple Testing Procedures.” Available at Scholar
Potter, M., Bouchaud, J. P., and Laloux, L (2005): “Financial Applications of Random Matrix Theory: Old Laces and New Pieces.Acta Physica Polonica B, Vol. 36, No. 9, pp. 2767–84.Google Scholar
Qin, Q., Wang, Q., Li, J., and Shuzhi, S. (2013): “Linear and Nonlinear Trading Models with Gradient Boosted Random Forests and Application to Singapore Stock Market.Journal of Intelligent Learning Systems and Applications, Vol. 5, No. 1, pp. 110.CrossRefGoogle Scholar
Robert, C. (2014): “On the Jeffreys–Lindley Paradox.Philosophy of Science, Vol. 81, No. 2, pp. 216–32.CrossRefGoogle Scholar
Shafer, G. (1982): “Lindley’s Paradox.Journal of the American Statistical Association, Vol. 77, No. 378, pp. 325–34.Google Scholar
Simon, H. (1962): “The Architecture of Complexity.Proceedings of the American Philosophical Society, Vol. 106, No. 6, pp. 467–82.Google Scholar
SINTEF (2013): “Big Data, for Better or Worse: 90% of World’s Data Generated over Last Two Years.” Science Daily, May 22. Available at Scholar
Sorensen, E., Miller, K., and Ooi, C. (2000): “The Decision Tree Approach to Stock Selection.Journal of Portfolio Management, Vol. 27, No. 1, pp. 4252.CrossRefGoogle Scholar
Theofilatos, K., Likothanassis, S., and Karathanasopoulos, A. (2012): “Modeling and Trading the EUR/USD Exchange Rate Using Machine Learning Techniques.Engineering, Technology and Applied Science Research, Vol. 2, No. 5, pp. 269–72.CrossRefGoogle Scholar
Trafalis, T., and Ince, H. (2000): “Support Vector Machine for Regression and Applications to Financial Forecasting.Neural Networks, Vol. 6, No. 1, pp. 348–53.Google Scholar
Trippi, R., and DeSieno, D. (1992): “Trading Equity Index Futures with a Neural Network.Journal of Portfolio Management, Vol. 19, No. 1, pp. 2733.CrossRefGoogle Scholar
Tsai, C., and Wang, S. (2009): “Stock Price Forecasting by Hybrid Machine Learning Techniques.Proceedings of the International Multi-Conference of Engineers and Computer Scientists, Vol. 1, No. 1, pp. 755–60.Google Scholar
Tsai, C., Lin, Y., Yen, D., and Chen, Y. (2011): “Predicting Stock Returns by Classifier Ensembles.Applied Soft Computing, Vol. 11, No. 2, pp. 2452–59.CrossRefGoogle Scholar
Tsay, R. (2013): Multivariate Time Series Analysis: With R and Financial Applications. 1st ed. Wiley.Google Scholar
Wang, J., and Chan, S. (2006): “Stock Market Trading Rule Discovery Using Two-Layer Bias Decision Tree.Expert Systems with Applications, Vol. 30, No. 4, pp. 605–11.CrossRefGoogle Scholar
Wang, Q., Li, J., Qin, Q., and Ge, S. (2011): “Linear, Adaptive and Nonlinear Trading Models for Singapore Stock Market with Random Forests.” In Proceedings of the 9th IEEE International Conference on Control and Automation, pp. 726–31.CrossRefGoogle Scholar
Wei, P., and Wang, N. (2016): “Wikipedia and Stock Return: Wikipedia Usage Pattern Helps to Predict the Individual Stock Movement.” In Proceedings of the 25th International Conference Companion on World Wide Web, Vol. 1, pp. 591–94.CrossRefGoogle Scholar
Wooldridge, J. (2010): Econometric Analysis of Cross Section and Panel Data. 2nd ed. MIT Press.Google Scholar
Wright, S. (1921): “Correlation and Causation.Journal of Agricultural Research, Vol. 20, pp. 557–85.Google Scholar
Żbikowski, K. (2015): “Using Volume Weighted Support Vector Machines with Walk Forward Testing and Feature Selection for the Purpose of Creating Stock Trading Strategy.Expert Systems with Applications, Vol. 42, No. 4, pp. 17971805.CrossRefGoogle Scholar
Zhang, G., Patuwo, B., and Hu, M. (1998): “Forecasting with Artificial Neural Networks: The State of the Art.International Journal of Forecasting, Vol. 14, No. 1, pp. 3562.CrossRefGoogle Scholar
Zhu, M., Philpotts, D., and Stevenson, M. (2012): “The Benefits of Tree-Based Models for Stock Selection.Journal of Asset Management, Vol. 13, No. 6, pp. 437–48.CrossRefGoogle Scholar
Zhu, M., Philpotts, D., Sparks, R., and Stevenson, J. (2011): “A Hybrid Approach to Combining CART and Logistic Regression for Stock Ranking.Journal of Portfolio Management, Vol. 38, No. 1, pp. 100109.CrossRefGoogle Scholar
American Statistical Association (2016): “Statement on Statistical Significance and P-Values.” Available at Scholar
Apley, D. (2016): “Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models.” Available at Scholar
Athey, Susan (2015): “Machine Learning and Causal Inference for Policy Evaluation.” In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 56. ACM.CrossRefGoogle Scholar
Bailey, D., and López de Prado, M (2012): “The Sharpe Ratio Efficient Frontier.Journal of Risk, Vol. 15, No. 2, pp. 344.CrossRefGoogle Scholar
Bailey, D., and López de Prado, M (2013): “An Open-Source Implementation of the Critical-Line Algorithm for Portfolio Optimization.Algorithms, Vol. 6, No. 1, pp. 169–96. Available at Scholar
Bailey, D., and López de Prado, M (2014): “The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting and Non-Normality.Journal of Portfolio Management, Vol. 40, No. 5, pp. 94107.CrossRefGoogle Scholar
Bailey, D., Borwein, J, López de Prado, M, and Zhu, J (2014): “Pseudo-mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance.Notices of the American Mathematical Society, Vol. 61, No. 5, pp. 458–71. Available at Scholar
Black, F., and Litterman, R (1991): “Asset Allocation Combining Investor Views with Market Equilibrium.Journal of Fixed Income, Vol. 1, No. 2, pp. 718.CrossRefGoogle Scholar
Black, F., and Litterman, R (1992): “Global Portfolio Optimization.Financial Analysts Journal, Vol. 48, No. 5, pp. 2843.CrossRefGoogle Scholar
Breiman, L. (2001): “Random Forests.” Machine Learning, Vol. 45, No. 1, pp. 532.CrossRefGoogle Scholar
Brian, E., and Jaisson, M. (2007): “Physico-theology and Mathematics (1710–1794).” In The Descent of Human Sex Ratio at Birth. Springer Science & Business Media, pp. 125.Google Scholar
Brooks, C., and Kat, H (2002): “The Statistical Properties of Hedge Fund Index Returns and Their Implications for Investors.Journal of Alternative Investments, Vol. 5, No. 2, pp. 2644.CrossRefGoogle Scholar
Cavallo, A., and Rigobon, R (2016): “The Billion Prices Project: Using Online Prices for Measurement and Research.” NBER Working Paper 22111, March.CrossRefGoogle Scholar
CFTC (2010): “Findings Regarding the Market Events of May 6, 2010.” Report of the Staffs of the CFTC and SEC to the Joint Advisory Committee on Emerging Regulatory Issues, September 30.Google Scholar
Christie, S. (2005): “Is the Sharpe Ratio Useful in Asset Allocation?” MAFC Research Paper 31. Applied Finance Centre, Macquarie University.CrossRefGoogle Scholar
Clarke, Kevin A. (2005): “The Phantom Menace: Omitted Variable Bias in Econometric Research.Conflict Management and Peace Science, Vol. 22, No. 1, pp. 341–52.CrossRefGoogle Scholar
Clarke, R., De Silva, H, and Thorley, S (2002): “Portfolio Constraints and the Fundamental Law of Active Management.Financial Analysts Journal, Vol. 58, pp. 4866.CrossRefGoogle Scholar
Cohen, L., and Frazzini, A (2008): “Economic Links and Predictable Returns.Journal of Finance, Vol. 63, No. 4, pp. 19772011.CrossRefGoogle Scholar
De Miguel, V., Garlappi, L, and Uppal, R (2009): “Optimal versus Naive Diversification: How Inefficient Is the 1/N Portfolio Strategy?Review of Financial Studies, Vol. 22, pp. 1915–53.Google Scholar
Ding, C., and He, X (2004): “K-Means Clustering via Principal Component Analysis.” In Proceedings of the 21st International Conference on Machine Learning. Available at Scholar
Easley, D., López de Prado, M, and O’Hara, M (2011a): “Flow Toxicity and Liquidity in a High-Frequency World.Review of Financial Studies, Vol. 25, No. 5, pp. 1457–93.Google Scholar
Easley, D., López de Prado, M, and O’Hara, M (2011b): “The Microstructure of the ‘Flash Crash’: Flow Toxicity, Liquidity Crashes and the Probability of Informed Trading.Journal of Portfolio Management, Vol. 37, No. 2, pp. 118–28.CrossRefGoogle Scholar
Efron, B., and Hastie, T (2016): Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. 1st ed. Cambridge University Press.CrossRefGoogle Scholar
Embrechts, P., Klueppelberg, C, and Mikosch, T (2003): Modelling Extremal Events. 1st ed. Springer.Google Scholar
Goutte, C., Toft, P, Rostrup, E, Nielsen, F, and Hansen, L (1999): “On Clustering fMRI Time Series.NeuroImage, Vol. 9, No. 3, pp. 298310.CrossRefGoogle ScholarPubMed
Grinold, R., and Kahn, R (1999): Active Portfolio Management. 2nd ed. McGraw-Hill.Google Scholar
Gryak, J., Haralick, R, and Kahrobaei, D (Forthcoming): “Solving the Conjugacy Decision Problem via Machine Learning.” Experimental Mathematics. Available at Scholar
Hacine-Gharbi, A., and Ravier, P (2018): “A Binning Formula of Bi-histogram for Joint Entropy Estimation Using Mean Square Error Minimization.Pattern Recognition Letters, Vol. 101, pp. 2128.CrossRefGoogle Scholar
Hacine-Gharbi, A., Ravier, P, Harba, R, and Mohamadi, T (2012): “Low Bias Histogram-Based Estimation of Mutual Information for Feature Selection.” Pattern Recognition Letters, Vol. 33, pp. 1302–8.CrossRefGoogle Scholar
Hamilton, J. (1994): Time Series Analysis. 1st ed. Princeton University Press.Google Scholar
Harvey, C., Liu, Y, and Zhu, C (2016): “… and the Cross-Section of Expected Returns.Review of Financial Studies, Vol. 29, No. 1, pp. 568. Available at Scholar
Hodge, V., and Austin, J (2004): “A Survey of Outlier Detection Methodologies.Artificial Intelligence Review, Vol. 22, No. 2, pp. 85126.CrossRefGoogle Scholar
IDC (2014): “The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things.” EMC Digital Universe with Research and Analysis. April. Available at Scholar
Ingersoll, J., Spiegel, M, Goetzmann, W, and Welch, I (2007): “Portfolio Performance Manipulation and Manipulation-Proof Performance Measures.The Review of Financial Studies, Vol. 20, No. 5, pp. 1504–46.Google Scholar
Jaynes, E. (2003): Probability Theory: The Logic of Science. 1st ed. Cambridge University Press.CrossRefGoogle Scholar
Jolliffe, I. (2002): Principal Component Analysis. 2nd ed. Springer.Google Scholar
Kraskov, A., Stoegbauer, H, and Grassberger, P (2008): “Estimating Mutual Information.” Working paper. Available at Scholar
Laloux, L., Cizeau, P, Bouchaud, J. P., and Potters, M (2000): “Random Matrix Theory and Financial Correlations.” International Journal of Theoretical and Applied Finance, Vol. 3, No. 3, pp. 391–97.CrossRefGoogle Scholar
Ledoit, O., and Wolf, M (2004): “A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices.Journal of Multivariate Analysis, Vol. 88, No. 2, pp. 365411.CrossRefGoogle Scholar
Lewandowski, D., Kurowicka, D, and Joe, H (2009): “Generating Random Correlation Matrices Based on Vines and Extended Onion Method.Journal of Multivariate Analysis, Vol. 100, pp. 19892001.CrossRefGoogle Scholar
Liu, Y. (2004): “A Comparative Study on Feature Selection Methods for Drug Discovery.Journal of Chemical Information and Modeling, Vol. 44, No. 5, pp. 1823–28. Available at ScholarPubMed
Lo, A. (2002): “The Statistics of Sharpe Ratios.” Financial Analysts Journal, July, pp. 3652.CrossRefGoogle Scholar
Lochner, M., McEwen, J, Peiris, H, Lahav, O, and Winter, M (2016): “Photometric Supernova Classification with Machine Learning.The Astrophysical Journal, Vol. 225, No. 2. Available at Scholar
López de Prado, M. (2016): “Building Diversified Portfolios that Outperform Out-of-Sample.Journal of Portfolio Management, Vol. 42, No. 4, pp. 5969.CrossRefGoogle Scholar
López de Prado, M. (2018a): Advances in Financial Machine Learning. 1st ed. Wiley.Google Scholar
López de Prado, M. (2018b): “The 10 Reasons Most Machine Learning Funds Fail.The Journal of Portfolio Management, Vol. 44, No. 6, pp. 120–33.CrossRefGoogle Scholar
López de Prado, M. (2019a): “A Data Science Solution to the Multiple-Testing Crisis in Financial Research.Journal of Financial Data Science, Vol. 1, No. 1, pp. 99110.CrossRefGoogle Scholar
López de Prado, M. (2019b): “Beyond Econometrics: A Roadmap towards Financial Machine Learning.” Working paper. Available at Scholar
López de Prado, M. (2019c): “Ten Applications of Financial Machine Learning.” Working paper. Available at Scholar
López de Prado, M., and Lewis, M (2018): “Detection of False Investment Strategies Using Unsupervised Learning Methods.” Working paper. Available at Scholar
Louppe, G., Wehenkel, L., Sutera, A., and Geurts, P. (2013): “Understanding Variable Importances in Forests of Randomized Trees.” In Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 431–39.Google Scholar
Markowitz, H. (1952): “Portfolio Selection.Journal of Finance, Vol. 7, pp. 7791.Google Scholar
Meila, M. (2007): “Comparing Clusterings – an Information Based Distance.” Journal of Multivariate Analysis, Vol. 98, pp. 873–95.CrossRefGoogle Scholar
Mertens, E. (2002): “Variance of the IID estimator in Lo (2002).” Working paper, University of Basel.Google Scholar
Molnar, C. (2019): “Interpretable Machine Learning: A Guide for Making Black-Box Models Explainable.” Available at Scholar
Mullainathan, S., and Spiess, J (2017): “Machine Learning: An Applied Econometric Approach.” Journal of Economic Perspectives, Vol. 31, No. 2, pp. 87106.CrossRefGoogle Scholar
Neyman, J., and Pearson, E (1933): “IX. On the Problem of the Most Efficient Tests of Statistical Hypotheses.Philosophical Transactions of the Royal Society, Series A, Vol. 231, No. 694–706, pp. 289337.Google Scholar
Opdyke, J. (2007): “Comparing Sharpe Ratios: So Where Are the p-Values?Journal of Asset Management, Vol. 8, No. 5, pp. 308–36.CrossRefGoogle Scholar
Parzen, E. (1962): “On Estimation of a Probability Density Function and Mode.The Annals of Mathematical Statistics, Vol. 33, No. 3, pp. 1065–76.CrossRefGoogle Scholar
Resnick, S. (1987): Extreme Values, Regular Variation and Point Processes. 1st ed. Springer.CrossRefGoogle Scholar
Romer, P. (2016): “The Trouble with Macroeconomics.” The American Economist, September 14.Google Scholar
Rosenblatt, M. (1956): “Remarks on Some Nonparametric Estimates of a Density Function.The Annals of Mathematical Statistics, Vol. 27, No. 3, pp. 832–37.CrossRefGoogle Scholar
Rousseeuw, P. (1987): “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis.Computational and Applied Mathematics, Vol. 20, pp. 5365.CrossRefGoogle Scholar
Schlecht, J., Kaplan, M, Barnard, K, Karafet, T, Hammer, M, and Merchant, N (2008): “Machine-Learning Approaches for Classifying Haplogroup from Y Chromosome STR Data.PLOS Computational Biology, Vol. 4, No. 6. Available at ScholarPubMed
Sharpe, W. (1966): “Mutual Fund Performance.Journal of Business, Vol. 39, No. 1, pp. 119–38.Google Scholar
Sharpe, W. (1975): “Adjusting for Risk in Portfolio Performance Measurement.Journal of Portfolio Management, Vol. 1, No. 2, pp. 2934.CrossRefGoogle Scholar
Sharpe, W. (1994): “The Sharpe Ratio.Journal of Portfolio Management, Vol. 21, No. 1, pp. 4958.CrossRefGoogle Scholar
Šidàk, Z. (1967): “Rectangular Confidence Regions for the Means of Multivariate Normal Distributions.Journal of the American Statistical Association, Vol. 62, No. 318, pp. 626–33.Google Scholar
Solow, R. (2010): “Building a Science of Economics for the Real World.” Prepared statement of Robert Solow, Professor Emeritus, MIT, to the House Committee on Science and Technology, Subcommittee on Investigations and Oversight, July 20.Google Scholar
Steinbach, M., Levent, E, and Kumar, V (2004): “The Challenges of Clustering High Dimensional Data.” In Wille, L (ed.), New Directions in Statistical Physics. 1st ed. Springer, pp. 273309.CrossRefGoogle Scholar
Štrumbelj, E., and Kononenko, I. (2014): “Explaining Prediction Models and Individual Predictions with Feature Contributions.” Knowledge and Information Systems, Vol. 41, No. 3, pp. 647–65.CrossRefGoogle Scholar
Varian, H. (2014): “Big Data: New Tricks for Econometrics.Journal of Economic Perspectives, Vol. 28, No. 2, pp. 328.CrossRefGoogle Scholar
Wasserstein, R., Schirm, A., and Lazar, N. (2019): “Moving to a World beyond p<0.05.” The American Statistician, Vol. 73, No. 1, pp. 119.CrossRefGoogle Scholar
Wasserstein, R., and Lazar, N. (2016): “The ASA’s Statement on p-Values: Context, Process, and Purpose.” The American Statistician, Vol. 70, pp. 129–33.CrossRefGoogle Scholar
Witten, D., Shojaie, A., and Zhang, F. (2013): “The Cluster Elastic Net for High-Dimensional Regression with Unknown Variable Grouping.” Technometrics, Vol. 56, No. 1, pp. 112–22.Google Scholar

Save element to Kindle

To save this element to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Machine Learning for Asset Managers
Available formats

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Machine Learning for Asset Managers
Available formats

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Machine Learning for Asset Managers
Available formats