Machine Learning for Asset Managers

Marcos M. López de Prado

doi:10.1017/9781108883658

Series: Elements in Quantitative Finance

Machine Learning for Asset Managers

Published online by Cambridge University Press: 04 April 2020

Marcos M. López de Prado

Show author details

Marcos M. López de Prado: Affiliation:
Cornell University, New York

Summary

Successful investment strategies are specific implementations of general theories. An investment strategy that lacks a theoretical justification is likely to be false. Hence, an asset manager should concentrate her efforts on developing a theory rather than on backtesting potential trading rules. The purpose of this Element is to introduce machine learning (ML) tools that can help asset managers discover economic and financial theories. ML is not a black box, and it does not necessarily overfit. ML tools complement rather than replace the classical statistical methods. Some of ML's strengths include (1) a focus on out-of-sample predictability over variance adjudication; (2) the use of computational methods to avoid relying on (potentially unrealistic) assumptions; (3) the ability to “learn” complex specifications, including nonlinear, hierarchical, and noncontinuous interaction effects in a high-dimensional space; and (4) the ability to disentangle the variable search from the specification search, robust to multicollinearity and other substitution effects.

Element contents

Summary
References

Get access

Keywords

machine learning unsupervised learning supervised learning clustering classification labeling portfolio construction

Type: Element
Information: Series: Elements in Quantitative Finance

DOI: https://doi.org/10.1017/9781108883658 [Opens in a new window]

Online ISBN: 9781108883658

Publisher: Cambridge University Press

Print publication: 30 April 2020

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Bibliography

Aggarwal, C., and Reddy, C (2014): Data Clustering – Algorithms and Applications. 1st ed. CRC Press.Google Scholar

Ahmed, N., Atiya, A., Gayar, N., and El-Shishiny, H. (2010): “An Empirical Comparison of Machine Learning Models for Time Series Forecasting.” Econometric Reviews, Vol. 29, No. 5–6, pp. 594–621.CrossRef Google Scholar

Anderson, G., Guionnet, A, and Zeitouni, O (2009): An Introduction to Random Matrix Theory. 1st ed. Cambridge Studies in Advanced Mathematics. Cambridge University Press.Google Scholar

Ballings, M., van den Poel, D., Hespeels, N., and Gryp, R. (2015): “Evaluating Multiple Classifiers for Stock Price Direction Prediction.” Expert Systems with Applications, Vol. 42, No. 20, pp. 7046–56.Google Scholar

Bansal, N., Blum, A, and Chawla, S (2004): “Correlation Clustering.” Machine Learning, Vol. 56, No. 1, pp. 89–113.Google Scholar

Benjamini, Y., and Yekutieli, D (2001): “The Control of the False Discovery Rate in Multiple Testing under Dependency.” Annals of Statistics, Vol. 29, pp. 1165–88.Google Scholar

Benjamini, Y., and Liu, W (1999): “A Step-Down Multiple Hypotheses Testing Procedure that Controls the False Discovery Rate under Independence.” Journal of Statistical Planning and Inference, Vol. 82, pp. 163–70.Google Scholar

Benjamini, Y., and Hochberg, Y (1995): “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society, Series B, Vol. 57, pp. 289–300.Google Scholar

Bontempi, G., Taieb, S., and Le Borgne, Y. (2012): “Machine Learning Strategies for Time Series Forecasting.” Lecture Notes in Business Information Processing, Vol. 138, No. 1, pp. 62–77.Google Scholar

Booth, A., Gerding, E., and McGroarty, F. (2014): “Automated Trading with Performance Weighted Random Forests and Seasonality.” Expert Systems with Applications, Vol. 41, No. 8, pp. 3651–61.CrossRef Google Scholar

Cao, L., and Tay, F. (2001): “Financial Forecasting Using Support Vector Machines.” Neural Computing and Applications, Vol. 10, No. 2, pp. 184–92.Google Scholar

Cao, L., Tay, F., and Hock, F. (2003): “Support Vector Machine with Adaptive Parameters in Financial Time Series Forecasting.” IEEE Transactions on Neural Networks, Vol. 14, No. 6, pp. 1506–18.Google Scholar

Cervello-Royo, R., Guijarro, F., and Michniuk, K. (2015): “Stockmarket Trading Rule Based on Pattern Recognition and Technical Analysis: Forecasting the DJIA Index with Intraday Data.” Expert Systems with Applications, Vol. 42, No. 14, pp. 5963–75.Google Scholar

Chang, P., Fan, C., and Lin, J. (2011): “Trend Discovery in Financial Time Series Data Using a Case-Based Fuzzy Decision Tree.” Expert Systems with Applications, Vol. 38, No. 5, pp. 6070–80.Google Scholar

Chen, B., and Pearl, J (2013): “Regression and Causation: A Critical Examination of Six Econometrics Textbooks.” Real-World Economics Review, Vol. 65, pp. 2–20.Google Scholar

Creamer, G., and Freund, Y. (2007): “A Boosting Approach for Automated Trading.” Journal of Trading, Vol. 2, No. 3, pp. 84–96.Google Scholar

Creamer, G., and Freund, Y. (2010): “Automated Trading with Boosting and Expert Weighting.” Quantitative Finance, Vol. 10, No. 4, pp. 401–20.Google Scholar

Creamer, G., Ren, Y., Sakamoto, Y., and Nickerson, J. (2016): “A Textual Analysis Algorithm for the Equity Market: The European Case.” Journal of Investing, Vol. 25, No. 3, pp. 105–16.Google Scholar

Dixon, M., Klabjan, D., and Bang, J. (2017): “Classification-Based Financial Markets Prediction Using Deep Neural Networks.” Algorithmic Finance, Vol. 6, No. 3, pp. 67–77.CrossRef Google Scholar

Dunis, C., and Williams, M. (2002): “Modelling and Trading the Euro/US Dollar Exchange Rate: Do Neural Network Models Perform Better?” Journal of Derivatives and Hedge Funds, Vol. 8, No. 3, pp. 211–39.Google Scholar

Easley, D., and Kleinberg, J (2010): Networks, Crowds, and Markets: Reasoning about a Highly Connected World. 1st ed. Cambridge University Press.Google Scholar

Easley, D., López de Prado, M, O’Hara, M, and Zhang, Z (2011): “Microstructure in the Machine Age.” Working paper.Google Scholar

Efroymson, M. (1960): “Multiple Regression Analysis.” In Ralston, A and Wilf, H (eds.), Mathematical Methods for Digital Computers. 1st ed. Wiley.Google Scholar

Einav, L., and Levin, J (2014): “Economics in the Age of Big Data.” Science, Vol. 346, No. 6210. Available at http://science.sciencemag.org/content/346/6210/1243089 Google Scholar

Feuerriegel, S., and Prendinger, H. (2016): “News-Based Trading Strategies.” Decision Support Systems, Vol. 90, pp. 65–74.Google Scholar

Greene, W. (2012): Econometric Analysis. 7th ed. Pearson Education.Google Scholar

Harvey, C., and Liu, Y (2015): “Backtesting.” The Journal of Portfolio Management, Vol. 42, No. 1, pp. 13–28.Google Scholar

Harvey, C., and Liu, Y (2018): “False (and Missed) Discoveries in Financial Economics.” Working paper. Available at https://ssrn.com/abstract=3073799 Google Scholar

Harvey, C., and Liu, Y (2018): “Lucky Factors.” Working paper. Available at https://ssrn.com/abstract=2528780 Google Scholar

Hastie, T., Tibshirani, R, and Friedman, J (2016): The Elements of Statistical Learning: Data Mining, Inference and Prediction. 2nd ed. Springer.Google Scholar

Hayashi, F. (2000): Econometrics. 1st ed. Princeton University Press.Google Scholar

Holm, S. (1979): “A Simple Sequentially Rejective Multiple Test Procedure.” Scandinavian Journal of Statistics, Vol. 6, pp. 65–70.Google Scholar

Hsu, S., Hsieh, J., Chih, T., and Hsu, K. (2009): “A Two-Stage Architecture for Stock Price Forecasting by Integrating Self-Organizing Map and Support Vector Regression.” Expert Systems with Applications, Vol. 36, No. 4, pp. 7947–51.Google Scholar

Huang, W., Nakamori, Y., and Wang, S. (2005): “Forecasting Stock Market Movement Direction with Support Vector Machine.” Computers and Operations Research, Vol. 32, No. 10, pp. 2513–22.Google Scholar

Ioannidis, J. (2005): “Why Most Published Research Findings Are False.” PLoS Medicine, Vol. 2, No. 8. Available at https://doi.org/10.1371/journal.pmed.0020124 Google Scholar

James, G., Witten, D, Hastie, T, and Tibshirani, R (2013): An Introduction to Statistical Learning. 1st ed. Springer.Google Scholar

Kahn, R. (2018): The Future of Investment Management. 1st ed. CFA Institute Research Foundation.Google Scholar

Kara, Y., Boyacioglu, M., and Baykan, O. (2011): “Predicting Direction of Stock Price Index Movement Using Artificial Neural Networks and Support Vector Machines: The Sample of the Istanbul Stock Exchange.” Expert Systems with Applications, Vol. 38, No. 5, pp. 5311–19.Google Scholar

Kim, K. (2003): “Financial Time Series Forecasting Using Support Vector Machines.” Neurocomputing, Vol. 55, No. 1, pp. 307–19.Google Scholar

Kolanovic, M., and Krishnamachari, R (2017): “Big Data and AI Strategies: Machine Learning and Alternative Data Approach to Investing.” J.P. Morgan Quantitative and Derivative Strategy, May.Google Scholar

Kolm, P., Tutuncu, R, and Fabozzi, F (2010): “60 Years of Portfolio Optimization.” European Journal of Operational Research, Vol. 234, No. 2, pp. 356–71.Google Scholar

Krauss, C., Do, X., and Huck, N. (2017): “Deep Neural Networks, Gradient-Boosted Trees, Random Forests: Statistical Arbitrage on the S&P 500.” European Journal of Operational Research, Vol. 259, No. 2, pp. 689–702.Google Scholar

Kuan, C., and Tung, L. (1995): “Forecasting Exchange Rates Using Feedforward and Recurrent Neural Networks.” Journal of Applied Econometrics, Vol. 10, No. 4, pp. 347–64.Google Scholar

Kuhn, H. W., and Tucker, A. W. (1952): “Nonlinear Programming.” In Proceedings of 2nd Berkeley Symposium. University of California Press, pp. 481–92.Google Scholar

Laborda, R., and Laborda, J. (2017): “Can Tree-Structured Classifiers Add Value to the Investor?” Finance Research Letters, Vol. 22, pp. 211–26.Google Scholar

López de Prado, M. (2018): “A Practical Solution to the Multiple-Testing Crisis in Financial Research.” Journal of Financial Data Science, Vol. 1, No. 1. Available at https://ssrn.com/abstract=3177057 Google Scholar

López de Prado, M., and Lewis, M (2018): “Confidence and Power of the Sharpe Ratio under Multiple Testing.” Working paper. Available at https://ssrn.com/abstract=3193697 Google Scholar

MacKay, D. (2003): Information Theory, Inference, and Learning Algorithms. 1st ed. Cambridge University Press.Google Scholar

Marcenko, V., and Pastur, L (1967): “Distribution of Eigenvalues for Some Sets of Random Matrices.” Matematicheskii Sbornik, Vol. 72, No. 4, pp. 507–36.Google Scholar

Michaud, R. (1998): Efficient Asset Allocation: A Practical Guide to Stock Portfolio Optimization and Asset Allocation. Boston: Harvard Business School Press.Google Scholar

Nakamura, E. (2005): “Inflation Forecasting Using a Neural Network.” Economics Letters, Vol. 86, No. 3, pp. 373–78.Google Scholar

Olson, D., and Mossman, C. (2003): “Neural Network Forecasts of Canadian Stock Returns Using Accounting Ratios.” International Journal of Forecasting, Vol. 19, No. 3, pp. 453–65.Google Scholar

Otto, M. (2016): Chemometrics: Statistics and Computer Application in Analytical Chemistry. 3rd ed. Wiley.Google Scholar

Patel, J., Sha, S., Thakkar, P., and Kotecha, K. (2015): “Predicting Stock and Stock Price Index Movement Using Trend Deterministic Data Preparation and Machine Learning Techniques.” Expert Systems with Applications, Vol. 42, No. 1, pp. 259–68.Google Scholar

Pearl, J. (2009): “Causal Inference in Statistics: An Overview.” Statistics Surveys, Vol. 3, pp. 96–146.Google Scholar

Plerou, V., Gopikrishnan, P, Rosenow, B, Nunes Amaral, L, and Stanley, H (1999): “Universal and Nonuniversal Properties of Cross Correlations in Financial Time Series.” Physical Review Letters, Vol. 83, No. 7, pp. 1471–74.Google Scholar

Porter, K. (2017): “Estimating Statistical Power When Using Multiple Testing Procedures.” Available at www.mdrc.org/sites/default/files/PowerMultiplicity-IssueFocus.pdf Google Scholar

Potter, M., Bouchaud, J. P., and Laloux, L (2005): “Financial Applications of Random Matrix Theory: Old Laces and New Pieces.” Acta Physica Polonica B, Vol. 36, No. 9, pp. 2767–84.Google Scholar

Qin, Q., Wang, Q., Li, J., and Shuzhi, S. (2013): “Linear and Nonlinear Trading Models with Gradient Boosted Random Forests and Application to Singapore Stock Market.” Journal of Intelligent Learning Systems and Applications, Vol. 5, No. 1, pp. 1–10.Google Scholar

Robert, C. (2014): “On the Jeffreys–Lindley Paradox.” Philosophy of Science, Vol. 81, No. 2, pp. 216–32.CrossRef Google Scholar

Shafer, G. (1982): “Lindley’s Paradox.” Journal of the American Statistical Association, Vol. 77, No. 378, pp. 325–34.Google Scholar

Simon, H. (1962): “The Architecture of Complexity.” Proceedings of the American Philosophical Society, Vol. 106, No. 6, pp. 467–82.Google Scholar

SINTEF (2013): “Big Data, for Better or Worse: 90% of World’s Data Generated over Last Two Years.” Science Daily, May 22. Available at www.sciencedaily.com/releases/2013/05/130522085217.htm Google Scholar

Sorensen, E., Miller, K., and Ooi, C. (2000): “The Decision Tree Approach to Stock Selection.” Journal of Portfolio Management, Vol. 27, No. 1, pp. 42–52.Google Scholar

Theofilatos, K., Likothanassis, S., and Karathanasopoulos, A. (2012): “Modeling and Trading the EUR/USD Exchange Rate Using Machine Learning Techniques.” Engineering, Technology and Applied Science Research, Vol. 2, No. 5, pp. 269–72.Google Scholar

Trafalis, T., and Ince, H. (2000): “Support Vector Machine for Regression and Applications to Financial Forecasting.” Neural Networks, Vol. 6, No. 1, pp. 348–53.Google Scholar

Trippi, R., and DeSieno, D. (1992): “Trading Equity Index Futures with a Neural Network.” Journal of Portfolio Management, Vol. 19, No. 1, pp. 27–33.Google Scholar

Tsai, C., and Wang, S. (2009): “Stock Price Forecasting by Hybrid Machine Learning Techniques.” Proceedings of the International Multi-Conference of Engineers and Computer Scientists, Vol. 1, No. 1, pp. 755–60.Google Scholar

Tsai, C., Lin, Y., Yen, D., and Chen, Y. (2011): “Predicting Stock Returns by Classifier Ensembles.” Applied Soft Computing, Vol. 11, No. 2, pp. 2452–59.Google Scholar

Tsay, R. (2013): Multivariate Time Series Analysis: With R and Financial Applications. 1st ed. Wiley.Google Scholar

Wang, J., and Chan, S. (2006): “Stock Market Trading Rule Discovery Using Two-Layer Bias Decision Tree.” Expert Systems with Applications, Vol. 30, No. 4, pp. 605–11.Google Scholar

Wang, Q., Li, J., Qin, Q., and Ge, S. (2011): “Linear, Adaptive and Nonlinear Trading Models for Singapore Stock Market with Random Forests.” In Proceedings of the 9th IEEE International Conference on Control and Automation, pp. 726–31.Google Scholar

Wei, P., and Wang, N. (2016): “Wikipedia and Stock Return: Wikipedia Usage Pattern Helps to Predict the Individual Stock Movement.” In Proceedings of the 25th International Conference Companion on World Wide Web, Vol. 1, pp. 591–94.Google Scholar

Wooldridge, J. (2010): Econometric Analysis of Cross Section and Panel Data. 2nd ed. MIT Press.Google Scholar

Wright, S. (1921): “Correlation and Causation.” Journal of Agricultural Research, Vol. 20, pp. 557–85.Google Scholar

Żbikowski, K. (2015): “Using Volume Weighted Support Vector Machines with Walk Forward Testing and Feature Selection for the Purpose of Creating Stock Trading Strategy.” Expert Systems with Applications, Vol. 42, No. 4, pp. 1797–1805.Google Scholar

Zhang, G., Patuwo, B., and Hu, M. (1998): “Forecasting with Artificial Neural Networks: The State of the Art.” International Journal of Forecasting, Vol. 14, No. 1, pp. 35–62.Google Scholar

Zhu, M., Philpotts, D., and Stevenson, M. (2012): “The Benefits of Tree-Based Models for Stock Selection.” Journal of Asset Management, Vol. 13, No. 6, pp. 437–48.CrossRef Google Scholar

Zhu, M., Philpotts, D., Sparks, R., and Stevenson, J. (2011): “A Hybrid Approach to Combining CART and Logistic Regression for Stock Ranking.” Journal of Portfolio Management, Vol. 38, No. 1, pp. 100–109.Google Scholar

References

American Statistical Association (2016): “Statement on Statistical Significance and P-Values.” Available at www.amstat.org/asa/files/pdfs/P-ValueStatement.pdf Google Scholar

Apley, D. (2016): “Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models.” Available at https://arxiv.org/abs/1612.08468 Google Scholar

Athey, Susan (2015): “Machine Learning and Causal Inference for Policy Evaluation.” In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 5–6. ACM.Google Scholar

Bailey, D., and López de Prado, M (2012): “The Sharpe Ratio Efficient Frontier.” Journal of Risk, Vol. 15, No. 2, pp. 3–44.Google Scholar

Bailey, D., and López de Prado, M (2013): “An Open-Source Implementation of the Critical-Line Algorithm for Portfolio Optimization.” Algorithms, Vol. 6, No. 1, pp. 169–96. Available at http://ssrn.com/abstract=2197616 Google Scholar

Bailey, D., and López de Prado, M (2014): “The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting and Non-Normality.” Journal of Portfolio Management, Vol. 40, No. 5, pp. 94–107.Google Scholar

Bailey, D., Borwein, J, López de Prado, M, and Zhu, J (2014): “Pseudo-mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance.” Notices of the American Mathematical Society, Vol. 61, No. 5, pp. 458–71. Available at http://ssrn.com/abstract=2308659 Google Scholar

Black, F., and Litterman, R (1991): “Asset Allocation Combining Investor Views with Market Equilibrium.” Journal of Fixed Income, Vol. 1, No. 2, pp. 7–18.Google Scholar

Black, F., and Litterman, R (1992): “Global Portfolio Optimization.” Financial Analysts Journal, Vol. 48, No. 5, pp. 28–43.CrossRef Google Scholar

Breiman, L. (2001): “Random Forests.” Machine Learning, Vol. 45, No. 1, pp. 5–32.Google Scholar

Brian, E., and Jaisson, M. (2007): “Physico-theology and Mathematics (1710–1794).” In The Descent of Human Sex Ratio at Birth. Springer Science & Business Media, pp. 1–25.Google Scholar

Brooks, C., and Kat, H (2002): “The Statistical Properties of Hedge Fund Index Returns and Their Implications for Investors.” Journal of Alternative Investments, Vol. 5, No. 2, pp. 26–44.Google Scholar

Cavallo, A., and Rigobon, R (2016): “The Billion Prices Project: Using Online Prices for Measurement and Research.” NBER Working Paper 22111, March.Google Scholar

CFTC (2010): “Findings Regarding the Market Events of May 6, 2010.” Report of the Staffs of the CFTC and SEC to the Joint Advisory Committee on Emerging Regulatory Issues, September 30.Google Scholar

Christie, S. (2005): “Is the Sharpe Ratio Useful in Asset Allocation?” MAFC Research Paper 31. Applied Finance Centre, Macquarie University.Google Scholar

Clarke, Kevin A. (2005): “The Phantom Menace: Omitted Variable Bias in Econometric Research.” Conflict Management and Peace Science, Vol. 22, No. 1, pp. 341–52.Google Scholar

Clarke, R., De Silva, H, and Thorley, S (2002): “Portfolio Constraints and the Fundamental Law of Active Management.” Financial Analysts Journal, Vol. 58, pp. 48–66.Google Scholar

Cohen, L., and Frazzini, A (2008): “Economic Links and Predictable Returns.” Journal of Finance, Vol. 63, No. 4, pp. 1977–2011.Google Scholar

De Miguel, V., Garlappi, L, and Uppal, R (2009): “Optimal versus Naive Diversification: How Inefficient Is the 1/N Portfolio Strategy?” Review of Financial Studies, Vol. 22, pp. 1915–53.Google Scholar

Ding, C., and He, X (2004): “K-Means Clustering via Principal Component Analysis.” In Proceedings of the 21st International Conference on Machine Learning. Available at http://ranger.uta.edu/~chqding/papers/KmeansPCA1.pdf Google Scholar

Easley, D., López de Prado, M, and O’Hara, M (2011a): “Flow Toxicity and Liquidity in a High-Frequency World.” Review of Financial Studies, Vol. 25, No. 5, pp. 1457–93.Google Scholar

Easley, D., López de Prado, M, and O’Hara, M (2011b): “The Microstructure of the ‘Flash Crash’: Flow Toxicity, Liquidity Crashes and the Probability of Informed Trading.” Journal of Portfolio Management, Vol. 37, No. 2, pp. 118–28.Google Scholar

Efron, B., and Hastie, T (2016): Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. 1st ed. Cambridge University Press.Google Scholar

Embrechts, P., Klueppelberg, C, and Mikosch, T (2003): Modelling Extremal Events. 1st ed. Springer.Google Scholar

Goutte, C., Toft, P, Rostrup, E, Nielsen, F, and Hansen, L (1999): “On Clustering fMRI Time Series.” NeuroImage, Vol. 9, No. 3, pp. 298–310.Google Scholar

Grinold, R., and Kahn, R (1999): Active Portfolio Management. 2nd ed. McGraw-Hill.Google Scholar

Gryak, J., Haralick, R, and Kahrobaei, D (Forthcoming): “Solving the Conjugacy Decision Problem via Machine Learning.” Experimental Mathematics. Available at https://doi.org/10.1080/10586458.2018.1434704 Google Scholar

Hacine-Gharbi, A., and Ravier, P (2018): “A Binning Formula of Bi-histogram for Joint Entropy Estimation Using Mean Square Error Minimization.” Pattern Recognition Letters, Vol. 101, pp. 21–28.Google Scholar

Hacine-Gharbi, A., Ravier, P, Harba, R, and Mohamadi, T (2012): “Low Bias Histogram-Based Estimation of Mutual Information for Feature Selection.” Pattern Recognition Letters, Vol. 33, pp. 1302–8.Google Scholar

Hamilton, J. (1994): Time Series Analysis. 1st ed. Princeton University Press.CrossRef Google Scholar

Harvey, C., Liu, Y, and Zhu, C (2016): “… and the Cross-Section of Expected Returns.” Review of Financial Studies, Vol. 29, No. 1, pp. 5–68. Available at https://ssrn.com/abstract=2249314 Google Scholar

Hodge, V., and Austin, J (2004): “A Survey of Outlier Detection Methodologies.” Artificial Intelligence Review, Vol. 22, No. 2, pp. 85–126.Google Scholar

IDC (2014): “The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things.” EMC Digital Universe with Research and Analysis. April. Available at www.emc.com/leadership/digital-universe/2014iview/index.htm Google Scholar

Ingersoll, J., Spiegel, M, Goetzmann, W, and Welch, I (2007): “Portfolio Performance Manipulation and Manipulation-Proof Performance Measures.” The Review of Financial Studies, Vol. 20, No. 5, pp. 1504–46.Google Scholar

Jaynes, E. (2003): Probability Theory: The Logic of Science. 1st ed. Cambridge University Press.Google Scholar

Jolliffe, I. (2002): Principal Component Analysis. 2nd ed. Springer.Google Scholar

Kraskov, A., Stoegbauer, H, and Grassberger, P (2008): “Estimating Mutual Information.” Working paper. Available at https://arxiv.org/abs/cond-mat/0305641v1 Google Scholar

Laloux, L., Cizeau, P, Bouchaud, J. P., and Potters, M (2000): “Random Matrix Theory and Financial Correlations.” International Journal of Theoretical and Applied Finance, Vol. 3, No. 3, pp. 391–97.Google Scholar

Ledoit, O., and Wolf, M (2004): “A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices.” Journal of Multivariate Analysis, Vol. 88, No. 2, pp. 365–411.Google Scholar

Lewandowski, D., Kurowicka, D, and Joe, H (2009): “Generating Random Correlation Matrices Based on Vines and Extended Onion Method.” Journal of Multivariate Analysis, Vol. 100, pp. 1989–2001.Google Scholar

Liu, Y. (2004): “A Comparative Study on Feature Selection Methods for Drug Discovery.” Journal of Chemical Information and Modeling, Vol. 44, No. 5, pp. 1823–28. Available at https://pubs.acs.org/doi/abs/10.1021/ci049875d Google Scholar

Lo, A. (2002): “The Statistics of Sharpe Ratios.” Financial Analysts Journal, July, pp. 36–52.Google Scholar

Lochner, M., McEwen, J, Peiris, H, Lahav, O, and Winter, M (2016): “Photometric Supernova Classification with Machine Learning.” The Astrophysical Journal, Vol. 225, No. 2. Available at http://iopscience.iop.org/article/10.3847/0067-0049/225/2/31/meta Google Scholar

López de Prado, M. (2016): “Building Diversified Portfolios that Outperform Out-of-Sample.” Journal of Portfolio Management, Vol. 42, No. 4, pp. 59–69.Google Scholar

López de Prado, M. (2018a): Advances in Financial Machine Learning. 1st ed. Wiley.Google Scholar

López de Prado, M. (2018b): “The 10 Reasons Most Machine Learning Funds Fail.” The Journal of Portfolio Management, Vol. 44, No. 6, pp. 120–33.Google Scholar

López de Prado, M. (2019a): “A Data Science Solution to the Multiple-Testing Crisis in Financial Research.” Journal of Financial Data Science, Vol. 1, No. 1, pp. 99–110.Google Scholar

López de Prado, M. (2019b): “Beyond Econometrics: A Roadmap towards Financial Machine Learning.” Working paper. Available at https://ssrn.com/abstract=3365282 Google Scholar

López de Prado, M. (2019c): “Ten Applications of Financial Machine Learning.” Working paper. Available at https://ssrn.com/abstract=3365271 Google Scholar

López de Prado, M., and Lewis, M (2018): “Detection of False Investment Strategies Using Unsupervised Learning Methods.” Working paper. Available at https://ssrn.com/abstract=3167017 Google Scholar

Louppe, G., Wehenkel, L., Sutera, A., and Geurts, P. (2013): “Understanding Variable Importances in Forests of Randomized Trees.” In Proceedings of the 26th International Conference on Neural Information Processing Systems, pp. 431–39.Google Scholar

Markowitz, H. (1952): “Portfolio Selection.” Journal of Finance, Vol. 7, pp. 77–91.Google Scholar

Meila, M. (2007): “Comparing Clusterings – an Information Based Distance.” Journal of Multivariate Analysis, Vol. 98, pp. 873–95.Google Scholar

Mertens, E. (2002): “Variance of the IID estimator in Lo (2002).” Working paper, University of Basel.Google Scholar

Molnar, C. (2019): “Interpretable Machine Learning: A Guide for Making Black-Box Models Explainable.” Available at https://christophm.github.io/interpretable-ml-book/Google Scholar

Mullainathan, S., and Spiess, J (2017): “Machine Learning: An Applied Econometric Approach.” Journal of Economic Perspectives, Vol. 31, No. 2, pp. 87–106.Google Scholar

Neyman, J., and Pearson, E (1933): “IX. On the Problem of the Most Efficient Tests of Statistical Hypotheses.” Philosophical Transactions of the Royal Society, Series A, Vol. 231, No. 694–706, pp. 289–337.Google Scholar

Opdyke, J. (2007): “Comparing Sharpe Ratios: So Where Are the p-Values?” Journal of Asset Management, Vol. 8, No. 5, pp. 308–36.Google Scholar

Parzen, E. (1962): “On Estimation of a Probability Density Function and Mode.” The Annals of Mathematical Statistics, Vol. 33, No. 3, pp. 1065–76.Google Scholar

Resnick, S. (1987): Extreme Values, Regular Variation and Point Processes. 1st ed. Springer.Google Scholar

Romer, P. (2016): “The Trouble with Macroeconomics.” The American Economist, September 14.Google Scholar

Rosenblatt, M. (1956): “Remarks on Some Nonparametric Estimates of a Density Function.” The Annals of Mathematical Statistics, Vol. 27, No. 3, pp. 832–37.Google Scholar

Rousseeuw, P. (1987): “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis.” Computational and Applied Mathematics, Vol. 20, pp. 53–65.Google Scholar

Schlecht, J., Kaplan, M, Barnard, K, Karafet, T, Hammer, M, and Merchant, N (2008): “Machine-Learning Approaches for Classifying Haplogroup from Y Chromosome STR Data.” PLOS Computational Biology, Vol. 4, No. 6. Available at https://doi.org/10.1371/journal.pcbi.1000093 Google Scholar

Sharpe, W. (1966): “Mutual Fund Performance.” Journal of Business, Vol. 39, No. 1, pp. 119–38.Google Scholar

Sharpe, W. (1975): “Adjusting for Risk in Portfolio Performance Measurement.” Journal of Portfolio Management, Vol. 1, No. 2, pp. 29–34.Google Scholar

Sharpe, W. (1994): “The Sharpe Ratio.” Journal of Portfolio Management, Vol. 21, No. 1, pp. 49–58.Google Scholar

Šidàk, Z. (1967): “Rectangular Confidence Regions for the Means of Multivariate Normal Distributions.” Journal of the American Statistical Association, Vol. 62, No. 318, pp. 626–33.Google Scholar

Solow, R. (2010): “Building a Science of Economics for the Real World.” Prepared statement of Robert Solow, Professor Emeritus, MIT, to the House Committee on Science and Technology, Subcommittee on Investigations and Oversight, July 20.Google Scholar

Steinbach, M., Levent, E, and Kumar, V (2004): “The Challenges of Clustering High Dimensional Data.” In Wille, L (ed.), New Directions in Statistical Physics. 1st ed. Springer, pp. 273–309.Google Scholar

Štrumbelj, E., and Kononenko, I. (2014): “Explaining Prediction Models and Individual Predictions with Feature Contributions.” Knowledge and Information Systems, Vol. 41, No. 3, pp. 647–65.CrossRef Google Scholar

Varian, H. (2014): “Big Data: New Tricks for Econometrics.” Journal of Economic Perspectives, Vol. 28, No. 2, pp. 3–28.Google Scholar

Wasserstein, R., Schirm, A., and Lazar, N. (2019): “Moving to a World beyond p<0.05.” The American Statistician, Vol. 73, No. 1, pp. 1–19.Google Scholar

Wasserstein, R., and Lazar, N. (2016): “The ASA’s Statement on p-Values: Context, Process, and Purpose.” The American Statistician, Vol. 70, pp. 129–33.Google Scholar

Witten, D., Shojaie, A., and Zhang, F. (2013): “The Cluster Elastic Net for High-Dimensional Regression with Unknown Variable Grouping.” Technometrics, Vol. 56, No. 1, pp. 112–22.Google Scholar

Element contents

Machine Learning for Asset Managers

Summary

Keywords

Access options

Bibliography

Bibliography

References

Save element to Kindle

Save element to Dropbox

Save element to Google Drive