Skip to main content Accessibility help
×
Hostname: page-component-5f7774ffb-cnfpf Total loading time: 0 Render date: 2026-02-19T09:35:52.455Z Has data issue: false hasContentIssue false

References

Published online by Cambridge University Press:  30 January 2026

Mohamed El-Geish
Affiliation:
Monta AI
Shabaz Patel
Affiliation:
Best Buy
Anand Sampat
Affiliation:
Overline AI
Hira Dangol
Affiliation:
Analog Devices
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Information

Type
Chapter
Information
Shipping Machine Learning Systems
A Practical Guide to Building, Deploying, and Scaling in Production
, pp. 411 - 426
Publisher: Cambridge University Press
Print publication year: 2026

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Abdulkader, Ahmad, and Mahmoud, Mohamed G. 2021. Ensemble modeling of automatic speech recognition output. U.S. Patent 11,094,326, filed August 6, 2018, and issued August 17, 2021.Google Scholar
Abraham, Lior, Allen, John, Barykin, Oleksandr, et al. 2013. Scuba: Diving into data at Facebook. Proc. VLDB Endow., 6(11), 10571067.10.14778/2536222.2536231CrossRefGoogle Scholar
Ackerman, Ian, and Kataria, Saurabh. 2021 (June). Homepage feed multi-task learning using TensorFlow. https://engineering.linkedin.com/blog/2021/homepage-feed-multi-task-learning-using-tensorflow. Accessed: May 16, 2022.Google Scholar
Agarwal, Deepak. 2018 (Oct). An introduction to AI at LinkedIn. https://engineering.linkedin.com/blog/2018/10/an-introduction-to-ai-at-linkedin. Accessed: December 31, 2021.Google Scholar
Agarwal, Deepak, Chen, Bee-Chung, Gupta, Rupesh, et al. 2014. Activity ranking in LinkedIn feed. Page 1603–1612 of: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’14.10.1145/2623330.2623362CrossRefGoogle Scholar
Agresti, Alan. 2012. Categorical Data Analysis. John Wiley & Sons.Google Scholar
Akyürek, Ekin, Schuurmans, Dale, Andreas, Jacob, Ma, Tengyu, and Zhou, Denny. 2022. What learning algorithm is in-context learning? Investigations with linear models. https://doi.org/10.48550/arXiv.2211.15661.CrossRefGoogle Scholar
Alpert, Ben. 2019. Deep Learning for Distracted Driving Detection. www.nauto.com/blog/nauto-engineering-deep-learning-for-distracted-driver-monitoringGoogle Scholar
Anderson, Philip W. 1972. More is different: Broken symmetry and the nature of the hierarchical structure of science. Science, 177(4047), 393396.10.1126/science.177.4047.393CrossRefGoogle Scholar
Anthropic. 2024. Anthropic – Building effective Agents. www.anthropic.com/research/building-effective-agents. Accessed: January 18, 2025.Google Scholar
Ariely, D. 2010. Predictably Irrational, Revised and Expanded Edition: The Hidden Forces That Shape Our Decisions. Business & Economics. HarperCollins.Google Scholar
Asar, Özgür, Ilk, Ozlem, and Dag, Osman. 2017. Estimating Box-Cox power transformation parameter via goodness-of-fit tests. Communications in Statistics-Simulation and Computation, 46(1), 91105.10.1080/03610918.2014.957839CrossRefGoogle Scholar
Barmer, Hollen, Dzombak, Rachel, Gaston, Matthew, Heim, Eric, Palat, Vijaykumar, Redner, Frank, Smith, Tanisha, and VanHoudnos, Nathan. 2021. Robust and Secure AI. https://kilthub.cmu.edu/articles/report/Robust_and_Secure_AI/16560252?file=30632691.Google Scholar
Beaulieu, A. 2020. Learning SQL: Generate, Manipulate, and Retrieve Data. O’Reilly Media.Google Scholar
Beck, K., Andres, C., and Gamma, E. 2004. Extreme Programming Explained: Embrace Change. XP series. Addison-Wesley.Google Scholar
Beggan, James K. 1992. On the social nature of nonsocial perception: The mere ownership effect. Journal of Personality and Social Psychology, 62(2), 229.10.1037/0022-3514.62.2.229CrossRefGoogle Scholar
Bengio, Yoshua, Courville, Aaron, and Vincent, Pascal. 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 17981828.10.1109/TPAMI.2013.50CrossRefGoogle ScholarPubMed
Bergstra, James, and Bengio, Yoshua. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(2).Google Scholar
Berlin, L. 2006. The Man Behind the Microchip: Robert Noyce and the Invention of Silicon Valley. Oxford University Press.Google Scholar
Bhajaria, Nishant. 2022. Data Privacy: A Runbook for Engineers. Manning Publications.Google Scholar
Borsos, Zalán, Marinier, Raphaël, Vincent, Damien, et al. 2023. AudioLM: A language modeling approach to audio generation. https://arxiv.org/abs/2209.0314310.1109/TASLP.2023.3288409CrossRefGoogle Scholar
Box, George EP, and Cox, David R. 1964. An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26(2), 211243.10.1111/j.2517-6161.1964.tb00553.xCrossRefGoogle Scholar
Boźić, Matej, and Horvat, Marko. 2024. A survey of deep learning audio generation methods. https://arxiv.org/abs/2406.00146Google Scholar
Brown, Tom, Mann, Benjamin, Ryder, Nick, et al. 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 18771901.Google Scholar
Brümmer, Niko, Swart, Albert, and van Leeuwen, David. 2014. A comparison of linear and non-linear calibrations for speaker recognition. https://doi.org/10.48550/ arXiv.1402.2447.Google Scholar
Brümmer, Niko, Ferrer, Luciana, and Swart, Albert. 2021. Out of a hundred trials, how many errors does your speaker verifier make? https://doi.org/10.48550/arXiv.2104.00732.CrossRefGoogle Scholar
Burgert, Ryan, Ranasinghe, Kanchana, Li, Xiang, and Ryoo, Michael S. 2023. Peekaboo: Text to image diffusion models are zero-shot segmentors. https://arxiv.org/ abs/2211.13224Google Scholar
Burks, L, and Gupta, Abhineet. 2020 (10). Performance metrics to evaluate probabilistic models for structural damage during seismic events. In: 17th World Conference on Earthquake Engineering. www.researchgate.net/publication/346028665_Performance_Metrics_to_Evaluate_Probabilistic_Models_ for_Structural_Damage_During_Seismic-EventsGoogle Scholar
Cemri, Mert, Pan, Melissa Z., Yang, Shuyi, et al. 2025. Why do multi-agent LLM systems fail? https://arxiv.org/abs/2503.13657Google Scholar
Chandola, Varun, Banerjee, Arindam, and Kumar, Vipin. 2009. Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 158.10.1145/1541880.1541882CrossRefGoogle Scholar
Chapelle, Olivier, Schölkopf, Bernhard, and Zien, Alexander. 2010. Semi-Supervised Learning. MIT Press.Google Scholar
Chawla, Nitesh V, Bowyer, Kevin W, Hall, Lawrence O, and Kegelmeyer, W Philip. 2002. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321357.10.1613/jair.953CrossRefGoogle Scholar
Chawla, Nitesh V, Lazarevic, Aleksandar, Hall, Lawrence O, and Bowyer, Kevin W. 2003. SMOTEBoost: Improving prediction of the minority class in boosting. Pages 107–119 of: Knowledge Discovery in Databases: PKDD 2003: 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, Cavtat-Dubrovnik, Croatia, September 22–26, 2003. Proceedings 7. Springer.Google Scholar
Chen, Daniel L, Moskowitz, Tobias J, and Shue, Kelly. 2016. Decision making under the gambler’s fallacy: Evidence from asylum judges, loan officers, and baseball umpires. The Quarterly Journal of Economics, 131(3), 11811242.10.1093/qje/qjw017CrossRefGoogle Scholar
Chen, Dongdong, Liao, Jing, Yuan, Lu, Yu, Nenghai, and Hua, Gang. 2017. Coherent online video style transfer. https://arxiv.org/abs/1703.0921110.1109/ICCV.2017.126CrossRefGoogle Scholar
Chio, C., and Freeman, D. 2018. Machine Learning and Security: Protecting Systems with Data and Algorithms. O’Reilly Media.Google Scholar
Clemm, Josh. 2015 (Jul). A brief history of scaling LinkedIn. www.linkedin.com/blog/engineering/architecture/brief-history-scaling-linkedin. Accessed: January 31, 2024.Google Scholar
Cloud, Google. 2014. Google Cloud trace: Distributed tracing system. https://cloud.google.com/trace/docs/overview. Latest update: January 24, 2025, Accessed: March 27, 2025.Google Scholar
Contributors, Kubeflow. 2018. Kubeflow: Machine learning toolkit for Kubernetes. https://github.com/kubeflow/kubeflow. Version 1.9, Accessed: March 27, 2025.Google Scholar
Contributors, Torch. 2022. Reproducibility. https://pytorch.org/docs/stable/notes/randomness.html.Google Scholar
Conway, Melvin E. 1968. How do committees invent. Datamation, 14(4), 2831.Google Scholar
D’Amour, Alexander, Heller, Katherine, Moldovan, Dan, et al. 2022. Underspecification presents challenges for credibility in modern machine learning. The Journal of Machine Learning Research, 23(1), 1023710297.Google Scholar
DeGroot, M.H. 1969. Optimal Statistical Decisions. McGraw-Hill Series in Probability and Statistics. McGraw-Hill.Google Scholar
Demerlé, Nils, Esling, Philippe, Doras, Guillaume, and Genova, David. 2024. Combining audio control and style transfer using latent diffusion. https://arxiv.org/abs/2408.00196Google Scholar
Deutsch, D. 2011. The Beginning of Infinity: Explanations That Transform the World. Penguin Publishing Group.Google Scholar
Developers, Feast. 2019. Feast: Open-source feature store for machine learning. https://github.com/feast-dev/feast. Version 0.45.0, Accessed: March 27, 2025.Google Scholar
Dinesh, Amara, Karthika, Kumar R, and Parameswaran, Latha. 2018. Novel deep learning model for traffic sign detection using capsule networks. https://doi.org/ 10.48550/arXiv.1805.04424Google Scholar
Domingos, Pedro. 2012. A few useful things to know about machine learning. Communications of the ACM, 55(10), 7887.10.1145/2347736.2347755CrossRefGoogle Scholar
Doran, George T, et al. 1981. There’s a SMART way to write management’s goals and objectives. Management Review, 70(11), 3536.Google Scholar
Dosovitskiy, Alexey, Beyer, Lucas, Kolesnikov, Alexander, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. https://doi.org/10.48550/arXiv.2010.11929.CrossRefGoogle Scholar
Ebbinghaus, H. 1913. Memory: A Contribution to Experimental Psychology. Educational reprints. Teachers College, Columbia University.10.1037/10011-000CrossRefGoogle Scholar
Edalati, Ali, Tahaei, Marzieh, Kobyzev, Ivan, et al. 2022. KronA: Parameter efficient tuning with Kronecker adapter. https://doi.org/10.48550/arXiv.2212.10650CrossRefGoogle Scholar
Elkan, Charles. 2001. The foundations of cost-sensitive learning. Pages 973–978 of: International Joint Conference on Artificial Intelligence, Vol. 17. Lawrence Erlbaum Associates Ltd.Google Scholar
Elsken, Thomas, Metzen, Jan Hendrik, and Hutter, Frank. 2019. Neural architecture search: A survey. The Journal of Machine Learning Research, 20(1), 19972017.Google Scholar
Esteva, Andre, Kuprel, Brett, Novoa, Roberto A, et al. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115118.10.1038/nature21056CrossRefGoogle Scholar
Eugene Jones, C., and Buchmann, Stephen L. 1974. Ultraviolet floral patterns as functional orientation cues in hymenopterous pollination systems. Animal Behaviour, 22(2), 481485.10.1016/S0003-3472(74)80047-3CrossRefGoogle Scholar
Face, Hugging. 2022. Huggingface inference endpoints. https://huggingface.co/docs/inference-endpoints/index.Google Scholar
Falcon, William, and The PyTorch Lightning team. 2019 (Mar.). PyTorch Lightning. https://github.com/Lightning-AI/pytorch-lightningGoogle Scholar
Feathr. 2022. Feathr. https://github.com/feathr-ai/feathr Accessed: January 31, 2024.Google Scholar
Feynman, Richard P. 1974. Cargo cult science. Caltech’s 1974 Commencement Address. https://calteches.library.caltech.edu/51/2/CargoCult.pdfGoogle Scholar
Feynman, R.P., Leighton, R.B., and Sands, M.L. 1970. The Feynman Lectures on Physics. Addison-Wesley.Google Scholar
Fisher, R.A. 1925. Statistical Methods for Research Workers. Oliver and Boyd.Google Scholar
Fu, Daniel, Chen, Mayee, Sala, Frederic, et al. 2020. Fast and three-rious: Speeding up weak supervision with triplet methods. Pages 3280–3291 of: Proceedings of the 37th International Conference on Machine Learning.Google Scholar
Ganchev, Kuzman, and Dredze, Mark. 2008. Small statistical models by random feature mixing. Pages 19–20 of: Proceedings of the ACL-08: HLT Workshop on Mobile Language Processing.Google Scholar
Gandikota, Rohit, Materzynska, Joanna, Fiotto-Kaufman, Jaden, and Bau, David. 2023. Erasing concepts from diffusion models. https://arxiv.org/abs/2303.0734510.1109/ICCV51070.2023.00230CrossRefGoogle Scholar
Garcia-Molina, Hector, Ullman, Jeffrey D., and Widom, Jennifer. 2008. Database Systems: The Complete Book. 2nd ed. Prentice Hall Press.Google Scholar
Gatys, Leon A., Ecker, Alexander S., and Bethge, Matthias. 2015. A neural algorithm of artistic style. https://arxiv.org/abs/1508.06576Google Scholar
Gertner, J. 2012. The idea factory: Bell Labs and the Great Age of American Innovation. Penguin Publishing Group.Google Scholar
Gholami, Amir, Kim, Sehoon, Dong, Zhen, et al. 2021. A survey of quantization methods for efficient neural network inference. https://arxiv.org/abs/2103.13630Google Scholar
Ginsparg, Paul. 1991. arXiv.org. https://arxiv.org. Accessed: August 15, 2021.Google Scholar
Golovin, Daniel, Solnik, Benjamin, Moitra, Subhodeep, et al. 2017. Google Vizier: A service for black-box optimization. Pages 1487-1495 of: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13–17, 2017.Google Scholar
Grattafiori, Aaron, Dubey, Abhimanyu, Jauhri, Abhinav, et al. 2024. The Llama 3 Herd of Models. https://arxiv.org/abs/2407.21783Google Scholar
Graves, Alex, Fernández, Santiago, Gomez, Faustino, and Schmidhuber, Jürgen. 2006. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. Pages 369–376 of: Proceedings of the 23rd International Conference on Machine Learning.10.1145/1143844.1143891CrossRefGoogle Scholar
Gray, D., Brown, S., and Macanufo, J. 2010. Gamestorming: A Playbook for Innovators, Rulebreakers, and Changemakers. O’Reilly Media.Google Scholar
Grimstad, Stein, and Jørgensen, Magne. 2007. Inconsistency of expert judgment-based estimates of software development effort. Journal of System Software, 80(11), 17701777.10.1016/j.jss.2007.03.001CrossRefGoogle Scholar
gRPC Authors. What is gRPC? Introduction to gRPC. https://grpc.io/docs/what-is- grpc/introduction/. Accessed: March 25, 2025.Google Scholar
Gu, Albert, and Dao, Tri. 2024. Mamba: Linear-time sequence modeling with selective state spaces. https://arxiv.org/abs/2312.00752Google Scholar
Gullapally, Sai Chowdary, Zhang, Yibo, Mittal, Nitin Kumar, et al. 2023. Synthetic DOmain-Targeted Augmentation (S-DOTA) improves model generalization in digital pathology. https://doi.org/10.48550/arXiv.2305.02401.CrossRefGoogle Scholar
Gulshan, Varun, Peng, Lily, Coram, Marc, et al. 2016. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. https://pubmed.ncbi.nlm.nih.gov/27898976/10.1001/jama.2016.17216CrossRefGoogle Scholar
Guyon, Isabelle, and Elisseeff, André. 2003. An introduction to variable and feature selection. Journal of Machine Learning Research, 3 (Mar), 11571182.Google Scholar
He, Xinran, Pan, Junfeng, Jin, Ou, et al. 2014. Practical lessons from predicting clicks on ads at Facebook. Pages 1–9 of: Proceedings of the Association for Computing Machinery (ACM). International Workshop on Data Mining for Online Advertising.10.1145/2648584.2648589CrossRefGoogle Scholar
He, Xuehai, Li, Chunyuan, Zhang, Pengchuan, Yang, Jianwei, and Wang, Xin Eric. 2023. Parameter-efficient model adaptation for vision transformers. https://arxiv.org/abs/2203.16329Google Scholar
Herzog, Stefan M, and Hertwig, Ralph. 2009. The wisdom of many in one mind: Improving individual judgments with dialectical bootstrapping. Psychological Science, 20(2), 231237.10.1111/j.1467-9280.2009.02271.xCrossRefGoogle ScholarPubMed
Heusel, Martin, Ramsauer, Hubert, Unterthiner, Thomas, Nessler, Bernhard, and Hochreiter, Sepp. 2018. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. https://arxiv.org/abs/1706.08500Google Scholar
Hochreiter, Sepp, and Schmidhuber, Jürgen. 1997. Long short-term memory. Neural Computation, 9(8), 17351780.10.1162/neco.1997.9.8.1735CrossRefGoogle ScholarPubMed
Hoffman, R., and Casnocha, B. 2012. The Start-Up of You. Crown Business.Google Scholar
Hoffmann, Jordan, Borgeaud, Sebastian, Mensch, Arthur, et al. 2022. Training compute-optimal large language models. https://doi.org/10.48550/arXiv.2203.15556CrossRefGoogle Scholar
Hohmann, L. 2006. Innovation Games: Creating Breakthrough Products Through Collaborative Play. Pearson Education.Google Scholar
Hooker, Sara. 2020. The hardware lottery. https://doi.org/10.48550/arXiv.2009.06489.CrossRefGoogle Scholar
Hornik, Kurt, Stinchcombe, Maxwell, and White, Halbert. 1989. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359366.10.1016/0893-6080(89)90020-8CrossRefGoogle Scholar
Howard, RA. 1980. On making life and death decisions, societal risk assessment. In: Societal Risk Assessment: How Safe is Safe Enough?. Plenum Publishing Corp.Google Scholar
Hu, Edward J., Shen, Yelong, Wallis, Phillip, et al. 2021. LoRA: Low-Rank Adaptation of Large Language Models. https://arxiv.org/abs/2106.09685Google Scholar
Huang, Hong, Man, Junfeng, Li, Luyao, and Zeng, Rongke. 2024. Musical timbre style transfer with diffusion model. PeerJ Computer Science, 10 (July).Google Scholar
Huang, Zeyi, Wang, Haohan, Xing, Eric P, and Huang, Dong. 2020. Self-challenging improves cross-domain generalization. Pages 124–140 of: Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, part II 16. Springer.Google Scholar
Hunt, A., and Thomas, D. 1999. The Pragmatic Programmer: From Journeyman to Master. Addison-Wesley Professional.Google Scholar
Hutter, Frank, Kotthoff, Lars, and Vanschoren, Joaquin. 2019. Automated Machine Learning: Methods, Systems, Challenges. Springer Nature.10.1007/978-3-030-05318-5CrossRefGoogle Scholar
Jaderberg, Max, Dalibard, Valentin, Osindero, Simon, et al. 2017. Population based training of neural networks. https://doi.org/10.48550/arXiv.1711.09846.CrossRefGoogle Scholar
James, G., Witten, D., Hastie, T., Tibshirani, R., and Taylor, J. 2023. An Introduction to Statistical Learning: With Applications in Python. Springer Texts in Statistics. Springer International Publishing.10.1007/978-3-031-38747-0CrossRefGoogle Scholar
Jarmul, Katharine. 2023. Practical Data Privacy: Enhancing Privacy and Security in Data. O’Reilly Media.Google Scholar
Jeong, Yoonjae, and Myaeng, Sung-Hyon. 2013. Feature selection using a semantic hierarchy for event recognition and type classification. Pages 136–144 of: Proceedings of the 6th International Joint Conference on Natural Language Processing. Nagoya, Japan: Asian Federation of Natural Language Processing.Google Scholar
Jiang, J., Cui, B., and Zhang, C. 2023. Distributed Machine Learning and Gradient Optimization. Big Data Management. Springer Nature.Google Scholar
Johnson, Justin, Alahi, Alexandre, and Fei-Fei, Li. 2016. Perceptual losses for real-time style transfer and super-resolution. https://arxiv.org/abs/1603.0815510.1007/978-3-319-46475-6_43CrossRefGoogle Scholar
Johnson, Steven. 2011. Where Good ideas Come from: The Natural History of Innovation. Penguin.Google Scholar
Jurafsky, D., Martin, J.H., Norvig, P., and Russell, S. 2014. Speech and Language Processing. Pearson Education.Google Scholar
Juran, Joseph M. 1950. Pareto, Lorenz, Cournot, Bernoulli, Juran and Others. Industrial Quality Control, 25. https://ia804509.us.archive.org/8/items/in.ernet.dli.2015.140155/2015.140155.Quality-Control-Handbook_text.pdfGoogle Scholar
Jurka, Tim, Ghosh, Souvik, and Davies, Pete. 2018 (Mar). A look behind the AI that powers LinkedIn's feed: Sifting through billions of conversations to create personalized news feeds for hundreds of millions of members. https://engineering.linkedin.com/blog/2018/03/a-look-behind-the-ai-that-powers-linkedins-feed--sifting-through. Accessed: December 31, 2021.Google Scholar
Kahneman, D. 2011. Thinking, Fast and Slow. Farrar, Straus and Giroux.Google Scholar
Kahneman, D., Sibony, O., and Sunstein, C.R. 2021. Noise: A Flaw in Human Judgment. Little, Brown.Google Scholar
Kahneman, Daniel, Knetsch, Jack L, and Thaler, Richard H. 1991. Anomalies: The endowment effect, loss aversion, and status quo bias. Journal of Economic Perspectives, 5(1), 193206.10.1257/jep.5.1.193CrossRefGoogle Scholar
Kang, Daniel, Arechiga, Nikos, Pillai, Sudeep, Bailis, Peter, and Zaharia, Matei. 2022. Finding label and model errors in perception data with learned observation assertions. https://arxiv.org/abs/2201.0579710.1145/3514221.3517907CrossRefGoogle Scholar
Kaplan, Jared, McCandlish, Sam, Henighan, Tom, et al. 2020. Scaling laws for neural language models. https://arxiv.org/abs/2001.08361Google Scholar
Kapoor, Sayash, and Narayanan, Arvind. 2023. Leakage and the reproducibility crisis in machine-learning-based science. Patterns, Elsevier. https://pubmed.ncbi.nlm.nih.gov/37720327/Google Scholar
Karpathy, Andrej. 2016. Arxiv sanity preserver. www.arxiv-sanity.com. Accessed: August 15, 2021.Google Scholar
Karpathy, Andrej. 2017. Software 2.0. https://karpathy.medium.com/software-2-0-a64152b37c35. Accessed: November 1, 2022.Google Scholar
Karpathy, Andrej. 2019 (April). The recipe. https://karpathy.github.io/2019/04/25/recipe/. Accessed: August 11, 2024.Google Scholar
Kaufman, Shachar, Rosset, Saharon, Perlich, Claudia, and Stitelman, Ori. 2012. Leakage in data mining: Formulation, detection, and avoidance. ACM Transactions on Knowledge Discovery from Data (TKDD), 6(4), 121.10.1145/2382577.2382579CrossRefGoogle Scholar
Keeney, R.L., and Raiffa, H. 1993. Decisions with Multiple Objectives: Preferences and Value Trade-offs. Cambridge University Press.10.1017/CBO9781139174084CrossRefGoogle Scholar
Kendall, Maurice G. 1938. A new measure of rank correlation. Biometrika, 30(1–2), 8193.10.1093/biomet/30.1-2.81CrossRefGoogle Scholar
Khanna, Sahil. 2022 (Jun). Griffin: How Instacarts ML platform tripled ML applications in a year. www.instacart.com/company/how-its-made/griffin-how-instacarts-ml-platform-tripled-ml-applications-in-a-year/. Accessed: January 28, 2023.Google Scholar
Kim, Kyungmi, and Johnson, Marcia K. 2014. Extended self: Spontaneous activation of medial prefrontal cortex by objects that are ‘mine’. Social Cognitive and Affective Neuroscience, 9(7), 10061012.10.1093/scan/nst082CrossRefGoogle ScholarPubMed
Kim, Sehoon, Hooper, Coleman, Wattanawong, Thanakul, et al. 2023. Full stack optimization of transformer inference: a survey. https://arxiv.org/abs/2302.14017Google Scholar
Kimball, R., and Caserta, J. 2011. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. Wiley.Google Scholar
Kirillov, Alexander, Mintun, Eric, Ravi, Nikhila, et al. 2023. Segment anything. https://arxiv.org/abs/2304.02643Google Scholar
Klaise, Janis, Looveren, Arnaud Van, Cox, Clive, Vacanti, Giovanni, and Coca, Alexandru. 2020. Monitoring and explainability of models in production. https://arxiv.org/abs/2007.06299Google Scholar
Klarman, Herbert E, and Rosenthal, Gerald D. 1968. Cost effectiveness analysis applied to the treatment of chronic renal disease. Medical Care, 6(1), 4854.10.1097/00005650-196801000-00005CrossRefGoogle Scholar
Kohavi, R., Tang, D., and Xu, Y. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press.10.1017/9781108653985CrossRefGoogle Scholar
Kohtala, Sampsa, and Steinert, Martin. 2021. Leveraging synthetic data from CAD models for training object detection models – A VR industry application case. Procedia CIRP, 100, 714719. 31st CIRP Design Conference 2021 (CIRP Design 2021).10.1016/j.procir.2021.05.092CrossRefGoogle Scholar
Krizhevsky, Alex, Sutskever, Ilya, and Hinton, Geoffrey E. 2012. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25.Google Scholar
Kuhn, M., and Johnson, K. 2019. Feature Engineering and Selection: A Practical Approach for Predictive Models. Chapman & Hall/CRC Data Science Series. CRC Press.10.1201/9781315108230CrossRefGoogle Scholar
Kurtic, Eldar, Frantar, Elias, and Alistarh, Dan. 2023. ZipLM: Inference-aware structured pruning of language models. https://arxiv.org/abs/2302.04089Google Scholar
Kwon, Woosuk, Li, Zhuohan, Zhuang, Siyuan, et al. 2023. Efficient memory management for large language model serving with pagedAttention. https://doi.org/10.48550/arXiv.2309.06180.CrossRefGoogle Scholar
Lebanon, G., and El-Geish, M. 2018. Computing with Data: An Introduction to the Data Industry. Springer International Publishing.10.1007/978-3-319-98149-9CrossRefGoogle Scholar
LeCun, Yann, Bengio, Yoshua, and Hinton, Geoffrey. 2015. Deep learning. Nature, 521(7553), 436444.10.1038/nature14539CrossRefGoogle Scholar
Lee, Dongwook, Kim, Junyoung, Moon, Won-Jin, and Ye, Jong Chul. 2019. CollaGAN: Collaborative GAN for missing image data imputation. https://arxiv.org/abs/1901.0976410.1109/CVPR.2019.00259CrossRefGoogle Scholar
Lee, Mina, Srivastava, Megha, Hardy, Amelia, et al. 2022. Evaluating human-language model interaction. https://doi.org/10.48550/arXiv.2212.09746CrossRefGoogle Scholar
Lehmann, E.L., and Romano, Joseph P. 2022. Testing Statistical Hypotheses. Springer International Publishing.10.1007/978-3-030-70578-7CrossRefGoogle Scholar
Lester, Brian, Al-Rfou, Rami, and Constant, Noah. 2021. The power of scale for parameter-efficient prompt tuning. https://arxiv.org/abs/2104.0869110.18653/v1/2021.emnlp-main.243CrossRefGoogle Scholar
Lester, Brian, and Constant, Noah. 2022. Guiding frozen language models with learned soft prompts. https://research.google/blog/guiding-frozen-language-modelswith-learned-soft-prompts/ Accessed: August 2, 2025.Google Scholar
Lewis, David D, and Catlett, Jason. 1994. Heterogeneous uncertainty sampling for supervised learning. Pages 148–156 of: ICML’94: Proceedings of the Eleventh International Conference on International Conference on Machine Learning. Elsevier. www.cs.cornell.edu/courses/cs6740/2010fa/papers/lewis-catlett-uncertainty-sampling.pdf August 2, 2025.Google Scholar
Li, J., Li, D., Xiong, C., & Hoi, S. C. H. (2022). BLIP: Bootstrapping languageimage pre-training for unified vision-language understanding and generation. Proceedings of the 39th International Conference on Machine LearningGoogle Scholar
Li, Liam, Jamieson, Kevin, Rostamizadeh, Afshin, et al. 2020a. A system for massively parallel hyperparameter tuning. Proceedings of Machine Learning and Systems, 2 230246.Google Scholar
Li, Shen, Zhao, Yanli, Varma, Rohan, et al. 2020b. PyTorch distributed: Experiences on accelerating data parallel training. https://arxiv.org/abs/2006.15704Google Scholar
Li, Yijun, Fang, Chen, Yang, Jimei, et al. 2017. Universal style transfer via feature transforms. https://arxiv.org/abs/1705.08086Google Scholar
Liang, Percy, Bommasani, Rishi, Lee, Tony, et al. 2022. Holistic evaluation of language models. https://doi.org/10.48550/arXiv.2211.09110CrossRefGoogle Scholar
Liaw, Richard, Liang, Eric, Nishihara, Robert, et al. 2018. Tune: A research platform for distributed model selection and training. https://doi.org/10.48550/arXiv.1807.05118CrossRefGoogle Scholar
Likert, Rensis. 1932. A technique for the measurement of attitudes. Archives of psychology. https://psycnet.apa.org/record/1933-01885-001Google Scholar
Liu, Haohe, Yuan, Yi, Liu, Xubo, et al. 2024a. AudioLDM 2: Learning holistic audio generation with self-supervised pretraining. https://arxiv.org/abs/2308.0573410.1109/TASLP.2024.3399607CrossRefGoogle Scholar
Liu, Haokun, Tam, Derek, Muqeeth, Mohammed, et al. 2022a. Few-shot parameterefficient fine-tuning is better and cheaper than in-context learning. https://arxiv.org/abs/2205.05638Google Scholar
Liu, Huan, and Motoda, Hiroshi. 2012. Feature selection for knowledge discovery and data mining. Springer Science & Business Media.Google Scholar
Liu, Shilong, Zeng, Zhaoyang, Ren, Tianhe, et al. 2024b. Grounding DINO: Marrying DINO with grounded pre-training for open-set object detection. https://arxiv.org/abs/2303.0549910.1007/978-3-031-72970-6_3CrossRefGoogle Scholar
Liu, Xiao, Ji, Kaixuan, Fu, Yicheng, et al. 2022b. P-Tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks. https://arxiv.org/abs/2110.0760210.18653/v1/2022.acl-short.8CrossRefGoogle Scholar
Lojek, B. 2007. History of Semiconductor Engineering. Springer.Google Scholar
Lombardo, M.M., and Eichinger, R.W. 2010. The Career Architect Development Planner: A Systematic Approach to Development Including 103 Research-based and Experience-tested Development Plans and Coaching Tips. Korn/Ferry International.Google Scholar
Lundberg, Scott M, and Lee, Su-In. 2017. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.Google Scholar
Madhu, Aswathy, and Suresh, K.. 2022. EnvGAN: A GAN-based augmentation to improve environmental sound classification. Artificial Intelligence Review, 55(8), 63016320.10.1007/s10462-022-10153-0CrossRefGoogle Scholar
Mamooler, Sepideh, Lebret, Rmi, Massonnet, Stephane, and Aberer, Karl. 2022. An efficient active learning pipeline for legal text classification. NLLP 2022 - Natural Legal Language Processing Workshop 2022, Proceedings of the Workshop, 11, 345358.10.18653/v1/2022.nllp-1.32CrossRefGoogle Scholar
Mann, Henry B., and Whitney, Donald R. 1947. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18, 5060.10.1214/aoms/1177730491CrossRefGoogle Scholar
Mavridis, Themis, Hausl, Soraya, Mende, Andrew, and Pagano, Roberto. 2020. Beyond algorithms: Ranking at scale at Booking. com. In: ComplexRec-ImpactRS@ RecSys.Google Scholar
McBride, Sean. 2018 (Oct). RICE: Simple prioritization for product managers. www.intercom.com/blog/rice-simple-prioritization-for-product-managers. Accessed: November 24, 2022.Google Scholar
Mell, Stephen, Brown, Olivia, Goodwin, Justin, and Son, Sung-Hyun. 2020. Safe predictors for enforcing input-output specifications. https://doi.org/10.48550/arXiv.2001.11062CrossRefGoogle Scholar
Mikolov, Tomas, Chen, Kai, Corrado, Greg, and Dean, Jeffrey. 2013. Efficient estimation of word representations in vector space. https://doi.org/10.48550/arXiv.1301.3781CrossRefGoogle Scholar
Miller, C. 2022. Chip War: The Fight for the World's Most Critical Technology. Scribner.Google Scholar
Mitchell, Margaret, Wu, Simone, Zaldivar, Andrew, et al. 2019. Model cards for model reporting. In: Proceedings of the Conference on Fairness, Accountability, and Transparency. FAT* '19.Google Scholar
Mohandas, Goku. 2023. Testing Machine Learning Systems- Made With ML. https://madewithml.com/courses/mlops/testing/Google Scholar
Moody, John. 1988. Fast learning in multi-resolution hierarchies. Advances in Neural Information Processing Systems, 1.Google Scholar
Mucsányi, Bálint, Kirchhof, Michael, Nguyen, Elisa, Rubinstein, Alexander, and Oh, Seong Joon. 2023. Trustworthy machine learning. https://doi.org/10.48550/arXiv.2310.08215CrossRefGoogle Scholar
Munro, Rob, and an O'Reilly Media Company. Safari. 2021. Human-in-the-loop machine learning. Manning Publications.Google Scholar
Norton, Michael I, Mochon, Daniel, and Ariely, Dan. 2012. The IKEA effect: When labor leads to love. Journal of Consumer Psychology, 22(3), 453460.10.1016/j.jcps.2011.08.002CrossRefGoogle Scholar
NVIDIA. 2025. Serving large language models: Run:AI benchmarking study. Accessed: March 23, 2025.Google Scholar
Odena, Augustus. 2016. Semi-supervised learning with generative adversarial networks. https://arxiv.org/abs/1406.2661Google Scholar
O'Gorman, Timothy J, Ross, John M, Taber, Allen H, et al. 1996. Field testing for cosmic ray soft errors in semiconductor memories. IBM Journal of Research and Development, 40(1), 4150.10.1147/rd.401.0041CrossRefGoogle Scholar
OpenAI, Achiam, Josh, Adler, Steven, , et al. 2024. GPT-4 Technical Report. https://arxiv.org/abs/2303.08774Google Scholar
Ouyang, Long, Wu, Jeff, Jiang, Xu, et al. 2022. Training language models to follow instructions with human feedback. https://doi.org/10.48550/arXiv.2203.02155CrossRefGoogle Scholar
Pallier, Gerry, Wilkinson, Rebecca, Danthiir, Vanessa, et al. 2002. The role of individual differences in the accuracy of confidence judgments. The Journal of General Psychology, 129(3), 257299.10.1080/00221300209602099CrossRefGoogle ScholarPubMed
Papineni, Kishore, Roukos, Salim, Ward, Todd, and Zhu, Wei-Jing. 2002. Bleu: A method for automatic evaluation of machine translation. Pages 311–318 of: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. ACL.Google Scholar
Pearson, K., and for National Eugenics, Galton Laboratory. 1895. Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society. Royal Society.Google Scholar
Pearson, Karl. 1900. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 50(302), 157175.10.1080/14786440009463897CrossRefGoogle Scholar
Pennington, Jeffrey, Socher, Richard, and Manning, Christopher D. 2014. GloVe: Global vectors for word representation. Pages 1532–1543 of: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).10.3115/v1/D14-1162CrossRefGoogle Scholar
Piezunka, Henning, and Dahlander, Linus. 2015. Distant search, narrow attention: How crowding alters organizations' filtering of suggestions in crowdsourcing. Academy of Management Journal, 58(3), 856880.10.5465/amj.2012.0458CrossRefGoogle Scholar
Polyzotis, Neoklis, Roy, Sudip, Whang, Steven Euijong, and Zinkevich, Martin. 2018. Data lifecycle challenges in production machine learning: A survey. ACM SIGMOD Record, 47(2), 1728.10.1145/3299887.3299891CrossRefGoogle Scholar
Pouyanfar, Samira, Tao, Yudong, Mohan, Anup, et al. 2018. Dynamic sampling in convolutional neural networks for imbalanced data classification. Pages 112–117 of: 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE.Google Scholar
Povey, Daniel, Ghoshal, Arnab, Boulianne, Gilles, et al. 2011. The Kaldi Speech Recognition Toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society. IEEE Catalog No.: CFP11SRW- USB.Google Scholar
Preston-Werner, Tom. 2012. Semantic versioning. https://semver.org/. Accessed: August 15, 2021.Google Scholar
Proust, M., and Scott-Moncrieff, C.K. 1929. The captive. A la recherche du temps perdu, pt. 2. A. & C. Boni.Google Scholar
Raad, Ragheb, Ray, Deep, Varghese, Bino, Hwang, Darryl, Gill, Inderbir, Duddalwar, Vinay, and Oberai, Assad A. 2023. Conditional generative learning for medical image imputation. bioRxiv. https://doi.org/10.1038/s41598-023-50566-7CrossRefGoogle Scholar
Rabanser, S., Gnnemann, S., & Lipton, Z. C. (2019). Failing loudly: An empirical study of methods for detecting dataset shift. Advances in Neural Information Processing Systems. 32.Google Scholar
Radford, Alec, Wu, Jeff, Child, Rewon, et al. 2019a. Language models are unsupervised multitask learners. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf Accessed: August 24, 2025.Google Scholar
Radford, Alec, Wu, Jeffrey, Child, Rewon, et al. 2019b. Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdfGoogle Scholar
Radford, Alec, Kim, Jong Wook, Hallacy, Chris, et al. 2021. Learning transferable visual models from natural language supervision. http://proceedings.mlr.press/v139/radford21a/radford21a.pdfGoogle Scholar
Rahman, Mezbahur. 1999. Estimating the Box-Cox transformation via Shapiro–wilk w statistic. Communications in Statistics-Simulation and Computation, 28(1), 223–241.10.1080/03610919908813545CrossRefGoogle Scholar
Rajbhandari, Samyam, Rasley, Jeff, Ruwase, Olatunji, and He, Yuxiong. 2020. Zero: Memory optimizations toward training trillion parameter models. Pages 1–16 of: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE.Google Scholar
Ratner, Alexander, Sa, Christopher De, Wu, Sen, Selsam, Daniel, and , Christopher. 2016. Data programming: Creating large training sets, quickly. Advances in Neural Information Processing Systems, 5, 35743582.Google Scholar
Ratner, Alexander, Bach, Stephen H., Ehrenberg, Henry, et al. 2017. Snorkel: Rapid training data creation with weak supervision. Proceedings of the VLDB Endowment, 11(11), 269282.10.14778/3157794.3157797CrossRefGoogle ScholarPubMed
Rebuffi, Sylvestre-Alvise, Gowal, Sven, Calian, Dan Andrei, et al. 2021. Data augmentation can improve robustness. Advances in Neural Information Processing Systems, 34(12), 2993529948.Google Scholar
Recht, Benjamin, Roelofs, Rebecca, Schmidt, Ludwig, and Shankar, Vaishaal. 2019. Do ImageNet classifiers generalize to ImageNet? Pages 5389–5400 of: Proceedings of the 36th International Conference on Machine Learning.Google Scholar
Redmon, Joseph, Divvala, Santosh Kumar, Girshick, Ross B., and Farhadi, Ali. 2015. You only look once: Unified, real-time object detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 779788.Google Scholar
Ribeiro, Flávio, Florêncio, Dinei, Zhang, Cha, and Seltzer, Michael. 2011. CrowdMOS: An approach for crowdsourcing mean opinion score studies. Pages 2416–2419 of: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.Google Scholar
Ribeiro, Marco Tulio, Singh, Sameer, and Guestrin, Carlos. 2016. “Why should I trust you?” Explaining the predictions of any classifier. Pages 1135–1144 of: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.10.1145/2939672.2939778CrossRefGoogle Scholar
Ribeiro, Marco Tulio, Wu, Tongshuang, Guestrin, Carlos, and Singh, Sameer. 2020. Beyond Accuracy: Behavioral testing of NLP models with checklist. In: Association for Computational Linguistics (ACL). https://aclanthology.org/2020.acl-main.442/10.18653/v1/2020.acl-main.442CrossRefGoogle Scholar
Ruder, Manuel, Dosovitskiy, Alexey, and Brox, Thomas. 2016. Artistic style transfer for videos. Pages 26–36 of: Pattern Recognition. 38th German Conference, GCPR 2016, Hannover, Germany, September 12-15, 2016, Proceedings. Springer International Publishing.Google Scholar
Russakovsky, Olga, Deng, Jia, Su, Hao, et al. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115, 211252.10.1007/s11263-015-0816-yCrossRefGoogle Scholar
Saeed, Aaqib, Grangier, David, and Zeghidour, Neil. 2020. Contrastive learning of general-purpose audio representations. https://arxiv.org/abs/2010.1091510.1109/ICASSP39728.2021.9413528CrossRefGoogle Scholar
Sambasivan, Nithya, Kapania, Shivani, Highfill, Hannah, et al. 2021a. Everyone wants to do the model work, not the data work: Data cascades in high-stakes AI. Pages 1–15 of: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery.Google Scholar
Sanseviero, O., Cuenca, P., Passos, A., and Whitaker, J. 2024. Hands-On Generative AI with Transformers and Diffusion Models. O’Reilly Media, Inc.Google Scholar
Sayin, Burcu, Krivosheev, Evgeny, Yang, Jie, Passerini, Andrea, and Casati, Fabio. 2021. A review and experimental analysis of active learning over crowdsourced data. Artificial Intelligence Review, 54(10), 52835305.10.1007/s10462-021-10021-3CrossRefGoogle Scholar
Shankar, Shreya, Garcia, Rolando, Hellerstein, Joseph M, and Parameswaran, Aditya G. 2022. Operationalizing machine learning: An interview study. https://doi.org/10.48550/arXiv.2209.09125CrossRefGoogle Scholar
Sharot, Tali, Kanai, Ryota, Marston, David, et al. 2012. Selectively altering belief formation in the human brain. Proceedings of the National Academy of Sciences, 109(42), 1705817062.10.1073/pnas.1205828109CrossRefGoogle ScholarPubMed
Shazeer, Noam, Cheng, Youlong, Parmar, Niki, et al. 2018. Mesh-tensorflow: Deep learning for supercomputers. Advances in Neural Information Processing Systems, 31.Google Scholar
Shenk, Kimberly. 2017. Measuring data science business value. https://blog.dominodatalab.com/measuring-data-science-business-value. Accessed: September 12, 2021.Google Scholar
Shoeybi, Mohammad, Patwary, Mostofa, Puri, Raul, et al. 2019. Megatron-LM: Training multi-billion parameter language models using model parallelism. https://doi.org/10.48550/arXiv.1909.08053CrossRefGoogle Scholar
Shu, Guanghua, and Khanna, Sahil. 2022 (Sep). Lessons learned: The journey to realtime machine learning at Instacart. www.instacart.com/company/how-its-made/lessons-learned-the-journey-to-real-time-machine-learning-at-instacart/. Accessed: December 22, 2024.Google Scholar
Silver, David, Huang, Aja, Maddison, Chris J., et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature, 529(1), 484489.10.1038/nature16961CrossRefGoogle ScholarPubMed
Simon, Herbert A. 1956. Rational choice and the structure of the environment. Psychological Review, 63(2), 129.10.1037/h0042769CrossRefGoogle ScholarPubMed
Simpson, Edward H. 1951. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society: Series B (Methodological), 13(2), 238241.10.1111/j.2517-6161.1951.tb00088.xCrossRefGoogle Scholar
Snoek, Jasper, Larochelle, Hugo, and Adams, Ryan P. 2012. Practical Bayesian optimization of machine learning algorithms. Advances in Neural Information Processing Systems, 25.Google Scholar
Song, Xingyou, Perel, Sagi, Lee, Chansoo, Kochanski, Greg, and Golovin, Daniel. 2022. Open source Vizier: Distributed infrastructure and API for reliable and flexible black-box optimization. In: Proceedings of the First International Conference on Automated Machine Learning, PMLR 188:8/1-17.Google Scholar
Song, Xingyou, Zhang, Qiuyi, Lee, Chansoo, et al. 2024. The Vizier Gaussian process bandit algorithm. Google DeepMind Technical Report. https://arxiv.org/abs/2408.11527Google Scholar
Spearman, C. 1904. The Proof and Measurement of association between two things. The American Journal of Psychology, 15(1), 72101.10.2307/1412159CrossRefGoogle Scholar
Srinivas, Niranjan, Krause, Andreas, Kakade, Sham M, and Seeger, Matthias. 2009. Gaussian process optimization in the bandit setting: No regret and experimental design. https://doi.org/10.48550/arXiv.0912.3995CrossRefGoogle Scholar
Srivastava, Nitish, Hinton, Geoffrey, Krizhevsky, Alex, Sutskever, Ilya, and Salakhut-dinov, Ruslan. 2014. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 19291958.Google Scholar
Stein, David. 2022 (Apr). Open sourcing Feathr – LinkedIn’s feature store for productive machine learning. www.linkedin.com/blog/engineering/open-source/open-sourcing-feathr-linkedin-s-feature-store-for-productive-m. Accessed: January 31, 2024.Google Scholar
Stevens, Stanley Smith. 1946. On the theory of scales of measurement. Science, 103(2684), 677680.10.1126/science.103.2684.677CrossRefGoogle ScholarPubMed
Student. 1908. The probable error of a mean. Biometrika, 6(1), 125.10.2307/2331554CrossRefGoogle Scholar
Sun, Chen, Shrivastava, Abhinav, Singh, Saurabh, and Gupta, Abhinav. 2017. Revisiting unreasonable effectiveness of data in deep learning era. Pages 843–852 of: Proceedings of the IEEE International Conference on Computer Vision. Institute of Electrical and Electronics Engineers (IEEE).Google Scholar
Sutton, Rich. 2019. The Bitter Lesson. Incomplete Ideas, March. Accessed: December 18, 2024. www.cs.utexas.edu~eunsol/courses/data/bitter_lesson.pdfGoogle Scholar
Szegedy, Christian, Zaremba, Wojciech, Sutskever, Ilya, et al. 2013. Intriguing properties of neural networks. https://doi.org/10.48550/arXiv.1312.6199CrossRefGoogle Scholar
Tang, Y. 2024. Distributed Machine Learning Patterns. Manning Publications.Google Scholar
Tetko, Igor V., Karpov, Pavel, Bruno, Eric, Kimber, Talia B., and Godin, Guillaume. 2019. Augmentation is what you need! Pages 831–835 of: Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions.10.1007/978-3-030-30493-5_79CrossRefGoogle Scholar
Torra, V. 2017. Data privacy: Foundations, new developments and the big data challenge. Studies in big data. Springer International Publishing.10.1007/978-3-319-57358-8CrossRefGoogle Scholar
Trabucco, Brandon, Doherty, Kyle, Gurinas, Max A, and Salakhutdinov, Ruslan. 2024. Effective data augmentation with diffusion models. In: The 12th International Conference on Learning Representations.Google Scholar
Tsymbal, Alexey. 2004. The problem of concept drift: Definitions and related work. Computer Science Department, Trinity College Dublin, 106(2), 58.Google Scholar
Tu, Huy, and Menzies, Tim. 2023. Less, but stronger: On the value of strong heuristics in semi-supervised learning for software analytics. https://arxiv.org/abs/2302.01997Google Scholar
Tukey, J. 2019. Exploratory Data Analysis. Pearson Modern Classics for Advanced Mathematics. Pearson.Google Scholar
Tversky, Amos, and Kahneman, Daniel. 1974. Judgment under uncertainty: Heuristics and biases: Biases in judgments reveal some heuristics of thinking under uncertainty. Science, 185(4157), 11241131.10.1126/science.185.4157.1124CrossRefGoogle Scholar
Tversky, Amos, and Kahneman, Daniel. 1981. The framing of decisions and the psychology of choice. Science, 211(4481), 453458.10.1126/science.7455683CrossRefGoogle ScholarPubMed
TwitchTV. 2023. Twirp. https://github.com/twitchtv/twirp Accessed: January 28, 2023.Google Scholar
Utley, J., Klebahn, P., and Kelley, D. 2022. Ideaflow: The only business metric that matters. Penguin Publishing Group.Google Scholar
Varuna Jayasiri, Nipun Wijerathne. 2020. LabML: A library to organize machine learning experiments. https://nn.labml.ai/. Accessed: February 4, 2023.Google Scholar
Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, et al. 2017. Attention is all you need. Advances in Neural Information Processing Systems, 30.Google Scholar
Viikki, Olli, and Laurila, Kari. 1998. Cepstral domain segmental feature vector normalization for noise robust speech recognition. Speech Communication, 25(1-3), 133147.10.1016/S0167-6393(98)00033-8CrossRefGoogle Scholar
Vul, Edward, and Pashler, Harold. 2008. Measuring the crowd within: Probabilistic representations within individuals. Psychological Science, 19(7), 645647.10.1111/j.1467-9280.2008.02136.xCrossRefGoogle ScholarPubMed
Walton, M. 1986. The Deming Management Method. Penguin Group (USA) Incorporated. Page 96.Google Scholar
Wan, Lulu, Papageorgiou, George, Seddon, Michael, and Bernardoni, Mirko. 2019. Long-length legal document classification. https://doi.org/10.48550/arXiv.1912.06905CrossRefGoogle Scholar
Wang, Xiaofang, Kondratyuk, Dan, Christiansen, Eric, et al. 2021. Wisdom of committees: An overlooked approach to faster and more accurate models. In: International Conference on Learning Representations. https://arxiv.org/abs/2012.01988Google Scholar
Wang, Xiaosong, Peng, Yifan, Lu, Le, et al. 2017. ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, January 3462–3471.Google Scholar
Wei, Jason, Bosma, Maarten, Zhao, Vincent Y, et al. 2021. Finetuned language models are zero-shot learners. https://doi.org/10.48550/arXiv.2109.01652CrossRefGoogle Scholar
Wei, Jason, Tay, Yi, Bommasani, Rishi, et al. 2022. Emergent abilities of large language models. https://doi.org/10.48550/arXiv.2206.07682CrossRefGoogle Scholar
Weinberger, Kilian, Dasgupta, Anirban, Langford, John, Smola, Alex, and Attenberg, Josh. 2009. Feature hashing for large scale multitask learning. Pages 1113–1120 of: Proceedings of the 26th Annual International Conference on Machine Learning.10.1145/1553374.1553516CrossRefGoogle Scholar
Weng, Lilian. 2022. Learning with not enough data Part 2: Active learning. lilian- weng.github.io, Feb. Accessed: May 20, 2023.Google Scholar
Weng, Lilian. 2023a (October). Adversarial attacks on LLMs. https://arxiv.org/abs/2410.19160Google Scholar
Weng, Lilian. 2023b (June). LLM powered autonomous Agents. https://lilianweng.github.io/posts/2023-06-23-agent/Google Scholar
Werner de Vargas, Vitor, Schneider Aranda, Jorge Arthur, dos Santos Costa, Ricardo, da Silva Pereira, Paulo Ricardo, and Victória Barbosa, Jorge Luis. 2023. Imbalanced data preprocessing techniques for machine learning: A systematic mapping study. Knowledge and Information Systems, 65(1), 31.10.1007/s10115-022-01772-8CrossRefGoogle ScholarPubMed
Wexler, James, Pushkarna, Mahima, Bolukbasi, Tolga, et al. (eds). 2019. The what-if tool: interactive probing of machine learning models. https://arxiv.org/abs/1907.0413510.1109/TVCG.2019.2934619CrossRefGoogle Scholar
Widmer, Gerhard, and Kubat, Miroslav. 1996. Learning in the presence of concept drift and hidden contexts. Machine Learning, 23, 69101.10.1023/A:1018046501280CrossRefGoogle Scholar
Wu, Haoning, Zhang, Zicheng, Zhang, Weixia, et al. 2023. Q-Align: Teaching LMMs for visual scoring via discrete text-defined levels. https://doi.org/10.48550/arXiv.2312.17090CrossRefGoogle Scholar
Wu, Xing, Chen, Cheng, Zhong, Mingyu, Wang, Jianjia, and Shi, Jun. 2021. COVIDAL: The diagnosis of COVID-19 with deep active learning. Medical Image Analysis, 68(2), 101913.10.1016/j.media.2020.101913CrossRefGoogle ScholarPubMed
Xie, Qizhe, Dai, Zihang, Hovy, Eduard, Luong, Minh-Thang, and Le, Quoc V. 2020. Unsupervised data augmentation for consistency training. Advances in Neural Information Processing Systems, 33, 62566268.Google Scholar
Xu, Jiazheng, Liu, Xiao, Wu, Yuchen, Tong, , et al. 2023. ImageReward: Learning and evaluating human preferences for text-to-image generation. https://arxiv.org/abs/2304.05977Google Scholar
Xu, Yuanzhong, Lee, HyoukJoong, Chen, Dehao, et al. 2020. Automatic cross-replica sharding of weight update in data-parallel training. https://doi.org/10.48550/arXiv.2004.13336CrossRefGoogle Scholar
Yan, Jinyun, Tiwana, Birjodh, Ghosh, Souvik, Liu, Haishan, and Chatterjee, Shaunak. 2019. Measuring long-term impact of ads on LinkedIn feed. https://arxiv.org/abs/1902.03098Google Scholar
Yang, Ling, Zhang, Zhilong, Song, Yang, et al. 2024. Diffusion models: A comprehensive survey of methods and applications. https://arxiv.org/abs/2209.00796Google Scholar
Yang, Yuedong, Chiang, Hung-Yueh, Li, Guihong, Marculescu, Diana, and Marculescu, Radu. 2023. Efficient Low-rank Backpropagation for Vision Transformer Adaptation. https://arxiv.org/abs/2309.15275Google Scholar
Yeo, In-Kwon, and Johnson, Richard A. 2000. A new family of power transformations to improve normality or symmetry. Biometrika, 87(4), 954959.10.1093/biomet/87.4.954CrossRefGoogle Scholar
Yuksel, Kamer Ali, Ferreira, Thiago, Gunduz, Ahmet, Al-Badrashiny, Mohamed, and Javadi, Golara. 2023. A reference-less quality metric for automatic speech recognition via contrastive-learning of a multi-language model with self-supervision. Pages 1–5 of: 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW). IEEE.Google Scholar
Zaharia, Matei, Davidson, Aaron, et al. 2018. MLflow: Open source platform for the machine learning lifecycle. https://github.com/mlflow/mlflow. Version 2.21.2, Accessed: March 27, 2025.Google Scholar
Zhang, Manlin, Wu, Jie, Ren, Yuxi, et al. 2023. DiffusionEngine: Diffusion model is scalable data engine for object detection. https://arxiv.org/abs/2309.0389310.2139/ssrn.4866102CrossRefGoogle Scholar
Zhang, Shuai, Yao, Lina, Sun, Aixin, and Tay, Yi. 2019. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys, 52(1), 138.10.1145/3158369CrossRefGoogle Scholar
Zheng, A., and Casari, A. 2018. Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists. O’Reilly Media.Google Scholar
Zhong, Zihan, Tang, Zhiqiang, He, Tong, Fang, Haoyang, and Yuan, Chun. 2024. Convolution Meets LoRA: Parameter efficient finetuning for segment anything model. In: The 12th International Conference on Learning Representations.Google Scholar
Zhu, Jun-Yan, Park, Taesung, Isola, Phillip, and Efros, Alexei A. 2020. Unpaired image-to-image translation using cycle-consistent adversarial networks. https://arxiv.org/abs/1703.10593Google Scholar

Accessibility standard: Inaccessible, or known limited accessibility

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

The PDF of this book is known to have missing or limited accessibility features. We may be reviewing its accessibility for future improvement, but final compliance is not yet assured and may be subject to legal exceptions. If you have any questions, please contact accessibility@cambridge.org.

Content Navigation

Table of contents navigation
Allows you to navigate directly to chapters, sections, or non‐text items through a linked table of contents, reducing the need for extensive scrolling.
Index navigation
Provides an interactive index, letting you go straight to where a term or subject appears in the text without manual searching.

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×