Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-cfpbc Total loading time: 0 Render date: 2024-04-24T17:55:02.012Z Has data issue: false hasContentIssue false

1 - What brought us here?

from Part 1 - Fundamentals

Published online by Cambridge University Press:  05 March 2014

Henrique C. M. Andrade
Affiliation:
J. P. Morgan
Buğra Gedik
Affiliation:
Bilkent University, Ankara
Deepak S. Turaga
Affiliation:
IBM Thomas J. Watson Research Center, New York
Get access

Summary

Overview

The world has become information-driven, with many facets of business and government being fully automated and their systems being instrumented and interconnected. On the one hand, private and public organizations have been investing heavily in deploying sensors and infrastructure to collect readings from these sensors, on a continuous basis. On the other hand, the need to monitor and act on information from the sensors in the field to drive rapid decisions, to tweak production processes, to tweak logistics choices, and, ultimately, to better monitor and manage physical systems, is now fundamental to many organizations.

The emergence of stream processing was driven by increasingly stringent data management, processing, and analysis needs from business and scientific applications, coupled with the confluence of two major technological and scientific shifts: first, the advances in software and hardware technologies for database, data management, and distributed systems, and, second, the advances in supporting techniques in signal processing, statistics, data mining, and in optimization theory.

In Section 1.2, we will look more deeply into the data processing requirements that led to the design of stream processing systems and applications. In Section 1.3, we will trace the roots of the theoretical and engineering underpinnings that enabled these applications, as well as the middleware supporting them. While providing this historical perspective, we will illustrate how stream processing uses and extends these fundamental building blocks.

Type
Chapter
Information
Fundamentals of Stream Processing
Application Design, Systems, and Analytics
, pp. 3 - 32
Publisher: Cambridge University Press
Print publication year: 2014

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Data, data everywhere; retrieved in October 2012. The Economist, February 25, 2010. http://www.economist.com/node/15557443.
[2] Daily Volume Statistics October 2012; retrieved in October 2012. http://www.theocc.com/webapps/daily-volume-statistics.
[3] NYSE Technologies Market Data; retrieved in October 2012. http://www.nyxdata.com/Data-Products/Product-Summaries.
[4] Zikopoulos, PC, deRoos, D, Parasuraman, K, Deutsch, T, Corrigan, D, Giles, J. Harness the Power of Big Data – The IBM Big Data Platform. McGraw Hill; 2013.Google Scholar
[5] Carney, D, Çetintemel, U, Cherniack, M, Convey, C, Lee, S, Seidman, G, et al.Monitoring streams – a new class of data management applications. In: Proceedings of the International Conference on Very Large Databases (VLDB). Hong Kong, China; 2002. pp. 215–226.Google Scholar
[6] Park, Y, King, R, Nathan, S, Most, W, Andrade, H. Evaluation of a high-volume, low-latency market data processing sytem implemented with IBM middleware. Software: Practice & Experience. 2012;42(1):37–56.Google Scholar
[7] Verscheure, O, Vlachos, M, Anagnostopoulos, A, Frossard, P, Bouillet, E, Yu, PS. Finding “who is talking to whom” in VoIP networks via progressive stream clustering. In: Proceedings of the IEEE International Conference on Data Mining (ICDM). Hong Kong, China; 2006. pp. 667–677.Google Scholar
[8] Wu, KL, Yu, PS, Gedik, B, Hildrum, KW, Aggarwal, CC, Bouillet, E, et al.Challenges and experience in prototyping a multi-modal stream analytic and monitoring application on System S. In: Proceedings of the International Conference on Very Large Databases (VLDB). Vienna, Austria; 2007. pp. 1185–1196.Google Scholar
[9] Sow, D, Biem, A, Blount, M, Ebling, M, Verscheure, O. Body sensor data processing using stream computing. In: Proceedings of the ACM International Conference on Multimedia Information Retrieval (MIR). Philadelphia, PA; 2010. pp. 449–458.Google Scholar
[10] Turaga, D, Verscheure, O, Sow, D, Amini, L. Adaptative signal sampling and sample quantization for resource-constrained stream processing. In: Proceedings of the International Conference on Biomedical Electronics and Devices (BIOSIGNALS). Funchal, Madeira, Portugal; 2008. pp. 96–103.Google Scholar
[11] Arasu, A, Cherniak, M, Galvez, E, Maier, D, Maskey, A, Ryvkina, E, et al.Linear Road: a stream data management benchmark. In: Proceedings of the International Conference on Very Large Databases (VLDB). Toronto, Canada; 2004. pp. 480–491.Google Scholar
[12] Biem, A, Bouillet, E, Feng, H, Ranganathan, A, Riabov, A, Verscheure, O, et al.IBM Info-Sphere Streams for scalable, real-time, intelligent transportation services. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD). Indianapolis, IN; 2010. pp. 1093–1104.Google Scholar
[13] Biem, A, Elmegreen, B, Verscheure, O, Turaga, D, Andrade, H, Cornwell, T. A streaming approach to radio astronomy imaging. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Dallas, TX; 2010. pp. 1654–1657.Google Scholar
[14] Schneider, S, Andrade, H, Gedik, B, Biem, A, Wu, KL. Elastic scaling of data parallel operators in streamprocessing. In: Proceedings ofthe IEEE International Conference on Parallel and Distributed Processing Systems (IPDPS); 2009. pp. 1–12.Google Scholar
[15] Angevin-Castro, Y. Water resources: quenching data thirst the first step to water security. CSIRO Solve. 2007;10(4).Google Scholar
[16] Water Resources Observation Network; retrieved in March 2011. http://wron.net.au/.
[17] Zhang, X, Andrade, H, Gedik, B, King, R, Morar, J, Nathan, S, et al.Implementing a high-volume, low-latency market data processing system on commodity hardware using IBM middleware. In: Proceedings of the Workshop on High Performance Computational Finance (WHPCF). Portland, OR; 2009. Article no. 7.Google Scholar
[18] Williamson, D, Shmoys, D. The Design of Approximation Algorithms. Cambridge University Press; 2011.CrossRefGoogle Scholar
[19] Asadoorian, M, Kantarelis, D. Essentials of Inferential Statistics. 5th edn. University Press of America; 2008.Google Scholar
[20] Jacques-Silva, G, Gedik, B, Andrade, H, Wu, KL. Language-level checkpointing support for stream processing applications. In: Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). Lisbon, Portugal; 2009. pp. 145–154.Google Scholar
[21] Jacques-Silva, G, Gedik, B, Andrade, H, Wu, KL. Fault-injection based assessment of partial fault tolerance in stream processing applications. In: Proceedings of the ACM International Conference on Distributed Event Based Systems (DEBS). New York, NY; 2011. pp. 231–242.Google Scholar
[22] Bouillet, E, Feblowitz, M, Liu, Z, Ranganathan, A, Riabov, A. A tag-based approach for the design and composition of information processing applications. In: Proceedings of the ACM International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Nashville, TN; 2008. pp. 585–602.Google Scholar
[23] Beynon, M, Chang, C, Çatalyürek, Ü, Kurç, T, Sussman, A, Andrade, H, et al.Processing large-scale multi-dimensional data in parallel and distributed environments. Parallel Computing (PARCO). 2002;28(5):827–859.Google Scholar
[24] Chang, C, Kurç, T, Sussman, A, Çatalyürek, Ü, Saltz, J. A hypergraph-based workload partitioning strategy for parallel data aggregation. In: Proceedings of the SIAM Conference on Parallel Processing for Scientific Computing (PPSC). Portsmouth, VA; 2001.Google Scholar
[25] Li, X, Jin, R, Agrawal, G. A compilation framework for distributed memory parallelization of data mining algorithms. In: Proceedings of the IEEE International Conference on Parallel and Distributed Processing Systems (IPDPS). Nice, France; 2003. p. 7.Google Scholar
[26] Riedel, E, Faloutsos, C, Gibson, GA, Nagle, D. Active disks for large-scale data processing. IEEE Computer. 2001;34(6):68–74.CrossRefGoogle Scholar
[27] Glimcher, L, Jin, R, Agrawal, G. FREERIDE-G: Supporting applications that mine remote. In: Proceedings of the International Conference on Parallel Processing (ICPP). Columbus, OH; 2006. pp. 109–118.Google Scholar
[28] Glimcher, L, Jin, R, Agrawal, G. Middleware for data mining applications on clusters and grids. Journal of Parallel and Distributed Computing (JPDC). 2008;68(1):37–53.CrossRefGoogle Scholar
[29] Jin, R, Vaidyanathan, K, Yang, G, Agrawal, G. Communication and memory optimal parallel data cube construction. IEEE Transactions on Parallel and Distributed Systems (TPDS). 2005;16(12):1105–1119.CrossRefGoogle Scholar
[30] Snyman, J. Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms. Springer; 2005.Google Scholar
[31] Elmasri, R, Navathe, S. Fundamentals of Database Systems. Addison Wesley; 2000.Google Scholar
[32] McGee, W. The Information Management System IMS/VS, Part I: General structure and operation. IBM Systems Journal. 1977;16(2):84–95.Google Scholar
[33] Metaxides, A, Helgeson, WB, Seth, RE, Bryson, GC, Coane, MA, Dodd, GG, et al.Data Base Task Group Report to the CODASYL Programming Language Committee. Association for Computing Machinery (ACM); 1971.Google Scholar
[34] ISO. Information Technology – Programming Languages – Cobol. International Organization for Standardization (ISO); 2002. ISO/IEC 1989.
[35] Bachman, C, Williams, S. A general purpose programming system for random access memories. In: Proceedings of the American Federation of Information Processing Societies Conference (AFIPS). San Francisco, CA; 1964. pp. 411–422.Google Scholar
[36] ISO. Information Technology – Programming Languages – Fortran – Part 1: Base Language. International Organization for Standardization (ISO); 2010. ISO/IEC 1539-1.
[37] ISO. Information Technology – Programming Languages – PL/1. International Organization for Standardization (ISO); 1979. ISO/IEC 6160.
[38] Codd, E. A relational model for large shared data banks. Communications of the ACM (CACM). 1970;13(6):377–387.CrossRefGoogle Scholar
[39] Astrahan, MM, Blasgen, HW, Chamberlin, DD, Eswaran, KP, Gray, JN, Griffiths, PP, et al.System R: relational approach to data management. ACM Transactions on Data Base Systems (TODS). 1976;1(2):97–137.Google Scholar
[40] Astrahan, MM, Blasgen, MW, Chamberlin, DD, Gray, JWFK III, Lindsay, BG, et al.System R: a relational data base management system. IEEE Computer. 1979;12(5):43–48.CrossRefGoogle Scholar
[41] Hellerstein, J, Stonebraker, M, editors. Readings in Database Systems. MIT Press; 2005.
[42] Ullman, J. Principles of database and knowledge-base systems. Computer Science Press; 1988.Google Scholar
[43] ISO. Information Technology – Database Languages – SQL. International Organization for Standardization (ISO); 2011. ISO/IEC 9075.
[44] Beaulieu, A. Learning SQL. 2nd edn. O'Reilly Media; 2009.Google Scholar
[45] Jarke, M, Lenzerini, M, Vassiliou, Y, Vassiliadis, P, editors. Fundamentals of Data Warehouses. 2nd edn. Springer; 2010.
[46] IBM InfoSphere Data Warehouse; retrieved in March 2011. http://www-01.ibm.com/software/data/infosphere/warehouse/.
[47] IBM SPSS Modeler; retrieved in March 2011. http://www.spss.com/software/modeler/.
[48] Oracle Data Mining; retrieved in March 2011. http://www.oracle.com/technetwork/database/options/odm/.
[49] Teradata Warehouse Miner; retrieved in March 2011. http://www.teradata.com/t/products-and-services/teradata-warehouse-miner/.
[50] IBM Cognos Now!; retrieved in September 2010. http://www-01.ibm.com/software/data/cognos/products/now/.
[51] Oracle Hyperion; retrieved in September 2010. http://www.oracle.com/us/solutions/ent-performance-bi/index.html.
[52] MicroStrategy; retrieved in September 2010. http://www.microstrategy.com/.
[53] Meyer, B. Object-Oriented Software Construction. Prentice Hall; 1997.Google Scholar
[54] Garcia-Molina, H, Salem, K. Main memory database systems: an overview. IEEE Transactions on Data and Knowledge Engineering (TKDE). 1992;4(6): 509–516.CrossRefGoogle Scholar
[55] Garcia-Molina, H, Wiederhold, G. Read-only transactions in a distributed database. ACM Transactions on Data Base Systems (TODS). 1982;7(2):209–234.Google Scholar
[56] Graves, S. In-memory database systems. Linux Journal. 2002;2002(101):10.Google Scholar
[57] IBM solidDB; retrieved in March 2011. http://www-01.ibm.com/software/data/soliddb/.
[58] Team, T. In-memory data management for consumer transactions the timesten approach. ACM SIGMOD Record. 1999;28(2):528–529.Google Scholar
[59] Stonebraker, M, Abadi, D, Batkin, A, Chen, X, Cherniack, M, Ferreira, M, et al.C-Store: A Column Oriented DBMS. In: Proceedings of the International Conference on Very Large Databases (VLDB). Trondheim, Norway; 2005. p. 553–564.Google Scholar
[60] Abadi, D, Madden, S, Hachem, N. Column-stores vs row-stores: how different are they really? In: Proceedings of the ACM International Conference on Management of Data (SIGMOD). Vancouver, Canada; 2008. pp. 967–980.Google Scholar
[61] Fox, A, Gribble, SD, Chawathe, Y, Brewer, EA, Gauthier, P. Cluster-based scalable network services. In: Proceedings of Symposium on Operating System Principles (SOSP). Saint Malo, France; 1997. pp. 78–91.Google Scholar
[62] Shen, K, Yang, T, Chu, L. Clustering support and replication management for scalable network services. IEEE Transactions on Parallel and Distributed Systems (TPDS). 2003;14(11):1168–1179.
[63] Dean, J, Ghemawat, S. MapReduce: simplified data processing on large clusters. In: Proceedings of the USENIX Symposium on Operating System Design and Implementation (OSDI). San Francisco, CA; 2004. pp. 137–150.Google Scholar
[64] Apache Hadoop; retrieved in March 2011. http://hadoop.apache.org/.
[65] Goldberg, B. Functional programming languages. ACM Computing Surveys. 1996;28(1):249–251.CrossRefGoogle Scholar
[66] Catozzi, J, Rabinovici, S. Operating system extensions for the teradata parallel VLDB. In: Proceedings of the International Conference on Very Large Databases (VLDB). Rome, Italy; 2001. pp. 679–682.Google Scholar
[67] Kurç, T, Lee, F, Agrawal, G, Çatalyürek, Ü, Ferreira, R, Saltz, J. Optimizing reduction computations in a distributed environment. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC). Phoenix, AZ; 2003. p. 9.Google Scholar
[68] Page, L, Brin, S, Motwani, R, Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab; 1999. SIDL-WP-1999-0120.Google Scholar
[69] Shneiderman, B, Plaisant, C, Cohen, M, Jacobs, S. Designing the User Interface: Strategies for Effective Human–Computer Interaction. Pearson Education; 2010.Google Scholar
[70] Shneiderman, B. Software Psychology: Human Factors in Computer and Information Systems. Winthrop Publishers; 1980.Google Scholar
[71] Robles-De-La-Torre, G. Principles of Haptic Perception in Virtual Environments. Birkhauser Verlag; 2008.CrossRefGoogle Scholar
[72] Dix, A, Finlay, J, Abowd, G, Beale, R. Human–Computer Interaction. 3rd edn. Pearson and Prentice Hall; 2004.Google Scholar
[73] Sears, A, Jacko, JA. Human–Computer Interaction Handbook. CRC Press; 2007.CrossRefGoogle Scholar
[74] Ahlberg, C, Shneiderman, B. Visual information seeking: tight coupling of dynamic query filters with starfield displays. In: Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI). Boston, MA; 1994. pp. 313–317.Google Scholar
[75] Asahi, T, Turo, D, Shneiderman, B. Using treemaps to visualize the analytic hierarchy process. Information Systems Research. 1995;6(4):357–375.CrossRefGoogle Scholar
[76] Adobe Flex; retrieved in September 2010. http://www.adobe.com/products/flex/.
[77] JFreeChart; retrieved in September 2010. http://www.jfree.org/jfreechart/.
[78] Tibco SpotFire; retrieved in September 2012. http://spotfire.tibco.com.
[79] TPC. TPC Benchmark E – Standard Specification – Version 1.12.0. Transaction Processing Performance Council (TPC); 2011. TPCE-v1.12.0.
[80] Chang, C, Moon, B, Acharya, A, Shock, C, Sussman, A, Saltz, J. Titan: a high-performance remote sensing database. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE). Birmingham, UK; 1997. pp. 375–384.Google Scholar
[81] DeWitt, D, Gray, J. Parallel database systems: the future of high performance database systems. Communications of the ACM (CACM). 1992;35(6):85–98.CrossRefGoogle Scholar
[82] Zou, Q, Wang, H, Soulé, R, Andrade, H, Gedik, B, Wu, KL. From a stream of relational queries to distributed stream processing. In: Proceedings of the International Conference on Very Large Databases (VLDB). Singapore; 2010. pp. 1394–1405.Google Scholar
[83] Cocke, J, Slotnick, D. Use of Parallelism in Numerical Calculations. IBM Research; 1958. RC-55.Google Scholar
[84] Dongarra, J, Duff, I, Sorensen, D, van der Vorst, H, editors. Numerical Linear Algebra for High Performance Computers. Society for Industrial and Applied Mathematics; 1998.CrossRef
[85] Levitin, A. The Design and Analysis of Algorithms. Pearson Education; 2003.Google Scholar
[86] Rubinstein, R, Kroese, D. Simulation and the Monte Carlo Method. John Wiley & Sons, Inc.; 2008.Google Scholar
[87] Shoch, JF, Dalal, YK, Redell, DD, Crane, RC. Evolution of the Ethernet local computer network. IEEE Computer. 1982;15(8):10–27.CrossRefGoogle Scholar
[88] Tanenbaum, A, Wetherall, D. Computer Networks. 5th edn. Prentice Hall; 2011.Google Scholar
[89] OpenMP ARB. OpenMP Application Program Interface – Version 3.0. OpenMP Architecture Review Board (OpenMP ARB); 2008. spec-30.
[90] Feo, J, Cann, DC, Oldehoeft, RR. A report on the Sisal language project. Journal of Parallel and Distributed Computing (JPDC). 1990;10(4):349–366.CrossRefGoogle Scholar
[91] The GHC Team. The Glorious Glasgow Haskell Compilation System User Guide. The Glasgow Haskell Compiler (GHC) Group; 2010. document-version-7.0.1.
[92] Trinder, PW, Hammond, K, Loidl, HW, Peyton-Jones, SL. Algorithm + strategy = parallelism. Journal of Functional Programming. 1998;8(1):23–60.CrossRefGoogle Scholar
[93] Mitrionics. Mitrion Users' Guide. Mitrionics AB; 2009.
[94] Geist, A, Beguelin, A, Dongarra, J, Jiang, W, Mancheck, R, Sunderam, V. PVM: Parallel Virtual Machine. A Users' Guide and Tutorial for Networked Parallel Computing. MIT Press; 1994.Google Scholar
[95] Gropp, W, Lusk, E, Skjellum, A. Using MPI: Portable Parallel Programming with Message-Passing Interface. MIT Press; 1999.Google Scholar
[96] Du, W, Ferreira, R, Agrawal, G. Compiler support for exploiting coarse-grained pipelined parallelism. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC). Phoenix, AZ; 2003. p. 8.Google Scholar
[97] Ferreira, R, Agrawal, G, Saltz, J. Compiler supported high-level abstractions for sparse disk-resident datasets. In: Proceedings of the ACM International Conference on Supercomputing (ICS). São Paulo, Brazil; 2002. pp. 241–251.Google Scholar
[98] Saraswat, V, Sarkar, V, von Praun, C. X10: Concurrent programming for modern architectures. In: Proceedings of the ACM Symposium on Principles and Practice of Parallel Programming (PPoPP). San Jose, CA; 2007. p. 271.Google Scholar
[99] Toomey, L, Plachy, E, Scarborough, R, Sahulka, R, Shaw, J, Shannon, A. IBM Parallel FORTRAN. IBM Systems Journal. 1988;27(4):416–435.CrossRefGoogle Scholar
[100] Yelick, K, Hilfinger, P, Graham, S, Bonachea, D, Su, J, Kamil, A, et al.Parallel languages and compilers: perspective from the Titanium experience. International Journal of High Performance Computing Applications. 2007;21(3):266–290.CrossRefGoogle Scholar
[101] Sun Microsystems. RPC: Remote Procedure Call Protocol Specification Version 2. The Internet Engineering Task Force (IETF); 1988. RFC 1050.
[102] The Object Management Group (OMG), Corba; retrieved in September 2010. http://www.corba.org/.
[103] Grisby, D, Lo, SL, Riddoch, D. The omniORB User's Guide. Apasphere and AT&T Laboratories Cambridge; 2009. document-version-4.1.x.Google Scholar
[104] OMG. IDL to C++ Language Mapping, Version 1.2. Object Management Group (OMG); 2008. formal/2008-01-09.
[105] OMG. IDL to Java Language Mapping, Version 1.3. Object Management Group (OMG); 2008. formal/2008-01-11.
[106] Microsoft. Distributed Component Object Model (DCOM) Remote Protocol Specification (Revision 13.0). Microsoft Corporation; 2012. cc226801.
[107] Newcomer, E. Understanding Web Services: XML, WSDL, SOAP, and UDDI. Addison Wesley; 2002.Google Scholar
[108] Smith, S. The Scientist and Engineer's Guide to Digital Signal Processing. California Technical Publishing; 1999.Google Scholar
[109] Brigham, E. Fast Fourier Transform and Its Applications. Prentice Hall; 1988.Google Scholar
[110] Guyon, I, Gunn, S, Nikravesh, M, Zadeh, L, editors. Feature Extraction, Foundations and Applications. Physica-Verlag – Springer; 2006.CrossRef
[111] Quatieri, T. Discrete Time Speech Signal Processing – Principles and Practice. Prentice Hall; 2001.Google Scholar
[112] Blanchet, G, Charbit, M, editors. Digital Signal and Image Processing using MATLAB. John Wiley & Sons, Inc and ISTE; 2006.CrossRef
[113] Zhu, Y, Shasha, D. Efficient elastic burst detection in data streams. In: Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (KDD). Washington, DC; 2003. pp. 336–345.Google Scholar
[114] Zhu, Y. High Performance Data Mining in Time Series: Techniques and Case Studies [Ph.D. Thesis]. New York University; 2004.Google Scholar
[115] Kay, S. Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice Hall; 1993.Google Scholar
[116] Devore, J. Probability and Statistics for Engineering and the Sciences. Brooks/Cole Publishing Company; 1995.Google Scholar
[117] Data Mining and Statistics: What's the Connection?; published November 1997; retrieved in April, 2011. http://www-stat.stanford.edu/~jhf/ftp/dm-stat.pdf.
[118] Data Mining and Statistics: What is the Connection?; published October 2004; retrieved in April, 2011. The Data Administration Newsletter – http://www.tdan.com/view-articles/5226.
[119] Zaki, M, Meira, W. Data Mining and Analaysis: Fundamental Concepts and Algorithms. Cambridge University Press; 2014.Google Scholar
[120] Fayyad, U, Piatetsky-Shapiro, G, Smyth, P, Uthurusamy, R, editors. Advances in Knowledge Discovery and Data Mining. AAAI Press and MIT Press; 1996.
[121] Yang, Y. The Online Customer: New Data Mining and Marketing Approaches. Cambria Press; 2006.Google Scholar
[122] Choudhary, A, Harding, J, Tiwari, M. Data mining in manufacturing: a review based on the kind of knowledge. Journal of Intelligent Manufacturing. 2009;20(5): 501–521.CrossRefGoogle Scholar
[123] Harding, J, Shahbaz, M, Srinivas, S, Kusiak, A. Data mining in manufacturing: a review. Journal of Manufacturing Science and Engineering. 2006;128(4):969–976.CrossRefGoogle Scholar
[124] Obenshain, MK. Application of data mining techniques to healthcare data. Infection Control and Hospital Epidemiology. 2004;25(8):690–695.CrossRefGoogle ScholarPubMed
[125] Aggarwal, C, editor. Social Network Data Analytics. Springer; 2011.CrossRef
[126] Aggarwal, C, Wang, H, editors. Managing and Mining Graph Data. Springer; 2010.CrossRef
[127] Guazzelli, A, Lin, W, Jena, T. PMML in Action: Unleashing the Power of Open Standards for Data Mining and Predictive Analytics. CreativeSpace; 2010.Google Scholar
[128] Guazzelli, A, Zeller, M, Chen, W, Williams, G. PMML: An open standard for sharing models. The R Journal. 2009;1(1):60–65.Google Scholar
[129] The Data Mining Group; retrieved in September 2010. http://www.dmg.org/.
[130] Park, H, Turaga, DS, Verscheure, O, van der Schaar, M. Tree configuration games for distributed stream mining systems. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Taipei, Taiwan; 2009. pp. 1773–1776.Google Scholar
[131] Aggarwal, C, editor. Data Streams: Models and Algorithms. Springer; 2007.CrossRef
[132] Joshi, M, Moudgalya, K. Optimization: Theory and Practice. Alpha Science International; 2004.Google Scholar
[133] Petrova, S, Solov'ev, A. The Origin of the Method of Steepest Descent. Historia Mathematica. 1997;24(4):361–375.CrossRefGoogle Scholar
[134] Bradley, S, Hax, A, Magnanti, T. Applied Mathematical Programming. Addison Wesley; 1977.
[135] Lipäk, B, editor. Instrument Engineer's Handbook – Process Control and Optimization. 4th edn. CRC Press and Taylor & Francis; 2006.
[136] Avriel, M. Nonlinear Programming – Analysis and Methods. Dover Publications; 2003.Google Scholar
[137] Křížek, M, Neittaanmäki, P, Glowinski, R, Korotov, S, editors. Conjugate Gradient Algorithms and Finite Element Methods. Berlin, Germany: Springer; 2004.CrossRef
[138] Wagner, H. Principles of Operations Research, with Applications to Managerial Decisions. Prentice Hall; 1975.Google Scholar
[139] Sarker, R, Mohammadian, M, Yao, X, editors. Evolutionary Optimization. Kluwer Academic Publishers; 2003.
[140] Goldberg, D. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley; 1989.Google Scholar
[141] De Jong, K. Evolutionary Computation – A Unified Approach. MIT Press; 2002.Google Scholar
[142] Kennedy, J, Eberhart, R, Shi, Y. Swarm Intelligence. Morgan Kaufmann; 2001.Google Scholar
[143] Kirkpatrick, S, Gelatt, CD Jr, Vecchi, MP. Optimization by simulated annealing. Science. 1983;220(4598):671–680.CrossRefGoogle ScholarPubMed
[144] Bellman, R. Dynamic Programming. Princeton University Press; 1957.Google ScholarPubMed
[145] Held, M, Karp, R. The traveling-salesman problem and minimum spanning trees. Operations Research. 1970;18(6):1138–1162.CrossRefGoogle Scholar
[146] Cormen, T, Leiserson, C, Rivest, R, Stein, C. Introduction to Algorithms. 3rd edn. MIT Press; 2009.Google Scholar
[147] Luenberger, D. Investment Science. Oxford University Press; 1998.Google Scholar
[148] Fourer, R. Software for optimization: a buyer's guide (part I). INFORMS Computer Science Technical Section Newsletter. 1996;17(1):14–17.Google Scholar
[149] Fourer, R. Software for optimization: a buyer's guide (part II). INFORMS Computer Science Technical Section Newsletter. 1996;17(2):3–4, 9–10.Google Scholar
[150] Moré, J, Wright, S. Optimization Software Guide. Society for Industrial and Applied Mathematics; 1993.CrossRefGoogle Scholar
[151] Abadi, D, Ahmad, Y, Balazinska, M, Cetintemel, U, Cherniack, M, Hwang, JH, et al.The design of the Borealis stream processing engine. In: Proceedings of the Innovative Data Systems Research Conference (CIDR). Asilomar, CA; 2005. pp. 277–289.Google Scholar
[152] Fu, F, Turaga, D, Verscheure, O, van der Schaar, M, Amini, L. Configuring competing classifier chains in distributed stream mining systems. IEEE Journal on Selected Topics in Signal Processing (J-STSP). 2007;1(4):548–563.Google Scholar
[153] Turaga, D, Foo, B, Verscheure, O, Yan, R. Configuring topologies of distributed semantic concept classifiers for continuous multimedia stream processing. In: Proceedings of the ACM Multimedia Conference. Vancouver, Canada; 2008. pp. 289–298.Google Scholar
[154] Wolf, J, Bansal, N, Hildrum, K, Parekh, S, Rajan, D, Wagle, R, et al.SODA: An optimizing scheduler for large-scale stream-based distributed computer systems. In: Proceedings of the ACM/IFIP/USENIX International Middleware Conference (Middleware). Leuven, Belgium; 2008. pp. 306–325.Google Scholar
[155] Wolf, J, Khandekar, R, Hildrum, K, Parekh, S, Rajan, D, Wu, KL, et al.COLA: Optimizing stream processing applications via graph partitioning. In: Proceedings of the ACM/I-FIP/USENIX International Middleware Conference (Middleware). Urbana, IL; 2009. pp. 308–327.Google Scholar
[156] Andrade, H, Kurf, T, Sussman, A, Saltz, J. Exploiting functional decomposition for efficient parallel processing of multiple data analysis queries. In: Proceedings of the IEEE International Conference on Parallel and Distributed Processing Systems (IPDPS). Nice, France; 2003. p. 81.Google Scholar
[157] Gedik, B, Andrade, H, Wu, KL. A code generation approach to optimizing high-performance distributed data stream processing. In: Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM). Hong Kong, China; 2009. pp. 847–856.Google Scholar
[158] Caltech. Sensing and responding – Mani Chandy's biologically inspired approach to crisis management. ENGenious – Caltech Division of Engineering and Applied Sciences. 2003;Winter(3).
[159] Boyer, S. SCADA: Supervisory Control and Data Acquisition. 2nd edn. Instrument Society of America; 1999.Google Scholar
[160] Arasu, A, Babcock, B, Babu, S, Datar, M, Ito, K, Motwani, R, et al.STREAM: the Stanford Stream data manager. IEEE Data Engineering Bulletin. 2003;26(1):665.Google Scholar
[161] Balakrishnan, H, Balazinska, M, Carney, D, Çetintemel, U, Cherniack, M, Convey, C, et al.Retrospective on Aurora. Very Large Databases Journal (VLDBJ). 2004;13(4):370–383.Google Scholar
[162] Chandrasekaran, S, Cooper, O, Deshpande, A, Franklin, M, Hellerstein, J, Hong, W, et al.TelegraphCQ: continuous dataflow processing. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD). San Diego, CA; 2003. pp. 329–338.Google Scholar
[163] Thies, W, Karczmarek, M, Amarasinghe, S. StreamIt: a language for streaming applications. In: Proceedings of the International Conference on Compiler Construction (CC). Grenoble, France; 2002. pp. 179–196.Google Scholar
[164] Upadhyaya, G, Pai, VS, Midkiff, SP. Expressing and exploiting concurrency in networked applications with Aspen. In: Proceedings of the ACM Symposium on Principles and Practice of Parallel Programming (PPoPP). San Jose, CA; 2007. pp. 13–23.Google Scholar
[165] IBM InfoSphere Streams; retrieved in March 2011. http://www-01.ibm.com/software/data/infosphere/streams/.
[166] StreamBase Systems; retrieved in April 2011. http://www.streambase.com/.

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×