To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter explores the fundamentals of data in data science, covering data types (structured vs. unstructured), collection sources (open data, social media APIs, multimodal data, synthetic data), and storage formats (CSV, TSV, XML, RSS, JSON). It emphasizes the critical importance of data pre-processing, including data cleaning (handling missing values, smoothing noisy data, data munging), integration, transformation, reduction, and discretization. Through hands-on examples, the chapter demonstrates how to systematically prepare "dirty" real-world data for analysis by addressing inconsistencies, outliers, and missing information. The chapter highlights that data preparation is often half the battle in data science, requiring both technical skills and careful attention to data quality and bias.
The chapter introduces key codesign principles across multiple layers of the design stack highlighting the need for cross-layer optimizations. Mitigation of various non-idealities stemming from emerging devices such as device-to-device variations, cycle-to-cycle variations, conductance drift, and stuck-at-faults through algorithm–hardware codesign are discussed. Further, inspiration from the brain’s self-repair mechanism is utilized to design neuromorphic systems capable of autonomous self-repair. Finally, an end-to-end codesign approach is outlined by exploring synergies of event-driven hardware and algorithms with event-driven sensors, thereby leveraging maximal benefits of brain-inspired computing.
The dynamics of information diffusion on social media platforms vary significantly between individual communities and the broader population. This study explores and compares the differences between community-based interventions and population-wide approaches in adjusting the spread of information. We first examine the temporal dynamics of social media groups, assessing their behavior through metrics such as time-dependent posts and retweets. Using functional data analysis, we investigate Twitter activities related to incidents such as the Skripal/Novichok case. We present three ways to quantify disparities between communities and uncover the strategies used by each group to promote specific narratives. We then compare the impact of targeted, community-based interventions with that of broader, population-wide responses in shaping the diffusion of information. Through this analysis, we identify key differences in how communities engage with and amplify information, revealing distinct patterns in the diffusion process. Our findings provide a comparative framework for understanding the relative consequences of different intervention strategies, offering insights into how targeted and broad approaches influence public discourse across social media platforms.
This research examines occupants’ ability to detect lighting differences as a function of proximity to the illuminated area. By understanding how proximity influences light difference detection, energy-saving lighting design techniques can be developed that do not negatively impact the appearance of architectural interiors. The experiment examined vertical surface illumination, hypothesizing that vertical illuminance difference detection thresholds would increase with greater spatial separation from the observer, regardless of gaze conditions. Illuminance was selected as the accepted metric for assessing lighting, aligning with commonly used standards, such as the European Standard and the Australian/New Zealand Standard. Eighty participants viewed a 10.0 m × 2.4 m vertical wall with five sections using a five-alternative forced choice method to identify the dimmer section. Eight experimental conditions manipulated participant position and gaze, with each subject completing 10 trials for 20 lighting conditions. Participants’ ability to detect lighting differences was very poor for wall end portions, regardless of position or gaze. Results suggest vertical illuminance in temporarily unoccupied areas can be reduced by at least 10% without affecting perceived illumination quality. Greater reductions of 25% can be achieved in room corners. These findings provide a foundation for future research into illuminance optimization across all surfaces within architectural spaces.
This chapter provides a selection of problems relevant to the field of neuromorphic computing that intersects materials science, electrical engineering, computer science, neural networks, and device design for realizing AI in hardware and algorithms. The emphasis on interdisciplinary nature of neuromorphic computing is apparent.
This introductory chapter defines data science as a field focused on collecting, storing, and processing data to derive meaningful insights for decision-making. It explores data science applications across diverse sectors including finance, healthcare, politics, public policy, urban planning, education, and libraries. The chapter examines how data science relates to statistics, computer science, engineering, business analytics, and information science, while introducing computational thinking as a fundamental skill. It discusses the explosive growth of data (the 3Vs: velocity, volume, variety) and essential skills for data scientists, including statistical knowledge, programming abilities, and data literacy. The chapter concludes by addressing critical ethical concerns around privacy, bias, and fairness in data science practice.
As data are becoming increasingly important resources for municipal administrations in the context of urban development, formalization of urban data governance (DG) is considered a prerequisite to systematic municipal data practice for the common good. Unlike for larger cities, it is unclear how common such formalized DG is in rural districts and small towns. We therefore mapped the current status quo in small municipalities in Germany as a case exemplifying the broader phenomenon. We systematically searched online for policy documents on DG in all metropolitan regions, all rural districts, and a quota sample of nearly a sixth of all German small towns. We then performed content analysis of the identified documents along predefined categories of urban development. Results show that hardly any small towns dispose of relevant policy documents. Rural districts are somewhat more active in formally defining DG. Identified policy documents tend to address mostly economic activities, social infrastructure, and demography, whereas Housing and Urban design and public space are among the least mentioned categories of urban development.
This chapter focuses on data collection methods, analysis approaches, and evaluation techniques in data science. It covers various data collection methods including surveys (with different question types like multiple-choice, Likert scales, and open-ended questions), interviews, focus groups, diary studies, and user studies in lab and field settings.
The chapter distinguishes between quantitative methods (using numerical measurements and statistical analysis) and qualitative methods (observing behaviors, attitudes, and opinions through techniques like grounded theory and constant comparison). It also discusses mixed-method approaches that combine both methodologies.
For evaluation, the chapter explains model comparison metrics including precision, recall, F-measure, ROC curves, AIC, and BIC. It covers validation techniques like training-testing splits, A/B testing, and cross-validation methods. The chapter emphasizes that data science involves pre-data collection planning and post-analysis evaluation, not just data processing.
The chapter discusses concepts in plasticity that go beyond memory. Several examples are discussed starting with the complexity of dendritic structure in biological neurons, nonlinear summation of signals from synapses by neurons and the vast range of plasticity that has been discovered in biological brain circuits. Learning and memory are commonly assigned to synapses; however, non-synaptic changes are important to consider for neuromorphic hardware and algorithms. The distinction between bioinspired and bio-realistic designs of hardware for AI is discussed. While synaptic connections can undergo both functional and structural plasticity, emulating such concepts in neuromorphic computing will require adaptive algorithms and semiconductors that can be dynamically reprogrammed. The necessity for close collaboration between neuroscience and neuromorphic engineering community is highlighted. Methods to implement lifelong learning in algorithms and hardware are discussed. Gaps in the field and directions for future research and development are discussed. The prospects for energy-efficient neuromorphic computing with disruptive brain-inspired algorithms and emerging semiconductors are discussed.
The chapter begins with physics and mathematical description of the nonlinear dynamics seen in biological neurons and their adaptation into neuromorphic hardware. Various abstractions of the Hodgkin–Huxley model of squid neuron have been studied in neuromorphic computing. Filamentary threshold switches that can act as neurons are discussed. The combination of ionic and electronic relaxation pathways offers unique abilities to design low-power artificial neurons. Ferroelectric, insulator–metal transition, 2D materials, and organic semiconductor-based neurons are discussed wherein modulation of long-range transport and/or bound charge displacement are utilized for neuron function. Besides electron transport, light and spin state can also be effectively utilized to create photonic and spintronic neurons respectively. The chapter should provide the reader a comprehensive insight into design of artificial neurons that can generate action potentials, spanning various classes of inorganic and organic semiconductors, and different stimuli for input and readout of signals such as voltage, light, spin current, and ionic currents.
This chapter introduces cloud computing platforms essential for modern data science work. It covers three major cloud services: Google Cloud Platform (GCP), Microsoft Azure, and Amazon Web Services (AWS). Students learn to create virtual machines, configure storage, and access cloud resources through SSH connections. The chapter demonstrates hands-on Python development using browser-based IDEs like Google Colab, Azure Machine Learning notebooks, and AWS Cloud9. Key topics include setting up accounts, managing costs through free tiers, and leveraging cloud resources for data science projects. The chapter also covers Hadoop for big data processing and discusses platform migration strategies. Practical exercises guide students through currency conversion programs, interactive calculations, and Olympic year predictions, emphasizing that cloud computing skills are now essential for data science professionals due to scalable processing power and storage capabilities.
This chapter introduces machine learning as a subset of artificial intelligence that enables computers to learn from data and make predictions without explicit programming. It defines machine learning through Tom Mitchell’s formal framework and explores real-world applications like self-driving cars, optical character recognition, and recommendation systems. The chapter focuses on regression as a fundamental machine learning technique, covering both linear modeling approaches and gradient descent algorithms for parameter optimization. Through hands-on examples using R, students learn to implement linear regression and gradient descent from scratch, understanding how models minimize error functions to find optimal parameters. The chapter emphasizes practical application over theoretical derivations.
This chapter introduces cloud computing platforms essential for modern data science work. It covers the three major providers: Google Cloud Platform (GCP), Microsoft Azure, and Amazon Web Services (AWS).
Key topics include setting up virtual machines, configuring SSH access, and running RStudio Server in browser-based environments on each platform. The chapter demonstrates how to migrate data science workflows from local machines to cloud infrastructure, providing scalable computing resources and storage.
Practical examples show installing R and RStudio on cloud VMs, accessing them through web browsers, and managing costs. The chapter emphasizes that cloud computing skills are now essential for data science practitioners, offering dynamic scaling, redundancy, and pay-as-you-use pricing models for computational resources.