To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Computer Networks: An Algorithmic Approach is designed for undergraduate and early postgraduate students in computer science and electronics/telecommunications. It goes beyond explaining what protocols do by focusing on how they work through an algorithm-centric approach. Core topics such as routing, switching, congestion control, and network security are presented using clear, step-by-step methods that support problem-solving, design, analysis, and implementation. The book also covers modern developments including software-defined networking (SDN), cloud and edge networking, IoT, and 5G, along with dedicated sections on AI for computer networks and blockchain networking.
Teaching fundamental design concepts and the challenges of emerging technology, this textbook prepares students for a career designing the computer systems of the future. Self-contained yet concise, the material can be taught in a single semester, making it perfect for use in senior undergraduate and graduate computer architecture courses. This edition has a more streamlined structure, with the reliability and other technology background sections now included in the appendix. New material includes a chapter on GPUs, providing a comprehensive overview of their microarchitectures; sections focusing on new memory technologies and memory interfaces, which are key to unlocking the potential of parallel computing systems; deeper coverage of memory hierarchies including DRAM architectures, compression in memory hierarchies and an up-to-date coverage of prefetching. Practical examples demonstrate concrete applications of definitions, while the simple models and codes used throughout ensure the material is accessible to a broad range of computer engineering/science students.
Important concepts from the diverse fields of physics, mathematics, engineering and computer science coalesce in this foundational text on the cutting-edge field of quantum information. Designed for undergraduate and graduate students with any STEM background, and written by a highly experienced author team, this textbook draws on quantum mechanics, number theory, computer science technologies, and more, to delve deeply into learning about qubits, the building blocks of quantum information, and how they are used in quantum computing and quantum algorithms. The pedagogical structure of the chapters features exercises after each section as well as focus boxes, giving students the benefit of additional background and applications without losing sight of the big picture. Recommended further reading and answers to select exercises further support learning. Written in approachable and conversational prose, this text offers a comprehensive treatment of the exciting field of quantum information while remaining accessible to students and researchers within all STEM disciplines.
This chapter reviews techniques to address the processor memory speed gap. We start with concepts behind modern memory hierarchies: the principle of locality of accesses, coherence in the memory hierarchy, and cache and memory inclusion. We then review the architecture of main memory systems, including the architecture of DRAM devices and DRAM systems. This is followed by concepts of cache hierarchies, including cache mappings and access, replacements and write policies, and classification of cache misses. We cover techniques needed to cope with processors exploiting high degrees of instruction-level parallelism, including lockup-free caches, cache prefetching, and preloading. The chapter reviews data compression in the memory hierarchy to allow for higher memory capacity and effective bandwidth. Finally, the chapter covers hardware support for virtual memory, page tables and translation lookaside buffers, and virtual address caches.
This chapter is devoted to design principles of multiprocessor systems, focusing on two architectural styles: shared-memory and message-passing. Both styles use multiple processors with to achieve a linear speedup of computational power with the number of processors but differ in the method of data exchange. Processors in shared-memory multiprocessors share the same address space and can exchange data through shared-memory locations by regular load and store instructions. This chapter reviews the programming model abstractions for shared-memory and message-passing multiprocessors, then the semantics of message-passing primitives, the protocols needed, and architectural support to accelerate message processing. It covers support of a shared-memory model abstraction by reviewing the concept of cache coherence, the design space of snoopy-cache coherence protocols, classification of communication events, and translation-lookaside buffer consistency strategies. Scalable models of shared memory are treated, with an emphasis on the design of cache coherence solutions that can be applied at a large scale as well as the software techniques to deal with page mappings to exploit locality.
For the past 30 years we have lived through the information revolution, powered by the explosive growth of semiconductor integration and the internet. The exponential performance improvement of semiconductor devices was predicted by Moore’s law as early as the 1960s. Moore’s law predicts that the computing power of microprocessors will double every 18-24 months at constant cost so that their cost-effectiveness (the ratio between performance and cost) will grow at an exponential rate. It has been observed that the computing power of entire systems also grows at the same pace. This law has endured the test of time and remains valid today. This law will be tested repeatedly, both now and in the future, as many people today see strong evidence that the "end of the ride" is near, mostly because the miniaturization of CMOS technology is rapidly reaching its limit. This chapter reviews technology trends underpinning the evolution of computer systems. It also introduces metrics for performance comparison of computer systems and fundamental laws that drive the field of computer systems such as Amdahl’s law.
This chapter is dedicated to the correct and reliable communication of values in shared-memory multiprocessors. Correctness properties of the memory system of shared-memory multiprocessors include coherence, the memory consistency model, and the reliable execution of synchronization primitives. Since CMPs are designed as shared-memory multi-core systems, this chapter targets correctness issues not only in symmetric multiprocessors (SMPs) or large-scale cache coherent distributed shared-memory systems, but also in CMPs with core multi-threading. The chapter reviews the hardware components of a shared-memory architecture and why memory correctness properties are so hard to enforce in modern shared-memory multiprocessor systems. We then treat various levels of coherence and the difference between plain memory coherence and store atomicity. We introduce memory models and sequential consistency, the most fundamental memory model, enforcing sequential consistency by store synchronization. Finally, we review thread synchronization and ISA-level synchronization primitives and relaxed memory models based on hardware efficiency and relaxed memory models relying on synchronization.
The chapter also covers compiler-centric approaches to build computers known as VLIW computers. Apart from reviewing the design principles of VLIW pipelines, we also review compiler techniques to uncover instruction-level parallelism, including loop unrolling, software pipelining, and trace scheduling. Finally, this chapter covers vector machines.
The instruction set is the interface between the hardware and the software and must be followed meticulously when designing a computer. This chapter starts with introducing the instruction set of a computer. A basic instruction set is used throughout the book. This instruction set is broadly inspired by the MIPS instruction set, a rather simple instruction set which is representative of many instruction sets such as ARM and RISC V. We then review how one can support a representative instruction set with the concept of static pipelining. We start with reviewing a simple 5-stage pipeline and all issues involved in avoiding hazards. This simple pipeline is gradually augmented to allow for higher instruction execution rates including out-of-order instruction completion, superpipelining, and superscalar designs.
Given the widening gaps between processor speed, main memory (DRAM) speed, and secondary memory (disk) speed, it has become more and more difficult in recent years to feed data and instructions at the speed required by the processor while providing the ever-expanding memory space expected by modern applications.
In prior chapters we discussed how Dennard’s scaling combined with Moore’s law has resulted in continuous increase in single-threaded performance, through innovations to exploit instruction-level parallelism (ILP). Designs such as out-of-order (OoO) execution and speculation have been used to exploit the scaling properties of transistors. Recently, Dennard’s voltage scaling has hit its limits, with the supply voltage reduction coming to a near halt. Thus, power density grows as more transistors are integrated into a unit area. In fact, Moore’s law scaling seem to keep its momentum, leading to billions of transistors being integrated into chips. Overall, it is fair to say that the density of transistors has been scaling faster than power density. Recognizing this concern, the chip industry has shifted (at least partially) emphasis toward multi- and even many-core chip multiprocessors (CMPs). While scaling frequency has a cubic relationship to power consumption, scaling the cores has a linear relationship to the power. Graphics processing units (GPUs) have emerged as a promising many-core architectures for power-efficient throughput computing. With thousands of simple in-order cores that can run thousands of threads in parallel, GPUs derive several tera-flops of peak performance, primarily through thread-level parallelism (TLP).