Search results for Computer engineering

Computer Networks

An Algorithmic Approach
Sudip Misra, Riya Tapwal
Coming soon
Expected online publication date:

July 2026

Print publication:

01 July 2027
- Textbook
- Export citation

Parallel Computer Organization and Design

2nd edition
Michel Dubois, Murali Annavaram, Per Stenström
Published online:

17 January 2026

Print publication:

09 October 2025
- Textbook
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Teaching fundamental design concepts and the challenges of emerging technology, this textbook prepares students for a career designing the computer systems of the future. Self-contained yet concise, the material can be taught in a single semester, making it perfect for use in senior undergraduate and graduate computer architecture courses. This edition has a more streamlined structure, with the reliability and other technology background sections now included in the appendix. New material includes a chapter on GPUs, providing a comprehensive overview of their microarchitectures; sections focusing on new memory technologies and memory interfaces, which are key to unlocking the potential of parallel computing systems; deeper coverage of memory hierarchies including DRAM architectures, compression in memory hierarchies and an up-to-date coverage of prefetching. Practical examples demonstrate concrete applications of definitions, while the simple models and codes used throughout ensure the material is accessible to a broad range of computer engineering/science students.

Analysis and Design of Data Converters

Behzad Razavi
Published online:

13 November 2025

Print publication:

03 July 2025
- Textbook
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Master the art of data converter design with this definitive textbook, a detailed and accessible introduction ideal for students and practicing engineers. Razavi's distinctive and intuitive pedagogical approach, building up from elementary components to complex systems. Step-by-step transistor-level designs and simulations offer a practical hands-on understanding of key design concepts. Comprehensive coverage of essential topics including sampling circuits, comparator design, digital-to-analog converters, flash topologies, SAR and pipelined architectures, time-interleaved converters, and oversampling systems. Over 250 examples pose thought-provoking questions, reinforcing core concepts and helping students develop confidence. Over 350 end-of-chapter homework problems to test student understanding, with solutions available for course instructors. Developed by leading author Behzad Razavi, and addressing all the principles and design concepts essential to today's engineers, this is the ideal text for senior undergraduate and graduate-level students and professional engineers who aspire to excel in data converter analysis and design.

5 - Coherence, Synchronization, and Memory Consistency
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 289-368
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter reviews techniques to address the processor memory speed gap. We start with concepts behind modern memory hierarchies: the principle of locality of accesses, coherence in the memory hierarchy, and cache and memory inclusion. We then review the architecture of main memory systems, including the architecture of DRAM devices and DRAM systems. This is followed by concepts of cache hierarchies, including cache mappings and access, replacements and write policies, and classification of cache misses. We cover techniques needed to cope with processors exploiting high degrees of instruction-level parallelism, including lockup-free caches, cache prefetching, and preloading. The chapter reviews data compression in the memory hierarchy to allow for higher memory capacity and effective bandwidth. Finally, the chapter covers hardware support for virtual memory, page tables and translation lookaside buffers, and virtual address caches.

Preface
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp vii-xi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Appendix A - Impact of Technology
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 486-534
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Acknowledgments
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp xiii-xvi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp v-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Commemoration
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp xii-xii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

6 - Chip Multiprocessors
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 369-428
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter is devoted to design principles of multiprocessor systems, focusing on two architectural styles: shared-memory and message-passing. Both styles use multiple processors with to achieve a linear speedup of computational power with the number of processors but differ in the method of data exchange. Processors in shared-memory multiprocessors share the same address space and can exchange data through shared-memory locations by regular load and store instructions. This chapter reviews the programming model abstractions for shared-memory and message-passing multiprocessors, then the semantics of message-passing primitives, the protocols needed, and architectural support to accelerate message processing. It covers support of a shared-memory model abstraction by reviewing the concept of cache coherence, the design space of snoopy-cache coherence protocols, classification of communication events, and translation-lookaside buffer consistency strategies. Scalable models of shared memory are treated, with an emphasis on the design of cache coherence solutions that can be applied at a large scale as well as the software techniques to deal with page mappings to exploit locality.

Index
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 566-586
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

1 - Introduction
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 1-34
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

For the past 30 years we have lived through the information revolution, powered by the explosive growth of semiconductor integration and the internet. The exponential performance improvement of semiconductor devices was predicted by Moore’s law as early as the 1960s. Moore’s law predicts that the computing power of microprocessors will double every 18-24 months at constant cost so that their cost-effectiveness (the ratio between performance and cost) will grow at an exponential rate. It has been observed that the computing power of entire systems also grows at the same pace. This law has endured the test of time and remains valid today. This law will be tested repeatedly, both now and in the future, as many people today see strong evidence that the "end of the ride" is near, mostly because the miniaturization of CMOS technology is rapidly reaching its limit. This chapter reviews technology trends underpinning the evolution of computer systems. It also introduces metrics for performance comparison of computer systems and fundamental laws that drive the field of computer systems such as Amdahl’s law.

Appendix B - Interconnection Networks
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 535-565
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Quantitative Evaluations
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 429-460
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter is dedicated to the correct and reliable communication of values in shared-memory multiprocessors. Correctness properties of the memory system of shared-memory multiprocessors include coherence, the memory consistency model, and the reliable execution of synchronization primitives. Since CMPs are designed as shared-memory multi-core systems, this chapter targets correctness issues not only in symmetric multiprocessors (SMPs) or large-scale cache coherent distributed shared-memory systems, but also in CMPs with core multi-threading. The chapter reviews the hardware components of a shared-memory architecture and why memory correctness properties are so hard to enforce in modern shared-memory multiprocessor systems. We then treat various levels of coherence and the difference between plain memory coherence and store atomicity. We introduce memory models and sequential consistency, the most fundamental memory model, enforcing sequential consistency by store synchronization. Finally, we review thread synchronization and ISA-level synchronization primitives and relaxed memory models based on hardware efficiency and relaxed memory models relying on synchronization.

3 - Memory Hierarchies
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 155-218
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The chapter also covers compiler-centric approaches to build computers known as VLIW computers. Apart from reviewing the design principles of VLIW pipelines, we also review compiler techniques to uncover instruction-level parallelism, including loop unrolling, software pipelining, and trace scheduling. Finally, this chapter covers vector machines.

Appendices
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 486-565
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Processor Microarchitecture
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 35-154
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The instruction set is the interface between the hardware and the software and must be followed meticulously when designing a computer. This chapter starts with introducing the instruction set of a computer. A basic instruction set is used throughout the book. This instruction set is broadly inspired by the MIPS instruction set, a rather simple instruction set which is representative of many instruction sets such as ARM and RISC V. We then review how one can support a representative instruction set with the concept of static pipelining. We start with reviewing a simple 5-stage pipeline and all issues involved in avoiding hazards. This simple pipeline is gradually augmented to allow for higher instruction execution rates including out-of-order instruction completion, superpipelining, and superscalar designs.

4 - Multiprocessor Systems
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 219-288
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Given the widening gaps between processor speed, main memory (DRAM) speed, and secondary memory (disk) speed, it has become more and more difficult in recent years to feed data and instructions at the speed required by the processor while providing the ever-expanding memory space expected by modern applications.

Copyright page
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp iv-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - Graphics Processing Units
Michel Dubois, University of Southern California, Murali Annavaram, University of Southern California, Per Stenström, Chalmers University of Technology, Gothenberg
Book:

Parallel Computer Organization and Design

Published online:

17 January 2026

Print publication:

09 October 2025, pp 461-485
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In prior chapters we discussed how Dennard’s scaling combined with Moore’s law has resulted in continuous increase in single-threaded performance, through innovations to exploit instruction-level parallelism (ILP). Designs such as out-of-order (OoO) execution and speculation have been used to exploit the scaling properties of transistors. Recently, Dennard’s voltage scaling has hit its limits, with the supply voltage reduction coming to a near halt. Thus, power density grows as more transistors are integrated into a unit area. In fact, Moore’s law scaling seem to keep its momentum, leading to billions of transistors being integrated into chips. Overall, it is fair to say that the density of transistors has been scaling faster than power density. Recognizing this concern, the chip industry has shifted (at least partially) emphasis toward multi- and even many-core chip multiprocessors (CMPs). While scaling frequency has a cubic relationship to power consumption, scaling the cores has a linear relationship to the power. Graphics processing units (GPUs) have emerged as a promising many-core architectures for power-efficient throughput computing. With thousands of simple in-order cores that can run thousands of threads in parallel, GPUs derive several tera-flops of peak performance, primarily through thread-level parallelism (TLP).

Computer engineering

Refine search

Refine search

Actions for selected content:

421 results in Computer engineering

Computer Networks

Parallel Computer Organization and Design

Analysis and Design of Data Converters

5 - Coherence, Synchronization, and Memory Consistency

Summary

Preface

Appendix A - Impact of Technology

Acknowledgments

Contents

Commemoration

6 - Chip Multiprocessors

Summary

Index

1 - Introduction

Summary

Appendix B - Interconnection Networks

7 - Quantitative Evaluations

Summary

3 - Memory Hierarchies

Summary

Appendices

2 - Processor Microarchitecture

Summary

4 - Multiprocessor Systems

Summary

Copyright page

8 - Graphics Processing Units

Summary

Computer engineering

Refine search

Refine search

Actions for selected content:

Save Search

421 results in Computer engineering

Computer Networks

Parallel Computer Organization and Design

Analysis and Design of Data Converters

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary