To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Today’s information technology is becoming ever-more complex, distributed and pervasive. Therefore, problematizing what we observe as Information Systems (IS) researchers is becoming ever-more difficult. This chapter offers a new perspective for qualitative empirical research in the IS field. It looks at how we can possibly study dynamically changing, evolving, spatially and temporally distributed phenomena that evade our accustomed concepts and assumptions about the locus of agency. Or asked differently: How can we formally approach phenomena evading our concept of ‘identity’?
Using the mathematical-logical framework of the Laws-of-Form, formulated in 1969 by George Spencer-Brown, the chapter introduces the notion of distinction to capture the manifestation of concepts. It provides a short overview and illustrates how it can be used on sample concepts drawn from IS sociomateriality research.
The chapter advances qualitative methodology by suggesting a formal notation to communication analysis that is reflective of technologies’ complex nature. Applying the framework not only alters the epistemological boundaries for how to experience and study the ‘digital’, but also helps to build a bridge between deep technological insights, our immediate, unbiased and mundane experience of technologies, and how we speak about them.
This chapter is devoted to studying, in more depth, the set of integers Z, its structure, and properties. The integers play a fundamental role in many areas of mathematics, science, and beyond. The integers are closely related to the set of natural numbers and thus are often used in problems involving counting, sequences, and structures with finitely many elements (such as finite fields).
This chapter tackles the simple problem of intersecting two (sorted) lists of increasing integers, which constitutes the backbone of every query resolver in databases and (Web) search engines. In dealing with this problem, the chapter describes several approaches of increasing sophistication and elegance, which eventually turn out to be efficient/optimal in terms of time and I/O complexities. A final solution will deploy a proper compression of the input integers and a two-level scheme aimed at reducing the final space occupancy and working efficiently over hierarchical memories.
This chapter describes a data compression technique devised by Mike Burrows and David Wheeler in 1994 at DEC Systems Research Center. This technique is known as the Burrows–Wheeler Transform (or BWT) and offers a revolutionary alternative to dictionary-based and statistical compressors. It is the algorithmic core of a new class of data compressors (such as bzip2), as well as of new powerful compressed indexes (such as the FM-index). The chapter describes the algorithmic details of the BWT and of two other simple compressors, Move-to-Front and Run-Length Encoding, whose combination constitutes the design core of bzip-based compressors. This description is accompanied by the theoretical analysis of the impact of BWT on data compression, in terms of the k-th order empirical entropy of the input data, and by a sketch of the main algorithmic issues that underlie the design of the first provably compressed suffix array to date, namely the FM-index. Given the technicalities involved in the description of the BWT and the FM-index, this chapter offers several running examples and illustrative figures which should ease their understanding.
This chapter deals with a classic topic in data compression and information theory, namely the design of compressors based on the statistics of the symbols present in the text to be compressed. This topic is addressed by means of an algorithmic approach that gives much attention to the time efficiency and algorithmic properties of the discussed statistical coders, while also evaluating their space performance in terms of the empirical entropy of the input text. The chapter deals in detail with the classic Huffman coding and arithmetic coding, and also discusses their engineered versionsc known as canonical Huffman coding and range coding. Its final part is dedicated to describing and commenting on the prediction by partial matching (PPM) coder, whose algorithmic structure is at the core of some of the best statistical coders to date.
This chapter revisits the classic sorting problem within the context of big inputs, where “Atomic” in the title refers to the fact that items occupy few memory words and are managed in their entirety by executing only comparisons. It discusses two classic sorting paradigms: the merge-based paradigm, which underlies the design of MergeSort, and the distribution-based paradigm, which underlies the design of QuickSort. It shows how to adapt them to work in a hierarchical memory setting, analyzes their I/O complexity, and finally proposes some useful algorithmic tools that allow us to speed up their execution in practice, such as the Snow-Plow technique and data compression. It also proves that these adaptations are I/O optimal in the two-level memory model by providing a sophisticated, yet very informative, lower bound.These results allow us to relate the sorting problem to the so-called permuting problem, typically neglected when dealing with sorting in the RAM model, and then argue an interesting I/O-complexity equivalence between these two problems which provides a mathematical ground for the ubiquitous use of sorters when designing I/O-efficient solutions for big data problems.