To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The internet, social media, smartphones and encryption have radically changed the way goods are bought, sold and advertised. In doing so, they have opened up new markets and increased economic opportunities, enabling advertising to reach a global and more targeted audience. Individuals can start an online business or take on a second job more easily than ever before. Technology has disrupted many areas of the economy. Digital currencies, such as Bitcoin, have facilitated encrypted online transactions without the need for processing by a third party such as a bank. Blockchain technology has enabled contracts to be executed online instead of requiring a hard copy to be signed by the parties as a means of verifying an agreement. Online businesses have emerged to disrupt traditional models in a range of sectors, including transportation, where ride- sharing apps have challenge taxis; accommodation, in the form of apps used to book private homes or rooms for short stays; and online clothing retailers that can offer substantially reduced prices with minimal rent and staffing costs. Developments such as these raise a number of legal and regulatory issues; for example, digital currencies may increase opportunities for fraud and create challenges for law enforcement agencies investigating online distributors of illegal drugs. In considering the issues associated with law, technology and commerce, this chapter begins with a discussion of digital currencies, and then proceeds to examine online markets and services, electronic contracting, and changes brought by new technology for professional services and other businesses. Emerging issues such as anti-competitive practices are also discussed.
Intellectual property involves the legal protection of inventions and other creative products. Its main categories are patents, copyright and trade marks, with related forms of protection also covering designs, circuit layouts, plant breeders’ rights, domain names and trade secrets. Some intellectual property rights attach automatically to a novel invention or creation, while others require registration in a publicly administered system, depending on the jurisdiction. Protection is typically limited to a specified time, with extension possible in some systems. What is offered to creators is an incentive to make potentially beneficial advances available to the public, rather than be kept secret or for private use only, by way of a limited monopoly.
The rapid growth of information and communications technology over the past two decades, including email, the internet, smartphones, social media, messaging applications and global positioning systems, has enhanced our ability to obtain and share information about the world. However, in the rush to secure the latest mass-produced technology device, many people give relatively less consideration to the implications for the security of the data that they produce and the consequential impact on individual privacy. That data is of great value in the corporate sector, to inform marketing strategies; and to governments to understand the behaviour of their citizens. As was noted in Chapter 1, developments in the past decade, such as the Snowden revelations, and the activities of the former political strategy firm Cambridge Analytica, have increased awareness of the implications of inadequate privacy protections. However, the convenience of new technologies may take precedence for many.
We study compression for function computation of sources at nodes in a network at receiver(s). The rate region of this problem has been considered under restrictive assumptions. We present results that significantly relax these assumptions. For a one-stage tree network, we characterize a rate region by a necessary and sufficient condition for any achievable coloring-based coding scheme, the coloring connectivity condition. We propose a modularized coding scheme based on graph colorings to perform arbitrarily closely to derived rate lower bounds. For a general tree network, we provide a rate lower bound based on graph entropies and show that it is tight for independent sources. We show that, in a general tree network case with independent sources, to achieve the rate lower bound, intermediate nodes should perform computations, but for a family of functions and random variables, which we call chain-rule proper sets, it suffices to have no computations at intermediate nodes to perform arbitrarily closely to the rate lower bound. We consider practicalities of coloring-based coding schemes and propose an efficient algorithm to compute a minimum-entropy coloring of a characteristic graph.
Clustering is a general term for techniques that, given a set of objects, aim to select those that are closer to one another than to the rest, according to a chosen notion of closeness. It is an unsupervised-learning problem since objects are not externally labeled by category. Much effort has been expended on finding natural mathematical definitions of closeness and then developing/evaluating algorithms in these terms. Many have argued that there is no domain-independent mathematical notion of similarity but that it is context-dependent; categories are perhaps natural in that people can evaluate them when they see them. Some have dismissed the problem of unsupervised learning in favor of supervised learning, saying it is not a powerful natural phenomenon. Yet, most learning is unsupervised. We largely learn how to think through categories by observing the world in its unlabeled state. Drawing on universal information theory, we ask whether there are universal approaches to unsupervised clustering. In particular, we consider instances wherein the ground-truth clusters are defined by the unknown statistics governing the data to be clustered.
Information theory plays an indispensable role in the development of algorithm-independent impossibility results, both for communication problems and for seemingly distinct areas such as statistics and machine learning. While numerous information-theoretic tools have been proposed for this purpose, the oldest one remains arguably the most versatile and widespread: Fano’s inequality. In this chapter, we provide a survey of Fano’s inequality and its variants in the context of statistical estimation, adopting a versatile framework that covers a wide range of specific problems. We present a variety of key tools and techniques used for establishing impossibility results via this approach, and provide representative examples covering group testing, graphical model selection, sparse linear regression, density estimation, and convex optimization.
This chapter introduces basic ideas of information-theoretic models for distributed statistical inference problems with compressed data, and discusses current and future research directions and challenges in applying these models to various statistical learning problems. In these applications, data are distributed in multiple terminals, which can communicate with each other via limited-capacity channels. Instead of recovering data at a centralized location first and then performing inference, this chapter describes schemes that can perform statistical inference without recovering the underlying data. Information-theoretic tools are borrowed to characterize the fundamental limits of the classical statistical inference problems using compressed data directly. In this chapter, distributed statistical learning problems are first introduced. Then, models and results of distributed inference are discussed. Finally, new directions that generalize and improve the basic scenarios are described.
The growth of cybercrime, social media misuse and online intellectual property infringement discussed in the preceding three chapters presents new and significant challenges for regulating these aspects of technology law. Conducting investigations, determining jurisdictional scope and prosecuting offences all require a degree of adaptation, as compared to traditional areas of the law. The contrast between investigating a robbery from a bricks-and-mortar store on the one hand and the hacking of an e-commerce company’s trade secrets is vast.
We began with the observation that technology law, as defined by this text, is now an important field in its own right. This importance will continue to grow as the progress of technology and its application in society continues to create gaps in the legal framework that require regulation. No-one can predict exactly what new technologies are coming, what their implications will be, or what laws will be needed, but by studying theoretical approaches to ethics and regulation and the legal problems that have arisen to date, and how these have been responded to, we are better prepared to deal with future challenges. The COVID-19 pandemic of 2020 provides a stark reminder that unanticipated events can change the societal landscape in a matter of weeks and provide compelling reasons for technologies to be quickly applied in new ways. This concluding chapter will consider the future directions of technology law by reflecting on technology and society, noting the areas of law and regulation that have been covered, and reflecting on themes that have arisen.
Machine-learning algorithms can be viewed as stochastic transformations that map training data to hypotheses. Following Bousquet and Elisseeff, we say such an algorithm is stable if its output does not depend too much on any individual training example. Since stability is closely connected to generalization capabilities of learning algorithms, it is of interest to obtain sharp quantitative estimates on the generalization bias of machine-learning algorithms in terms of their stability properties. We describe several information-theoretic measures of algorithmic stability and illustrate their use for upper-bounding the generalization bias of learning algorithms. Specifically, we relate the expected generalization error of a learning algorithm to several information-theoretic quantities that capture the statistical dependence between the training data and the hypothesis. These include mutual information and erasure mutual information, and their counterparts induced by the total variation distance. We illustrate the general theory through examples, including the Gibbs algorithm and differentially private algorithms, and discuss strategies for controlling the generalization error.
A grand challenge in representation learning is the development of computational algorithms that learn the explanatory factors of variation behind high-dimensional data. Representation models (encoders) are often determined for optimizing performance on training data when the real objective is to generalize well to other (unseen) data. This chapter provides an overview of fundamental concepts in statistical learning theory and the information-bottleneck principle. This serves as a mathematical basis for the technical results, in which an upper bound to the generalization gap corresponding to the cross-entropy risk is given. When this penalty term times a suitable multiplier and the cross-entropy empirical risk are minimized jointly, the problem is equivalent to optimizing the information-bottleneck objective with respect to the empirical data distribution. This result provides an interesting connection between mutual information and generalization, and helps to explain why noise injection during the training phase can improve the generalization ability of encoder models and enforce invariances in the resulting representations.
In recent decades, the regulation of technology and associated information has become an important and topical area of law, relevant to almost all aspects of society. Issues in technology law typically extend beyond specific jurisdictions and have state, national and international implications. Developments in one jurisdiction rapidly have international ramifications, due to the connectedness facilitated by the internet and modern communications technology. The areas of the law that are evolving due to technological developments are diverse: a preliminary list would include finance law, criminal law, medical law, media law and privacy law. New technology creates challenges, because when it becomes available, new regulatory gaps arise. For example, the emergence of cryptocurrencies such as Bitcoin has required public agencies to issue guidelines as to whether they constitute forms of currency, and whether dealings using them are subject to taxation laws.1 Another example is the legislation enacted in the early 2000s to regulate the use of DNA evidence in criminal investigations. When these laws were enacted, the use of commercial ancestry databases and other modern techniques in genetic analysis to identify suspects in some high-profile contemporary cases was not envisaged.2