To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Making sense of paradata as information on practices and processes is both a matter of theory and practice. This chapter introduces a comprehensive theoretical reference model for paradata and discusses its practical implications. Paradata is approached as a category of things that can be appropriated as being informative about processes and practices. Working knowledge on practices and processes, and the practices and processes themselves, can create paradata through both embodiment and acts of inscription. Paradata turns back to working knowledge through appropriation. Enactment turns paradata back to practices and processes. Paradata materialises as a process and network-like meshwork in space-time. It is perpetually in the making and stabilised momentarily only at times when it is taken into use.
Paradata is a concept that is very much in the making. Its significance is not given and it can matter in different ways depending on context and how the notion itself is operationalised in use. Paradata complements earlier metainformation concepts for knowledge organisation in how it can facilitate systematising and making the complexity of data, practices and processes visible. As a mindset, paradata underlines the importance of being involved both in the theory and practice of how data is constantly being made and remade. There are, however, practical and ethical limits to what paradata can do and how far, and where are the limits of what is desirable to do with it. Ultimately, mastering the use of paradata and making it matter is also a question of literacy, tightly interwoven in the intricate meshwork of the social reality of the domains where it is put to work.
While generative AI enables the creation of diverse content, including images, videos, text, and music, it also raises significant ethical and societal concerns, such as bias, transparency, accountability, and privacy. Therefore, it is crucial to ensure that AI systems are both trustworthy and fair, optimising their benefits while minimising potential harm. To explore the importance of fostering trustworthiness in the development of generative AI, this chapter delves into the ethical implications of AI-generated content, the challenges posed by bias and discrimination, and the importance of transparency and accountability in AI development. It proposes six guiding principles for creating ethical, safe, and trustworthy AI systems. Furthermore, legal perspectives are examined to highlight how regulations can shape responsible generative AI development. Ultimately, the chapter underscores the need for responsible innovation that balances technological advancement with societal values, preparing us to navigate future challenges in the evolving AI landscape.
The purpose of this chapter is to show how and where paradata emerges ‘in the wild’ of the many varieties of research documentation produced during scholarly work, and to demonstrate what this paradata might look like. The examination of paradata in research documentation is approached using perspectives of data ‘as practice’ and data ‘as thing’, emphasising simultaneously that paradata is malleable and will manifest differently across contexts of data production and use, but also that paradata is a tangible data phenomenon with identifiable characteristics. This chapter draws empirically from an interview study of archaeologists and archaeological research data professionals (N=31). Theoretical framing is provided by scholarship on data and documentation. The chapter reveals how paradata in research documentation emerges in different forms and with varying scope, comprehensiveness and degrees of formalisation. It also suggests that there are technical and epistemic usefulness thresholds relevant for identifying and using paradata. The technical usefulness threshold represents baseline possibilities of accessing and interacting with paradata in research documentation. The epistemic usefulness threshold underlines instead the degree of affinity between the intellectual horizons of paradata creation and paradata use, and several resources are identified that can help to strengthen this affinity.
Generative AI promises to have a significant impact on intellectual property law and practice in the United States. Already several disputes have arisen that are likely to break new ground in determining what IP protects and what actions infringe. Generative AI is also likely to have a significant impact on the practice of searching for prior art, creating new materials, and policing rights. This chapter surveys the emerging law of generative AI and IP in the United States, sticking as close as possible to near-term developments and controversies. All of the major IP areas are covered, at least briefly, including copyrights, patents, trademarks, trade secrets, and rights of publicity. For each of these areas, the chapter evaluates the protectability of AI-generated materials under current law, the potential liability of AI providers for their use of existing materials, and likely changes to the practice of creation and enforcement.
It is well-known that, to be properly valued, high-quality products must be distinguishable from poor-quality ones. When they are not, indistinguishability creates an asymmetry in information that, in turn, leads to a lemons problem, defined as the market erosion of high-quality products. Although the valuation of generative artificial intelligence (GenAI) systems’ outputs is still largely unknown, preliminary studies show that, all other things being equal, human-made works are evaluated at significantly higher values than machine-enabled ones. Given that these works are often indistinguishable, all the conditions for a lemons problem are present. Against that background, this Chapter proposes a Darwinian reading to highlight how GenAI could potentially lead to “unnatural selection” in the art market—specifically, a competition between human-made and machine-enabled artworks that is not based on the merits but distorted by asymmetrical information. This Chapter proposes solutions ranging from top-down rules of origin to bottom-up signalling. It is argued that both approaches can be employed in copyright law to identify where the human author has exercised the free and creative choices required to meet the criterion of originality, and thus copyrightability.
This chapter will focus on how Chinese and Japanese copyright law balance content owner’s desire for copyright protection with the national policy goal of enabling and promoting technological advancement, in particular in the area of AI-related progress. In discussing this emerging area of law, we will focus mainly on the two most fundamental questions that the widespread adoption of generative AI pose to copyright regulators: (1) does the use and refinement of training data violate copyright law, and (2) who owns a copyright in content produced by or with the help of AI?
This chapter explores the intricate relationship between consumer protection and GenAI. Prominent tools like Bing Chat, ChatGPT4.0, Google’s Gemini (formerly known as Bard), OpenAI’s DALL·E, and Snapchat’s AI chatbot are widely recognized, and they dominate the generative AI landscape. However, numerous smaller, unbranded GenAI tools are embedded within major platforms, often going unrecognized by consumers as AI-driven technology. In particular, the focus of this chapter is the phenomenon of algorithmic consumers, whose interactions with digital tools, including GenAI, have become increasingly dynamic, engaging, and personalized. Indeed, the rise of algorithmic consumers marks a pivotal shift in consumer behaviour, which is now characterized by heightened levels of interactivity and customization.
This chapter introduces a selection of methods applicable for identifying and extracting paradata from existing datasets and data documentation which can then be used to complement existing formal documentation of practices and processes. Data reuse, in its multiple forms, enables researchers to build upon the foundations laid by previous studies. Retrospective methods for eliciting paradata, including qualitative and quantitative backtracking and data forensics, provide means to get insights into past research practices and processes for data-driven analysis. The methods discussed in this chapter enhance understanding of data-related practices and processes, reproducibility of findings by facilitating the replication and verification of results through data reuse. Key references and further reading are provided after each method description.
Generative AI has catapulted into the legal debate through the popular applications ChatGPT, Bard, Dall-E, and others. While the predominant focus has hitherto centred on issues of copyright infringement and regulatory strategies, particularly within the ambit of the AI Act, it is imperative to acknowledge that generative AI also engenders substantial tension with data protection laws. The example of generative AI puts a finger on the sore spot of the contentious relationship between data protection law and machine learning built on the unresolved conflict between the protection of individuals, rooted in fundamental data protection rights and the massive amounts of data required for machine learning, which renders data processing nearly universal. In the case of LLMs, which scrape nearly the whole internet, this training inevitably relies on and possibly even creates personal data under the GDPR. This tension manifests across multiple dimensions, encompassing data subjects’ rights, the foundational principles of data protection, and the fundamental categories of data protection. Drawing on ongoing investigations by data protection authorities in Europe, this paper undertakes a comprehensive analysis of the intricate interplay between generative AI and data protection within the European legal framework.
Research on paradata practices provides diverse insights for the management of paradata. This chapter draws on the existing body of research to inform paradata practices in repository settings including research data archives, repositories and research information management contexts. Four categories of paradata needs (methods; scope; provenance; knowledge representation) are described as well as two major categories of paradata relevant from a repository perspective (core paradata i.e. information commonly perceived as being paradata, and potential paradata i.e. information with potential to function as paradata). Further, the chapter discusses three broad management approaches and a set of intermediary strategies of standardisation and embracing the messiness paradata, and of cultivating paradata literacy to manage different varieties of core paradata and potential paradata.
Making sense of data, and making it useful and manageable, requires understanding of both what the data is about but also where it comes from and how it has been processed, and used. An emerging interdisciplinary corpus of literature terms information about the practices and processes of data making, management and use as paradata. This introductory chapter to a first comprehensive overview of the concept and phenomenon of paradata from data management and knowledge organisation perspectives contextualises the notion and provides an overview of the volume and its aims and starting points.
This chapter provides an outline analysis of the evolving governance framework for Artificial Intelligence (AI) in the island city-state of Singapore. In broad terms, Singapore’s signature approach to AI Governance reflects its governance culture more broadly, which harnesses the productive energy of free-market capitalism contained within clear guardrails, as well as the dual nature (as a regulator and development authority) of Singapore’s lead public agency in AI policy formulation. Singapore’s approach is interesting for other jurisdictions in the region and around the world and it can already be observed to have influenced the recent Association of South East Asian Nations (ASEAN) Guide on AI Governance and Ethics which was promulgated in early 2024.
This chapter explores the privacy challenges posed by generative AI and argues for a fundamental rethinking of privacy governance frameworks in response. It examines the technical characteristics and capabilities of generative AIs that amplify existing privacy risks and introduce new challenges, including nonconsensual data extraction, data leakage and re-identification, inferential profiling, synthetic media generation, and algorithmic bias. It surveys the current landscape of U.S. privacy law and its shortcomings in addressing these emergent issues, highlighting the limitations of a patchwork approach to privacy regulation, the overreliance on notice and choice, the barriers to transparency and accountability, and the inadequacy of individual rights and recourse. The chapter outlines critical elements of a new paradigm for generative AI privacy governance that recognizes collective and systemic privacy harms, institutes proactive measures, and imposes precautionary safeguards, emphasizing the need to recognize privacy as a public good and collective responsibility. The analysis concludes by discussing the political, legal, and cultural obstacles to regulatory reform in the United States, most notably the polarization that prevents the enactment of comprehensive federal privacy legislation, the strong commitment to free speech under the First Amendment, and the “permissionless” innovation approach that has historically characterized U.S. technology policy.
This chapter introduces methods for generating and documenting paradata before and during data creation practices and processes (i.e. prospective and in-situ approaches, respectively). It introduces formal metadata-based paradata documentation using standards and controlled vocabularies to contribute to paradata consistency and interoperability. Narrative descriptions and recordings are advantageous for providing contextual richness and detailed documentation of data generation processes. Logging methods, including log files and blockchain technology, allow for automatic paradata generation and for maintaining the integrity of the record. Data management plans and registered reports are examples of measures to prospectively generate potential paradata on forthcoming activities. Finally, facilitative workflow-based approaches are introduced for step-by-step modelling of practices and processes. Rather than suggesting that a single approach to generating and documenting paradata will suffice, we encourage users to consider a selective combination of approaches, facilitated by adequate institutional resources, technical and subject expertise, to enhance the understanding, transparency, reproducibility and credibility of paradata describing practices and processes.
The chapter examines the legal regulation and governance of ‘generative AI,’ ‘foundation AI,’ ‘large language models’ (LLMs), and the ‘general-purpose’ AI models of the AI Act. Attention is drawn to two potential sorcerer’s apprentices, namely, in the spirit of J. W. Goethe’s poem, people who were unable to control a situation they created. Focus is on developers and producers of such technologies, such as LLMs that bring about risks of discrimination and information hazards, malicious uses and environmental harms; furthermore, the analysis dwells on the normative attempt of EU legislators to govern misuses and overuses of LLMs with the AI Act. Scholars, private companies, and organisations have stressed limits of such normative attempts. In addition to issues of competitiveness and legal certainty, bureaucratic burdens and standard development, the threat is the over-frequent revision of the law to tackle advancements of technology. The chapter illustrates this threat since the inception of the AI Act and recommends some ways in which the law has not to be continuously amended to address the challenges of technological innovation.
Generative AI offers a new lever for re-enchanting public administration, with the potential to contribute to a turning point in the project to ‘reinvent government’ through technology. Its deployment and use in public administration raise the question of its regulation. Adopting an empirical perspective, this chapter analyses how the United States of America and the European Union have regulated the deployment and use of this technology within their administrations. This transatlantic perspective is justified by the fact that these two entities have been very quick to regulate the issue of the deployment and use of this technology within their administrations. They are also considered to be emblematic actors in the regulation of AI. Finally, they share a common basis in terms of public law, namely their adherence to the rule of law. In this context, the chapter highlights four regulatory approaches to regulating the development and use of generative AI in public administration: command and control, the risk-based approach, the experimental approach, and the management-based approach. It also highlights the main legal issues raised by the use of such technology in public administration and the key administrative principles and values that need to be safeguarded.