Artificial Intelligence in Court Proceedings: Judge’s Little Helper or the Beginning of AI’s Hostile Takeover?

Elisabeth Paar

doi:10.1017/glj.2026.10181

Artificial Intelligence in Court Proceedings: Judge’s Little Helper or the Beginning of AI’s Hostile Takeover?

Published online by Cambridge University Press: 24 April 2026

Elisabeth Paar

Show author details

Elisabeth Paar*: Affiliation:
Institute of Public Law and Political Science, University of Graz, Austria
*: elisabeth.paar@uni-graz.at

Article contents

Abstract
Introduction
AI Assistance Over AI Delegation: Roots and Core Arguments of the Human in the Loop-Intuition
The Dangers of Having a Human (Judge) in the Loop: Potential Conceptual Misunderstandings
AI Supporting Human Judges: Systematic Overview of AI Use Cases in Court Proceedings
Protecting the Good against the Bad and the Ugly? Concluding Thoughts
Competing Interests
Funding Statement
References

Abstract

This Article examines the “human in the loop” argument regarding the increasing use of artificial intelligence (AI) in court proceedings, challenging the intuition that AI assistance is inherently less problematic than full delegation. It argues that even limited AI support can risk blurring the lines between human and AI decision-making, posing significant dangers for human judgment in critical judicial functions.

The Article starts out by identifying the main arguments in favor of human oversight and categorizing them into technological, legal, and psychological reasoning in Section B. It then shifts focus to the potential dangers of applying the “human in the loop” concept to judicial decision-making in Section C. The analysis highlights the fundamentally different modes of operation between humans and AI, particularly in handling natural language and legal reasoning. Furthermore, it explores how human judges might become over-reliant on AI, effectively acting as “rubber stampers” and leading to eroding human vigilance, skills, and independent judgment. To illustrate the complexities of these dynamics, the Article categorizes various AI tools used in court proceedings based on their degree of involvement and effect on judicial decision-making in Section D.

The Article concludes by urging us to rethink our current understanding of “co-working” with AI as a universal remedy and putting emphasis on a clear division of labor instead, as discussed in Section E. For those scenarios where human and AI contributions are deeply interwoven, the Article stresses the need for making a conscious decision whether AI or a human judge should perform the underlying tasks in the future.

Keywords

Artificial Intelligence (AI)judicial decision-making human in the loop human oversight automated decision-making algorithmic governance judicial independence accountability democratic legitimacy legal reasoning overreliance on AI rubber-stamping vigilance decrement skill degradation Large Language Models (LLMs)predictive justice

Information

Type: Article
Information: German Law Journal , Volume 26 , Special Issue 7: Comparative AI Law: Regulating the Future , October 2025 , pp. 1383 - 1403

DOI: https://doi.org/10.1017/glj.2026.10181 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2026. Published by Cambridge University Press on behalf of German Law Journal e.V

A. Introduction

The use of Artificial Intelligence (AI) has caused a disruption in the legal sector, increasingly affecting courts. The discussion of AI potentially taking over the task of adjudication in the not-too-distant future has become somewhat of a “scholarship favorite” in the last few years.Footnote ¹ If one listened to AI enthusiasts only, one might get the impression that the advent of fully-fledged “Robo-judges” was already imminent. However, this idea has at least as many opponents as proponents. In legal scholarship, in particular, the concept of an AI judge tends to be met primarily with skepticism. While the approaches may vary quite drastically from one another, the overall conclusion seems to be rather uniform: AI cannot replace the human judge for both technological and legal reasons.Footnote ²

So far, so good. Interestingly enough, though, many scholars do not leave it at that. Rather, they oftentimes extrapolate from negating the feasibility of a complete replacement of the human judge by AI to proclaiming that AI may “only” be used to support the human judge by assisting them in the course of the decision-making process.Footnote ³ This later claim is mostly based on some version of the notorious “human in the loop” argument.

The mask of this cure-all remedy, however, quickly starts to crack if one zooms in on this framing of the judge as the human in the loop. The widespread underlying intuition that having AI tools assisting a human judge is categorically less problematic than fully delegating judicial decision-making to AI turns out to have a rather shaky foundation. To challenge this assumption, it is essential to carve out the core arguments in favor of keeping a human in the loop and trace them back to their roots.Footnote ⁴ Doing so establishes the baseline for identifying the risks of opting for human in the loop constructs.Footnote ⁵ To illuminate the technological and legal challenges of “merely supporting” AI tools may pose for the judicial decision-making process specifically, stopping at AI as an abstract concept, is not sufficient. Rather, the focus must shift to concrete AI tools, categorized against the backdrop of the different stages and features of court proceedings.Footnote ⁶ These AI use cases may then serve as representative examples of graduated degrees of AI involvement in the judicial decision-making processes, with each of them having a rather distinct potential to impact the judicial decision. The outcome of the analysis urges a rethinking of the current understanding of AI “co-working” with judges. It proposes a move towards a clearer division of labor, where AI takes on specific, well-defined tasks that do not intrinsically require human judicial expertise, rather than an intertwined assistance model. For those scenarios where human and AI contributions are deeply interwoven, a conscious decision whether AI or a human judge should perform the underlying tasks in the future is required.Footnote ⁷

B. AI Assistance Over AI Delegation: Roots and Core Arguments of the Human in the Loop-Intuition

As we all know by now, AI is everywhere and here to stay. The manner in which AI is incorporated into our lives remains, however, a moving target and so does the scholarly debate analyzing it. For many years, the discussion has mostly been focused on whether and in which domains AI will replace humans. As it is the case with most technological advancements, one of the first questions widely discussed with the rise of AI has been which jobs will no longer exist for humans. The legal profession, let alone the judiciary, were not, however, at the heart of this debate. Rather, the debate centered around whether we will still require human drivers, human pilots, human servers, or human nurses, to name a few.Footnote ⁸ Recently, the discussion is more and more shifting towards the question of AI assisting humans. This may seem counter-intuitive at first glance, as one may consider moving from AI as a mere assisting tool with limited capabilities to AI as an alternative to a human in the course of technological evolution to be the more obvious scenario. This paradox can, however, be explained when taking into consideration which tasks are at the heart of the “AI assisting humans” versus “AI replacing humans” debate. With the advances in AI, the areas in which AI is, in fact, capable of replacing the human or has done so already, are increasingly not worth debating because AI outperforms the human more and more visibly without creating additional or greater risks in comparison to sticking with humans to perform the task in question. For example, it is no longer up for debate that AI is greatly outperforming any human counterpart when it comes to collecting data about user’s online behavior and interpreting this data to recommend them a new book to read or a new video to watch. Such use of AI nevertheless, of course, raises numerous questions, including whether we actually want AI to take on these tasks – for example, due to concerns about subliminally influencing the decision-making process of the users. Such objections are, however, not rooted in doubts that AI is able to perform the task in question. In fact, the opposite is the case: AI is so good at performing this task that humans may question whether the task itself requires reevaluation. With regard to tasks in areas in which humans being replaced by AI has been considered rather unlikely from the start, for example, due to its core human skills-oriented nature or lack of a reliable and objective way to measure the accuracy of the outcome produced by the AI tool, our perception has not substantially changed with technological advancements. On the contrary: The more we understand how AI functions, the less we think it is likely for it to fully replace humans with regard to these tasks. This, in turn, results in scholarship moving beyond AI versus human-scenarios and instead putting AI and human-scenarios, characterized by an AI assisting a human to fulfil tasks, increasingly at the heart of their assessment.Footnote ⁹

What characterizes these AI assisting humans scenarios? They concern areas in which AI has not proven its capabilities in a way allowing us to fully rely on it, but enough to make it seem a plausible tool to support the human in charge. More specifically, the tasks in question usually have features which require abilities AI may have become surprisingly good at mimicking but, in fact, cannot tackle in the manner we, as humans, consider appropriate. Therefore, we rule out replacing the human with AI by fully delegating the corresponding tasks to it. However, due to the oftentimes astonishing mimicking abilities of AI, the temptation increasingly grows to make use of AI despite the dissonance between the abilities required to conquer these tasks, on the one hand, and how AI actually functions beyond merely mimicking human outcomes, on the other hand. The same effect can be observed, when certain advantages, such as lowering the cost of fulfilling a certain task or increasing the effectiveness of the underlying decision making process, are equally attributed to the involvement of AI as are risks or drawbacks, such as discriminatory effects, due to its use.

One of the, if not the, most popular coping mechanism for AI lacking some abilities required to fulfill a certain task or for at least reducing the risks and disadvantages resulting from using AI has been constant reassurances towards those affected by it that AI, figuratively speaking, will not be let off the leash. Rather, so the argument goes, the human remains in charge.Footnote ¹⁰ Depending on the specific AI application and the tasks, the terms used may differ; all of them, however, collectively uphold the importance of a human monitoring AI when in use.Footnote ¹¹

Judicial decision making is oftentimes referred to as a task for which keeping a human judge in the loop is of particular importance.Footnote ¹² Nevertheless, the underlying reasoning of human oversight is, of course, neither unique nor limited to AI involvement in court proceedings. Rather, as elaborated, the intuition “AI assistance over AI delegation” spreads across various areas and disciplines in an equal manner. Examples are seemingly endless, with some of the most notorious ones being practicing medicine,Footnote ¹³ driving a car,Footnote ¹⁴ and using weapons.Footnote ¹⁵ In the course of this Article, I will draw on empirical evidence concerning human in the loop-scenarios not only in judicial decision making but rather also in some of these other areas just mentioned, provided that the reference points of the studies are comparable to the ones of court proceedings.

I. Technological Reasoning

So, what is the foundation of favoring having a human in the loop whenever AI involvement is discussed? What are the reasons given across the fields? Broadly speaking, one may identify three types of reasoning. The first one is what I call “technological reasoning”: As any other technology, AI is more or less prone to glitches which could be exploited by ill-intended people. In addition, AI systems may be directly manipulated or hacked. These technological “mishaps” are, in turn, likely to result in wrong output, undeservedly favoring the ill-intended or harming others. Human oversight can, of course, never fully shield innocent bystanders from technological errors of an AI system. However, some scholars claim that if there is human oversight, manipulation, hacking, and glitch exploitation are at least not as dangerous as they would be in case of delegating the task fully to AI.Footnote ¹⁶

Of course, technological based reasoning does not only apply to these worst case scenarios of ill-intended people actively trying to “break” or “trick” the AI system to act in their favor.Footnote ¹⁷ Rather, reservations towards fully delegating tasks to AI due to technological concerns oftentimes reach further, including also any scenarios in which AI may “merely” act unpredictably on its own from the perspective of a human observer.Footnote ¹⁸ In such cases, AI is not manipulated. In fact, it may actually perform as it was set out to do. Due to a lack of understanding of AI’s decision-making process, however, the humans involved do not feel confident to let AI impact the course of action in an unfiltered manner.Footnote ¹⁹ The human in the loop is meant to perform a type of “rationality check.”

II. Legal Reasoning

The second type of reasoning in favor of having a human in the loop when using AI is one founded in law, therefore legal reasoning. In many scenarios, scholarship frames the necessity of a human remaining in the loop primarily as a result of current liability regimes.Footnote ²⁰ This does not come as a surprise as AI systems are, at least for now, not considered legal subjects or actors who can be held directly liable for the damages they cause.Footnote ²¹ Therefore, a human is required to which the AI system in question may be attributable.Footnote ²² Placing a human in the loop for liability-reasons is not, however, primarily aimed at making sure that the person damaged by an AI system will, in fact, get compensated. Rather, it is a mechanism for those developing an AI system to protect themselves by, at least partly, shifting the burden of any mistakes their AI system may make to another human, namely the human in the loop responsible to oversee the AI system in use.Footnote ²³ The underlying reasoning corresponds with the overarching concern when it comes to fully automated decision-making: the awareness that the AI tool is not reliable enough to perform the task in question by itself, without any human oversight; in this case with the apposition: at a tolerably low liability risk for the product developer.Footnote ²⁴

When it comes to the state using AI in order to fulfill its sovereign tasks, the reasoning of keeping a human in the loop has some similarities with the one based on liability concerns. It is, however, oftentimes additionally embedded in more general concerns of accountability and democratic legitimacy:Footnote ²⁵ Whenever the state uses AI in the course of fulfilling its sovereign tasks, keeping a state representative in the loop is not merely a mechanism to shield those developing the AI tool in use from liability. Rather, through the lens of the law, decisions of the state are unique insofar as who decides, and also who is the human in the loop in case of “mere” AI assistance, matters for them to qualify as sovereign actions. Generally speaking, in order for the decision to be sovereign, the state needs to be the one to make the decision, to apply the law, to enforce it. The state does so by appointing certain individuals out of the pool of “the people.” In the context of judicial decision-making, which is, of course, one of the core cases of the state using AI in order to fulfill its tasks, it is the individual serving as the judge. Because the idea of appointing an AI system as a judge is widely rejected on various grounds,Footnote ²⁶ only human judges may decide “in the name of the state.” Having an AI system make judicial decisions without the appointed human judge staying in the loop would result in a decision not imputable to the state. The state would “outsource” some of its core tasks to an entity, in this case an AI system, which is not considered a representative of the state.Footnote ²⁷ The outcome, therefore, would not qualify as a “judicial decision.”Footnote ²⁸

With regard to judicial decision-making specifically, further legal reasons in favor of keeping a human in the loop concern the specific function a judge plays within the legal system and the core legal principles guiding judicial decision-making. These are first and foremost judicial independence, transparency of the judicial decision-making process, being treated equally before the law, and the overall right to a fair trial.Footnote ²⁹ This type of legal reasoning is, however, not only referred to when justifying the need for a human judge to remain in the loop. Rather, these legal principles determining the process of judicial decision making are also the relevant benchmarks when it comes to evaluating the admissibility of specific AI tools used in the course of judicial decision-making altogether, thus also in assisting capacity.

III. Psychological Reasoning

Thirdly, there seem to be, though not always made explicitly, psychological reasons for favoring human-in-the-loop solutions over fully delegating a task to AI. The most obvious ones are based on both the person using AI and the person being affected by the use of AI, feeling more at ease with having human oversight. While this is, of course, not a universal truth and, on top of that, something that may change over the course of time, quite a few studies have been conducted confirming that humans do in fact prefer to keep a human in the loop whenever AI is involved. This seems to be the case not only with regard to decisions directly aimed at and affecting the individual in question but rather also reflecting upon designing decision making processes in the abstract. One reason for this intuition is that there is an “identifiable entity,” an aspect relevant not only through the lens of legal reasoning, but rather also for the psychological dimension.Footnote ³⁰ Interestingly enough, it seems to be the case that whenever anything goes wrong, we as a society would by and large prefer to attribute the mistake to a specific human than to have “no one to blame.” It may even go so far as accepting a higher error rate and have a human in the loop to shift the blame to than to having a statistically lower error rate without a human in the loop.Footnote ³¹

There is, however, another more subtle psychological dimension which could explain why keeping a human in the loop in the context of AI has been a widely popular approach: The human in the loop-scenario could be coined as a case of “making the choice for the option in the middle.” The underlying reasoning is best known from customer behavior when choosing between three versions of the same product with varying levels of functions or sophistication, on the one hand, and prices on the other hand: Studies have shown that most people do not want to buy the cheapest version of the product as it entails having the one with the lowest level of sophistication. At the same time, they are also not willing to purchase the most advanced version of the product as it is the most expensive one, requiring them to spend the maximum amount of money for the product in question. Therefore, they settle on the one in the middle: The price they pay is not as high as the one of the most expensive version of the product. The level of sophistication of the product is also not as low as the one of the least expensive version. The one in the middle is perceived to be the best value, the best product one can get for the least amount of money, the happy medium, so to speak.Footnote ³²

This psychological reasoning appears to fit the scenario discussed here, equally: Sticking with tasks being solely performed by humans, having no AI-involvement at all, can be understood as the least sophisticated yet safest option from the perspective of the individual confronted with the choice of using AI to fulfil a task or not. In contrast, fully replacing the human with AI, resulting in AI performing the task without any human involvement, may be perceived as the most sophisticated yet riskiest option in terms of the “price” one pays. It is often considered, figuratively speaking, as coming at too high a cost to delegate tasks to AI without any human oversight. When applying this logic, keeping the human in the loop while still having AI supporting them may very well be considered the “best of both worlds,” the middle ground.Footnote ³³

C. The Dangers of Having a Human (Judge) in the Loop: Potential Conceptual Misunderstandings

Having outlined the foundations of favoring having a human in the loop whenever AI is involved as well as the main types of reasons given across the fields, I now shift the focus to the risks of opting for human in the loop constructs. The aim of this part of the Article is to carve out not only general concerns regarding human in the loop scenarios but also showcase why having AI supporting human judges can be particularly harmful.

I. The Good Outweighing the Bad and the Ugly?

When analyzing the scholarship on using AI in court proceedings, be it instead of a human judge or to assist them, it is rarely the case that the involvement of AI is considered to be fully positive without any reservations, risks, or drawbacks. Quite the opposite: Scholars have been pointing out various potentially negative implications of using AI in the course of judicial decision-making, flagging risks such as opaque decisions, a lack of accountability, a violation of judicial independence, and concerns of unfair and discriminatory treatment of those subject to the judicial decisions.Footnote ³⁴ Alongside scholars flagging AI-related hazards, the world witnessed quite a few incidents of “using AI in court gone wrong” which seemingly confirmed many of the concerns; in fact, so much so that these AI use cases have, in turn, been serving as notorious examples to showcase just how risky it is to use AI in the course of law enforcement. Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) is potentially the most famous out of them.Footnote ³⁵

As outlined above, though, limiting the involvement of AI to a “merely” assisting tool in support of the human here—the human judge, who is still in charge of making the decision—is often framed to be “as good as it gets.”Footnote ³⁶ Quite a few scholars claim that while involving AI in judicial decision-making goes hand in hand with potential risks and downsides, using—at least some types of—AI tools while human judges remain in the loop do overall more good than bad. All things considered, AI supporting human judges is therefore considered to be an improvement when compared with the status quo, such as sticking with human judges without any AI involvement in their decision-making process, on the one hand, or fully delegating judicial decision-making to AI, on the other hand.Footnote ³⁷

Some scholars, in fact, seem to consider AI involvement in judicial decision making as mainly positive, so long as the human judges are the ones making the final decision.Footnote ³⁸ In support of this position, scholars are pointing to weaknesses human judges notoriously display. First and foremost, human judges are susceptible to a variety of cognitive biases as well as personal prejudices. Humans—and consequently human judges—are also known to be inconsistent in their decision-making.Footnote ³⁹ Against this backdrop, AI is often considered to be a tool which could potentially help to “de-bias” judges,Footnote ⁴⁰ and increase their consistency.

II. AI Versus Human (Abilities): Not a Matter of Degree but of Kind

At first glance, approaching the question of AI-assistance in the course of judicial decision-making based on the “Do Benefits Outweigh Drawbacks?” criterion appears intuitive. However, upon closer examination, this framing raises fundamental problems that cast doubt on its plausibility.

A core flaw of this “Do Benefits Outweigh Drawbacks?” approach is that it downplays the fact that humans and AI are operating on the basis of gravely different inner logics which in turn directly affects how to determine the quality of output as well as how they each achieved it.Footnote ⁴¹ With AI’s mode of operation being fundamentally different from that of humans, it seems rather questionable from the outset to what extent one may speak of “better or worse” or “more or less.”

This fundamentally different mode of operation of humans and AI manifests itself in numerous ways. For the use of AI in the course of judicial decision-making, the differences in handling natural language are at the center. The law is, as is well known, expressed in natural language, with natural language being the medium, as it is always the case when used by humans. AI is also capable of processing natural language. In the case of generative AI, such systems can even independently generate natural language. However, unlike humans, AI does not use natural language as merely the medium for expressing meaning or communicating information and content. Rather, for AI, natural language represents the end goal.

This distinction is a direct result of how AI processes natural language: Even in the case of Large Language Models (LLMs), its processing is a purely statistical analysis based solely on historical data and patterns recognized therein. With the release of newer models like Claude 3.7 Sonnet and Deepseek r1, the reasoning process of LLMs has become more similar to how the human brain works. This is because these newer models integrate reasoning as a core capability within a single model, therey using a hybrid reasoning approach which combines quick answers with deliberate, step-by-step analysis for complex problems, similar to how humans might approach different tasks. In contrast, the models prior to them used separate models, namely one for quick answers and another one for solving complex problems. However, these technological advances do not change the fact that the way these LLMs “think”—even when in so-called “extended thinking” mode—is fundamentally a different type of intelligence, as they still lack the human elements of consciousness, emotions, and embodied experience. They merely show their work steps which enables the user to read how the model got to the answer it provided (“Chain-of-Thought”), and even this aspect is not fully reliable.Footnote ⁴² AI thus does not have an understanding of the subject matter expressed through natural language and, consequently, lacks language comprehension in the human sense.Footnote ⁴³ With the rise of ChatGPT, Noam Chomsky, Ian Roberts, and Jeffrey Watumull have poignantly emphasized this aspect by stressing that any equating of AI-based and human language processing is based on a “fundamentally flawed conception of language and knowledge.”Footnote ⁴⁴ Though one might get a different impression given their human-like output, AI-based systems are incapable of distinguishing “the possible from the impossible” as they are detached from the real physical world. They cannot, by nature, develop an understanding of the “physical and social situations” expressed through natural language.Footnote ⁴⁵

This lack of genuine language comprehension is already problematic when using natural language in everyday conversational settings. In the context of law and its application, the negative implications of AI’s deficient language comprehension are particularly severe. Judicial legal application is notably not—merely—language processing.Footnote ⁴⁶ Rather, they follow a specific methodology that has evolved within a legal community and is passed on to the next generation through legal education.Footnote ⁴⁷ Moreover, the law and its application are inherently dynamic. Structurally, the process of judicial legal application, at least in the case of legal systems which follow the civil law tradition, exhibits a top-down approach: The judge derives from a general norm how the specific case should be adjudicated. In contrast, AI systems, at least in the case of Machine Learning-applications such as LLMs, do not, by their operational nature, start out with a general rule but rather with numerous individual cases. Consequently, AI would require the capability of deriving the general rule bottom-up from all these individual cases. However, this in itself currently presents a nearly insurmountable challenge for AI.Footnote ⁴⁸ To replicate the process of judicial legal application, AI would furthermore need to evaluate the specific case at hand based on this general rule derived. As of now, AI fails to do so, as well. Instead, AI evaluates the case at hand simply based on patterns it has derived from numerous individual cases and transferred to the specific case, without following legal methodology.Footnote ⁴⁹ While the output generated by AI may be identical to that of the judge, the path to this output could hardly be more different.

Against this backdrop, it is apparent that judicial legal assessment is far from being a simple algorithmic procedure. Rather, it represents a complex, multi-faceted endeavor that resists comprehensive replication by AI systems. While narrow, standardized, legal procedures may be amenable to automation, the vast majority of judicial legal assessment involves a degree of nuance and complexity that current AI technology is ill-equipped to handle.

A similar picture can be painted with regard to judicial tasks aimed at establishing the facts of the case. AI may demonstrate potential in isolated aspects, particularly in specific tasks such as document authentication or organizational functions. However, fundamental limitations persist, especially in areas requiring deep language understanding, emotional intelligence, and holistic reasoning. Even advanced AI applications, such as deep learning-based language models, fall short of the judicial competencies required for comprehensive fact-finding. The inability to fully grasp communicated content or to interpret nonverbal cues precludes AI from wholly supplanting human judges in this domain.

Given that these limitations of AI are, as elaborated, the result of fundamental differences in cognitive architecture between human and machine intelligence, they are not merely quantitative, that is, a matter of processing power or data volume, but qualitative.Footnote ⁵⁰ The fact that AI is “thinking” in ways that are fundamentally alien to human reasoning leads to what we might term insurmountable structural incompatibilities with the judicial process.Footnote ⁵¹

To conclude, the process of judicial decision-making is inextricably linked to law as a human construct, expressed through natural language and interpreted through culturally-informed hermeneutics. AI can, at best, imitate aspects of this process but falls short of replicating this level of contextual understanding and reflective, context-sensitive, adaptive reasoning. Even in cases where AI systems demonstrate high accuracy rates in predicting judicial outcomes, as seen in some assessments of “predictive justice” tools, this superficial success belies a deeper failure to grasp the underlying legal norms and methodological requirements that guide judicial legal assessment. The inability of AI to comprehend these elements, even at a rudimentary level, underscores the qualitative gap between statistical correlation and true legal reasoning.Footnote ⁵²

III. Focus on the Output?

Some scholars are, however, not convinced that the fundamentally different functioning of AI and humans as such is reason enough to reject the “Do Benefits Outweigh Drawbacks?” criterion when approaching the question of AI-assistance in the course of judicial decision-making. After all, human judges are “black boxes” too. What happens inside a human judge, how they actually make decisions, and for which reasons are not accessible for anyone besides themselves.Footnote ⁵³ Therefore, so the argument goes, the output is the only relevant criterion for evaluating whether AI shall assist a human judge; the path by which AI reaches an output shall, in contrast, be completely disregarded.Footnote ⁵⁴

By solely focusing on the output and evaluating it from an external perspective, this approach reduces judicial reasoning to a simple performative test: what occurs within the decision making system is irrelevant. Only the final product matters, a position that directly mirrors the conceptual framework of the Turing Test. In the judicial context, this approach would pose a singular evaluative question to determine whether an AI tool meets the threshold set by a human judge: Can the AI-generated output—more specifically, AI-generated judicial reasoning—be distinguished from a human-authored text? Focusing on the output seems to be the overarching approach of quite a few experiments centering LLMs solving legal questions and comparing their output to the one of judges or law students. Most of these studies do so while simultaneously acknowledging that judicial decision-making is more than its output.Footnote ⁵⁵

This approach, however, fundamentally misapprehends the nature of judicial reasoning. Judicial decision-justification is not merely an exercise in generating text that superficially resembles legal reasoning. It is a dynamic process inherently linked to the decisional moment itself. A judicial explanation is not a post-hoc narrative imposed upon an arbitrary decision, but a critical component of the judicial process that reveals the internal logic of the decision.Footnote ⁵⁶

Even from a legal realist perspective which acknowledges that judicial reasoning may not perfectly align with the idealized narrative of purely rational decision-making,Footnote ⁵⁷ the justification process remains constitutive of judicial decision-making; even if it is only through ex post rationalization by the human judge. The requirement of reasoned explanation fundamentally constrains judicial discretion. A legal outcome that cannot be articulated through legally requisite forms of reasoning cannot be legitimately rendered.Footnote ⁵⁸

The Turing Test, originally conceived as an “imitation game,” captures only the external perspective of reasoning. Judicial explanation, by contrast, demands an internal perspective that current AI systems categorically fail to reproduce. The machine can mimic, but it cannot truly reason.

VI. Human Judges as “Rubber Stamper”: The Inability of Humans to Properly Monitor AI and the Decline of Human Professionals

However, the mere fact that AI has a fundamentally different way of approaching legal reasoning from human judges and that this internal perspective is inherently linked to our current understanding of judicial decision making does not, as such, explain why having AI support human judges can be particularly harmful, even when compared to fully delegating judicial tasks to AI. In fact, quite the opposite seems intuitive, particularly when contrasted with delegating judicial decisions fully to AI: Against the backdrop of AI-tools operating under a whole subset of rules different from the ones of human decision-makers, having a human judge in the loop, making sure that the outcome provided by an AI-system matches what has been intended by the rules of judicial decision-making appears a promising approach. Why is it, then, that having a human in the loop seems to not have the desired effect of a monitoring mechanism but rather leads to an even more opaque, harmful outcome?

The root of the evil lies in the assumption “that human-machine systems represent the best of both worlds and don’t introduce new issues of their own.” Though intuitive, it is wrong and can become dangerous whenever adopted as the premise for a human in the loop construct.Footnote ⁵⁹ Even if, based on the MABA–MABA (Men Are Better At–Machines Are Better At) approach, one correctly identifies weaknesses of an AI system and corresponding strengths of a human, it does not necessarily follow that they will even each other out when combined. Instead, it becomes even more likely for the enterprise as a whole to fail, with the only dubious difference of having the human in the loop as a scapegoat.Footnote ⁶⁰

1. Humans Falsely Project their Way of Seeing the World onto AI

The first facet of the explanation of why a hybrid system may rather foster the worst of both worlds instead of combining the best of both worlds concerns how humans perceive AI-tools and how they assess their output. Empirical studies show that humans are prone to project their way of seeing the world onto AI. This, in turn, makes it hard for humans to detect mistakes made by an AI system.Footnote ⁶¹ It becomes particularly difficult for humans to identify mistakes of AI tools when AI takes over an aspect of a task humans are not good at performing themselves.Footnote ⁶² Ironically, these types of tasks are, however, precisely the ones often deemed most suitable for having an AI system support the, not so competent, human.Footnote ⁶³ The human placed in the loop turns into a “rubber-stamper” who does not oversee the decisions made by the AI system.Footnote ⁶⁴

2. Degradation of Vigilance

The second negative consequence of having humans monitor AI-generated results concerns the ability of humans to be sensitive to unpredictable events and to detect them whenever they may occur over a period of time.Footnote ⁶⁵ This ability is called “vigilance.”Footnote ⁶⁶

Empirical evidence suggests that this ability to detect unpredictable events has the tendency to degrade over time particularly due to automation complacency.Footnote ⁶⁷ As Clark put it: “The more powerful and capable an automation system appears, the greater the vigilance decrement per unit time.” Interestingly enough, “[t]his vigilance decrement effect of automation is most pronounced in environments where automation support is present for only a sub-set of tasks, that is, the subject must also perform other manual tasks in addition to ‘backing up’ or monitoring the automation.”Footnote ⁶⁸ The reason is that “subjects developed selective ignorance of conflicting information, a bias towards trusting the automation system even in instances where other conflicting data was clearly visible, a ‘looking-but-not-seeing’ phenomenon.”Footnote ⁶⁹

3. Degradation of the Skill(set)

This degradation in the sphere of the human being placed in the loop to monitor AI performing a task cannot only be observed with regard to the general ability to detect unpredictable events. Rather, the specific skill or skillset of humans is affected in a similar manner; a rather obvious result considering that any human in the loop scenario is characterized by the human no longer using their skills to complete the task in question and instead merely observe an AI attempting to do so.Footnote ⁷⁰ Empirical studies, once more, seem to confirm this consequence. While these studies mostly focus on “practical skills” like the ones required to fly a plane, drive a car, or diagnose a patient, there is no reason to assume that the skillset necessary to serve as a judge would not be equally affected when judges are reduced to monitoring AI in the course of judicial decision-making.Footnote ⁷¹

4. No Good, only the Bad and the Ugly? Supporting AI Gradually Replacing Human Judges

The result of using such supposedly merely supporting AI tools in the course of judicial decision-making is a gradual replacement of human judges, under the guise of its human oversight. The human judge may be perceived as the one making the decision. But: “Humans tend to become reliant on automated decision-making systems. They trust statistical data and begin to give up on their own independent judgment, and become blind to systems errors.”Footnote ⁷² This overreliance on technology turns human in the loop scenarios to de facto delegating the decision to AI. Any negative consequences of delegating the decision making process as a whole to an AI system meant to be avoided by keeping the human judge in the loop therefore equally manifest.

To make matters worse, there is, as elaborated above, plenty of reason to assume that such de facto delegations to AI are prone to cause even more harm than conscious delegations to AI with a corresponding legal basis. Keeping a human in the loop is oftentimes motivated by the need to create some sense of safety for those affected by the use of AI. In addition, the safety mechanisms for the AI tool are likely to be less rigid in the first scenario as the human oversight is a construct created precisely to compensate for any shortcomings or unpredictable behaviors displayed by the AI tool. Because human oversight, however, does not actually live up to any of these expectations, the sense of security derived is not merely false but opens the floodgate to using AI tools which would never pass as safe, accurate, or trustworthy enough to delegate decisions to.Footnote ⁷³

D. AI Supporting Human Judges: Systematic Overview of AI Use Cases in Court Proceedings

So, what does all of this mean for AI potentially assisting a judge and thereby creating a scenario in which the judge becomes the human in the loop? Given the unique features of judicial decision-making and the various types of AI tools, it seems worthwhile pausing and assessing these tools and their impact on judicial decision-making before recommending how to move forward.

When referring to AI as a supporting tool, assisting the human judges in the course of the decision making process, the range of AI tools forming this category can hardly be overestimated. AI may be used for as little as taking on minor organizational aspects of the judicial decision making processes. It may, however, equally be used for as much as preparing a full draft of the judicial decision, enabling the human judge to simply “sign on” to it to make it their decision.

The scholarship on using AI in support of human judges is similarly heterogeneous to the AI tools themselves, as most scholars simply “pick and choose” some use cases of how AI could assist in the course of judicial decision making instead of providing a holistic analysis. In addition, the normative framework against which these AI use cases are measured varies. Some scholars consult rather vague legal principles, partly obscuring the assessment further by blurring the line between these legal principles and ethical guidelines.Footnote ⁷⁴ Others, in contrast, conduct more narrow analysis in light of concrete procedural legal norms, such as—parts of—fundamental procedural rights as the right to a fair trial or specific provisions in simple law.

In the following part of the Article, I attempt to categorize the variety of AI tools which may be used in different stages of court proceedings or are aimed to recreate certain features of judges and their decision making process. The purpose of this categorization is not to comprehensively list all AI tools potentially suitable to be used in court proceedings; such an undertaking would be neither possible nor useful in light of the focus of this Article. Rather, the AI use cases chosen here shall be understood as representative examples of graduated degrees of AI-involvement in the judicial decision making processes, with each of them having a rather distinct potential to affect the judicial decision making. Within the different degrees of AI-involvement, the specific judicial task the AI aims to support is the determining factor for categorizing and analyzing AI use cases in court proceedings.

I. Pre-Trial and Post-Decision-Making

The first category of AI tools concerns aspects of the judicial decision making process which are usually not considered to be core tasks of a judge or even tasks to be personally carried out by a judge at all. They are, nevertheless, judicial decision making-adjacent. This concerns tasks carried out in preparation for the actual trial and tasks required after the judge made a decision. I refer to them “pre-trial” and “post-decision-making” use cases of AI supporting tools. Although the spectrum of tools qualified as AI is notoriously broad, some of these workflow-optimizing tools applied at a stage of court proceedings before the judge in charge actually assesses the assigned case may merely be digitization of analog tasks. As will be elaborated below, these types of assisting tools are the least problematic from the perspective of judicial decision-making. Therefore, the blurry line between mere digitalization and using less sophisticated AI tools in the course of court proceedings is innocuous for the purpose of this article.

With regard to the pre-trial stage, the main tasks involve receiving information from potential future parties, organizing it, and drawing certain conclusions from it which are mostly of formal nature at this stage of a court proceeding. AI-based tools of this kind are often described under the collective term “Case Management Systems.”Footnote ⁷⁵

One specific way of using AI at this stage of court proceedings concerns how the individual seeking a judicial decision is communicating with the court as an institution and, subsequently, the competent judge. With the rise of LLMs such as ChatGPT, AI-based Chatbots are becoming an increasingly popular tool to provide information to prospective parties and communicate with them directly. The advantages of such Chatbots are self-evident: No restrictions to the opening hours of courts as well as an overall low threshold to initiate the first interaction with the judicial system, usually without any costs for the individual making use of the Chatbot. In addition, if the Chatbot is provided by the courts, the information fed into the system can be controlled more easily, resulting in generally more trustworthy and tailored outputs for the users, at least when compared to general search engines such as Google or non-specific LLM-based chatbots like ChatGPT.Footnote ⁷⁶

When it comes to the post-decision making phase of court proceedings, the ways of using AI to support the judge and courtroom staff greatly depends on how the legal system one focuses on is designed. Broadly speaking, AI can assist when it comes to how a decision is communicated to those affected by it as well as the legal community as a whole. AI could, for instance, format mere text provided by the judge and turn it into a judgement by inserting it into a template. AI may also adjust the elements of the template to fit the features of decision in the specific case such as the date of the decision making, the mailing address of the party and the name or identification number of the competent judge. Furthermore, AI could once again be utilized in the form of an LLM-based chatbot, providing additional information to the parties upon request or answering their questions about the content of the decision in an easier to digest question answer format.Footnote ⁷⁷ Finally, in legal systems which provide access to judicial decisions to the public only after anonymizing them, AI tools are offering a promising way to support this anonymization process.Footnote ⁷⁸

II. Organizational Tools Supporting the Decision-Making Process as Such

Organizational tasks are not, however, restricted to the period of time before and after the judge makes the decision but rather are features of the judicial decision making process as a whole. Therefore, AI tools aimed at tackling these organizational aspects of judges may be incorporated into the process of judicial decision-making as such in an equal manner.

One may think of AI tools retrieving data from documents in the course of a court proceeding, for instance by using Optical Character Recognition, or extracting information from data bases. Furthermore, AI tools could filter this information and organize it according to a criterion defined by the deciding judge. In these cases, the AI tools used are merely providing information that already exist and are accessible to the human in charge of making the judicial decision; AI may, however, be able to collect them more efficiently and provide the relevant information in a more intuitive format for the judge.Footnote ⁷⁹

A rather specific, yet highly relevant AI-based tool potentially assisting judges is speech-to-text software. As its name already suggests, such AI tools convert spoken language into written text. The main field of application is, of course, the interaction between the judge and the parties in the course of oral proceedings. Nevertheless, such AI tools may also be used to transcribe any other spoken language relevant to the judge in the course of judicial decision-making. The main advantage of such an AI tool, especially when compared to a transcription of oral protocols by a third person, is that anyone present in the court room can simultaneously review the transcript of the hearing and fix potential errors made by the AI system with the approval of the rest of the people involved. This makes the process of recording court hearings and turning it into a written protocol not only significantly faster but has the potential to also increase its accuracy.Footnote ⁸⁰

The common feature of all these AI-based support mechanisms is that they “merely” aim at simplifying both internal and external processes constituting the judicial decision making processes and thereby making the court proceeding overall more efficient which, in turn, benefits the judge, the parties, and the legal community as a whole. AI tools of the kind as just outlined do so, however, without autonomously contributing to the court proceedings and their outcomes in any substantive manner. Therefore, they may, at most, indirectly impact how the judge decides a case.

III. Summarization of Information

The next degree of AI-involvement in the judicial decision-making process is AI-based summarization of information which are either requested by or provided to the judge.Footnote ⁸¹ In the context of judicial decision-making, such AI-based summarization may assist the judge in the process of establishing the facts of the case as well as when it comes to legally assessing these facts. With regard to the process of establishing the facts, AI could, for example, summarize documents submitted by the parties as evidence or expert opinions, sparing the judge from reading them in full length. With regard to the process of legally assessing the established facts, AI may be used to summarize case law or relevant scholarly contributions like law review articles.Footnote ⁸²

By moving from AI-based organizational tools to AI-based summarization tools, we are entering an area of activity in which AI is increasingly taking on tasks with a potentially substantive impact on the judicial decision. Contrary to AI-based organization of information, AI-based summarization is not merely a case of rearranging information in the way they are presented, such as regarding the format of their output provided to the deciding judge. Rather, AI-based summarization is a case of processing information in an altering fashion. This is a necessary consequence of the fact that summaries per definition do not include everything in the original document or text but only an aggregated version of it.Footnote ⁸³ Therefore, when it comes to AI-generated summarization, the AI-system is the one deciding—at least to some degree—which parts of the information provided are of relevance for the judge and their decision-making process. The degree of information altering by the AI tool is, nevertheless, comparatively low in the case of AI-based summarization. It is limited to picking parts of available information and potentially rephrasing them if required to provide a coherent summary. The summarizing AI tool does not, however, add any new information which the initial text or document did not contain.

IV. Detection of Certain Features or Conditions

This last aspect sets apart the types of AI assistance outlined so far from AI systems aimed at detecting certain features or conditions within data to provide new insights. In contrast to AI tools merely organizing or summarizing existing information, AI-based feature detection is not only targeted at making judicial tasks less time-consuming for the deciding judge. Instead, it is first and foremost a way of generating information previously unknown to the judge. AI-assistance of this kind directs the focus of the human judge to aspects of data not, easily, detectable for humans without AI-support.

In the course of establishing the facts, such AI tools may be used to detect whether a witness or a party lies during their testimony. Humans are notoriously weak at determining whether someone else is telling the truth or not. With AI being able to pick up on involuntary external clues of internal emotions such as micro expressions occurring within a fraction of a second, it may have the potential to compensate some of the deficits human judges display when it comes to emotion recognition.Footnote ⁸⁴

AI tools aimed at detecting certain features or conditions within data to provide new insights may further be consulted to generate predictions in light of the file of the specific case.Footnote ⁸⁵ Such predications could play a similar role to statements of expert witness. In both scenarios, the human judge requires additional non-legal knowledge they do not possess to establish the facts.Footnote ⁸⁶ Alternatively, a human judge may consult an AI tool to receive a second opinion or to contrast it with the result of their own assessment.

With regard to legally assessing the established facts, AI tools may assist the judge in a feature-detection manner by identifying similar cases to the case provided.Footnote ⁸⁷ The structurally identical approach may also be chosen to have AI assist the judge to track down pertinent case law or pieces of legal scholarship.

V. Answering Case-Specific Questions

These AI-based substantive assessments with regard to the specific case lays the groundwork for seamlessly moving to the next degree of AI involvement in the judicial decision-making processes, namely AI tools assisting the judge by answering concrete questions arising from the case in question.

When deciding a case, judges do not merely add one excerpt of case law or legal scholarship after the other. Rather, they are required to provide a legal assessment of the specific pending case. Therefore, judges may not only pose abstract legal questions to an AI tool but also aim for tailoring them to fit the characteristics of the specific case. Instead of requesting an overview of the current legal situation of, for example, strict liability from the AI system, the judge deciding a case of strict product liability might ask the AI system: “Assuming the conditions x, y, and z, is producer A liable for the harm to person B which was caused when using product c”?

When asked a specific case related question, the AI tool would conduct a search in a similar manner as outlined above. Instead of the results of its search being the output of the AI tool, though, it could additionally draft a response to the question asked on the basis of these results and communicate it to the human judge using natural language. Therefore, LLMs are, of course, particularly fitting for this type of AI assistance.Footnote ⁸⁸

VI. Making Drafts and Recommendations

The highest degree of involving AI in the course of judicial decision making without fully delegating the decision-making process to the AI tool, resulting in what is usually called a “fully automated decision,” is having an AI tool generate a draft of a decision.Footnote ⁸⁹ The human in charge of making the judicial decision may use such a draft as a basis for their own decision. Alternatively, the judge may also fully adopt the draft and declare it as their own decision.

Instead of requesting only one single draft from the AI tool in use, it could also produce multiple drafts it considers suitable for deciding the case at hand, leaving it to the judge to choose the draft they deem fitting best in light of the specific circumstances of the case. The judge may also decide that none of the drafts provided by the AI system are, in fact, in line with their take on the case, resulting in rejecting all of the drafts. An AI tool providing multiple drafts for one case to the judge may further support the judge by recommending one draft out of the proposed ones as, for example, particularly likely to not be overturned by an appeal court.Footnote ⁹⁰ It seems also feasible that AI could add additional explanatory notes to each of the drafts, which are not part of the draft as such but are making it potentially easier for the judge to choose one draft over the others.

The transition from AI “merely” answering questions posed by the judge in light of a specific case to AI drafting a decision which is then proposed to the deciding judge is, of course, once again fluent. An AI system may, for instance, merely assist the judge in drafting an opinion without fully providing all of the parts on its own; such a scenario could qualify as both, answering a case-specific question or providing a—part of a—draft of a judicial decision.Footnote ⁹¹ The same holds true for the line between such an AI-generated draft of a judicial decision on the one hand and an AI-generated judicial decision on the other hand.

E. Protecting the Good against the Bad and the Ugly? Concluding Thoughts

Having outlined different degrees of potential AI-involvement in the judicial decision-making processes and some specific use cases of AI tools exemplifying them, I conclude by turning to the implications of the findings for how to approach AI tools assisting a judge.

It seems neither required nor useful to collectively declare AI tools dreadful for the process of judicial decision making. Instead, it is worth digging deeper and reevaluating what we actually mean when we are speaking of an AI tool supporting the human judge. Under which circumstances does the human judge remain in the loop in a manner which is in line with the concerns outlined in this Article?

I. Division of Labor Instead of “Co-working” with AI as Glimmers of Hope

The concept of human in the loop-scenarios as it currently prevails in scholarship is, at least partly, misleading. Why? Usually, they also include scenarios in which the focus lies on a broad task or an area in which AI may be used to cover some aspects or subtasks of it. In the case of the judge, the task, understood in this broad sense, is judicial decision-making as a whole. Any involvement of AI in this process in an assisting manner would therefore qualify as “keeping the human judge in the loop.”Footnote ⁹²

However, not all of these scenarios are necessarily cases in which the use of AI may be understood as “only” to support the human judge by assisting them in the course of the decision-making process, as discussed in this Article. With regard to the concerns raised here, it makes a big difference whether a general, broad task like judicial decision making is divided into various smaller tasks amongst which some remain for the human judge to complete whilst others are delegated to AI, or whether a more specific task is completed by the human judge and an AI system coworking on it.

Only the latter is a case of an actually human in the loop scenario as discussed here. The former, in contrast, is a case of merely assessing a bundle of smaller tasks, previously assigned as a whole to the human judge, in light of a new entity entering the scene as a potential agent for fulfilling these tasks. Against the backdrop of the abilities of humans on the one hand and the abilities of AI on the other hand, some tasks out of the aforementioned bundle may simply be reassigned, namely to AI instead of the human. Particularly fitting are, of course, those judicial tasks that do not intrinsically require a human judge to start with given the non-specific skill set necessary to perform it. This concerns especially judicial tasks discussed as the first and second category of AI tools in Section D, namely pre-trial and post-decision-making judicial tasks as well organizational tasks throughout the court proceeding. It goes without saying that in order to reassign some of the tasks to AI, a thorough assessment of whether the specific AI tool in question is, in fact, able to perform a certain task in the manner required, is always needed. However, once confirmed, there are no additional hazards resulting from a human and an AI-system working side by side and dividing the labor of a broader task between them.

II. What about the Actual Human in the Loop-Scenarios?

Reevaluating AI-based tools used in the course of judicial decision-making through the lens of division of labor rather than human and AI co-working with the human overseeing the AI answers some of the challenges outlined above. However, there are certain judicial tasks as well as types of AI assistance leading to an allocation of tasks between the human and the AI tool which can hardly be characterized as “division of labor.” This is because of how greatly intertwined the contributions of the human judge on the one hand and the contributions of the AI on the other hand to the task in question are.

How should we approach these scenarios where human and AI activities are so intricately interwoven that individual contributions cannot be meaningfully disaggregated? One compelling suggestion in scholarship is a “shift from human oversight to institutional oversight for regulating government algorithms.” Such institutional oversight would require a “written justification of its decision to adopt an algorithm in high-stakes decisions,” including “evidence that any proposed forms of human oversight are supported by empirical evidence.” Additionally, these written justifications must be made publicly available in order to receive public review and approval.Footnote ⁹³

This mechanism, proposed by Green, certainly is a conceptually promising mechanism to challenge any type of “blanket rules” allowing state actors to make use of algorithms as long as humans remain in the loop, providing oversight. And yet, it seems that some scenarios in which this approach will lead to odd and somewhat artificial justification attempts in hope to bypass restrictions to incorporate AI into the judicial decision making process will nevertheless remain. Therefore, while Green’s approach of requiring institutional oversight makes it undoubtedly much harder and less likely to sneak in the use of AI-support through the backdoor, the backdoor nevertheless remains.

Against this backdrop, we should take these remaining human in the loop scenarios for what they are, even if it results in prohibiting AI in certain aspects of the judicial decision-making process altogether. We should acknowledge the fundamentally different approaches of humans on the one hand and AI-tools on the other hand when it comes to certain aspects of judicial decision making. And if we are not content with the outcome, we should invest our effort in proposing changes to how we conceptualize judges and their decision-making processes. We should not, however, risk blurring the lines in hope for short-sighted relief in workload, and presumed increase in efficiency. Because once lines become blurry, we cannot expect anyone to see a boundary when we cross it, least of all a human—judge—placed “in-the-loop.”

Acknowledgements

The author declares none.

Competing Interests

The author declares none.

Funding Statement

No specific funding has been declared in relation to this Article.

References

¹ See, e.g., Aziz Z. Huq, A Right to a Human Decision, 106 Va. L. Rev. 611, 611–86 (2020). See also Eugene Volokh, Chief Justice Robots, 68 Duke L.J. 1135, 1135–91 (2019); see also Cinara Rocha & Joāo Carvalho, Artificial Intelligence in the Judiciary: Uses and Threats, 2022 CEUR Workshop Proc. (Conf. Paper) 1 (Ger.), https://ceur-ws.org/Vol-3399/paper17.pdf; Robert Buckland, AI, Judges and Judgment: Setting the Scene 1 (Harv. Kennedy Sch. Mossavar-Rahamani Ctr. for Bus. & Gov’t, Working Paper No. 220, 2023), https://dash.lib.harvard.edu/handle/1/37377475; see also Amin Ebrahimi Afrouzi, Role-Reversible Judgments and Related Democratic Objections to AI Judges, 114 J. Crim. L. & Crimonology 22, 22–34 (2023); see also Tania Sourdin, Judge v Robot? Artificial Intelligence and Judicial Decision-Making, 41 U. N.S.W. L.J. 1114, 1114–33 (2018) (Austl.).

² See, e.g., Ian R. Kerr & Carissima Mathen, Chief Justice John Roberts is a Robot 1 (Apr. 1, 2014) (working paper) (on file with the University of Ottawa) (Can.); see also Frank Pasquale, A Rule of Persons, Not Machines: The Limits of Legal Automation, 87 Geo. Wash. L. Rev. 1, 1–55 (2019); see also Florence G’sell, AI Judges, in The Cambridge Handbook of Artificial Intelligence: Global Perspectives on Law and Ethics 347, 347–63 (Larry DiMatteo, Christina Poncibò & Michel Cannarsa eds., 2022) (deeming fully automated judicial decision making “very unlikely”).

³ See, e.g., Sourdin, supra note 1, at 1133; see also Justin Snyder, RoboCourt: How Artificial Intelligence Can Help Pro Se Litigants and Create a “Fairer” Judiciary, 10 Ind. J.L. & Soc. Equal. 200, 200–20 (2022); see also Amy Howe, AI Won’t Displace Human Judges, but Will Affect Judiciary, Roberts Says in Annual Report, SCOTUSBlog (Dec. 31, 2023, 12:00 AM), https://www.scotusblog.com/2023/12/ai-wont-displace-human-judges-but-will-affect-judiciary-roberts-says-in-annual-report/.

⁴ See infra Section B.

⁵ See infra Section C.

⁶ See infra Section D.

⁷ See infra Section E.

⁸ See, e.g., Nancy Robert, How Artificial Intelligence Is Changing Nursing, 50 Nursing Mgmt. 30, 30–39 (2019) (U.K.); see also Brandon A. Bordenkircher, The Unintended Consequences of Automation and Artificial Intelligence: Are Pilots Losing Their Edge?, 19 Issues Aviation L. & Pol’y 205, 205–36 (2019); see also Peter Y. Kim, Where We’re Going, We Don’t Need Drivers: Autonomous Vehicles and AI-Chaperone Liability, 69 Cath. U. L. Rev. 341, 341–70 (2020); see also James Bessen, AI and Jobs: The Role of Demand 1 (Nat’l Bureau of Econ. Rsch., Working Paper No. 2435, 2018), https://www.nber.org/papers/w24235; Edvard P. G. Brunn & Alban Duka, Artificial Intelligence, Jobs and the Future of Work: Racing with the Machines, 13 Basic Income Stud. 1, 1–13 (2018) (Ger.)

⁹ See, e.g., AI Won’t Replace Humans—But Humans With AI Will Replace Humans Without AI, Harv. Bus. Rev. (Aug. 4, 2023), https://hbr.org/2023/08/ai-wont-replace-humans-but-humans-with-ai-will-replace-humans-without-ai; see also Jeremy Shapiro, Elevating Human Potential: AI as a Support Tool, Not a Replacement, Workflow Blog (Aug. 29, 2023), https://workflowotg.com/elevating-human-potential-ai-as-a-support-tool-not-a-replacement/; Jonny Hart, AI Won’t Take Your Job; It Will Make You Better at It, Temple Now (Jan. 16, 2025), https://news.temple.edu/news/2025-01-16/ai-won-t-take-your-job-it-will-make-you-better-it.

¹⁰ This may be referred to as the “slap a human in it” approach. Cf. Meg Leta Jones, The Right to a Human in the Loop: Political Constructions of Computer Automation and Personhood, 47 Soc. Stud. Sci. 216, 224 (2017).

¹¹ Jeff Clark, The Fallacy of the “Human-in-the-Loop” as a Safety Net for Generative-AI Applications in Healthcare, Medium Blog (June 25, 2024), https://medium.com/@jeffclark_61103/the-fallacy-of-the-human-in-the-loop-as-a-safety-net-for-generative-ai-applications-in-healthcare-b425be453649 (referring to the “copilot,” “assistant,” or “human-in-the-loop” narrative as a way “to reassure business leaders that AI solutions are not roaming unchecked”).

¹² This becomes particularly clear when assessing the legal scholarship on whether the judge is required to be a human, regardless of how far-reaching the capabilities of AI may be. See, e.g., Ryan Calo & Danielle Citron, The Automated Administrative State: A Crisis of Legitimacy, 70 Emory L.J. 797, 797–844 (2021); see also Huq, supra note 1, at 636; see also Ryan Calo, Robots in American Law, SSRN Database, at 2 (2016), https://papers.ssrn.com/abstract=2737598.

¹³ See, e.g., Kerr & Mathen, supra note 2, at 7 (comparing possible future uses of AI in court proceedings to a development in medicine, in which the future “Watson” was framed as in competition with physicians, when the current “Watson” was carefully conceived as a mere tool to assist physicians in diagnosing cancer).

¹⁴ See, e.g., Christian P. Janssen & Andrew L. Kun, Automated Driving: Getting and Keeping the Human in the Loop, 27 Interactions 62, 62–65 (2020); see also Miriam Gil, Manoli Albert, Joan Fons & Vincente Pelechano, Designing Human-in-the-Loop Autonomous Cyber-Physical Systems, 130 Int’l J. Hum.-Comput. Stud. 21, 21–39 (2019). For a more recent example, see Zilin Huang, Zihao Sheng, Chengyuan Ma & Sikai Chen, Human as AI Mentor: Enhanced Human-in-the-Loop Reinforcement Learning for Safe and Efficient Autonomous Driving, 4 Commc’n. Transp. Rsch. 1, 1–24 (2024) (China) (introducing a more advanced technological construct).

¹⁵ See Alex Leveringhaus & Tjerk de Greef, Keeping the Human ‘in-the-Loop’: A Qualified Defence of Autonomous Weapons, in Precision Strike Warfare and International Intervention 199, 199–218 (Mike Aaronson et al. eds., 2014); see Stansislav Abaimov & Maurizio Martellini, Artificial Intelligence in Autonomous Weapon Systems, in 21st Century Prometheus: Managing CBRN Safety and Security Affected by Cutting-Edge Technologies 141, 141 (Maurizio Martellini & Ralf Trapp eds., 2020); Rebecca Crootof, The Killer Robots Are Here: Legal and Policy Implications, 36 Cardozo L. Rev. 1837, 1837–1915 (2015).

¹⁶ Cf. Volokh, supra note 1, at 1175 (explaining hacking and glitch exploiting in the case of AI judges:

“My tentative sense is that these problems, while potentially quite substantial, should not categorically foreclose the possibility of AI judging—especially if AI judges provide great countervailing benefits (in making the justice system much faster, less expensive, and less prone to human biases). But they are major concerns, and anyone thinking about AI judging should take them very seriously, whether we’re talking about the most ambitious proposals for law-developing AI judges or more mundane proposals for everyday law-applying AI judges. Indeed, these concerns need to be considered even as to AI law clerks or AI magistrates, though in those situations the supervision by human judges may spot some (though not all) of the possible problems.”).

¹⁷ Robert Fay & Wallace Trenholm, The Cyber Security Battlefield: AI Technology Offers Both Opportunities and Threats, in Governing Cyberspace During a Crisis in Trust: An Essay Series on the Economic Potential—And Vulnerability—of Transformative Technologies and Cyber Security 45, 45–48 (2019) (explaining intentional exploitation of AI). See generally Miles Brundage, Shahar Avin, Jack Clark, Helen Toner, Peter Eckersley, Ben Garfinkel, Allan Dafoe, Paul Scharre, Thomas Zeitzoff, Bobby Filar, Hyrum Anderson, Heather Roff, Gregory C. Allen, Jacob Steinhardt, Carrick Flynn, Seán Ó hÉigeartaigh, SJ Beard, Haydn Belfield, Sebastian Farquhar, Clare Lyle, Rebecca Crootof, Owain Evans, Michael Page, Joanna Bryson, Roman Yampolskiy & Dario Amodei, The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation (2024), http://arxiv.org/abs/1802.07228.

¹⁸ Rebecca Crootof, Margot Kaminski & W. Price, Humans in the Loop, 76 Vand. L. Rev. 429, 465–66 (2023).

¹⁹ See generally Janelle Shane, You Look Like a Thing and I Love You: How Artificial Intelligence Works and Why it’s Making the World a Weirder Place (Illustrated. ed. 2019) (explaining AI may simply produce “weird” outputs). See also Crootof et al., supra note 18, at 464.

²⁰ See Crootof et al., supra note 18, at 437, 452–53 (referring to the example of autonomous vehicles and the risk of tort liability potentially incentivizing autonomous vehicle developers to force a handoff to a human driver in particularly dangerous situations “because doing so increases the likelihood that the human driver, rather than the designer, will bear the brunt of liability”).

²¹ Interestingly enough, though, creating a specific type of legal subject—the so-called “e-person”—has probably been among the most-discussed aspects of how to approach regulating technological tools like AI for the past few years. See, e.g., Paweł Nowik, Electronic Personhood for Artificial Intelligence in the Workplace, 42 Comput. L. & Sec. Rev. (2021) (U.K.). See also Başak Yalman, Electronic Personhood: A Compact Analysis of Legal Personality for Artificial Intelligence, 1 Ex/Ante 3, 3–13 (2024) (Switz.).

²² See Ian Ayers & Jack M. Balkin, The Law of AI Is the Law of Risky Agents Without Intentions, at 2 (Univ. Chi. L. Rev. Online, 2024), https://lawreview.uchicago.edu/sites/default/files/2024-11/Ayres_Balkin_Law%20of%20Risky%20Agents.pdf.

²³ Of course, the opposite is possible, as well: Liability regimes may also discourage keeping a human in the loop. See, e.g., a. Michael Froomkin, Ian R. Kerr & Joelle Pineau, When AIs Outperform Doctors: Confronting the Challenges of a Tort-Induced Over-Reliance on Machine Learning, 61 ariz. l. Rev. 33, 72–73 (2019) (arguing that medical malpractice liability will discourage the meaningful presence of humans in the loop for medical algorithmic systems). See also Crootof et al., supra note 18, at 458.

²⁴ See Calo & Citron, supra note 12, at 36 (explaining that judges tend to attribute liability to a human who has been placed in the loop rather than an AI system). See also Crootof et al., supra note 18, at 437.

²⁵ See Ben Green, The Flaws of Policies Requiring Human Oversight of Government Algorithms, 45 Comput. L. Sec. & Rev. 1, 1, 9 (2022) (U.K.).

²⁶ See Calo & Citron, supra note 12, at 797–844. See generally Kerr & Mathen, supra note 2.

²⁷ See Crootof et al., supra note 18, at 478 (stressing that this aspect is closely connected to the ability of human judges to justify decisions, as justification is often a crucial element of legitimacy).

²⁸ This line of argument may be found in works of scholars who declare that our current idea of judging can hardly be squared with not having a human judge in the loop, whenever core judicial tasks are at hand, without explicitly tracing this claim back to a certain legal provision or concept. E.g., Kerr & Mathen, supra note 2, at 7

(“Undoubtedly, if and when future lawyers or (perhaps one day) judges actually begin to delegate significant legal tasks or decision making to AIs, the profession will require that these AIs are utilized merely as assistive tools that help lawyers or judges carry out their responsibilities, not as replacements for them. The key element in the debate will be whether, to what extent and, exactly how, the lawyer retains control. It is difficult to imagine regulatory regimes that would not require a lawyer to remain in-the-loop.”).

²⁹ See Crootof et al., supra note 18, at 453–54 (explaining that “purely algorithmic decisions may be susceptible to challenge on procedural grounds—that is, that they failed to follow a legally adequate decisional process” and that “including a human undermines such claims”, and further referring to a lawsuit in which “teachers successfully claimed that the lack of transparency regarding how the algorithm reached its conclusions constituted a due process violation”).

³⁰ Cf. Crootof et al., supra note 18, at 473.

³¹ See Madeleine Clare Elish, Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction, 5 Engaging Sci. Tech. & Soc’y 40, 42 (2019) (referring to these humans in the loop as the “moral crumple zone”).

³² See, e.g., Rafi Mohammed, The Good-Better-Best Approach to Pricing, Harv. Bus. Rev., Sept.–Oct. 2018, https://hbr.org/2018/09/the-good-better-best-approach-to-pricing; The Chief Outsider, The Marketing Power of Three, Chief Outsiders (May 10, 2016), https://www.chiefoutsiders.com/blog/marketing-power-of-three.

³³ See Ben Green, The Flaws of Policies Requiring Human Oversight of Government Algorithms, 45 Comput. L. & Sec. Rev. 1, 2 (2022) (U.K.) (identifying the requirement of human oversight as one of the central “regulatory approaches that could enable governments to attain the benefits of algorithms while avoiding the risks of algorithms”).

³⁴ See e.g., Jenna Burrell, How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms, 3 Big Data & Soc’y 1, 1–12 (2016). See also Jay Thornton, Cost, Accuracy, and Subjective Fairness in Legal Information Technology: A Response to Technological Due Process Critics, 91 N.Y.U. L. Rev. 1821, 1821–49 (2016); Katherine Freeman, Algorithmic Injustice: How the Wisconsin Supreme Court Failed to Protect Due Process Rights in State v. Loomis, 18 N.C. J.L. & Tech. 75, 75–106 (2016); Frank Fagan & Saul Levmore, The Impact of Artificial Intelligence on Rules, Standards, and Judicial Discretion, 93 S. Cal. L. Rev. 1, 3–34 (2019).

³⁵ See, e.g., Christoph Engel, Lorenz Linhardt & Marcel Schubert, Code Is Law: How COMPAS Affects the Way the Judiciary Handles the Risk of Recidivism, 33 a.i & l. 383, 383–404 (2025). See also Anne L. Washington, How to Argue with an Algorithm: Lessons from the COMPAS-ProPublica Debate, 17 Colo. Tech. L.J. 131, 133–59 (2018); Raffaele Piccolo, AI in Criminal Sentencing: A Risk to Our Human Rights?, 40 Bull. (L. Soc’y. S. Austl.) 15, 15–17 (2020) (Austl.).

³⁶ Crootof et al., supra note 18, at 437 (describing “the MABA–MABA trap,” with MABA–MABA being short for what “Men Are Better At” and what “Machines Are Better At,” and the temptation of assuming that “adding a human to a machine system will result in the best of both worlds”). See also Meg Jones, The Ironies of Automation Law: Tying Policy Knots with Fair Automation Practices Principles, 18 Vand. J. Ent. & Tech. L. 77, 104–06 (2015).

³⁷ See, e.g., Frank Pasquale, New Laws of Robotics 13 (2020) (arguing that intelligence augmentation, in which AI is used not to replace but to augment human capacities, “results in better service and outcomes than either artificial or human intelligence working alone”). See Crootof et al., supra note 18, at 467–68. See also Volokh, supra note 1, at 1175 (“My tentative sense is that these problems, while potentially quite substantial, should not categorically foreclose the possibility of AI judging—especially if AI judges provide great countervailing benefits (in making the justice system much faster, less expensive, and less prone to human biases).”).

³⁸ See, e.g., Rocha & Carvalho, supra note 1, at 1 (“AI has great potential to support judicial activities resulting in more access to justice, transparency and accountability, reduction of costs, and decreased judicial lawsuit duration.”).

³⁹ See, e.g., Ozkan Eren & Naci Mocan, Emotional Judges and Unlucky Juveniles, 10 Am. Econ. J. Applied Econ. 171, 173 (2018).

⁴⁰ Dovilė Barysė & Roee Sarel, Algorithms in the Court: Does It Matter Which Part of the Judicial Decision-Making Is Automated?, 32 A.I. & L. 117, 119 (2024) (referring to Daniel L. Chen, Judicial Analytics and the Great Transformation of American Law, 27 a.i. & l. 15, 15–42 (2019)).

⁴¹ Harry Surden, Artificial Intelligence and Law: An Overview, 35 Ga. State U.L. Rev. 1305, 1308 (2019) (“The reality is that today’s AI systems are decidedly not intelligent thinking machines in any meaningful sense.”).

⁴² See, e.g., Reasoning Models Don’t Always Say What They Think, Anthropic (Apr. 3, 2025), https://www.anthropic.com/research/reasoning-models-dont-say-think.

⁴³ See, e.g., Crootof et al., supra note 18, at 465 (“Algorithms don’t ‘think’ like humans do (in fact, they don’t ‘think’ at all).”).

⁴⁴ Noam Chomsky, Ian Roberts & Jeffrey Watumull, Opinion, Noam Chomsky: The False Promise of ChatGPT, N.Y. Times (Mar. 8, 2023), https://www.nytimes.com/2023/03/08/opinion/noam-chomsky-chatgpt-ai.html.

⁴⁵ Id.; see also Melanie Mitchell & David C. Krakauer, The Debate Over Understanding in AI’s Large Language Models, 120 Procs. Nat’l. Acad. Sci. 1, 3 (2023).

⁴⁶ Mireille Hildebrandt, Data-Driven Prediction of Judgment. Law’s New Mode of Existence?, at 13, SSRN Database (2019), https://papers.ssrn.com/abstract=3548504 (U.K.).

⁴⁷ It is therefore a socialization process within an interpretation community. Cf. Stanley Fish, Doing What Comes Naturally: Change, Rhetoric, and the Practice of Theory in Literary & Legal Studies 133 (1989).

⁴⁸ See, e.g., Cor Steging, Silja Renooij & Bart Verheij, Rationale Discovery and Explainable AI, in Legal Knowledge and Information Systems 225, 229 (Erich Schweighofer ed., 2021).

⁴⁹ See, e.g., Katie Atkinson, Trevor Bench-Capon & Danushka Bollegala, Explanation in Ai and Law: Past, Present and Future, 289 a.i 1, 3 (2020) (referring to the fact that due to this different mode of operation, Machine Learning is also referred to as “case-based reasoning”, while Good Old-Fashioned AI is a case of “rule-based reasoning”.).

⁵⁰ See Brent Mittelstadt, Sandra Wachter & Chris Russell, To Protect Science, We Must Use LLMs as Zero-Shot Translators, 7 Nature Hum. Behav. 1830, 1831 (2023) (U.K.) (“The concept of ‘truth’ has been highly simplified in LLM development and equated with accuracy measured against the ‘ground truth’ of the training data.”).

⁵¹ Robert Post, The Internet, Democracy and Misinformation, in Disinformation, Misinformation and Democracy 1, 10 (Ronald Krotoszynski, András Koltay, & Charlotte Garden eds., 2024) (“AI cannot be a member of any human community. It cannot participate in, and hence construct a dialectical relationship with, any human community. AI therefore cannot pronounce law. At most AI can report factual determinations about the way that actual humans regard law.”).

⁵² See, e.g., Eur. Comm’n for the Efficiency of Just., European Ethical Charter on the Use of Artificial Intelligence in Judicial Systems and their Enviroment 29 (Dec. 3–4, 2018), https://www.europarl.europa.eu/cmsdata/196205/COUNCIL%20OF%20EUROPE%20-%20European%20Ethical%20Charter%20on%20the%20use%20of%20AI%20in%20judicial%20systems.pdf.

(“In most occasions, the objective of these systems is not to reproduce legal reasoning but to identify the correlations between the different parameters of a decision and, through the use of machine learning, to infer one or more models. Such models would be used to ’predict’ or ’foresee’ a future judicial decision.”).

⁵³ See Huq, supra note 1, at 640–46.

⁵⁴ See Volokh, supra note 1, at 1137–38.

⁵⁵ See, e.g., Eric A. Posner & Shivam Saran, Judge AI: Assessing Large Language Models in Judicial Decision-Making, SSRN Database, at 26 (2025), https://papers.ssrn.com/abstract=5098708 (referencing similar studies that “sought to explore whether GPT can ‘decide’ a case”).

⁵⁶ See, e.g., Ulfrid Neumann, Juristische Argumentationstheorie, in Handbuch Rechtsphilosophie 234, 234 (Eric Hilgendorf & Jan C. Joerden eds., 2017) (Ger.) (arguing that there is an intrinsic connection between decision-making and justification as judicial reasoning is not a cosmetic exercise but a substantive articulation of legal judgment, and that the process matters as much as, maybe even more than, the product).

⁵⁷ John Hart Ely, Democracy and Distrust: A Theory of Judicial Review 44 (1980) (“Legal realism ‘discovered’ that judges were human and therefore were likely in a variety of legal contexts consciously or unconsciously to slip their personal values into their legal reasoning.”).

⁵⁸ See Crootof et al., supra note 18, at 467 (“Whatever one’s thoughts about the opacity of the human mind, society has developed ways of querying decisionmakers and identifying reasoning errors.”).

⁵⁹ See Crootof et al., supra note 18, at 438.

⁶⁰ See Andrea Roth, Trial by Machine, 104 Geo. L.J. 1245, 1296–97 (2016).

⁶¹ See e.g., John Zhuang Liu & Xueyao Li, How Do Judges Use Large Language Models? Evidence from Shenzhen, 16 J. Legal Analysis 235, 252 (2025)

(“However, it is worth noting that these subtle hallucinations can easily manifest in judgments through judge–AI interactions. As AI-generated arguments always support judges’ initial decisions, the judges that reviewed AI-generated reasoning can easily overlook errors that align with their decisions. In other words, the hallucination problem can easily seep into judges’ decisions through the workflow, even when judges are accountable for reviewing and scrutinizing the reasons generated by AI. This clearly poses a significant risk to judicial decision-making.”).

⁶² Jeff Clark, The Fallacy of the “Human-in-the-Loop” as a Safety Net for Generative-AI Applications in Healthcare, Medium Blog 14 (Jun. 25, 2024), https://medium.com/@jeffclark_61103/the-fallacy-of-the-human-in-the-loop-as-a-safety-net-for-generative-ai-applications-in-healthcare-b425be453649.

⁶³ See id.

(“Research shows that, in certain scenarios, the outcome from a “human-in-the-loop” scenario can be worse than if the task was done entirely by the human, that is, with no automation support. They are equally bad at detecting when an automated system they are monitoring has made a mistake. Human performance in these scenarios ranges from an 80% detection rate in ideal circumstances, to as low as 20% in other scenarios.”).

⁶⁴ Michael Veale & Lilian Edwards, Clarity, Surprises, and Further Questions in the Article 29 Working Party Draft Guidance on Automated Decision-Making and Profiling, 34 Comput. L. & Sec. Rev. 398, 400 (2018) (U.K.).

⁶⁵ See Sidney Dekker & David D. Woods, MABA-MABA or Abracadabra? Progress on Human-Automation Co-Ordination 5 (2002), https://www.humanfactors.lth.se/fileadmin/lusa/Sidney_Dekker/articles/2003_and_before/MABA_MABA.pdf.

⁶⁶ Roman V. Yampolskiy, Strategic Patience: Long-Horizon AI Dominance and the Erosion of Human Vigilance 3 (2025), https://s-rsa.com/index.php/agi/article/view/14435/10767.

⁶⁷ See Raja Parasuraman & Dietrich H. Manzey, Complacency and Bias in Human Use of Automation: An Attentional Integration, 52 Hum. Factors 381, 383–84 (2010).

⁶⁸ See generally Clark, supra note 11.

⁶⁹ Id.

⁷⁰ Cf. Jones, supra note 36, at 112 (referring to this automation-associated deterioration of human abilities and calling it “skill fade.”). See also Crootof, Kaminski, & Price, supra note 18, at 469.

⁷¹ See e.g., Kerr & Mathen, supra note 2, at 8

(“As a safeguard, we will at first be inclined to require that a human professional remain in-the-loop. But, eventually, the professional left in charge will experience a kind of existential dilemma in cases of disagreement between her and the AI. The professional may trust his or her own intuitions. But evidence-based reasoning will suggest that the machine’s decision ought to be followed. The more often professionals delegate decision making to machines, the more they will relinquish control. The more they relinquish control, the more they become dependent on the machines. The more dependent they become on the machine, the more they relinquish professional expertise. And, so on.”).

⁷² Rocha & Carvalho, supra note 1, at 5.

⁷³ See Dekker & Woods, supra note 65, at 241 (“[A]utomation does not replace a human weakness. It creates new human strengths and weaknesses–often in unanticipated ways.”).

⁷⁴ See, e.g., CEPEJ European Ethical Charter on the Use of Artificial Intelligence (AI) in Judicial Systems and Their Environment, Eur. Comm’n for the Efficiency of Just., https://www.coe.int/en/web/cepej/cepej-european-ethical-charter-on-the-use-of-artificial-intelligence-ai-in-judicial-systems-and-their-environment (last visited Oct. 6, 2025) (demonstrating that ethical guidelines were among the first to explicitly address the use of AI in court proceeding, at least within the European sphere).

⁷⁵ See, e.g., Valentina Capasso, For an… “Artificially Intelligent” Process: When Access to Justice and Efficient Court Management Go Hand in Hand, 14 Int’l J. Procedural L. 245, 258 (2024), https://brill.com/view/journals/ijpl/14/2/article-p245_003.pdf?srsltid=AfmBOoock9cTab33treYHkPIicDdnuSMsyfZA-6S5-zHaTR9Le28sAMn.

⁷⁶ See, e.g., Aggeliki Androutsopoulou, Nikos Karacapilidis, Euripidis Loukis & Yannis Charalabidis, Transforming the Communication Between Citizens and Government Through AI-Guided Chatbots, 36 Gov’t Info. Q. 358, 360–65 (2019). See also Marc Queudot, Éric Charton & Marie-Jean Meurs, Improving Access to Justice with Legal Chatbots, 3 Stats 356, 368–73 (2020).

⁷⁷ See generally Felicity Bell, Lyria Bennett Moses, Michael Legg, Jake Silove & Monika Zalnieriute, AI Decision-Making and the Courts: A Guide for Judges, Tribunal Members and Court Administrators 7–59 (2022) (Austl.), https://aija.org.au/wp-content/uploads/woocommerce_uploads/2022/06/AI-DECISION-MAKING-AND-THE-COURTS_Report_V5-2022-06-20-1lzkls.pdf (providing an overview of how AI can support courts in their administration).

⁷⁸ See, e.g., Kalliopi Terzidou, Automated Anonymization of Court Decisions: Facilitating the Publication of Court Decisions through Algorithmic Systems, in Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law 297, 297 (2023); see also Ingo Glaser, Tom Schamberger & Florian Matthes, Anonymization of German Legal Court Rulings, in Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law 205, 205 (2021).

⁷⁹ See Drish Mali, Rubash Mali & Claire Barale, Information Extraction for Planning Court Cases, in Proceedings of the Natural Legal Language Processing Workshop 2024, at 97 (Nikolaos Aletras et al. eds., 2024); Kishan Kanhaiya, Naveen, Arpit Kumar Sharma, Kamlesh Gautam & Pramod Singh Rathore, AI Enabled- Information Retrieval Engine (AI-IRE) in Legal Services: An Expert-Annotated NLP for Legal Judgments, in 2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS) 206, 206–10 (2023), https://ieeexplore.ieee.org/document/10250733.

⁸⁰ See N. Geetha, S. Sanjushree & R. Sneha, Judicial Speech Summarization Tool, 10 Grenze Int’l J. Eng’g & Tech. 6266, 6267–71 (2024) (India). See also Nicolad Garneau & Olivier Bolduc, The State of Commercial Automatic French Legal Speech Recognition Systems and Their Impact on Court Reporters et al 3–4 (2024), https://arxiv.org/pdf/2408.11940.

⁸¹ See Min-Yuh Day & Chao-Yu Chen, Artificial Intelligence for Automatic Text Summarization, in 2018 IEEE International Conference on Information Reuse and Integration (IRI) 478, 478 (2018). See also Zaema Dar, Muhammad Raheel & Usman Bokhar, Advanced Generative AI Methods for Academic Text Summarization, in 2024 IEEE 3rd International Conference on Computing and Machine Intelligence (ICMI) 1, 1 (2024).

⁸² See, e.g., Liu & Li, supra note 61, at 241 (detailing how judges may use such LLM-based summarization tools in the course of their decision-making process).

⁸³ See, e.g., Udo Hahn & Inderjeet Mani, The Challenges of Automatic Summarization, 33 Computer 29, 30–35 (2000) (explaining the challenges of automatic summarization—in light of less advanced tools as the ones than those available today). See also, e.g., Supriyono, Aji Prasetya Wibawa, Suyono & Fachrul Kurniawan, A Survey of Text Summarization: Techniques, Evaluation and Challenges, 7 Nat’l. Language Processing J. 1, 1–21 (2024) (describing more recent tools).

⁸⁴ See generally, e.g., Konstantinos Kalodanis, Panagiotis Rizomiliotis, Georgios Feretzakis, Charalampos Papapavlou & Dimosthenis Anagnostopoulos, High-Risk AI Systems—Lie Detection Application, 17 Future Internet 1–23 (2025). See also Jo Ann Oravec, The Emergence of “Truth Machines”?: Artificial Intelligence Approaches to Lie Detection, 24 Ethics & Info. Tech. 1–10 (2022).

⁸⁵ See, e.g., S. Iniyan, Nishanth Raghuvanshi & Piyush Maurya, Court Judgment Prediction for Article 6 Using Machine Learning, in 2024 Ninth International Conference on Science Technology Engineering and Mathematics (ICONSTEM) 1, 1 (2024). See also Carolyn McKay, Predicting Risk in Criminal Procedure: Actuarial Tools, Algorithms, AI and Judicial Decision-Making, 32 Current Issues Crim. J. 22, 22–39 (2020); Marc Queudot & Marie-Jean Meurs, Artificial Intelligence and Predictive Justice: Limitations and Perspectives, in Recent Trends and Future Technology in Applied Intelligence 889, 889 (Malek Mouhoub et al. eds., 2018).

⁸⁶ One may argue that AI-generated translations are another example of this type of AI assistance. AI could, for instance, translate documents which are of relevance for the court proceeding. It seems also feasible to use AI tools to translate simultaneously when two or more humans are speaking with each other during the hearings in different languages.

⁸⁷ See, e.g., Jinting Deng, Should the Common Law System Welcome Artificial Intelligence: A Case Study of China’s Same-Type Case Reference System, 3 Geo. L. Tech. Rev. 223, 223–80 (2018) (explaining a specific example).

⁸⁸ Of course, such AI tools may not only provide help to judges in the form of AI-based legal search engines but could rather also come in handy for litigants in the form of chatbots to inform or support them in the course of their court proceedings. See, e.g., Jonathan Li, Rohan Bhambhoria, Samuel Dahan & Xiaodan Zhu, Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice 3–7 (2024), http://arxiv.org/abs/2409.07713.

⁸⁹ See, e.g., Inesa Stolper, Towards Automated Decision-Making at Court: The Use of Artificial Intelligence for Drafting and Rendering Court Decisions, 130 Teisė 153, 153–63 (2024) (Lith.).

⁹⁰ Volokh makes in this regard the difference between what he calls “judicial staff attorney 1.0” and “judicial staff attorney 2.0.” The former “writes” the opinion as instructed by the judge whereas the latter proposes the result to the judge. See Volokh, supra note 1, at 1148.

⁹¹ See Liu & Li, supra note 61, at 239 (explaining that the LLMs used in courts in Shenzhen, China, are precisely at this transition—from AI “merely” answering questions posed by the judge in light of a specific case to AI drafting a decision which is then proposed to the deciding judge: “AI generates reasoning based on the judges’ decision on the disputed issues and the established facts.” Afterwards, “[j]udges review and modify the reasoning provided by the AI to finalize the judgment.”).

⁹² See, e.g., Crootof et al., supra note 18, at 433–34, 440 (referring to the common definition of human in the loop as “an individual involved in a single, particular algorithmic decision”).

⁹³ See Green, supra note 25, at 4.

Article contents

Artificial Intelligence in Court Proceedings: Judge’s Little Helper or the Beginning of AI’s Hostile Takeover?

Abstract

Keywords

Information

A. Introduction

B. AI Assistance Over AI Delegation: Roots and Core Arguments of the Human in the Loop-Intuition

I. Technological Reasoning

II. Legal Reasoning

III. Psychological Reasoning

C. The Dangers of Having a Human (Judge) in the Loop: Potential Conceptual Misunderstandings

I. The Good Outweighing the Bad and the Ugly?

II. AI Versus Human (Abilities): Not a Matter of Degree but of Kind

III. Focus on the Output?

VI. Human Judges as “Rubber Stamper”: The Inability of Humans to Properly Monitor AI and the Decline of Human Professionals

1. Humans Falsely Project their Way of Seeing the World onto AI

2. Degradation of Vigilance

3. Degradation of the Skill(set)

4. No Good, only the Bad and the Ugly? Supporting AI Gradually Replacing Human Judges

D. AI Supporting Human Judges: Systematic Overview of AI Use Cases in Court Proceedings

I. Pre-Trial and Post-Decision-Making

II. Organizational Tools Supporting the Decision-Making Process as Such

III. Summarization of Information

IV. Detection of Certain Features or Conditions

V. Answering Case-Specific Questions

VI. Making Drafts and Recommendations

E. Protecting the Good against the Bad and the Ugly? Concluding Thoughts

I. Division of Labor Instead of “Co-working” with AI as Glimmers of Hope

II. What about the Actual Human in the Loop-Scenarios?

Acknowledgements

Competing Interests

Funding Statement

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests