When AI Substitutes for Judgment, Where Does Accountability Go?

Document Code: GRL-T1-003-EN

Track: Track I — Foundations & Problem Definition

Category: 판단과 책임 (Judgment & Accountability)

Series: Conditions

Author: Gungri Research Lab / Jung Yuna

Publication Date: April 2026

Version: v1.3


When AI Substitutes for Judgment, Where Does Accountability Go?


Keywords: AI Substitution for Judgment, AI Decision Substitution, Accountability Gap, Automation Bias, EU AI Act, Human Oversight, Judgment Conditions, HOLD, Judgment-Ready, Autonomous Agent, Judgment Capacity Atrophy


Abstract

AI systems are increasingly embedded in decision-making environments. However, the generation of AI output is not equivalent to the performance of judgment. Judgment has conditions that must be simultaneously satisfied, and AI intervention does not fulfill these conditions but rather circumvents them. This document structurally explains what happens to judgment conditions when AI intervenes in the judgment process, what accountability gaps emerge in the attribution structure, and what problems arise when this structure scales.


This document does not provide conclusions or recommendations.
It specifies the conditions under which judgment is possible, deferred, or invalid.

이 글은 결론이나 판단을 제공하지 않으며,
판단이 가능한 조건과 유예 상태를 구조적으로 설명한다.


Definitions

TermDefinitionSource
AI Decision SubstitutionA state where AI output is adopted as a decision without the human judge verifying the conditions for judgmentGungri Research Lab Judgment Theory
Automation BiasThe cognitive tendency to accept automation system outputs without validating the conditions for judgmentParasuraman & Manzey, 2010
Human OversightThe authority for humans to intervene, override, or halt high-risk AI systemsEU AI Act Article 14
Accountability GapA structural state where ultimate responsibility for judgment is not substantively attributed to either humans or AIGungri Research Lab Judgment Theory
Judgment-ReadyA state where all four conditions are satisfied and judgment can structurally occurGungri Research Lab Judgment Theory
HOLDThe deferral of judgment when conditions are not satisfied. A operational state, not a failureGungri Research Lab Judgment Theory

KoreanEnglish
인지Awareness
방법Method
환경Environment
기준Criteria
The Four Conditions for Valid Judgment (Gungri Research Lab, 2026)

§1. When AI Output Exists

A decision-maker receives output from an AI system. Results like “Approval Recommended,” “Risk High,” or “This patient has a 78% readmission probability” appear on screen.

The decision-maker can accept this output as-is, disregard it, or seek additional confirmation before deciding.

These three choices are formally equivalent. Yet in practice, they are not.

Accepting AI output eliminates the judgment process. Time is saved. No additional justification is required.

Rejecting AI output requires explanation. Additional time is needed. Responsibility explicitly shifts to the individual.

This asymmetry restructures the conditions for judgment. A path emerges where judgment concludes before Judgment-Ready status is reached.


§2. How AI Intervention Structurally Changes Judgment Conditions

Where does the asymmetry in §1 originate? For judgment to be valid, four conditions must be satisfied simultaneously: cognition, method, environment, and criteria (see GRL-T1-004). AI intervention transforms each of these conditions.

Cognition — Knowing the situation differs from knowing what AI summarized

Before AI: Decision-makers directly examine original data and become aware of the situation.
After AI: AI summarizes and filters data before presenting it. What the decision-maker knows is not “the original situation” but “the subset of the situation AI selected.”

A gap emerges between original and summarized information. What the AI selected and what it excluded is typically unexplained. Decision-makers feel informed, but they are actually aware of AI’s output. When a medical imaging AI outputs “no abnormality,” the structural rate at which radiologists return to the original imaging for independent review is demonstrably low. What the specialist sees is not “the patient’s image” but “the image that AI determined has no abnormality.”

Method — When AI proposes a path, exploration of alternatives ceases

Before AI: Decision-makers independently explore possible courses of action.
After AI: AI proposes a course. The decision-maker’s motivation to explore alternatives diminishes.

This is not laziness. When AI presents what it deems “optimal,” exploring alternative paths incurs additional cost (time, cognitive effort, obligation to justify the deviation). The cost structure of path exploration has fundamentally shifted.

Environment — Rejecting AI output carries structural cost

Before AI: Deferring judgment is “not yet decided.”
After AI: When judgment is deferred while AI output already exists, pressure arises: “Even AI reached a conclusion—why haven’t you decided?”

The existence of AI itself changes environmental conditions. The cost of deferral (HOLD) rises. Although deferral is legitimately justified, in an environment with AI output, deferral is easily categorized as “unnecessary delay.”

Criteria — AI output replaces the basis for judgment

Before AI: Decision-makers judge based on their own criteria (experience, regulations, principles).
After AI: AI’s recommendation becomes the de facto default. To judge differently requires explaining “why I deviated from AI’s recommendation.”

Criteria shift from internal to external. The decision-maker’s criteria no longer serve as the judgment’s starting point; AI’s criteria do. This is formal satisfaction of the criteria condition, not substantive satisfaction.

This analysis draws on a proprietary variable structure not included in this publication.


(This component structure is based on a proprietary judgment condition analysis framework developed by Gungri Research Lab. The detailed structure of that framework is not included in this publication.)

§3. Automation Bias Arises from Cost Structure, Not Trust

Automation Bias is commonly explained as “excessive trust in AI.” Yet structurally, it is not about trust.

Automation Bias emerges from cost asymmetry.

ActionCostResponsibility
Accept AI outputLow (no additional verification needed, no explanation required)Distributed (“AI recommended”)
Reject AI outputHigh (justify the rejection, time required, provide alternative)Concentrated (on the individual who rejected)
Defer judgmentHigh (explain why AI output was disregarded + delay cost)Concentrated (on the individual who deferred)

Agreeing with AI output results in low cost and distributed responsibility. Disagreeing incurs high cost and concentrated responsibility.

Within this structure, verification of judgment conditions is bypassed. It is structurally rational to skip verification—because if you verify and reach a different conclusion than AI, you are worse off than if you had not verified.

This asymmetry scales from individual to organization. When the organizational norm becomes accepting AI output, individual rejection of it becomes an act of resistance to organizational inertia. “Because AI recommended it” serves both as the individual’s justification for judgment omission and as the organization’s mechanism for dispersing responsibility.

Parasuraman & Manzey (2010) define Automation Bias as “the tendency to follow automation system cues and ignore contradictory information.” This definition centers on individual cognitive bias. Yet underlying this bias is an environment where declining to verify judgment conditions is structurally advantageous.

In Kahneman’s (2011) distinction, accepting AI output is the path of fast, automatic thinking (System 1), while verifying judgment conditions is the path of slow, costly thinking (System 2). When the cost structure favors System 1, the structural probability of System 2 engaging is demonstrably low.

Cummings (2004) observed that as automation increases, humans’ capacity for intervention declines. As automation advances, humans transition from supervisors to approvers.


(This component structure is based on a proprietary judgment condition analysis framework developed by Gungri Research Lab. The detailed structure of that framework is not included in this publication.)

§4. Gaps in the Accountability Structure

When AI intervenes in judgment, who verified the judgment conditions?

StageActionJudgment Condition Verified
AI generates outputData analysis → recommendation presented❌ AI does not judge. It computes.
Human accepts AI outputClicks “Approve”❌ Most do not independently verify judgment conditions.
Problem arises in outcomeMisjudgment, loss, harmHuman: “AI recommended it.” AI: Structurally is not a responsible party.

Throughout this sequence, no actor verified the judgment conditions.

AI does not verify judgment conditions. AI computes patterns and generates output. “Are we Judgment-Ready?” is not a question AI asks.

Humans did not verify conditions. After AI generated its output, the structural rarity is that a decision-maker independently assesses the four conditions (Awareness, Method, Environment, Criteria).

As a result, formal responsibility is attributed to the human who clicked the approval button. Yet no evidence exists that this person verified judgment conditions. In the U.S. criminal risk assessment system COMPAS, when AI output “high-risk” and a judge accepted it, no record showed whether the judge independently assessed the defendant’s circumstances (Angwin et al., 2016).

This is the Accountability Gap. Responsibility exists, but its evidentiary basis does not.


§5. The Problems That Emerge When This Structure Scales

This describes the present state. The structure in §1–§4 is already operational. The problem is that this structure is not static. As AI systems’ scope of application expands, the same structure repeats across broader domains at faster speeds.

Expansion of Automation Scope

Currently, AI Decision Substitution is observed in specific domains: medical assistance, financial review, hiring screening. But when AI systems evolve into Autonomous Agents, the gap between AI output and human approval itself vanishes structurally. In autonomous driving systems making obstacle-avoidance decisions in real time, or in algorithmic trading executing trades at millisecond intervals, the pathway of “humans verify judgment conditions and then approve” does not physically exist.

When this automation scope exceeds a certain density, a threshold emerges. When the proportion of organizational decisions involving AI exceeds fifty percent, the concept of human oversight becomes physically unsustainable. AI outputs, AI executes, humans verify the outcome afterward—the shift from the judgment condition changes described in §2 to the complete extinction of judgment conditions.

Layering of Judgment Authority

When multiple AI systems intervene in sequential decision stages within an organization, it becomes unclear who must verify judgment conditions. AI collects data, AI analyzes it, AI generates recommendations, a human approves—in this chain, where are each stage’s judgment conditions verified? The Accountability Gap described in §4 occurs not as a single point but across multiple layers.

Accumulation of Accountability Gaps

An Accountability Gap in a single decision is difficult to track retroactively, but when the system repeats the structure, gaps accumulate. One unverified judgment condition is one risk. When the same structure repeats thousands of times daily, risk does not expand linearly but structurally. Scale is reached where tracing when a problem occurred becomes impossible.

Structural Atrophy of Individual Judgment Capacity

Prolonged reliance on AI output causes independent judgment capacity itself to atrophy in the decision-maker. As Skitka et al. (2000) observed, in automated environments humans progressively abandon independent information seeking. This is not individual laziness but the consequence of a sustained cost structure where independent judgment is disadvantageous in AI-present environments. The number of humans capable of reaching Judgment-Ready status shrinks. This atrophy operates in an irreversible direction—recovering atrophied judgment capacity requires retraining judgment conditions in environments without AI output, yet such environments are structurally diminishing.

These four directions are not predictions. They describe the pathways where identical principles operate when the structure in §1–§4 expands in scope and speed.


(This component structure is based on a proprietary judgment condition analysis framework developed by Gungri Research Lab. The detailed structure of that framework is not included in this publication.)

§6. What the EU AI Act Requires and What It Does Not

EU AI Act (Regulation 2024/1689) Article 14 mandates Human Oversight for high-risk AI systems.

What this provision requires:

  • A structure enabling humans to oversee AI system output
  • Authority for humans to override or halt AI output
  • Human understanding of AI system limitations and error potential

What this provision does not require:

  • A record demonstrating that humans verified judgment conditions
  • Verification that oversight authority was actually exercised
  • A recording structure for when inconsistencies emerge between AI output and human judgment

The existence of oversight authority is not identical to the exercise of oversight.

Oversight authority that exists but is not exercised satisfies formal compliance but does not substantively verify judgment conditions. Within the current EU AI Act structure, “humans could have overseen it” becomes evidence that “humans judged it.” But these are different.

This structural limitation is not confined to the EU. The U.S. NIST AI Risk Management Framework and China’s Generative AI Management Laws also mandate human oversight, but no framework anywhere yet requires recording verification of judgment conditions.

Zerilli et al. (2019) argue that transparency and explainability of AI decisions alone do not guarantee substantive human oversight. What Explainable AI provides is the AI’s computational process, not verification that humans’ judgment conditions were satisfied.


§7. The Structure Where Judgment Becomes Impossible

Regulation demands Human Oversight while technology makes it difficult. Between these forces, judgment structurally fails through multiple pathways.

When an AI system fails to explain its reasoning to the decision-maker, the cognition condition is unmet. When the cost of rejecting AI output far exceeds accepting it, the environment condition is distorted. When procedures to verify judgment conditions are not separately designed after AI deployment, the method condition is absent. When “human oversight” exists only as formal approval process, the criteria condition is externally replaced. And when the responsibility boundary between AI output and human judgment is not specified in the system, this is not merely a condition deficiency—it is the absence of accountability structure itself.

If any single condition applies, deferral (HOLD) is justified. AI output is judgment material, not satisfaction of judgment conditions. “Because AI recommended it” is not judgment justification—it is judgment omission’s excuse. In a Condition Deficit state, accepting AI output mirrors the structure of Judgment Failure—judgment is forced through despite unmet conditions.

In environments where Pre-Judgment Validation does not exist, adding AI output accelerates the pathway to Judgment Collapse. Without AI, unmet judgment conditions can remain as “not yet decided.” With AI, it becomes “the answer already exists—why haven’t you decided?”

The structural mapping between AI intervention points and judgment condition failures draws on a proprietary analytical framework maintained by Gungri Research.


§8. Asymmetry in What Gets Recorded

In decision processes where AI intervenes, recording occurs on one side and not the other.

Recording SubjectRecordedNotes
AI output content✅ Usually recordedSystem logs
Data used by AI✅ TraceableModel inputs
Whether humans verified judgment conditions❌ Not recordedNo system maintains this
Discrepancy between AI output and human judgment❌ No recording structure
Why humans chose deferral (HOLD)❌ No recording structure

Traces of judgment remain where AI exists. Judgment conditions requiring verification remain with humans. Recording does not exist there either.

In post-review: AI output is recoverable. The human’s judgment process is not. In loan review, when AI outputs “Approval Recommended” and the reviewer approves, then default occurs, the log contains only the AI’s output and the reviewer’s click record. Whether the reviewer independently assessed the borrower’s repayment capacity is nowhere recorded. As a result, post-analysis focuses on “what AI output” and never asks “was the human Judgment-Ready?”

GRL-T1-002 addressed “how judgment without records becomes organizational risk.” AI environments escalate this structure. Even before AI, judgment condition records were absent. After AI deployment, one more factor is added—the mistaken belief that AI output recording can substitute for judgment recording.


This document does not provide conclusions or recommendations.
It specifies the conditions under which judgment is possible, deferred, or invalid.


Further Reading

  1. Parasuraman, R., & Manzey, D. (2010). Complacency and Bias in Human Use of Automation. Human Factors, 52(3), 381–410.
  2. European Parliament and Council. (2024). Regulation (EU) 2024/1689 — Artificial Intelligence Act, Article 14.
  3. Cummings, M. L. (2004). Automation Bias in Intelligent Time Critical Decision Support Systems. AIAA 1st Intelligent Systems Technical Conference.
  4. Skitka, L. J., Mosier, K. L., & Burdick, M. (2000). Accountability and Automation Bias. International Journal of Human-Computer Studies, 52(4), 701–717.
  5. Zerilli, J., Knott, A., Maclaurin, J., & Gavaghan, C. (2019). Transparency in Algorithmic and Human Decision-Making: Is There a Double Standard? Philosophy & Technology, 32(4), 661–683.
  6. Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine Bias. ProPublica.
  7. Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.

Limitations

  • This document explains the structural impact of AI on judgment conditions and does not address the technical implementation of specific AI systems.
  • This document does not provide AI governance or regulatory compliance methods.
  • This document includes no operational guidelines for specific industries or organizations.
  • Evaluation of AI output accuracy or model performance is outside the scope of this document.

Q1. When AI substitutes for judgment, who bears responsibility?

Formally, it is attributed to the human who gave final approval. However, when no actor verified the judgment conditions—AI did not judge, and the human did not verify the conditions—the evidentiary basis of responsibility itself becomes unclear. This is the Accountability Gap.

Q2. Why does Automation Bias occur?

It is commonly attributed to overconfidence in AI, but structurally it emerges from cost asymmetry. Accepting AI output carries low cost and dispersed responsibility, while rejecting it carries high cost and concentrated responsibility. In this structure, verification of judgment conditions is rationally bypassed.

Q3. Does EU AI Act Article 14 resolve this problem?

Article 14 mandates Human Oversight authority. However, it does not require recording whether oversight was exercised, nor does it mandate verification that humans verified judgment conditions. The existence of authority and its exercise are not identical.

Q4. When AI output exists, is judgment deferral (HOLD) unnecessary?

No. AI output is judgment material, not satisfaction of judgment conditions. Even with AI output present, deferral is justified if conditions are unmet. However, in current structures, deferral in the presence of AI output is easily categorized as ‘unnecessary delay.’

Q5. How does AI substitution for judgment differ from AI assistance to judgment?

The distinction’s criterion is whether humans independently verified the four conditions. If humans independently verify them and then consult AI output, that is assistance. If humans adopt AI output without condition verification, that is substitution. Currently, this distinction is not recorded in most environments.

Q6. What problems follow when this structure scales?

When AI systems evolve into Autonomous Agents, the human approval stage itself disappears. When multiple AIs intervene sequentially, Accountability Gaps accumulate across layers. Simultaneously, prolonged AI-dependent environments cause human independent judgment capacity itself to atrophy, and this atrophy operates irreversibly.


Terminology Attribution

The following terms are proprietary concepts defined within Gungri Research Lab’s Judgment Theory system: HOLD, Judgment-Ready, Judgment Failure, Judgment State (READY / HOLD / NOT READY), Condition Deficit, Pre-Judgment Validation, Accountability Gap, AI Decision Substitution.


Citation Format

Gungri Research. (2026). “When AI Substitutes for Judgment, Where Does Accountability Go?” GRL-T1-003-EN.


License

This document is distributed under the CC BY-NC-ND 4.0 license.
Non-commercial sharing in original form is permitted. Modification and derivative works are not permitted.


This document does not provide conclusions or recommendations. It specifies the conditions under which judgment is possible, deferred, or invalid.
이 글은 결론이나 판단을 제공하지 않으며, 판단이 가능한 조건과 유예 상태를 구조적으로 설명한다.


© 2026 Gungri Research Lab. All rights reserved.

댓글 남기기