LangGraph Human in the Loop: A Complete Tutorial (With Production Considerations)

LangGraph human in the loop support is built around interrupt() — call it inside a node, execution pauses, state is saved to the checkpointer, and the graph waits. Resume with Command(resume=...) and it picks up exactly where it left off.

That covers a lot of scenarios. What it doesn’t cover is what happens when the approver isn’t you, when they need an email, might not respond for two hours, and the decision needs to end up in an audit log. This tutorial covers both: how to implement LangGraph human-in-the-loop correctly, and where it hits its ceiling for production deployments.

What LangGraph’s interrupt pattern does

LangGraph human-in-the-loop (HITL) is built on two primitives: checkpointers and interrupts.

What is LangGraph human-in-the-loop? LangGraph HITL is a pattern where graph execution pauses at a designated node, saves its state to a checkpointer, and waits for a human response before continuing. You trigger the pause by calling interrupt() inside a node; you resume it by invoking the graph again with Command(resume=value) and the same thread_id.

A checkpointer persists the graph’s state after each node execution. This is what makes pause-and-resume possible — the graph state survives the interruption, so execution can continue from exactly where it stopped rather than starting over.

The interrupt() function, introduced in LangGraph v0.4, is the current recommended way to pause execution. When your node calls interrupt(value), the value you pass becomes the payload surfaced to the external caller — typically a question, a summary of the proposed action, or anything the human reviewer needs to make a decision. The old pattern (raise NodeInterrupt()) is deprecated; use interrupt() instead.

Static breakpoints offer a compile-time alternative. Set interrupt_before=["node_name"] or interrupt_after=["node_name"] when you call graph.compile(), and LangGraph will automatically pause before or after that node on every execution. LangGraph breakpoints for human approval work well in inspect-then-approve workflows where you always want oversight at a fixed point in the graph.

Dynamic interrupts are more flexible. You call interrupt() inside the node logic itself, which means you can conditionally pause based on the agent’s state — only interrupting when the proposed action meets a certain risk threshold, for example.

Both approaches share the same resume mechanism: you call graph.invoke(Command(resume=response), config=config) with the same thread_id, and the graph continues.

LangGraph interrupt tutorial: a working Python example

Here is a minimal LangGraph interrupt example in Python. The agent proposes an action, pauses for human approval, then either executes or skips based on the response.

from langgraph.types import interrupt, Command
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import InMemorySaver
from typing import TypedDict

class State(TypedDict):
    action: str
    approved: bool

def approval_node(state: State):
    # Pause and surface the proposed action to the caller
    decision = interrupt(
        f"Agent wants to: {state['action']}. Approve? (True/False)"
    )
    return {"approved": decision}

def execute_node(state: State):
    if state["approved"]:
        print(f"Executing: {state['action']}")
    else:
        print("Action denied.")
    return state

builder = StateGraph(State)
builder.add_node("approval", approval_node)
builder.add_node("execute", execute_node)
builder.add_edge(START, "approval")
builder.add_edge("approval", "execute")
builder.add_edge("execute", END)

checkpointer = InMemorySaver()
graph = builder.compile(checkpointer=checkpointer)

config = {"configurable": {"thread_id": "thread-1"}}

# First invocation — pauses at interrupt()
result = graph.invoke(
    {"action": "send refund of $340 to customer 8821", "approved": False},
    config=config
)
print(result)  # {'__interrupt__': (...)}

# Human reviews, then resumes
graph.invoke(Command(resume=True), config=config)

A few things to note here. The InMemorySaver checkpointer works for local development and short-lived scripts. In production, you need a durable checkpointer backed by PostgreSQL, Redis, or a cloud equivalent — InMemorySaver loses state when the process restarts. LangGraph provides PostgresSaver and RedisSaver for this purpose.

Also: each run needs a thread_id in the config. This is how LangGraph knows which checkpointed state to resume from. If you’re running multiple agents concurrently, each needs its own unique thread ID.

The double execution problem

There is a known gotcha that bites developers when they first move to production. When you call graph.invoke(Command(resume=...), config), LangGraph restarts the interrupted node from the beginning, not from the line where interrupt() was called.

This means any code that ran before interrupt() will execute again on resume. If that code made an API call, sent a notification, or logged something, it will do so twice.

The fix is straightforward: keep each approval node focused on a single responsibility. An approval node that does nothing except call interrupt() and return the result will re-execute cleanly. An approval node that also prepares data, calls external services, or logs pre-interrupt actions will double-fire those side effects.

The pattern to follow: if a node mixes tool execution and approval gating, split it into two nodes. Let the preparation node do its work, then pass to a dedicated approval node that only calls interrupt().

LangGraph HITL limitations: what interrupt() can’t do in production

LangGraph’s HITL interrupt mechanism was designed for interactive, synchronous UIs — a chatbot where the user types back immediately, or a Streamlit app where you click Approve in the same session. For those use cases it works perfectly as a lightweight approval gate.

Where it shows friction is when the LangGraph human in the loop workflow extends beyond the developer’s own environment.

No notification system

When a thread is interrupted, nobody gets notified. The graph state is persisted in the checkpointer and the thread is marked as interrupted, but that’s it. There’s no email sent, no Slack message, no ping of any kind.

If the approver is you and you’re watching the terminal, that’s fine. If the approver is Priya in finance who needs to know that the agent is waiting on her sign-off before it processes a batch of refunds, you need to build that notification layer yourself. Email integration, Slack bots, webhook handlers — none of it is included.

No timeout or escalation

Interrupted threads sit in the checkpointer indefinitely. LangGraph has no built-in mechanism to say “if nobody responds within 30 minutes, escalate to the backup approver” or “expire the decision after two hours and fail gracefully.”

You can build this with cron jobs that poll for stale interrupted threads, but that’s non-trivial infrastructure. You need reliable scheduling, retry logic, fallback routing, and a way to resume the graph programmatically with a timeout response. Teams consistently underestimate how much work this is to build correctly.

No structured audit trail

LangGraph’s persistence layer records graph state, which is valuable for debugging and resumability, but it is not an audit log. It doesn’t produce a structured record of who was asked, when they responded, what they decided, and what reasoning they provided.

For most production deployments, and certainly for any EU AI Act Article 14 compliance, you need a human-readable, exportable log of every decision your agent made or requested. LangGraph doesn’t provide that out of the box. You have to instrument it yourself.

Adding production-grade approval to a LangGraph agent

The cleanest way to add notifications, timeout handling, and audit trails to a LangGraph agent is to replace the interrupt() call with a dedicated approval API call. The graph structure stays identical — you’re just swapping the implementation of the approval node.

Here’s the same workflow from earlier, with the approval node calling The Handover’s POST /decisions endpoint instead. The approver gets an email with one-click Approve/Deny buttons. Your agent polls for the result. The decision is logged automatically.

import requests
import time
from langgraph.graph import StateGraph, START, END
from typing import TypedDict

class State(TypedDict):
    action: str
    approved: bool

def approval_node(state: State):
    # Create a decision — approver gets an email immediately
    resp = requests.post(
        "https://thehandover.xyz/decisions",
        headers={"Authorization": "Bearer ho_your_key_here"},
        json={
            "action": state["action"],
            "context": "LangGraph agent requesting human sign-off",
            "approver": "approver@yourcompany.com",
            "urgency": "high",
            "timeout_minutes": 30,
        }
    )
    decision_id = resp.json()["id"]

    # Poll until the approver responds (or the decision expires)
    for _ in range(180):
        time.sleep(10)
        r = requests.get(
            f"https://thehandover.xyz/decisions/{decision_id}",
            headers={"Authorization": "Bearer ho_your_key_here"},
        ).json()
        if r["status"] != "pending":
            return {"approved": r["status"] == "approved"}

    return {"approved": False}  # timed out

def execute_node(state: State):
    if state["approved"]:
        print(f"Executing: {state['action']}")
    else:
        print("Action denied or timed out.")
    return state

builder = StateGraph(State)
builder.add_node("approval", approval_node)
builder.add_node("execute", execute_node)
builder.add_edge(START, "approval")
builder.add_edge("approval", "execute")
builder.add_edge("execute", END)

graph = builder.compile()
graph.invoke({"action": "send refund of $340 to customer 8821", "approved": False})

Notice that this version doesn’t need a checkpointer at all. The state doesn’t need to be persisted across a process restart because the agent stays in the polling loop until the approver responds. The graph runs synchronously from start to finish.

If you want async behaviour — where the agent fires the decision and moves on rather than polling — pass a callback_url with the request. The Handover will POST the result to your server the moment the approver decides, and your agent can resume from there. See the POST /decisions endpoint docs for the full callback pattern.

The approver experience is the same regardless: they receive an email (or a Slack DM on Scale+ plans) with the action context and one-click Approve/Deny buttons. No login required, no new app to install. Every decision — who was asked, when, what they chose, any notes they added — lands in the audit log automatically.

For LangChain’s create_react_agent pattern rather than a custom state graph, see the LangChain approval tutorial which walks through the full tool definition in Python and TypeScript. The LangGraph integration guide has copy-paste examples for both patterns.

Which approach to use

These two approaches solve different problems. Here’s how to choose.

Need	LangGraph `interrupt()`	The Handover API
Local demo or chatbot UI	Sufficient	Overkill
External approver (email)	Build yourself	Built in
Slack notifications	Build yourself	Scale+
Audit trail for compliance	Build yourself	Built in
Timeout + escalation	Build yourself	Built in
Structured approve/deny/notes	Raw value	Typed response
No checkpointer required	No	Yes

LangGraph’s interrupt() is the right tool when the approver is interactive and in the same session — a chatbot where the user responds, a developer-facing tool where you’re the reviewer, a Streamlit prototype where the approval UI is part of your app.

The Handover fills the gap when the approver is someone else, somewhere else. The agent creates a decision, the approver gets notified through their existing channels, and the result comes back structured and logged.

The two approaches can also be combined. Use LangGraph’s interrupt mechanism for in-session checkpoints — validating agent plans before execution, for example — and use The Handover for out-of-band approvals that need to reach people who aren’t watching the terminal.

Understanding the five core agentic AI patterns will help you identify which approval strategy fits where in your agent’s architecture. The orchestrator-worker pattern in particular benefits from having both kinds of checkpoints.

What this looks like in a real agent

Consider a billing agent that processes refund requests. It retrieves a customer record, assesses the claim, and proposes a refund amount. Before it executes the refund:

For small amounts (under $50), you’ve configured auto-approval rules — the agent proceeds without interruption.
For amounts between $50 and $500, the agent calls POST /decisions. The customer success lead gets an email, reviews the case, and clicks Approve or Deny within the 30-minute timeout.
For amounts above $500, the same call goes out but with a higher urgency flag. If the first approver doesn’t respond within 15 minutes, escalation routes the decision to the team lead automatically.

None of this required building notification infrastructure, a custom approval UI, a scheduling system for escalations, or an audit log schema. The agent’s code stayed focused on the billing logic. The approval layer is handled by the API.

That’s the pattern. Start free at the dashboard — no credit card required, 10 decisions per month on the free plan — and the full API reference shows the integration in detail.

Frequently asked questions

What is the LangGraph interrupt() function?

interrupt() is a function you call inside a LangGraph node to pause graph execution and surface a value to the caller. The graph saves its state to the checkpointer, the thread is marked as interrupted, and execution resumes when you call graph.invoke(Command(resume=value), config) with the same thread ID.

What’s the difference between static breakpoints and dynamic interrupts in LangGraph?

Static breakpoints are set at compile time using interrupt_before or interrupt_after on specific node names — the graph always pauses at that node. Dynamic interrupts are set inside node logic by calling interrupt() — you can conditionally pause based on the current state, which is useful for risk-based approval workflows.

Why does my code run twice when I resume a LangGraph interrupt?

This is the double execution problem. When you resume, LangGraph restarts the interrupted node from the beginning, not from the interrupt() call. Any code before interrupt() re-executes. The fix: move interrupt() into a dedicated node that does nothing else, so re-execution has no side effects.

Does LangGraph send notifications when a graph is interrupted?

No. LangGraph persists the interrupted state but does not notify anyone. If you need an approver to receive an email or Slack message when the agent is waiting, you need to build that notification layer yourself or use a dedicated approval API like The Handover.

What happens to an interrupted LangGraph thread if nobody responds?

Nothing — the thread sits in the checkpointer indefinitely. LangGraph has no built-in timeout or escalation. For production systems where decision latency matters, you need external scheduling logic or an API that handles expiry and escalation automatically.

Ready to add human oversight to your agent?

Free to start. No credit card required. Takes five minutes.

Get Started Free