Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.crewai.com/llms.txt

Use this file to discover all available pages before exploring further.

Checkpointing saves a snapshot of execution state during a run so a crew, flow, or agent can resume after a failure or be forked into an alternate branch.

Explanation

How checkpointing works: events, storage, and inheritance.

Tutorial

A 5-minute walkthrough: run, interrupt, resume.

How-to guides

Task-focused recipes for common workflows.

Reference

CheckpointConfig, events, providers, and CLI.

Explanation

What a checkpoint is

A checkpoint captures everything CrewAI needs to recreate a run mid-flight: the full state of the crew, flow, or agent — configuration, agent memory and knowledge sources, task progress, intermediate outputs, internal state and attributes — alongside the kickoff inputs, the event history up to that point, and a lineage ID that ties the checkpoint to the run it came from. Restoring rebuilds that state and continues. Completed tasks are skipped, memory and knowledge are rehydrated, and downstream work runs against the same outputs the original run produced. Forking does the same restore under a new lineage, so the new branch and the original run can write checkpoints side by side without overwriting each other.

When checkpoints are written

Checkpointing is event-driven. The runtime subscribes to events you select via on_events and writes a checkpoint each time one fires. The default task_completed produces one checkpoint per finished task — a sensible tradeoff between granularity and disk use. Higher-frequency events like llm_call_completed are available for fine-grained recovery but write far more files.

Storage

Two providers ship with CrewAI:
  • JsonProvider writes one file per checkpoint. Human-readable and easy to inspect.
  • SqliteProvider writes to a single SQLite database. Better for high-frequency checkpointing.
Both prune oldest checkpoints when max_checkpoints is set.
Auto-checkpoint writes (event-driven) are best-effort: a failed write is logged and the run continues. Manual state.checkpoint() and state.acheckpoint() calls re-raise on failure.

Inheritance model

Crew, Flow, and Agent all accept a checkpoint argument. Children inherit from their parent unless they set their own value or pass False to opt out. Enable checkpointing once on the crew and every agent participates, or selectively exclude one agent.

Tutorial: Resume a failing crew

This walkthrough takes ~5 minutes. You will run a two-task crew, kill it midway, and resume from the saved checkpoint.
1

Create the crew with checkpointing enabled

from crewai import Agent, Crew, Task

researcher = Agent(role="Researcher", goal="Research", backstory="Expert")
writer = Agent(role="Writer", goal="Write", backstory="Expert")

crew = Crew(
    agents=[researcher, writer],
    tasks=[
        Task(description="Research AI trends", agent=researcher, expected_output="bullets"),
        Task(description="Write a summary", agent=writer, expected_output="paragraph"),
    ],
    checkpoint=True,
)
2

Run it and interrupt after the first task

result = crew.kickoff()
Press Ctrl+C after the first task finishes. Look in ./.checkpoints/ — a file named <timestamp>_<uuid>.json is the checkpoint.
3

Resume from the checkpoint

from crewai import CheckpointConfig

result = crew.kickoff(
    from_checkpoint=CheckpointConfig(
        restore_from="./.checkpoints/<timestamp>_<uuid>.json",
    ),
)
The research task is skipped, the writer runs against the saved research output, and the crew finishes.

How-to guides

crew = Crew(agents=[...], tasks=[...], checkpoint=True)
Writes to ./.checkpoints/ on every task_completed.
from crewai import Crew, CheckpointConfig

crew = Crew(
    agents=[...],
    tasks=[...],
    checkpoint=CheckpointConfig(
        location="./my_checkpoints",
        on_events=["task_completed", "crew_kickoff_completed"],
        max_checkpoints=5,
    ),
)
from crewai import Crew, CheckpointConfig
from crewai.state import JsonProvider

crew = Crew(
    agents=[...],
    tasks=[...],
    checkpoint=CheckpointConfig(
        location="./my_checkpoints",
        provider=JsonProvider(),
        max_checkpoints=5,
    ),
)
SQLite enables WAL journal mode for concurrent reads. Prefer it for high-frequency checkpointing.
crew = Crew(
    agents=[
        Agent(role="Researcher", ...),
        Agent(role="Writer", ..., checkpoint=False),
    ],
    tasks=[...],
    checkpoint=True,
)
fork() restores a checkpoint under a fresh lineage so the new run does not collide with the original.
config = CheckpointConfig(restore_from="./my_checkpoints/<file>.json")
crew = Crew.fork(config, branch="experiment-a")
result = crew.kickoff(inputs={"strategy": "aggressive"})
The branch label is optional; one is generated if omitted.
crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, write_task, review_task],
    checkpoint=CheckpointConfig(location="./crew_cp"),
)
Default trigger: task_completed.
Register a handler on any event and call state.checkpoint().
from __future__ import annotations

from typing import TYPE_CHECKING, Any

from crewai.events.event_bus import crewai_event_bus
from crewai.events.types.llm_events import LLMCallCompletedEvent

if TYPE_CHECKING:
    from crewai.state.runtime import RuntimeState


@crewai_event_bus.on(LLMCallCompletedEvent)
def on_llm_done(source: Any, event: LLMCallCompletedEvent, state: RuntimeState) -> None:
    path = state.checkpoint("./my_checkpoints")
    print(f"Saved checkpoint: {path}")
A state argument is supplied automatically when the handler takes three parameters. See Event Listeners for the full event catalog.
crewai checkpoint
crewai checkpoint --location ./my_checkpoints
crewai checkpoint --location ./.checkpoints.db
Checkpoint TUI tree view
The left panel groups checkpoints by branch; forks nest under their parent. Selecting a checkpoint opens the detail panel with metadata, entity state, and task progress. Resume continues the run; Fork starts a new branch.
Checkpoint detail overview tab
The detail panel exposes two editable areas:
  • Inputs — original kickoff inputs, pre-filled and editable.
    Editable kickoff inputs
  • Task outputs — outputs of completed tasks. Editing an output and hitting Fork invalidates downstream tasks so they re-run against the modified context.
    Editable task outputs
Fork confirmation panel
Useful for “what if” exploration: fork, tweak, observe.
crewai checkpoint list ./my_checkpoints
crewai checkpoint info ./my_checkpoints/<file>.json
crewai checkpoint info ./.checkpoints.db

Reference

CheckpointConfig

location
str
default:"\"./.checkpoints\""
Storage destination. A directory for JsonProvider, a database file path for SqliteProvider.
on_events
list[CheckpointEventType | Literal["*"]]
default:"[\"task_completed\"]"
Event types that trigger a checkpoint. CheckpointEventType is a Literal — your type checker will autocomplete and reject unsupported values. See event types for the full list.
provider
BaseProvider
default:"JsonProvider()"
Storage backend. Either JsonProvider or SqliteProvider.
max_checkpoints
int | None
default:"None"
Maximum checkpoints to retain. Oldest are pruned after each write.
restore_from
Path | str | None
default:"None"
Checkpoint to restore from when passed via from_checkpoint.

checkpoint field values

Accepted by Crew, Flow, and Agent.
None
default
Inherit from parent.
True
bool
Enable with defaults.
False
bool
Explicit opt-out. Stops inheritance.
CheckpointConfig(...)
CheckpointConfig
Custom configuration.

Event types

on_events accepts any combination of CheckpointEventType values. The default ["task_completed"] writes one checkpoint per finished task; ["*"] matches every event.
["*"] and high-frequency events like llm_call_completed write many checkpoints and can degrade performance. Pair them with max_checkpoints.

Storage providers

JsonProvider
provider
One file per checkpoint, named <timestamp>_<uuid>.json inside location.
SqliteProvider
provider
Single database file at location with WAL journaling.

CLI

CommandPurpose
crewai checkpointLaunch the TUI; auto-detect storage.
crewai checkpoint --location <path>Launch the TUI against a specific location.
crewai checkpoint list <path>List checkpoints.
crewai checkpoint info <path>Inspect a checkpoint file or the latest entry in a SQLite database.