Records 模块详细设计

本文档展开 kgent/records/ 的代码设计。

Records 模块负责将 run 的过程证据写成机器可读和人类可读 artifacts，是可审计交付、未来调度、revision 和质量治理的基础。

目标关联

Records 模块支撑：

Sandbox-First Work Execution
Tool-Mediated Action System
Quality and Governance System
Professional Workflow Orchestration
Learning and Capability Evolution

它直接支撑可审计交付、专业可靠性、上下文连续性和未来调度。

模块范围

包含：

events.jsonl append-only 写入。
event sequence 管理。
transcript.md 写入和 finalize。
logs/tools.jsonl 写入。
logs/errors.jsonl 写入。
tool call record 写入。
error record 写入。

不包含：

决定治理结果。
执行工具。
执行模型。
远程 tracing backend。
dashboard。

文件结构

src/kgent/records/
  __init__.py
  context.py
  events.py
  transcript.py
  logs.py
  errors.py

数据模型

Event

位置：events.py

class Event:
    event_id: str
    sequence: int
    run_id: str
    session_id: str
    task_id: str
    type: str
    timestamp: str
    actor: str
    severity: Literal["debug", "info", "warning", "error"]
    summary: str
    data: dict
    correlation_id: str | None
    parent_event_id: str | None

要求：

sequence 在单个 run 内单调递增。
sequence 是排序 source of truth。
timestamp 用于 wall-clock 分析。
data 不存大型内容和敏感值。

ToolCallRecord

位置：logs.py

class ToolCallRecord:
    call_id: str
    tool_name: str
    action: str
    started_at: str
    completed_at: str | None
    duration_ms: int | None
    status: Literal["started", "completed", "failed", "blocked"]
    args_summary: dict
    result_summary: str | None
    artifacts: list[str]
    error: ErrorInfo | None

ErrorInfo

统一位置：kgent/errors.py

class ErrorInfo:
    code: str
    message: str
    category: Literal["config", "sandbox", "skill", "tool", "memory", "engine", "governance", "unknown"]
    retryable: bool
    details: dict | None

records/errors.py 中的 ErrorLogger 使用该统一模型，不重新定义错误结构。

RunRecords

位置：context.py

class RunRecords:
    events: EventRecorder
    transcript: TranscriptWriter
    tools: ToolLogger
    errors: ErrorLogger

核心接口

EventRecorder

位置：events.py

class EventRecorder:
    def __init__(self, path: Path, session_id: str, task_id: str, run_id: str):
        ...

    def emit(
        self,
        event_type: str,
        summary: str,
        data: dict | None = None,
        *,
        actor: str,
        severity: str = "info",
        correlation_id: str | None = None,
        parent_event_id: str | None = None,
    ) -> Event:
        ...

行为：

自动生成 event_id。
自动递增 sequence。
自动填充 IDs 和 timestamp。
追加写入 JSONL。
返回写入的 Event。

TranscriptWriter

位置：transcript.py

class TranscriptWriter:
    def __init__(self, path: Path):
        ...

    def append_section(self, title: str, content: str) -> None:
        ...

    def finalize(self, summary: dict) -> None:
        ...

最小 transcript 结构：

# Run Transcript

## Metadata
## Prompt
## Effective Role Summary
## Skills Used
## Tool Activity Summary
## Deliverables
## Errors and Warnings

ToolLogger

位置：logs.py

class ToolLogger:
    def started(self, tool_name: str, action: str, args_summary: dict) -> str:
        ...

    def completed(self, call_id: str, result_summary: str, artifacts: list[str], duration_ms: int) -> None:
        ...

    def failed(self, call_id: str, error: ErrorInfo, duration_ms: int) -> None:
        ...

    def blocked(self, call_id: str, error: ErrorInfo) -> None:
        ...

ErrorLogger

位置：errors.py

class ErrorLogger:
    def write(self, error: ErrorInfo, *, context: dict | None = None) -> None:
        ...

文件输出

runs/&lt;run-id&gt;/
  events.jsonl
  transcript.md
  logs/
    tools.jsonl
    errors.jsonl

事件类型常量

建议定义在 events.py：

RUN_CREATED = "run.created"
RUN_STARTED = "run.started"
RUN_COMPLETED = "run.completed"
RUN_FAILED = "run.failed"
CONFIG_LOADED = "config.loaded"
PROMPT_RENDERED = "prompt.rendered"
SKILL_INDEXED = "skill.indexed"
SKILL_LOADED = "skill.loaded"
TOOL_STARTED = "tool.started"
TOOL_COMPLETED = "tool.completed"
TOOL_FAILED = "tool.failed"
FILE_REJECTED = "file.rejected"
DELIVERABLE_MISSING = "deliverable.missing"
ERROR = "error"

大型数据和敏感信息策略

Records 模块不负责完整脱敏系统，但 Phase 1 必须遵守：

不把文件全文写入 event data。
不把 secret value 写入 events 或 logs。
tool args/results 只写 summary。
大型输出写 artifact，并在 event 中引用路径。
原始异常 stack 默认写入 errors log 的 details，可配置是否暴露。

运行时集成

Runtime 创建 EventRecorder、TranscriptWriter、ToolLogger、ErrorLogger，聚合为 RunRecords，并通过 RunContext.records 传递给下游模块。

下游模块不得直接打开 events.jsonl 写入，应通过 recorder 接口写入。

测试设计

单元测试：

EventRecorder.emit 自动递增 sequence。
event 必填字段完整。
JSONL 每行是合法 JSON。
append-only 行为正确。
ToolLogger 写入 started/completed/failed/blocked。
ErrorLogger 写入统一 ErrorInfo。
transcript 最小结构存在。
RunRecords 聚合四类 writer/logger。
大型 data 被拒绝或摘要。

集成测试：

最小 run 产生 events.jsonl、transcript.md、logs/tools.jsonl、logs/errors.jsonl。
engine failure 产生 run.failed 和 error log。
missing deliverable 产生 deliverable.missing。

性能和调度考虑

JSONL append 是 Phase 1 默认持久化方式。
sequence 支持未来 scheduler 稳定回放。
event 写入应保持同步简单实现，后续可替换为 buffered writer。
单个 event 不应过大。
correlation_id 用于关联 tool call、engine event、file event。

验收标准

Event 写入符合 schema。
sequence 单调递增。
Tool call 有完整记录。
Error 使用统一结构。
Transcript 人类可读。
记录模块不决定业务策略。
记录模块可被单元测试覆盖。

来源项目	`kunora-kgent`
分支	`docs-publish`
路径	`technology/components/kunora-kgent/code-design/records-module.md`

目标关联​

模块范围​

文件结构​

数据模型​

Event​

ToolCallRecord​

ErrorInfo​

RunRecords​

核心接口​

EventRecorder​

TranscriptWriter​

ToolLogger​

ErrorLogger​

文件输出​

事件类型常量​

大型数据和敏感信息策略​

运行时集成​

测试设计​

性能和调度考虑​

验收标准​