DeliveryPilot Conduit

仓库概览

Import the official sandbox to show repo summary, scripts, modules, and evidence-ready context.

package scripts

Package scripts appear after repository indexing.

模块地图

Module map appears after Official Conduit Sandbox import.

搜索

导入仓库后搜索 article / profile / comment / prisma，快速定位上下文。

交付流水线（Stage 2：Contract → Plan → Location → Evidence）

Current Delivery

Start with a PM requirement

先输入需求或运行 Official L1 Demo。这里会显示当前阶段、下一步动作和证据状态。

Start Requirement View Evidence

Stage not started

运行 Official L1 Demo 或创建需求后显示 Progress / Completed / Blocked / Evidence。

自然语言需求

未创建会话

创建需求后，这里会按阶段显示 PRD、Plan、Context、Patch、Test 和 PR Evidence 的摘要。

澄清问题

创建需求后点击“生成澄清问题”，确认 PM 输入是否足够进入 PRD。

PRD 契约（可编辑）

title userStory acceptanceCriteria（每行一条） nonGoals（每行一条） testContract（每行一条） priority

方案拆解（JSON，可编辑）

模块定位（Impact Heatmap）

Target Files

确认方案后点击“模块定位”，生成 target files。

Test Files

模块定位后显示建议测试文件。

Do Not Touch

模块定位后显示 doNotTouch 边界。

人工边界调整

Evidence Ledger

运行 Official L1 Demo 或刷新当前需求证据账本后显示 workflow events。

Observability（P1）

Observability

AI calls not loaded

刷新后展示 AI 调用次数、fallback、latency、tokens 和 cost；原始日志保留下方。

not started View Logs

—

Skill Marketplace（Stage 5）

Skill Marketplace

Skills not loaded

刷新 Skill 列表后，系统会展示可复用交付技能和 dry-run 匹配结果。

not started View Skill Evidence

Skills

点击“刷新 Skill 列表”

Dry Run

—

Context Strategy Lab（Stage 5）

Context Strategy Lab

Strategy comparison not run

创建需求后比较 minimal / balanced / defensive，先看推荐策略，再查看原始 JSON。

not started View Context JSON

点击 “Compare Strategies” 生成 minimal / balanced / defensive 对比

—

Delivery Memory Loop（Stage 5）

Delivery Memory Loop

Memory suggestion not generated

写入或召回历史交付经验后，这里会显示 reuse decision、confidence 和可应用范围。

not started View Memory JSON

Memory Library

—

Suggestion

—

说明：建议包含 suggestedSkill / contextStrategy / selectedFiles / excludedFiles / doNotTouch / testStrategy / confidence / reuseDecision，并会写入 workflow events（memory_written / memory_suggested / memory_applied）。

Requirement Benchmark Suite（Stage 5）

Benchmark Cases

点击 Refresh Cases 查看 L1 / L2 / L3 标准需求集。

Scoreboard

Run All Quick to produce average score, by-level scores, and weak areas.

Selected Case Detail

Select a benchmark case to inspect expected capability profile and failures.

说明：Benchmark v1 目标是评估 Skill/Clarification/PRD/Plan/Context/Cross-stack Gate（coverImage）能力，不要求所有 case 都生成 patch。L3 case 必须体现“先澄清，不直接 patch”。

Demo Scenario Runner

Scenario List

点击 Refresh Scenarios 后选择 official L1、coverImage review 或 memory reuse 场景。

Run Detail

Run Selected 后展示 scenario run、状态和 evidence refs。

Step Timeline

运行场景后显示每一步：requirement、context、patch、test、review。

Evidence Links

Scenario evidence links appear after a run.

说明：Demo Scenario Runner 将 Skill、Context、Memory、Benchmark、Review、Replay 串成可一键运行的回归/演示场景，输出 ordered steps 与 evidenceRefs。

Human Review Workbench（P1）

Review Detail

Failure reasons, evidence, actions, decisions

Failure Reasons

选择队列项后展示失败原因。

Evidence Refs

选择队列项后展示 evidence refs。

Recommended Actions

选择队列项后展示建议动作。

Decision History

暂无人工决策

Decision（for selected item）

Replay Panel

Request Only / Dry Run / Execute Downstream / Rollback

Request Only 记录人工重放请求，不改下游产物。

Dry Run 生成 Artifact Diffs 预览，不写正式 gate。

Execute Downstream 需 approval code，执行下游重放。

Rollback 回滚已执行的 replay run。

Affected Stages

Request Replay 后显示受影响阶段。

Safety Checks

Dry Run 或 Execute 前显示安全检查。

Artifact Diffs

Dry Run 后显示 artifact diffs。

Replay History

执行 replay 后显示历史记录。

Controlled Replay JSON

—

—

—

Selected Item Detail JSON

—

说明：Review Workbench 以 Benchmark 失败/弱点队列为主，支持 approve/reject/request-replay/close；Controlled Replay 支持 request_only、dry_run、execute_downstream 和 rollback，execute 需要人工 approval code。

Evaluation Dashboard（P1）

Benchmark Scoreboard

Refresh Evaluation Dashboard to load benchmark average, cross-stack gate, replay, strict verification, and validation status.

Capability Metrics

Capability metrics load Skill Match, Context Strategy, Memory Reuse, and Review Queue signals.

AI Cost Summary

AI call count, estimated tokens, latency, and cost appear after refresh.

Capability Validation Suite

Real AI Capability Test

fallback

Failed Cases

Run Capability Suite to inspect failed cases. In public demo mode, real AI live calls remain CLI-only.

Real AI / Cost Status

AI_MODE and ARK_* are read only on the server. Secret values are never returned to the browser.

说明：Evaluation Dashboard 聚合 Benchmark Scoreboard、Skill Accuracy、Context Strategy Distribution、Memory Reuse Decisions、Review Queue Open Items、Cross-stack Gate、Replay success rate、Strict verification status、Validation pass rate 和 AI latency/usage/cost。

Governance Policy Engine（P1）

Governance Policy Engine

Policies not evaluated

刷新 policies 后可查看 scope、secret、cross-stack 和 human approval 门禁；Evaluate 后显示 pass/block 摘要。

not started View Policy JSON

Policies

—

Policy Input JSON

Decision Summary

—

说明：Policy Engine 将 scope guard、secret hygiene、cross-stack gate 和 human approval 统一为可审计 policy decisions，并保留 evidenceRefs。

Final Evidence Center（P1）

Coverage Matrix

Refresh Evidence Center to see Requirement / Clarification / Plan / Context / Patch / Test / PR / Review / Replay / Deployment coverage.

Verification Commands

Verification commands appear after evidence refresh.

Export Result

Export JSON + Markdown + HTML after the evidence center is refreshed.

说明：Final Evidence Center 汇总 official L1、coverImage cross-stack、Memory、Benchmark、Review、Controlled Replay、Observability 和 strict verification 证据，并支持离线 JSON/Markdown/HTML 导出。

Patch Theatre（Stage 3）

Patch Theatre

Patch not generated

先确认方案和模块边界，再生成 patch。原始 diff 会保留在下方。

not started View Diff JSON

—

Unified Diff

—

Spec-to-Diff Trace Matrix

生成 patch 后显示

DoNotTouch / Scope Guard

—

Test Arena（Stage 3）

Test Arena

Tests not run

Patch apply 后生成 Test Plan，再运行白名单测试。

not started View Logs JSON

—

—

QA Report（Stage 3）

—

PR & CI Manager

PR Evidence

PR package not created

测试通过后创建 PR Package；公网 Demo 不创建真实 GitHub PR。

not started View PR JSON

PR Record

—

CI Status

—

说明：package 模式生成 branch / commitSha / patch.diff / pr-summary.md / validation.log / rollback-plan.md；没有真实 GitHub 集成时不会伪造 PR URL，而是记录 GITHUB_PR_BLOCKED。

Preview Deploy（Stage 4）

Preview Deploy

Preview not deployed

先生成并确认 Deployment Plan，再执行 preview / smoke / release note。

not started View Deploy JSON

—

Deployment Plan

—

Deployment / Smoke

—

—

—

Native Conduit Preview Adapter

Native Preview

Native preview not checked

公网 Demo 默认阻断 native start；这里只展示真实依赖检查和 blocker reason。

not started View Native JSON

Native Preview Plan / Status

—

Dependency / Health Check

—

说明：contract preview 仍是快速稳定路径；native preview 只在真实 health check 通过后显示 URL，否则记录 NATIVE_PREVIEW_BLOCKED，不伪造成功。

Knowledge Flywheel（Stage 4）

Knowledge Flywheel

Knowledge not written

QA evidence ready 后写入知识，后续需求可复用 playbook。

not started View Knowledge JSON

—

Reuse Replay（Stage 4）

Reuse Replay

Reuse not evaluated

输入相似需求并 Find Similar，验证历史知识是否能指导新交付。

not started View Reuse JSON

—

Evidence Hub（Stage 4）

Evidence Hub

Evidence hub not refreshed

刷新后汇总 requirement、PRD、plan、module location、patch、test、deploy 和 knowledge 证据。

not started View Hub JSON

—

Prompt in. Verified patch out.

Three L1 paths to prove speed, safety, and reuse.

Article reading stats

Popular Tags Top 5

Profile About Me Tab

L2 / L3 cases for cross-stack and reasoning quality.

Article Cover Image

Draft Workflow Clarifier