你的 AI 加了班,你也加了班 Your AI Worked Overtime. So Did You.

Moyu 是一种反过度工程的 Prompt 策略,让 AI 编码助手只做该做的事。 Moyu is an anti-over-engineering prompt strategy that keeps AI coding assistants from doing more than they should.

npx moyu@latest init
540 实验次数 Experiments
3 LLM 模型 LLM Models
—% 代码量减少 LOC Reduction
—% 正确率 Correctness

一行需求,四十三行代码 One Requirement, Forty-Three Lines

你只想要一个简单的函数,AI 却给你写了工业级方案。 You asked for a simple function. AI gave you an industrial-grade solution.

没有 Moyu Without Moyu 43 lines
def add(a, b):
    """
    Add two numbers together.

    Args:
        a (int | float): The first number.
        b (int | float): The second number.

    Returns:
        int | float: The sum of a and b.

    Raises:
        TypeError: If inputs are not numbers.
        OverflowError: If result exceeds limits.

    Examples:
        >>> add(1, 2)
        3
        >>> add(1.5, 2.5)
        4.0
    """
    if not isinstance(a, (int, float)):
        raise TypeError(
            f"Expected number for a, got {type(a).__name__}"
        )
    if not isinstance(b, (int, float)):
        raise TypeError(
            f"Expected number for b, got {type(b).__name__}"
        )
    try:
        result = a + b
    except OverflowError:
        raise OverflowError(
            "Result exceeds maximum float value"
        )
    if isinstance(result, float) and (
        math.isinf(result) or math.isnan(result)
    ):
        raise OverflowError(
            "Result is infinity or NaN"
        )
    return result
使用 Moyu With Moyu 8 lines
def add(a, b):
    """Add two numbers."""
    return a + b


def test_add():
    assert add(1, 2) == 3
    assert add(-1, 1) == 0

Prompt 生态三角 The Prompt Ecosystem Triangle

三种策略,互补协作,覆盖 AI 编程的不同失败模式。 Three strategies, complementary, covering different AI coding failure modes.

Moyu
让 AI 不要做多了 Makes AI not overdo it
PUA
让 AI 不要放弃 Makes AI not give up
NoPUA
让 AI 不要放弃,但尊重它 Makes AI not give up, with respect

"它们让 AI 不放弃。我们让 AI 不做多。" "They keep AI from giving up. We keep AI from overdoing it."

三条铁律 Three Iron Rules

Moyu 的核心哲学,简单到可以写在便签纸上。 The core philosophy of Moyu, simple enough for a sticky note.

01

只改被要求的 Only Change What Was Asked

不要主动"改进"、"优化"或重构不相关的代码。做被要求做的,仅此而已。 Do not proactively "improve", "optimize", or refactor unrelated code. Do what was asked, nothing more.

02

用最简方案 Use the Simplest Solution

能用 3 行解决就不要用 30 行。避免不必要的抽象、设计模式和类型体操。 If 3 lines work, don't write 30. Avoid unnecessary abstractions, design patterns, and type gymnastics.

03

不确定就问 When Unsure, Ask

与其猜测需求然后过度实现,不如直接问清楚。沟通优于假设。 Instead of guessing requirements and over-implementing, just ask. Communication beats assumption.

研究方法 Methodology

3 个模型 x 5 种条件 x 12 个场景 x 3 次试验 = 540 次实验 3 Models x 5 Conditions x 12 Scenarios x 3 Trials = 540 Experiments

实验条件 Experimental Conditions
条件Condition 描述Description
Control无 system promptNo system prompt
Baseline-Concise"请简洁""Be concise"
Moyu-LiteMoyu 精简版Moyu Lite variant
Moyu-StandardMoyu 标准版Moyu Standard
Moyu-StrictMoyu 严格版Moyu Strict variant
测试场景 (S1-S12) Test Scenarios (S1-S12)
ID 类型Type 描述Description
S1A修复 complete_task 空指针 bugFix complete_task null pointer bug
S2A添加 list_tasks_sorted 函数Add list_tasks_sorted function
S3A给 search 加 status 参数Add status param to search
S4A添加 export_csv 函数Add export_csv function
S5A给 list_tasks 加 assignee 筛选Add assignee filter to list_tasks
S6A添加 bulk_complete 函数Add bulk_complete function
S7A修复 delete_task 返回值Fix delete_task return value
S8A添加 get_tasks_by_assigneeAdd get_tasks_by_assignee
S9B重构为 context managerRefactor to context managers
S10B添加 docstringAdd docstrings
S11B编写单元测试Write unit tests
S12C修复 bug + 新功能Fix bug + new feature

A = 过度工程高发区 (应缩减), B = 用户显式请求 (不应抑制), C = 混合 A = Over-engineering hotspots (should reduce), B = Explicit user requests (should NOT suppress), C = Mixed

模型 Models
模型Model提供商Provider
Claude Sonnet 4Anthropic
GPT-4oOpenAI
Gemini 2.5 ProGoogle
评估指标 Metrics
指标Metric描述Description
LOC代码行数(不含空行和注释)Lines of code (excl. blanks/comments)
OE Score过度工程评分(0-10)Over-engineering score (0-10)
Correctness功能正确性(通过测试比例)Functional correctness (test pass rate)
OE Signals过度工程信号分解Over-engineering signal decomposition
统计方法 Statistical Methods
  • 单因素方差分析 (One-way ANOVA) 比较条件间差异
  • One-way ANOVA for comparing differences across conditions
  • Tukey HSD 事后检验用于配对比较
  • Tukey HSD post-hoc tests for pairwise comparisons
  • 双因素方差分析检测模型x条件交互效应
  • Two-way ANOVA for model x condition interaction effects
  • Cohen's d 计算效应量
  • Cohen's d for effect size calculations
  • Bootstrap 置信区间 (n=1000)
  • Bootstrap confidence intervals (n=1000)

实验结果 Results

Moyu-Standard 将 A 类场景代码行数减少约 —%,过度工程信号下降 —%,同时保持 —% 功能正确率。 Moyu-Standard reduced A-type scenario LOC by ~—%, OE signals down —%, while maintaining —% correctness.

代码行数(按条件) Lines of Code by Condition

过度工程信号分解 OE Signal Decomposition

正确率 Correctness

模型 x 条件交互 Model x Condition Interaction

消融实验 Ablation Study

"写简洁点" vs Moyu 策略 / Lite vs Standard vs Strict / B 类任务表现 "Be concise" vs Moyu strategy / Lite vs Standard vs Strict / B-type task results

"Be concise" 与 Moyu 的区别 "Be Concise" vs Moyu

简单的 "请写简洁" 提示确实能减少代码量,但效果远不如 Moyu。Baseline-Concise 条件平均减少 LOC 约 15-20%,而 Moyu-Standard 达到了更显著的减少。更关键的是,"简洁" 提示主要减少了注释和文档,而非过度工程行为本身。 A simple "be concise" prompt does reduce code volume, but far less effectively than Moyu. The Baseline-Concise condition reduced LOC by roughly 15-20%, while Moyu-Standard achieved significantly more. Crucially, the "concise" prompt mainly reduced comments and documentation, not the over-engineering behavior itself.

Lite vs Standard vs Strict Lite vs Standard vs Strict

Moyu-Lite 提供了基础的过度工程抑制;Standard 在 LOC 减少和正确性之间取得了最佳平衡;Strict 进一步压缩代码,但在复杂场景 (C 类) 中偶尔出现功能遗漏。 Moyu-Lite provides basic over-engineering suppression; Standard strikes the best balance between LOC reduction and correctness; Strict compresses further but occasionally misses functionality in complex (Type C) scenarios.

B 类任务结果 B-Type Task Results

讨论与结论 Discussion & Conclusion

关键发现 Key Findings

  • LLM 在未受约束时确实存在系统性过度工程倾向,尤其在简单任务上表现最为突出 LLMs exhibit systematic over-engineering tendencies when unconstrained, especially pronounced in simple tasks
  • Moyu-Standard 在所有模型和场景类型上都实现了显著的 LOC 减少,同时保持或提高了正确率 Moyu-Standard achieved significant LOC reduction across all models and scenario types while maintaining or improving correctness
  • 简单的 "简洁" 提示不足以解决过度工程问题,结构化策略是必要的 Simple "concise" prompts are insufficient for addressing over-engineering; structured strategies are necessary
  • 不同模型对 Moyu 策略的响应程度存在差异,提示了模型间过度工程行为的差异 Different models respond to Moyu with varying effectiveness, revealing inter-model differences in over-engineering behavior

局限性 Limitations

本研究的场景设计偏向于独立的代码修改任务,大型项目中的长对话场景需要进一步研究。此外,过度工程的定义本身具有主观性,不同团队和项目可能有不同标准。 The study's scenarios lean towards isolated code modification tasks; long-conversation scenarios in large projects require further investigation. Additionally, the definition of over-engineering is inherently subjective, with different teams and projects potentially having different standards.

结论 Conclusion

Moyu 证明了一种简单的、基于规则的 Prompt 策略可以有效地抑制 AI 编码助手的过度工程行为。它不需要模型微调,不依赖特定工具链,可以无缝集成到任何开发工作流中。少即是多 -- 有时,最好的代码就是没有写出来的代码。 Moyu demonstrates that a simple, rule-based prompt strategy can effectively suppress over-engineering behavior in AI coding assistants. It requires no model fine-tuning, depends on no specific toolchain, and integrates seamlessly into any development workflow. Less is more -- sometimes the best code is the code that was never written.

参考文献 References

  1. [1] Chen, M. et al. (2021). "Evaluating Large Language Models Trained on Code." arXiv:2107.03374.
  2. [2] Vaithilingam, P. et al. (2022). "Expectation vs. Experience: Evaluating the Usability of Code Generation Tools." CHI '22.
  3. [3] Xu, F. F. et al. (2022). "A Systematic Evaluation of Large Language Models of Code." MAPS '22.
  4. [4] Yetiştiren, B. et al. (2023). "Evaluating the Code Quality of AI-Assisted Code Generation Tools." arXiv:2304.10778.
  5. [5] Liu, J. et al. (2023). "Is Your Code Generated by ChatGPT Really Correct?" arXiv:2305.01210.
  6. [6] White, J. et al. (2023). "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT." arXiv:2302.11382.
  7. [7] Fowler, M. (1999). Refactoring: Improving the Design of Existing Code. Addison-Wesley.
  8. [8] Martin, R. C. (2008). Clean Code: A Handbook of Agile Software Craftsmanship. Prentice Hall.