[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80917":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":11,"openIssues":13,"contributorsCount":12,"subscribersCount":12,"size":12,"stars1d":12,"stars7d":12,"stars30d":12,"stars90d":12,"forks30d":12,"starsTrendScore":12,"compositeScore":12,"rankGlobal":9,"rankLanguage":9,"license":14,"archived":15,"fork":15,"defaultBranch":16,"hasWiki":17,"hasPages":15,"topics":18,"createdAt":9,"pushedAt":9,"updatedAt":19,"readmeContent":20,"aiSummary":21,"trendingCount":12,"starSnapshotCount":12,"syncStatus":22,"lastSyncTime":23,"discoverSource":24},80917,"answer-engineering","victorlavrenko\u002Fanswer-engineering","victorlavrenko","Local trajectory editing for protocol-constrained decision making in large language models, with a reference implementation and reproducible paper results.",null,"Python",33,0,35,"Other",false,"main",true,[],"2026-06-12 02:04:08","# Answer Engineering\n\nAnswer Engineering is a Python library for steering language model generation with explicit, local rules.\n\nIt is designed for situations where the *path* to an answer matters, not just the final output — for example when models must follow engineering practices, clinical protocols, safety procedures, or organizational standards.\n\nInstead of retraining the model or post-processing the output, Answer Engineering intervenes during generation and redirects specific steps in real time. The result is behavior that is inspectable, reproducible, and policy-constrained.\n\n---\n\n## Quickstart\n\nThe fastest way to understand the value of Answer Engineering is to run the notebook:\n\n- [Quickstart notebook](notebooks\u002Fquickstart.ipynb)\n\n### The problem\n\nLanguage models are very good at producing output — sometimes *too* good.\n\nFor example, in vibe-coding tasks, models often start implementing new code immediately because they are trained to generate solutions. Human engineers, however, frequently pause to check whether an existing library or component can be reused instead.\n\nThis mismatch leads to unnecessary complexity, duplicated logic, and code that violates team conventions.\n\n### What the notebook does\n\nThe notebook demonstrates how a single rule can redirect the generation path.\n\nThe rule says:\n\n> Replace **\"Implement\"** with **\"Consider reusing an existing\"** and continue from there\n\nThis intervention happens locally during generation — at the moment the target phrase appears — not after the answer is finished.\n\n### What you will see\n\nThe notebook runs the same prompt twice:\n\n1. Baseline generation\n   - The model immediately writes new code from scratch.\n\n2. Generation with Answer Engineering\n   - The rule intercepts the reasoning step.\n   - The model pauses to evaluate reuse options.\n   - The final answer uses an existing component instead of reimplementing one.\n\nThe saved output cell shows the divergence clearly: the baseline implements, while the guided run reuses.\n\nThis is the core idea of Answer Engineering:\n\n*Change the trajectory, and the outcome changes naturally.*\n\n---\n\n## Why this project exists\n\nAfter running the Quickstart, the key observation becomes hard to ignore: generation can be corrected locally, and the model will continue naturally from the corrected state.\n\nLarge language models do not maintain hidden commitments to earlier tokens. They simply continue from the visible text prefix. This means that when a trajectory step is edited — for example, redirecting \"Implement\" toward \"Consider reusing an existing component\" — the model proceeds as if that step had always been written that way.\n\nIn other words, trajectory correction is not a hack. It is a property of how autoregressive generation works.\n\nOnce this is understood, allowing generation to proceed without correction in protocol-sensitive settings starts to look like an unnecessary risk. If a step can be repaired immediately, the downstream reasoning — and the final outcome — can change in a predictable way.\n\nThis matters beyond individual steps.\n\nSmall trajectory corrections accumulate. A corrected assumption leads to a different branch of reasoning. A different branch of reasoning leads to a different decision. And a different decision often determines whether the system behaves safely, efficiently, or correctly.\n\nAnswer Engineering exists to make this capability explicit and reliable.\n\nIt provides a runtime layer that can:\n\n- detect when a trajectory enters a risky or non-compliant path\n- apply deterministic local edits at that moment\n- continue generation from the corrected state\n- record what changed and why\n\nThe result is not just cleaner intermediate steps, but more dependable final answers — because the reasoning path itself stayed within the required boundaries.\n\nThis repository includes both the runtime implementation and a reproducible evaluation pipeline that demonstrates this effect in a controlled benchmark.\n\nFor the full research description of the system, see:\n\n- [`docs\u002Fpaper\u002Flavrenko2026_answer_engineering.pdf`](docs\u002Fpaper\u002Flavrenko2026_answer_engineering.pdf)\n- [`docs\u002Fpaper\u002Fmain.tex`](docs\u002Fpaper\u002Fmain.tex)\n- [`docs\u002Fpaper\u002Fgenerated\u002Fpaper-metrics.tex`](docs\u002Fpaper\u002Fgenerated\u002Fpaper-metrics.tex)\n\n---\n\n## Repository structure\n\nThis repository contains two related layers.\n\n- [`answer_engineering`](src\u002Fanswer_engineering\u002F) — the runtime library and rule system.\n- [`ae_paper_reproduction`](src\u002Fae_paper_reproduction\u002F) — the notebook, telemetry, reporting, and paper-reproduction layer.\n\nFor the full documentation index, start with [`docs\u002FREADME.md`](docs\u002FREADME.md).\n\n### `answer_engineering`\n\nThe runtime library.\n\nIt provides:\n\n- rule parsing and compilation\n- deterministic trajectory intervention\n- observable runtime behavior\n- telemetry and inspection tools\n\nThis is the component you use to integrate Answer Engineering into applications.\n\n### `ae_paper_reproduction`\n\nThe research and evaluation layer.\n\nIt provides:\n\n- notebooks used in the paper\n- telemetry aggregation and reporting\n- reproducibility workflows\n- metric generation for the manuscript\n\nYou typically do not need this layer to use the runtime, but it is included so that all reported results can be reproduced exactly.\n\n---\n\n## What works today\n\nThe current implementation supports rule-guided generation through a narrow public runtime API:\n\n- [`GenerationRuntime`](src\u002Fanswer_engineering\u002Finference\u002Fanswering.py)\n- [`GenerationRequest`](src\u002Fanswer_engineering\u002Finference\u002Fcontracts.py)\n- [`GenerationPolicy`](src\u002Fanswer_engineering\u002Finference\u002Fcontracts.py)\n- [`GenerationResult`](src\u002Fanswer_engineering\u002Finference\u002Fcontracts.py)\n- [`CompiledRules`](src\u002Fanswer_engineering\u002Frules\u002F__init__.py)\n\nCurrent code-faithful documentation:\n\n- [Current capabilities](docs\u002Fcurrent\u002Fcapabilities.md)\n- [Current architecture](docs\u002Fcurrent\u002Farchitecture.md)\n- [Current runtime model](docs\u002Fcurrent\u002Fruntime-model.md)\n- [Current codebase reality](docs\u002Fcurrent\u002Fcodebase-reality.md)\n- [Telemetry schema](docs\u002Fcurrent\u002Ftelemetry-schema.md)\n\n## What this project is not\n\nThis repository is not currently a general-purpose agent framework, a production LLM serving platform, a broad prompt-engineering toolkit, or a generic safety moderation library.\n\nIts core concern is controlled generation under explicit local rules.\n\n## Core runtime model\n\nThe canonical public call is [`GenerationRuntime.generate(request, policy)`](src\u002Fanswer_engineering\u002Finference\u002Fanswering.py).\n\nThe current execution path is described in [Runtime model](docs\u002Fcurrent\u002Fruntime-model.md) and [Runtime entry points](docs\u002Fdev\u002Fruntime-entrypoints.md). At a high level:\n\n```text\nGenerationRuntime.generate(...)\nStreamSession.run()\nGreedyDecoder.decode()\nExecutionSession.apply_step(...)\nPlanRunner\nGenerationResult\n```\n\nWhen rules are present, the runtime monitors the evolving answer, evaluates compiled rule plans, applies deterministic text edits, records telemetry, and continues generation from the edited state. This is not just post-processing: intervention happens during generation.\n\n## Rule system\n\nRules are authored in a compact Markdown-based domain-specific language and compiled into executable plans. The exact syntax is documented in [Rule language reference](docs\u002Frules\u002Flanguage-reference.md), and practical authoring guidance is in [Writing rules](docs\u002Fusers\u002Fwriting-rules.md).\n\nRule families:\n\n- **`Replace`** — normalize protocol-critical terminology by replacing matched text with approved alternatives.\n- **`After`** — insert approved text after an anchor once the relevant concept has appeared.\n- **`Avoid`** — detect risky trajectories using prefix\u002Fpostfix guards and redirect generation through fallback or probed continuations.\n- **`Force`** — enforce a required statement within a scope.\n\nA minimal rule looks like this:\n\n```ae-rules\n## Replace (once): sensorineural hearing loss\nWith:\n- sudden sensorineural hearing loss\n```\n\nFor the full grammar, modifiers, guard operators, scope syntax, options, and template expansion rules, see [Rule language reference](docs\u002Frules\u002Flanguage-reference.md).\n\n## Minimal shape\n\n```python\nfrom answer_engineering import GenerationRuntime, GenerationRequest, GenerationPolicy\n\nruntime = GenerationRuntime(MODEL_ID)\nanswer = runtime.generate(\n    GenerationRequest(question=QUESTION),\n    GenerationPolicy(\n        rules=RULES,\n        system_prompt=SYSTEM_PROMPT,\n    ),\n)\n```\n\n## Minimal story\n\nLoad a model, ask a question, and apply a ruleset during generation.\n\nThe ruleset defines local trajectory edits that are enforced while the answer is being produced. The resulting output reflects those enforced constraints and can be inspected together with the associated runtime telemetry.\n\n## Installation\n\nFor local development:\n\n```bash\npython -m pip install -U pip\npython -m pip install -e \".[dev,hf]\"\n```\n\nThen validate the repository:\n\n```bash\n.\u002Fscripts\u002Fcheck\n```\n\nContribution and validation details are in [CONTRIBUTING.md](CONTRIBUTING.md). CI is defined in [`.github\u002Fworkflows\u002Fci.yml`](.github\u002Fworkflows\u002Fci.yml).\n\n## Reproducing the paper\n\nThe main reproduction entry point is:\n\n- [notebooks\u002Freproduce.ipynb](notebooks\u002Freproduce.ipynb)\n\nReproducibility documentation:\n\n- [Reproducibility guide](docs\u002Fcurrent\u002Freproducibility.md)\n- [Paper artifacts](docs\u002Fcurrent\u002Fpaper-artifacts.md)\n- [Generated paper metrics](docs\u002Fpaper\u002Fgenerated\u002Fpaper-metrics.tex)\n- [Paper source](docs\u002Fpaper\u002Fmain.tex)\n- [Rendered paper PDF](docs\u002Fpaper\u002Flavrenko2026_answer_engineering.pdf)\n\nThe reproduction layer emits structured artifacts such as evaluation summaries, telemetry summaries, paper tables, manifests, and generated TeX metrics. The current artifact flow is described in [Paper artifacts](docs\u002Fcurrent\u002Fpaper-artifacts.md).\n\n## Repository layout\n\n- [`src\u002Fanswer_engineering\u002F`](src\u002Fanswer_engineering\u002F) — runtime library, rules, inference, engine, config, telemetry, and infrastructure.\n- [`src\u002Fae_paper_reproduction\u002F`](src\u002Fae_paper_reproduction\u002F) — evaluation, reporting, telemetry export, and paper-reproduction workflows.\n- [`notebooks\u002F`](notebooks\u002F) — notebook entry points, including [quickstart](notebooks\u002Fquickstart.ipynb) and [reproduction](notebooks\u002Freproduce.ipynb).\n- [`docs\u002F`](docs\u002FREADME.md) — reader-facing documentation.\n- [`docs\u002Fcurrent\u002F`](docs\u002Fcurrent\u002F) — code-faithful current architecture and behavior.\n- [`docs\u002Fdev\u002F`](docs\u002Fdev\u002F) — developer entry points, extension seams, and golden snapshots.\n- [`docs\u002Frules\u002F`](docs\u002Frules\u002F) — rule language reference.\n- [`docs\u002Fusers\u002F`](docs\u002Fusers\u002F) — practical rule-authoring guidance.\n- [`docs\u002Fvision\u002F`](docs\u002Fvision\u002F) — long-term system and trajectory-control vision.\n- [`docs\u002Fgaps\u002F`](docs\u002Fgaps\u002F) — known gaps, roadmap, and convention-compliance work.\n- [`conventions\u002F`](conventions\u002F) — coding and architectural conventions.\n- [`tests\u002F`](tests\u002F) — tests, architecture checks, regression tests, and golden snapshots.\n\n## Documentation map\n\nStart here based on what you need:\n\n- New user: [Writing rules](docs\u002Fusers\u002Fwriting-rules.md)\n- Rule author: [Rule language reference](docs\u002Frules\u002Flanguage-reference.md)\n- Runtime integrator: [Runtime entry points](docs\u002Fdev\u002Fruntime-entrypoints.md)\n- Extension author: [Extension points](docs\u002Fdev\u002Fextension-points.md)\n- Reproduction user: [Reproducibility guide](docs\u002Fcurrent\u002Freproducibility.md)\n- Paper reader: [Rendered paper PDF](docs\u002Fpaper\u002Flavrenko2026_answer_engineering.pdf) and [paper source](docs\u002Fpaper\u002Fmain.tex)\n- Architecture reader: [Current architecture](docs\u002Fcurrent\u002Farchitecture.md), [Runtime model](docs\u002Fcurrent\u002Fruntime-model.md), and [Codebase reality](docs\u002Fcurrent\u002Fcodebase-reality.md)\n- Telemetry consumer: [Telemetry schema](docs\u002Fcurrent\u002Ftelemetry-schema.md)\n- Maintainer: [Golden snapshots](docs\u002Fdev\u002Fgolden-snapshots.md), [CONTRIBUTING.md](CONTRIBUTING.md), and [conventions](conventions\u002F)\n- Roadmap reader: [System vision](docs\u002Fvision\u002Fsystem-vision.md), [Trajectory control vision](docs\u002Fvision\u002Ftrajectory-control.md), and [Functionality roadmap](docs\u002Fgaps\u002Ffunctionality-roadmap.md)\n\n## Current status\n\nThis is an initial public implementation and research artifact. The core runtime is already meaningful, tested, and documented, but the repository is not architecturally finished.\n\nThe most accurate current-state summary is in [Current codebase reality](docs\u002Fcurrent\u002Fcodebase-reality.md). In short: the runtime package has a relatively narrow public API and stronger subsystem boundaries, while the reproduction and paper layer remains more shaped by active research workflows.\n\nBackward compatibility is not guaranteed.\n\n## Expected future development\n\nFuture work is expected in both architecture and capabilities.\n\nPlanned architectural directions include clearer runtime\u002Freproduction boundaries, stronger extension seams, improved ownership of scoring and candidate-selection components, and continued convergence between documentation, tests, and implementation. See [Current architecture](docs\u002Fcurrent\u002Farchitecture.md), [Codebase reality](docs\u002Fcurrent\u002Fcodebase-reality.md), and [Extension points](docs\u002Fdev\u002Fextension-points.md).\n\nPlanned capability directions include causal trajectory repair, alternative trajectory tracking, branch-aware scoring, uncertainty signaling, partial-history editing, and richer multi-rule protocol control. See [Trajectory control vision](docs\u002Fvision\u002Ftrajectory-control.md) and [Functionality roadmap](docs\u002Fgaps\u002Ffunctionality-roadmap.md).\n\nThe long-term goal is not merely “more rules”. The target is a runtime layer that can identify where a protocol violation appeared, what earlier commitment caused it, which repairs are valid, whether multiple trajectories remain plausible, and how uncertainty should be surfaced.\n\n## Development validation\n\nBefore committing changes, run:\n\n```bash\n.\u002Fscripts\u002Fcheck\n```\n\nThis repository uses formatting, linting, type checking, convention checks, tests, and package-build validation. Details are in [CONTRIBUTING.md](CONTRIBUTING.md), [Golden snapshots](docs\u002Fdev\u002Fgolden-snapshots.md), and the convention documents under [`conventions\u002F`](conventions\u002F).\n\n## Citation\n\nIf you use this repository as a research artifact, cite the paper:\n\n```text\nVictor Lavrenko and Anastasiia Molodnitskaia.\nAnswer Engineering: Local Trajectory Editing for Protocol-Constrained Decision Making in Large Language Models.\n2026.\n```\n\nPaper files:\n\n- [Rendered PDF](docs\u002Fpaper\u002Flavrenko2026_answer_engineering.pdf)\n- [TeX source](docs\u002Fpaper\u002Fmain.tex)\n- [Bibliography](docs\u002Fpaper\u002Freferences.bib)\n- [Generated metrics](docs\u002Fpaper\u002Fgenerated\u002Fpaper-metrics.tex)\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n","Answer Engineering 是一个用于通过显式局部规则引导语言模型生成的Python库。其核心功能在于实现在生成过程中实时干预和重定向特定步骤，确保生成内容符合工程实践、临床协议、安全程序或组织标准等要求，而无需重新训练模型或事后处理输出。这种做法使得生成过程可检查、可复现且受策略约束。适用于需要严格遵守预设流程或标准的场景，如软件开发中的代码复用评估、医疗领域的临床指南遵循等。",2,"2026-06-11 04:02:50","CREATED_QUERY"]