[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82915":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":13,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":14,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":15,"starSnapshotCount":15,"syncStatus":14,"lastSyncTime":26,"discoverSource":27},82915,"monogram","johnsoncodehk\u002Fmonogram","johnsoncodehk","Define syntax once, generate lexer, parser, TextMate, tree-sitter","",null,"TypeScript",93,3,2,0,15,38,50.11,false,"master",true,[],"2026-06-12 04:01:39","# Monogram\n\nWrite a language's grammar **once**, as an executable definition. Monogram runs it as a real parser, proves it against the language's official conformance suite, then **derives the syntax highlighters** — TextMate, tree-sitter, Monarch — from that same proven grammar. Highlighting correctness flows *down* from a parser-verified model instead of *up* from hand-tuned regex.\n\n> *mono + grammar — one grammar definition, many derived artifacts.*\n\n**Status** — an active research project; four languages on one shared, [language-agnostic](#a-language-agnostic-engine) engine, each [proven as a parser](#the-idea) before its highlighter is trusted:\n\n- **TypeScript** ([`typescript.ts`](typescript.ts)) — mature: 100% valid-code coverage, 97.8% bidirectional vs `tsc`.\n- **JavaScript** ([`javascript.ts`](javascript.ts)) — the standalone ECMAScript base TypeScript [builds on](#adding-a-language) (subset → superset); parses real-world JS, with less conformance-corpus depth than TS so far.\n- **HTML** ([`html.ts`](html.ts)) — the engine reaching *past token streams into markup*; ~95 lines, validated against [`parse5`](https:\u002F\u002Fgithub.com\u002Finikulin\u002Fparse5).\n- **Vue** ([`vue.ts`](vue.ts)) — a dialect of `html.ts`: SFC blocks that embed Monogram's own TS\u002FJS\u002FCSS, plus directives and `{{ }}` interpolation.\n\n## Quick start\n\nRequires Node 24+ (runs `.ts` directly — no build step, no `tsx`).\n\n```bash\nnpm install\nnode src\u002Fcli.ts typescript.ts        # regenerate every artifact from the grammar\n```\n\n```ts\nimport { createParser } from '.\u002Fsrc\u002Fgen-parser.ts';\nimport grammar from '.\u002Ftypescript.ts';\n\nconst { parse } = createParser(grammar);\nconst cst = parse('const x = f(a, b)');        \u002F\u002F → a concrete syntax tree\n```\n\n## The idea\n\nA TextMate grammar is a pile of regexes guessing at a language's structure. It's written by hand, independently of any parser, and perpetually wrong at the edges — VS Code's official TypeScript grammar carries [100+ open issues](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues) for exactly this reason. Everyone trying to fix it competes on the same losing axis: *who can hand-write better regexes.*\n\nTake `typeof x \u003C y`. A regex highlighter has to guess whether `\u003C` opens a generic argument list or is a less-than comparison — and it guesses wrong somewhere, forever. A **parser** doesn't guess; the grammar already decides. Monogram inverts the dependency:\n\n1. **Write the grammar, then prove it.** The grammar is executable — Monogram runs it as a recursive-descent + [Pratt](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FOperator-precedence_parser) (operator-precedence) parser over the TypeScript conformance suite, measured *bidirectionally*: it must **accept** every input `tsc` accepts **and reject** every input it rejects.\n\n2. **Derive the highlighters from that proven grammar**, never hand-write them. The TextMate, tree-sitter, and Monarch outputs are all generated from the one parser-validated definition, so their correctness is underwritten by the conformance run, not by regex tuning.\n\nThat single source reaches across grammars, too: an embedded snippet runs *another Monogram grammar* — a `\u003Cscript>` body is highlighted by Monogram's own JavaScript, so `\u003Cscript>const x = 1 \u003C 2\u003C\u002Fscript>` colours `\u003C` as a JS operator, the same ambiguity resolved *inside* the embed. Where VS Code's embeds fray — two independently-written grammars meeting with nothing checking the seam — Monogram owns both sides, so self-verifying that seam becomes possible (a design goal beyond today's standard `contentName` injection).\n\n## Comparison\n\nThe same question, every language at once: take the bugs reported against each *hand-written* official grammar and ask whether the *derived* grammar solves them. Which does **only** the official solve, which does **only** Monogram solve — and which do **both** still get wrong (the shared frontier neither reaches today)?\n\n\u003C!-- issues:start -->\n\u003C!-- generated by `npm run bench:issues` — do not edit by hand -->\n_Each hand-written **official** grammar vs Monogram's **derived** one, on the bugs filed against it: **TypeScript 26\u002F26** (official 8\u002F26) · **TSX 11\u002F11** (official 5\u002F11) · **HTML 20\u002F20** (official 13\u002F20) · **Vue 23\u002F23** (official 18\u002F23). Per-issue detail below — auto-generated by `npm run bench:issues`._\n\n#### TypeScript\n| issue | Monogram | official |\n|---|:--:|:--:|\n| [#1050](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1050) — typeof y \u003C string is a relational operator not generic (cascade victim intact) | ✓ | · |\n| [#978](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F978) — typeof x \u003C string then function (cascade victim intact) | ✓ | · |\n| [#859](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F859) — as cast inside \u003C > comparison | ✓ | · |\n| [#1020](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1020) — new Map\u003Cnumber, number>; (no parens) | ✓ | · |\n| [#855](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F855) — new Map\u003C\u002F* comment *\u002Fstring, IArgs>() | ✓ | · |\n| [#853](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F853) — throw \u002Ffoo\u002F is regex | ✓ | · |\n| [#804](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F804) — \u002F[a\\-b]\u002Fg char class recognized | ✓ | · |\n| [#869](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F869) — x in obj ? x : fallback ternary works | ✓ | · |\n| [#770](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F770) — function call parens are punctuation | ✓ | · |\n| [#1021](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1021) — regex with the v (unicode-sets) flag is recognized | ✓ | · |\n| [#1025](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1025) — for-of without surrounding space keeps `of` a loop keyword | ✓ | · |\n| [#815](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F815) — a class method named `new` is a method name, not the operator | ✓ | · |\n| [#992](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F992) — casting to a type named `type` does not break highlighting | ✓ | · |\n| [#994](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F994) — JSDoc `@template [Output=Value]` default — Monogram colors the param name, official misses it | ✓ | · |\n| [#891](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F891) — `from` as an ordinary variable is not a keyword | ✓ | · |\n| [#814](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F814) — `a instanceof B & c` keeps the operand a value, not a type | ✓ | · |\n| [#950](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F950) — default import named `type` — the binding is a variable, not the `type` keyword | ✓ | · |\n| [#1058](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1058) — `import defer` should scope `defer` as a keyword | ✓ | · |\n\n\u003Cdetails>\u003Csummary>… and 8 more both grammars already handle (✓ \u002F ✓)\u003C\u002Fsummary>\n\n| issue | Monogram | official |\n|---|:--:|:--:|\n| [#1063](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1063) — \u002F\\cJ\u002F control char escape | ✓ | ✓ |\n| [#736](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F736) — obj.example() method gets entity.name.function | ✓ | ✓ |\n| [#788](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F788) — optional chaining ?. is the optional accessor | ✓ | ✓ |\n| [#881](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F881) — `override` modifier on a method is storage.modifier | ✓ | ✓ |\n| [#1066](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1066) — triple-slash reference directive is a comment | ✓ | ✓ |\n| [#1027](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1027) — nested generic `>>` closes two type-arg lists, not a shift | ✓ | ✓ |\n| [#956](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F956) — `as const satisfies Foo` colors the satisfies keyword and the type | ✓ | ✓ |\n| [#907](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F907) — `typeof x extends string ? 1 : 2` conditional-type ternary | ✓ | ✓ |\n\n\u003C\u002Fdetails>\n\n#### TSX\n| issue | Monogram | official |\n|---|:--:|:--:|\n| [#967](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F967) — generic arrow with a default type in `.tsx` | ✓ | · |\n| [#979](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F979) — `const` modifier on a type parameter in `.tsx` | ✓ | · |\n| [#1042](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1042)\u002F[#990](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F990) — default generic arrow function in `.tsx` | ✓ | · |\n| [#627](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F627) — member-expression JSX tag name | ✓ | · |\n| [#1033](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F1033) — generic arrow with a default + destructured param in `.tsx` | ✓ | · |\n| [#825](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F825) — `\u003C` and tag name on separate lines | ✓ | · |\n\n\u003Cdetails>\u003Csummary>… and 5 more both grammars already handle (✓ \u002F ✓)\u003C\u002Fsummary>\n\n| issue | Monogram | official |\n|---|:--:|:--:|\n| [#794](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F794) — non-null `!` then `\u002F` (division) in a JSX-attribute object | ✓ | ✓ |\n| [#585](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F585) — `\u002F\u002F` line comment inside a JSX open tag | ✓ | ✓ |\n| [#754](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F754) — JSX element right after a `\u002F**\u002F` block comment | ✓ | ✓ |\n| [#667](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F667) — arrow function + ternary inside a JSX attribute | ✓ | ✓ |\n| [#624](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTypeScript-TmLanguage\u002Fissues\u002F624) — JSX element in an array after a template-literal attribute | ✓ | ✓ |\n\n\u003C\u002Fdetails>\n\n#### HTML\n| issue | Monogram | official |\n|---|:--:|:--:|\n| [tmbundle#118](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F118) — trailing `\u002F` in an unquoted URL value | ✓ | · |\n| [tmbundle#108](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F108) — nested `\u003Csvg>` is a valid tag, not flagged invalid | ✓ | · |\n| [tmbundle#113](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F113) — `\u002F\u002F` in an `onclick=` JS string read as a comment | ✓ | · |\n| [tmbundle#104](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F104) — mixed-case `onChange=` event handler still reads as JS | ✓ | · |\n| [tmbundle#88](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F88) — inline `style=` value embeds CSS | ✓ | · |\n| [tmbundle#65](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F65) — `\u003C` of `\u003C\u002Fscript>` is HTML punctuation, not `source.js` | ✓ | · |\n| [tmbundle#74](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F74) — `\u003C` of `\u003C\u002Fstyle>` is HTML punctuation, not `source.css` | ✓ | · |\n\n\u003Cdetails>\u003Csummary>… and 13 more both grammars already handle (✓ \u002F ✓)\u003C\u002Fsummary>\n\n| issue | Monogram | official |\n|---|:--:|:--:|\n| [tmbundle#124](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F124) — slash in unquoted value `foo\u002F` | ✓ | ✓ |\n| [vscode#140360](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fvscode\u002Fissues\u002F140360) — `\u002F` inside an unquoted value (path) | ✓ | ✓ |\n| [tmbundle#84](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F84) — tag name a prefix of a sibling (`\u003Ci>`\u002F`\u003Cinput>`) | ✓ | ✓ |\n| [tmbundle#117](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F117) — SVG camelCase tag name | ✓ | ✓ |\n| [tmbundle#122](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F122) — `\u003C` inside a quoted attr value | ✓ | ✓ |\n| [vscode#130284](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fvscode\u002Fissues\u002F130284) — `>` inside a quoted attr value does not close the tag early | ✓ | ✓ |\n| [tmbundle#97](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F97) — whitespace (incl. a line feed) before `>` in a raw-text end tag | ✓ | ✓ |\n| [tmbundle#81](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F81) — character entity `&amp;` in text | ✓ | ✓ |\n| [tmbundle#102](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F102) — `\u003Cstyle>` element CSS is tokenized, not a flat blob | ✓ | ✓ |\n| [tmbundle#50](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F50) — `onclick=` event-handler value is colored as JS | ✓ | ✓ |\n| [tmbundle#85](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F85) — `\u002F\u002F\u003C\u002Fscript>` on its own line still closes the script | ✓ | ✓ |\n| [tmbundle#51](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F51) — self-closing `\u002F` is tag punctuation | ✓ | ✓ |\n| [tmbundle#82](https:\u002F\u002Fgithub.com\u002Ftextmate\u002Fhtml.tmbundle\u002Fissues\u002F82) — a `\u002F>`-style `\u003Cscript src=… \u002F>` does NOT self-close — its body is the script content | ✓ | ✓ |\n\n\u003C\u002Fdetails>\n\n#### Vue\n| issue | Monogram | official |\n|---|:--:|:--:|\n| [#6007](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F6007)\u002F[#2096](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F2096)\u002F[#520](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F520) — `as` type assertion in directive value | ✓ | · |\n| [#2060](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F2060)-inline — `` const a = 1;\u003C\u002Fscript> `` (content on the close line) embeds + clean close | ✓ | · |\n| [#2060](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F2060)-inline-adjacent — an unterminated union before a same-line `` \u003C\u002Fscript> ``, then a second `\u003Cscript setup>` block | ✓ | · |\n| [#5660](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F5660) — `as const` cast in a v-for value | ✓ | · |\n| [#4716](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F4716)\u002F[#5571](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F5571) — `as` cast followed by another attribute | ✓ | · |\n\n\u003Cdetails>\u003Csummary>… and 18 more both grammars already handle (✓ \u002F ✓)\u003C\u002Fsummary>\n\n| issue | Monogram | official |\n|---|:--:|:--:|\n| [#3400](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F3400) — `instanceof` in {{ }} | ✓ | ✓ |\n| [#5370](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F5370) — `typeof x !==` in v-if | ✓ | ✓ |\n| [#5118](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F5118) — `?.` \u002F `??` in {{ }} | ✓ | ✓ |\n| [#1675](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F1675) — arrow `=>` in {{ }} | ✓ | ✓ |\n| [#6039](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F6039)\u002F[#4741](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F4741) — `\u003C` operator in {{ }} (not a tag!) | ✓ | ✓ |\n| [#5722](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F5722) — negated ternary + quotes in {{ }} | ✓ | ✓ |\n| [#5538](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F5538)\u002F[#2060](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F2060) — trailing `export type` before `` \u003C\u002Fscript> `` | ✓ | ✓ |\n| [#3999](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F3999) — a force-wrapped multi-line `\u003Cscript lang=\"ts\">` start tag keeps the body as the `ts` family (no .ts→.js flip) | ✓ | ✓ |\n| [#4769](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F4769) — tag name starting with `template` | ✓ | ✓ |\n| [#5701](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F5701) — `{{` inside a `\u003Cscript>` string | ✓ | ✓ |\n| [#6070](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F6070) — capitalized component then a `\u003Cstyle>` block | ✓ | ✓ |\n| [#4291](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F4291) — `\u003Cscript lang=\"tsx\">` body embeds the DECLARED `source.tsx` (not a source.js fallback) | ✓ | ✓ |\n| [#4291](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F4291)-jsx — `\u003Cscript lang=\"jsx\">` body embeds the DECLARED `source.js.jsx` | ✓ | ✓ |\n| generic=\"T\" — `generic=\"T extends U\">` type-param list embeds as TS | ✓ | ✓ |\n| [#4410](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F4410) — dynamic directive argument `:[attr]` | ✓ | ✓ |\n| [#3727](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F3727) — `.prop` modifier shorthand | ✓ | ✓ |\n| [#2666](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F2666) — dynamic slot name from a template literal | ✓ | ✓ |\n| [#2560](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F2560)\u002F[#1290](https:\u002F\u002Fgithub.com\u002Fvuejs\u002Flanguage-tools\u002Fissues\u002F1290) — `type` as a v-for loop variable | ✓ | ✓ |\n\n\u003C\u002Fdetails>\n\u003C!-- issues:end -->\n\n\u003Csub>A sampled ledger of real tracker issues, not an exhaustive audit. Run `npm run bench:issues` to regenerate (needs the official grammars: VS Code's installed TS\u002FJS\u002FHTML, and the Vue fixtures — see [`test\u002Fvue-bench.ts`](test\u002Fvue-bench.ts)). Sources: [`test\u002Fissue-cases.ts`](test\u002Fissue-cases.ts), [`test\u002Fhtml-issue-cases.ts`](test\u002Fhtml-issue-cases.ts), [`test\u002Fvue-issue-cases.ts`](test\u002Fvue-issue-cases.ts).\u003C\u002Fsub>\n\n### The ceiling — and the bar for claiming it\n\nDeriving from a proven parser wins the disambiguation that is *TextMate-expressible but infeasible to hand-write* — regex-vs-division, generic-vs-comparison, whitespace-fragile multiline generics — the **only-Monogram** column. The **both-miss** cases are ones neither grammar gets *today* — not, by default, ones TextMate *can't*.\n\n\"TextMate can't express X\" is not a guess or an assertion; it is a claim to be **proven from the model**. TextMate is a line-oriented matcher whose only cross-line memory is a finite stack of scope contexts, so a proof exhibits an X whose correct highlighting provably needs memory that model lacks — unbounded lookback to a token that is not an enclosing context. A failed *attempt* to derive a pattern is not such a proof: a cleverer pattern may exist, and most \"impossible for TextMate\" folklore is exactly this error — the multiline \u002F nested-generic cases turn out TM-expressible once a parser supplies the pattern, which is why the derived grammar gets them right. Where a construct provably exceeds the model, Monogram's **tree-sitter** target — a real parser over the whole tree — resolves it.\n\n## What you get\n\nFrom one grammar definition (a small TypeScript combinator API), five outputs are **fully functional**:\n\n- **A lexer** — tokenizes source straight from the grammar's token definitions; usable on its own (`createLexer(grammar).tokenize`).\n- **A CST parser** — recursive descent + Pratt precedence on top of the lexer, producing a **CST** (concrete syntax tree): every token is a node, including punctuation and keywords — roughly 2× an AST's nodes, by design, which is exactly what the highlighter and lossless source reconstruction need.\n- **A TextMate grammar** — a `.tmLanguage.json` for VS Code \u002F Sublime syntax highlighting, derived from the same rules, including derived **JSDoc-body** and **regex-internal** sub-grammars. (TextMate *scopes* are the dot-separated labels — `entity.name.function`, `keyword.control` — that a theme maps to colors.)\n- **A VS Code language configuration** — `language-configuration.json` (comments, bracket pairs, auto-close\u002Fsurround, folding) derived from the same tokens.\n- **CST node types** — a TypeScript discriminated union (keyed by rule) for typed tree consumers.\n\nAnd — from the same grammar — generators for the rest of the ecosystem, at varying maturity:\n\n- **tree-sitter** — `grammar.js` + a **structural** `queries\u002Fhighlights.scm` + an external scanner for context-sensitive lexing. tree-sitter's GLR absorbs the grammar and compiles to wasm; the derived query scores **95.9%** token-family accuracy against a neutral `tsc` oracle — above the official tree-sitter's **92.7%** — and is CI-gated by `npm run gate:treesitter`.\n- **Monarch** — a Monaco (web) tokenizer (functional, bounded by JS-regex limits).\n\n## The grammar is the source of truth\n\nA grammar is a TypeScript module: tokens, operator precedence, and rules built from small combinators. A self-contained mini-example:\n\n```ts\nimport { token, rule, defineGrammar, left, op, sep } from '.\u002Fsrc\u002Fapi.ts';\n\nconst Ident  = token(\u002F[a-zA-Z_$][a-zA-Z0-9_$]*\u002F, { identifier: true });\nconst Number = token(\u002F[0-9]+(\\.[0-9]+)?\u002F);\n\nconst Expr = rule($ => [\n  Ident,\n  Number,\n  [$, op, $],                    \u002F\u002F binary operators (precedence declared below)\n  [$, '(', sep(Expr, ','), ')'], \u002F\u002F call:    foo(a, b)\n  [$, '.', Ident],               \u002F\u002F member:  obj.name\n]);\n\nexport default defineGrammar({\n  name: 'mini',\n  tokens: { Ident, Number },\n  prec: [ left('+', '-'), left('*', '\u002F') ],\n  rules: { Expr },\n  entry: Expr,\n});\n```\n\nThe parser uses these rules to build a CST. The highlighter reads the same rule **shapes** and infers most scopes structurally — with no per-rule annotation:\n\n- `foo(x)` → `foo` is `entity.name.function` (from the `$ '(' …` call form)\n- `obj.name` → `name` is `entity.other.property` (from the `$ '.' Ident` form)\n- `'class' Ident` → `Ident` is `entity.name.type` (from declaration structure)\n- `Expr '\u003C' Type '>' '('` → a generic call, not a comparison (from rule structure)\n\nFlat, irreducible facts — which keywords are control flow, which punctuation is an operator — are declared once in a small `scopes` map (≈50 lines for TypeScript) rather than inferred. Structure is derived; vocabulary is declared.\n\n## A language-agnostic engine\n\nNothing in the engine knows about TypeScript. Everything language-specific lives in the grammar — keywords, which token is the identifier, template-literal delimiters, the regex-vs-division lexer ambiguity — all *declared per token*:\n\n```ts\nconst Template = token(\u002F`…`\u002F, { template: { open: '`', interpOpen: '${', interpClose: '}' } });\nconst Regex    = token(\u002F\\\u002F…\\\u002F\u002F, {\n  regex: true,\n  regexContext: {\n    divisionAfterTypes: ['Ident', 'Number', 'String', 'Template'],\n    divisionAfterTexts: [')', ']', 'this', 'true', \u002F* … *\u002F],\n    regexAfterTexts:    ['return', 'typeof', 'instanceof', \u002F* … *\u002F],\n  },\n});\n```\n\n[`test\u002Fagnostic.ts`](test\u002Fagnostic.ts) proves it directly — the same engine parses a toy grammar whose identifier token is `Word`, with no templates or regex. The deeper proof is [`html.ts`](html.ts): markup shares *nothing* with TypeScript's token stream, yet the same engine handles it (and Vue layers SFC blocks + `{{ }}` interpolation on top).\n\n## Adding a language\n\nA new language is **one grammar file** on the unchanged engine:\n\n1. **Write the grammar** with the combinator API ([`src\u002Fapi.ts`](src\u002Fapi.ts)) — tokens, operator precedence, rules. Everything language-specific lives here.\n2. **Prove it as a parser** against the language's own official test suite, measured **bidirectionally** (accept what the reference accepts, reject what it rejects).\n3. **Drop in the official TextMate grammar** as the baseline, so highlighter coverage is measured against what you're replacing, not asserted.\n\nThe lexer, CST types, and all three highlighters fall out of step 1; a *dialect* (`.tsx`\u002F`.jsx` via [`jsx.ts`](jsx.ts), or Vue on [`html.ts`](html.ts)) reuses a base grammar's rules by name in a few lines. The conformance\u002Fhighlighter harnesses are currently TypeScript-specific (they call `tsc` and read VS Code's grammar) — point them at your own reference compiler.\n\n## Known differences from the official highlighter\n\nA handful of token patterns are scoped differently from VS Code's official TypeScript grammar — all intentional, and in some Monogram is arguably *more* correct (these are *deliberate divergences*, distinct from the bug-class fixes the [ledger](#comparison) measures):\n\n| Token | Monogram | Official | Why we keep ours |\n|---|---|---|---|\n| `console` in `console.log` | `support.variable` | `variable.other.object` | We highlight built-in globals (`console`, `window`, …) distinctly — a deliberate, common choice. |\n| `transform` (a function parameter) | `variable.parameter` | `entity.name.function` | It **is** a parameter. Official's heuristic mis-reads `name: (…) => T` as a function definition; we're more correct. |\n| `error` (the method in `console.error(…)`) | `entity.name.function` | `variable.other.readwrite` | We scope a called method as a function name — arguably more informative. |\n\n> Built-in class names in **type** position (e.g. `Error` in `extends Error`) correctly emit `entity.name.type`, matching official; in **value** position (`new Error()`) they remain `support.class`, also matching official.\n\nMatching the official grammar *exactly* would, in cases like `transform`, make the output worse. The metric counts these as differences, not defects.\n\n## Architecture\n\n```\ntypescript.ts                one grammar (TypeScript combinator API)\n        │\n        ├─ src\u002Fgen-lexer.ts  ───────▶ lexer → tokens        (standalone: createLexer)\n        │        ▲ composed by\n        ├─ src\u002Fgen-parser.ts ───────▶ CST parser   (recursive descent + Pratt + packrat memoization;\n        │                             run against the conformance suite = the grammar's proof)\n        │\n        ├─ src\u002Fgen-tm.ts ───────────▶ typescript.tmLanguage.json            (TextMate highlighter)\n        ├─ src\u002Fgen-vscode-config.ts ▶ typescript.language-configuration.json (editor behavior)\n        ├─ src\u002Fgen-treesitter.ts ───▶ tree-sitter\u002F  (grammar.js + highlights.scm + scanner.c)\n        ├─ src\u002Fgen-monarch.ts ──────▶ typescript.monarch.json\n        └─ src\u002Fgen-ast-types.ts ────▶ typescript.cst-types.ts\n\nshared  src\u002Fgrammar-utils.ts          structural helpers used across stages\n        src\u002Fapi.ts, types.ts          the grammar's combinator + type surface\n```\n\nEvery target is produced by the *same* structural scope-inference, retargeted per format — lexer, parser, and generators are generic runtimes; all language specifics live in the grammar.\n\n## Prior art\n\n| Tool | Parser | Highlighting | Single source |\n|------|:---:|:---:|:---:|\n| TextMate grammars | — | manual regex | — |\n| tree-sitter | yes | queries (written separately) | — |\n| ANTLR | yes | — | — |\n| Langium | yes | Monarch (separate config) | — |\n| ungrammar | AST types | — | — |\n| **Monogram** | **CST, conformance-proven** | **derived from the parser grammar** | **yes** |\n\nEvery tool here has a real parser; none *derives the highlighter from the parser's own grammar as a single source* — the one thing Monogram is for.\n","Monogram 是一个用于定义语言语法并自动生成词法分析器、语法分析器以及多种语法高亮工具的项目。其核心功能包括通过一次定义即可生成如TextMate、tree-sitter等语法高亮器，且这些高亮器直接基于经过验证的语法模型生成，确保了高亮规则的准确性。技术上采用TypeScript编写，并利用了递归下降与Pratt解析技术来处理语法定义。该项目特别适合需要为新编程语言或现有语言变体快速开发高质量语法高亮支持的场景，同时也适用于希望提高已有语言高亮准确性的情况。目前，Monogram已成功应用于TypeScript、JavaScript、HTML及Vue等语言的语法定义和高亮生成中。","2026-06-11 04:09:38","CREATED_QUERY"]