One .parse file. Define the lexer, write the grammar, configure semantics, emit C++23, and the bootstrap compiler gives you back a fully working native compiler. Nothing else required.
ParseLang is a meta-language: a language for defining languages. Write the description once. Get a complete compiler back. No runtime, no framework, no Delphi code written by you.
Write a single .parse file. Declare every token your lexer needs, every grammar rule your parser needs, every semantic check your analyser needs, and every C++23 construct your code generator needs, all in ParseLang's own scripting language.
language MyLang; keywords casesensitive 'var' -> 'keyword.var'; 'func' -> 'keyword.func'; 'return' -> 'keyword.return'; 'if' -> 'keyword.if'; 'while' -> 'keyword.while'; 'end' -> 'keyword.end'; end operators ':=' -> 'op.assign'; '+' -> 'op.plus'; '(' -> 'delim.lparen'; ')' -> 'delim.rparen'; end typemap 'type.int' -> 'int64_t'; 'type.string' -> 'std::string'; end
The PLC bootstrap reads your .parse file and configures a live TParse instance, wiring every lexer token, Pratt grammar handler, semantic rule, and emitter. Phase 2: that configured instance compiles your source file, validates it semantically, and emits C++23 via a fluent IR builder.
registerLiterals; binaryop 'op.plus' power 20 op '+'; binaryop 'op.star' power 30 op '*'; prefix 'identifier' as 'expr.ident' parse result := createNode(); setAttr(result, 'ident.name', current().text); consume(); end end statement 'keyword.var' as 'stmt.var_decl' parse result := createNode(); consume(); setAttr(result, 'decl.name', current().text); consume(); expect('delim.colon'); setAttr(result, 'decl.type', current().text); consume(); expect('op.assign'); addChild(result, parseExpr(0)); expect('delim.semi'); end end
Your emit rules drive a fluent C++23 IR builder that produces a .h and .cpp. ParseLang calls Zig/Clang to compile them into a native exe, lib, or dll. Win64 or Linux64, debug or release. Windows version info and icons configurable from emit rules.
emit 'program.root' setPlatform('win64'); setBuildMode('exe'); setOptimize('release-fast'); include('cstdint', target.header); include('string', target.header); func('main', 'int'); emitChildren(node); returnVal('0'); endFunc(); end emit 'stmt.var_decl' declVar( getAttr(node, 'decl.name'), typeToIR(getAttr(node, 'sem.type')), exprToString(getChild(node, 0))); end emit 'stmt.if' ifStmt(exprToString( getChild(node, 0))); emitBlock(getChild(node, 1)); endIf(); end
Phase 1 reads your language description and configures the compiler in memory. Phase 2 uses that compiler to process source files and produce native binaries via Zig.
One file replaces every compiler component: lexer config, Pratt grammar rules, semantic handlers, C++23 emitters, pipeline config (platform, build mode, optimize), Windows version info, and icon embedding. No Delphi code written by you.
Top-down operator precedence ready to use. prefix, infix left/right, statement. The binaryop shorthand registers parse + emit in one line.
Built-in nested scope trees. declare, lookup, lookupLocal. Type-text-to-kind resolution. Structured errors with source location and error codes.
Structured builder API. Functions, control flow, variable declarations, includes. exprToString() converts expression subtrees. Dual-file: source or header per call.
Variables, if, while, for, repeat, string concatenation. Typed helper functions defined once, called from any rule block.
Compiles to exe, lib, or dll. Win64 and Linux64. debug, release-safe, release-fast, release-small. Windows version info and icons embeddable from emit rules.
The entire compiler: tokeniser, Pratt parser, semantic analyser, code generator, configured in one file using ParseLang's scripting language.
Register every token the language needs. Keywords with case sensitivity control, multi-character operators (longest-match), string styles with independent escape config, comment styles, type keywords, and a typemap that resolves them to C++ type names.
typemap resolves kind strings to C++ type namesstructural sets terminator and block-close tokenslanguage MyLang; keywords casesensitive 'var' -> 'keyword.var'; 'func' -> 'keyword.func'; 'return' -> 'keyword.return'; 'if' -> 'keyword.if'; 'else' -> 'keyword.else'; 'while' -> 'keyword.while'; 'end' -> 'keyword.end'; 'true' -> 'keyword.true'; 'false' -> 'keyword.false'; end operators ':=' -> 'op.assign'; '<=' -> 'op.lte'; '<>' -> 'op.neq'; '+' -> 'op.plus'; '-' -> 'op.minus'; '*' -> 'op.star'; '(' -> 'delim.lparen'; ')' -> 'delim.rparen'; ',' -> 'delim.comma'; ':' -> 'delim.colon'; ';' -> 'delim.semi'; end strings '"' '"' -> 'literal.string' escape true; end comments line '--'; end structural terminator 'delim.semi'; blockclose 'keyword.end'; end types 'int' -> 'type.int'; 'string' -> 'type.string'; end typemap 'type.int' -> 'int64_t'; 'type.string' -> 'std::string'; end
Grammar rules are written in ParseLang's scripting language. prefix fires at expression start, infix left/right handles binary positions, statement fires at the top level. Binding powers control precedence.
registerLiterals wires all literal prefixes in one callbinaryop shorthand registers parse and emit togetherresult is the node to return; left is the infix left operandregisterLiterals; binaryop 'op.plus' power 20 op '+'; binaryop 'op.minus' power 20 op '-'; binaryop 'op.star' power 30 op '*'; prefix 'identifier' as 'expr.ident' parse result := createNode(); setAttr(result, 'ident.name', current().text); consume(); end end infix left 'delim.lparen' power 80 as 'expr.call' parse result := createNode(); setAttr(result, 'call.name', getAttr(left, 'ident.name')); consume(); if not check('delim.rparen') then addChild(result, parseExpr(0)); while match('delim.comma') do addChild(result, parseExpr(0)); end end expect('delim.rparen'); end end statement 'keyword.var' as 'stmt.var_decl' parse result := createNode(); consume(); setAttr(result, 'decl.name', current().text); consume(); expect('delim.colon'); setAttr(result, 'decl.type_text', current().text); consume(); expect('op.assign'); addChild(result, parseExpr(0)); expect('delim.semi'); end end
Semantic rules fire during the analysis pass. Nodes without a handler are walked transparently. Register only what you need; children are visited automatically.
pushScope / popScopedeclare(name, node) returns false on duplicateslookup walks full chain; lookupLocal stays in scopetypeTextToKind() resolves type text via lexer typessemantic 'program.root' pushScope('global', node); visitChildren(node); popScope(node); end semantic 'stmt.func_decl' ok := declare( getAttr(node, 'func.name'), node); if not ok then error(node, 'ML001', 'Duplicate function: ' + getAttr(node, 'func.name')); end pushScope( getAttr(node, 'func.name'), node); visitChildren(node); popScope(node); end semantic 'stmt.var_decl' ok := declare( getAttr(node, 'decl.name'), node); if not ok then error(node, 'ML002', 'Duplicate variable: ' + getAttr(node, 'decl.name')); end setAttr(node, 'sem.type', typeTextToKind( getAttr(node, 'decl.type_text'))); visitChildren(node); end semantic 'expr.ident' sym := lookup( getAttr(node, 'ident.name')); if sym = nil then error(node, 'ML003', 'Undeclared: ' + getAttr(node, 'ident.name')); end end
Emit rules fire during code generation. Statement nodes call IR builder procedures. Expression nodes set the implicit result to their C++ string. Targets the source or header file per call.
func, ifStmt, whileStmt, declVar, returnValtarget.source vs target.headerexprToString(node) converts expression trees to C++typeToIR() resolves kind strings via the typemapemit 'program.root' setPlatform('win64'); setBuildMode('exe'); setOptimize('debug'); include('cstdint', target.header); include('string', target.header); func('main', 'int'); emitChildren(node); returnVal('0'); endFunc(); end emit 'stmt.var_decl' declVar( getAttr(node, 'decl.name'), typeToIR(getAttr(node, 'sem.type')), exprToString(getChild(node, 0))); end emit 'stmt.if' ifStmt(exprToString( getChild(node, 0))); emitBlock(getChild(node, 1)); if childCount(node) > 2 then elseStmt(); emitBlock(getChild(node, 2)); end endIf(); end emit 'expr.call' args := ''; i := 0; while i < childCount(node) do if i > 0 then args := args + ', '; end args := args + exprToString(getChild(node, i)); i := i + 1; end result := getAttr(node, 'call.name') + '(' + args + ')'; end
Grab the latest release from GitHub. Includes the PLC bootstrap compiler and the full Zig toolchain. No separate downloads needed.
Start with a language declaration. Add keywords, operators, strings, comments, types, and typemap blocks.
Call registerLiterals, then write binaryop, prefix, infix, and statement rules.
Write an emit block for each AST node kind. Add semantic rules for scope and type analysis where needed.
Execute PLC mylang.parse myprogram.ml. ParseLang bootstraps your compiler, emits C++23, and Zig links a native binary.
language Hello; -- Lexer ───────────────────────── keywords casesensitive 'print' -> 'keyword.print'; end operators '(' -> 'delim.lparen'; ')' -> 'delim.rparen'; end strings '"' '"' -> 'literal.string' escape true; end comments line '--'; end -- Grammar ───────────────────── registerLiterals; statement 'keyword.print' as 'stmt.print' parse result := createNode(); consume(); expect('delim.lparen'); addChild(result, parseExpr(0)); expect('delim.rparen'); end end -- Emit ─────────────────────────── emit 'program.root' setPlatform('win64'); setBuildMode('exe'); include('cstdio', target.header); include('string', target.header); func('main', 'int'); emitChildren(node); returnVal('0'); endFunc(); end emit 'stmt.print' val := exprToString( getChild(node, 0)); stmt('printf("%s\n", ' + val + '.c_str())'); end
The release includes the PLC bootstrap compiler, all supporting binaries, and the full Zig toolchain, everything bundled. Unzip and start writing .parse files immediately.
| Requirement | Minimum | Notes |
|---|---|---|
| Host OS | Windows 10/11 x64 | Supported |
| Delphi (source builds) | Delphi 12 Athens | Required only to build ParseLang from source |
| Linux target | WSL2 + Ubuntu | wsl --install -d Ubuntu. ParseLang locates it automatically |