nxt.gxbnf

Lexer/Parser Generator for ANTLR (G, G2, G4) and (E)BNF grammars.

Members

Aliases

Imports alias Imports = DynamicArray!(Import, Mallocator, uint): Undocumented in source.
Input alias Input = string: Undocumented in source.
NodeArray alias NodeArray = DynamicArray!(Node, Mallocator, uint): Undocumented in source.
Output alias Output = DynamicArray!char: Undocumented in source.
PatternArray alias PatternArray = DynamicArray!(Pattern, Mallocator, uint): Undocumented in source.
Rules alias Rules = DynamicArray!(Rule, Mallocator, uint): Undocumented in source.
RulesByName alias RulesByName = Rule[Input]: Undocumented in source.
SymbolRefs alias SymbolRefs = DynamicArray!(SymbolRef, Mallocator, uint): Undocumented in source.

Classes

Action class Action: Undocumented in source.
ActionSymbol class ActionSymbol: Undocumented in source.
AltCharLiteral class AltCharLiteral: Undocumented in source.
AltM class AltM: Undocumented in source.
AnyClass class AnyClass: Undocumented in source.
AttributeSymbol class AttributeSymbol: Undocumented in source.
BinaryOpPattern class BinaryOpPattern: Binary pattern combinator.
BlockComment class BlockComment: Undocumented in source.
Channels class Channels: Undocumented in source.
CharAltM class CharAltM: Undocumented in source.
Class class Class: Undocumented in source.
DotDotSentinel class DotDotSentinel: Undocumented in source.
FragmentRule class FragmentRule: A reusable part of a lexer rule that doesn't match (a token) on its own.
Grammar class Grammar: Grammar named name.
GreedyCount class GreedyCount: Match count number of instances of type sub.
GreedyOneOrMore class GreedyOneOrMore: Match (greedily) one or more instances of type sub.
GreedyZeroOrMore class GreedyZeroOrMore: Match (greedily) zero or more instances of type sub.
GreedyZeroOrOne class GreedyZeroOrOne: Match (greedily) zero or one instances of type sub.
GxFileParser class GxFileParser: Gx filer parser.
GxParserByStatement class GxParserByStatement: Gx parser with range interface over all statements.
Header class Header: Undocumented in source.
Import class Import: Import of modules.
LeftParenSentinel class LeftParenSentinel: Undocumented in source.
LexerGrammar class LexerGrammar: Lexer grammar named name.
LineComment class LineComment: Undocumented in source.
Mode class Mode: Undocumented in source.
NaryOpPattern class NaryOpPattern: N-ary expression.
NonGreedyOneOrMore class NonGreedyOneOrMore: Match (non-greedily) one or more instances of type sub.
NonGreedyZeroOrMore class NonGreedyZeroOrMore: Match (non-greedily) zero or more instances of type sub.
NonGreedyZeroOrOne class NonGreedyZeroOrOne: Match (non-greedily) zero or one instances of type sub.
NotPattern class NotPattern: Don't match an instance of type sub.
Options class Options: Undocumented in source.
OtherSymbol class OtherSymbol: Undocumented in source.
ParserGrammar class ParserGrammar: Parser grammar named name.
Pattern class Pattern: Undocumented in source.
PipeSentinel class PipeSentinel: Undocumented in source.
Range class Range: Match value range between limits[0] and limits[1].
RewriteSyntacticPredicate class RewriteSyntacticPredicate: Undocumented in source.
Rule class Rule: Rule.
ScopeAction class ScopeAction: Undocumented in source.
ScopeSymbol class ScopeSymbol: Undocumented in source.
ScopeSymbolAction class ScopeSymbolAction: Undocumented in source.
SeqM class SeqM: Sequence.
StrLiteral class StrLiteral: Undocumented in source.
SymbolRef class SymbolRef: Undocumented in source.
TerminatedUnaryOpPattern class TerminatedUnaryOpPattern: Undocumented in source.
TildeSentinel class TildeSentinel: Undocumented in source.
TokenNode class TokenNode: Undocumented in source.
Tokens class Tokens: Undocumented in source.
UnaryOpPattern class UnaryOpPattern: Unary match combinator.

Enums

Layout enum Layout: Format when printing AST (nodes).
NODE enum NODE: Node.
TOK enum TOK: < Token kind. TODO: make this a string type like with std.experimental.lexer

Functions

buildSourceFiles string buildSourceFiles(string[] parserPaths, string[] parserModules, bool linkFlag): Build the D source files parserPaths.
createMainFile SourceFile createMainFile(string path, string[] parserPaths, string[] parserModules): Undocumented in source. Be warned that the author may not have intended to support it.
dcharCountSpanOf DcharCountSpan dcharCountSpanOf(Pattern[] subs): Undocumented in source. Be warned that the author may not have intended to support it.
doTree void doTree(BuildCtx bcx): Undocumented in source. Be warned that the author may not have intended to support it.
equalsAll bool equalsAll(Node[] a, Node[] b): Undocumented in source. Be warned that the author may not have intended to support it.
flattenSubs PatternArray flattenSubs(PatternArray subs): Undocumented in source. Be warned that the author may not have intended to support it.
iput void iput(Output sink, uint indentDepth, T x): Put x indented at indentDepth.
lexAllInDirTree void lexAllInDirTree(BuildCtx bcx): Undocumented in source. Be warned that the author may not have intended to support it.
makeAltA Pattern makeAltA(Token head, PatternArray subs, bool rewriteFlag): Undocumented in source. Be warned that the author may not have intended to support it.
makeAltM Pattern makeAltM(Token head, Pattern[] subs, bool rewriteFlag): Undocumented in source. Be warned that the author may not have intended to support it.
makeAltN Pattern makeAltN(Token head, Pattern[n] subs, bool rewriteFlag): Undocumented in source. Be warned that the author may not have intended to support it.
makeLiteral TokenNode makeLiteral(Token head): Undocumented in source. Be warned that the author may not have intended to support it.
makeSeq Pattern makeSeq(PatternArray subs, GxLexer lexer, bool rewriteFlag): Undocumented in source. Be warned that the author may not have intended to support it.
makeSeq Pattern makeSeq(Pattern[] subs, GxLexer lexer, bool rewriteFlag): Undocumented in source. Be warned that the author may not have intended to support it.
makeSeq Pattern makeSeq(Node[] subs, GxLexer lexer, bool rewriteFlag): Undocumented in source. Be warned that the author may not have intended to support it.
needsWrapping bool needsWrapping(Node[] subs): Undocumented in source. Be warned that the author may not have intended to support it.
parseAllInDirTree void parseAllInDirTree(BuildCtx bcx): Undocumented in source. Be warned that the author may not have intended to support it.
parseCharAltM Pattern parseCharAltM(CharAltM alt, GxLexer lexer): Undocumented in source. Be warned that the author may not have intended to support it.
putCharLiteral void putCharLiteral(Output sink, Input inp): Undocumented in source. Be warned that the author may not have intended to support it.
putStringLiteralBackQuoted void putStringLiteralBackQuoted(Output sink, Input inp): Undocumented in source. Be warned that the author may not have intended to support it.
putStringLiteralDoubleQuoted void putStringLiteralDoubleQuoted(Output sink, Input inp): Undocumented in source. Be warned that the author may not have intended to support it.
showNIndents void showNIndents(Output sink, uint indentDepth): Undocumented in source. Be warned that the author may not have intended to support it.
showNSpaces void showNSpaces(uint indentDepth): Undocumented in source. Be warned that the author may not have intended to support it.
showNSpaces void showNSpaces(Output sink, uint n): Undocumented in source. Be warned that the author may not have intended to support it.
toPathModuleName string toPathModuleName(string path)
tryRelativePath string tryRelativePath(string rootDirPath, string path): Undocumented in source. Be warned that the author may not have intended to support it.

Manifest constants

indentStep enum indentStep;: < Indentation size in number of spaces.
matcherFunctionNamePrefix enum matcherFunctionNamePrefix;: Undocumented in source.
showProgressFlag enum showProgressFlag;: Undocumented in source.
useStaticTempArrays enum useStaticTempArrays;: < Use fixed-size (statically allocated) sequence and alternative buffers.

Static functions

isSymbolStart bool isSymbolStart(dchar ch): Undocumented in source. Be warned that the author may not have intended to support it.

Static variables

mainName auto mainName;: Undocumented in source.
mainSource auto mainSource;: Undocumented in source.
parserSourceBegin auto parserSourceBegin;: Undocumented in source.
parserSourceEnd auto parserSourceEnd;: Undocumented in source.

Structs

BuildCtx struct BuildCtx: Build Context
DcharCountSpan struct DcharCountSpan: Lower and upper limit of dchar count.
ExecutableFile struct ExecutableFile: Undocumented in source.
Format struct Format: Undocumented in source.
GxFileReader struct GxFileReader: Undocumented in source.
GxLexer struct GxLexer: Gx lexer for all version ANTLR grammsrs (.g, .g2, .g4).
ObjectFile struct ObjectFile: Undocumented in source.
SourceFile struct SourceFile: Undocumented in source.
Token struct Token: Gx rule.

- Add Syntax Tree Nodes as structs with members being sub-nodes. Composition over inheritance. If we use structs over classes more languages, such as Vox, can be supported in the code generation phase. Optionally use extern(C++) classes. Sub-node pointers should be defined as unique pointers with deterministic destruction.

- Should be allowed instead of warning: grammars-v4/lua/Lua.g4(329,5): Warning: missing left-hand side, token (leftParen) at offset 5967

- Parallelize grammar parsing and generation of parser files using https://dlang.org/phobos/std_parallelism.html#.parallel After that compilation of parser files should grouped into CPU-count number of groups.

- Use: https://forum.dlang.org/post/zcvjwdetohmklaxriswk@forum.dlang.org

- Rewriting (X+)? as X* in ANTLR grammars and commit to grammars-v4. See https://stackoverflow.com/questions/64706408/rewriting-x-as-x-in-antlr-grammars

- Add errors for missing symbols during code generation

- Literal indexing: - Add map from string literal to fixed-length (typically lexer) rule - Warn about string literals, such as str(...), that are equal to tokens such ELLIPSIS in Python3.g4.

- Make Rule.root be of type Matcher and make - dcharCountSpan and - toMatchInSource members of Matcher. - Remove Symbol.toMatchInSource

- Support tokens { INDENT_WS, DEDENT_WS, LINE_BREAK_WS } to get Python3.g4` with TOK.whitespaceIndent, whitespaceDedent, whitespaceLineBreak useWhitespaceClassesFlag See: https://stackoverflow.com/questions/8642154/antlr-what-is-simpliest-way-to-realize-python-like-indent-depending-grammar

- Unicode regular expressions. Use https://www.regular-expressions.info/unicode.html Use https://forum.dlang.org/post/rsmlqfwowpnggwyuibok@forum.dlang.org

- Use to detect conflicting rules with import and tokenVocab

- Use a region allocator on top of the GC to pre-allocate the nodes. Either copied from std.allocator or Vox. Maybe one region for each file. Calculate the region size from lexer statistics (number of operators, symbols and literals).

- not(...)'s implementation needs to be adjusted. often used in conjunction with altN?

- handle all TODO's in makeRule

- Move parserSourceBegin to gxbnf_rdbase.d

- Use TOK.tokenSpecOptions in parsing. Ignored for now.

- Essentially, Packrat parsing just means caching whether sub-expressions match at the current position in the string when they are tested -- this means that if the current attempt to fit the string into an expression fails then attempts to fit other possible expressions can benefit from the known pass/fail of subexpressions at the points in the string where they have already been tested.

- Deal with differences between import and tokenVocab. See: https://stackoverflow.com/questions/28829049/antlr4-any-difference-between-import-and-tokenvocab

- Add Rule in generated code that defines opApply for matching that overrides - Detect indirect mutual left-recursion by check if Rule.lastOffset (in generated code) is same as current parser offset. Simple-way in generated parsers: enters a rule again without offset change. Requires storing last offset for each non-literal rule. ** Last offset during parsing. * * Used to detect infinite recursion, size_t.max indicates no last offset * yet defined for this rule. * size_t lastOffset = size_t.max;

- Warn about options{greedy=false;}: and advice to replace with non-greedy variants - Warn about options{greedy=true;}: being deprecated

- Display column range for tokens in messages. Use head.input.length. Requires updating FlyCheck. See: -fdiagnostics-print-source-range-info at https://clang.llvm.org/docs/UsersManual.html. See: https://clang.llvm.org/diagnostics.html Use GNU-style formatting such as: fix-it:"test.c":{45:3-45:21}:"gtk_widget_show_all".

- Use: nxt.git to scan parsing examples in grammars-v4

- If performance is needed: - Avoid casts and instead compare against head.tok for isA!NodeType - use RuleAltN(uint n) in makeAlt - use SeqN(uint n) in makeSeq

- Support reading parsers from Grammar Zoom.