Domain Specific Language
The Universal Context-Free Solver provides an intuitive Kotlin-based inner DSL for defining and working with context-free grammars. It enables developers to describe grammars using natural Kotlin syntax while benefiting from:
- development tools and syntax analysis from scratch;
- simple (for now) compile-time checks for rules.
Grammar
Each grammar is defined as a Kotlin class extending the base Grammar type and contains information about non-terminals, terminals, and production rules.
For example, the following grammar defines a simple language that recognizes a single word "a":
The start non-terminal can be set using the setStart(nt) method or directly during initialization with Nt(...).asStart().
Currently, only one start non-terminal is supported.
Non-terminals
- Designed as field of
Grammarfor easy monitoring and reuse. - Declared using Kotlin property delegates, e.g.,
val S by Nt(). - Initialization:
-
- in corresponds delegate constructor
val AB by Nt(A or B);
- in corresponds delegate constructor
-
- inside the
initblock using the/=operator, e.g.,AB /= A or B.
- inside the
-
- Each non-terminal must be initialized exactly once.
Terminals
Termis a generic class that can store terminals of any type. For example:- Terminals are compared based on their content value.
- Initialization:
-
- as fields of grammar class, can be reused
val a = Temr("A");
- as fields of grammar class, can be reused
-
- directly in production
BA \= Term("B") * Term("A").
- directly in production
- Strings can be handled without wrapping. For instance,
digit \= "1" or "0".
Operations
| Operation | EBNF | DSL |
|---|---|---|
| Production | A ::= B | val A by Nt(B) //in grammar class body A \= B //in init block |
| Concatenation | A B | A * B |
| Alternatve | A | B | A or B |
| Kleene Star | \(A^*\) | many(A) |
| Kleene Star+ | \(A^+\) | some(A) |
| Optional | A | \(\varepsilon\) A? |
opt(A) |
Epsilon (\(\varepsilon\)) - constant terminal with behavior corresponding to the empty string.
Examples
Dyck language
These grammars define the Dyck language — all correctly nested parentheses, brackets, and braces.
EBNF
DSL
class DyckGrammar : Grammar() {
val S by Nt().asStart()
val Round by Nt("(" * S * ")")
val Quadrat by Nt("[" * S * "]")
val Curly by Nt("{" * S * "}")
init {
//recursive initialization can be only in `init` block
S /= S * (Round or Quadrat or Curly) or Epsilon
}
}
A* language
EBNF
DSL
Grammar for this language can be described in various ways.
or