When constructing formal grammars, developers frequently encounter repetitive syntactic patterns. A typical scenario involves parsing delimiter-separated sequences, such as lists of identifiers or arithmetic expressions. Manually expanding these rules results in verbose definitions that are difficult to maintain and prone to copy-paste errors. LALRPOP mitigates this issue by providing a robust macro system that enables grammar rule abstraction.
Intrinsic Repetition Operators
The framework includes several built-in operators to handle common structural patterns: ?, *, +, and grouping parentheses. Appending ? to a rule transforms it into an optional match, yielding an Option<T> where T represents the rule's underlying type. The * and + operators generate zero-or-more and one-or-more repetitions, respectively, both producing a Vec<T>. Parenthetical grouping allows temporary composition of tokens without instantiating a permanent named nonterminal. Value projection is controlled via angle brackets, which instruct the parser exactly which sub-expressions should be captured and forwarded to the action code.
Constructing Generic List Handlers
By combining these primitives, developers can create parameterized macros for delimiter-separated sequences. The following example demonstrates a generic list parser that gracefully handles an optional trailing delimiter:
pub ExpressionList = SequenceOf<Statement>;
SequenceOf<T>: Vec<T> = {
<mut accumulated:(<t> ",")*> <trailing:T?> => match trailing {
Some(final_item) => {
accumulated.push(final_item);
accumulated
},
None => accumulated,
}
};</t>
The macro accepts a type parameter T, allowing it to process any terminal or nonterminal. The pattern (T ",")* repeatedly matches the target element followed by a delimiter. By wrapping only T in angle brackets, the delimiter itself is stripped from the output. The mutable binding accumulated enables direct vector manipulation. An optional trailing element captures the final item if it exists without requiring a trailing comma. The embedded Rust action conditionally appends this last item and returns the complete collection.
Recursive Precedence Levels
Another prevalent use case for macros involves stratified operator precedence. Instead of manually writing out identical recursive rules for each priority tier, a single generic definition can be instantiated multiple times:
PrecedenceChain<Operator, Inner>: Box<AstNode> = {
<left:PrecedenceChain<Operator, Inner>> <op:Operator> <right:Inner> => Box::new(AstNode::BinaryOp(<>)),
Inner
};
HighPriority = PrecedenceChain<AdditiveOp, MultiplicativeExpr>;
MultiplicativeExpr = PrecedenceChain<MultiplicativeOp, AtomicLiteral>;
AdditiveOp: OperatorKind = {
"+" => OperatorKind::Plus,
"-" => OperatorKind::Minus,
};
MultiplicativeOp: OperatorKind = {
"*" => OperatorKind::Multiply,
"/" => OperatorKind::Divide,
};
This recursive template accepts two parameters: the operator rule to apply at the current level, and the inner rule representing higher-precedence cnostructs. The first alternative implements left-recursion, building a binary tree by chaining the current level's operators with the inner rule. The second alternative simply delegates downward when no higher-level operator is present, effectively terminating the chain at the appropriate precedence boundary.
Parser Verification
After defining these macro expansions, standard integration tests can validate the generated parser against expected inputs:
use lalrpop_util::lalrpop_mod;
lalrpop_mod!(pub grammar_v5);
#[test]
fn validate_expression_sequences() {
let empty = grammar_v5::ExpressionListParser::new()
.parse("")
.unwrap();
assert_eq!(format!("{:?}", empty), "[]");
let single = grammar_v5::ExpressionListParser::new()
.parse("12 * 4")
.unwrap();
assert_eq!(format!("{:?}", single), "[(12 * 4)]");
let multi_trailing = grammar_v5::ExpressionListParser::new()
.parse("12 * 4, 3 + 9,")
.unwrap();
assert_eq!(format!("{:?}", multi_trailing), "[(12 * 4), (3 + 9)]");
let complex = grammar_v5::ExpressionListParser::new()
.parse("5, 10 / 2, 20 - 8")
.unwrap();
assert_eq!(format!("{:?}", complex), "[5, (10 / 2), (20 - 8)]");
}