The replacement engine isn't as elegant as it could be, resulting in reduced capabilities.
The lexer should persist through the replacement. A state machine decides whether tokens are fed through (with recursive replacement optionally preceding substitution) or re-lexed.
This would allow n-way catenation, for example this should work:
define M(STR,SUF) R ## "abc(" ## # S ## ")abc" ## _ ## SUF
M(,yo) => R"abc(""\"")abc"_yo (and lots of potential diagnostics)
It may be difficult to implement the requisite state queries in the tokenizer to verify that the state is valid after each successive operation. A better approach would be to use two simultaneous lexer, with one performing minimal operations like (#S) or (_##SUF) for the sole sake of verification. This could be switched off for a speedup.
The lexer(s) don't need to exist if an operator isn't being processed, so this could be a place to try an unrestricted union.
The form of input handled by each iteration would be simplified from [x] [##] [#] [y] to (#|x [##]).
Status: Accepted
Labels:
Type-Enhancement
Priority-Medium
Component-Macros