IntroductionAtlepage makes writing a lexer super easy. You need to create 3 classes: a Token class, a TokenType enum, and the lexer itself. DetailsHere we create a simple RPM calculator. Your TokenType EnumThe token type enum is just the list of possible tokens. enum TokenKind
{
NUMBER,
OP,
}Your Token ClassFor the vast majority of projects, your token class will simply inherit from the Atlepage.Token class. class Token : Atlepage.Token<Token>
{
}Atlepage.Token is a generic class, it takes the inheriting class as its first parameter. Your TokenizerThe lexer defines a series of rules. There are two types of token rules: class tokens and method tokens. using Atlepage.Lexical;
[Token(@"( |\t|\n)+")]
[Token(@"\d+", Name = "NUMBER")]
[Token(@"\+|-|\*|/", Name = "OP")]
class LexerHandler {}or using Atlepage.Lexical;
class LexerHandler
{
[Token(@"( |\t|\n)+")]
public Token t_ignored(System.Text.RegularExpressions.Group g, Token t)
{
return null;
}
[Token(@"\d+")]
public string t_NUMBER(System.Text.RegularExpressions.Group g, Token t)
{
return t;
}
[Token(@"\+|-|\*|/")]
public Token t_OP(System.Text.RegularExpressions.Group g, Token t)
{
return t;
}
}Token type names are matched by adding t to the beginning of the method name or matched by the optional Name parameter to the Token attribute. If a name does not match a token type then the lexer factory will raise an exception. If a class token has no name or the method returns null, then the token is ignored. Using your LexerAll this is combined using the Atlepage.Lexical.Lexer class. LexerHandler handler = new LexerHandler();
Lexer<Token> lexer = LexerFactory<T>.CreateLexer(
typeof(LexerHandler),
new GenericEnum<TokenKind>(),
handler);
lexer.Begin(" 5 5 + 6 * ");
Token t = lexer.Next();
while (t != null)
{
Console.WriteLine(t);
t = lexer.Next();
}The type is used to generate the regular expressions and lexer. The GenericEnum class takes your token type enum as its parameter, and is used to map the enum to integers. If all of your token rules are class tokens, then you can pass null in for the handler, otherwise it must be an instance of the type you pass in for the first parameter.
|