A simple & fast library to create parsers using the parser-combinators approach. Please provide your feedback or suggestions here - https://docs.google.com/forms/d/e/1FAIpQLScIt6dT7t1pxTjeUO5iHYagSPs-f6VPuWNtnCsLl32qYJf5Ig/viewform?usp=sf_link
$ dotnet add package AlphaX.ParserzA strong & fast .NET Standard Parser Combinator library for creating simple/complex parsers. This library is being actively developed.
GitHub Repo : https://github.com/kartikdeepsagar/AlphaX.Parserz
In this library, a parser is represented by the following IParser interface
public interface IParser
{
IParserState Run(string input);
IParserState Parse(IParserState inputState);
}
Run - Run method takes a string input and tries to parser the input as per the implemented logic of the parser. (Internally calls the Parse method)
Parse - Parse method takes an input state and returns an output (success/failure) state.
The input/output parser state is represented by the followin IParserState interface
public interface IParserState : ICloneable<IParserState>
{
int Index { get; set; }
string ActualInput { get; set; }
string Input { get; }
bool IsError { get; }
IParserResult Result { get; set; }
IParserError Error { get; set; }
}
Index - Index of the input from where the parsing will start.
Actual Input - Actual input passes to the initial parser.
Input - Input for the next parser.
IsError - Gets if the parser state is a failure state.
IParserResult - Represents the result of a state.
IParserError - Represents the error of a state.
IParserResult/IParserResult interface
public interface IParserResult
{
object Value { get; }
}
public interface IParserResult<T> : IParserResult
{
new T Value { get; }
}
Value - Result value.
IParserError interface
public interface IParserError
{
int Index { get; }
string Message { get; }
}
Index - Index of the input where error occured.
Message - Error message with failure information.
Create a DigitParser class by inheriting AlphaX.Parserz.Parser class and override its ParseInput method as follows:
public class DigitParser : Parser<ByteResult>
{
protected override IParserState ParseInput(IParserState inputState)
{
var targetString = inputState.Input;
if (string.IsNullOrEmpty(targetString))
return ParserStates.Error(inputState, new ParserError(inputState.Index,
string.Format(ParserMessages.UnexpectedInputError, inputState.Index, ParserMessages.Digits, targetString)));
var character = targetString[0];
if (char.IsDigit(character))
{
return ParserStates.Result(inputState, new ByteResult(Convert.ToByte(character - '0')), inputState.Index + 1);
}
return ParserStates.Error(inputState, new ParserError(inputState.Index,
string.Format(ParserMessages.UnexpectedInputError, inputState.Index, ParserMessages.Digits, targetString)));
}
}
It's that simple! :-)
This library provides some inbuilt parsers to make your work easy. However, you can always use these inbuilt parsers to make a more complex parser or create your own parsers
public static class Parser
{
public static IParser<ByteResult> Digit { get; }
public static IParser<DoubleResult> Decimal { get; }
public static IParser LetterOrDigit { get; }
public static IParser<BooleanResult> Boolean { get; }
...
static Parser()
{
Digit = new DigitParser();
...
}
Lets look at some examples for getting a headstart.
var resultState = Parser.Digit.Run("1");
int minimumCount = 1;
int maximumCount = 3;
var threeDigitParser = Parser.Digit.Many(1, 3);
var resultState = threeDigitParser.Run("874");
You can see that we have used an extension method i.e. Many in the above code. It just returns a new ManyParser which basically runs the input parser on the input string provided number (min/max) of times.
public static IParser<ArrayResult> Many(this IParser parser, int minCount = 0, int maxCount = -1)
{
return new ManyParser(parser, minCount, maxCount);
}
Similarly, you can combine small parsers to make a more complex parser. For example, you can create a basic email (gmail/microsoft) parser as follows:
var @parser = AlphaX.Parserz.Parser.String("@");
var dotParser = AlphaX.Parserz.Parser.String(".");
var comParser = AlphaX.Parserz.Parser.String("com");
var gmailParser = AlphaX.Parserz.Parser.String("gmail");
var microsoftParser = AlphaX.Parserz.Parser.String("microsoft");
// username parser to parse names starting with letters and then containing letters/digits
var userNameParser = AlphaX.Parserz.Parser.Letter.Many()
.AndThen(Parser.LetterOrDigit.Many())
.MapResult(x => x.ToStringResult()); // converting to string result
// domain parser for example, @gmail.com
var domainParser = @parser
.AndThen(gmailParser.Or(microsoftParser))
.AndThen(dotParser)
.AndThen(comParser)
.MapResult(x => x.ToStringResult());
var emailParser = userNameParser.AndThen(domainParser)
.MapResult(x => new EmailResult(new Email()
{
UserName = (string)x.Value[0].Value,
Domain = (string)x.Value[1].Value
}));
And the EmailResult class is defined as follows:
public class Email
{
public string UserName { get; set; }
public string Domain { get; set; }
}
public class EmailResult : ParserResult<Email>
{
// specifies the type of result
public static ParserResultType EmailResultType = new ParserResultType("email");
public EmailResult(Email email) : base(email, EmailResultType)
{
}
}
And this is how we can use the parser
var result = emailParser.Run("testuser@gmail.com");
var email = result.Result as EmailResult;
Console.WriteLine(JsonConvert.SerializeObject(email.Value)); // {"UserName":"testuser","Domain":"@gmail.com"}
This library also allows you to trace parser steps using the 'AlphaX.Parserz.Tracing.ParserTracer'.
In order to use the parser tracing. You need to set the Enable property to true which is false by default.
All the Traces can retrieved using the GetTrace method as follows:
ParserTracer.Enabled = true;
var result = emailParser.Run("emailparser1@alphax.com");
Console.WriteLine(string.Join(Environment.NewLine, ParserTracer.GetTrace()));
ParserTracer.Reset();
And that's how the trace looks like:
LetterParser > Parsing 'emailparser1@alphax.com' at index '0'
LetterParser > Parsed & left with 'mailparser1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '1'
LetterParser > Parsed & left with 'ailparser1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '2'
LetterParser > Parsed & left with 'ilparser1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '3'
LetterParser > Parsed & left with 'lparser1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '4'
LetterParser > Parsed & left with 'parser1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '5'
LetterParser > Parsed & left with 'arser1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '6'
LetterParser > Parsed & left with 'rser1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '7'
LetterParser > Parsed & left with 'ser1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '8'
LetterParser > Parsed & left with 'er1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '9'
LetterParser > Parsed & left with 'r1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '10'
LetterParser > Parsed & left with '1@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '11'
LetterParser > Parse Failed. Position (11): Unexpected input. Expected 'a-z/A-Z' but got '1'
DigitParser > Parsing 'emailparser1@alphax.com' at index '11'
DigitParser > Parsed & left with '@alphax.com'
LetterParser > Parsing 'emailparser1@alphax.com' at index '12'
LetterParser > Parse Failed. Position (12): Unexpected input. Expected 'a-z/A-Z' but got '@'
DigitParser > Parsing 'emailparser1@alphax.com' at index '12'
DigitParser > Parse Failed. Position (12): Unexpected input. Expected '0-9' but got '@alphax.com'
StringParser("@") > Parsing 'emailparser1@alphax.com' at index '12'
StringParser("@") > Parsed & left with 'alphax.com'
StringParser("alphax") > Parsing 'emailparser1@alphax.com' at index '13'
StringParser("alphax") > Parsed & left with '.com'
StringParser(".") > Parsing 'emailparser1@alphax.com' at index '19'
StringParser(".") > Parsed & left with 'com'
StringParser("com") > Parsing 'emailparser1@alphax.com' at index '20'
StringParser("com") > Parsed & left with ''
Note: Parser Tracer is singleton so it will be shared by all the parsers. So always remember to clear the tracing before another parser call using the Reset method.
Stay tuned for future updates. That's all for now. Thank you!
Feedback is very much appreciated: https://forms.gle/SUqd5Ewqep62mP428