Readme

ClosedParser

ClosedParser is a project to parse OOXML grammar to create an abstract syntax tree that can be later evaluated.

Official source for the grammar is MS-XML, chapter 2.2.2 Formulas. The provided grammar is not usable for parser generators, it's full of ambiguities and the rules don't take into account operator precedence.

How to use

Implement IAstFactory interface.
Call parsing methods
- FormulaParser<TScalarValue, TNode>.CellFormulaA1("Sum(A1, 2)", astFactory)
- FormulaParser<TScalarValue, TNode>.CellFormulaR1C1("Sum(R1C1, 2)", astFactory)

Visualizer

There is a visualizer to display AST in a browser at https://parser.closedxml.io

Goals

Performance - ClosedXML needs to parse formula really fast. Limit allocation and so on.
Evaluation oriented - Parser should concentrates on creation of abstract syntax trees, not concrete syntax tree. Goal is evaluation of formulas, not transformation.
Multi-use - Formulas are mostly used in cells, but there are other places with different grammar rules (e.g. sparklines, data validation)
Multi notation (A1 or R1C1) - Parser should be able to parse both A1 and R1C1 formulas. I.e. SUM(R5) can mean return sum of cell R5 in A1 notation, but return sum of all cells on row 5 in R1C1 notation.

Project uses ANTLR4 grammar file as the source of truth and a lexer. There is also ANTLR parser is not used, but is used as a basis of recursive descent parser (ANTLR takes up 8 seconds vs RDS 700ms for parsing of enron dataset).

ANTLR4 one of few maintained parser generators with C# target.

ClosedXML/ClosedXML.Parserv1.3.0

Get Started

Readme

ClosedParser

How to use

Visualizer

Goals

Current performance

Limitations

Why not use XLParser

Debugging

Testing strategy

Rolex

Generate lexer

TODO

Resources