My favorites | Sign in
Project Logo
                
Search
for
Updated Apr 21, 2009 by braden.mcdaniel
Labels: Phase-Deploy, Featured
About  
About uri_grammar

Introduction

I wrote uri_grammar both to get myself acquainted with Spirit and to parse URIs in OpenVRML. uri_grammar is fairly small and it parses something that is pretty common; as such, it is probably generally useful for demonstrating some of what Spirit can do.

A tour of uri_grammar's design

Spirit has a lot of flexibility. A side effect of that is that the options for designing a grammar can be overwhelming to persons first getting acquainted with Spirit. Hopefully this will help a bit.

Factoring the grammar

Spirit grammars are arguably most attractive aesthetically when they look most like BNF. That typically means in a monolithic grammar capsule. There are, however, reasons of performance and usability that make this an unattractive design strategy in general.

In uri_grammar's case, factoring the grammar into separately usable parts was desirable so that users could parse either all kinds of URIs or exclusively absolute URIs. As such, we find the main grammar capsule, uri_grammar, using another grammar capsule for parsing absolute URIs, absolute_uri_grammar:

template <typename Actions = null_actions>
struct uri_grammar : public boost::spirit::grammar<uri_grammar<Actions> > {

    template <typename ScannerT>
    struct definition {
        typedef boost::spirit::rule<ScannerT> rule_type;

        rule_type uri_reference;
        absolute_uri_grammar<Actions> absolute_uri;
        rule_type relative_uri;
        ⋮
        explicit definition(const uri_grammar & self):
            ⋮
        {
            ⋮
            uri_reference
                =   !(absolute_uri | relative_uri) >> !('#' >> fragment)
                ;
            ⋮
        }
        ⋮
    };
    ⋮
};

When everything is in a monolithic grammar capsule, rules you define are readily reusable anywhere within the grammar capsule. But once you start breaking things up, you find that parts of your grammar that need to be used in two or more of your new “subgrammars” also need to be factored out such that they can be used independently. This requirement yields the additional grammar capsules uri_abs_path_grammar and uri_authority_grammar, each of which gets used both in uri_grammar and absolute_uri_grammar.

Semantic actions

Semantic actions can participate in the parsing process or they can simply transmit parsed data to the rest of your application. Embedding the latter class of semantic actions in the grammar has the effect of coupling the grammar to the application—it is no longer a piece of generally reusable code. So we need a way of allowing the user to specify these actions.

The grammar capsules uri_grammar, absolute_uri_grammar, uri_abs_path_grammar, and uri_authority_grammar aren't actually classes; rather, they are class templates. The parameter for the template is something called Actions.

The Actions concept

Actions is a concept (in the same sense that BidirectionalIterator and Sequence are concepts in the C++ standard library; see “Concepts and Modeling” in the Introduction to the Standard Template Library documentation). This concept is realized in a class you provide as a parameter to the uri_grammar (or absolute_uri_grammar) template.

Refinement of

Assignable, DefaultConstructible

Valid expressions

For an object a that is an instance of a type that models Actions, the following operations must be supported:

Name Expression Type requirements Return type
scheme action a.scheme(first, last) first and last are InputIterators void
scheme-specific part action a.scheme_specific_part(first, last) first and last are InputIterators void
user info action a.user_info(first, last) first and last are InputIterators void
host action a.host(first, last) first and last are InputIterators void
port action a.port(first, last) first and last are InputIterators void
authority action a.authority(first, last) first and last are InputIterators void
path action a.path(first, last) first and last are InputIterators void
query action a.query(first, last) first and last are InputIterators void
fragment action a.fragment(first, last) first and last are InputIterators void

null_actions

Concretely, null_actions represents a simplest-possible model of Actions:

class null_actions {
public:
    struct null_action {
        template <typename Iterator>
        void operator()(const Iterator &, const Iterator &) const
        {}
    };

    null_action scheme, scheme_specific_part, userinfo, host, port,
        authority, path, query, fragment;
};

null_actions doesn't actually do anything; it is provided as a convenience so that users don't have to write one of these every time they just want to instantiate a uri_grammar without any semantic actions (typically for testing).


Sign in to add a comment
Hosted by Google Code