A Scala library for language processing.
Documentation HomeUp: Parsing, Next: Parser combinators
See also Parsing and Parser combinators for general information about defining and using parser combinators.
Most parsers for use with Kiama will operate on character input. For simplest usage extend the following class:
org.bitbucket.inkytonik.kiama.parsing.Parsers
The Parsers trait gives you backtracking parsers that
memoise their results, so that they allow unlimited lookahead
and performance that is linear in terms of the input size. They
also allow left recursive productions to be directly encoded.
Parsers uses Vector values to record repeated constructs;
use ListParsers if you prefer to use List values for this
purpose.
The Parsers class uses a Positions data structure to keep
track of the input positions of parsed values. You should
provide one when you extend the class.
class SyntaxAnalysis (positions : Positions) extends Parsers(positions) {
... parser definitions ...
}
Thus, the usual approach is as follows where the positions argument to SyntaxAnalysis is provided by the code that creates the parser.
import org.bitbucket.inkytonik.kiama.util.Positions
val positions = new Positions
val parsers = new SyntaxAnalysis (positions)
A parser processes sources of type org.bitbucket.inkytonik.kiama.util.Source.
The most common type of source is a FileSource that obtains its input from a file.
import org.bitbucket.inkytonik.kiama.util.FileSource
val source = FileSource ("file.txt")
The class StringSource is also available so that a string can be used instead of a file.
Parser input then is a source combined with an offset which records the current parsing positions.
case class Input (source : Source, offset : Int)
A parser that returns a value of type T on success is of type Parser[T]
which extends the function type Input => ParseResult[T].
ParseResult has sub-classes to represent the possible outcomes of
a parse.
case class Success[T] (result : T, next : Input) extends ParseResult[T]
abstract class NoSuccess(message : String, next: Input) extends ParseResult[Nothing]
case class Failure (message : String, next : Input) extends NoSuccess (message, next)
case class Error (message : String, next : Input) extends NoSuccess (message, next)
In each case, the next parameter represents the input remaining after
the parse. Success.result is the value produced by the successful
parse, whereas a message field is the message resulting from a failed
parse. A Failure represents a failure that can backtrack to retry
earlier parses, whereas an Error represents a final error.
Since a parser is just a function, it can be applied directly to an
input to get a result. I.e., p (in) for a parser p and input in.
Invocation of a parser is also encapsulated by the parse function,
provided by Parsers which begins parser at the start of a source:
def parse[T] (p : Parser[T], source : Source) : ParseResult[T]
It is often useful to make sure that a parser consumes all of its
input. The phrase combinator returns a parser that recognises what
its argument parser does, but only succeeds if there is no input
remaining.
def phrase[T] (p : Parser[T]) : Parser[T]
phrase is used by parseAll that has the same
signature as parse but requires that the entire input be consumed
by a successful parse. Thus, a typical invocation of a parser p is
parseAll (p, s) match {
case Success (e, _) =>
println ("successful parse: " + e)
case f =>
println (f)
}
In the Success case e is the value created by the parse.
Note that the printable representation of a Failure includes the
location of the failure, so it is almost always better to print the
entire Failure than just its message.
Up: Parsing, Next: Parser combinators