[218] in linux-announce channel archive

home help back first fref pref prev next nref lref last post

Marpa -- The Hacker's Parser

daemon@ATHENA.MIT.EDU (Lars Wirzenius)
Mon Feb 20 09:22:12 1995

Date: Mon, 20 Feb 1995 14:03:46 +0200
From: Lars Wirzenius <wirzeniu@cc.helsinki.fi>
To: linux-activists@niksula.hut.fi, linux-announce@vger.rutgers.edu

X-Mn-Key: announce

From: Jeffrey Kegler <jeffrey@best.com>
Subject: Marpa -- The Hacker's Parser
Keywords:       parser, Earley's algorithm
Newsgroups: comp.os.linux.announce
Organization: ?
Approved: linux-announce@tc.cornell.edu (Lars Wirzenius)
Followup-to: comp.os.linux.development.apps

Marpa is TCL 7.3 extended with an enhanced Earley's Algorithm.  It
handles context-free and ambiguous grammars and languages.  It was
developed and tested on Linux.  Its README follows:

language:	Marpa
package:	Marpa is TCL 7.3 extended with an enhanced Earley's Algorithm
version:	Alpha 2.6
Keywords:       parser, Earley's algorithm
parts:		parser-generator, examples, document
author:		Jeffrey Kegler <jeffrey@best.com>
location:	via anonymous ftp at
		best.com:/pub/jeffrey/marpa.2.6.tar.z
legalities:     freely redistributable without warranty under GNU Public
                License Version 2

Description:  Marpa is a TCL 7.3 extended with an ambiguous context-free
parser which uses Earley's algorithm.  It is hacker friendly, with a
variety of handy features.  It is intended for use in implementing
parsers that use the same crude but effective approaches to parsing that
humans use, whether these humans be reading natural language or computer
code.  TCL code is attached to every production, explicitly or by
default, and this is used to evaluate the result of the parse.  The
grammar is specified directly in BNF, extended with Kleene star
sequences.  The parse logic follows directly from the BNF.  It handles
ambiguous grammars, ambiguous tokens (tokens which were not positively
identified by the lexer) and allows the programmer to change the start
symbol.  The input must be divided into "chunks" of tokens for best
results.  There is no limit enforced on chunks, but once they get larger
than 500 tokens things slow down.  There is no fixed distinction between
terminals and non-terminals, that is, a symbol can both match the input
AND be on the left hand side of a production.  Multiple Marpa grammars
are allowed in a single TCL interpreter, and multiple Marpa-extended TCL
interpreters can run in a single program.

Speed is reasonable if not blinding, and Marpa is in use in some
applications.  Marpa is the outcome of the Milarepa prototype which
implemented a different general parsing algorithm in Perl.

requires:	TCL 7.3, GNU C compiler, GNU Make
updated:	1995/02/19

I hope it proves useful!

Jeffrey Kegler
Algorists, Inc.
743 East El Camino Real, #338
Sunnyvale CA 94087
jeffrey@best.com

--
Send submissions for comp.os.linux.announce to: linux-announce@news.ornl.gov
PLEASE remember Keywords: and a short description of the software.


home help back first fref pref prev next nref lref last post