[7720] in www-talk@info.cern.ch
Re: HTML parser in Yacc form???
daemon@ATHENA.MIT.EDU (Gavin Nicol)
Fri Mar 24 13:18:22 1995
Date: Fri, 24 Mar 1995 12:43:07 +0500
Errors-To: procmaster@www19.w3.org
Reply-To: gtn@ebt.com
From: Gavin Nicol <gtn@ebt.com>
To: Multiple recipients of list <www-talk@www10.w3.org>
>Parsing SGML with a top down recursive decent parser based on an FSR is
>by far the simplest approach to implement and also produces correct code.
>Why would anyone want to use an inappropriate tool which does the job less
>well and is more difficult to use?
True enough. I was just pointing out that it's not impossible, and in
particular, recommending the TEI subset.
>Yacc is OK if you actually have an LR(1) grammar. But its best to
>steer well clear of it otherwise. In addition error handling was
>never really though out properly for yacc. I've never seen anyone
>sucessfully use the error productions without comming a cropper.
Quite! yacc error productions fall into the "black art" category at
the very least.
>I think the problem lies in comp sci classes being taught that bottom
>up parsing is `better' and the students not asking why. Goldfarb
>would not know an LR(1) grammar if one bit him on the nose. If he had
>SGML might not fall into the "much wailing and gnashing of teeth"
>catogory which it does.
Well, the other thing is that many people perceive writing a recursive
descent parser to be harder than writing a YACC grammar
description. I'm not sure this is true. I'll be perfectly honest and
say that over the last 10 years, less than a third of the parsers I've
written used YACC (though flex is a godsend, at least for
prototyping). I saw one interesting parser concept which used "event
handlers" to create a very loosely coupled FSM. Quite interesting, and
very fast too.
>PS: I have discovered that the correct pronunciation of "ASN.1" is
>"assasin 1".
Or perhaps asinine ;-)