[2200] in java-interest

home help back first fref pref prev next nref lref last post

Re: regular expressions in Java

daemon@ATHENA.MIT.EDU (Glen C. Perkins)
Tue Sep 26 21:50:14 1995

Date: Tue, 26 Sep 1995 16:43:54 -0700
To: java-interest@java.sun.com
From: Glen.Perkins@NativeGuide.com (Glen C. Perkins)


>From: tmb@best.com
>Date: Sun, 24 Sep 1995 23:56:40 -0800
>Subject: regular expressions and dbm for Java?
>
>Languages like Perl and Python that are currently popular for doing
>a lot of WWW related programming have extensive support for
>regular expressions and dbm files.
>
>I'm sort of missing similar support in Java.  Is there a standard
>API and implementation in the works?
>
>Thomas.
>

Several people have pointed out how valuable unicode-savvy regular
expressions would be to Java, but the Java team never responds to this
question. I wish they would.

I would love to be able to create a class in a minute or two that could
parse some particular protocol. I can write a parser in Perl in a minute or
two usually, but Perl doesn't scale up to big projects as well as Java
will, IMHO. I want to create Java classes as quickly and reliably as I can
now create little Perl parser scripts, and have them serve as interpreters
between my projects and the outside world.

One language isn't going to be the best at everything, of course, but Java
has a real shot at becoming a protocol handler extraordinaire. There are
already some goodies in Java that will be extremely helpful: tokenizer,
enumerator, vector, hashtables, to name a few. These seem to be at least as
good as Perl's split() and associative arrays. It seems as though a reg
expr class would fit right in with the stream filters.

Of course, you can always write a parser by brute force, but then you could
do object oriented programming in assembly language, too. Yecch! C
programmers tend to use Perl, not C, for CGI because Perl has done so much
of the work for you already. A regular expression lets you specify patterns
at a higher level of abstraction. Using a pre-built, debugged and optimized
parser engine (reg expr. compiler) makes code faster to create and more
robust which are two stated goals of Java. Perl has this advantage over C.
Java could have this advantage over C++.

Also, a language that targets the net as its platform as much as Java does,
should learn from the success of Perl. Before the web took off, they
couldn't give away Perl books and now they're flying off the shelves acc.
to the book buyer at Stacey's in Palo Alto. So MUCH of what you do on the
net is interpreting data in gazillions of different formats/protocols. Perl
lets you glance over a data file (or a frozen data stream), notice the
patterns you need to pick out, and have a working script to handle the
protocol running in minutes. If only we could do this with Java, I'd say
goodbye to Perl forever....

A Unicode-savvy regular expression class would be a big job, though. If
it's such an important part that you really want to get it right, why not
say so, Java team? "Not now, but we'd like to do it later" would be a
sensible (and encouraging) answer. If that's the case, why not get
suggestions and/or help from the Unicode consortium?

Or maybe the answer is, "Not now, not ever, and this is why...." There
could be a lot of very good reasons. You haven't hesitated to share your
reasons for not including various other language features.

What have you decided regarding support for regular expressions?

__Glen__


-
Note to Sun employees: this is an EXTERNAL mailing list!
Info: send 'help' to java-interest-request@java.sun.com

home help back first fref pref prev next nref lref last post