[23606] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 5813 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Nov 16 18:10:45 2003

Date: Sun, 16 Nov 2003 15:10:10 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 16 Nov 2003     Volume: 10 Number: 5813

Today's topics:
        Learn Regex or Perl Frist? <nihilistcoder@yahoo.com>
    Re: Learn Regex or Perl Frist? <usenet@morrow.me.uk>
    Re: Learn Regex or Perl Frist? <jwillmore@remove.adelphia.net>
    Re: Learn Regex or Perl Frist? <nihilistcoder@yahoo.com>
    Re: Learn Regex or Perl Frist? <nihilistcoder@yahoo.com>
    Re: Learn Regex or Perl Frist? <jurgenex@hotmail.com>
    Re: Learn Regex or Perl Frist? <usenet@morrow.me.uk>
    Re: Script to convert RTF to PDF <nakroshis@NOICKYSPAMsmart.net>
        Whitespace removal in html generated by cgi <nospam@bigpond.com>
    Re: Whitespace removal in html generated by cgi <usenet@morrow.me.uk>
    Re: Whitespace removal in html generated by cgi <nospam@bigpond.com>
    Re: Whitespace removal in html generated by cgi <REMOVEsdnCAPS@comcast.net>
    Re: Whitespace removal in html generated by cgi <pinyaj@rpi.edu>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sun, 16 Nov 2003 09:26:11 -0800
From: "Don Kim" <nihilistcoder@yahoo.com>
Subject: Learn Regex or Perl Frist?
Message-Id: <TqOtb.5027$n23.4135@okepread02>

I'm looking to learn perl.  Comming from a c++ background, I'm not finding
it very hard to learn, except for getting used to deciphering terse regex
expressions.

Curious, is it better to learn regex first, from say a book like "Mastering
Regular Expressions" by Friedl, then learn perl?

Thx.

Don




------------------------------

Date: Sun, 16 Nov 2003 17:32:12 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Learn Regex or Perl Frist?
Message-Id: <bp8cas$hf7$1@wisteria.csv.warwick.ac.uk>


"Don Kim" <nihilistcoder@yahoo.com> wrote:
> I'm looking to learn perl.  Comming from a c++ background, I'm not finding
> it very hard to learn, except for getting used to deciphering terse regex
> expressions.
> 
> Curious, is it better to learn regex first, from say a book like "Mastering
> Regular Expressions" by Friedl, then learn perl?

My recommendation would be: first get the hang of the language,
treating a regex as a black box. When you find that you need to
understand or construct one yourself, read perldoc perlretut or
chapter 5 of the Camel ('Programming Perl', L. Wall, T. Christiansen &
J. Orwant; O'Reilly).

Ben

-- 
don't get my sympathy hanging out the 15th floor. you've changed the locks 3
times, he still comes reeling though the door, and soon he'll get to you, teach
you how to get to purest hell. you do it to yourself and that's what really
hurts is you do it to yourself just you, you and noone else *  ben@morrow.me.uk


------------------------------

Date: Sun, 16 Nov 2003 18:54:34 GMT
From: James Willmore <jwillmore@remove.adelphia.net>
Subject: Re: Learn Regex or Perl Frist?
Message-Id: <20031116135434.0db71dcf.jwillmore@remove.adelphia.net>

On Sun, 16 Nov 2003 09:26:11 -0800
"Don Kim" <nihilistcoder@yahoo.com> wrote:
> I'm looking to learn perl.  Comming from a c++ background, I'm not
> finding it very hard to learn, except for getting used to
> deciphering terse regex expressions.
> 
> Curious, is it better to learn regex first, from say a book like
> "Mastering Regular Expressions" by Friedl, then learn perl?

If you want to practice with regular expressions in a C++ environment,
then might I suggest pcre++ (http://www.daemon.de/en/software/pcre/). 
It's Perl Compatible Regular Expressions in C++.  This way, you can
learn Perl Compatible Regular Expressions in C++ :-)

If you want to learn Perl, start using the 'perldoc' utility first
(type 'perldoc perldoc' and 'perldoc perl' at the command line to
start).  Then, if you want a more extensive look at the language,
you'll a least know what to look for in a book you purchase.  Why
throw down $30+ if you don't have to :-)

Being an owner of the book mentioned I can say this - unless you're
going to be doing what the author has done with regular expressions,
save your money.  If you *must* have a book, get the pocket book by
the same title ($12(?) versus $30(?) for the full book).

That's my $0.02

-- 
Jim

Copyright notice: all code written by the author in this post is
 released under the GPL. http://www.gnu.org/licenses/gpl.txt 
for more information.

a fortune quote ...
Nobody said computers were going to be polite. 



------------------------------

Date: Sun, 16 Nov 2003 11:26:32 -0800
From: "Don Kim" <nihilistcoder@yahoo.com>
Subject: Re: Learn Regex or Perl Frist?
Message-Id: <IbQtb.5427$n23.93@okepread02>

"Ben Morrow" <usenet@morrow.me.uk> wrote in message
news:bp8cas$hf7$1@wisteria.csv.warwick.ac.uk...
> My recommendation would be: first get the hang of the language,
> treating a regex as a black box.

Hmm.  Does that mean there are functions or modules in perl that encapsulate
regex calls?

For example, it is encouraged and considered good practice to use standard
c++ libraries for things like arrays and linked lists.  For example, you
would use #include<list> in the std namespace to implement a linked list for
your application, rather than rolling one on your own.  It is in a sense, a
"black box" for commonly used algorithms and DS for c++.

Does this same analogy hold for Perl with respect to regex?

Thanks for your time.

Don






------------------------------

Date: Sun, 16 Nov 2003 11:30:58 -0800
From: "Don Kim" <nihilistcoder@yahoo.com>
Subject: Re: Learn Regex or Perl Frist?
Message-Id: <TfQtb.5441$n23.4501@okepread02>

"James Willmore" <jwillmore@remove.adelphia.net> wrote in message
news:20031116135434.0db71dcf.jwillmore@remove.adelphia.net...
> If you want to practice with regular expressions in a C++ environment,
> then might I suggest pcre++ (http://www.daemon.de/en/software/pcre/).
> It's Perl Compatible Regular Expressions in C++.  This way, you can
> learn Perl Compatible Regular Expressions in C++ :-)

Thanks!  I also looked at the regex library in boost and one called "Greta"
from a guy in microsoft.  How does this compare to those?  Assuming you've
looked at them.

> Being an owner of the book mentioned I can say this - unless you're
> going to be doing what the author has done with regular expressions,
> save your money.  If you *must* have a book, get the pocket book by
> the same title ($12(?) versus $30(?) for the full book).

Actually, I already purchased the book, but I got it used for $12 so it was
worth it.  I hear you though, I'm sick of buying books only to find it
wasn't useful to me. :-(

Don




------------------------------

Date: Sun, 16 Nov 2003 19:40:54 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: Learn Regex or Perl Frist?
Message-Id: <apQtb.61479$p9.40542@nwrddc02.gnilink.net>

Don Kim wrote:
> I'm looking to learn perl.  Comming from a c++ background, I'm not
> finding it very hard to learn, except for getting used to deciphering
> terse regex expressions.
>
> Curious, is it better to learn regex first, from say a book like
> "Mastering Regular Expressions" by Friedl, then learn perl?

The one has little to do with the other. Perl offers also nice arithmetic
functions.

Depending on what you want to do with Perl you may need arithmetic functions
or you may need REs.
The main difference is that you learned about arithmetic in school, but you
didn't learn about REs in school. Therefore plus and minus seem to be more
natural to you.

Make up your mind what kind of programs you want to write. If all you ever
do is numerical calculations then you don't need any REs. If you are going
to write a lot of text processing programs then sooner or later you will
want to learn about REs and you automatically will.

jue





------------------------------

Date: Sun, 16 Nov 2003 21:10:57 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Learn Regex or Perl Frist?
Message-Id: <bp8p51$mm2$1@wisteria.csv.warwick.ac.uk>


"Don Kim" <nihilistcoder@yahoo.com> wrote:
> "Ben Morrow" <usenet@morrow.me.uk> wrote in message
> news:bp8cas$hf7$1@wisteria.csv.warwick.ac.uk...
> > My recommendation would be: first get the hang of the language,
> > treating a regex as a black box.
> 
> Hmm.  Does that mean there are functions or modules in perl that encapsulate
> regex calls?
> 
> For example, it is encouraged and considered good practice to use standard
> c++ libraries for things like arrays and linked lists.  For example, you
> would use #include<list> in the std namespace to implement a linked list for
> your application, rather than rolling one on your own.  It is in a sense, a
> "black box" for commonly used algorithms and DS for c++.
> 
> Does this same analogy hold for Perl with respect to regex?

Hmmm.... in a way. Perl is a higher-level language than C++, and so
there are more things in the core language than in C++. So, Perl has a
basic data type of 'array', which covers pretty much all uses of
arrays/vectors/lists/queues/stacks in C++.

Regexen are part of the language in the same way. The basic unit in
Perl is the 'scalar', which is a polymorphic type: a little like a C
union, except that it is converted between the different
representations as necessary. So if you treat a scalar like a string,
it'll be a string; if you treat it like a number it'll be a number;
etc.

One of the things it can be is a regex, and there is a special
operator '=~' in Perl which means 'match this regex against this
string'. This is something that will perhaps take a little getting
used to coming from C++: Perl is a very operator-rich language, and a
lot of the time it makes more sense to consider your own functions as
operators rather than function calls.

Ben

-- 
perl -e'print map {/.(.)/s} sort unpack "a2"x26, pack "N"x13,
qw/1632265075 1651865445 1685354798 1696626283 1752131169 1769237618
1801808488 1830841936 1886550130 1914728293 1936225377 1969451372
2047502190/'                                                 # ben@morrow.me.uk


------------------------------

Date: Sun, 16 Nov 2003 16:02:30 -0000
From: Rick Nakroshis <nakroshis@NOICKYSPAMsmart.net>
Subject: Re: Script to convert RTF to PDF
Message-Id: <Xns94357049E5EF6ricknak@216.168.3.44>

[posted and mailed]

justin@cutroni.com (Justin Cutroni) wrote in 
news:b75bacde.0311120719.321c5b0a@posting.google.com:

> Does anyone know if there is a Perl script out there (free or for
> sale) that will convert an RTF file to a PDF file?  I'm on a Win2k
> machine using ActivePerl.
> 
> I know there is one for EPS to PDF but have no luck finding the RTF to
> PDF converter.

Have you looked at the PDF::API2 module?  Documentation is the pits, but I 
have created PDF's from plain text and TIFF images.  I don't have a copy of 
it installed on this machine to check it right now.

Rick


------------------------------

Date: Sun, 16 Nov 2003 23:22:37 +1000
From: Gregory Toomey <nospam@bigpond.com>
Subject: Whitespace removal in html generated by cgi
Message-Id: <1933712.m1tGeoNVPB@gregs-web-hosting-and-pickle-farming>

A few weeks ago a question was asked in this group about removing whitespace from html, in particular from html generated by cgi.
Here's a simple technique I developed for Linux:


1. A sample cgi. Bash uses the <<'delimiter' conststuct to pass the input verbatim to Perl. The output of the cgi is piped to delspace.pl. our whitespace munger. 
 
#!/bin/bash
/usr/bin/perl <<'EOFPERL'  | ./delspace.pl
#your cgi goes here
use strict;
$|++;
print "Content-type:text/html\n\n";
print "  <h1>  This     is  a   test <h1> \n";
print " some more    text\n";
                                                                                 
EOFPERL
 

2. Now here's delspace.pl, the whitespace remover. It may be a little buggy, but it seems to work for my simple html.

#!/usr/bin/perl
my $count=0;
while(<>){
        # remove trailing whitespace
        s/^\s+//;
                                                                                 
        # remove leading whitespace
        s/\s+$//;
                                                                                 
        # change internal whitespace to single space
        s/\s+/ /g;
                                                                                 
        # remove simple one line comments
        s/<!--.*?-->//;
                                                                                 
        # another simple whitespace removal
        s/> </></g;
                                                                                 
        #newlines are not needed
        #except  for Content-type-text/html\n\n
        # which occurs at the start
        print;
        print "\n" if $count++<4;
}



gtoomey


------------------------------

Date: Sun, 16 Nov 2003 15:57:50 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Whitespace removal in html generated by cgi
Message-Id: <bp86pu$f1r$1@wisteria.csv.warwick.ac.uk>

[please limit your line lengths to 72 characters]
[please make sure your blank lines are *actually* blank]

Gregory Toomey <nospam@bigpond.com> wrote:
> A few weeks ago a question was asked in this group about removing
> whitespace from html, in particular from html generated by cgi.
> Here's a simple technique I developed for Linux:
>
> 1. A sample cgi. Bash uses the <<'delimiter' conststuct to pass the
> input verbatim to Perl. The output of the cgi is piped to
> delspace.pl. our whitespace munger.
>  
> #!/bin/bash

There is absolutely no need to use bash. If nothing better, use the
techniques described in perldoc perlipc "Safe Pipe Opens". Better, use
a tied filehandle or a PerlIO layer on STDOUT. Or simply generate the
thing without superflous whitespace in the first place.

<snip>
> 2. Now here's delspace.pl, the whitespace remover. It may be a
> little buggy, but it seems to work for my simple html.
>
> #!/usr/bin/perl
> my $count=0;
> while(<>){
>         # remove trailing whitespace
>         s/^\s+//;
>
>         # remove leading whitespace
>         s/\s+$//;
>
>         # change internal whitespace to single space
>         s/\s+/ /g;
>
>         # remove simple one line comments
>         s/<!--.*?-->//;
>
>         # another simple whitespace removal
>         s/> </></g;

You realise this changes the presentation of the HTML?

>         #newlines are not needed
>         #except  for Content-type-text/html\n\n
>         # which occurs at the start
>         print;
>         print "\n" if $count++<4;

Why 4?

> }

'A little buggy'? The whole idea's fundamentally flawed: you need to
start by separating the HTTP from the HTML from the data, which means
using an HTML parsing module. For instance, what about this:

<link
  rel=stylesheet
  type="text/css"
  href="..."/>

Or this:

Status: 302 Found
Location: ...
Content-encoding: ...
Content-type: text/html
Content-length: ...

<html>...

Or this:

<pre>
  #!/usr/bin/perl

  use warnings;
  use strict;

  print "Hello world\n";
</pre>

Ben

-- 
I've seen things you people wouldn't believe: attack ships on fire off the
shoulder of Orion; I've watched C-beams glitter in the darkness near the
Tannhauser Gate. All these moments will be lost, in time, like tears in rain.
Time to die.  |-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-|  ben@morrow.me.uk


------------------------------

Date: Mon, 17 Nov 2003 06:55:35 +1000
From: Gregory Toomey <nospam@bigpond.com>
Subject: Re: Whitespace removal in html generated by cgi
Message-Id: <2244697.xVYichmzJo@gregs-web-hosting-and-pickle-farming>

It was a dark and stormy night, and Ben Morrow managed to scribble:

> [please limit your line lengths to 72 characters]
> [please make sure your blank lines are *actually* blank]
> 
> Gregory Toomey <nospam@bigpond.com> wrote:
>> A few weeks ago a question was asked in this group about removing
>> whitespace from html, in particular from html generated by cgi.
>> Here's a simple technique I developed for Linux:
>>
>> 1. A sample cgi. Bash uses the <<'delimiter' conststuct to pass the
>> input verbatim to Perl. The output of the cgi is piped to
>> delspace.pl. our whitespace munger.
>>  
>> #!/bin/bash
> 
> There is absolutely no need to use bash. If nothing better, use the
> techniques described in perldoc perlipc "Safe Pipe Opens". Better, use
> a tied filehandle or a PerlIO layer on STDOUT. Or simply generate the
> thing without superflous whitespace in the first place.
> 

The technique I described allows you to take an existing cgi & change 2 lines at the top & one at the bottom.
What you described will work, but its more complicated.



> <snip>
>> 2. Now here's delspace.pl, the whitespace remover. It may be a
>> little buggy, but it seems to work for my simple html.
>>
>> #!/usr/bin/perl
>> my $count=0;
>> while(<>){
>>         # remove trailing whitespace
>>         s/^\s+//;
>>
>>         # remove leading whitespace
>>         s/\s+$//;
>>
>>         # change internal whitespace to single space
>>         s/\s+/ /g;
>>
>>         # remove simple one line comments
>>         s/<!--.*?-->//;
>>
>>         # another simple whitespace removal
>>         s/> </></g;
> 
> You realise this changes the presentation of the HTML?
> 
>>         #newlines are not needed
>>         #except  for Content-type-text/html\n\n
>>         # which occurs at the start
>>         print;
>>         print "\n" if $count++<4;
> 
> Why 4?
> 
>> }
> 
> 'A little buggy'? The whole idea's fundamentally flawed: you need to
> start by separating the HTTP from the HTML from the data, which means
> using an HTML parsing module. For instance, what about this:
> 

It worked with all the cgis I've created.
Its just a simple pragmatic way to solve a real world problem .


gtoomey


------------------------------

Date: Sun, 16 Nov 2003 16:02:33 -0600
From: "Eric J. Roode" <REMOVEsdnCAPS@comcast.net>
Subject: Re: Whitespace removal in html generated by cgi
Message-Id: <Xns9435AD76F1841sdn.comcast@216.196.97.136>

-----BEGIN xxx SIGNED MESSAGE-----
Hash: SHA1

Gregory Toomey <nospam@bigpond.com> wrote in
news:1933712.m1tGeoNVPB@gregs-web-hosting-and-pickle-farming: 

> A few weeks ago a question was asked in this group about removing
> whitespace from html, in particular from html generated by cgi. Here's
> a simple technique I developed for Linux: 

What is the goal of this?  Reducing the amount of data that is 
transmitted to the client browser?  If so, you would probably be better 
off compressing the output with gzip -- all major browsers support gzip 
compressed data.

[...]
>         #newlines are not needed
>         #except  for Content-type-text/html\n\n
>         # which occurs at the start
>         print;
>         print "\n" if $count++<4;

Newlines are needed in <pre>...</pre> sections, and sometimes in 
<textarea>...</textarea> sections.

- -- 
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print

-----BEGIN xxx SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBP7f0GWPeouIeTNHoEQKoQACg4qJhX/JKb6y7ZCOK9eiMVqXih9EAn2px
YT5a72WavpE6GErYnLOzUQ+d
=zRRz
-----END PGP SIGNATURE-----


------------------------------

Date: Sun, 16 Nov 2003 17:13:24 -0500
From: Jeff 'japhy' Pinyan <pinyaj@rpi.edu>
Subject: Re: Whitespace removal in html generated by cgi
Message-Id: <Pine.SGI.3.96.1031116171158.181912A-100000@vcmr-64.server.rpi.edu>

On Sun, 16 Nov 2003, Eric J. Roode wrote:

>>         #newlines are not needed
>>         #except  for Content-type-text/html\n\n
>>         # which occurs at the start
>>         print;
>>         print "\n" if $count++<4;
>
>Newlines are needed in <pre>...</pre> sections, and sometimes in 
><textarea>...</textarea> sections.

Not to mention that, although most HTML renders multiple whitespace as a
SINGLE space, a SINGLE newline IS needed, because the browser will render
it as a space.  That is, "foo\nbar" is rendered as "foo bar", while a
string like "foo  \n  bar" is also just rendered as "foo bar".

-- 
Jeff Pinyan            RPI Acacia Brother #734            2003 Rush Chairman
"And I vos head of Gestapo for ten     | Michael Palin (as Heinrich Bimmler)
 years.  Ah!  Five years!  Nein!  No!  | in: The North Minehead Bye-Election
 Oh.  Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 5813
***************************************


home help back first fref pref prev next nref lref last post