[6519] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 144 Volume: 8

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Mar 19 11:11:09 1997

Date: Wed, 19 Mar 97 08:00:25 -0800
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 19 Mar 1997     Volume: 8 Number: 144

Today's topics:
     Re: /RE/ for Dummies - Match all caps (Robert Schuldenfrei)
     Re: /RE/ for Dummies - Match all caps (Honza Pazdziora)
     A question regarding s/(w+)/\L\u$1/g (Jim Showalter)
     Adding DB_File to my installation <mconley@pms110.pms.ford.com>
     ANNOUNCE: libnet-1.05  PATCH: 01 <gbarr@ti.com>
     Re: Broken Pipes (Charles DeRykus)
     Re: Broken Pipes <rootbeer@teleport.com>
     Re: Can we create an Perl executable (Tad McClellan)
     Re: Executable? (Steven Cotton)
     Re: Executable? <rootbeer@teleport.com>
     Re: flock() - Just the basics <rootbeer@teleport.com>
     Re: Help with Pattern Matching <gbrown@cae.ca>
     Re: How To Pattern Match To The First Occurance Of A Ch <gwhassan@prodigy.net>
     Re: How To Pattern Match To The First Occurance Of A Ch <tchrist@mox.perl.com>
     Re: ISBN/Checkdigit calculator (Donald H. Locker)
     Re: Lost backslash (Honza Pazdziora)
     Re: Newbie Question on accessing elements of Strings <agent.email@NetTown.com>
     Re: Parsing html tag attributes & values in Perl <rodos@haywood.org>
     Patch for docs Re: Lost backslash <rootbeer@teleport.com>
     perl-binaries for aix1.3 <jonnyb@omni.uio.no>
     perl-script doesn't run matgorl@chem.leidenuniv.nl
     reading in directory & filenames from a log file <rod@neep.demon.co.uk>
     Redefining character set matched by \w ? <dehon_olivier@jpmorgan.com>
     replacing et al <schajer@dircon.co.uk>
     Re: SORTING ... one more Q (Andrew M. Langmead)
     Re: text to HTML <rootbeer@teleport.com>
     Wanted: name parser (Jon Barry)
     Wrapper for constructing a privileged socket (Corina Scheiter)
     Digest Administrivia (Last modified: 8 Mar 97) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Wed, 19 Mar 1997 13:09:55 GMT
From: sailboat@tiac.net (Robert Schuldenfrei)
Subject: Re: /RE/ for Dummies - Match all caps
Message-Id: <5gooi5$hhp@news-central.tiac.net>

Tom Christiansen <tchrist@mox.perl.com> wrote:

> [courtesy cc of this posting sent to cited author via email]

>In comp.lang.perl.misc, 
>    sailboat@tiac.net (Robert Schuldenfrei) writes:
>:I have been having a problem trying to match lines in which everything is
>:capitalized.  We have some manuals that use all caps and I would like to pretty
>:them up with just initial caps.
>:
>:Thus I would like the following line:
>:
>:   THIS IS ALL CAPS
>:
>:To be converted into:
>:
>:   This Is All Caps
>:
>:My first probe at this was:
>:
>:if(/[A-Z]+/) {
>:	tr/[A-Z]/[a-z]/;
>:	}
>:
>:But that translated every line of text, not just the lines of all caps.  I am
>:obviously missing something basic in RE, but I do not know what it is.  Please
>:enlighten me.  TIA  Bob

>    s/(\w+)/\L\u$1/g

>--tom
>-- 
>	Tom Christiansen	tchrist@jhereg.perl.com
>    "It's easier to make up sayings people like to hear than sayings they
>    like to heed."
>    	--Larry Wall
Thank you Tom.  It is an honor to get help from a SUPERSTAR.  BTW, this is part
of a conversion process which takes FrameMaker 1.0 mml files and converts them
to html.  While my needs are quite modest, I have a whole shelf of manuals to
convert.  The thought of doing it by hand leaves me cold.  Is there a collection
of perl scripts or /RE/ put together somewhere to be of aid in conversion from
one markup language to another?  Yes, I guess the definition of awk and perl is
such a collection, but a book, website, or other repository would be a big help
in my study.

Thanks again.  Bob

Robert Schuldenfrei
S. I. Inc.
32 Ridley Road
Dedham, MA  02026
Voice: (617) 329-4828
FAX:   (617) 329-1696
E-Mail: bob@s-i-inc.com
WWW:    http://www.tiac.net/users/tangaroa/index.html



------------------------------

Date: Wed, 19 Mar 1997 14:32:54 GMT
From: adelton@fi.muni.cz (Honza Pazdziora)
Subject: Re: /RE/ for Dummies - Match all caps
Message-Id: <adelton.858781974@aisa.fi.muni.cz>

sailboat@tiac.net (Robert Schuldenfrei) writes:

> I have been having a problem trying to match lines in which everything is
> capitalized.  We have some manuals that use all caps and I would like to pretty
> them up with just initial caps.
> 
> Thus I would like the following line:
> 
>    THIS IS ALL CAPS
> 
> To be converted into:
> 
>    This Is All Caps
> 
> My first probe at this was:
> 
> if(/[A-Z]+/) {
> 	tr/[A-Z]/[a-z]/;
> 	}

You might try

if(/^[A-Z]+$/) {
	tr/[A-Z]/[a-z]/;
	}

What you did is test for any A-Z characters. But you say you want the
line with A-Z characters only. Then you must say that those should go
from the start of the line up to the end. However in the line THIS IS
ALL CAPS there are not only CAPS, but spaces as well. So you may want
to make the test to be if (/^[A-Z ]+$/) or better yet /^[A-Z\W]+$/ to
match lines line THIS IS ALL CAPS. Also, read at least the perlre and
perlop man pages to learn more about matches and regexps.

As for the tr line, it will convert ALL CAPS into lowercase, not just
letters inside words. So you have to change that as well, and I would
suggest to write your script this way:

if (/^[A-Z\W]+$/) {
	{
	s/(\w)(\w+)/$1\L$2/g;
	}

Hope this helps.

--
------------------------------------------------------------------------
 Honza Pazdziora | adelton@fi.muni.cz | http://www.fi.muni.cz/~adelton/
                   I can take or leave it if I please
------------------------------------------------------------------------


------------------------------

Date: Wed, 19 Mar 1997 09:42:41 -0500 (EST)
From: gamma@mintaka.iern.disa.mil (Jim Showalter)
Subject: A question regarding s/(w+)/\L\u$1/g
Message-Id: <Pine.GSO.3.95.970319092651.23014A-100000@mintaka.iern.disa.mil>


Last night I played around with this and came up with a similar answer
that you did. However, I reversed the \L\u construct to \u\L.  I would
have thought the characters should first be forced to lower case and then
the first chracter changed to upper case.  To my surprise, the answer is
the same and \L does not undo \u. I checked page 40 of Camel 2nd Ed but
it didn't say.  Could you please explain why? 

Thanks,
Jim


On 19 Mar 1997, Tom Christiansen wrote:  (Edited for shortness)

>  [courtesy cc of this posting sent to cited author via email]
> 
> In comp.lang.perl.misc, 
>     sailboat@tiac.net (Robert Schuldenfrei) writes:
>
> :Thus I would like the following line:
> :
> :   THIS IS ALL CAPS
> :
> :To be converted into:
> :
> :   This Is All Caps
> :
>     s/(\w+)/\L\u$1/g
> 
> --tom
> -- 
> 	Tom Christiansen	tchrist@jhereg.perl.com




------------------------------

Date: Wed, 19 Mar 1997 09:31:46 -0500
From: Mike Conley <mconley@pms110.pms.ford.com>
Subject: Adding DB_File to my installation
Message-Id: <332FF8D1.676@pms110.pms.ford.com>

When we built perl 5.003, we did not have the Berkeley libdb.a 
available. Can I build only the DB_File module without rebuilding
all of perl? If so, how?

-- 
Mike Conley
mconley@ford.com


------------------------------

Date: 19 Mar 1997 14:26:12 GMT
From: Graham Barr <gbarr@ti.com>
Subject: ANNOUNCE: libnet-1.05  PATCH: 01
Message-Id: <5got24$fh$1@nadine.teleport.com>

I have just uploaded to CPAN a patch to libnet-1.05 which fixes a few minor
problems that have been reported.

These fixes are

	Net::SMTP
        - Changed the default HELO argument from hostdomain() to hostfqdn()
 
        Net::Cmd, Net::Domain
        - Incremented VERSION numbers
 
        Net::NNTP
        - corrected @_ check for newnews
 
        Net::Domain
        - Added use of inet_domain element from NetConfig
 
        Net::FTP
        - Updated documentation for get for WHERE

	Net::Cmd 
        - Added ->debug as conditional to EOF warning

libnet is a collection of perl modules which encapsulate the usage
of various protocols used in the internet community. These include

  Net::FTP     (RFC959)
  Net::SMTP    (RFC821)
  Net::Netrc
  Net::Cmd
  Net::Domain
  Net::Telnet  (RFC854)
  Net::Time    (RFC867 & RFC868)
  Net::NNTP    (RFC977)
  Net::POP3    (RFC1939)
  Net::SNPP    (RFC1861)
  Net::PH
  Net::Config

To install libnet you ***MUST*** have the following modules installed

  Data::Dumper
  IO::Socket

It should be avaliable on mirror sites soon from

    http://www.perl.com/CPAN/authors/Graham_Barr/libnet-1.05_01.pat.gz

Comments are always very welcome.

Copyright 1996 Graham Barr. All rights reserved.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.

Share and Enjoy!
Graham <gbarr@ti.com>
-- 
Graham Barr                                               <gbarr@ti.com>
Logic merely enables one to be wrong with authority.
	--Doctor Who





------------------------------

Date: Tue, 18 Mar 1997 23:50:06 GMT
From: ced@bcstec.ca.boeing.com (Charles DeRykus)
Subject: Re: Broken Pipes
Message-Id: <E79JJJ.GKu@bcstec.ca.boeing.com>

In article <5gmj7mINNeu3@python.cis.ohio-state.edu>,
tracy allen brown <brownt@python.cis.ohio-state.edu> wrote:
 >   I was hoping someone could explain the possible reasons why the open() 
 >function, when used to fork off a process, could produce a broken pipe error
 >message.
>	Tracy Brown		brownt@cis.ohio-state.edu

such problems usually occur after the fork off...  could you
post some code?

--
Charles DeRykus
ced@carios2.ca.boeing.com


------------------------------

Date: Wed, 19 Mar 1997 07:37:15 -0800
From: Tom Phoenix <rootbeer@teleport.com>
To: tracy allen brown <brownt@python.cis.ohio-state.edu>
Subject: Re: Broken Pipes
Message-Id: <Pine.GSO.3.96.970319073442.24834J-100000@kelly.teleport.com>

On 18 Mar 1997, tracy allen brown wrote:

>    I was hoping someone could explain the possible reasons why the
> open() function, when used to fork off a process, could produce a broken
> pipe error message. 

Open doesn't do that, but writing to a closed pipe can. If either process
writes to a pipe which isn't open, it gets a SIGPIPE. (For example, this
would happen if the open failed and you didn't check for that before
printing to the pipe.) Hope this helps! 

-- Tom Phoenix        http://www.teleport.com/~rootbeer/
rootbeer@teleport.com   PGP  Skribu al mi per Esperanto!
Randal Schwartz Case:     http://www.lightlink.com/fors/



------------------------------

Date: Wed, 19 Mar 1997 07:29:01 -0600
From: tadmc@flash.net (Tad McClellan)
Subject: Re: Can we create an Perl executable
Message-Id: <tmpog5.bm.ln@localhost>

akench@cvimail.cv.com wrote:
: Hi Perl Gurus,

: Can we compile and build a PERL script
: to get an excutable? I am using PERL scripts
: as call backs from the GUI. Since some of the
: scripts are big the responce time is not good enough.
: Hoping that running an executable will
: help, cause it'll save the compilation time.
: Also pl. tell me about any limitation
: for the executables so created.


>From the new Perl FAQ, part 3:

=head2 How can I compile my Perl program into byte-code or C?

Malcolm Beattie has written a multifunction backend compiler,
available from CPAN, that can do both these things.  It is as of
Feb-1997 in late alpha release, which means it's fun to play with if
you're a programmer but not really for people looking for turn-key
solutions.

I<Please> understand that merely compiling into C does not in and of
itself guarantee that your code will run very much faster.  That's
because except for lucky cases where a lot of native type inferencing
is possible, the normal Perl run time system is still present and thus
will still take just as long to run and be just as big.  Most programs
save little more than compilation time, leaving execution no more than
10-30% faster.  A few rare programs actually benefit significantly
(like several times faster), but this takes some tweaking of your
code.

Malcolm will be in charge of the 5.005 release of Perl itself
to try to unify and merge his compiler and multithreading work into
the main release.

You'll probably be astonished to learn that the current version of the
compiler generates a compiled form of your script whose executable is
just as big as the original perl executable, and then some.  That's
because as currently written, all programs are prepared for a full
eval() statement.  You can tremendously reduce this cost by building a
shared libperl.so library and linking against that.  See the
F<INSTALL> podfile in the perl source distribution for details.  If
you link your main perl binary with this, it will make it miniscule.
For example, on one author's system, /usr/bin/perl is only 11k in
size!



--
    Tad McClellan                          SGML Consulting
    Tag And Document Consulting            Perl programming
    tadmc@flash.net


------------------------------

Date: 19 Mar 1997 14:10:45 GMT
From: cottons@bre.co.uk (Steven Cotton)
Subject: Re: Executable?
Message-Id: <5gos55$q7i$1@sys10.cambridge.uk.psi.net>


>I put a perl script on my isp's cgi server last week. I telnet to the
>server, type perl mcr.cgi in the directory where the script is and
>everything works fine. The tags print to screen and such. I used chmod 777
>mcr.cgi to make it executable. Doesn't that mean I can tpye mcr.cgi
>(without running perl; ie. perl mcr.cgi)? Anyway, when I type mcr.cgi Unix
>gives me a "bad command" message.

Perhaps "current directory" isn't part of the path. Tried ./mcr.cgi? I assume
you have a line resembling:

#!/usr/bin/perl

at the start of the file too? Good luck.


Steve Cotton.


------------------------------

Date: Wed, 19 Mar 1997 07:43:27 -0800
From: Tom Phoenix <rootbeer@teleport.com>
To: Mike Smith <masmith@telusplanet.net>
Subject: Re: Executable?
Message-Id: <Pine.GSO.3.96.970319073742.24834K-100000@kelly.teleport.com>

On 19 Mar 1997, Mike Smith wrote:

> I put a perl script on my isp's cgi server last week. I telnet to the
> server, type perl mcr.cgi in the directory where the script is and
> everything works fine. The tags print to screen and such. I used chmod
> 777 mcr.cgi to make it executable.

Ooooh, don't do that. CGI scripts already have enough security worries
without making them world writable. (Do you really want any user on your
system to be able to edit your script to print "Satan rules!" atop each
HTML page? :-)  You probably want 755, although you could probably get by
with fewer perms than that. And your webmaster or ISP should be scolded
for not making sure you knew that before you put any scripts on the
server.

> Doesn't that mean I can tpye mcr.cgi (without running perl; ie. perl
> mcr.cgi)? Anyway, when I type mcr.cgi Unix gives me a "bad command" 
> message.

Maybe your #! line is defective, or your system doesn't support it. Or
maybe your script isn't in the current PATH, which may not include '.' . 
There may be other reasons as well. 

> The other thing is when I click the form using the script I get a server
> error that states there is an internal configuration problem. I called
> my ISP and they said my cgi password may (they are not sure??????) be
> crackable so that may be the problem.

Your ISP may not be especially cluefull. :-)

When you're having trouble with a CGI form in Perl, you should first look
at the please-don't-be-offended-by-the-name Idiot's Guide to solving such
problems. It's available on the perl.com web pages. Hope this helps!

   http://www.perl.com/perl/
   http://www.perl.com/perl/faq/
   http://www.perl.com/perl/faq/idiots-guide.html

-- Tom Phoenix        http://www.teleport.com/~rootbeer/
rootbeer@teleport.com   PGP  Skribu al mi per Esperanto!
Randal Schwartz Case:     http://www.lightlink.com/fors/



------------------------------

Date: Wed, 19 Mar 1997 07:29:55 -0800
From: Tom Phoenix <rootbeer@teleport.com>
To: Robert Schuldenfrei <sailboat@tiac.net>
Subject: Re: flock() - Just the basics
Message-Id: <Pine.GSO.3.96.970319072933.24834H-100000@kelly.teleport.com>

On Tue, 18 Mar 1997, Robert Schuldenfrei wrote:

> Subject: flock() - Just the basics

I think you could use the methods in Randal's fourth Web Techniques
column, which explains how to use flock() to avoid problems when multiple
processes need to modify one file. Hope this helps! 

   http://www.stonehenge.com/merlyn/WebTechniques/

-- Tom Phoenix        http://www.teleport.com/~rootbeer/
rootbeer@teleport.com   PGP  Skribu al mi per Esperanto!
Randal Schwartz Case:     http://www.lightlink.com/fors/



------------------------------

Date: Wed, 19 Mar 1997 09:14:55 -0500
From: Gil Brown <gbrown@cae.ca>
To: mwick@toy.mem.ti.com
Subject: Re: Help with Pattern Matching
Message-Id: <332FF4DF.41C6@cae.ca>

M. Wick wrote:
> 
> I've tried grep, index, and the matching sequence and nothing seems to
> be working.  I'm opening a file, reading in a line, and trying to
> match a pattern from within the line.  I have found that I can match
> the line if the parameter starts at the beginning on the line.  I want
> to find the pattern in the middle of the line, and nothing seems to be
> working correctly.
> 
> while( <FILE> ){
>   if ( (index($_, $pattern)) >= 0 ) {
> 
>   if (grep($pattern, $_) != 0) {  (even though grep works on an array,
>                                    I still tried it)
> 
>   if (/$pattern/) {               (matching on the $_ from the file)
>        do this stuff if match;
>   }
> }
> 
> I've tried all 3 possibilities.
> 
> Here are the lines that I'm looking at:
> 
> NAME                                 <-- Works fine here
> <LI><A HREF="http://..../">NAME</A>  <-- Doesn't match this line
> 
> I'm trying to find NAME.  Any suggestions?
> Thanks in advance.

Hello;

I do not know if this is what you mean but the following code is
normally what I use to match any patterns I want anywhere in a line.

if (open(MYFILE2,"$TMP/test1.$PID"))
	{
		$line=<MYFILE2>;
		while ($line ne "")
		{
			if ($line !~ /.netscape\/cache/)
			{
				open (MYFILE3,">>$TMP/test2.$PID");
 				print MYFILE3 ("/$line");
			}
			$line=<MYFILE2>;
		}
	}

In this case (a backup script I wrote) I am looking for any netscape
cache files to be sure I am not backing them up.

Hope this helps.


Regards.


------------------------------

Date: Wed, 19 Mar 1997 06:11:53 +0000
From: Greg Hassan <gwhassan@prodigy.net>
Subject: Re: How To Pattern Match To The First Occurance Of A Character
Message-Id: <332F83A9.660B948B@prodigy.net>

Rhadji P wrote:
> 
> I must be missing something. It can't be that difficult.
> 
> What I'm trying to do is pattern match to the first occurance of a
> character.
> 
> For example, given the following string...
> 
> <TEXTAREA NAME="area 1" ROWS="10" COLS="10"></TEXTAREA>
> 
> I want $1 to equal  NAME="area 1" ROWS="10" COLS="10"
> 
> but the pattern /<TEXTAREA(.*)>/
> 
> gobbles up the string to the second ">" causing
> 
> $1 to equal NAME="area 1" ROWS="10" COLS="10"></TEXTAREA
> 
> How do I pattern/<TEXTAREA( match to the first occurance of > ?
> 
> Thanks
> 
> Ron
> dharma@msys.net

you need to do something more descriptive such as:

/<TEXTAREA([^>]+)>/


------------------------------

Date: 19 Mar 1997 15:15:28 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: How To Pattern Match To The First Occurance Of A Character
Message-Id: <5govug$ad8$1@csnews.cs.colorado.edu>

 [courtesy cc of this posting sent to cited author via email]

In comp.lang.perl.misc, 
    dharma@msys.net (Rhadji P) writes:
:I must be missing something. It can't be that difficult.
:
:What I'm trying to do is pattern match to the first occurance of a
:character.
:
:For example, given the following string...
:<TEXTAREA NAME="area 1" ROWS="10" COLS="10"></TEXTAREA>
:I want $1 to equal  NAME="area 1" ROWS="10" COLS="10"
:but the pattern /<TEXTAREA(.*)>/
:gobbles up the string to the second ">" causing
:$1 to equal NAME="area 1" ROWS="10" COLS="10"></TEXTAREA
:How do I pattern match to the first occurance of > ?

    occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce
    occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce
    occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce
    occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce
    occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce occuRREnce

There, now that I've got that out of my system, here's something
from perlfaq6:

  What does it mean that regexps are greedy?  How can I get around it?

    Most people mean that greedy regexps match as much as they
    can. Technically speaking, it's actually the quantifiers (`?',
    `*', `+', `{}') that are greedy rather than the whole pattern; Perl
    prefers local greed and immediate gratification to overall greed. To
    get non-greedy versions of the same quantifiers, use (`??', `*?',
    `+?', `{}?').

    An example:

            $s1 = $s2 = "I am very very cold";
            $s1 =~ s/ve.*y //;      # I am cold
            $s2 =~ s/ve.*?y //;     # I am very cold

    Notice how the second substitution stopped matching as soon as it
    encountered "y ". The `*?' quantifier effectively tells the regular
    expression engine to find a match as quickly as possible and pass
    control on to whatever is next in line, like you would if you were
    playing hot potato.

And here's something from perlfaq9:

  How do I remove HTML from a string?

    The most correct way (albeit not the fastest) is to use HTML::Parse
    from CPAN (part of the libwww-perl distribution, which is a must-have
    module for all web hackers).

    Many folks attempt a simple-minded regular expression approach,
    like `s/<.*?>//g', but that fails in many cases because the tags may
    continue over line breaks, they may contain quoted angle-brackets, or
    HTML comment may be present.  Plus folks forget to convert entities,
    like `&lt;' for example.

    Here's one "simple-minded" approach, that works for most
    files:

        #!/usr/bin/perl -p0777
        s/<(?:[^>'"]*|(['"]).*?\1)*>//gs

    If you want a more complete solution,
    see the 3-stage striphtml program in
    http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz
    .

  How do I extract URLs?

    A quick but imperfect approach is

        #!/usr/bin/perl -n00
        # qxurl - tchrist@perl.com
        print "$2\n" while m{
            < \s*
              A \s+ HREF \s* = \s* (["']) (.*?) \1
            \s* >
        }gsix;

    This version does not adjust relative URLs, understand alternate
    bases, deal with HTML comments, or accept URLs themselves
    as arguments. It also runs about 100x faster than a more
    "complete" solution using the LWP suite of modules, such as the
    http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz
    program.

So, you need to study things like [^>]+ or .?* and
the /s flag and multiline input.  Here are my general tips
about this matter:

    0) Using regular expressions for generalized parsing is quite hard.

    1) You can look at some approaches at
    http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/html-hacking.html
    some of which take this regex thing way to the limit.

    2) You could (and probably should) also get jfriedl's great Mastering
    Regular Expressions book.

    3) But what I think you should really do is get the great LWP module
    from http://cgi.perl.com/cgi-bin/cpan_mod?module=LWP because it
    has facilities to do just what you're talking about -- that is,
    parsing HTML.


--tom

PS: Here's my standard treatise for remembering how to spell
    occurrence:

    Subject: The Hardest Word to Spell in the English Language

    Ranking right up there with "separate", the word "occur" and its
    compounds are amongst the most commonly misspelled words in the
    whole English language.  You seem to have gotten one of these "occur"
    words wrong in the cited text.  I think no one ever teaches people
    not just HOW to spell the "occur" words, but more importantly, tell
    them WHY.  So please permit me, if you would, to show you WHY it's
    spelled as it is, and then maybe you'll remember better next time.

    First of all, "occur" in English derives from a Latin verb (occurrere)
    whose conjugation happens to be something other than the 1st (-are
    verbs), and 1st conjugation verbs are the *ONLY* Latin ones that go
    to -ance and -able compounds in English.  (Non-Latin ones, like those
    of German derivatoin, also go to the -a- forms.).  Thus, it must form
    compounds with -ence and -ible.  If you can't remember any Latin,
    it's simple to think of what it would be in any modern Latin dialect,
    like Spanish, French, Italian, or Portuguese.  In all cases, you
    have a verb ending in /[ei]re?$/, (in fact, /^occ?o?urr?[ei]re?$/
    describes all the forms), which clues you in that it's not first
    conjugation at all.  That means -ance is dead wrong.

    Secondly, English verbs stressed on the last syllable using "short"
    vowels in English (like infer, beget, forget, and incur) always
    double their final consonant when adding suffixes, thus providing
    us with infeRRing and infeRRIBLE (FN#1), begeTTing and begaTTer, forgeTTing
    and forgeTTABLE, and incuRRed and incuRRing. (FN#2) 

    Thus it must be occuRRed, occuRRing, and occuRRENCE.

    I truly hope this answers the WHY, rather than the traditional
    boring and painful spelling flames which never do more than chastise
    the pointer.  I'm tried to tell you now just HOW, but also WHY.

    I hope it's some help.

    FN#1: Yes, I know "inferable" is gaining popularity over "inferrible",
    but if go down that route, we'll end up with "occurance", which is
    not supportable.

    FN#2: If it isn't stressed at the end, spelling varies, like
    formated/formatted, travelled/traveled, and programmed/programed.
    My own choice is to double it in all or nearly all cases.


-- 
	Tom Christiansen	tchrist@jhereg.perl.com
    There is, however, a strange, musty smell in the air that reminds me of
    something...hmm...yes...I've got it...there's a VMS nearby, or I'm a Blit.
        --Larry Wall in Configure from the perl distribution


------------------------------

Date: Fri, 14 Mar 1997 14:44:52 GMT
From: dhl@mrdog.msl.com (Donald H. Locker)
Subject: Re: ISBN/Checkdigit calculator
Message-Id: <E71FMt.KJ@mrdog.msl.com>

[posted and emailed]

How about: (caution: untested, untried, undebugged)

my $indx = 10;
my ($sum, $digit, $lastdigit);

for $digit (split (//, $isbn)) {
    warn "too many digits" if $indx == 0;
    $sum += $indx * $digit;
    $indx -= 1;
}
$lastdigit = $sum % 11;
$lastdigit = 'X' if $lastdigit == 10;
$lastdigit;

In article <3325FD5B.7546@sybex.com>, Matt Riggsby  <mriggsby@sybex.com> wrote:
>I need a perl script that will calculate the final digit of an ISBN and
>would prefer not to reinvent the wheel if at all possible.  For those of
>you unfamiliar with the process, a book's ISBN is an eleven digit
>number, the first ten of which identify the publisher and sequence in
>publishing.  A check-digit is calculated by calculating a sum with the
>first ten digits (10 * the first digit + 9 * the second and so on), then
>dividing the sum by 11 and taking the remainder (the digit is "X" if the
>remainder is 10).  Has anyone written a perl routine for this?  I
>browsed through CPAN, but nothing definitively relevant popped up at
>me.

-- 
Donald.
These opinions were formulated by a trained professional.
              DO NOT TRY THIS AT HOME!
      At the time, the tone will be ... BEEP!


------------------------------

Date: Wed, 19 Mar 1997 14:16:39 GMT
From: adelton@fi.muni.cz (Honza Pazdziora)
Subject: Re: Lost backslash
Message-Id: <adelton.858780999@aisa.fi.muni.cz>

Thomas Buehner <buehner@pfaffenhofen.netsurf.de> writes:

> When I run this script:
> 
>  $test = '\.\\test';
>  print $test;
>  
> what gets printed is: \.\test
> 
> I am confused. What I expected to get was: \.\\test, maybe .\test, if my 
> understanding of Perl had been wrong, but \.\test?

You can do $test = '\.\\te\'st'; to get ' into the string. Well then
you need a way to get \ into the string, and the way is double
backslash.

So \\ -> \, but you do not need to write two just to get one, in fact
print '\.\test'; will print \.\test just OK.

Read the manual page for more info.

--
------------------------------------------------------------------------
 Honza Pazdziora | adelton@fi.muni.cz | http://www.fi.muni.cz/~adelton/
                   I can take or leave it if I please
------------------------------------------------------------------------


------------------------------

Date: Wed, 19 Mar 1997 08:16:14 +0000
From: Opera Ghost <agent.email@NetTown.com>
To: Mark Glover <mglover@rnib.org.uk>
Subject: Re: Newbie Question on accessing elements of Strings
Message-Id: <332FA0CE.17F30865@NetTown.com>

Mark Glover wrote:
> 
> I am sure there is a very simple and obvious answer to this question,
> but I can't find the answer in any of the books I have!
> 
> I am reading lines of text into a string from a file using the READ
> function call and now I want to cycle through the string character by
> character but because PERL strings are not arrays I cannot see how to
> do this; in 'C' I would simply do:-
> 
>         for (n=0; n<strlen(str); ++n) {
>                 do something useful with str[n];
>         }
> 
> Any help would be very much appreciated.
> 
> Markstr = 'hello mark';

while ($str =~ /./g) {
  print "$&\n";
}

this is running at 
http://nettown.com/site_perl/doc/eg/news.string.cgi
this source is at
http://nettown.com/site_perl/doc/eg/news.string.cgi.source
--
<i>Opera Ghost
<mailto:agent.email@NetTown.com>
[http://NetTown.com/site_perl/]
have a good day!


------------------------------

Date: Thu, 20 Mar 1997 01:02:56 +1000
From: Rodos <rodos@haywood.org>
To: Dan Sumption <dan@gulch.demon.co.uk>
Subject: Re: Parsing html tag attributes & values in Perl
Message-Id: <33300020.4445@haywood.org>

Dan Sumption wrote:
> My problem is that that the program separates out the tag by
> looking for the < and > characters. This can lead to errors (and
> a potentially never-ending loop) when one of these characters
> appears as part of a value. For example:
> <img src="arrow.gif" alt="Look Here >">
> Should be reduced to:
> src="arrow.gif" alt="Look Here >"
> before parsing, but my code instead reduces it to:
> src="arrow.gif" alt="Look Here ">
> 
> I can write various 'fixes' to avoid some of these potential
> problems, but they all seem very long-winded and ugly looking.
> Surely there must be 'more than one way to do it'?

> Enough already! Here's the code:
> (call using, for e.g., %pairs=&tag_attribs('<META NAME="Author"
> CONTENT="Dan Sumption">');
> 
> sub tag_attribs
> {
>         local ($_, *pairs) = shift;
>         s/^\s*<\s*\w+\s*([^>]*)>.*/$1/;

Dan I have worked out whats going wrong. 

In the first regex (above) the ([^>]*) will stop at the FIRST > and not
gobble up characters until the last >. Thats why you end up with  
src="arrow.gif" alt="Look Here ">
rather than
src="arrow.gif" alt="Look Here>"
as you want.

Now I played with this trying to use a $ at the end to get the regex to
gobble as much as I could but it got awfully complicated and I could not
get it to work. Then I replaced it with two regex's.

s/^[^<]<(.*)/$1/;
s/(.*)>\s*$/$1/;

Which drops everything before the first < and then everything after the
last >. After this the value left is
src="arrow.gif" alt="Look Here>"
and it works no matter how may < or > are in quoted text.

Here is my test program.

Rodos

P.S. You did a great job on the parsing of the attribute value pairs!
P.P.S. I added comments (as I can never remember what my code does a
week later) so its a little longer.

# parsetag.pl

$check = ' < img src=blue.gif alt="<blue dot>"> ';  &dump;
$check = ' < img src=blue.gif alt="blue dot>"> ';   &dump;
$check = q| < img src='blue.gif' alt="blue dot"> |; &dump;
$check = q| <table border cellpadding=5> |;         &dump;
$check = q| <table border cellpadding=5 > |;        &dump;
$check = q| <td width=50% valign="center"> |;       &dump;

exit;

sub dump {
    ($tag, %pairs) = &parseTag($check);
    print "\n\n$check\n$tag : ";
    foreach $key (keys %pairs) {
        print "'$key = $pairs{$key}'\t";
    }
}

sub parseTag {

    use Strict;
    local ($_) = shift;
    my ($attribute, $value, $separator, $tag, %pairs) = '';

    # Remove upto and including the <
    s/^[^<]<(.*)/$1/;

    # Remove everything after and including the outer >
    s/(.*)>\s*$/$1/;

    # Take the first word, its the tag
    s/^\s*((\w|-)+)\s*//;
    $tag = $1;

    # Whilst we still have attribute pairs read them
    while ($_) {

        # Get the next word delimited by a space
        s/^\s*((\w|-)+)\s*//;
        $attribute = $1;

        # Is there is a value for the attribute? Remove the =
        if (s/^\s*=\s*//) {

            # Is the attribute surounded by a delimiter?
            if (s/^\s*("|')//) {

                # There was a seperate, grab everything until
                #the end of the seperator
                $separator = $1;
                s/^([^$1]*)$1\s*//;
                $value = $1;
            }
            else {

                # No delimiter on the value so grab everything up the
gap
                s/^(\S*)\s*//;
                $value = $1;
            }
        }
        else {
            # There was no = so it was a bare attribute, such as
"border"
            $value = '';
        }
        $pairs{$attribute} = $value;
    }           
    return ($tag, %pairs);
}


------------------------------

Date: Wed, 19 Mar 1997 07:28:57 -0800
From: Tom Phoenix <rootbeer@teleport.com>
To: Dave@Thomases.com, Chip Salzenberg <chip@atlantic.net>
Subject: Patch for docs Re: Lost backslash
Message-Id: <Pine.GSO.3.96.970319071438.24834G-100000@kelly.teleport.com>

On 19 Mar 1997, Dave Thomas wrote:

> From 'perldoc perlop'"
> 
>        'STRING'
>                       A single-quoted, literal string.  Backslashes are
> 		      ignored, unless followed by the delimiter or
>                       another backslash, in which case the delimiter or
>                       backslash is interpolated.

Hmmm... Your answer to the original question was correct, but that quote
from the docs doesn't seem right. In '\n', the backslash is not ignored. 

This patch may help. Then again, it may not. :-)

--- perl5.003_93/pod/perlop.pod.orig	Wed Mar 19 08:19:41 1997
+++ perl5.003_93/pod/perlop.pod	Wed Mar 19 08:23:52 1997
@@ -767,12 +767,13 @@
 
 =item C<'STRING'>
 
-A single-quoted, literal string.  Backslashes are ignored, unless
-followed by the delimiter or another backslash, in which case the
-delimiter or backslash is interpolated.
+A single-quoted, literal string. A backslash represents a backslash
+unless followed by the delimiter or another backslash, in which case
+the delimiter or backslash is interpolated.
 
     $foo = q!I said, "You said, 'She said it.'"!;
     $bar = q('This is it.');
+    $baz = '\n';		# a two-character string
 
 =item qq/STRING/
 
@@ -783,6 +784,7 @@
     $_ .= qq
      (*** The previous line contains the naughty word "$1".\n)
 		if /(tcl|rexx|python)/;      # :-)
+    $baz = "\n";		# a one-character string
 
 =item qx/STRING/
 


-- Tom Phoenix        http://www.teleport.com/~rootbeer/
rootbeer@teleport.com   PGP  Skribu al mi per Esperanto!
Randal Schwartz Case:     http://www.lightlink.com/fors/



------------------------------

Date: 19 Mar 1997 14:59:43 +0100
From: Jonny Birkelund <jonnyb@omni.uio.no>
Subject: perl-binaries for aix1.3
Message-Id: <pd8sp1s6kz4.fsf@omni.uio.no>

Hi,
I'm looking for binaries for aix1.3 (that is aixps2 version close to
stoneage) 

Any hints is appreciable.

/Jonny
Univ. of Oslo


------------------------------

Date: Wed, 19 Mar 1997 14:55:06 +0100
From: matgorl@chem.leidenuniv.nl
Subject: perl-script doesn't run
Message-Id: <332FF03A.4EDD@chem.leidenuniv.nl>

I have written a simple perl-script and put it on my server in
my-home-directory/public_html/cgi-bin/ 
When I run it from the command-line during a telnet session it works
perfectly and outputs the text hallo. (see script below)
When I run it through Netscape (using a form and POST) I get an error
501.
What do I do wrong? The location of perl on the server is right. The
script itself works but it doesn't seem to be interpreted as a script
when I refer to it by an html request. The attributes are rwxrwxrwx so
it is accessible by any means. And when I call the file with the GET
method it just is printed on the screen as was it a normal html
document.
Do I have to place the script in a specific directory on the server? Or
do I have to change a file on the server that indicates that a file with
a cgi-extension has to be interpreted instead of echoed to the caller.
Or is there some kind of environment variable that has to be set.

#!/usr/local/bin/perl 

print <<EOF ;
hallo
EOF

exit ;


Thanks,


------------------------------

Date: Wed, 19 Mar 1997 13:26:55 +0000
From: Rod Neep <rod@neep.demon.co.uk>
Subject: reading in directory & filenames from a log file
Message-Id: <BdTL+JAfm+LzEwNF@neep.demon.co.uk>

The following snippet from a script reads a log file with lines in the
following format:

(all on one line)
xyz.demon.co.uk - - [02/Mar/1997:12:19:21 +0000] "GET
/fweb/fddc/tourist/index.htm HTTP/1.0" 200 3895

there are several sections to this line, each separated by a space.

The following script works fine to get the 7th section, and if it
contains the string "index" then places the full path and filename into
$docList (later I get it to count the number of each docList, and then
print the results to another file)

This works fine BUT what I want to be able to do is extract ONLY those
which contain "index" in a *certain group of directories*

for example 

/fewb/fddc/index.htm
/fweb/fddc/tourist/index.htm
/fweb/fddc/etc/index.htm

and ignore all others such as /fweb/dean ... etc

How can I change the following (working) section of the script to do
what I need?

Your help would be much appreciated
Thanks

Rod

====================== section of script =========================
open(LOGFILE) or die("Could not open log file.");

#iterate over each line of the logfile
foreach (<LOGFILE>) {
    # parse the entry to extract all the items but only keep item 7
    $fileSpec = (parseLogEntry())[7];
    # put the filename into pattern memory
    $fileSpec =~ m!.+/(.+)!;

    # store the filename into $fileName
    $fileName = $1;
    # some requests don't specify a filename, just a directory.
    # so test to see that the filename is defined
    if (defined($fileName)) {

  # specify the string to be searched for in the filename in item 7
          # for example in /fweb/fddc/tourist/index.htm 
          #                                      V here
        $docList{$fileSpec}++ if $fileName =~ m/^index/i;
    }
}
close(LOGFILE);

====================== end of section of script ===================



-- 
Rod Neep
Cinderford, Gloucestershire, England
E-mail    : rod@neep.demon.co.uk


------------------------------

Date: 19 Mar 1997 14:56:27 +0000
From: Olivier Dehon <dehon_olivier@jpmorgan.com>
Subject: Redefining character set matched by \w ?
Message-Id: <njzpvwwszfo.fsf@jpmorgan.com>


Hi all,

I was wondering if it was possible in Perl to redefine the character
set that is matched by \w. It is with emacs.
The default set (I think) is [0-9a-zA-Z_] for Perl.
Under emacs, the default set is [0-9a-zA-Z], but I can change it so
that it works just like Perl.

I couldn't find this feature in perlre. Maybe it's just not possible
and I will have to use an explicit character set in my RE.

I also understand that this feature would make Perl code less
readable, but well, it could be useful sometimes.

Thanks in advance for pointers.

Regards,
Olivier Dehon


------------------------------

Date: Wed, 19 Mar 1997 16:19:21 +0000
From: Alex Schajer <schajer@dircon.co.uk>
Subject: replacing et al
Message-Id: <33301209.167E@dircon.co.uk>

Hi,

I'm having a spot of bother, hope you can help:

How do I replace everything after a certain point.
EG /this/that/then/this/replace.me

How do I leave the this and thats and cut replace.me

I know it must be something flash with `s///g or tr

THanks

Alex


------------------------------

Date: Wed, 19 Mar 1997 14:06:44 GMT
From: aml@world.std.com (Andrew M. Langmead)
Subject: Re: SORTING ... one more Q
Message-Id: <E7An78.893@world.std.com>

friedman@medusa.acs.uci.edu (Eric D. Friedman) writes:

>Is the Schwartzian Transform documented in some obvious place that
>I've over looked? I've seen a number of allusions to it in recent
>days, but have yet to catch a glimpse of the thing itself.

One place is the FAQ.

>=head2 How do I sort an array by (anything)?
[stuff deleted in hoped that people read the stuff after the FAQ quote] 
>If you have a complicated function needed to pull out the part you
>want to sort on, then don't do it inside the sort function.  Pull it
>out first, because the sort BLOCK can be called many times for the
>same element.  Here's an example of how to pull out the first word
>after the first number on each item, and then sort those words
>case-insensitively.
>
>    @idx = ();
>    for (@data) {
>	($item) = /\d+\s*(\S+)/;
>	push @idx, uc($item);
>    }
>    @sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ];
>
>Which could also be written this way, using a trick
>that's come to be known as the Schwartzian Transform:
>
>    @sorted = map  { $_->[0] }
>	      sort { $a->[1] cmp $b->[1] }
>	      map  { [ $_, uc((/\d+\s*(\S+) )[0] ] } @data;
>
[more stuff deleted. Useful stuff, get the FAQ and read it.]

Also, the document "Far More Than Everything you Ever Wanted to Know
About Sorting." <http://www.perl.com/CPAN/doc/FMTEYEWTK/sorting.html>
shows the code, but it was long before the term "Schwartzian Transform"
was coined.

Finally, putting "Schwartzian Transform" into a Usenet search engine
like DejaNews <http://www.dejanews.com> or AltaVista
<http://www.altavista.digital.com> shouldn't come up with too many
false hits. One interesting Usenet post is Message-Id:
<4l36f0$ghg@csnews.cs.colorado.edu>  where Tom Christiansen benchmarks
various snippets of sorting code. (It is also the first Usenet article
to use the term "Schwartzian Transform".)

-- 
Andrew Langmead


------------------------------

Date: Wed, 19 Mar 1997 07:33:54 -0800
From: Tom Phoenix <rootbeer@teleport.com>
To: Lewis Taylor <lewis@nexusint.com>
Subject: Re: text to HTML
Message-Id: <Pine.GSO.3.96.970319073258.24834I-100000@kelly.teleport.com>

On Tue, 18 Mar 1997, Lewis Taylor wrote:

> I am currently writing a primative text to HTML routine. 

Is there any reason you can't use code from CPAN? Hope this helps!

    http://www.perl.org/CPAN/
    http://www.perl.com/CPAN/

-- Tom Phoenix        http://www.teleport.com/~rootbeer/
rootbeer@teleport.com   PGP  Skribu al mi per Esperanto!
Randal Schwartz Case:     http://www.lightlink.com/fors/



------------------------------

Date: 19 Mar 1997 15:39:20 GMT
From: barry@mayo.edu (Jon Barry)
Subject: Wanted: name parser
Message-Id: <5gp1b9$i7u$1@tribune.mayo.edu>

Can someone give me a reference to a name parser? I'm looking for a
function that will split up a name given in one string into its component
parts such as lastname, firstname, title(s), surname(s)...etc. This should
work for names that might come from all over the world. I don't want to
re-invent something I'm sure has been done before.

Many thanks, Jon




           Jon Barry
     barry.jon@mayo.edu
Mayo Clinic Cancer Center Statistics



------------------------------

Date: 19 Mar 1997 13:29:33 GMT
From: scheiter@informatik.tu-muenchen.de (Corina Scheiter)
Subject: Wrapper for constructing a privileged socket
Message-Id: <5gopnt$q9a@sunsystem5.informatik.tu-muenchen.de>
Keywords: Wrapper, Socket, Server


I want to write a wrapper, which builds a socket (in future on a privileged port)
and then starts a server. The server should get the opened socket from the
wrapper to communicate on it.
But the problem is:
When the wrapper starts the server with exec (I also tried system), the server
always gets a closed socket.
Here is the code:

The wrapper:
#!/usr/local/dist/bin/perl -w

require 5.002;
#use strict;
use Socket;
use Carp;

#initialization 
my ($rsockname,$rec, $proto, $pid);
local ($port, $socket); 
my ($debug);
$debug=1;

$port = shift || 2000;
$proto = getprotobyname('tcp');
socket(Server, PF_INET, SOCK_STREAM, $proto) || die "socket: $!";
setsockopt(Server, SOL_SOCKET, SO_KEEPALIVE, pack("l", 1)) || 
die "setsockopt $!";
bind(Server, sockaddr_in($port, INADDR_ANY)) || die "bind: $!";
if ($debug) { print ("Socket gebunden, rufe lcnd auf\n"); }
$socket=\*Server;
#print ("$socket\n");
#print ("$$socket\n");

@list = ("lcnd1", "$port","\*Server");
exec (@list);

The server:
#!/usr/local/dist/bin/perl -w

require 5.002;
#use strict;
use Socket;
use Carp;

#initialisation
my ($ok, $rsockname,$rec, $socket); 
$ok = 200;
my ($debug);
$debug=1;
$port=$ARGV[0];
print ("Server $ARGV[1]\n");
$socket=$ARGV[1];

print ("port=$port, socket=$socket\n");
sub logmsg { print STDOUT "$0 $$: @_ at ", scalar localtime, "\n" }

listen($socket,SOMAXCONN) || die "listen: $!";
 ...


the error message:
listen() on closed fd at ./lcnd1 line 20.
listen: Bad file number at ./lcnd1 line 20.


Thanks

Corina
scheiter@informatik.tu-muenchen.de



------------------------------

Date: 8 Mar 97 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 8 Mar 97)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.

The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V8 Issue 144
*************************************

home help back first fref pref prev next nref lref last post