[23877] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 6080 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Feb 4 18:05:46 2004

Date: Wed, 4 Feb 2004 15:05:09 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 4 Feb 2004     Volume: 10 Number: 6080

Today's topics:
    Re: Clarifications <Joe.Smith@inwap.com>
    Re: Clarifications <bik.mido@tiscalinet.it>
        Copy Constructor Craziness (Unknown Poster)
        Counting words (Fran)
    Re: Counting words (Walter Roberson)
    Re: Counting words <noreply@gunnar.cc>
        Difficulty cleaning oddly encoded whitespace (from MS H (David R. Throop)
        Docs comprehensibility [was: Perl For... ] <noreply@gunnar.cc>
    Re: group but do not capture (naren)
    Re: how to find the last "new line" in string <jgibson@mail.arc.nasa.gov>
        Looking for a FAQ article on autoposting in PERL <cob@hotmail.com>
    Re: Looking for a FAQ article on autoposting in PERL <dwall@fastmail.fm>
    Re: Need help reading a perl regexp - someone clue me? <jill_krugman@yahoo.com>
    Re: Need help reading a perl regexp - someone clue me? <usenet@morrow.me.uk>
    Re: Need help reading a perl regexp - someone clue me? <usenet@morrow.me.uk>
    Re: Need help reading a perl regexp - someone clue me? <dakidd@sonic.net>
    Re: newbie help <gnari@simnet.is>
        Perl data types <no@spam.uk>
    Re: Perl data types <emschwar@pobox.com>
    Re: Perl For Amateur Computer Programmers <Joe.Smith@inwap.com>
    Re: Perl For Amateur Computer Programmers <edgrsprj@ix.netcom.com>
    Re: Perl For Amateur Computer Programmers <uri@stemsystems.com>
    Re: Perl For Amateur Computer Programmers (G Klinedinst)
    Re: RegExp for matching word "karl@aol.com" or word "pa <jill_krugman@yahoo.com>
    Re: RegExp for matching word "karl@aol.com" or word "pa <noreply@gunnar.cc>
    Re: Simple syntax question <usenet@morrow.me.uk>
    Re: Simple syntax question (Walter Roberson)
    Re: Simple syntax question <tore@aursand.no>
    Re: Simple syntax question <ittyspam@yahoo.com>
    Re: Site Visitor's eMail Address <jwillmore@remove.adelphia.net>
    Re: Site Visitor's eMail Address <lh+canned_pork@tn.no>
    Re: Site Visitor's eMail Address <tore@aursand.no>
        Soap-Lite XML parameter error, doc must have top level  (doug)
    Re: Turn $5 into $15,000 or more!!! Here's how.... (David R. Throop)
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Wed, 04 Feb 2004 21:38:52 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: Clarifications
Message-Id: <MDdUb.173729$Rc4.1316159@attbi_s54>

edgrsprj wrote:

> "Brad Baxter" <bmb@ginger.libs.uga.edu> wrote in message
> news:Pine.A41.4.58.0402021311110.21724@ginger.libs.uga.edu...
> 
>>On Sun, 1 Feb 2004, edgrsprj wrote:
> 
> 
>>Not if you give them "codes" that are syntax errors:
>>
>>    open FILENAME &#8220;&gt; c:/textfile.txt&#8221;;
>>
> 
> 
> I have a third browser that I am going to have to try in order to examine
> the statements on that page. Your post makes it appear that for some reason
> they are not displaying correctly with certain browsers.  Perhaps they will
> need to be stored there with a preformatted text format.

It's not a problem with any browser.  It's your source code.

1: open FILENAME “> c:\textfile.txt”;  # “smart quotes”
2: open FILENAME "> c:/textfile.txt";  # "double quotes" forward slash

Any code you post needs to be formatted such that the quote marks
are vertical, not slanted left and right like line 1 above.
	-Joe


------------------------------

Date: Thu, 05 Feb 2004 23:22:55 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: Clarifications
Message-Id: <tnc520ta6d6t4jafe4g28at6uedmk8hqvl@4ax.com>

On Wed, 04 Feb 2004 01:00:15 GMT, "John W. Kennedy"
<jwkenne@attglobal.net> wrote:

>> However you may find it interesting to know what my own dictionary
>> says about these two terms:
>> 
>>   SIGLA sf. [prob. dal lat. singula littera] abbreviatura di una o
>>   piu' parole per lo piu' rappresentata dalle iniziali di esse.
>>   
>>   ACRONIMO sm. nome formato con le lettere iniziali di altre parole.
>>                ^^^^

>Interesting.  In the 40 or so years since my dictionary was published, 
>"sigla" seems to have shifted its meaning a little; I have only 

Well, my own dictionary is just as old, so the claim has no
foundation...

>"initials, monogram; abbreviation".  Of course, there are far, far more 

Well, in practice *I* (not you) have *only* "initials" and
"abbreviation". But indeed "sigla" is also used for monograms.

To cut the story down, the point is that both officially and to common
people a "thing" like "USA", "CCCP" (i.e. "SSSR", btw!), etc. is a
sigla. That's what you'd call it in Italy in any case. Most people
people just ignore what an acronimo is instead.

As a side note, recently "sigla" has acquired another meaning: it's
the beginning (or ending) part of a movie or a TV show, that with the
music (generally). I don't even know the english term for
it..."opening titles" maybe?

>acronyms now than there were then, thanks to computers.  (But none as 
>famous as "Vittorio Emmanuele, Re D'Italia".)

Huh?!? I just don't know it, but in any case you mean *Emanuele*,
don't you? Oh, come on, what's its meaning?

>I notice that the two entries you quote above label the words "sf" and 
>"sm".  That suggests to me that your dictionary's editors call a noun a 
>"sostantivo", not a "nome", and that "nome" in the definition of 
>"acronimo" simply has its ordinary meaning of "name".

You're perfectly right about this point! I just didn't translate "sf"
and "sm" because... well, I was just too lazy to do it!! But hey:
we're on a Perl ng, after all!
;-)

>By the way, on the derivation of "sigla", I confess that I had guessed 
>it to derive from Lat. "sigilium", which appears in English as "sigil". 
>  But I am very ignorant in these matters.

I would have never known nor guessed, had I not checked the dictionary
because of this discussion...


Michele
-- 
you'll see that it shouldn't be so. AND, the writting as usuall is
fantastic incompetent. To illustrate, i quote:
- Xah Lee trolling on clpmisc,
  "perl bug File::Basename and Perl's nature"


------------------------------

Date: 4 Feb 2004 14:49:53 -0800
From: use63net@yahoo.com (Unknown Poster)
Subject: Copy Constructor Craziness
Message-Id: <c62e93ec.0402041449.4e127f49@posting.google.com>

The behavior I'm seeing in Perl 5.6 contradicts what I understand
about copy constructors - specifically, when they are autogenerated.

# $f is a reference to an object
my $g = $f;
print "\$g = $g, ";
++$g;
print "after ++, \$g = $g, \$f = $f\n";  # The value of $g changes,
but the
                                         #  value of $f does not!

There is no overloading of "=" - no explicit copy constructor in the
class.
There is also no overloading of "++" in the class, but it is
apparently
autogenerated from the overloading of "+".

This appears to conflict with information in "The Copy Constructor"
section
of Programming Perl, 3rd edition.
Without a copy constructor, it states that the following would happen:

"$copy =  $original;  # copies only the reference
 ++$copy;              # changes underlying shared reference"

"If the copy constructor is required during the execution of some
mutator,
but a handler for = was not specified, it can be autogenerated as a
string
copy provided the object is a plain scalar and notsomething fancier."

There are two separate scalar values in every object of the class, 
so I don't see why the copy constructor is apparently being
autogenerated
in this case.


------------------------------

Date: 4 Feb 2004 14:04:29 -0800
From: fmeizoso@yahoo.es (Fran)
Subject: Counting words
Message-Id: <2ec7a60.0402041404.43df29f3@posting.google.com>

Hi,

Can anybody please tell me how to count words? If a need to know if an
input has n words, how do I do it? Thank you


------------------------------

Date: 4 Feb 2004 22:11:12 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: Counting words
Message-Id: <bvrqm0$od7$1@canopus.cc.umanitoba.ca>

In article <2ec7a60.0402041404.43df29f3@posting.google.com>,
Fran <fmeizoso@yahoo.es> wrote:
:Can anybody please tell me how to count words? If a need to know if an
:input has n words, how do I do it? Thank you

What's a "word" for this purpose? How many words on the line

123.45e+73,nifty is it ? "Hello, ... . I'm fine" she said.


-- 
   Look out, there are llamas!


------------------------------

Date: Wed, 04 Feb 2004 23:20:59 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: Counting words
Message-Id: <bvrr5c$105pp0$1@ID-184292.news.uni-berlin.de>

Fran wrote:
> Can anybody please tell me how to count words? If a need to know if
> an input has n words, how do I do it?

That's a FAQ; see perlfaq4:

"How can I count the number of occurrences of a substring within a
string?"

> Thank you

You are welcome.

But please note that you are supposed to check the FAQ before posting
a question here.

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl



------------------------------

Date: 4 Feb 2004 16:17:17 -0600
From: throop@cs.utexas.edu (David R. Throop)
Subject: Difficulty cleaning oddly encoded whitespace (from MS HTML)
Message-Id: <bvrr1d$ohh$1@yojo.cs.utexas.edu>

I'm perplexed.  I'm writing a PERL script that reads a single large
many-sectioned HTML document, breaks it into smaller files and
extracts some information for another text-manipulation tool to read.
The first HTML file comes from saving a 150+ page MS-Word file as HTML.

I'm having fits with some nonstandard whitespace in the HMTL file.  It
appears like a long whitespace and acts as a single character, but it
doesn't patternmatch a \s.  When I view it in Emacs, it appears as
    %/1\200\216iso8859-15^B\201 \201 \201 

where \200 \216 ^B and \201 are all single characters.  But text
containing the odd whitespace fails to patternmatch those characters.
I Googled on iso8859 and found enough to get some idea that I'm
dealing with some specially encoded character, but everything I found
assumed I already knew about the encoding.

All I want to do is to turn this oddspace into regular whitespace.
Anybody?

Thanks

David Throop



------------------------------

Date: Wed, 04 Feb 2004 23:37:09 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Docs comprehensibility [was: Perl For... ]
Message-Id: <bvrs3s$103j0s$1@ID-184292.news.uni-berlin.de>

G Klinedinst wrote:
> Iain Chalmers <bigiain@mightymedia.com.au> wrote in message news:
>> "Perl makes a lousy first programming language, thats because its
>>  designed to be the *last* programming language you ever need to
>> learn."
> 
> Iain, I am one of the few who agrees with you I guess. The Perl
> docs often have terrible form.

I agree as well. The documentation occationally introduces the nature
of functions etc. very poorly, which the examples posted by Iain and
Greg illustrate.

Perl is my first (and so far only) programming language, and with a
better documentation it would certainly have taken less time to learn
some things.

This isn't the first time this issue is brought up. One reason why
nothing is changed may be that the most skilled and dedicated Perl
programmers don't see the shortcomings, since the nature of those
functions and other features has been natural to them since long.

Maybe a beginner should be engaged to proof read and suggest changes
to the docs? ;-)

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl



------------------------------

Date: 4 Feb 2004 14:52:55 -0800
From: naren_tech@yahoo.com (naren)
Subject: Re: group but do not capture
Message-Id: <c3260cc0.0402041452.16d5faf9@posting.google.com>

Hi,

Thank you very much!!
I understand that we can get this in $1 and $2, 
but the challenge I faced is to get this in one step,
basically I feed this regex to a configuration file,
which will use this regex to parse the line, it can
only take $1, it can't append $1 and $2.
That is why I considered to use (?:\|), group but do
not capture,I haven't undestood how this works??

But thanks for your feedback, 

Naren.

"David K. Wall" <dwall@fastmail.fm> wrote in message news:<Xns948499A38D6AAdkwwashere@216.168.3.30>...
> naren <naren_tech@yahoo.com> wrote:
> 
> > I need some help with a regular expression parsing, 
> > 
> > I have to group a string but want to exclude some characters from the
> > group, for example, I have a string :
> > 
> >>gnl|genbank|2398 this is a test gene
> > 
> > would like to get genbank2398
> > 
> > I have tried following reg ex, but it doesn't work, can any body
> > help??
> > 
> > m/\|(\w+(?:\|)\d+)/
> > 
> > (?:\|), group but do not capture | , is not working, I am getting
> > genbank|2398
> 
> Actually, it is working, or $2 would be set to '|'.  
> 
> You could capture only the parts you want and then concatenate them:
> 
> my $string = 'gnl|genbank|2398 this is a test gene';
> my $result;
> if ($string =~ /\w+\|(\w+)\|(\d+)/) {
>     $result = $1 . $2;
> }
> 
> 
> or you could grab everything including the unwanted | and then remove it:
> 
> my $string = 'gnl|genbank|2398 this is a test gene';
> my $result;
> if ($string =~ /^\w+\|(\w+\|\d+)/) {
>     ($result = $1) =~ s/\|//;
> }
> 
> Or you could split() the string on the |s and then modify the pieces.  
> Whatever is most convenient....
> 
> (and if I were Someone Who Must Not Be Named I'd write it using index()and 
> substr(), but that's far too painful....)


------------------------------

Date: Wed, 04 Feb 2004 13:21:36 -0800
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: how to find the last "new line" in string
Message-Id: <040220041321365138%jgibson@mail.arc.nasa.gov>

In article <d40b8853.0402040959.6d1cfba4@posting.google.com>, hshen
<hshen_1998@yahoo.com> wrote:

> Hi,
> If a string contains a few lines, (separated by '\n'), how can I make
> a  regular expression to replace this '\n' by a space?
> Thanks!

s/\n/ /g;


------------------------------

Date: 4 Feb 2004 14:01:06 -0600
From: "cob" <cob@hotmail.com>
Subject: Looking for a FAQ article on autoposting in PERL
Message-Id: <40214ef9$0$70230$45beb828@newscene.com>

I maintain a FAQ for the misc.immigration.misc newsgroup.  I have been
looking for a short perl snippet to automate FAQ announcements that would be
posted to the misc.immigration.misc newsgroup.

Also a perl snippet that could be run once a week to post an article showing
where the FAQ is.

I have a rudimentary command of perl, sufficient to re-engineer an existing
snippet if it uses something like sendmail, but I'm in no way a heavyweight.
There's scripts out there, mail2news, for example.  But they are WAY over my
head.  Something dirt simple is needed...

Is it possible?  I appreciate any pointers in the right direction!





------------------------------

Date: Wed, 04 Feb 2004 21:07:00 -0000
From: "David K. Wall" <dwall@fastmail.fm>
Subject: Re: Looking for a FAQ article on autoposting in PERL
Message-Id: <Xns9485A3F2EA3E5dkwwashere@216.168.3.30>

cob <cob@hotmail.com> wrote:

> I maintain a FAQ for the misc.immigration.misc newsgroup.  I have been
> looking for a short perl snippet to automate FAQ announcements that
> would be posted to the misc.immigration.misc newsgroup.
> 
> Also a perl snippet that could be run once a week to post an article
> showing where the FAQ is.
> 
> I have a rudimentary command of perl, sufficient to re-engineer an
> existing snippet if it uses something like sendmail, but I'm in no way a
> heavyweight. There's scripts out there, mail2news, for example.  But
> they are WAY over my head.  Something dirt simple is needed...
> 
> Is it possible?  I appreciate any pointers in the right direction!

You could read the docs for Net::NNTP and write your own program, or search 
the web for other people's code using the same module, which you can then 
customize to your heart's content.

Why not try writing your own? Graham Barr (the author of Net::NNTP) has 
already done most of the work for you. If you have problems you can post 
your code here and almost certainly someone will help you correct it.

-- 
David Wall


------------------------------

Date: Wed, 4 Feb 2004 19:28:31 +0000 (UTC)
From: J Krugman <jill_krugman@yahoo.com>
Subject: Re: Need help reading a perl regexp - someone clue me?
Message-Id: <bvrh4v$kss$1@reader2.panix.com>

In <A0aUb.12559$XF6.244308@typhoon.sonic.net> Don Bruder <dakidd@sonic.net> writes:

>I've got a "canned" regexp I'm trying to analyze that I can't quite 
>follow due to one of the constructs used in it. Can anyone 
>translate/verify my translation for me?

>Here's the segment that's throwing me (It's a very small sub-section of 
>a rather large and complex regexp - We're talking something on the order 
>of 300+ characters worth of "rather large and complex")

>[a-zA-Z]{2}[.,\;:?%!&+^~`'\$*=\#|013467\(\)\[\]\{\}<>"][a-zA-Z]{2}

>Now, if I'm reading rightly, and I'm not totally hopeless as far as my 
>understanding of perl regexps goes, this should be looking to match "any 
>two letters followed by pretty much any punctuation mark (including 
>parens, braces, and brackets of all flavors, but (seemingly) excluding 
>the "bar" (AKA "OR") character) or any of the digits 0, 1, 3, 4, 6, or 
>7, followed by any two letters.

Why exclude "|"?  It's right there in the character class, and
there's no ^ at the beginning of that class, so that regexp is
*supposed* to match "AB|CD".

Most of those backslashes are superfluous, BTW.  You only need the
ones before $ and ].




------------------------------

Date: Wed, 4 Feb 2004 19:29:17 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Need help reading a perl regexp - someone clue me?
Message-Id: <bvrh6d$21n$1@wisteria.csv.warwick.ac.uk>


Don Bruder <dakidd@sonic.net> wrote:
> 
> I've got a "canned" regexp I'm trying to analyze that I can't quite 
> follow due to one of the constructs used in it. Can anyone 
> translate/verify my translation for me?
> 
> Here's the segment that's throwing me (It's a very small sub-section of 
> a rather large and complex regexp - We're talking something on the order 
> of 300+ characters worth of "rather large and complex")
> 
> [a-zA-Z]{2}[.,\;:?%!&+^~`'\$*=\#|013467\(\)\[\]\{\}<>"][a-zA-Z]{2}

Good God who wrote that?!  
None of those backslashes are necessary except for the one before ].
Those [a-zA-Z] should almost certainly be [[:alpha:]].

I would strongly recommend breaking the regex up into bits as you
understand it. Assign each 'chunk' to a variable with qr//, and use /x
on the bits so you can separate things out decently. For instance,
that bit you have there can be written:

my $code   = qw/[[:alpha:]]{2}/;
my $symbol = qr/[.,;:...<>"]/;

/$code $symbol $code/x;

(I'm making the entirely unjustified assumption that the two-letter
sequences are some sort of code, to illustrate that you want to give
the pieces names which reflect their function, rather than merely what
they match). See how much more readable that is?

> Should I be ignoring any usual "special meaning" of the 'bar' character 
> when it appears as part of a square-bracketed set,

Yes, you should. Read perldoc perlre again. Nothing is significant in
a [] class except ] (except at the start), ^ (if at the start), -
(except at either end), and \.

Ben

-- 
Musica Dei donum optimi, trahit homines, trahit deos.    |
Musica truces mollit animos, tristesque mentes erigit.   |   ben@morrow.me.uk
Musica vel ipsas arbores et horridas movet feras.        |


------------------------------

Date: Wed, 4 Feb 2004 19:33:57 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Need help reading a perl regexp - someone clue me?
Message-Id: <bvrhf5$264$1@wisteria.csv.warwick.ac.uk>


Ben Morrow <usenet@morrow.me.uk> wrote:
> Don Bruder <dakidd@sonic.net> wrote:
>
> > [a-zA-Z]{2}[.,\;:?%!&+^~`'\$*=\#|013467\(\)\[\]\{\}<>"][a-zA-Z]{2}
> 
> Good God who wrote that?!  
> None of those backslashes are necessary except for the one before
> ]...

 ...and the one before $.

> my $code   = qw/[[:alpha:]]{2}/;
                ^ r
Apologies.

Ben

-- 
Heracles: Vulture! Here's a titbit for you / A few dried molecules of the gall
   From the liver of a friend of yours. / Excuse the arrow but I have no spoon.
(Ted Hughes,        [ Heracles shoots Vulture with arrow. Vulture bursts into ]
 /Alcestis/)        [ flame, and falls out of sight. ]         ben@morrow.me.uk


------------------------------

Date: Wed, 04 Feb 2004 20:59:28 GMT
From: Don Bruder <dakidd@sonic.net>
Subject: Re: Need help reading a perl regexp - someone clue me?
Message-Id: <Q2dUb.12613$XF6.244777@typhoon.sonic.net>

In article <bvrh6d$21n$1@wisteria.csv.warwick.ac.uk>,
 Ben Morrow <usenet@morrow.me.uk> wrote:

> Don Bruder <dakidd@sonic.net> wrote:
> > 
> > I've got a "canned" regexp I'm trying to analyze that I can't quite 
> > follow due to one of the constructs used in it. Can anyone 
> > translate/verify my translation for me?
> > 
> > Here's the segment that's throwing me (It's a very small sub-section of 
> > a rather large and complex regexp - We're talking something on the order 
> > of 300+ characters worth of "rather large and complex")
> > 
> > [a-zA-Z]{2}[.,\;:?%!&+^~`'\$*=\#|013467\(\)\[\]\{\}<>"][a-zA-Z]{2}
> 
> Good God who wrote that?!

Dunno. 'Tweren't me. I'm just trying to understand it.
  
> None of those backslashes are necessary except for the one before ].
> Those [a-zA-Z] should almost certainly be [[:alpha:]].

Tell the person who wrote it, not me! :) Extra backslashes aren't giving 
me the problem, though. (FWIW, I'm not a Perl programmer, and the regexp 
in question is part of another package, not (to my knowledge) Perl, but 
questions regarding the syntax of such regexps are referred to the Perl 
Regexp documentation - Either the package is written in Perl, and I'm 
only seeing a piece of one of the plugins (very likely) or they coded it 
to the Perl regexp standard since it was easier than "scratch-building" 
their own regexp package)
 
> I would strongly recommend breaking the regex up into bits as you
> understand it.

Been doing pretty much that as I walked thorugh it.

> Assign each 'chunk' to a variable with qr//, and use /x
> on the bits so you can separate things out decently. For instance,
> that bit you have there can be written:
> 
> my $code   = qw/[[:alpha:]]{2}/;
> my $symbol = qr/[.,;:...<>"]/;
> 
> /$code $symbol $code/x;
> 
> (I'm making the entirely unjustified assumption that the two-letter
> sequences are some sort of code, to illustrate that you want to give
> the pieces names which reflect their function, rather than merely what
> they match). See how much more readable that is?

Agreed on the readability. But since I'm not intersted in trying to 
"tweak" it or anything like that - only UNDERSTAND it - I'll be leaving 
it "as-is".
 
 
> > Should I be ignoring any usual "special meaning" of the 'bar' character 
> > when it appears as part of a square-bracketed set,
> 
> Yes, you should. 

Bingo. That's the answer I needed, and cleared a big part of the "fog" I 
was stumbling around in. Now to figure out why only "013467" in the list 
of digits... I could easily understand ALL digits, but having only that 
particular sub-set of digits just doesn't seem to make any sense, either 
on the surface, or in the context of what I know it's *SUPPOSED* to be 
doing.

In case anybody's interested, here's the full regexp that I'm trying to 
understand: 

(Beware of line-wrap - there are no literal space/carriage 
return/linefeed characters in the string other than the regulation CR/LF 
pair at the very end, following the "/i")

/\s(?!(?:fn|re):|(?:cc|to)=|(?:ma|qu|un)[`'"]|(?:dr|m[rst]|li|st|td)\.)[a
-zA-Z]{2}[.,\;:?%!&+^~`'\$*=\#|013467\(\)\[\]\{\}<>"][a-zA-Z]{2}(?<!\.(?:
(?-i:[A-Z][a-z]{1})|a[eiu]|b[ebmrsz]|c[afhnrx]|d[bek]|es|f[ir]|g[uz]|h[kn
rtu]|i[elnqrst]|j[mops]|k[prwy]|m[ckx]|n[loz]|p[lmrty]|ru|s[eghm]|t[cnv]|
u[ksu]|v[gi])|:no|['`"](?:ed|ll|[rv]e))(?:[,'\?!]|\.?\s)/i

Its "advertised purpose" is to go through a block of text looking for a 
string consisting of 
"<space><alpha-char><alpha-char><period><alpha-char><alpha-char><space>",
 with no interest in whether the two letters on either side of the 
period are upper or lower case.

It appears (from my analysis - which may be in error) that several 
two-character top level internet domain names (.us, .uk, .se, .cn, .br, 
 .ru, and quite a few others), a small handful of common filename 
extensions (.db, .gz, .js, etc), and a few other two-letter combinations 
(dr., mr./ms., etc) are special-cased to exclude them from causing a 
match. It *DOES* work as advertised, so that's not at issue. I'm not 
trying to debug it, tweak it, or otherwise mess with it, I just wanted 
to know how/why it was doing what it did before I changed behavior (and 
potentially breaking something due to not understanding exactly what was 
being matched) that happens if/when it finds a match.

-- 
Don Bruder -  dakidd@sonic.net <--- Preferred Email - SpamAssassinated.
Hate SPAM? See <http://www.spamassassin.org> for some seriously great info.
I will choose a path that's clear: I will choose Free Will! - N. Peart
Fly trap info pages: <http://www.sonic.net/~dakidd/Horses/FlyTrap/index.html>


------------------------------

Date: Wed, 4 Feb 2004 18:16:01 -0000
From: "gnari" <gnari@simnet.is>
Subject: Re: newbie help
Message-Id: <bvrcrc$a4u$1@news.simnet.is>

"Ram" <spam@spammer.com> wrote in message
news:bvr8mb$jaj$1@grandcanyon.binc.net...

[note: if you do not top-post then it is more likely we want to help.
it si annoying when you put your follow-up at the top of your message,
quoting the message you are rplying to under that (in this case in whole)]


> This string does not match if <ordsts>  and </ordsts> has child tags
spread
> across multiple lines.
> ...

> "Gunnar Hjalmarsson" <noreply@gunnar.cc> wrote in message
> news:bvp3d5$ujeo2$1@ID-184292.news.uni-berlin.de...
> >
> > Assuming the data is in $_:

key sentence, perhaps?

> >
> >      my ($lastmatch) = /.*(<ordsts>.*<\/ordsts>).*/s;

are you matching one line at a time?

gnari





------------------------------

Date: Wed, 4 Feb 2004 21:39:09 -0000
From: "David Holmes" <no@spam.uk>
Subject: Perl data types
Message-Id: <YDdUb.1022$%i5.660@news-binary.blueyonder.co.uk>

I have a few questions regarding perl composite data types:

firstly product types, as I see it the only way of doing this is to use
Class::Struct. This will allow the use of multiple variables to be
identified as a single value.

secondly, does this make a hash a product type, because the keys are
referenced by scalars, and these point to variables, which i guess could be
anything but allows a simple structure.

thirdly, does perl support sum types, as far as I can tell, since perl is
not a typed language as such it is imposible to tell the differnce between
the types, for example a case statement is impossible. The could obviously
be overcome using the a regex to an extent to tell whether a scalar is a
number or a string, but apart from that it is not really possible.

Regards

Dave




------------------------------

Date: Wed, 04 Feb 2004 15:00:03 -0700
From: Eric Schwartz <emschwar@pobox.com>
Subject: Re: Perl data types
Message-Id: <etobroebj4s.fsf@fc.hp.com>

"David Holmes" <no@spam.uk> writes:
> firstly product types, as I see it the only way of doing this is to use
> Class::Struct. This will allow the use of multiple variables to be
> identified as a single value.

What is a "product type"?  I've not heard this terminology before.

> secondly, does this make a hash a product type, because the keys are
> referenced by scalars, and these point to variables, which i guess could be
> anything but allows a simple structure.

Not knowing what a product type is, I can't help.

> thirdly, does perl support sum types

What is a 'sum type'?

> as far as I can tell, since perl is not a typed language as such it
> is imposible to tell the differnce between the types, for example a
> case statement is impossible.

Case statements are possible, they're just not built into the
language.  See perlfaq7, "How do I create a switch or case statement?".

And perl is typed, albeit loosely.

> The could obviously be overcome using the a regex to an extent to
> tell whether a scalar is a number or a string, but apart from that
> it is not really possible.

You don't (generally) care if it's a number or a string.  If you use
it like a string, it's a string.  If you use it like a number, it's a
number.

$foo = '4';

$bar = $foo + 3; # $bar is now 7

print "bar is [$bar]" if $bar eq '7';

-=Eric
-- 
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
		-- Blair Houghton.


------------------------------

Date: Wed, 04 Feb 2004 21:30:42 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: Perl For Amateur Computer Programmers
Message-Id: <6wdUb.217831$I06.2383314@attbi_s01>

edgrsprj wrote:

> I have made a few corrections to that Web page.  Checks for others will have
> to wait until I can get some more time free.

You haven't fixed the error in 'print localtime'.  The year is three
digits, not two, and the month is a number from 0 to 11, not 01 to 12.
	-Joe


------------------------------

Date: Wed, 04 Feb 2004 21:42:33 GMT
From: "edgrsprj" <edgrsprj@ix.netcom.com>
Subject: Re: Perl For Amateur Computer Programmers
Message-Id: <dHdUb.10666$GO6.3979@newsread3.news.atl.earthlink.net>

"Uri Guttman" <uri@stemsystems.com> wrote in message
news:x7ptcu4qud.fsf@mail.sysarch.com...
> >>>>> "e" == edgrsprj  <edgrsprj@ix.netcom.com> writes:
newbie. just point your earthquake people to those resources and drop
> your silly site already.

Something will likely happen sooner or later.




------------------------------

Date: Wed, 04 Feb 2004 21:45:36 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: Perl For Amateur Computer Programmers
Message-Id: <x7y8ri34e7.fsf@mail.sysarch.com>

>>>>> "JS" == Joe Smith <Joe.Smith@inwap.com> writes:

  JS> edgrsprj wrote:
  >> I have made a few corrections to that Web page.  Checks for others will have
  >> to wait until I can get some more time free.

  JS> You haven't fixed the error in 'print localtime'.  The year is three
  JS> digits, not two, and the month is a number from 0 to 11, not 01 to 12.

the year is not 3 digits nor 2 digits. it is the year - 1900.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org


------------------------------

Date: 4 Feb 2004 14:03:11 -0800
From: g_klinedinst@hotmail.com (G Klinedinst)
Subject: Re: Perl For Amateur Computer Programmers
Message-Id: <168f035a.0402041403.7effb0bc@posting.google.com>

Iain Chalmers <bigiain@mightymedia.com.au> wrote in message news:

> "Perl makes a lousy first programming language, thats because its 
> designed to be the *last* programming language you ever need to learn."

<snip>

Iain, I am one of the few who agrees with you I guess. The Perl docs
often have terrible form.

Exhibit A: "sprintf" which you used:

"Returns a string formatted by the usual printf conventions of the C
library function sprintf. See sprintf(3) or printf(3) on your system
for an explanation of the general principles.

Perl does its own sprintf formatting--it emulates the C function
sprintf, but it doesn't use it (except for floating-point numbers, and
even then only the standard modifiers are allowed). As a result, any
non-standard extensions in your local sprintf are not available from
Perl.

Perl's sprintf permits the following universally-known conversions:"

Three paragraphs go by and I still don't know WTF this thing does,
except that it somehow returns a string. All technical writing needs
to start with a summary sentence covering what we are going to talk
about and why. Ben made the point that the stuff is explained further
down in detail and this is certainly true. However, even as a
experienced programmer I want to be able to read in the first sentence
what this function is used for. There is no excuse for not having a
summary sentence at the very beginning, such as "This function takes a
format string and a list as scalars. It's return value is a string
where the parts of the format string such as the %d are replaced by
arguements from the list of scalars." Ok, I know that's a crappy
sentence but at least it tells the reader what is happening and what
to expect. The official version has the person guessing and looking up
other docs rather than telling them right off the bat if that is the
function they are looking for.

Exhibit B: "exit"

"Evaluates EXPR and exits immediately with that value. Example:"

It uses the same word to define itself. Does anyone else have a
problem with this? What if you found this in the dictionary:

Exit: v. To exit something.

Also it exits what? Exits a loop? Exits the program? Exits a block?
Exits what? How about "Evaluates EXPR and terminates the process
immediately returning EXPR's value to the calling process." At least
this way I know from the first sentence what this is used for.

These are examples only, written to show what type of information
should be in the first sentence of every page of documentation. I am
not claiming that these are technically correct. All I ask for is a
one sentence overview of what arguments it takes, what it does with
those arguements and what it returns. That way if I am searching
through functions trying to see if it is the one I want to use I don't
need to read 2 pages of docs to learn that it won't do what I want.

-Greg


------------------------------

Date: Wed, 4 Feb 2004 19:34:08 +0000 (UTC)
From: J Krugman <jill_krugman@yahoo.com>
Subject: Re: RegExp for matching word "karl@aol.com" or word "paul@hotmail.com" ?
Message-Id: <bvrhfg$kss$2@reader2.panix.com>

In <pan.2004.02.04.16.39.16.196934@lnubb.qr> Robert Meyer <rtbvfg99@lnubb.qr> writes:

>(/karl\@aol\.com/|/paul\@hotmail\.com/)

That looks like a bitwise OR of two scalars...  I suppose it would
work, but it's not terribly good Perl.  How about

/karl\@aol\.com|paul\@hotmail\.com/

?




------------------------------

Date: Wed, 04 Feb 2004 20:46:50 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: RegExp for matching word "karl@aol.com" or word "paul@hotmail.com" ?
Message-Id: <bvri4h$vr1ln$1@ID-184292.news.uni-berlin.de>

J Krugman wrote:
> Robert Meyer writes:
>> 
>> (/karl\@aol\.com/|/paul\@hotmail\.com/)
> 
> That looks like a bitwise OR of two scalars...  I suppose it would 
> work, but it's not terribly good Perl.  How about
> 
> /karl\@aol\.com|paul\@hotmail\.com/
> 
> ?

Yeah, but since the email addresses are plain strings, it would
probably be even better Perl to make use of the index() function
instead of a regular expression.

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl



------------------------------

Date: Wed, 4 Feb 2004 19:10:27 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Simple syntax question
Message-Id: <bvrg33$rn$2@wisteria.csv.warwick.ac.uk>


Web Surfer <raisin@delete-this-trash.mts.net> wrote:
> %checktype is a "hash"
> 
> # unlike a "normal" array whose index value needs to be an integer 
> value, the "index" for a hash can be just about any kind of scalar 
> value.

No. The key ('index') for a hash is a string.

Ben

-- 
For the last month, a large number of PSNs in the Arpa[Inter-]net have been
reporting symptoms of congestion ... These reports have been accompanied by an
increasing number of user complaints ... As of June,... the Arpanet contained
47 nodes and 63 links. [ftp://rtfm.mit.edu/pub/arpaprob.txt] * ben@morrow.me.uk


------------------------------

Date: 4 Feb 2004 19:17:47 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: Simple syntax question
Message-Id: <bvrggr$k1d$1@canopus.cc.umanitoba.ca>

In article <bvrg33$rn$2@wisteria.csv.warwick.ac.uk>,
Ben Morrow  <usenet@morrow.me.uk> wrote:

:Web Surfer <raisin@delete-this-trash.mts.net> wrote:
:> # unlike a "normal" array whose index value needs to be an integer 
:> value, the "index" for a hash can be just about any kind of scalar 
:> value.

:No. The key ('index') for a hash is a string.

True, but other kinds of scalars will be stringified and -that-
used as the key.

stringification can have some surprising results. For example,
I discovered just a couple of days ago that every time you
stringify a Thread::Queue then you get a different value, even
if the queue has not changed in the meantime.
-- 
   So you found your solution
   What will be your last contribution?
   -- Supertramp (Fool's Overture)


------------------------------

Date: Wed, 04 Feb 2004 20:33:30 +0100
From: Tore Aursand <tore@aursand.no>
Subject: Re: Simple syntax question
Message-Id: <pan.2004.02.04.17.21.23.846456@aursand.no>

On Wed, 04 Feb 2004 08:46:27 -0800, Trimbitas Sorin wrote:
> I have a simple syntax question :
> What does the following line mean:
> 1: %checkType; ?? I know that @test="" is an array and $test="" is a
> simple variable.

perldoc perldata


-- 
Tore Aursand <tore@aursand.no>
"Life is pleasant. Death is peaceful. It's the transition that's
 troublesome." -- Isaac Asimov


------------------------------

Date: Wed, 4 Feb 2004 14:39:02 -0500
From: Paul Lalli <ittyspam@yahoo.com>
Subject: Re: Simple syntax question
Message-Id: <20040204143432.E483@dishwasher.cs.rpi.edu>

On Wed, 4 Feb 2004, Walter Roberson wrote:

> In article <bvrg33$rn$2@wisteria.csv.warwick.ac.uk>,
> Ben Morrow  <usenet@morrow.me.uk> wrote:
>
> :Web Surfer <raisin@delete-this-trash.mts.net> wrote:
> :> # unlike a "normal" array whose index value needs to be an integer
> :> value, the "index" for a hash can be just about any kind of scalar
> :> value.
>
> :No. The key ('index') for a hash is a string.
>
> True, but other kinds of scalars will be stringified and -that-
> used as the key.

Yes, just as any non-integers used as the index to an array will be
"integer-fied" and -that- used as the index.

Paul Lalli


------------------------------

Date: Wed, 04 Feb 2004 14:15:33 -0500
From: James Willmore <jwillmore@remove.adelphia.net>
Subject: Re: Site Visitor's eMail Address
Message-Id: <pan.2004.02.04.19.15.31.904277@remove.adelphia.net>

On Wed, 04 Feb 2004 07:54:03 +0000, snoopy wrote:

> I teach Mathematics using Internet to secondary school students.  I have
> many students to whom I send emails on Monday, Wednesday, and Friday.
> 
> Each email I send to my students have a link to the problem sets for
> that day, and when my students receive that email, they are supposed to
> click on the link, and do the problems.
> 
> Now I have many students, and need to follow up who is visiting the site
> and doing the problems.  At this point, I see how many visitors incur
> for each site through Log file, but there is no way for me to know which
> students are visiting the site and which students are not doing the
> problems at all.
> 
> Would it be possible for me to get email addresses of those who click on
> the link in the email I send them??  Can anybody help me realize this
> using Perl??
> 
> Any help will be deeply appreciated.

<ot>
This is covered with JavaScript.

Of course, if the person viewing the page has JavaScript turned off, then
you don't get notified.  And, of course, there's a way to insure that the
person viewing the page has to have JavaScript turned on. In any case, it
can and has been done.
</ot>

<ot> (notice a theme here :-) )
You could require a signin process to insure that the person viewing any
links from that point forward will send an email when viewing a link.
Depending on *how* you do it (with hidden fields, whch can be altered,
-or- cookies -or- some combination of these two items and maybe a few
other "tricks" thrown in), it could still be defeated. 
</ot>

This question is best posted to another group.

-- 
Jim

Copyright notice: all code written by the author in this post is
 released under the GPL. http://www.gnu.org/licenses/gpl.txt 
for more information.

a fortune quote ...
It's the thought, if any, that counts! 




------------------------------

Date: Wed, 04 Feb 2004 20:50:20 +0100
From: Lars Haugseth <lh+canned_pork@tn.no>
Subject: Re: Site Visitor's eMail Address
Message-Id: <m3hdy6fwub.fsf@gollum.polygnosis.com>


* Scott Bryce <sbryce@scottbryce.com> wrote:
|
| Steve wrote:
|
|> You could send a link like this
|> http://www.mysite.com/stuff/grab.pl?name=johnsmith
|
| John could re-mung this URL, so you wouldn't know it was John.

That's easily fixed. Instead of using the name or email address
in the link you send out, create an MD5 digest of the email
address plus some random value, then store the mapping on the
server. Add the MD5 checkum as a CGI parameter in the link, and
when the user visits, you can compare the MD5 checksum against
the list of mappings you've stored on the server. If the checksum
doesn't fit any of the users, you know it's been modified and can
produce an appropriate response.

Of course you can't know if it was really John who visited, or
just someone who's gotten the link forwarded by John.

-- 
Lars Haugseth


------------------------------

Date: Wed, 04 Feb 2004 21:13:37 +0100
From: Tore Aursand <tore@aursand.no>
Subject: Re: Site Visitor's eMail Address
Message-Id: <pan.2004.02.04.19.36.14.146633@aursand.no>

On Wed, 04 Feb 2004 18:10:14 +0000, Chris wrote:
>>> Would it be possible for me to get email addresses of those who click
>>> on the link in the email I send them?

>> Yes.  What have you tried so far?  What didn't work?

> No.  He's talking about sniffing the address, [...]

No, he's not.  He is talking about sending an email to _known_ email
addresses, together with a link, and how to find out if the recipient ever
clicks on that link.


-- 
Tore Aursand <tore@aursand.no>
"The purpose of all war is ultimately peace." -- Saint Augustine


------------------------------

Date: 4 Feb 2004 14:55:54 -0800
From: doug_joel@bctel.ca (doug)
Subject: Soap-Lite XML parameter error, doc must have top level element
Message-Id: <ce6c4b44.0402041455.9f25b6a@posting.google.com>

Hi, I'm using Soap-lite as a client to talk to a .Net server.  It
works great using one parameter.  When I add the second parameter, a
XML string, it fails.  When I load the string that it generated into a
browser it also fails.  It doesn't like the &lt; symbols.

I've read that Soap-lite defaults to soap encoding ang .Net requires
literal encoding, but don't know how to change it on the client side.

Here is the code I'm using with the envelope following.

Thanks for any help
Doug

use SOAP::Lite;

my $prod = "<?xml version = \"1.0\" encoding=\"UTF-8\"
standalone=\"yes\"?>" .
          "<VFPData><row cupc=\"06365200060\"/><row
cups=\"06365200080\"/>" .
          "<row cupc=\"06365200001\"/></VFPData>";

print SOAP::Lite
  -> service('http://192.168.101.2/ip2.wsdl')
  -> GetProducts("123", $prod), "\n";

sub SOAP::Transport::HTTP::Client::get_basic_credentials {
    return 'storetest' => 'mm0822tt#';
}

my @params = ( SOAP::Data->name(x1 => "123"),
               SOAP::Data->name(x2 => $prod) );


<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope 
  xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"  
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"  
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
  xmlns:xsd="http://www.w3.org/1999/XMLSchema"  
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
    <SOAP-ENV:Body>
      <GetProducts xmlns="">
        <parameters>123</parameters>
        <c-gensym4 xsi:type="xsd:string">
        &lt;?xml version = "1.0" encoding="UTF-8" standalone="yes"?>
        &lt;VFPData>
          &lt;row cupc="06365200060"/>
          &lt;row cups="06365200080"/>
          &lt;row cupc="06365200001"/>
        &lt;/VFPData></c-gensym4>
    </GetProducts>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>


------------------------------

Date: 4 Feb 2004 16:20:00 -0600
From: throop@cs.utexas.edu (David R. Throop)
Subject: Re: Turn $5 into $15,000 or more!!! Here's how....
Message-Id: <bvrr6g$oi9$1@yojo.cs.utexas.edu>

s/\$5/\$15,000/g

:-)


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6080
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[23877] in Perl-Users-Digest

Perl-Users Digest, Issue: 6080 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Wed Feb 4 18:05:46 2004

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Feb 4 18:05:46 2004