[9201] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2821 Volume: 8

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Jun 5 18:07:32 1998

Date: Fri, 5 Jun 98 15:01:34 -0700
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Fri, 5 Jun 1998     Volume: 8 Number: 2821

Today's topics:
    Re: Spider programms in PERL (Steve Linberg)
        Split string into newline-terminated lines (Mark-Jason Dominus)
    Re: Split string into newline-terminated lines <tchrist@mox.perl.com>
    Re: TC Loses it! ;) [Was: Re: Why is there no "in" oper <zenin@bawdycaste.org>
        Uppercasing user input under CGI <jefkatz@rci.rutgers.edu>
    Re: Uppercasing user input under CGI <upsetter@ziplink.net>
    Re: Use of HTML, POD, etc in Usenet (was: Re: map in vo (Kevin Reid)
    Re: Use of HTML, POD, etc in Usenet (was: Re: map in vo <zenin@bawdycaste.org>
    Re: Use of HTML, POD, etc in Usenet (was: Re: map in vo (Jonathan Stowe)
    Re: Why is there no "in" operator in Perl? <tchrist@mox.perl.com>
        Digest Administrivia (Last modified: 8 Mar 97) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Fri, 05 Jun 1998 16:50:57 -0400
From: linberg@literacy.upenn.edu (Steve Linberg)
Subject: Re: Spider programms in PERL
Message-Id: <linberg-0506981650570001@projdirc.literacy.upenn.edu>

In article <357801F8.616D613@matrox.com>, Ala Qumsieh
<aqumsieh@matrox.com> wrote:

> Tom Christiansen wrote:
> 
> > Your best hope is that someone will be happy to let you hire them as a
> > consultant at about $150/hour or better to spend the severely nontrivial
> > amount time it would take to teach you all this, since your current
> > posting on spidering and your other one about autoposting to USENET do
> > not exactly inspire us to complete confidence in your ability to grasp
> > what we mean when we politely but succinctly suggest that you consult
> > the libnet and the LWP module suites on CPAN--which is likely the most
> > information you're liable to get, and in fact, just have.
> >
> > --tom
> > --
> >     "It's okay to be wrong temporarily." --Larry Wall
> 
>  Boy .. that is the longest sentence I've ever read!

"Rob McKenna was a miserable bastard and he knew it becauase he'd had a
lot of people point it out to him over the years and he saw no reason to
disagree with them, other than the obvious one, which was that he liked
disagreeing with people, particularly people he disliked, which included,
at the last count, everybody."

- Douglas Adams, "So Long and Tnanks For All the Fish"

Not quite as long as Tom's gem above, but still quite good.
_____________________________________________________________________
Steve Linberg                       National Center on Adult Literacy
Systems Programmer &c.                     University of Pennsylvania
linberg@literacy.upenn.edu              http://www.literacyonline.org


------------------------------

Date: 5 Jun 1998 17:09:37 -0400
From: mjd@op.net (Mark-Jason Dominus)
Subject: Split string into newline-terminated lines
Message-Id: <6l9mqh$s5d$1@monet.op.net>
Keywords: habitat mountainous onrush Paz


Suppose I've done

	{ local $/ = undef;
          $file = <FILE>;
	}

Now I change my mind.  I wish I had just done

	@lines = <FILE>;

instead.

How can I turn $file into @lines?

The obvious thing to try is this:

	@lines = split /\n/, $file;

But this is wrong, because the lines you get are no longer
\n-terminated, and because there's no way at all to tell if the last
line was \n-terminated or not.

I suppose that one possibility is:

	$nls = ($file =~ tr/\n//);  # count newlines
	@lines = split /\n/, $file, $nls;

which should retain a trailing null field if the last line *was*
\n-terminated; then you can go and put the \n's back:

	for (@lines[0..$#lines-1]) {
	  $_ .= "\n";
	}

That seems sort of ridiculous---too much work.

I thought I had an inspiration:

	@lines = split /(?=\n)/, $file;

but it's all wrong; the newlines end up at the beginnings of the
strings, instead of at the ends.

I ended up doing

	@lines = ($file =~ /(.*\n?)/g);

This works pretty well, but it seems bizarre and hard to understand.

I feel like there's something obvious that I'm missing.

What could it be?


------------------------------

Date: 5 Jun 1998 21:26:51 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: Split string into newline-terminated lines
Message-Id: <6l9nqr$slj$1@csnews.cs.colorado.edu>
Keywords: habitat mountainous onrush Paz

 [courtesy cc of this posting sent to cited author via email]

In comp.lang.perl.misc, mjd@op.net (Mark-Jason Dominus) writes:
:Suppose I've done
:	{ local $/ = undef;
:          $file = <FILE>;
:	}
:Now I change my mind.  I wish I had just done
:	@lines = <FILE>;
:instead.
:
:How can I turn $file into @lines?
:The obvious thing to try is this:
:	@lines = split /\n/, $file;

Funny, to me the obvious thing :-) was:

    @lines = split /^/, $file;

I really don't know why the //m isn't required, and makes no difference.

--tom
-- 
    tmps_base = tmps_max;                /* protect our mortal string */
        --Larry Wall in stab.c from the perl source code


------------------------------

Date: 5 Jun 1998 20:26:35 GMT
From: Zenin <zenin@bawdycaste.org>
Subject: Re: TC Loses it! ;) [Was: Re: Why is there no "in" operator in Perl?]
Message-Id: <897078895.708861@thrush.omix.com>

Peter A Fein <p-fein@uchicago.edu> wrote:
: Tom Christiansen <tchrist@mox.perl.com> writes:
: > Essentially, the word "in" should set off a Pavlovian response
: > saying 
: > 
: >     +-------------------------------------------------------------+
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     | HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH HASH |
: >     +-------------------------------------------------------------+
: Umm, whoa. ;)

	That's a lot of hash just lying around.  Tom, you should really
	smoke some of that stuff before it goes stale. :-)
-- 
-Zenin
 zenin@archive.rhps.org


------------------------------

Date: Fri, 05 Jun 1998 17:10:00 -0400
From: "Jeffrey P. Katz" <jefkatz@rci.rutgers.edu>
Subject: Uppercasing user input under CGI
Message-Id: <35785EA8.D68FBDA8@rci.rutgers.edu>

<HTML>
Hello.

<P>I'm a new Perl person, and have written a module that performs a string
<BR>search on an external file. I have 2 questions:

<P>Question 1:&nbsp; Perl5 CGI gives this error message on compilation,
and
<BR>won't return to Unix without a &lt;CRL>C:

<P>(offline mode: enter name=value pairs on standard input)

<P>What does this mean? And what should I do about it?

<P>Question 2: I'd like to take the variable

<P>$query->textfield('textword')

<P>and use _tr_ to uppercase it. In a test non-CGI Perl program, it works
<BR>OK. When I try in a CGI environment, , however, I get an error message
<BR>about not being able to modify this value in a subroutine.

<P>How does one uppercase user-input in a CGI environment?

<P>Thanks for any help!
<BR>&nbsp;
<BR>&nbsp;
<BR>&nbsp;

<P>&nbsp;Subject:
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Uppercasing user input in
CGI
<BR>&nbsp;&nbsp; Date:
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Fri, 05 Jun 1998 15:28:42
-0400
<BR>&nbsp;&nbsp; From:
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; "Jeffrey P. Katz" &lt;jefkatz@rci.rutgers.edu>
<BR>&nbsp;&nbsp;&nbsp;&nbsp; To:
<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; alt.perl@rci.rutgers.edu,
comp.lang.perl.misc@rci.rutgers.edu
<BR>&nbsp;
<BR>&nbsp;

<P>Hello.

<P>I'm a new Perl person, and have written a module that performs a string
<BR>search on an external file. I have 2 questions:

<P>Question 1:&nbsp; Perl5 CGI gives this error message on compilation,
and
<BR>won't return to Unix without a &lt;CRL>C:

<P>(offline mode: enter name=value pairs on standard input)

<P>What does this mean? And what should I do about it?

<P>Question 2: I'd like to take the variable

<P>$query->textfield('textword')

<P>and use _tr_ to uppercase it. In a test non-CGI Perl program, it works
<BR>OK. When I try in a CGI environment, , however, I get an error message
<BR>about not being able to modify this value in a subroutine.

<P>How does one uppercase user-input in a CGI environment?

<P>Thanks for any help!
<BR>&nbsp;
<BR>&nbsp;</HTML>



------------------------------

Date: 5 Jun 1998 21:11:46 GMT
From: Scratchie <upsetter@ziplink.net>
Subject: Re: Uppercasing user input under CGI
Message-Id: <6l9mui$1na@fridge.shore.net>

Jeffrey P. Katz <jefkatz@rci.rutgers.edu> wrote:

: <P>(offline mode: enter name=value pairs on standard input)

: <P>What does this mean? And what should I do about it?

This is probably not obvious to a non-Unix person. You need to hit CTRL-D
to indicate the end of your input (what would normally be passed to your
CGI script as a query string or standard input) and then your program will
execute.

PS: You really shouldn't post HTML to newsgroups. It's hard to read.

--Art

--------------------------------------------------------------------------
                    National Ska & Reggae Calendar
            http://www.ziplink.net/~upsetter/ska/calendar.html
--------------------------------------------------------------------------


------------------------------

Date: Fri, 5 Jun 1998 16:16:06 -0400
From: kpreid@ibm.net (Kevin Reid)
Subject: Re: Use of HTML, POD, etc in Usenet (was: Re: map in void context regarded as evil - suggestion)
Message-Id: <1da4csr.17043m61si30cgN@slip166-72-108-250.ny.us.ibm.net>

Sam Trenholme <set@netcom.com> wrote:

> You know, I had this particular flame war on comp.os.linux.misc some
> months ago, and I concluded the only way a Usenet will expand in to a
> decent markup language is if there is a way to embed basic HTML in the
> header of the message, with a simple one or two-line script that would say
> things like "Start boldface at the 124th character in the posting", and so
> on. 

I had exactly the same idea.
 
> It would be akin to an X-Face header, maybe a X-Markup header or some such.

How about a format like this:

START-END:FMT,START-END:FMT, ...

where START and END specify a character range, and FMT is a number or
word specifying the format to apply.

-- 
  Kevin Reid.      |         Macintosh.
   "I'm me."       |      Think different.


------------------------------

Date: 5 Jun 1998 20:12:54 GMT
From: Zenin <zenin@bawdycaste.org>
Subject: Re: Use of HTML, POD, etc in Usenet (was: Re: map in void context regarded as evil - suggestion)
Message-Id: <897078074.880742@thrush.omix.com>

Chris Nandor <pudge@pobox.com> wrote:
	>snip<
: It "must" be declared as such?

	Yes.

: So if I am quoting perlfaq, I need what, a separate MIME attachment?

	No.  It can be in the message header, without needing MIME
	delimiters.  Read the RFC.

: So if someone asks what is a JAPH, and I respond:
: =head2 What is a JAPH?
: These are the "just another perl hacker" signatures that some people
: sign their postings with.  About 100 of the of the earlier ones are
: available from http://www.perl.com/CPAN/misc/japh .
: ... now I have violated the RFC?

	Yep.  Run it through pod2text, or just cut and paste from an xterm
	when using perldoc.  Think about it; If it was from a "standard"
	man page and you posted the raw man page data, everyone would
	freak and rightfully so.  Yes, pod is milder, but that's not the
	point.  It's *not* meant as the final form to be read by humans.
	It's meant to be filtered into another format first, then read.
	Thus we have pod2text, pod2man, et al.  How many text2pod converters
	are around?

-- 
-Zenin
 zenin@archive.rhps.org


------------------------------

Date: Fri, 05 Jun 1998 21:32:58 GMT
From: Gellyfish@btinternet.com (Jonathan Stowe)
Subject: Re: Use of HTML, POD, etc in Usenet (was: Re: map in void context regarded as evil - suggestion)
Message-Id: <35786103.1887442@news.btinternet.com>

On 5 Jun 1998 15:53:40 GMT, Tom Grydeland wrote :

>
>Oh, please.  Cool down a bit will you?9
>

I thinks so.

On 2 Jun 1998 12:56:47 GMT, Tom Grydeland wrote :

>
>Many OPs in Perl know what context they're called in and modify their
>behaviour to fit the requirement.  For instance, C<scalar keys %hash>

Then I said:

>just a thought but shouldnt POD markup be deprecated in usenet postings just as HTML is ?

And i *was kidding* guys and gals.

/J\
Jonathan Stowe
Some of your questions answered:
<URL:http://www.btinternet.com/~gellyfish/resources/wwwfaq.htm>



------------------------------

Date: 5 Jun 1998 19:47:07 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: Why is there no "in" operator in Perl?
Message-Id: <6l9hvr$ke7$2@csnews.cs.colorado.edu>

 [courtesy cc of this posting sent to cited author via email]

In comp.lang.perl.misc, 
    scott@softbase.com writes:
:I think set operations should be added to Perl for two reasons:

We already have them.  They're called hashes.

Or, if you prefer, you can use the bit vector sets I posted
recently.
    sub Set::contains {
        my($vector, $element) = @_;
        my $bitno = elt2bit($element);
        return vec($Set::vector, $bitno, 1);
    } 

    sub Set::add {
        my($vector, @elements) = @_;
        for my $element( @elements ) { 
            my $bitno = elt2bit($element);
            vec($Set::vector, $bitno, 1) = 1;
        }
    } 

    sub Set::delete {
        my($vector, @elements) = @_;
        for my $element( @elements ) { 
            my $bitno = elt2bit($element);
            vec($Set::vector, $bitno, 1) = 0;
        }
    } 

    sub Set::clear   { $_[0] = '' if @_ }

    sub Set::isempty { $_[0] =~ /^\0*$/i }

    $Set::BITS_SEEN = 0;
    sub Set::elt2bit {
        my $name = shift;
        unless (defined $Mapping{$name}) {
            $Mapping{$name} = $BITS_SEEN++;
        } 
        $Mapping{$name};
    } 

Here was a followup.  I'm the ">" part.  Note the & and | operators
there.  We already have these, you see.

>:To me, that code looks like it's using a hash to store the positions of
>:the bits; if you're going to use a hash, why not just store the
>:true/false values in the hash?
>
>You don't want a full hash per set.  And you want
>this kind of thing to be blazingly fast.  Which it is.
>
>    $intersect = $set1 & $set2;
>    $union     = $set1 | $set2;
>
>:The advantage to this code (that it stores the data compactly) is offset
>:by the need to store the name->position table; so you only get space
>:savings if you are going to store a large number of sets with the same
>:element names.
>
>That's true.  But the space savings is incredible.  You really won't
>believe it.

:1. To see the punctuation characters that Larry Wall would invent for
:the operations (What could you use? <-=->, <-!->, <-U->, <-O->?)

We nearly never use punctuational operators on aggregates.

When this issue has come up before, such as trying to make
something like this

    @a = @b x @c;

Run the cross product or matrix multiply, Larry has said nope.

>From my Beginning Perl course exercises, and soon to
find their way into the Perl Cookbook, we have these:

--tom

The solutions below that need them the following initializations:

    @a = (1, 3, 5, 6, 7, 8);
    @b = (2, 3, 5, 7, 9);

    @union = @isect = @diff = ();
    %union = %isect = ();
    %count = ();

Simple Solution for Union and Intersection

    foreach $e (@a) { $union{$e} = 1 }

    foreach $e (@b) {
        if ( $union{$e} ) { $isect{$e} = 1 }
        $union{$e} = 1;
    }
    @union = keys %union;
    @isect = keys %isect;

More Idiomatic Version

    foreach $e (@a, @b) { $union{$e}++ && $isect{$e}++ }

    @union = keys %union;
    @isect = keys %isect;

Union, Intersection, and Symmetric Difference

    foreach $e (@a, @b) { $count{$e}++ }

    foreach $e (keys %count) {
        push(@union, $e);
        if ($count{$e} == 2) {
            push @isect, $e;
        } else {
            push @diff,  $e;
        }
    }

Indirect Solution

    @isect = @diff = @union = ();

    foreach $e (@a, @b) { $count{$e}++ }

    foreach $e (keys %count) {
        push(@union, $e);
        push @{ $count{$e} == 2 ? \@isect : \@diff }, $e;
    }
-- 
    "Since nobody ever compared Hitler to Hitler, being compared with Hitler
    immediately disqualifies you for Hitlerhood."
    	--Larry Wall


------------------------------

Date: 8 Mar 97 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 8 Mar 97)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.

The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V8 Issue 2821
**************************************

home help back first fref pref prev next nref lref last post