[15571] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 2984 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon May 8 21:10:50 2000

Date: Mon, 8 May 2000 18:10:21 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <957834620-v9-i2984@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Mon, 8 May 2000     Volume: 9 Number: 2984

Today's topics:
    Re: Perl with VRML <reida@nwu.edu>
        perlex install error -51 embern@my-deja.com
    Re: Please check my 'random' code <andrew.mcguire@walgreens.com>
    Re: Please check my 'random' code <lr@hpl.hp.com>
        Please help cannot display REMOTE_HOST variable joydip_chaklader@my-deja.com
    Re: Printing Arrays <nospam@devnull.com>
    Re: Printing Arrays (Tad McClellan)
    Re: Proper use of resources (was Re: more regexp madnes <andrew.mcguire@walgreens.com>
    Re: Proper use of resources (was Re: more regexp madnes <nospam@devnull.com>
    Re: Proper use of resources (was Re: more regexp madnes (Randal L. Schwartz)
    Re: Proper use of resources (was Re: more regexp madnes <nospam@devnull.com>
    Re: Proper use of resources (was Re: more regexp madnes <nospam@devnull.com>
    Re: Proper use of resources (was Re: more regexp madnes <nospam@devnull.com>
    Re: Reading MS-Word Doc via PERL (John McNamara)
    Re: Reading MS-Word Doc via PERL <phill@modulus.com.au>
    Re: search and replace meta tags in perl - newbie quest <lr@hpl.hp.com>
    Re: search and replace meta tags in perl - newbie quest (Tad McClellan)
    Re: Sort/Remove Duplicates from Database (Tad McClellan)
    Re: splitting into a hash <makarand_kulkarni@My-Deja.com>
    Re: splitting into a hash <phill@modulus.com.au>
    Re: splitting into a hash <lr@hpl.hp.com>
    Re: trouble with a conversion (Bart Lateur)
        using Perl's RE to do basic manipulation of a flex file <ronnie@catlover.com>
    Re: Video comes in bursts <rootbeer@redcat.com>
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 08 May 2000 16:51:43 -0500
From: Andrew Reid <reida@nwu.edu>
Subject: Re: Perl with VRML
Message-Id: <391736EF.764E7E8E@nwu.edu>

Bob4dummys wrote:
> 
> I recently go a book on VRML, and in it is a PERL script to make a VRML file.
> When opened the file should produce a VRML file (world).
> This is the code, please tell me how I can make it open as a VRML file, not
> just as a bunch of text.  (Please reply via email)

  Content-type header looks OK, that's all the Perl can
do to get it right, I think.  Other than that, it's up to 
the server and the browser to figure it out.  (It's possible
that "x-world/x-vrml" has been deprecated in favor of
"world/vrml", but I'm not sure.)

  Both the Perl and VRML in your post are pretty old
fashioned, even if you can work the kinks out of the 
(apparent) server-side issue, I think you're really
rolling the dice on this.

  You might want to check out comp.infosystems.www.authoring.cgi
for server content-type issues, and comp.lang.vrml for hints 
about more recent VRML authoring tools, specifications, and
viewers.

				-- A. Reid
				   reida@nwu.edu


------------------------------

Date: Mon, 08 May 2000 22:38:52 GMT
From: embern@my-deja.com
Subject: perlex install error -51
Message-Id: <8f7flq$9bj$1@nnrp1.deja.com>

We are trying to install PerlEx and we keep getting Error -51 when it
tries to install the files.

Here's the breakdown of our system:
Windows NT, IIS4, Option Pack 4, Service Pack 6
We downloaded PLXi116e.exe from ActiveState last week(and again when it
didn't work).
We stopped all services and closed all applications before attempting to
install.
It tries to install in the Perl directory and we let it.  We don't do
anything special.
We have 2 NT IIS servers and we got the same error on both machines.

We are running ActiveState
"perl, version 5.005_03 built for MSWin32-x86-object
(with 1 registered patch, see perl -V for more detail)"
 ...
"Binary build 522 provided by ActiveState Tool Corp..."
"Built 09:52:28 Nov 2 1999"

On my personal machine, I have PWS and it seemed to work fine.

Has anyone else rec'd this error?  Do you have any suggestions on how
to fix it?

Thank you!
Ember


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: Mon, 08 May 2000 17:14:56 -0500
From: "Andrew N. McGuire" <andrew.mcguire@walgreens.com>
Subject: Re: Please check my 'random' code
Message-Id: <39173C60.EA6A3736@walgreens.com>

ra jones wrote:
> 
> I'm fairly new to Perl, but learning fast (I hope). Could some-one check
> my code please. I need to generate a random return from two choices for
> a medical experiment where one of two treatments are given, allocated at
> random. The code I have used is as below:

Are you running under 'use strict' and '-w'? If not, it is a good
idea.

> @treat = ('Test','Control');
> srand(time ^ $$);

Do a 'perldoc -f srand', you could pick a better random seed here.

> $N = @treat;

You could also be more explicit:

$N = scalar @treat;

but that is a matter of preference.

> $N = int(rand($N));
> open(FILE,">>$log"); # path to $log specified elsewhere

It is good practice to use diagnostics, such as:

    open FILE, ">>$log" or die "Can't append to $log: $!\n";

There is more info on this in 'perldoc -f open', and
'perldoc perlopentut'.

[ snip ]

 
> --

That should be '-- ', or two dashes followed by a space.

Cheers,

anm
-- 
Andrew N. McGuire
andrew.mcguire@walgreens.com


------------------------------

Date: Mon, 8 May 2000 16:25:03 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: Please check my 'random' code
Message-Id: <MPG.1380ecacb5afd22d98aa27@nntp.hpl.hp.com>

In article <39173C60.EA6A3736@walgreens.com> on Mon, 08 May 2000 
17:14:56 -0500, Andrew N. McGuire <andrew.mcguire@walgreens.com> says...
> ra jones wrote:
> > 
> > I'm fairly new to Perl, but learning fast (I hope). Could some-one check
> > my code please. I need to generate a random return from two choices for
> > a medical experiment where one of two treatments are given, allocated at
> > random. The code I have used is as below:

 ...

> > srand(time ^ $$);
> 
> Do a 'perldoc -f srand', you could pick a better random seed here.

It will tell you not to call srand:

  In fact, it's usually not necessary to call srand at all, because if
  it is not called explicitly, it is called implicitly at the first use 
  of the rand operator. However, this was not the case in version of
  Perl before 5.004, so if your script will run under older Perl
  versions, it should call srand.

To simulate a coin toss, the rand() function is quite adequate, and your 
approach is OK.  It can be compressed thus:

    (qw(Test Control))[rand 2]

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Tue, 09 May 2000 00:19:29 GMT
From: joydip_chaklader@my-deja.com
To: jchaklader@hotmail.com
Subject: Please help cannot display REMOTE_HOST variable
Message-Id: <8f7li2$fu0$1@nnrp1.deja.com>

I have developed a script for my hit counter the enviroment variables
are displaying properly except the remote host.

http://www.hitostat.com/cgi-bin/hit.pl

the programme is in this url

http://www.hitostat.com/hit.html

I think server may not be configured to set the REMOTE_HOST variable
(it's a real drain on system resources to do a reverse DNS lookup for
every request). Now I am trying  to use the REMOTE_ADDR variable
instead. I am trying  set my scripts to check  REMOTE_HOST, and if it's
empty,then using REMOTE_ADDR.

But then again my programme  will be a hit counter for other sites
which will be giving site stastics ,it will be very poor if i cannot
provide the domain names of visitors to them.

I am just getting somewhere in the apache documentation that mod_perl.c
can be modified to display enviromental variables which was not
displayed.But I don't know exactly how to fix it up.

Can anybody give me some idea ?

Thanks in advance


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: 8 May 2000 23:12:51 GMT
From: The WebDragon <nospam@devnull.com>
Subject: Re: Printing Arrays
Message-Id: <8f7hlj$gac$2@216.155.32.145>

In article <fDFR4.38997$x4.1283757@newsread1.prod.itd.earthlink.net>, 
"Jason Malone" <jamalone@earthlink.net> wrote:

<snip copious included "so and so said:" headers>
 | > >> I am trying to print individual elements of an array
 | > >> using a loop structure as follows:
 | > >>
 | > >> for ($i = $num_modules; $i >= 1; $i--) {
 | > >>
 | > >> print $block[$i],"\n";
 | > >>
 | > >> }
 | > >>
 | > > You could also try
 | >
 | > > foreach (@block) { print $_; }
 | >
 | > Now, why over-elaborate things? :-)
 | >
 | >   print "$_\n" for @block;
 | >
 | > hth
 | > t
 | >
 | 
 | Just a style issue :)
 | Jason

heh. 

saving this in my growing library of 'neat ways to do things'

That's a nice trick :)

-- 
send mail to mactech (at) webdragon (dot) net instead of the above address. 
this is to prevent spamming. e-mail reply-to's have been altered 
to prevent scan software from extracting my address for the purpose 
of spamming me, which I hate with a passion bordering on obsession.  


------------------------------

Date: Mon, 8 May 2000 19:13:39 -0400
From: tadmc@metronet.com (Tad McClellan)
Subject: Re: Printing Arrays
Message-Id: <slrn8heih3.dsm.tadmc@magna.metronet.com>

On Mon, 08 May 2000 20:32:44 GMT, Jason Malone <jamalone@earthlink.net> wrote:
>
>"Eric Hueckel" <eric.hueckel@vitesse.com> wrote in message
>news:3916DA0B.547CBF4F@vitesse.com...
>> To All,
>>
>> I am trying to print individual elements of
>> an array using a loop structure as follows:
>>
>> for ($i = $num_modules; $i >= 1; $i--) {
                              ^^^^

So you want to ignore the first element then? (at index zero)

I'll assume that was a mistake.


>>    print $block[$i],"\n";
>>
>> }
>>
>You could also try
>
>foreach (@block) {
>    print $_;
>}

But that loops from lowest index to highest index, and adds
no newline.

The OP's code goes in the opposite direction and adds a newline.


So:

   foreach ( reverse @block) {
       print "$_\n";
   }


or:

   foreach ( reverse 0..$#block) {
       print "$block[$_]\n";
   }


or:
   { local $, = "\n";
     print reverse @block;
   }


-- 
    Tad McClellan                          SGML Consulting
    tadmc@metronet.com                     Perl programming
    Fort Worth, Texas


------------------------------

Date: Mon, 08 May 2000 18:27:40 -0500
From: "Andrew N. McGuire" <andrew.mcguire@walgreens.com>
Subject: Re: Proper use of resources (was Re: more regexp madness extracting data    from files.)
Message-Id: <39174D6C.AC56D368@walgreens.com>

The WebDragon wrote:

[ snip ]

> the line
>       @dataList .= "$1 \n";
> 
> generates an error of
> 
> # Can't modify private array in concatenation, near ""$1 \n";"
> File '[snip]:ratings:fileGrab.pl'; Line 21
> 
> which perldiag.pod describes as
> 
>     Can't modify %s in %s
> 
> (F) You aren't allowed to assign to the item indicated, or otherwise try
> to change it, such as with an auto-increment.
> 
> and the question is : why not?

I think the below is what you want.. I rewrote it,
cleaning up some things, maybe not everything...

#!/usr/bin/perl -w

use strict;
use diagnostics -verbose;
use File::Spec;

my $inputDir = File::Spec->catfile( File::Spec->curdir(),
                                    'input_files', '');

opendir DIR, $inputDir or die "Can't opendir $inputDir: $!";
my @filesList = readdir DIR;
closedir DIR;

my @dataList;
for (@filesList) {
    my $grabFile = $inputDir . $_;
    open GRAB, "<$grabFile" or die "Can't open file $grabFile: $!\n";
    while(<GRAB>) {
        next unless /^\Qmaps[i++] = new Map(\E([^)]+)/;
        push @dataList, "$1 \n"; # This is what you want, I think.
    }
    close GRAB;
}

open OUT, ">MyCompleteList.txt" or die "Can't open output file: $!\n";

for (@dataList) { print OUT "$_\n" }

close OUT;
__END__

I hope that the few corrections I made illustrate what you
want, as well as some things that could be done more easily.

Cheers,

anm
-- 
Andrew N. McGuire
andrew.mcguire@walgreens.com


------------------------------

Date: 9 May 2000 00:16:05 GMT
From: The WebDragon <nospam@devnull.com>
Subject: Re: Proper use of resources (was Re: more regexp madness extracting data    from files.)
Message-Id: <8f7lc5$2kh$1@216.155.32.145>

In article <39174D6C.AC56D368@walgreens.com>, "Andrew N. McGuire" 
<andrew.mcguire@walgreens.com> wrote:

 | I think the below is what you want.. I rewrote it,
 | cleaning up some things, maybe not everything...
[snip] 
 | I hope that the few corrections I made illustrate what you
 | want, as well as some things that could be done more easily.

yes, and I updated my code with them.. along with a few other changes I 
made in the interim.. this now produces the results I want. 

I changed     for (@dataList) { print OUT "$_\n" }

to     for (@dataList) { print OUT "$_" }

since the \n 's were already in there from the 
  push @dataList, "$1 \n" ;
line, and this lets me more easily break up sections visually, and also 
detect for whitespace/newlines surrounding my begin and end blocks to 
break up each section, when I go to output them back again. 

I posted some other questions, one of which you already answered with 
your for (@dataList) { print OUT "$_" } snippet. care to take a whack at 
the remaining one? :)

-- 
send mail to mactech (at) webdragon (dot) net instead of the above address. 
this is to prevent spamming. e-mail reply-to's have been altered 
to prevent scan software from extracting my address for the purpose 
of spamming me, which I hate with a passion bordering on obsession.  


------------------------------

Date: 08 May 2000 17:26:40 -0700
From: merlyn@stonehenge.com (Randal L. Schwartz)
Subject: Re: Proper use of resources (was Re: more regexp madness extracting data    from files.)
Message-Id: <m1wvl4muv3.fsf@halfdome.holdit.com>

>>>>> "The" == The WebDragon <nospam@devnull.com> writes:

The>     for (@dataList) { print OUT "$_" }

print OUT @datalist;

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!


------------------------------

Date: 8 May 2000 23:02:30 GMT
From: The WebDragon <nospam@devnull.com>
Subject: Re: Proper use of resources (was Re: more regexp madness extracting data from files.)
Message-Id: <8f7h26$gac$0@216.155.32.145>

In article <x7zoq0vhpy.fsf@home.sysarch.com>, Uri Guttman 
<uri@sysarch.com> wrote:

 | >>>>> "TW" == The WebDragon <nospam@devnull.com> writes:
 | 
 |   TW>       @dataList .= "$1 \n";
 | 
 |   TW> # Can't modify private array in concatenation, near ""$1 \n";"
 |   TW> File '[snip]:ratings:fileGrab.pl'; Line 21
 | 
 |   TW> (F) You aren't allowed to assign to the item indicated, or 
 |   otherwise try 
 |   TW> to change it, such as with an auto-increment. 
 | 
 |   TW> and the question is : why not?
 | 
 | simple. you are applying the scalar assignment operator .= to an
 | array. it logically expands to:
 | 
 | 	@dataList = @dataList . "$1 \n" ;
 | 
 | which make no sense as you are assigning a single element to the list
 | which is the concatenation of the number of elements in the list and
 | some string. so you are accessing its count and changing the count in
 | one expression so perl says "don't do that!!"
 | 
 | you probably meant to use push:
 | 
 | 	push @dataList, "$1 \n" ;
 | 
 | 
 | also the rest of the code has some areas which can be improved
 | stylistically.


this much I already know, which is why I posted the original question 
WITHOUT code, as it's a fairly simple process, but I wanted to see how a 
"seasoned pro" would write it. 

I'll try your suggestion.. thanks! :)

-- 
send mail to mactech (at) webdragon (dot) net instead of the above address. 
this is to prevent spamming. e-mail reply-to's have been altered 
to prevent scan software from extracting my address for the purpose 
of spamming me, which I hate with a passion bordering on obsession.  


------------------------------

Date: 8 May 2000 23:05:16 GMT
From: The WebDragon <nospam@devnull.com>
Subject: Re: Proper use of resources (was Re: more regexp madness extracting data from files.)
Message-Id: <8f7h7c$gac$1@216.155.32.145>

In article <m1bt2gog0v.fsf@halfdome.holdit.com>, merlyn@stonehenge.com 
(Randal L. Schwartz) wrote:

 | >>>>> "Randal" == Randal L Schwartz <merlyn@stonehenge.com> writes:
 | 
 | Randal> Post with a valid email address (spamblocked if you must).  If
 | Randal> you had done that, I wouldn't need to needle you in public -
 | Randal> this could have been a private message.
 | 
 | I apologize for this part of the message.  Your spamblocked address
 | is in your .sig block.  Silly me for not looking there.  Sorry.

does this mean the newbie gets to thwap the seasoned veteran with the 
clue stick, or is the self-flagellation implied in the context? :) 

(I may be a newbie to PERL, but not to USENET :D I was on USENET even 
before Mosaic 1.0 became a reality. Hell, for that matter, does anyone 
remember FidoNet? ;o)

-- 
send mail to mactech (at) webdragon (dot) net instead of the above address. 
this is to prevent spamming. e-mail reply-to's have been altered 
to prevent scan software from extracting my address for the purpose 
of spamming me, which I hate with a passion bordering on obsession.  


------------------------------

Date: 9 May 2000 00:06:23 GMT
From: The WebDragon <nospam@devnull.com>
Subject: Re: Proper use of resources (was Re: more regexp madness extracting data from files.)
Message-Id: <8f7kpv$2kh$0@216.155.32.145>

In article <8f7h26$gac$0@216.155.32.145>, The WebDragon 
<nospam@devnull.com> wrote:

 | In article <x7zoq0vhpy.fsf@home.sysarch.com>, Uri Guttman 
 | <uri@sysarch.com> wrote:
 | 
 |  | >>>>> "TW" == The WebDragon <nospam@devnull.com> writes:
 |  | 
 |  |   TW>       @dataList .= "$1 \n";
 |  | 
 |  |   TW> # Can't modify private array in concatenation, near ""$1 \n";"
 |  |   TW> File '[snip]:ratings:fileGrab.pl'; Line 21
 |  | 
 |  |   TW> (F) You aren't allowed to assign to the item indicated, or 
 |  |   otherwise try 
 |  |   TW> to change it, such as with an auto-increment. 
 |  | 
 |  |   TW> and the question is : why not?
 |  | 
 |  | simple. you are applying the scalar assignment operator .= to an
 |  | array. it logically expands to:
 |  | 
 |  | 	@dataList = @dataList . "$1 \n" ;
 |  | 
 |  | which make no sense as you are assigning a single element to the list
 |  | which is the concatenation of the number of elements in the list and
 |  | some string. so you are accessing its count and changing the count in
 |  | one expression so perl says "don't do that!!"
 |  | 
 |  | you probably meant to use push:
 |  | 
 |  | 	push @dataList, "$1 \n" ;
 |  | 
 |  | 
 |  | also the rest of the code has some areas which can be improved
 |  | stylistically.
 | 
 | 
 | this much I already know, which is why I posted the original question 
 | WITHOUT code, as it's a fairly simple process, but I wanted to see how a 
 | "seasoned pro" would write it. 
 | 
 | I'll try your suggestion.. thanks! :)

okiedokie, the basic "shell" of the completed script is here. 

-=-begin-=-
#!perl
use strict;
use diagnostics -verbose;
use File::Spec;

my $inputDir = File::Spec->catfile( File::Spec->curdir(), 'input_files', 
'');
my(@filesList, $grabFile, @dataList);

opendir(DIR, $inputDir) || die "can't opendir $inputDir: $!";
    @filesList = readdir(DIR);
closedir DIR;

foreach my $flist (@filesList) {
$grabFile = $inputDir . $flist;
open(GRAB, "<$grabFile") or die ('Cannot open file ' . $grabFile . "$!");
#   print 'Successful open of '. $grabFile ."\n"; # debugging
    push @dataList, "--BEGIN $flist --\n\n";
    while(<GRAB>) {
        next unless /^\Qmaps[i++] = new Map(\E([^)]+)/;
        push @dataList, "$1 \n" ;
#       print $1 . "\n"; # debugging
    }
push @dataList, "\n--END--\n\n";
    close (GRAB);
};

open(OUT, ">maps_list.txt") or die(' Cannot open output file ' . $!);
    foreach my $text (@dataList) {
        print OUT $text;
    };
close (OUT);
-=-end-=-
Question #1: 

now from some other post I just saw on here and if I understood this 
correctly, I can replace 
    foreach my $text (@dataList) {
        print OUT $text;
    };
with
    print "$_\n" for @dataList;

correct?

Question #2 

on contemplation, I'd like to add to this script the ability to grab 
these six .html files containing the data I wish to extract via http, 
and save them locally, overwriting the previous ones, before running 
this script on them and generating my output file of the data. (this way 
I can daily keep my local output file up to date.)

Which modules do you recommend that would best suit this sort of thing? 
I'm guessing one of the Net:: or HTTP:: modules? Which wuld you use? and 
why?

-- 
send mail to mactech (at) webdragon (dot) net instead of the above address. 
this is to prevent spamming. e-mail reply-to's have been altered 
to prevent scan software from extracting my address for the purpose 
of spamming me, which I hate with a passion bordering on obsession.  


------------------------------

Date: Mon, 08 May 2000 22:24:38 GMT
From: writeexcel@eircom.net (John McNamara)
Subject: Re: Reading MS-Word Doc via PERL
Message-Id: <39173e92.9464675@news1.eircom.net>

Ar Mon, 08 May 2000 20:06:01 GMT, do scriobh Karim Wall
<mirak63@yahoo.com>:

>How would one read directly from an
>MS-WORD document using PERL.
>
>I have about 300 .doc files to scan through and
>I don't relish opening each of them within WORD
>and doing a SAVE AS Text.

Try the Win32::OLE module and office automation. This requires a
Windows platform and an installed copy of Excel. Have a look at:
http://www.activestate.com/ActivePerl/docs/faq/Windows/ActivePerl-Winfaq12.html
http://www.activestate.com/ActivePerl/docs/site/lib/Win32/OLE.html

On Linux/Unix try OLE::Storage, aka LAOLA. This is a Perl interface to
OLE file formats. This includes "lhalw - Laola Have A Look at Word".
Try:
http://user.cs.tu-berlin.de/~schwartz/pmh/

In general, a lot of repetitive Word tasks are more easily
accomplished using VBA and Word itself.

John McNamara
-- 





------------------------------

Date: Tue, 09 May 2000 09:44:52 +1000
From: Peter Hill <phill@modulus.com.au>
Subject: Re: Reading MS-Word Doc via PERL
Message-Id: <39175174.3C9B@modulus.com.au>

Karim Wall wrote:
> 
> Hello,
> 
> Sorry for the stupid question...
> How would one read directly from an
> MS-WORD document using PERL.
> 
> I have about 300 .doc files to scan through and
> I don't relish opening each of them within WORD
> and doing a SAVE AS Text.
> 
> Is there an EZ way?
> 
> Thanks,
This (read directly from an MS-WORD document using PERL.) is possible
with OLE, but the EZ (TM?) way IMHO would be to write a macro in Word's
VB implementation to read the docs and save them as text. 

-- 
Peter Hill,
Modulus Pty. Ltd.,
http://www.modulus.com.au/


------------------------------

Date: Mon, 8 May 2000 16:05:53 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: search and replace meta tags in perl - newbie question
Message-Id: <MPG.1380e82c23d48cbb98aa25@nntp.hpl.hp.com>

In article <slrn8he8vl.dia.tadmc@magna.metronet.com> on Mon, 8 May 2000 
16:30:45 -0400, Tad McClellan <tadmc@metronet.com> says...

 ...

> On Mon, 08 May 2000 18:46:43 GMT, Jason Malone <jamalone@earthlink.net> wrote:
> >"Hilary Cotter" <hpcotter@my-deja.com> wrote in message
> >news:8f6qg2$g48$1@nnrp1.deja.com...

 ...

> >Try this
> >
> >s/<HEAD>/<HEAD<meta name\=keywords content\=\"dejanews usenet newsgroup
> >articles search query discussion\">/g

 ...

> You only need to quote characters that have a meta (special)
> meaning.
> 
> quotes and equal signs are not meta in regexs, so escaping
> them accomplishes nothing (except make your code harder to
> read, which is bad).

That isn't even a regex; it is the substitution string of a substitution 
operator.  Even fewer characters have metasemantics.

As I replied to the original poster, $, @, \, plus the regex delimiter.

But that wasn't enough information to satisfy him.  :-(

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Mon, 8 May 2000 19:01:41 -0400
From: tadmc@metronet.com (Tad McClellan)
Subject: Re: search and replace meta tags in perl - newbie question
Message-Id: <slrn8hehql.dsm.tadmc@magna.metronet.com>


[ Please put your comments *following* the quoted text that
  you are commenting on.

  Please do NOT quote text that you are not going to comment on.

  Please do not quote .sigs

  Please visit     news.announce.newusers

  Jeopardectomy performed.
]


On Mon, 08 May 2000 20:38:35 GMT, Jason Malone <jamalone@earthlink.net> wrote:
>"Larry Rosler" <lr@hpl.hp.com> wrote in message
>news:MPG.1380b6983609443298aa22@nntp.hpl.hp.com...
>> In article <nYDR4.38784$x4.1273346@newsread1.prod.itd.earthlink.net> on
>> Mon, 08 May 2000 18:46:43 GMT, Jason Malone <jamalone@earthlink.net>
>> says...
>> > Try this
>> >
>> > s/<HEAD>/<HEAD<meta name\=keywords content\=\"dejanews usenet newsgroup
>>                 ^
>>                 >
>> > articles search query discussion\">/g
>> >
>> > It is important to escape certain characters like quotes and equal
>signs.
>>
>> Wrong on each of those.  The only characters that must be escaped in the
>> substitution string are $, @, \, and the regex delimiter.
>>
>> And the /g flag isn't likely to do much.  How many '<HEAD>' tags do you
>> think there will be?


>OK Larry,
>
>Why don't you post a solution instead of just a slap on the wrist.


Maybe because the OP's code *works* just fine?

So there is no need for a "solution" because there is no "problem" :-)

There is something the OP hasn't told us...

He gave us one line of code. There is nothing wrong with that line.
The problem must be elsewhere.


-- 
    Tad McClellan                          SGML Consulting
    tadmc@metronet.com                     Perl programming
    Fort Worth, Texas


------------------------------

Date: Mon, 8 May 2000 19:25:29 -0400
From: tadmc@metronet.com (Tad McClellan)
Subject: Re: Sort/Remove Duplicates from Database
Message-Id: <slrn8hej79.dsm.tadmc@magna.metronet.com>

On Mon, 08 May 2000 19:49:32 GMT, jzoetewey@my-deja.com <jzoetewey@my-deja.com> wrote:

>I'm currently trying to create a program that will allow me to sort and
>remove duplicates from information derived from a database.
        ^^^^^^^^^

So you have already searched the Perl FAQs for some
appropriate keywords then?

Looks like you missed one  :-)


   perldoc -q duplicate

   "How can I remove duplicate elements from a list or array?"


>The information will be in fixed width columns (first name is at 10-19,
                            ^^^^^^^^^^^^^^^^^^^

A Perlified brain starts shouting "unpack!" when exposed
to those keywords.


   perldoc -f unpack


>last name starts at 20-39, company at...).
>
>I'm assuming that I'll use substr to read the field I want to sort by
                        ^^^^^^^^^^

You can use substr(), but unpack will be easier to understand
and take a whole lot less typing. (you need 10 substr()s for 10
fields. You need 1 unpack() for 10 (indeed any number of) fields.


>(for example: address) and the record number into some kind of array.
>I'll then sort by the field (address, name, whatever...).  After things
>are sorted, I'll use the order the record number ends up in to determine
 ^^^^^^^^^^


   perldoc -q sort

   perldoc -f sort

for more insights into how to sort stuff in Perl.


> what order the records should be in in the finished file.
>
>My questions go like this:
>
>1. Is that a good strategy for the program?


Not if you believe the Perl FAQs.

I generally believe those  :-)


>2. What kind array should I be putting the field and record number into?
> Associative? A multi-dimensional array?
  ^^^^^^^^^^^

They are not called "associative arrays" anymore.

They are called "hashes".

Yes, use a hash.


>3. Would it be better to bring the whole database into an array at once
>rather than just two fields?
>
>Feel free to include any tips that you think might not be obvious to a
>beginner...


The Very First Thing To Know is how to make use of the ~1200 "pages"
of documentation that got installed on your hard disk along with
the perl interpreter.


   perldoc perldoc

will get you started.


Good luck!


-- 
    Tad McClellan                          SGML Consulting
    tadmc@metronet.com                     Perl programming
    Fort Worth, Texas


------------------------------

Date: Mon, 08 May 2000 15:49:47 -0700
From: Makarand Kulkarni <makarand_kulkarni@My-Deja.com>
Subject: Re: splitting into a hash
Message-Id: <3917448B.85249C62@My-Deja.com>



rbbdsb wrote:

> I've been trying to accomplish this with split, but am having trouble
> figuring out how to pull the data as a field/value pair.  The number of
> pairs varies with every line, so I can't do things like

you can use the result of split () and assign it to a hash.

see example

use Data::Dumper ;
my %MyHash ;
while (<DATA>)
        {
        my %hash = split ( ' ',$_); #complains if number of items found is
odd, also assumes that your items on each line are separated with a space
        $MyHash {$.}=\%hash; # $. holds the current line number.
        }
print Dumper \%MyHash ;
exit;

__DATA__
field value field1 value1 field2 value2 field3 value3
field4 value4 field5 value5 field6 value6 field7 value7

results :
$VAR1 = {
          1 => {
                 'field' => 'value',
                 'field1' => 'value1',
                 'field2' => 'value2',
                 'field3' => 'value3'
               },
          2 => {
                 'field4' => 'value4',
                 'field5' => 'value5',
                 'field6' => 'value6',
                 'field7' => 'value7'
               }
        };




------------------------------

Date: Tue, 09 May 2000 09:02:23 +1000
From: Peter Hill <phill@modulus.com.au>
Subject: Re: splitting into a hash
Message-Id: <3917477F.2EE1@modulus.com.au>

rbbdsb wrote:
> 
> Hi,
>   I've got a long variable length string that I want to read as data pairs
> and  stuff into a hash.  The pattern would be something like:
> 
> Data
> field value field1 value1 field2 value2 field3 value3 ...
> 
> Code
> split();  # or some variation there of
> $StringCount++;
> $MyHash{$StringCount}{$field} = value;
> foreach my $value( keys %{$MyHash{StringCount}{$field}} )
> 
>   print "$field, $value, $StringCount, $MyHash{$StringCount}{$field}\n";
> }
> 
> I've been trying to accomplish this with split, but am having trouble
> figuring out how to pull the data as a field/value pair.  The number of
> pairs varies with every line, so I can't do things like
> 
>  my ($field1, $value1, $field2, $value2) = split;
> 
> Anyone have any thoughts?
> 
> TIA,
> Russ

For the moment, you could use:

#! /bin/perl -w
use strict;
my $str = 'field value field1 value1 field2 value2 field3 value3';
my @pairs = split(/ /,$str);
my %theHash;

while (@pairs){
	my $field  = shift(@pairs);
	my $value = shift(@pairs);
	$theHash{$field} = $value;
}
print "$_ $theHash{$_}\n" for sort keys %theHash;
__END__

but it will be interesting to see some of the more efficient approaches
which will no doubt be posted.
-- 
Peter Hill,
Modulus Pty. Ltd.,
http://www.modulus.com.au/


------------------------------

Date: Mon, 8 May 2000 16:17:43 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: splitting into a hash
Message-Id: <MPG.1380eaef46c6d4f198aa26@nntp.hpl.hp.com>

In article <plGR4.39085$x4.1288569@newsread1.prod.itd.earthlink.net> on 
Mon, 08 May 2000 21:29:57 GMT, rbbdsb <rbbdsb@earthlink.net> says...
>   I've got a long variable length string that I want to read as data pairs
> and  stuff into a hash.  The pattern would be something like:
> 
> Data
> field value field1 value1 field2 value2 field3 value3 ...
> 
> Code
> split();  # or some variation there of
> $StringCount++;
> $MyHash{$StringCount}{$field} = value;
> foreach my $value( keys %{$MyHash{StringCount}{$field}} )
> 
>   print "$field, $value, $StringCount, $MyHash{$StringCount}{$field}\n";
> }

Are you looking for a one-dimensional or two-dimensional data structure?  
For a one-dimensional structure, if the string is in $_:

  %MyHash = split;

Simple enough?

> I've been trying to accomplish this with split, but am having trouble
> figuring out how to pull the data as a field/value pair.  The number of
> pairs varies with every line, so I can't do things like
> 
>  my ($field1, $value1, $field2, $value2) = split;

The code I suggested above wipes out the previous value of %MyHash, if 
any.  If you are adding values within a loop, it has to be more 
elaborate than that.

  $MyHash{$1} = $2 while /(\S+)\s+(\S+)/g;

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Mon, 08 May 2000 22:03:51 GMT
From: bart.lateur@skynet.be (Bart Lateur)
Subject: Re: trouble with a conversion
Message-Id: <391a3600.4766501@news.skynet.be>

troylachinski@my-deja.com wrote:

>I am converting a comma delimited file to a fixed length file.  At the
>end of each record I need to add a carriage return (HEX 0D = "\x0d"?)
>and then a line feed (HEX 0A = "\x0a"?).  When I add these the record
>gets an extra byte.

It's the LF ("\xoa") getting converted into CR+LF ("\x0d\x0a"). In
short: as long as you're doing this on a platform that uses CR+LF as
default, adding "\n" (= "\x0a") is enough.

If you DON'T want this effect, apply binmode() on your output
filehandle.

Try this:

    open OUT,">test.txt";
    print OUT "\x0a";
    close OUT;
    open(IN,"test.txt");
    binmode IN;
    my $fl = read IN, $_, -s IN;
    print "Size: $fl bytes\n";
    ($\,$,) = ("\n"," ");
    print 'Bytes:', map { sprintf '%02X', $_ } unpack 'C*',$_;

I get:

	Size: 2 bytes
	Bytes: 0D 0A


And finally, to releave any doubt left:

	$_ = "\n";
	($\,$,) = ("\n"," ");
	print length, ord;

-->
	1 10

-- 
	Bart.


------------------------------

Date: Mon, 08 May 2000 22:26:12 GMT
From: Ron Grabowski <ronnie@catlover.com>
Subject: using Perl's RE to do basic manipulation of a flex file
Message-Id: <39173F35.C3B1C119@catlover.com>

I am working with flex/Bison and I wish to do some rough debugging on my
flex file. Basically I want to change all of the 'return TOKEN' entries
to 'printf TOKEN'. Here is a simplified layout of my flex file:

0|[1-9][0-9]*	DIGIT
blah|test	ANOTHER_TOKEN

%%

[A-Za-z]{DIGIT}	       { some_function(); return ALPHA_WITH_DIGIT_TOKEN
; }
{ANOTHER_TOKEN}[0-9]   { blah(); return TOKEN2 ; }

I want to change 

{ some_function(); return ALPHA_WITH_DIGIT_TOKEN ; }

into 

{ printf("[ALPHA_WITH_DIGIT_TOKEN] %s", yytext); }

Whenever I try and do something basic like 

$flex_file =~ s#{.*return\s+(\w+).*}#{ printf("[$1] %s", yytext); }#gs;

My RE does not come out as expected because of the optional {DIGIT} ( or
any other series of characters that looks like {xxx} for that matter )
throughs things off. I have experimented with using non greedy matches
like: 

$flex_file =~ s#{.*?return\s+(\w+).*?}#{ printf("[$1] %s", yytext);
}#gs;

but that produces undesirable output too. I do not want to simply change
the 'return' into a 'printf', I want to replace all of the content
between the curly braces. The information between the brackets that
contain the 'return' does not have to be on one-line:

{ANOTHER_TOKEN}[0-9]   { 
                          blah(); 
                          return TOKEN2 ; 
                       }
is valid too.

My test program:

use strict;

my $flex_file;

# normally we would be slurping the file
# from disk
{ local $/ = undef; $flex_file = <DATA>; }

$flex_file =~ s#{.*return\s+(\w+).*}#{ printf("[$1] %s", yytext); }#gs;
print $flex_file;

__DATA__
0|[1-9][0-9]*	DIGIT
blah|test	ANOTHER_TOKEN

%%

[A-Za-z]{DIGIT}	       { some_function(); return ALPHA_WITH_DIGIT_TOKEN
; }
{ANOTHER_TOKEN}[0-9]   { blah(); return TOKEN2 ; }

which produces the undesirable output of:

0|[1-9][0-9]*   DIGIT
blah|test       ANOTHER_TOKEN

%%

[A-Za-z]{ printf("[TOKEN2] %s", yytext); }

- Ron


------------------------------

Date: Mon, 8 May 2000 17:07:21 -0700
From: Tom Phoenix <rootbeer@redcat.com>
Subject: Re: Video comes in bursts
Message-Id: <Pine.GSO.4.10.10005081700490.3921-100000@user2.teleport.com>

On Mon, 8 May 2000 jlucande@my-deja.com wrote:

> The problem is that the video comes in bursts of few kilobytes each.

Sounds like buffering. (The fact that the data is video doesn't matter;
it's all just bits to perl!)

> $q = new CGI;

> $|=1;  # Forces the buffer to be flushed.
>  LOOP: while (read SOCK, $data, 8) {
>      $q->print($data) or last LOOP;
>  }

I must be missing something. Are you using a print method of the CGI
module? Maybe you meant to simply print() here. In fact, if this works at
all, it's probably because of a bug in the CGI module. 

Also, although that loop label doesn't hurt anything, it's not needed.
Labels are quite rare in Perl programs.

Cheers!

-- 
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/





------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V9 Issue 2984
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[15571] in Perl-Users-Digest

Perl-Users Digest, Issue: 2984 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon May 8 21:10:50 2000

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon May 8 21:10:50 2000