[19342] in Perl-Users-Digest
Perl-Users Digest, Issue: 1537 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Aug 16 03:05:39 2001
Date: Thu, 16 Aug 2001 00:05:14 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <997945513-v10-i1537@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Thu, 16 Aug 2001 Volume: 10 Number: 1537
Today's topics:
Re: "shifting" values of the @_ array?? <uri@sysarch.com>
Re: "shifting" values of the @_ array?? (Tad McClellan)
Re: "shifting" values of the @_ array?? <miscellaneousemail@yahoo.com>
Re: Avoiding symbolic references. <goldbb2@earthlink.net>
Re: checking for non-letters (Ken)
download.cgi, MS Ex & PDF <engineering@pretran.co.nz>
FAQ: How can I read in a file by paragraphs? <faq@denver.pm.org>
Re: For each Object in ObjContainer ... How to translat (derek chen)
Re: For each Object in ObjContainer ... How to translat <Tassilo.Parseval@post.rwth-aachen.de>
How can I get rid of ISO codes? <bigbanana@mailandnews.com>
Re: Ignoring The First Line (Paul Lew)
Re: Ignoring The First Line (Tad McClellan)
Re: Math::Matica, Perl Modules in general (Adamone11)
Re: Math::Matica, Perl Modules in general <randy@theory.uwinnipeg.ca>
Re: Need Perl module or regexp to slurp specific XML re <mel2000@hotmaildot.com>
Re: Need Perl module or regexp to slurp specific XML re <mel2000@hotmaildot.com>
Re: Need Perl module or regexp to slurp specific XML re <goldbb2@earthlink.net>
Newbie Activeperl Win2k <gwhs@bronxpages.com>
Re: Newbie Activeperl Win2k <pschnell@touchpowder.com>
Newbie Perl Installation - "%1 is not a valid Windows N <ash@turnernewmedia.com.au>
Re: perldoc is like Greek to a beginner?? (Eric Bohlman)
Problem with two versions of Perl installed <edmcheng@hotmail.com>
Re: traversing sub-directories <goldbb2@earthlink.net>
Re: tutorial for perl/ldap/web <EvR@compuserve.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Thu, 16 Aug 2001 03:53:19 GMT
From: Uri Guttman <uri@sysarch.com>
Subject: Re: "shifting" values of the @_ array??
Message-Id: <x7itfok6ds.fsf@home.sysarch.com>
>>>>> "CCG" == Carlos C Gonzalez <miscellaneousemail@yahoo.com> writes:
CCG> The following methodX subs all output the same thing. Is there any
CCG> reason to use one method as opposed to another other than personal
CCG> preference or that one method is used more often than another just
CCG> because?
CCG> use CGI::Carp qw(carpout fatalsToBrowser);
why include that when it is not a cgi program?
CCG> use diagnostics;
CCG> use strict;
CCG> my @a = qw(Tom Mary Jack);
CCG> method1(@a);
CCG> method2(@a);
CCG> method3(@a);
those are not method calls, they are plain sub calls. methods can only
be called via objects or class names and with a proper perl method call.
<@_ styles reordered>
CCG> my ($a) = @_;
this is the most common way of handling @_. if you have multiple args it
works nicely and a trailing array arg can slurp in the rest of @_.
CCG> my $a = shift;
this is common but i tend to use it more and more rarely. when i really
want to remove the element from @_ i will use this.
CCG> my $a = $_[0];
this has no advantages at all. it is effectively the same as the first
one (for 1 arg only) as it copies the first argument but is doesn't look
as nice. the primary use of $_[0] and friends is to modify the original
argument as $_[0] is an alias to it. but then you don't assign the arg
to another var but use $_[0] directly. it also is slightly faster as you
eliminate the extra copy but that is not worth the obsfucation except in
a few cases.
uri
--
Uri Guttman --------- uri@sysarch.com ---------- http://www.sysarch.com
SYStems ARCHitecture and Stem Development ------ http://www.stemsystems.com
Search or Offer Perl Jobs -------------------------- http://jobs.perl.org
------------------------------
Date: Wed, 15 Aug 2001 23:23:40 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: "shifting" values of the @_ array??
Message-Id: <slrn9nmf5r.3ap.tadmc@tadmc26.august.net>
Carlos C. Gonzalez <miscellaneousemail@yahoo.com> wrote:
>
>I have been studying and playing with the shift operator and was
>wondering...
>
>The following methodX subs all output the same thing. Is there any
>reason to use one method as opposed to another other than personal
>preference or that one method is used more often than another just
>because?
>#!/usr/bin/perl -w
>
>use CGI::Carp qw(carpout fatalsToBrowser);
You don't really "play around" in the CGI environment, do you?
It's much easier to play at a command line (less "other things"
to get in the way too).
>sub method1
>{
> my $a = shift;
I use method1 only when I'm going to do something else with
the rest of @_.
(Or sometimes if it is a "transform" function that will *always*
take just one argument (eg. $str = my_quotemeta($str)).
>sub method2
>{
> my $a = $_[0];
I never use method2.
(but I don't do JAPHs)
>sub method3
>{
> my ($a) = @_;
I use method3 the rest of the time. I can easily add another
argument when the need arises.
>Thanks for any insight on this.
And I never declare variables named $a or $b, I leave those
for sort()ing.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Thu, 16 Aug 2001 05:22:49 GMT
From: Carlos C. Gonzalez <miscellaneousemail@yahoo.com>
Subject: Re: "shifting" values of the @_ array??
Message-Id: <MPG.15e5065a54e3fa16989761@news.edmonton.telusplanet.net>
In article <slrn9nmf5r.3ap.tadmc@tadmc26.august.net>, Tad McClellan at
tadmc@augustmail.com says...
> >use CGI::Carp qw(carpout fatalsToBrowser);
>
>
> You don't really "play around" in the CGI environment, do you?
>
> It's much easier to play at a command line (less "other things"
> to get in the way too).
Oops! I neglected to take that line out. When I create a command line
perl script I just hit a key to generate a simple template. Got so used
to doing that once I created the mini-template that I neglected to
realize that I didn't even need the CGI::Carp line.
Thanks for your input on the rest Tad.
---
Carlos
www.internetsuccess.ca
*NOTE*: Internet Success is NOT yet fully operational so although you are
welcomed to visit and take a look, trying to subscribe will only be a
frustration for you as your data will not be saved at this time.
------------------------------
Date: Wed, 15 Aug 2001 23:44:19 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Avoiding symbolic references.
Message-Id: <3B7B4193.FA0FDC3C@earthlink.net>
Benjamin Goldberg wrote:
>
> I have in a module a for loop which creates a number of subroutines
> from a closure.
[snip]
> while( my ($name,$args) = each %whatever ) {
> $Package::{$name} = sub { .... };
> }
>
> Which *should* do what I want, but doesn't seem to work.
>
> Calls to those functions result in errors like "Undefined subroutine
> &Package::whatever called at blah.pl line XX"
>
> Anyone want to tell me what's going wrong?
Aha! I found it. I was expecting $Package::{$name} to automatically be
a typeglob. It's not... However, if I predeclare the subs, using "use
subs" it works. The new code:
use subs ();
my %blat = ( name => [x,y,z], name2 => [q,u,u,x] );
while( my ($name,$args) = each %blat ) {
subs->import($name);
$Package::{$name} = sub {...};
}
Oh, and the reason for doing it this way is because I want it to be "use
strict" clean, without needing a "no strict 'refs'" for the created
methods.
Urghk... I just looked at subs.pm, and I see that it works using
symbolic references. Is there any way to cause a glob to magically
appear in $::Package{$name} *without* using symbolic references?
Hmm
local *x = $Package::{$foo} || do {\local *x};
*x = sub { ... };
$Package::{$foo} = *x;
Bleh. This is ugly. I guess that the best way to make it so sub{} is
compiled with strict refs in place, and add it to the symbol table, is
to do just that [in that order, as two seperate steps]...
my $sub = sub { ... };
{ no strict 'refs'; *{$name} = $sub }
--
I'm not a programmer but I play one on TV...
------------------------------
Date: 15 Aug 2001 18:24:03 -0700
From: kenphilbrick@mindspring.com (Ken)
Subject: Re: checking for non-letters
Message-Id: <f8b445c0.0108151724.147dfe24@posting.google.com>
kenphilbrick@mindspring.com (Ken) wrote in message news:<f8b445c0.0108151224.d36ed7b@posting.google.com>...
> How can I check a character to see if it is a letter (a-z or A-Z).
> I've tried:
>
> if ($char =~ /a-zA-Z/) {
> print "it's a letter\n";
> }
>
> where $char holds a single character. But that didn't work. It came
> up false every time, even when it really was a letter. This didn't
> seem to work either:
>
> if ($char !~ /a-zA-Z/) {
> print "it's not a letter\n";
> }
>
> Anyone know what I'm doing wrong?
>
> Thanks.
Thanks, everyone. I guess I'll go read perlre 5 times, now that
that's worked out :-)
--
Ken
------------------------------
Date: Thu, 16 Aug 2001 16:52:50 +0100
From: steve edmonds <engineering@pretran.co.nz>
Subject: download.cgi, MS Ex & PDF
Message-Id: <3B7BEC52.DA7203F2@pretran.co.nz>
Hi,
I am trying to use a download perl script to force pdf's to the browser
so as to control distribution on apache servers.
Works fine any which way with netscape but all woes with MS Explorer.
I get acrobat opening but nothing hapens or I get an error with page
(exp, not server) depending on explorer version.
Can someone suggest a fix please.
I have tried redirecting with
print "Location: $fileid\n\n";
and pushing with
if ($DownloadURL=~ /.pdf$/i) {
$file =~ s/^\w+(?=.)/$dlfile/ ;
print("Content-type: application/pdf\n");
print("Content-Disposition: atachment; filename=$file\n");
print("Content-Length: $size\n");
print("Content-Description: Larry\'s File Downloader\n\n\n");
$size = -s $DownloadFile;
select STDOUT;
$| = 1;
open(fin, $DownloadFile) or die "$0: Can't open file: $!\n";
$blksize = (stat fin)[11] or 16384;
while ($len = sysread fin, $buf, $blksize) {
if (!defined $len) {
next if $! =~ /^Interrupted/;
die "$0: System read error: $!\n";
}
$offset = 0;
while ($len) {
$written = syswrite STDOUT, $buf, $len, $offset;
if (!defined ($written)) {
die "$0: System write error: $!\n";
}
$len -= $written;
$offset += $written;
}
}
close (fin);
and with the above with
print("Content-type: application/force-download\n");
help much appreciated
steve
------------------------------
Date: Thu, 16 Aug 2001 06:17:02 GMT
From: PerlFAQ Server <faq@denver.pm.org>
Subject: FAQ: How can I read in a file by paragraphs?
Message-Id: <yzJe7.143$V3.170992640@news.frii.net>
This message is one of several periodic postings to comp.lang.perl.misc
intended to make it easier for perl programmers to find answers to
common questions. The core of this message represents an excerpt
from the documentation provided with every Standard Distribution of
Perl.
+
How can I read in a file by paragraphs?
Use the "$/" variable (see the perlvar manpage for details). You can
either set it to """" to eliminate empty paragraphs (""abc\n\n\n\ndef"",
for instance, gets treated as two paragraphs and not three), or ""\n\n""
to accept empty paragraphs.
Note that a blank line must have no blanks in it. Thus ""fred\n
\nstuff\n\n"" is one paragraph, but ""fred\n\nstuff\n\n"" is two.
-
Documents such as this have been called "Answers to Frequently
Asked Questions" or FAQ for short. They represent an important
part of the Usenet tradition. They serve to reduce the volume of
redundant traffic on a news group by providing quality answers to
questions that keep coming up.
If you are some how irritated by seeing these postings you are free
to ignore them or add the sender to your killfile. If you find
errors or other problems with these postings please send corrections
or comments to the posting email address or to the maintainers as
directed in the perlfaq manual page.
Answers to questions about LOTS of stuff, mostly not related to
Perl, can be found by pointing your news client to
news:news.answers
or to the many thousands of other useful Usenet news groups.
Note that the FAQ text posted by this server may have been modified
from that distributed in the stable Perl release. It may have been
edited to reflect the additions, changes and corrections provided
by respondents, reviewers, and critics to previous postings of
these FAQ. Complete text of these FAQ are available on request.
The perlfaq manual page contains the following copyright notice.
AUTHOR AND COPYRIGHT
Copyright (c) 1997-1999 Tom Christiansen and Nathan
Torkington. All rights reserved.
This posting is provided in the hope that it will be useful but
does not represent a commitment or contract of any kind on the part
of the contributers, authors or their agents.
05.25
--
This space intentionally left blank
------------------------------
Date: 15 Aug 2001 20:04:19 -0700
From: u8526505@ms27.hinet.net (derek chen)
Subject: Re: For each Object in ObjContainer ... How to translate to perl
Message-Id: <85789064.0108151904.499c27da@posting.google.com>
"Peter S?aard" <peter.sogaard@tjgroup.com> wrote in message news:<3b7a69b3$0$328$edfadb0f@dspool01.news.tele.dk>...
> > use WIN32::OLE;
> > $p=Win32::OLE->GetObject("IIS://tw-derek/W3svc/1/root/test");
> > ?????
>
> well, if that object is a container, you should store it in list context,
> using @, not $.
> @p=Win32::OLE->GetObject("IIS://tw-derek/W3svc/1/root/test");
>
> foreach $obj ( @p ){
> print $obj->name;
> }
>
> I have not tested this...
It can't work.There's only one element in the list,ITSELF.
------------------------------
Date: Thu, 16 Aug 2001 08:56:56 +0200
From: Tassilo von Parseval <Tassilo.Parseval@post.rwth-aachen.de>
Subject: Re: For each Object in ObjContainer ... How to translate to perl
Message-Id: <3B7B6EB8.4020104@post.rwth-aachen.de>
derek chen wrote:
> I'm using a VB script with the following statement
>
> set websvc = GetObject("IIS://tw-derek/W3svc/1/root/test")
> For Each OBJ in websvc
> wscript.echo obj.name
> Next
>
> websvc is an object container and I can use "For Each" statement to enumate
> all the objects it contains.I'm really confused about how I can use perl
> WIN32:OLE module to accomplish the same task.
>
> use WIN32::OLE;
> $p=Win32::OLE->GetObject("IIS://tw-derek/W3svc/1/root/test");
> ?????
>
Well, have you read the docs of Win32::OLE? It should have given you
some hints. Notably, that the Perl version looks surprisingly similar to
the same thing in VB script:
use strict;
use Win32::OLE qw(in);
my $object = Win32::OLE->GetObject("IIS://tw-derek/W3svc/1/root/test");
for my $item (in $object) { $item->do_something }
__END__
Due to my lacking a Win-system, this is untested. See the relevant
documentation for yourself. However, functions such as 'in' and 'valof'
are not exported by Win32:OLE by default, so you have to do that
manually as seen in the above use-statement.
Tassilo
--
$a=[(74,116)];$b=[($a->[1]-1,$a->[1]++,0x20)];$c=[(97,110)];$d=[($c->
[1]+1,$b->[1],"her")];for(@{[$a,$b,$c,$d]}){for(@{$_}){$_=~/\d+/?print
(chr($_)):print;}}$c=sub{$l=shift;[(0x20+$l-1,0x50,0x65,0x73-0x01,108
),(0x20,0x68,0x61,)]};print(map{chr($_)}@{($c->(1))});$h={a=>33*3,b=>
10**2+7,c=>"1"."0"."1",d=>0162};@h=sort(keys(%$h));for(@h){print(chr(
ord(chr($h->{$_}))))};
------------------------------
Date: Thu, 16 Aug 2001 03:23:47 +0100
From: "Big Banana" <bigbanana@mailandnews.com>
Subject: How can I get rid of ISO codes?
Message-Id: <9lf7el$n6a$1@news8.svr.pol.co.uk>
I've got a Flash file, which send input to a Perl script.
The Perl script takes this input and mails it to an address.
BUT... I seem to have a problem whereby the data seems to contain ISO
codes...
Someone told me that there are some common function libraries that I can
use...
I'm not sure where to start...?
I'd really appreciate if someone could give me a few pointers... and perhaps
some sample code.
Thanks.
BB
------------------------------
Date: Thu, 16 Aug 2001 02:47:53 GMT
From: plew@csus.edu (Paul Lew)
Subject: Re: Ignoring The First Line
Message-Id: <slrn9nmd50.h4.plew@crane.li-po.edu>
In article <87ae11km6t.fsf@abra.ru>, Ilya Martynov wrote:
> PL> I've got the "reverse" problem in that my 1st line is ignored!!
> PL> As a beginner, I did a "quick & dirty" to remove the "\000" that
> PL> was passed into my resolv.conf; the "\000" is in ascii readable/printable
> PL> mode. I have since did a shell script that just executes
> PL> "/usr/bin/perl -pe ......etc" which works but still wonder why my perl
> PL> script loosed the 1st line. The code is as follows:
> PL> =================================================================
>
> PL> #!/usr/bin/perl
> PL> #
> PL> #
> PL> # to remove the '\000' created by /sbin/dhclient-script
> PL> #
> PL> #
> PL> print "running /usr/local/sbin/resolvchg................"
> PL> #
> PL> open (OLDFILE, "/etc/resolv.conf");
> PL> open (NEWFILE, ">/etc/resolv.conf.new");
> PL> while (<OLDFILE>) {
> PL> s/'\000'//;
>
> Should not it be s/\000//;?
Found that doesn't work as Perl thinks that the '\000' is an octal char,
just like the shell (I'm running linux); so I do need the quotes or
use an escape like 's/\\000//'.
>
> PL> print NEWFILE <OLDFILE>;
> ^^^^^^^^^
>
> It should be
>
> print NEWFILE $_;
Will try it....THANKS.
>
> PL> }
> PL> close OLDFILE;
> PL> close NEWFILE;
> PL> rename ('/etc/resolv.conf.new', '/etc/resolv.conf');
> PL> #
>
> You script will pass only even lines
>
> while(<OLDFILE>) reads odd line
> print NEWFILE <OLDFILE> reads even line and prints its
This is very interesting!!! My old books, Programming Perl, 1st ed,
Interactive Perl learning (not worth it) and Sam's Learning Perl in
21 days. Have not seen this mentioned; perhaps I was looking at the
forrest and not the tree but......
>
> BTW instead of your script you can use this one-liner:
>
> perl -i -pe 's/\000//' /etc/resolv.conf
That is what I have but it is 's/\\000//' instead else sh or bash thinks
that the '\000\' is an octal char which it isn't. The reason for the
script is that it has to be a script has to be passed to the "script option"
to be executeda and my workaround so far is to have another shell script
executed that would run the "program file" of the perl command.
>
> See 'perldoc perlrun' for description of -i, -p and -e switches.
>
------------------------------
Date: Wed, 15 Aug 2001 23:37:26 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Ignoring The First Line
Message-Id: <slrn9nmfvm.3ap.tadmc@tadmc26.august.net>
Paul Lew <plew@csus.edu> wrote:
>In article <87ae11km6t.fsf@abra.ru>, Ilya Martynov wrote:
>> PL> print NEWFILE <OLDFILE>;
>>
>> It should be
>>
>> print NEWFILE $_;
That part is correct.
>> You script will pass only even lines
No it won't. (assuming "it" is the below)
>> while(<OLDFILE>) reads odd line
Reads the first line.
>> print NEWFILE <OLDFILE> reads even line and prints its
Reads *all of the rest of the lines* (and outputs them to NEWFILE).
OLDFILE is at EOF after the above statement. The loop must execute
only one time.
print() puts list context on its arguments.
perldoc -f print
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 15 Aug 2001 20:01:21 -0700
From: adamone11@yahoo.com (Adamone11)
Subject: Re: Math::Matica, Perl Modules in general
Message-Id: <2c1772aa.0108151901.26c39cf0@posting.google.com>
I've checked the other repositories, and no one has a Math::ematica
ppm. I've got bcc++, and I could get djgpp if I needed, what are the
(fewest possible) steps I'd need to take to get this compiled (is
there a very detailed explanation somewhere)? Anyone out there happen
to have a binary for it? I'd be more than willing to work up an AOL
instant messenger connection for it. The integral is possible
(Math::ematica or not, I need this project done by friday) but the
module would make it more robust and more accurate, I think.
Thanks
> Math::ematica requires a C compiler to build, so you'd
> either have to get one, or else find a ppm package
> (ActiveState, at http://www.activestate.com/ppmpackages/, doesn't
> appear to have it).
> If the integrals aren't too hairy, perhaps Math::Integral::Romberg
> would be able to handle them.
(sorry, google groups doesn't show the reply, and my usenet free
server won't let me post)
------------------------------
Date: Wed, 15 Aug 2001 23:35:17 -0500
From: "Randy Kobes" <randy@theory.uwinnipeg.ca>
Subject: Re: Math::Matica, Perl Modules in general
Message-Id: <9lfj3f$t01$1@canopus.cc.umanitoba.ca>
"Adamone11" <adamone11@yahoo.com> wrote in message
news:2c1772aa.0108151901.26c39cf0@posting.google.com...
> I've checked the other repositories, and no one has a Math::ematica
> ppm. I've got bcc++, and I could get djgpp if I needed, what are the
> (fewest possible) steps I'd need to take to get this compiled (is
> there a very detailed explanation somewhere)?
[ ... ]
Unfortunately, if you're using ActivePerl, you'll probably
need M$'s Visual C++ compiler, for binary compatibility.
The alternative is to compile your own Perl with the
compiler you have, and then build Math::ematica.
best regards,
randy kobes
------------------------------
Date: Wed, 15 Aug 2001 20:04:58 -0700
From: "M.L." <mel2000@hotmaildot.com>
Subject: Re: Need Perl module or regexp to slurp specific XML records
Message-Id: <9lfdc8$94gd8$1@ID-19545.news.dfncis.de>
> > I couldn't find any info in XML::Records indicating that it can extract
> > records based on some value of a child element. All the examples showed
> > record extractions based on an element name only, not a condition of
that
> > element's value. Please correct me if I'm wrong.
>
> If you're using the latest version of XML::Records (the one based on
> XML::TokeParser), you can call begin_saving() at the beginning of a
> record, use get_tag() and get_text() to determine if there's a match, and
> then call restore_saved() and get_record().
>
I recently downloaded version 0.01 dated Fri Jan 26 18:03:59 2001
(according to the Changes file) and it either didn't include those methods
or
did not describe them. Thanks for the update. While this looks interesting,
I'll use it as a backup if the regexp method (recommended by other posters)
is not suitable.
Thanks,
M.L.
------------------------------
Date: Wed, 15 Aug 2001 21:04:11 -0700
From: "M.L." <mel2000@hotmaildot.com>
Subject: Re: Need Perl module or regexp to slurp specific XML records
Message-Id: <9lfgq3$93m7r$1@ID-19545.news.dfncis.de>
First, a wholehearted thanks to everyone who responded. All advice was
appreciated. Resetting the record separator is an ideal slurping solution
for my case.
RECAP: Given the following fixed XML records format:
<record id="012-8">
<name>John Doe</name>
<address>123 Main St.</address>
<city>Lake Elsinore</city>
<state>CA</state>
<phone>(123) 456-7890</phone>
<zip>12345</zip>
<email>jd@domain.com</email>
</record>
objective: slurp into hash only records where, for example, <state> = NY
$records{$id}{'id'} = ...
$records{$id}{'name'} = ...
$records{$id}{'address'} = ...
$records{$id}{'city'} = ...
$records{$id}{'state'} = NY
$records{$id}{'phone'} = ...
...
Taking all advice into consideration, as well as advice from Google search
on Perl extraction techniques, could I do something like this?
# *****************************************
where: $user_parameter = 'NY';
local ($/) = '</record>';
local @ARGV = 'filename.xml';
foreach (grep(/<state>\s*$user_parameter\s*<\/state>/i, <>))
{
$id = m/\bid\s*=\s*"(.*?)"/i; # should have mentioned id not just numeric
$records{$id}{'id'} = $id;
$records{$id}{'state'} = $user_parameter;
$records{$id}{'name'} = m/<name>\s*(.*?)\s*<\/name>/i;
$records{$id}{'address'} = m/<address>\s*(.*?)\s*<\/address>/i;
...
$states{$user_parameter}[$i++] = $id; # need to store ids for each state
}
# *****************************************
Thanks again to all,
M.L.
> <SNIP>
>
> > Research and read about the $/ default record separator
> > along with the powerful and efficient local () function.
>
> Interesting approach that is worthy of consideration. Too bad you
> didn't test your code (or at least didnt test anything but the
> simplest of cases), otherwise you would notice that it doesnt work.
> Hmmm. Come to think of it arent you the same Lizard that endlessly
> berates people for not testing their code? Eat your own dog food
> lizard.
>
> Also a further point onlong these lines... You didnt read the spec
> either. He said if the STATE was some value, not the PHONE number.
> Please before you post your rantings at least _READ_ the OP!
>
> > You Perl 5 Cargo Cultists dictate local () is worthless
> > and should be removed from perl core? This is a classic
> > example of Perl 5 Cargo Cultists not having an ability
> > to think for themselves. You truly are Borg.
>
> Nobody was arguing with you yet you seem fit to bitch at them anyway.
> Poor OP asks a question and you snakebite the guy.
>
> Now on to the code. While the approach is interesting (as I said
> before) the code doesnt work (as I also said before.) Replace your
> data block with the code following my signature, execute your code and
> you get the output specified. Notice that 'Record id="ID 0128: '
> Ooops, Lizard, did you make a fool out of yourself again???
>
> And now in the spirit of not abusing anyone whose code I cant fix, I
> present the following analysis of your code: (unlike you I can PROVE
> that your code is bad, AND I can fix it. Well, insofar as such an
> erroneous approach can be fixed.)
> > #!perl
> >
> > print "Content-type: text/plain\n\n";
>
> Useless. Waste of keystrokes and disk space.
>
> >
> > $user_parameter = "ou812";
> >
> > &Localize;
>
> And for what reason are you including the & symbol? Its not wrong but
> ultimately it IS a waste of time etc...
>
> >
> > sub Localize
>
> I suppose you have created this sub in the spirit that other subs
> might be added. Fair enough. But in this case it is a further waste
> of disk space.
>
> > {
> > local ($/) = "<record id=\"";
>
> Interesting approach. Too bad it doesnt work. Replace with:
>
> local ($/) = "</record>";
>
> >
> > while (<DATA>)
> > {
> > if (/$user_parameter/i)
>
> This is plain and simple bad logic. For instance what happens if the
> OP has some moron (a lizard perhaps) that changed their NAME to
> 'ou812'. You would still/output process the record wouldnt you?
> Dohh!! Come to think of it, you arent even checking a given field,
> you are checking the whole record. Wow think of what would happen if
> SQL did that!
>
> So to fit with the OP's spec it would have to be:
>
> if (/<state>\s*$user_parameter\s*<\/state>/i)
>
> I put in the \s* because I believe that it is worthwhile to employ
> defensive measures in my code. (Which incidentally means that I would
> _never_ use this (meaning the overall program) approach in anything
> but the most mundane of circumstances. Just consider the number of
> minor changes in the data set/spec that would completely screw your
> code.) Really there should be further measures taken, but my point is
> made.
>
> > {
> > s/(\d+)">/ID $1:/;
>
> Obviously this line has to be removed (See above). Replace it with:
>
> s/<record id="(\d+)">/ID $1:/;
>
> > s/<\/.*>(\n)/$1/g;
>
> This one doesnt do the job either! Also it is ineficient to capture
> constant data. So replace it with
>
> s/<\/.*>\n?/\n/g;
>
> > tr/>/:/;
> > s/:/: /g;
>
> Oh wow! Talk about redundant code! (and buggy, what happens if the
> content of the fields contains a '>') These two get merged into
> something marginally less wasteful (but no less bugggy, sigh, parsing
> XML with regexes is just not smart):
>
> s/>/: /g;
>
> > s/<([a-z])/\u$1/g;
> > s/\n+$//;
> > print;
> > }
> > }
> > }
>
> And now the code should work as advertised. Oh yes, one little piece
> of advice lizard darling, but if you make the changes that I have
> indicated here you might also want to change $user_parameter to
> something that resembles a two digit state code, otherwise you might
> worry your little brain about why nothing is being output.(besides
> your stupid content line)
>
<snip>
------------------------------
Date: Thu, 16 Aug 2001 03:04:20 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Need Perl module or regexp to slurp specific XML records
Message-Id: <3B7B7074.B3A1EA8A@earthlink.net>
M.L. wrote:
>
> First, a wholehearted thanks to everyone who responded. All advice was
> appreciated. Resetting the record separator is an ideal slurping
> solution for my case.
>
> RECAP: Given the following fixed XML records format:
>
> <record id="012-8">
> <name>John Doe</name>
> <address>123 Main St.</address>
> <city>Lake Elsinore</city>
> <state>CA</state>
> <phone>(123) 456-7890</phone>
> <zip>12345</zip>
> <email>jd@domain.com</email>
> </record>
>
> objective: slurp into hash only records where, for example,
> <state> = NY
>
> $records{$id}{'id'} = ...
> $records{$id}{'name'} = ...
> $records{$id}{'address'} = ...
> $records{$id}{'city'} = ...
> $records{$id}{'state'} = NY
> $records{$id}{'phone'} = ...
> ...
> Taking all advice into consideration, as well as advice from Google
> search on Perl extraction techniques, could I do something like this?
>
[snip]
No, you shouldn't do it something like that. It depends strongly on the
structure of your data, and if the structure of your XML changes, then
you have to rewrite your parser almost entirely.
Here's a parser I wrote which should be able to parse any XML data. It
assumes that the contents of a tag will only be strings, or nested tags
[not both]. This assumption is true for most real world XML [except
when considering html as xml, which usually does have mixed tags and
text].
#!/usr/local/bin/perl -w
use Text::Balanced qw(extract_tagged);
use Tie::IxHash;
use strict;
sub parse_xml_params {
local ($_) = shift;
m[<(\w+)\s*]g or die "Missing start tag\n";
my $tagname = $1;
tie my(%params), "Tie::IxHash";
# this regex here isn't exactly the best, but it works
# for the data given, if not the general case.
while( m[\G(\w+)\s*(?:=\s*"([^"]*)")?\s*]g ) {
$params{$1} = $2;
}
( $tagname, \%params );
}
sub parse_xml_tree {
local ($_) = @_;
my @results;
while( undef,$_,my($prefix,$open,$contents,$close))
= extract_tagged ) {
push @results, [parse_xml_params($open),
parse_xml_tree($contents)];
}
@results ? \@results : $_[0];
}
=pod
This produces the data structure:
[ [ "xml-documentroot-tag", { key => value, key => value }, [
[ "record", {id=>"012-8"}, [
["name", {}, "John Doe"],
["address", {}, "123 Main St."],
["city", {}, "Lake Elsinore"],
["state", {}, "CA"],
["phone", {}, "(123) 456-7890"],
["zip", {}, "12345"],
["email", {}, "jd@domain.com"],
] ],
[ "record", {id=>...}, [
....
] ],
] ] ]
Remember, proper XML docs have exactly one root, and may have many
levels of nested tags within.
=cut
# this is the part which depends on the shape of <record> tags.
my @roots = @{parse_xml_tree( do { local $/; <> } )};
die "Only one document root allowed" if @roots > 1;
my $root = $roots[0];
print "The name of the root is $root->[0]\n";
my $records = $root->[2];
# turn our parsed tree into nested hashes.
tie my (%records), "Tie::IxHash";
foreach my $r ( @$records ) {
my ($name,$params,$contents) = @$r;
die "Expected only <record> things"
unless $name eq "record";
defined( my $id = $params->{id} )
or die "Missing id";
tie my (%rec), "Tie::IxHash";
foreach my $rr ( @$contents ) {
my ($n, $p, $c) = @$rr;
die "Expected a string" if ref $c;
$rec{$n} = $c;
}
next if( $rec{state} ne "NY" );
$record{$id} = \%rec;
}
# modify %records.
# print them to stdout:
print qq[<$root->[0]>\n];
while( my ($id, $rec) = each %records ) {
print qq[\t<record id="$id">\n];
while( my($field, $val) = each %$rec ) {
print qq[\t\t<$field>$val</$field>\n];
}
print qq[\t</record>\n];
}
print qq[</$root->[0]>\n];
The reason I tie all my hashes to Tie::IxHash is so that when retrieving
the data via each(), they will be in the same order they were put in.
You can omit this if you want, but your output data may now be in
arbitrary order, unless you do a sort or somesuch.
--
I'm not a programmer but I play one on TV...
------------------------------
Date: Thu, 16 Aug 2001 04:40:43 GMT
From: "Wayne Lippold" <gwhs@bronxpages.com>
Subject: Newbie Activeperl Win2k
Message-Id: <f9Ie7.14648$Kf4.3391312@news02.optonline.net>
I just installed activeperl on my Win2k Pro box, and my computer has no
problem recognizing the .pl extension but it doesn't recognize .cgi
extention.
I hope somebody can help me with this.
Thanks in advance
Wayne Lippold
gwhs@bronxpages.com
------------------------------
Date: Thu, 16 Aug 2001 07:51:08 +0100
From: "Paul Schnell" <pschnell@touchpowder.com>
Subject: Re: Newbie Activeperl Win2k
Message-Id: <K%Je7.13862$LN3.3501470@monolith.news.easynet.net>
"Wayne Lippold" <gwhs@bronxpages.com> wrote in message
news:f9Ie7.14648$Kf4.3391312@news02.optonline.net...
> I just installed activeperl on my Win2k Pro box, and my computer has no
> problem recognizing the .pl extension but it doesn't recognize .cgi
> extention.
>
What server are you running?
If Apache, add the following to httpd.conf
AddHandler cgi-script .cgi
------------------------------
Date: Thu, 16 Aug 2001 16:26:26 +1000
From: Ash Turner <ash@turnernewmedia.com.au>
Subject: Newbie Perl Installation - "%1 is not a valid Windows NT Application"
Message-Id: <B7A1A4B2.6B5%ash@turnernewmedia.com.au>
Hi
I've just installed perl on my NT Server. I know it's working. When I try
and access through my web scripts though I get the error "%1 is not a valid
Windows NT Application".
I think It's the pathname which I've set to "c:/perl/bin/" in my perl
scripts, which is the pathname on the NT box.
Any help greatly appreciated.
Ash Turner
------------------------------
Date: 16 Aug 2001 01:55:33 GMT
From: ebohlman@omsdev.com (Eric Bohlman)
Subject: Re: perldoc is like Greek to a beginner??
Message-Id: <9lf96l$jk4$1@bob.news.rcn.net>
Bernie Cosell <bernie@fantasyfarm.com> wrote:
> Actually, I think that very very few programming languages manage to be
> context-free. That's just too restrictive for normal/effective use, and so
> while most languages use CF expressions and many CF programming constructs,
> there'll almost always inevitably be *some* little corner of the language
> that ends up having to be context sensitive.
Yep, the classic illustration is that a CFG can't express the constraint
that a function needs to be called with the same number of parameters it
was declared with. In practice, what's done is to write (and parse
according to) a CFG that defines a superset of the actual language, and
then tweak the parse tree afterwards.
------------------------------
Date: Wed, 15 Aug 2001 23:24:31 -0400
From: "Ed Cheng" <edmcheng@hotmail.com>
Subject: Problem with two versions of Perl installed
Message-Id: <hYGe7.28789$KG2.2869633@news20.bellglobal.com>
Hi,
I am using Redhat 7.1 and one day I connected to the CPAN and install the
Bundle::CPAN, CGI, DBI and other softwares. Now my Perl was upgraded from
the Redhat's Perl 5.6.0 to Perl 5.6.1 after installing the Bundle::CPAN.
I tried rpm -e, but it complaint that a lot of softwares are dependent on
Perl. How do I safely remove Perl 5.6.0 and all reference to it? And make
all softwares pointing to Perl 5.6.1?
Thanks,
Edmond
------------------------------
Date: Wed, 15 Aug 2001 23:23:32 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: traversing sub-directories
Message-Id: <3B7B3CB4.E2234376@earthlink.net>
Chuck Goldstein wrote:
>
> Am new to Perl. An old (more ways than one) awk-user. I
> would like to traverse subdirectories to any depth looking
> fof files matching 1) a name pattern (e.g. html) and 2) a
> pattern in text ... and then operate on said files. Have a
> book on Perl but can't locate a paridigm for such. The
> first part of my need is to replicate 'find' within Perl.
> Of course, I could use 'find' with a -exec ... but in back
> of my memory someone once told me that Perl traversed
> directory structures very well. Is this true ... or is my
> memory bad?
As others have said, you can if you want use find2perl and File::Find,
but in some circumstances, something like the following may be prefered:
local $/ = "\0";
open( my $find, "-|", find => $dir, -name => "*.html", -print0 )
or die ...;
while( my $found = <$find> ) {
chomp;
# process $found
}
close( $find ) or die ...;
Note that setting $/ to \0 and using -print0 instead of print go
together. You can leave $/ as the normal \n and use -print, but beware
that it's possible for filenames to have newlines in them, and if there
are any such files, your script will break.
Using File::Find has the advantage of being done in-perl, so no fork/
exec is done, but the disadvantage of not being able to easily take
advantage of system-specific find features [some things on gnu find, for
example]. Plus, find2perl makes ugly scripts :P
--
I'm not a programmer but I play one on TV...
------------------------------
Date: Wed, 15 Aug 2001 19:49:36 -0600
From: "Richard A. Evans" <EvR@compuserve.com>
Subject: Re: tutorial for perl/ldap/web
Message-Id: <9lf8ov$u7$1@suaar1ac.prod.compuserve.com>
> Could anyone kindly suggest a good tutorial on creating the
> server-side perl code and web interface to hook into an ldap server?
I used the Perl for System Administration (O'Reilly) and had success doing
what I needed.
Regards,
Rick Evans
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 1537
***************************************