[23413] in Perl-Users-Digest
Perl-Users Digest, Issue: 5631 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Oct 8 00:06:08 2003
Date: Tue, 7 Oct 2003 21:05:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 7 Oct 2003 Volume: 10 Number: 5631
Today's topics:
Re: //= operator alternative (Roy Johnson)
Re: [OT] please ignore <kkeller-usenet@wombat.san-francisco.ca.us>
computer name and domain, change, win2k (marge dahl)
converting file to excel problem <spedwards@qwest.net>
Re: converting file to excel problem <trammell+usenet@hypersloth.invalid>
Re: GD::Graph: "mixed" graph doesn't recognize "area" g <mgjv@tradingpost.com.au>
Re: GDBM_File problems [solved] <mhunter@uclink.berkeley.edu>
Re: NEWBIE! Please help! <noone@nowhere.com>
Re: NEWBIE! Please help! <uri@stemsystems.com>
Re: Opinions on "new SomeObject" vs. "SomeObject->new() <ict@eh.org>
Re: pattern matching <invalid-email@rochester.rr.com>
Re: Reading huge *.txt files? <syscjm@gwu.edu>
Re: Reading huge *.txt files? <syscjm@gwu.edu>
Segmentation Fault - core dumped. Do I have latest ver (Glen Hendry)
Re: Teach me how to fish, regexp <mgjv@tradingpost.com.au>
Re: Teach me how to fish, regexp (Bryan Castillo)
trying to understand a hash (John)
Re: trying to understand a hash <nospam_for_jkeen@concentric.net>
Re: trying to understand a hash <zoooz@gmx.de>
Re: trying to understand a hash <ddunham@redwood.taos.com>
Re: trying to understand a hash <mbudash@sonic.net>
Re: Virus, CPU killer, Memory Eater on Sourceforge.net? (James Willmore)
Re: <bwalton@rochester.rr.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 7 Oct 2003 19:55:13 -0700
From: rjohnson@shell.com (Roy Johnson)
Subject: Re: //= operator alternative
Message-Id: <3ee08638.0310071855.4cc99fb0@posting.google.com>
Having thought about it some more, I got a much better, more
comprehensive idea.
Change the behavior of defined() thus: it provides a context in which
only undefined values are false. Boolean operators return undef rather
than empty string.
Then this would work exactly like you'd expect:
if (defined($a or $b)) { ...
and you could do
defined($v ||= a() || b());
exists() would be exactly like defined(), except that when checking a
hash value, only non-existent keys return false.
It's straightforward, and it addresses a lot of things that people
grouse about. For larger areas, there could be pragmas "use defined"
and "use exists". I don't know how useful/popular those would really
be.
In a hurry tonight, but wanted to get this out there. Please think
about it a little and comment.
------------------------------
Date: Tue, 7 Oct 2003 16:03:28 -0700
From: Keith Keller <kkeller-usenet@wombat.san-francisco.ca.us>
Subject: Re: [OT] please ignore
Message-Id: <0ogvlb.t3v.ln@goaway.wombat.san-francisco.ca.us>
-----BEGIN xxx SIGNED MESSAGE-----
Hash: SHA1
NotDashEscaped: You need GnuPG to verify this message
On 2003-10-07, Tad McClellan <tadmc@augustmail.com> wrote:
> Swen Killer <swen_killer@yahoo.co.uk> wrote:
>
>> Subject: [OT] please ignore
>
> Please do not post articles that are to be ignored.
I think he meant that the author should be ignored.
--keith
--
kkeller-usenet@wombat.san-francisco.ca.us
(try just my userid to email me)
AOLSFAQ=http://wombat.san-francisco.ca.us/cgi-bin/fom
-----BEGIN xxx SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iEYEARECAAYFAj+DRj4ACgkQhVcNCxZ5ID+F4QCeJ+w9vtxpw4y+kAzku7xoT6rD
BdsAn0JGuZNrZSJeqzsSgYJA98wg9MrV
=Vcjs
-----END PGP SIGNATURE-----
------------------------------
Date: 7 Oct 2003 19:36:29 -0700
From: mardahl2000@yahoo.com (marge dahl)
Subject: computer name and domain, change, win2k
Message-Id: <92bd2c30.0310071836.19bd7ac1@posting.google.com>
Sitting at the w2k workstation. Have a perl script that does
many things - one thing I can't figure out though is how to
rename the computer and domain.
------------------------------
Date: Tue, 7 Oct 2003 21:07:22 -0600
From: "Shawn" <spedwards@qwest.net>
Subject: converting file to excel problem
Message-Id: <KaLgb.792$XM3.41280@news.uswest.net>
Hi,
We are using the below script to convert a file into excel format. The
problem is that my file contains ssn in which they can start with zero.
Well, when it gets converted to excel it drops the leading zero. I need
that leading zero and am not sure how to modify this script to keep the
zero.
Any asistance would be greatly appreciated!
Shawn
--
#!/opt/bin/perl5.6 -w
############################################################################
###
# Example of how to use the WriteExcel module
# Program to convert a text [delim] separated value file into an Excel file.
# Usage: txt2xls.pl file.txt newfile.xls
use Getopt::Long;
use Spreadsheet::WriteExcel::Big;
GetOptions ("d=s" => \$delim);
$delim = "|" if !defined($delim);
$delim =~ s/\|/\\|/g;
# Check for valid number of arguments
if (($#ARGV < 1) || ($#ARGV > 2)) {
die("Usage: txt2xls file.txt newfile.xls\n");
};
# Open the Comma Seperated Variable file
open (TXTFILE, $ARGV[0]) or die "$ARGV[0]: $!";
# Create a new Excel workbook
my $workbook = Spreadsheet::WriteExcel::Big->new($ARGV[1]);
my $worksheet = $workbook->add_worksheet();
# Row and column are zero indexed
my $row = 0;
while (<TXTFILE>) {
chomp;
@cols = split(/\s*${delim}\s*/,$_);
$col = 0;
foreach my $token (@cols) {
$worksheet->write($row, $col, $token);
$col++;
}
$row++;
}
------------------------------
Date: Wed, 8 Oct 2003 03:21:11 +0000 (UTC)
From: "John J. Trammell" <trammell+usenet@hypersloth.invalid>
Subject: Re: converting file to excel problem
Message-Id: <slrnbo70l7.kag.trammell+usenet@hypersloth.el-swifto.com.invalid>
On Tue, 7 Oct 2003 21:07:22 -0600, Shawn <spedwards@qwest.net> wrote:
> We are using the below script to convert a file into excel format. The
> problem is that my file contains ssn in which they can start with zero.
> Well, when it gets converted to excel it drops the leading zero. I need
> that leading zero and am not sure how to modify this script to keep the
> zero.
Do something like:
my $starts_with_zero = "=(\"01234\")";
or use qq for neatness:
my $starts_with_zero = qq[=("01234")];
------------------------------
Date: 08 Oct 2003 02:14:47 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: GD::Graph: "mixed" graph doesn't recognize "area" graph type
Message-Id: <slrnbo6sop.pv1.mgjv@verbruggen.comdyn.com.au>
On 7 Oct 2003 10:50:51 -0700,
Emilio Mayorga <e.mayorga@co.snohomish.wa.us> wrote:
> e.mayorga@co.snohomish.wa.us (Emilio Mayorga) wrote in message news:<faa70e85.0310061616.437e484a@posting.google.com>...
>
>
>> > I've had a bit of a look at it, but I can't really see what's wrong with
>> > it (if anything at all). Would it be possible for you to just fill the
>> > $sensorgraph{$sensorid} hashes with some decent values, and then run the
>> > GD::Graph part separately? if you then still see problems, please email
>> > the result to me, because that probably means there is some bug or
>> > oddity that I need to have a look at.
>>
>> Thanks for looking into it. I'll try to do that test, but a deadline
>> is creeping up on me, so it may be a while. I'll definitely try to
>> follow up if the problem persists. I expect to make heavy use of
>> GD::Graph for several projects in the future.
>
> I've found the problem. The error message comes up when there are
> undef values in the area graph. I was able to reproduce it in
> sample61.pl by just setting one of the area graph values to undef.
Thanks for that. I'll look into that to see why I decided to make it
behave in that way.
> I suppose there is logic to why an area graph would not work when
> there is an undef, but the error message is misleading.
There is probably a reason, yes, but I agree that the error message
needs some work.
> On a more general note, it seems restrictive to have to associate all
> graphs with exactly the same set of x-values, especially for a numeric
> x-axis (continuous values).
I agree. Fixing that is non-trivial however, and has been on my todo
list for quite a while. GD::Graph started its life intended for
something much simpler than what it has evolved into and bad design
decisions were made early on.
Martien
--
|
Martien Verbruggen | Freudian slip: when you say one thing but
Trading Post Australia | mean your mother.
|
------------------------------
Date: Tue, 7 Oct 2003 23:54:02 +0000 (UTC)
From: Mike Hunter <mhunter@uclink.berkeley.edu>
Subject: Re: GDBM_File problems [solved]
Message-Id: <slrnbo6kd5.1j3.mhunter@celeste.net.berkeley.edu>
On Tue, 7 Oct 2003 17:43:52 +0000 (UTC), Mike Hunter wrote:
> Can somebody point me to the perl tgz file for this? I am getting:
>
> "Can't locate loadable object for module GDBM_File in @INC (@INC contains..."
This turned out to be a freebsd problem with the /usr/ports/lang/perl5.8 .
Build with "WITH_GDBM" set.
Mike
------------------------------
Date: Tue, 07 Oct 2003 23:03:37 GMT
From: "gibbering poster" <noone@nowhere.com>
Subject: Re: NEWBIE! Please help!
Message-Id: <dDHgb.8613$Vo6.1035@newssvr29.news.prodigy.com>
"BDK" <bdknoll@runbox.com> wrote in message
news:dffcb909.0310070808.82e2f3a@posting.google.com...
> I am trying to take a peice of a word and replace it with output of
> the chr function. So the word is ki45. I want to replace the 45 with
> what "print (chr(36));" returns.
>
> Here is what I have tried:
>
> $word = ki45;
> $word =~ s/(45)/(print (chr(chr36))/;
> print ($word);
> print "\n";
>
> but that returns:
> ki(print (chr(chr36))
>
Try using the e mod which executes the right-hand side of a
substitution:
untested:
$_=ki45;
s/\d+/chr$&/e and print
A couple notes:
1) Best to keep scalar strings in single quotes
2) If you're not going to be matching character 45 every time, you need
a more dynamic approach to your regex... possibly:
s/ki(\d+)/"ki".chr$1/e
To replace all number sequences in a string with their corresponding
character value:
$word =~ s/\d+/chr $&/ge;
------------------------------
Date: Tue, 07 Oct 2003 23:09:14 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: NEWBIE! Please help!
Message-Id: <x7vfr0ejc5.fsf@mail.sysarch.com>
>>>>> "gp" == gibbering poster <noone@nowhere.com> writes:
gp> untested:
gp> $_=ki45;
why assign to $_? just bind to the var.
gp> s/\d+/chr$&/e and print
don't use $&. grab the matched string and use $1. using $& anywhere will
slow down all the s/// operations in your entire program.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
------------------------------
Date: 8 Oct 2003 11:29:22 +1000
From: Iain Truskett <ict@eh.org>
Subject: Re: Opinions on "new SomeObject" vs. "SomeObject->new()"
Message-Id: <slrnbo6qi2.of8.ict@dellah.org>
* Quantum Mechanic <quantum_mechanic_1964@yahoo.com>:
[...]
> Not to put too fine a point on it, but what is acceptable syntax for
> creating a new instance of a class of a given instance, when that
> class isn't known at the time of coding?
[...]
> If I don't know the class ahead of time (determined at runtime, for
> instance), I'm looking at one of these:
> $new_instance = new $old_instance;
> $new_instance = $old_instance->new();
> $new_instance = (ref $old_instance)->new();
Or even:
my @classes = qw( Class Fnord Fnerk Xyzzy );
my $class = $classes[ rand @classes ];
my $new_instance = $class->new();
cheers,
--
Iain.
------------------------------
Date: Wed, 08 Oct 2003 02:45:23 GMT
From: Bob Walton <invalid-email@rochester.rr.com>
Subject: Re: pattern matching
Message-Id: <3F837912.5060402@rochester.rr.com>
Lex wrote:
> "Bob Walton" <invalid-email@rochester.rr.com> wrote in message
> news:3F823B3F.6070602@rochester.rr.com...
>
...
> What it does: when showing the field $rec{'Text'}from one of the records of
> the database it checks if words, synonyms or plurals form all records in the
> database are used in this field and if so create a link to those words.
> Anyway, I don't need telling you people do I? :) But just to be clear about
> what I'm doning.
>
> Now I wanted something more again (always the same...), when showing records
> from another database I wanted it to happen as well. Got that working (the
> links), wasn't very hard. However, I now wanted a popup screen as well
> already showing the meaning of the word (taken from the meaning field). Now,
> here I run into troubles and I think it's because it's still working and
> trying to do things with that meaning field as well. If it would produce
> just the text it works. I thought it might be ' or " as you suggested so
> made sure that I got rid of them. However, that didn't do the trick and the
> problem was worse than just that.
>
> So I think $meaning{$id} in this case (underneath) has more luggage than
> that what I am looking for. In this case I do not want links in this field
> now, I know I want a lot and am still not capable of producing it, quite
> frustrating, but hey, this is my way of learning I guess. I'll copy the code
> underneath that I tried but that was giving me more than I wanted:
>
> open (DB, "<$db_file_name_abc") or &cgierr("error in search. unable to open
> database: $db_file_name_abc.\nReason: $!");
> if ($db_use_flock) { flock(DB, 1); }
>
> my %xref;
> my %meaning;
> while ( <DB> ) {
> chomp;
> my($id, $word, $plural, $synonym, $cat, $meaning) = split /\|/;
> @xref{($word, $plural, $synonym, $meaning)} = ($id)x4;
> $meaning{$id}=$meaning;
> }
>
> foreach my $id(keys %meaning){
> foreach my $word ( $rec{'Text'} =~ /\S+/g ) {
>
> if ($xref{$word}) {
>
> my $newword = "<a
> href=\"$db_dir_url/db.cgi?db=abc&uid=$db_uid&ID=$xref{$word}&mh=1&ww=1&view_
> records=1\" class=\"abclink\"
> ONMOUSEOVER=\"popup('$meaning{$id}','#ffffcc')\";
> ONMOUSEOUT=\"kill()\">$word</a>";
>
> $rec{'Text'} =~ s|\b$word\b|$newword|gs;
> }
> }
> }
>
> close DB;
>
> As soon as I insert $meaning{$id} it all goes wrong, all of it, even $word
> isn't what it's supposed to be anymore. As well after cutting out all html
> and ' and " etc. from $meaning{$id}.
If you are running this as a CGI script, run it for debugging purposes
at the command prompt and *use the Perl debugger*. With it you can step
through your program and observe the values of variables as you go. In
the excerpt above, for example, you will note that $rec{Text} (why are
you using a hash to hold just one scalar, anyway??) never gets set to
anything because the "$word" foreach loop loops over words in
$rec{Text}, but $rec{Text} starts out empty, so there are never any
words to start with (if you would pay attention to advice and use
strict; and use warnings; you would have known that right away). Thus,
that foreach body is never executed, so $rec{Text} never gets any
content. Proper indentation of the loops would help understanding of
the code, too.
You appear to be just trying random things rather than taking a
systematic approach to your programming problem. You should sit back
and develop an overview of what you want to do, and then outline the
small steps needed to accomplish that. Then tackle each of those steps,
using the debugger to assure that each statement is accomplishing its
purpose and that you actually have the data you expect in each variable.
When you get to the end of that, you will have working code.
HTH.
...
> Lex
--
Bob Walton
Email: http://bwalton.com/cgi-bin/emailbob.pl
------------------------------
Date: Tue, 07 Oct 2003 18:29:04 -0400
From: Chris Mattern <syscjm@gwu.edu>
Subject: Re: Reading huge *.txt files?
Message-Id: <3F833E30.8050400@gwu.edu>
Math55 wrote:
> hi, is there a possibility to read large (>1mb) *.txt files in a fast
> way? everytime i do that, my program freezes or takes very long to
> finish. anyone a idea?
>
Specifics. Specifics are good. Specifics want to be your friend.
How fast is "in a fast way"? How long is "very long"? And most of
all--what "that" are you doing?
Chris Mattern
------------------------------
Date: Tue, 07 Oct 2003 18:33:26 -0400
From: Chris Mattern <syscjm@gwu.edu>
Subject: Re: Reading huge *.txt files?
Message-Id: <3F833F36.6030105@gwu.edu>
Peter Hickman wrote:
> Tulan W. Hu wrote:
>
>> "Math55" <magelord@t-online.de> wrote in message ...
>>
>>> hi, is there a possibility to read large (>1mb) *.txt files in a fast
>>> way? everytime i do that, my program freezes or takes very long to
>>> finish. anyone a idea?
>>>
>>> THANKS:-)
>>
>>
>>
>> Upgrade your perl to 5.8.1 and use Tie::File.
>>
>>
>
> To be honest this is not good advice. His code is grossly inefficient,
> most of the improvements will come from a better design than using a
> module to implement a bad design.
>
Actually, we don't know if it's good advice or not. The OP hasn't
given us any code to look at. The code we've seen in this thread
came from another newbie who chimed in about having the same problem,
except he actually provided people with code to fix.
Chris Mattern
------------------------------
Date: 7 Oct 2003 19:08:15 -0700
From: ghendry@iprimus.com.au (Glen Hendry)
Subject: Segmentation Fault - core dumped. Do I have latest version ?
Message-Id: <e6f27561.0310071808.13ee465f@posting.google.com>
Hi all,
I am getting repeated seg faults and dumped cores on Solaris (version
details below).
The problem is difficult to trace with debug print statements and
often dissapears completely when print statements (which resolve
variables) are added. The problem also stops happening when we run in
debugging mode. It is very frustrating.
We do use the 'use blah' statement and some have said in recent posts
that these can cause the problem.
Any advice would be greatly appreciated. (My current solution is to
run the program in production with -d switch ;-)
Thanks
Glen
OS Version ...
> uname -a
SunOS capdev01 5.6 Generic_105181-15 sun4u sparc SUNW,Ultra-4
Perl Version ...
> perl -V
Summary of my perl5 (revision 5.0 version 8 subversion 0)
configuration:
Platform:
osname=solaris, osvers=2.6, archname=sun4-solaris
uname='sunos 5.6 generic_105181-26 sun4u sparc sunw,ultra-1 '
config_args='-Dcc=gcc -B/usr/ccs/bin/'
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef
usemultiplicity=unde
f
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc -B/usr/ccs/bin/', ccflags ='-fno-strict-aliasing
-D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64',
optimize='-O',
cppflags='-fno-strict-aliasing'
ccversion='', gccversion='3.1', gccosandvers='solaris2.6'
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=16
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize
=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='gcc -B/usr/ccs/bin/', ldflags =' -L/usr/local/lib '
libpth=/usr/local/lib /usr/lib /usr/ccs/lib
libs=-lsocket -lnsl -lgdbm -ldl -lm -lc
perllibs=-lsocket -lnsl -ldl -lm -lc
libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'
Characteristics of this binary (from libperl):
Compile-time options: USE_LARGE_FILES
Built under solaris
Compiled at Jul 22 2002 05:26:53
%ENV:
PERLLIB="/devhome/com/dev/ghendry/tools:/devhome/com/tools"
@INC:
/devhome/com/dev/ghendry/tools
/devhome/com/tools
/usr/local/lib/perl5/5.8.0/sun4-solaris
/usr/local/lib/perl5/5.8.0
/usr/local/lib/perl5/site_perl/5.8.0/sun4-solaris
/usr/local/lib/perl5/site_perl/5.8.0
/usr/local/lib/perl5/site_perl/5.005
/usr/local/lib/perl5/site_perl
.
GDB output ...
> gdb /usr/local/bin/perl core
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for
details.
This GDB was configured as "sparc-sun-solaris2.6"...
warning: core file may not match specified executable file.
Core was generated by `/usr/local/perl/bin/perl -w
./WU_619_SF_CNVEXT_INV_BAL.pl
-D /unload/ISAS/ghend'.
Program terminated with signal 11, Segmentation Fault.
#0 0xef60800c in ?? ()
(gdb) where
#0 0xef60800c in ?? ()
#1 0xef5b88f4 in ?? ()
#2 0x73e28 in Perl_av_store ()
#3 0x704fc in Perl_hv_fetch_ent ()
#4 0x73a6c in Perl_av_extend ()
#5 0x69934 in Perl_mini_mktime ()
#6 0xb1150 in Perl_pp_connect ()
#7 0x26e1c in S_parse_body ()
#8 0x24558 in frame_dummy ()
(gdb)
------------------------------
Date: 08 Oct 2003 02:12:15 GMT
From: Martien Verbruggen <mgjv@tradingpost.com.au>
Subject: Re: Teach me how to fish, regexp
Message-Id: <slrnbo6sk1.pv1.mgjv@verbruggen.comdyn.com.au>
[rewrapped long lines]
On Tue, 07 Oct 2003 19:12:29 GMT,
Henry <henryn@zzzspacebbs.com> wrote:
> Martien Vebruggen:
>
> Thank you for your response to my post:
>
> in article slrnbo4p6t.pv1.mgjv@verbruggen.comdyn.com.au, Martien Verbruggen
> at mgjv@tradingpost.com.au wrote on 10/7/03 12:01 AM:
>
>> On Tue, 07 Oct 2003 04:34:05 GMT, Henry <henryn@zzzspacebbs.com>
>> wrote: Folks:
>>> Seems the best way to deal with this is to slurp, and use "split"
>>> with the appropriate regexp. Wrinkle: I need to retain the
>>> section numbers in the return strings.
>>>
>> I would probably set the input record separator ($/, see perlvar)
>> to "", which will treat two or more consecutive newlines as the
>> record separator. Then each record starts with the number you're
>> interested in.
> Right, that's what I finally did, in effect. (I did something
> similar at the "split".) But this isn't very robust, I think: it
> depends on some typist somewhere _always_ following the rules.
>
> I think you are saying that slurp mode may not be the best choice.
>
> As far as your setting
>
> $/ = "";
>
> This is not exactly intuitive from the point of view of a newcomer.
> Sorry, could you help me understand (or give me a blind rule of
> thumb) how what looks like setting a variable to an empty string
> implies "two or more successive newlines"?
The perlvar documentation explains what $/ (the input record
separator) does, and that it has a "special" setting of the empty
string, which makes it reads "paragraphs", i.e. blocks of text
separated by two or more newlines.
> Thanks for taking all the trouble to explain the components in detail:
>> /
>> ^ # from the beginning of the record
>
> Right.
>
>> ( # start capture
>
> Capture? I guess you mean the mysterious "save the stuff you match"
> mechanisms I've found in some perl references. The explanations
> I've found are very short and not very useful. Also: I find it
> hard to discriminate between parens used for operation grouping and
> this use.
Yes. Capturing parentheses "save" whatever is matched between them,
and return it as a result of the operation, as well as in the named
variables $1, $2, etc.. At the same time they group multiple
characters together to form a single subpattern.
There is more information about this in the perlre documentation, as
well as in the perlop documentation under the entry for
"m/PATTERN/cgimosx".
>> (?: # start grouping, but no capturing
>
> Sorry, could you speak more fully about this? Again, I haven't
> found a good reference for this stuff.
If you only want to group some stuff together in a subpattern, but you
don't want that match of that subpattern returned as one of the digit
variables, or in the return list, you use (?:PATTERN). Again, see the
perlre documentation for a full explanation.
>> .\ \ # literal . followed by two spaces
>
> Sorry, I don't get that. Could you explain more fully? I think that I
> understand that a period, unescaped, matches any character, so I would
> expect that you'd have to escape before the period to match a literal
> period/decimal point.
You're right. my mistake in transcribing the regular expression. there
should be a backslash in front of the dot.
>> (.*) # capture the rest of the record
>
> I think I understand that
>
> .*
>
> means "any character, repeated 0 or more times", but I don't get how the
> parens lead to capture (and not operation grouping, as above) and eventual
> appearance of the captured data somewhere.
It does both. They group, and as a side effect, the matched subpattern
gets captured and returned (in this case as the second element of the
returned list, as well as in $2).
>> The first capturing set of parentheses returns the paragraph
>> number, including the sub-number, if present, and the second
>> capturing parentheses set returns the "Blah, blah.." bit up to the
>> end of the record.
>
> Right, as I said above, I can't figure out how this aspect works.
> This may seem obvious to you but looks like a hidden (or magical)
> side-effect to me.
The fact that those grouped subpattern matches get returned (and saved
in $1, $2...) is more an effect of the m// operator (documented in
perlop) than of regular expressions themselves. However, they do get
captured in regular expressions, and you can refer back to them (with
\1, \2...) inside of the same regular expression.
>> Also see the perlvar and perlre documentation for more information.
>
> My desk and my screen are littered with various references. Thanks for
> pointing out these man "subreferences" -- I had not noticed them
man perl gives a rather complete list of all the various other manual
pages that are available.
>> If two newlines is not a record splitter, and you _have_ to use a
>> minimum of three, this won't work.
>
> Sorry, could you speak more fully about this? Is there a
> restriction I'm not seeing?
If, for example, your text is formatted like:
12345 Some text for paragraph 1
Some more text that belongs in paragraph two
12345.1 This is the second paragraph
Then setting $/ to "" would read the second part of the first
paragraph as a separate read, since it has two newlines between the
first and second bit. if there is text in your documents that is like
that, you can't use the first bunch of code (with $/ set to ""), but
you have to use the second bunch of code (with $/ set to "\n\n\n" or
possibly even "\n\n\n\n") and do a bit more work in removing trailing
and leading newlines.
>> #!/usr/local/bin/perl use warnings; use strict;
That's not what I posted. The newlines are important.
There are also a perlrequick and a perlretut manual page, which are
more gentle introductions to regular expressions than the perlre
reference documentation. You should probably have a bit of a read of
those.
Furthermore: Don't worry too much that some of this stuff looks
magical. It is. Perl is full of things that you just have to learn
about by immersion, and by repeated visits to the same documentation.
it can take a while before some of this stuff becomes automatic.
Martien
--
|
Martien Verbruggen | Unix is user friendly. It's just selective
Trading Post Australia | about its friends.
|
------------------------------
Date: 7 Oct 2003 20:15:35 -0700
From: rook_5150@yahoo.com (Bryan Castillo)
Subject: Re: Teach me how to fish, regexp
Message-Id: <1bff1830.0310071915.68b1952f@posting.google.com>
Henry <henryn@zzzspacebbs.com> wrote in message
<snip>
> I've got a bunch of fixed-format text files (< 100k bytes each) to sniff.
>
> Each file is divided into paragraphs. Each para is preceded by at least
> three blank lines, and is introduced by a section number of 1 to 6 digits
> followed by a period and two spaces, OR, 1 to 6 digits followed by a period
> and at least one digit, followed by a period and two spaces, e.g.
>
> ------------------------------------------------------
> .....
> <empty>
> <empty>
> <empty>
> 12034. Blah, blah, blah, blah. Blah. Blah Blah Blah. Blah. Blah, blah,
> blah, blah. Blah. Blah Blah Blah. Blah. Blah, blah, blah, blah. Blah.
> Blah Blah Blah. Blah. Blah, blah, blah, blah. Blah. Blah Blah Blah.
> Blah. ...
> ------------------------------------------------------
>
> Or, the second format:
>
> ------------------------------------------------------
> .....
> <empty>
> <empty>
> <empty>
> 12034.1. Blah, blah, blah, blah. Blah. Blah Blah Blah. Blah. Blah, blah,
> blah, blah. Blah. Blah Blah Blah. Blah. Blah, blah, blah, blah. Blah.
> Blah Blah Blah. Blah. Blah, blah, blah, blah. Blah. Blah Blah Blah. ...
> ....
> ------------------------------------------------------
>
> Yes, if you are wondering, these are legal blah-blah-blahs.
>
> Seems the best way to deal with this is to slurp, and use "split" with the
> appropriate regexp. Wrinkle: I need to retain the section numbers in the
> return strings.
Here is a way to slurp and split with an re, for what you described.
3 things you might want to look at:
1. How the zero-width look ahead asserion (?!) doesn't
consume the section number
2. How the (?:) grouping doesn't capture the value
(you could have used regular capturing grouping
but there isn't any point to capture in the split)
3. How the qr operator is used. It isn't nescessary, but
I thought it made the code more readable.
You should also see, that this split might leava an empty
value in first element of the array. You will have to check for it.
use strict;
use warnings;
use IO::File;
sub readfile {
my $in = IO::File->new($_[0], "r") || die;
my $text = '';
$text.=$_ while (<$in>);
return $text;
}
my $text = readfile('file.txt');
# compile re here for readability
my $re = qr/
[\r\n]{3,} # match 3 or more new lines
(?! # zero width look ahead doesn't consume section
\d{5}\. # match first 5 digits of section
(?:\d\.)? # match optional digit and dot (non-capturing)
\s{2} # match 2 spaces
)
/x;
my @t = split $re, $text;
for (my $i=0; $i<=$#t; $i++) {
print "Para [$i]\n", $t[$i], "\n", "-"x60,"\n";
}
<snip>
> Could some wizard teach me to fish: Please don't give me a solution, merely
> tell me where I'm going wrong and put me back on the right path.
>
Sorry, but its easier to give a sollution and to ask you to
read it and research it to figure out how it works.
> Or should I go back to my awk hack that works and which I actually
> understand?
>
> Thanks,
>
> Henry
>
------------------------------
Date: 7 Oct 2003 16:16:34 -0700
From: jguad98@hotmail.com (John)
Subject: trying to understand a hash
Message-Id: <a964da31.0310071516.5059d0d7@posting.google.com>
I'm taking apart somebody else's perl script in order to (a) learn and
(b) make something for my own purposes, and have come across a hash
that is not written in the manner to which I've become accustomed ...
the author is not within hollering distance, so I thought I'd try this
list for assistance.
Here's the hash as the author has created it:
$hashname{$key}{$other} = value;
where "$hashname" was initialized with "my %hashname", "$key" is a
scalar derived from input, and "$other" is another scalar also derived
from input.
It looks to me that "other" is outside of the key and obviously isn't
the value either. The script I found it in works as the author
intended, so the structure is obviously legal, but I don't understand
it. I was under the impression that a hash in scalar context should
look like:
$hashname{$key} = value;
or
%hashname(key => value);
or
%hashname("key","value");
but not
%hashname{$key}{$other} = value;
I thought maybe it was intended to be some kind of index notation, but
the actual value of that var is a string and not a digit, and if it
was supposed to be a reference of some sort to make or indicate the
key is another array (hash of hashes?), then I would expect the var
"$other" to be inside the key's curly braces, not outside in their own
braces (i.e. "$hashname{$key{$other}} = value;" as opposed to
"$hashname{$key}{$other} = value;).
Can somebody enlighten me on what that {$other} is all about? How
does it work? Why does it work?
best regards,
John
------------------------------
Date: 07 Oct 2003 23:27:57 GMT
From: "James E Keenan" <nospam_for_jkeen@concentric.net>
Subject: Re: trying to understand a hash
Message-Id: <blvi5t$beo@dispatch.concentric.net>
"John" <jguad98@hotmail.com> wrote in message
news:a964da31.0310071516.5059d0d7@posting.google.com...
> I'm taking apart somebody else's perl script in order to (a) learn and
> (b) make something for my own purposes,
Worthy objectives.
[snip]
> Here's the hash as the author has created it:
>
> $hashname{$key}{$other} = value;
>
> where "$hashname" was initialized with "my %hashname", "$key" is a
> scalar derived from input, and "$other" is another scalar also derived
> from input.
>
> It looks to me that "other" is outside of the key and obviously isn't
> the value either. The script I found it in works as the author
> intended, so the structure is obviously legal, but I don't understand
> it. I was under the impression that a hash in scalar context should
> look like:
>
> $hashname{$key} = value;
> or
> %hashname(key => value);
> or
> %hashname("key","value");
> but not
> %hashname{$key}{$other} = value;
>
It's a hash of hashes, a multi-dimensional data structure. The value
associated with $hashname{$key} is a *reference* to another hash. In that
inner hash, a value is being assigned. Example:
use strict;
use warnings;
use Data::Dumper;
my (%hashname, $key, $other);
$key = 'alpha';
$other = 'beta';
$hashname{$key}{$other} = 'gamma';
print Dumper(\%hashname);
See: perldoc perlref
jimk
------------------------------
Date: Wed, 08 Oct 2003 01:33:35 +0200
From: Amir Kadic <zoooz@gmx.de>
Subject: Re: trying to understand a hash
Message-Id: <blvisk$h0v63$1@ID-142982.news.uni-berlin.de>
John wrote:
> $hashname{$key}{$other} = value;
...which is the same as
$hashname{$key}->{$other}= value;
...because 'you can omit the arrow if and only if
it occurs between braces (or brackets)'.
I don't know where I read this, but hope it's correct.
Maybe `man perlreftut`...
Amir
------------------------------
Date: Tue, 07 Oct 2003 23:45:34 GMT
From: Darren Dunham <ddunham@redwood.taos.com>
Subject: Re: trying to understand a hash
Message-Id: <yeIgb.8628$eC6.8076@newssvr29.news.prodigy.com>
John <jguad98@hotmail.com> wrote:
> I'm taking apart somebody else's perl script in order to (a) learn and
> (b) make something for my own purposes, and have come across a hash
> that is not written in the manner to which I've become accustomed ...
> the author is not within hollering distance, so I thought I'd try this
> list for assistance.
> Here's the hash as the author has created it:
> $hashname{$key}{$other} = value;
> where "$hashname" was initialized with "my %hashname", "$key" is a
> scalar derived from input, and "$other" is another scalar also derived
> from input.
> It looks to me that "other" is outside of the key and obviously isn't
> the value either. The script I found it in works as the author
> intended, so the structure is obviously legal, but I don't understand
> it. I was under the impression that a hash in scalar context should
> look like:
> $hashname{$key} = value;
> or
> %hashname(key => value);
> or
> %hashname("key","value");
> but not
> %hashname{$key}{$other} = value;
This gets into references. You should go over the perlref and
perlreftut documents in perldoc if
you haven't already. The actual line is
$hashname{$key}{$other} = value;
and may also be written the following ways..
$hashname{$key}->{$other} = value;
$hashref = $hashname{$key}
${$hashref}{$other} = value;
%hashname has values in it that are not simple scalars, but are
references to more hashes.
%hashname = { key => { other => value } };
> I thought maybe it was intended to be some kind of index notation, but
> the actual value of that var is a string and not a digit, and if it
> was supposed to be a reference of some sort to make or indicate the
> key is another array (hash of hashes?), then I would expect the var
> "$other" to be inside the key's curly braces, not outside in their own
> braces (i.e. "$hashname{$key{$other}} = value;" as opposed to
> "$hashname{$key}{$other} = value;).
Indeed "hash of hashes" is exactly how such a structure is named.
It's the way they're parsed. The above would require the presence of a
hash called %key.
Think of it like a multidimensional array.
$array[4][3] is also valid, but it's a array of arrays.
> Can somebody enlighten me on what that {$other} is all about? How
> does it work? Why does it work?
Start here.
perldoc perldsc
perldoc perlreftut
perldoc perlref
Come back if you have more questions... :-)
--
Darren Dunham ddunham@taos.com
Unix System Administrator Taos - The SysAdmin Company
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
------------------------------
Date: Tue, 07 Oct 2003 23:49:16 GMT
From: Michael Budash <mbudash@sonic.net>
Subject: Re: trying to understand a hash
Message-Id: <mbudash-1A1309.16491607102003@typhoon.sonic.net>
In article <a964da31.0310071516.5059d0d7@posting.google.com>,
jguad98@hotmail.com (John) wrote:
> I'm taking apart somebody else's perl script in order to (a) learn and
> (b) make something for my own purposes, and have come across a hash
> that is not written in the manner to which I've become accustomed ...
> the author is not within hollering distance, so I thought I'd try this
> list for assistance.
>
> Here's the hash as the author has created it:
>
> $hashname{$key}{$other} = value;
>
> where "$hashname" was initialized with "my %hashname", "$key" is a
> scalar derived from input, and "$other" is another scalar also derived
> from input.
>
> It looks to me that "other" is outside of the key and obviously isn't
> the value either. The script I found it in works as the author
> intended, so the structure is obviously legal, but I don't understand
> it. I was under the impression that a hash in scalar context should
> look like:
>
> $hashname{$key} = value;
> or
> %hashname(key => value);
> or
> %hashname("key","value");
> but not
> %hashname{$key}{$other} = value;
>
> I thought maybe it was intended to be some kind of index notation, but
> the actual value of that var is a string and not a digit, and if it
> was supposed to be a reference of some sort to make or indicate the
> key is another array (hash of hashes?), then I would expect the var
> "$other" to be inside the key's curly braces, not outside in their own
> braces (i.e. "$hashname{$key{$other}} = value;" as opposed to
> "$hashname{$key}{$other} = value;).
>
> Can somebody enlighten me on what that {$other} is all about? How
> does it work? Why does it work?
>
> best regards,
>
> John
you were sooooo close. $hashname{$key}{$other} does indeed indicate a
"hash of hashes", or more specifically, a "hash of hash references",
since a hash value can only contain scalars.
so, since $hashname{$key} contains a hash ref, you can refer to one of
that hashref's keys in either of two ways:
$hashname{$key}->{$hashrefkey}
or
$hashname{$key}{$hashrefkey}
hth-
--
Michael Budash
------------------------------
Date: 7 Oct 2003 17:28:42 -0700
From: jwillmore@cyberia.com (James Willmore)
Subject: Re: Virus, CPU killer, Memory Eater on Sourceforge.net?
Message-Id: <e0160815.0310071628.5f00818b@posting.google.com>
"Public Interest" <test@test.com> wrote in message news:<Ygngb.171373$3o3.12569897@bgtnsc05-news.ops.worldnet.att.net>...
> I am not sure it is my problem or a general problem. Last week, there was a
> hack on changing your DNS record on your PC via IE. The hackers use banners
> on forturncity, a free web hosting company to do that. It seems to me
> somebody hacked to sourceforge now. I went to
> http://sourceforge.net/projects/ayttm/ or
> http://sourceforge.net/projects/miranda-icq/
<snip>
And this has what to do with Perl? Oh, I get it, you wanted to make
everyone aware of, what, a new crack? Thanks, but I get my security
advisories from reliable sources, than you very much :-)
Now that you mention it, did you notify SourceForge _before_ posting
what ever it was you posted? If not, this was a _very_ bad move on
your part. Shame on you.
And did you, maybe, scan your system to see if you contracted the
latest "Microsoft bug of the week"? Maybe the site wasn't cracked,
but your PC.
Jim
------------------------------
Date: Sat, 19 Jul 2003 01:59:56 GMT
From: Bob Walton <bwalton@rochester.rr.com>
Subject: Re:
Message-Id: <3F18A600.3040306@rochester.rr.com>
Ron wrote:
> Tried this code get a server 500 error.
>
> Anyone know what's wrong with it?
>
> if $DayName eq "Select a Day" or $RouteName eq "Select A Route") {
(---^
> dienice("Please use the back button on your browser to fill out the Day
> & Route fields.");
> }
...
> Ron
...
--
Bob Walton
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 5631
***************************************