[19608] in Perl-Users-Digest
Perl-Users Digest, Issue: 1803 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Sep 24 09:10:25 2001
Date: Mon, 24 Sep 2001 06:10:12 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <1001337012-v10-i1803@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Mon, 24 Sep 2001 Volume: 10 Number: 1803
Today's topics:
Segmentation fault <ant@tardis.ed.ac.uk>
Re: Segmentation fault <Thomas@Baetzler.de>
Re: Simple Hash and Random Problem <Graham.T.Wood@oracle.com>
Re: Simple Random Problem <Graham.T.Wood@oracle.com>
Slicing emptiness (Anno Siegel)
Re: Slicing emptiness <tinamue@zedat.fu-berlin.de>
Statistics for comp.lang.perl.misc <gbacon@cs.uah.edu>
Re: String comparison (Samppa)
threads <buchi.martin@web.de>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 24 Sep 2001 13:18:09 +0100
From: Anthony Clifford <ant@tardis.ed.ac.uk>
Subject: Segmentation fault
Message-Id: <Pine.GS4.4.33.0109241317370.13693-100000@omega.tardis.ed.ac.uk>
Hi,
I am parsing large html files (~10-50k) using s/// quite heavily.
The html files are database generated, so carriage returns are not
generally used much, leaving me with some very long lines to parse.
One I am looking at, at the moment, is 28000 characters. It is this line
that when parsed causes a segmentation fault. I've tried some ugly work
abouts (using unix fold), but i still get the segmentation fault.
Actually, i have just tried without the =~ and no seg fault, which led me
to try just a plain
$Info = m/match/;
$Info = $';
instead of
$Info =~ s/(\n|.)+match//;
This reduces the number of seg faults i am getting but i still get one if
there are over 35000 chars on a line!
Can anyone help / or let me know exactly what the limitations are?
cheers
ant
------------------------------
Date: Mon, 24 Sep 2001 14:08:46 +0200
From: =?ISO-8859-1?Q?Thomas_B=E4tzler?= <Thomas@Baetzler.de>
Subject: Re: Segmentation fault
Message-Id: <eg8uqtsbm183c7gvadusuk17d2pd4jhtnv@4ax.com>
On Mon, 24 Sep 2001, Anthony Clifford <ant@tardis.ed.ac.uk> wrote:
>I am parsing large html files (~10-50k) using s/// quite heavily.
>The html files are database generated, so carriage returns are not
>generally used much, leaving me with some very long lines to parse.
>
>One I am looking at, at the moment, is 28000 characters. It is this line
>that when parsed causes a segmentation fault. I've tried some ugly work
>abouts (using unix fold), but i still get the segmentation fault.
What do "perl -v" and "uname -a" say?
Could you post a bit more extensive code sample so that we could try
and reproduce the problem?
HTH,
--
use strict;my($i,$t,@r)=(0,'5 -.@BHJPT4acd6e2hk2lmn2o4r2s3tuz',map{ord}
split//,unpack('u*','L#`T&)QD5#0`#!!`#%1D)#08`#P05!!(3``$$"``#"0L&``('.
'"`P<!`````0$`'));$t=~s/(\d)(.)/$2x$1/eg;map{$t.=substr$t,$i,1,''while
$_--;$i++}@r;print"$t\n";# Thomas@Baetzler.de - http://baetzler.de/perl
------------------------------
Date: Mon, 24 Sep 2001 12:33:36 +0100
From: Graham Wood <Graham.T.Wood@oracle.com>
Subject: Re: Simple Hash and Random Problem
Message-Id: <3BAF1A10.B027E926@oracle.com>
This is a multi-part message in MIME format.
--------------FD8F143861BEDD4BCE6AACA3
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
BUCK NAKED1 wrote:
> Why does the *value* of "3" never print in this script I wrote?
>
> #!/usr/bin/perl -wd
> use CGI::Carp qw(fatalsToBrowser);
> use strict;
>
> my %text = (1, "<font color='blue' size='1'>Random Text 1</font>",
> 2, "<font color='red' size='2'>Random Text 2</font>",
> 3, "<font color='006600' size='4'><strong>Rand 3</strong></font>");
>
> my $rand = int(rand(values(%text)));
Because int removes the fractional part of the real number returned by
rand which is itself between 0 and the value of the expression
values(%test) (3 in scalar context). You will never get a value of 3
falling between 0 and 3. Int doesn't round it just truncates. Add 0.5
then call int if you want to round it. Better still just add 1 to $rand
when accessing the text hash then you will avoid getting 0 too.
>
> my $random_text = $text{$rand};
# change this line to $random_text = $text{$rand + 1};
>
> print "Content-type: text/html\n\n";
> print <<EOL;
> <HTML>
> <HEAD>
> <TITLE>Random Text Assoc Array</TITLE>
> </HEAD>
> <BODY>
> $random_text<BR>
> </BODY>
> </HTML>
> EOL
>
> the Best to All!
> --Dennis
--------------FD8F143861BEDD4BCE6AACA3
Content-Type: text/x-vcard; charset=UTF-8;
name="Graham.T.Wood.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Graham Wood
Content-Disposition: attachment;
filename="Graham.T.Wood.vcf"
begin:vcard
n:;Graham
x-mozilla-html:FALSE
adr:;;;;;;
version:2.1
email;internet:Graham.T.Wood@oracle.com
fn:Graham Wood
end:vcard
--------------FD8F143861BEDD4BCE6AACA3--
------------------------------
Date: Mon, 24 Sep 2001 13:22:51 +0100
From: Graham Wood <Graham.T.Wood@oracle.com>
Subject: Re: Simple Random Problem
Message-Id: <3BAF259B.CDBBB02E@oracle.com>
This is a multi-part message in MIME format.
--------------1C5EA197F4548FB9B5B772BD
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
BUCK NAKED1 wrote:
> Thanks. I know I didn't need a hash for that; but I'd never written an
> associative array, and needed the practice. I had written it previously
> as an array(below); and it worked great. I wonder why the array script
> below didn't need a +1 added to int? or maybe it did???
Your array indeces are 0,1 and 2 and @text returns 3 in scalar context. You
were selecting 0,1 or 2 as the random integer between 0 and 3.
Graham Wood
>
>
> #!/usr/bin/perl -wT
> use CGI::Carp qw(fatalsToBrowser);
> use strict;
> print "Content-type: text/html\n\n";
> print "<HTML><HEAD><TITLE>Random Text Script</TITLE>
> </HEAD><BODY>";
> my @text = (
> "<font color=blue size=1>Random Text 1</font>",
> "<font color= red size=2>Random Text 2</font>",
> "<font color=brown size=4><strong>Random Text 3</strong></font>", "<font
> color= black size=6><em>Random 4</em></font>",
> "<font color=green size=2>Random Text 5</font>");
> my $rand = int(rand(@text));
> my $random_text = $text[$rand];
> print "$random_text<BR>";
> print "</BODY></HTML>";
>
> Best to all!
> --Dennis
--------------1C5EA197F4548FB9B5B772BD
Content-Type: text/x-vcard; charset=UTF-8;
name="Graham.T.Wood.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Graham Wood
Content-Disposition: attachment;
filename="Graham.T.Wood.vcf"
begin:vcard
n:;Graham
x-mozilla-html:FALSE
adr:;;;;;;
version:2.1
email;internet:Graham.T.Wood@oracle.com
fn:Graham Wood
end:vcard
--------------1C5EA197F4548FB9B5B772BD--
------------------------------
Date: 24 Sep 2001 10:19:46 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Slicing emptiness
Message-Id: <9on1c2$j9n$1@mamenchi.zrz.TU-Berlin.DE>
Slicing an empry array, as in
# a)
my @x;
my @y = @x[ 1, 2, 3];
print scalar @y, "\n";
yields a list of so many undef's (it prints 3). Doing the same with an
empty list
# b)
@y = ()[ 1, 2, 3];
print scalar @y, "\n";
leaves @y empty (prints 0).
When the list contains at least one element, the behavior changes
# c)
@y = ( 'wawa')[ 1, 2, 3];
print scalar @y, "\n";
prints 3 again.
I think the behavior in b) is a bug.
Anno
------------------------------
Date: 24 Sep 2001 10:46:00 GMT
From: Tina Mueller <tinamue@zedat.fu-berlin.de>
Subject: Re: Slicing emptiness
Message-Id: <9on2t8$duhac$1@fu-berlin.de>
Anno Siegel <anno4000@lublin.zrz.tu-berlin.de> wrote:
i am getting different results from yours with all versions below:
- v5.6.0 built for i586-linux
- v5.7.2 built for sun4-solaris-thread-multi-64int,
- 5.005_03 built for sun4-solaris
> Slicing an empry array, as in
> # a)
> my @x;
> my @y = @x[ 1, 2, 3];
> print scalar @y, "\n";
> yields a list of so many undef's (it prints 3).
same for me.
> Doing the same with an
> empty list
> # b)
> @y = ()[ 1, 2, 3];
> print scalar @y, "\n";
> leaves @y empty (prints 0).
same for me.
> When the list contains at least one element, the behavior changes
> # c)
> @y = ( 'wawa')[ 1, 2, 3];
> print scalar @y, "\n";
> prints 3 again.
prints 0
but:
# d)
@y = ("wawa")[0, 1, 2, 3];
print scalar @y, "\n";
prints 4!
> I think the behavior in b) is a bug.
which version are you using?
regards,
tina
--
http://www.tinita.de \ enter__| |__the___ _ _ ___
tina's moviedatabase \ / _` / _ \/ _ \ '_(_-< of
search & add comments \ \ _,_\ __/\ __/_| /__/ perception
------------------------------
Date: Mon, 24 Sep 2001 12:46:43 -0000
From: Greg Bacon <gbacon@cs.uah.edu>
Subject: Statistics for comp.lang.perl.misc
Message-Id: <tquapjjjgg5vcc@corp.supernews.com>
Following is a summary of articles spanning a 7 day period,
beginning at 17 Sep 2001 13:50:16 GMT and ending at
24 Sep 2001 13:28:58 GMT.
Notes
=====
- A line in the body of a post is considered to be original if it
does *not* match the regular expression /^\s{0,3}(?:>|:|\S+>|\+\+)/.
- All text after the last cut line (/^-- $/) in the body is
considered to be the author's signature.
- The scanner prefers the Reply-To: header over the From: header
in determining the "real" email address and name.
- Original Content Rating (OCR) is the ratio of the original content
volume to the total body volume.
- Find the News-Scan distribution on the CPAN!
<URL:http://www.perl.com/CPAN/modules/by-module/News/>
- Please send all comments to Greg Bacon <gbacon@cs.uah.edu>.
- Copyright (c) 2001 Greg Bacon.
Verbatim copying and redistribution is permitted without royalty;
alteration is not permitted. Redistribution and/or use for any
commercial purpose is prohibited.
Excluded Posters
================
perlfaq-suggestions\@(?:.*\.)?perl\.com
faq\@(?:.*\.)?denver\.pm\.org
Totals
======
Posters: 322
Articles: 983 (423 with cutlined signatures)
Threads: 255
Volume generated: 2155.8 kb
- headers: 795.5 kb (15,947 lines)
- bodies: 1304.3 kb (35,965 lines)
- original: 965.1 kb (26,451 lines)
- signatures: 55.0 kb (1,343 lines)
Original Content Rating: 0.740
Averages
========
Posts per poster: 3.1
median: 1.0 post
mode: 1 post - 165 posters
s: 4.7 posts
Posts per thread: 3.9
median: 3 posts
mode: 1 post - 64 threads
s: 3.6 posts
Message size: 2245.7 bytes
- header: 828.7 bytes (16.2 lines)
- body: 1358.7 bytes (36.6 lines)
- original: 1005.3 bytes (26.9 lines)
- signature: 57.3 bytes (1.4 lines)
Top 10 Posters by Number of Posts
=================================
(kb) (kb) (kb) (kb)
Posts Volume ( hdr/ body/ orig) Address
----- -------------------------- -------
56 84.2 ( 49.3/ 34.5/ 21.1) Bart Lateur <bart.lateur@skynet.be>
27 68.8 ( 23.6/ 40.0/ 23.9) mgjv@tradingpost.com.au
20 33.5 ( 12.5/ 15.4/ 7.2) Randal L. Schwartz <merlyn@stonehenge.com>
19 33.3 ( 15.0/ 18.3/ 6.9) Anno Siegel <anno4000@lublin.zrz.tu-berlin.de>
18 34.7 ( 15.7/ 14.0/ 8.7) =?ISO-8859-1?Q?Thomas_B=E4tzler?= <Thomas@Baetzler.de>
16 25.1 ( 12.4/ 11.5/ 7.2) nobull@mail.com
15 34.7 ( 11.8/ 22.1/ 11.3) Benjamin Goldberg <goldbb2@earthlink.net>
15 30.9 ( 14.3/ 16.5/ 8.5) Bob Walton <bwalton@rochester.rr.com>
14 433.1 ( 10.6/421.5/412.6) Stan Brown <stanb@panix.com>
14 25.3 ( 14.0/ 11.3/ 8.3) Dave Tweed <dtweed@acm.org>
These posters accounted for 21.8% of all articles.
Top 10 Posters by Volume
========================
(kb) (kb) (kb) (kb)
Volume ( hdr/ body/ orig) Posts Address
-------------------------- ----- -------
433.1 ( 10.6/421.5/412.6) 14 Stan Brown <stanb@panix.com>
84.2 ( 49.3/ 34.5/ 21.1) 56 Bart Lateur <bart.lateur@skynet.be>
68.8 ( 23.6/ 40.0/ 23.9) 27 mgjv@tradingpost.com.au
39.2 ( 12.2/ 25.6/ 21.4) 11 tadmc@augustmail.com
34.7 ( 15.7/ 14.0/ 8.7) 18 =?ISO-8859-1?Q?Thomas_B=E4tzler?= <Thomas@Baetzler.de>
34.7 ( 11.8/ 22.1/ 11.3) 15 Benjamin Goldberg <goldbb2@earthlink.net>
33.5 ( 12.5/ 15.4/ 7.2) 20 Randal L. Schwartz <merlyn@stonehenge.com>
33.3 ( 15.0/ 18.3/ 6.9) 19 Anno Siegel <anno4000@lublin.zrz.tu-berlin.de>
30.9 ( 14.3/ 16.5/ 8.5) 15 Bob Walton <bwalton@rochester.rr.com>
28.4 ( 8.7/ 18.4/ 8.7) 11 timmy@cpan.org
These posters accounted for 38.1% of the total volume.
Top 10 Posters by OCR (minimum of five posts)
==============================================
(kb) (kb)
OCR orig / body Posts Address
----- -------------- ----- -------
0.979 (412.6 /421.5) 14 Stan Brown <stanb@panix.com>
0.847 ( 5.6 / 6.6) 5 Rory <rory@campbell-lange.net>
0.837 ( 21.4 / 25.6) 11 tadmc@augustmail.com
0.816 ( 8.8 / 10.8) 10 Mark Jason Dominus <mjd@plover.com>
0.755 ( 6.1 / 8.1) 5 peter pilsl <pilsl_@goldfisch.at>
0.740 ( 8.3 / 11.3) 14 Dave Tweed <dtweed@acm.org>
0.723 ( 5.3 / 7.3) 9 "Philip 'Yes, that's my address' Newton" <nospam.newton@gmx.li>
0.718 ( 4.1 / 5.8) 11 * Tong * <sun_tong@users.sourceforge.net>
0.704 ( 4.3 / 6.0) 8 "Matt Garrish" <matthew.garrish@sympatico.ca>
0.695 ( 3.1 / 4.5) 6 Jeff Zucker <jeff@vpservices.com>
Bottom 10 Posters by OCR (minimum of five posts)
=================================================
(kb) (kb)
OCR orig / body Posts Address
----- -------------- ----- -------
0.442 ( 1.5 / 3.4) 7 "Tintin" <tintin@snowy.calculus>
0.440 ( 2.3 / 5.1) 6 "B. Caligari" <bcaligari@fireforged.com>
0.416 ( 2.8 / 6.6) 9 Chris Fedde <cfedde@fedde.littleton.co.us>
0.410 ( 1.3 / 3.2) 11 Laocoon <Laocoon@eudoramail.com>
0.409 ( 2.9 / 7.0) 11 "Rob - Rock13.com" <rob_13@excite.com>
0.384 ( 2.6 / 6.7) 6 "GuaRDiaN" <guardian@chello.be>
0.383 ( 2.4 / 6.3) 8 Uri Guttman <uri@sysarch.com>
0.376 ( 6.9 / 18.3) 19 Anno Siegel <anno4000@lublin.zrz.tu-berlin.de>
0.366 ( 1.7 / 4.5) 5 news@tinita.de
0.323 ( 4.3 / 13.4) 12 Joe Chung <m_010@yahoo.com>
51 posters (15%) had at least five posts.
Top 10 Threads by Number of Posts
=================================
Posts Subject
----- -------
21 Best way to hide the perl source code..
17 Schwartzian Transform problem
15 split SuchKindOfWord into individual words
12 What does this do ? "select( (select($writer), $|=1)[0] );" ?
12 extra newline in s command
12 Perl English only??????
12 Hash problem...
11 example for rewinddir
11 win32 stat in directory with 4682 files
11 Objects and Sockets
These threads accounted for 13.6% of all articles.
Top 10 Threads by Volume
========================
(kb) (kb) (kb) (kb)
Volume ( hdr/ body/ orig) Posts Subject
-------------------------- ----- -------
413.5 ( 10.7/401.2/395.9) 12 What does this do ? "select( (select($writer), $|=1)[0] );" ?
32.8 ( 17.0/ 14.9/ 9.8) 21 Best way to hide the perl source code..
32.7 ( 10.1/ 22.6/ 10.1) 11 win32 stat in directory with 4682 files
28.5 ( 16.3/ 11.4/ 6.4) 17 Schwartzian Transform problem
26.8 ( 9.9/ 16.6/ 6.9) 12 extra newline in s command
24.8 ( 6.8/ 17.3/ 7.6) 9 perl module
23.6 ( 10.5/ 12.0/ 4.7) 12 Hash problem...
21.1 ( 4.5/ 15.8/ 10.7) 6 Perhaps some help for a newbie - uninitialized value...
20.7 ( 12.3/ 7.1/ 3.4) 15 split SuchKindOfWord into individual words
19.8 ( 8.9/ 10.5/ 6.1) 11 example for rewinddir
These threads accounted for 29.9% of the total volume.
Top 10 Threads by OCR (minimum of five posts)
==============================================
(kb) (kb)
OCR orig / body Posts Subject
----- -------------- ----- -------
0.987 (395.9/ 401.2) 12 What does this do ? "select( (select($writer), $|=1)[0] );" ?
0.905 ( 8.5/ 9.4) 5 Unusual error message...
0.789 ( 3.7/ 4.6) 5 Use symbol table like 'real' hash
0.775 ( 2.1/ 2.7) 6 Perl Compiler
0.764 ( 6.2/ 8.1) 8 copy STDERR to file
0.753 ( 3.8/ 5.1) 5 File::Find::name problem...
0.744 ( 4.6/ 6.2) 12 Perl English only??????
0.738 ( 3.1/ 4.2) 5 faster execution
0.720 ( 3.6/ 5.0) 5 HELP! Having Trouble concatenating or joining and making or building a url with ActivePerl and the '.' operator.
0.719 ( 3.8/ 5.2) 5 HTML-parsing prob - need regexpression help
Bottom 10 Threads by OCR (minimum of five posts)
=================================================
(kb) (kb)
OCR orig / body Posts Subject
----- -------------- ----- -------
0.449 ( 10.1 / 22.6) 11 win32 stat in directory with 4682 files
0.445 ( 2.5 / 5.5) 7 Help with reading web page from socket.
0.441 ( 7.6 / 17.3) 9 perl module
0.436 ( 1.8 / 4.2) 5 Matching Strings Help Needed
0.417 ( 6.9 / 16.6) 12 extra newline in s command
0.416 ( 2.5 / 6.1) 7 Persistence in Perl
0.414 ( 2.4 / 5.8) 9 write to a file handle
0.389 ( 4.7 / 12.0) 12 Hash problem...
0.369 ( 2.8 / 7.5) 10 Creating a file
0.297 ( 3.5 / 11.9) 7 Newbie problem
80 threads (31%) had at least five posts.
Top 10 Targets for Crossposts
=============================
Articles Newsgroup
-------- ---------
25 alt.perl
13 comp.lang.perl
6 comp.lang.perl.modules
3 comp.mail.mutt
2 comp.lang.perl.moderated
2 alt.perl.sockets
2 comp.unix.questions
2 comp.unix.programmer
1 comp.infosystems.search
1 alt.comp.perlcgi.freelance
Top 10 Crossposters
===================
Articles Address
-------- -------
6 "Matt Garrish" <matthew.garrish@sympatico.ca>
4 Randal L. Schwartz <merlyn@stonehenge.com>
4 "Jay Flaherty" <fty@mediapulse.com>
4 John Hoarty <jhoarty@quickestore.com>
3 Bart Lateur <bart.lateur@skynet.be>
3 Kelly and Sandy <junk@almide.demon.co.uk>
3 Michael Carman <mjcarman@home.com>
2 Bikesh Patel <bikesh@my-deja.com>
2 Andrew Cady <please@no.spam>
2 peter <peter_icaza@REMOVE2REPLYuhc.com>
------------------------------
Date: 24 Sep 2001 05:40:53 -0700
From: sami@xenetic.fi (Samppa)
Subject: Re: String comparison
Message-Id: <30586a1d.0109240440.218d7cdc@posting.google.com>
> On 21 Sep 2001, sami@xenetic.fi wrote:
>
> > #Step 1
> > if ($line =~ /$string1/) {
I had to keep in mind, that string in scalar $string1 is
regular expression to Perl.
The index function or \Q escape both works fine.
Thanks to everybody.
Sami
------------------------------
Date: 24 Sep 2001 13:06:31 +0200
From: Buchleitner Martin <buchi.martin@web.de>
Subject: threads
Message-Id: <3baf13ba@netnews.web.de>
Hi !
I am reading data from a web-server.
but i want also to be able to send data.
the reading process is a never-ending process.
so i have to use threads.
i have a sub called readData( $readingURL ) and a
sub called sendData( $sendingURL, $message ).
now i want to use threads to handle my problem.
the user should be able to send data to the server
while the program is still reading from the server.
how may i do this?
i read the perldoc perlthrtut but i am not sure
what i sould do now.
martin
--
__________________________________________________________
News suchen, lesen, schreiben mit http://newsgroups.web.de
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 1803
***************************************