[28236] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 9600 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Aug 14 03:05:55 2006

Date: Mon, 14 Aug 2006 00:05:07 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 14 Aug 2006     Volume: 10 Number: 9600

Today's topics:
    Re:     LWP::UserAgent question--MULTIPOSTED <jurgenex@hotmail.com>
    Re:     LWP::UserAgent question--MULTIPOSTED <john@castleamber.com>
    Re:     LWP::UserAgent question--MULTIPOSTED <rvtol+news@isolution.nl>
    Re: Beginner: read and print same file <someone@example.com>
    Re: creating a datastructure from lists <someone@example.com>
    Re: is it possible to efficiently read a large file? xhoster@gmail.com
    Re: is it possible to efficiently read a large file? <benmorrow@tiscali.co.uk>
    Re: is it possible to efficiently read a large file? <someone@example.com>
    Re: LWP::UserAgent question--MULTIPOSTED <john@castleamber.com>
    Re: LWP::UserAgent question--MULTIPOSTED <tadmc@augustmail.com>
    Re: LWP::UserAgent question--MULTIPOSTED <john@castleamber.com>
    Re: LWP::UserAgent question--MULTIPOSTED <DJStunks@gmail.com>
    Re: LWP::UserAgent question--MULTIPOSTED <john@castleamber.com>
    Re: LWP::UserAgent question--MULTIPOSTED usenet@DavidFilmer.com
    Re: LWP::UserAgent question--MULTIPOSTED <someone@example.com>
    Re: LWP::UserAgent question <john@castleamber.com>
        new CPAN modules on Mon Aug 14 2006 (Randal Schwartz)
    Re: PerlDoc used in CPAN?--MULTIPOSTED usenet@DavidFilmer.com
    Re: Proposal: extending perldoc -f <simon@unisolve.com.au>
    Re: RegEx question: Exclude characters from group <xicheng@gmail.com>
    Re: system command won't let go <tadmc@augustmail.com>
    Re: The assignment of command output to an array hangs. <kaleem177@gmail.com>
        Use of named groups in VC++ or C++ code shilpi.rustagi@beesys.com
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 14 Aug 2006 01:30:24 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re:     LWP::UserAgent question--MULTIPOSTED
Message-Id: <QmQDg.69175$MW.54452@trnddc04>

usenet.cop@3955291010.com wrote:
> "a" <a@mail.com> wrote:
>>> [ snip and ignore MULTIPOSTED message ]

I think this multiposting bot is a great idea.
However I would suggest to tune it down a bit. Just make the text 
informative instead of accusing, remove those heavy borders, etc.
A 3 to 4 line message with a link to more information is really all that is 
needed. No need for those KB and KB of heavy shots.

jue 




------------------------------

Date: 14 Aug 2006 02:56:31 GMT
From: John Bokma <john@castleamber.com>
Subject: Re:     LWP::UserAgent question--MULTIPOSTED
Message-Id: <Xns981EDF32AAD93castleamber@130.133.1.4>

"Jürgen Exner" <jurgenex@hotmail.com> wrote:

> usenet.cop@3955291010.com wrote:
>> "a" <a@mail.com> wrote:
>>>> [ snip and ignore MULTIPOSTED message ]
> 
> I think this multiposting bot is a great idea.

Me no, and I have reported it for what I see it is right now: Usenet 
abuse.

-- 
John Bokma          Freelance software developer
                                &
                    Experienced Perl programmer: http://castleamber.com/


------------------------------

Date: Mon, 14 Aug 2006 05:14:26 +0200
From: "Dr.Ruud" <rvtol+news@isolution.nl>
Subject: Re:     LWP::UserAgent question--MULTIPOSTED
Message-Id: <ebp0s2.168.1@news.isolution.nl>

usenet.cop@3955291010.com schreef:
> "a" <a@mail.com> wrote:

>> [ snip and ignore MULTIPOSTED message ]
>
> **********************************************************************
> **********   PLEASE  DO  NOT  RESPOND  TO  THIS  THREAD    ***********
> **********************************************************************
>
> This message has been multiposted as indicated by these message IDs:
>    <JoPDg.386093$Mn5.194189@pd7tw3no>
>    <GbPDg.382930$iF6.158066@pd7tw2no>


  s/</<news:/g

and you should put them both in the References: header field (without
the "news:" of course).

-- 
Affijn, Ruud

"Gewoon is een tijger."




------------------------------

Date: Mon, 14 Aug 2006 02:35:06 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: Beginner: read and print same file
Message-Id: <ujRDg.2266$Ch.1703@clgrps13>

Marek Stepanek wrote:
> 
> I have to bother you once again. I have a script with __DATA__ which will be
> transformed later into a LaTeX Table. Just now I am stuck to insert these
> DATA after a keyword "% begin", but I am not finding the solution.
> 
> 
> My LaTeX-Table "calc_hours_table02.tex" looks like follows:
> 
> \begin{longtable}[c]{| c | c | c | c | c | c | c | c |}
> 
>   \hline
> %  \caption{table title}\\
>   \multicolumn{8}{| c |}{\textbf{TITLE}} \\
>   \hline
>    & & & \multicolumn{5}{ c |}{\textbf{title}} \\
>   \hline
>   \textbf{Datum} & \textbf{Umsatz 7\%} & \textbf{Umsatz 16\%} &
> \textbf{Beginn} & \textbf{Pause} & \textbf{Ende} & \textbf{Stunden} &
> \textbf{Km-Gesamt} \\
>   \hline
> % begin <--- and here my script should insert the transformed data ...
> 
>   \hline
>   \hline
> 
> \end
> 
> 
> 
> #!/usr/bin/perl
> 
> use strict;
> use warnings;
> 
> ####global variables########
> my (@lines);
> my ($date);
> ####END global variables####
> 
> 
> #### Files #################
> my $out_file = "calc_hours_table02.tex";
> open OUT, "$out_file" or die "Error! $!\n;";
> ####END Files ##############
> 
> 
> ####Read in and work########
> while (<DATA>)
>   {
>     chomp;
>       next if (/^$/);
> 
>       if (/^(\w+, [\d\.]+)/)
>         {
>           $date = $1;
>         }
>         
>     if (/^TOTAL/)
>       {
>         s/TOTAL/"$date"/e;
>         push (@lines, $_);
>       }
>   }
> ####END Read in and work####
> 
> ####Output##################
> while (<OUT>)
>   {
>     if (/^(% begin)$/)
>       {
>          s/(% begin)/"$1\n" . join ("\n",@lines)/e; ###And here my problem!
> #        print OUT "$1\n";
> #        print OUT join ("\n",@lines);
> #        print OUT "\n";
>       }
>   }
> ####END Output##############
> 
> ######Data to read in#######
> 
> __DATA__
> 
> Son, 16.07.2006    37086.40    15445.00    808    19.50    3156.30
> Mon, 17.07.2006    37667.00    15769.20    817    19.50    3621.00
> TOTAL    580.60    324.20    9    0.00    464.70
>                    
> Mon, 17.07.2006    37667.00    15769.20    817    19.50    3621.00
> Die, 18.07.2006    37936.50    15929.60    823    19.50    3857.80
> TOTAL    269.50    160.40    6    0.00    236.80

[snip]

After reading the posts on this thread it looks like you want something like:


#!/usr/bin/perl
use strict;
use warnings;

#### Files #################
my $out_file = 'calc_hours_table02.tex';
####END Files ##############

####Read in and work########
my ( $date, $lines );
while ( <DATA> ) {
    /^(\w+, +[\d.]+)/ and $date = $1;
    s/^TOTAL// and $lines .= "$date$_";
    }
####END Read in and work####


####Output##################
( $^I, @ARGV ) = ( '', $out_file );
while ( <> ) {
    s/^(% begin\n)/$1$lines/;
    print;
    }
####END Output##############

######Data to read in#######

__DATA__

Son, 16.07.2006    37086.40    15445.00    808    19.50    3156.30
Mon, 17.07.2006    37667.00    15769.20    817    19.50    3621.00
TOTAL    580.60    324.20    9    0.00    464.70

Mon, 17.07.2006    37667.00    15769.20    817    19.50    3621.00
Die, 18.07.2006    37936.50    15929.60    823    19.50    3857.80
TOTAL    269.50    160.40    6    0.00    236.80

[snip]



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Mon, 14 Aug 2006 02:42:31 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: creating a datastructure from lists
Message-Id: <rqRDg.4090$Nz6.3988@edtnps82>

sal.x.lopez@gmail.com wrote:
> I need to convert the following lists:
> 
> house,doors,knobs,style
> house,doors,knobs,color
> house,windows,length
> house,windows,width
> 
> into a datastructure like this;
> 
> $ds = {
>     'house' => {
>         'doors' => {
>             'knobs' => {
>                'style' => "",
>                'color' => '""
>              }
>         },
>          'windows' => {
>              'length' => "",
>              'width' => ""
>          }
>     }
> };
> 
> Does this require the use of recursive calls? Thanks in advance.

See this thread posted a few days ago:

http://groups.google.com/group/comp.lang.perl.misc/browse_frm/thread/6d2424960f2b038e/c6d0263039cab4bf?#c6d0263039cab4bf



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: 14 Aug 2006 01:52:59 GMT
From: xhoster@gmail.com
Subject: Re: is it possible to efficiently read a large file?
Message-Id: <20060813220313.088$Xw@newsreader.com>

Mark Seger <Mark.Seger@hp.com> wrote:
> I realized I didn't answer all your questions.  Sorry about that.  See
> below:
>
> xhoster@gmail.com wrote:
> > Mark Seger <Mark.Seger@hp.com> wrote:
> >
> >>I'm trying to read a 3GB file efficiently.  If I do with a benchmarking
> >>tool, I use about 6-8% of the cpu and can read it in about 44 seconds -
> >>obviously the time if very closely tied to the type of disk, but I'm
> >>including that for reference.
> >
> >
> > What kind of benchmarking tool is it?  For benchmarking raw disks, or
> > the OS FS, or what?  It may be using methods that are simply
> > unavailable to a general purpose language like perl.
>
> I do believe DT (which I mentioned in an earlier note) uses very basic,
> sequential reads (there are certainly switches to do async and other
> switches as well, but I just use the basics).
 ...
> to create a 10G file.  As I said before, pick a size at least as large
> as your current RAM to assume it doesn't get cached.

But I thought the point was to investigate Perl, not my hard drive.
What difference does it make to Perl if it is cached or not?  (My previous
run was using a sparse file, most of the pages were all nulls but that
shouldn't make any difference to the buffering--the data is still real data
as far as that goes.)

> Now read it back
> with the identical command, replacing the 'of' with 'if' (output to
> input).  Here's an example on my machine, remember I only have 3 GB RAM:
>
> [root@cag-dl380-01 mjs]# ./dt of=/mnt/scratch/test limit=3g bs=1m
> disable=compare,verify dispose=keep

OK, I used dt to make a 4G file (I have 2 G ram) and then used it again to
read it like above, it took 1:24 to read 4 Gigs.  (I straced it, and it
seemed to use only ordinary read commands, same as Perl does.)

I ran your Perl sysread code, it took 84 seconds, or 1:24, to read the same
4G file.

This is perl, v5.8.8 built for i686-linux-thread-multi

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB


------------------------------

Date: Sun, 13 Aug 2006 23:59:16 +0100
From: Ben Morrow <benmorrow@tiscali.co.uk>
Subject: Re: is it possible to efficiently read a large file?
Message-Id: <4fn4r3-s1m.ln1@osiris.mauzo.dyndns.org>


Quoth Mark Seger <Mark.Seger@hp.com>:
> Ben Morrow wrote:
> 
> > Quoth Mark Seger <Mark.Seger@hp.com>:
> > 
> >>John W. Krahn wrote:
> >>
> >>
> >>>Are you using open() or sysopen() to open the file?  sysread() "bypasses
> >>>buffered IO" but your $reclen may be too large (or too small) for efficient
> >>>IO.  Your example appears to use $main::buffer which means that the same
> >>>variable is used for each read however I don't know whether Perl reallocates
> >>>memory for each read.  You could use something like strace(1) to determine
> >>>exactly what system calls the program is making.
> >>
> >>I'm usong open(), but I'll give sysopen() a whirl in the morning.  I 
> >>also like the idea abot strace.  my fear is the data is being read into 
> >>one buffer and storage is getting allocated for $buffer on each call and 
> >>then moved to it.  the challenge is, is there a way to read directly 
> >>into the $buffer.  maybe strace() will provide some clues...
> > 
> > open/sysopen should make no difference. To preallocate a buffer, create
> > a long string and overwrite bits of it with substr or directly with
> > sysread. You have to do your own buffer manglement as in C, of course,
> > but that's how you get efficiency.
> 
> I didn't think it would make a differnce but I'm desparate and willing 
> to try anything.  8-(
> 
> I had created a long string and passed it to sysread but it didn't seem 
> to make any difference, and besides one subsequent reads it would 
> already be allocated to the proper length by the previous reads.  Or am 
> I missing something?

read and sysread only ever make strings longer. By using the fourth
parameter OFFSET, and keeping track of where you are in the string, you
can write into a precreated string just as you would a buffer in C.

> What I don't see how is to pass the address of my buffer to sysread as 
> it wants a scalar and so won't that always force it to be created/malloc'd?

sysread effectively 'takes the address' of the scalar you pass in: that
is, it writes directly into the scalar as given, and only allocates
memory if it isn't long enough.

> Here's exactly what I'm doing, noting that I'm counting the bytes read 
> just to make sure I'm reading no more than I should be.  Assuming I 
> understand what you were saying, I believe I am preallocating the buffer.
> 
> #!/usr/bin/perl -w

Instread of -w you want

    use warnings;

Also you want

    use strict;

and you need to declare your variables with my.

> 
> $reclen=1024*128;

This is still a very small record size. The overhead of Perl ops is much
higher than C ops, so each spin round the loop will cost you much more
in Perl than in C. This makes it more important to minimize the number
of times you need to loop by reading as much as you reasonably can at a
time.

> $buffer=' 'x$reclen;
> $filename='/mnt/scratch/test';
> open FILE, "<$filename" or die;

Use lexical filehandles.
Use 3-arg open.
Give a meaningful error message, even for tiny programs.

    open my $FILE, '<', $filename or die "can't read '$filename': $!";

> 
> $total=0;
> $start=time;

Note that it is often easier to use the Benchmark module for timing
things.

> while ($bytes=sysread(FILE, $buffer, $reclen))

This will overwrite the contents of $buffer, starting from the beginning
each time. You want something more like

    while ($bytes = sysread($FILE, $buffer, $reclen, $total)) {

, although you want to bear in mind what I said above about reading as
much as you can at a time.

Ben

-- 
   Razors pain you / Rivers are damp
   Acids stain you / And drugs cause cramp.                    [Dorothy Parker]
Guns aren't lawful / Nooses give
  Gas smells awful / You might as well live.            benmorrow@tiscali.co.uk


------------------------------

Date: Mon, 14 Aug 2006 05:33:26 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: is it possible to efficiently read a large file?
Message-Id: <GWTDg.5312$365.4164@edtnps89>

Mark Seger wrote:
> 
> I didn't think it would make a differnce but I'm desparate and willing
> to try anything.  8-(
> 
> I had created a long string and passed it to sysread but it didn't seem
> to make any difference, and besides one subsequent reads it would
> already be allocated to the proper length by the previous reads.  Or am
> I missing something?
> 
> Just to back up a step or two, I wrote a short C program that mallocs a
> buffer and calls fread with the address of the buffer and it runs as
> efficiently and at the same speed as my benchmark tool (which I've
> included a pointer to in a previous response - a very cool tool if you
> haven't tried it yet).
> 
> What I don't see how is to pass the address of my buffer to sysread as
> it wants a scalar and so won't that always force it to be created/malloc'd?
> 
> Here's exactly what I'm doing, noting that I'm counting the bytes read
> just to make sure I'm reading no more than I should be.  Assuming I
> understand what you were saying, I believe I am preallocating the buffer.
> 
> #!/usr/bin/perl -w
> 
> $reclen=1024*128;
> $buffer=' 'x$reclen;
> $filename='/mnt/scratch/test';
> open FILE, "<$filename" or die;
> 
> $total=0;
> $start=time;
> while ($bytes=sysread(FILE, $buffer, $reclen))
> {
>     $total+=$bytes;
> }
> 
> $duration=time-$start;
> printf "Filesize: %5dM  Recsize:%5dK  %5.1fSecs  %6dKB/sec\n",
>       $total/(1024*1024), $reclen/1024, $duration, $total/$duration/1024;

I tried writing a C program and a Perl program that did (basicly) the same
thing and I got these results:

$ gcc -o seger-test seger-test.c
$ time ./seger-test
Filesize:  1043M  Recsize:  128K     45Secs   23749KB/sec

real    0m45.024s
user    0m0.016s
sys     0m7.408s
$ time ./seger-test.pl
Filesize:  1043M  Recsize:  128K     45Secs   23749KB/sec

real    0m45.606s
user    0m0.171s
sys     0m7.462s


And it doesn't look like C and Perl differ very much in performance.

Memory size:

$ free -b
             total       used       free     shared    buffers     cached
Mem:     462196736  456695808    5500928          0    2555904  334778368
-/+ buffers/cache:  119361536  342835200
Swap:            0          0          0

File size:

$ ls -l SUSE-10.0-LiveDVD.iso
-rw-r--r--  1 john users 1094363136 2005-10-25 05:08
download/SUSE-10.0-LiveDVD.iso

The C program:

$ cat ./seger-test.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <error.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <time.h>


int main ( void )
{
        ssize_t reclen = 1024 * 128;
        char filename[] = "SUSE-10.0-LiveDVD.iso";
        int fd;

        if ( ( fd = open( filename, O_RDONLY ) ) == -1 )
        {
                perror( "Cannot open file" );
                return EXIT_FAILURE;
        }

        time_t start = time( NULL );
        char *buffer = malloc( reclen );
        ssize_t total = 0;
        ssize_t bytes;
        while ( bytes = read( fd, buffer, reclen ) )
        {
                if ( bytes == -1 )
                {
                        perror( "Cannot read from file" );
                        return EXIT_FAILURE;
                }
                total += bytes;
        }

        free( buffer );
        time_t duration = time( NULL ) - start;

        printf( "Filesize: %5uM  Recsize:%5uK  %5uSecs  %6uKB/sec\n",
                total / ( 1024 * 1024 ), reclen / 1024, duration, total /
duration / 1024 );

        return EXIT_SUCCESS;
}


The Perl program:

$ cat ./seger-test.pl
#!/usr/bin/perl
use warnings;
use strict;
use bytes;
use integer;
use Fcntl;


my $reclen = 1024 * 128;
my $filename = 'SUSE-10.0-LiveDVD.iso';


sysopen my $fd, $filename, O_RDONLY or die "Cannot open '$filename' $!";


my $start = time;
my $buffer;
my $total = 0;

while ( my $bytes = sysread $fd, $buffer, $reclen ) {

        defined $bytes or die "Cannot read from '$filename' $!";

        $total += $bytes;
        }

my $duration = time() - $start;

printf( "Filesize: %5uM  Recsize:%5uK  %5uSecs  %6uKB/sec\n",
        $total / ( 1024 * 1024 ), $reclen / 1024, $duration, $total /
$duration / 1024 );

__END__



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: 14 Aug 2006 01:23:23 GMT
From: John Bokma <john@castleamber.com>
Subject: Re: LWP::UserAgent question--MULTIPOSTED
Message-Id: <Xns981ECF6847673castleamber@130.133.1.4>

usenet.cop@3955291010.com wrote:

Last warning: next time I report this annoying piece of garbage as 
Usenet abuse. I am sure that running bots, especially the piece of 
crap you are using, are a ToS violation of Giganews.

You can't fix an issue by causing a bigger one.


-- 
John Bokma          Freelance software developer
                                &
                    Experienced Perl programmer: 
http://castleamber.com/


------------------------------

Date: Sun, 13 Aug 2006 20:51:47 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: LWP::UserAgent question--MULTIPOSTED
Message-Id: <slrnedvlpj.sv5.tadmc@magna.augustmail.com>

usenet.cop@3955291010.com <usenet.cop@3955291010.com> wrote:
> "a" <a@mail.com> wrote:
>>> [ snip and ignore MULTIPOSTED message ]


> Questions or comments are welcome  #


Please undeploy it forthwith.


> # Q-Why am I doing this? A--For a better usenet. 


It isn't working. This "cure" is worse than the disease.


> # Some folks try to
> #   discourage job postings; some discourage off-topic posts. I try  #
                                                                ^
                                                                ^
> #   to discourage multiposts 


No _you_ don't.

Your _machine_ does.

Not the same thing.


> #   If you don't wish to be bothered with these auto-generated       #
> #   responses, please killfile the scanner.                          #


If you don't stop this right quick, expect more than the scanner
to be killfiled...


> #   But I choose to run this  #
> #   scanner anonymously 


Bad choice.

It hurts the credibility of your message so much as to make the 
postings worthless.



( I will note however that multiposting was the primary reason that
  I stopped participating in the beginners mailing list, way back
  when.
)

-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: 14 Aug 2006 02:57:49 GMT
From: John Bokma <john@castleamber.com>
Subject: Re: LWP::UserAgent question--MULTIPOSTED
Message-Id: <Xns981EDF6AFB111castleamber@130.133.1.4>

Tad McClellan <tadmc@augustmail.com> wrote:

> If you don't stop this right quick, expect more than the scanner
> to be killfiled...

Too late :-)

-- 
John Bokma          Freelance software developer
                                &
                    Experienced Perl programmer: http://castleamber.com/


------------------------------

Date: 13 Aug 2006 21:19:52 -0700
From: "DJ Stunks" <DJStunks@gmail.com>
Subject: Re: LWP::UserAgent question--MULTIPOSTED
Message-Id: <1155529192.079099.267640@b28g2000cwb.googlegroups.com>

John Bokma wrote:
> "J=FCrgen Exner" <jurgenex@hotmail.com> wrote:
>
> > usenet.cop@3955291010.com wrote:
> >> "a" <a@mail.com> wrote:
> >>>> [ snip and ignore MULTIPOSTED message ]
> >
> > I think this multiposting bot is a great idea.
>
> Me no, and I have reported it for what I see it is right now: Usenet
> abuse.

What happened to your "last warning", Bokma?  Itchy reporting finger?
Perhaps David can hook you up with his source for his bot and you could
modify it to create your own bot reporting bot.  The downside, however,
is that, while more efficient, this approach would almost certainly
take away from what appears to be a very enjoyable pastime for you.

- Jake "Free Xah" Peavy



------------------------------

Date: 14 Aug 2006 05:57:43 GMT
From: John Bokma <john@castleamber.com>
Subject: Re: LWP::UserAgent question--MULTIPOSTED
Message-Id: <Xns981F9C70B2A8castleamber@130.133.1.4>

"DJ Stunks" <DJStunks@gmail.com> wrote:

> John Bokma wrote:
>> "Jürgen Exner" <jurgenex@hotmail.com> wrote:
>>
>> > usenet.cop@3955291010.com wrote:
>> >> "a" <a@mail.com> wrote:
>> >>>> [ snip and ignore MULTIPOSTED message ]
>> >
>> > I think this multiposting bot is a great idea.
>>
>> Me no, and I have reported it for what I see it is right now: Usenet
>> abuse.
> 
> What happened to your "last warning", Bokma?  Itchy reporting finger?

I decided to let the bot owner's Usenet provider decide ;-)

> - Jake "Free Xah" Peavy

Ha ha ha. If you're real (which I doubt), give Xah some momey for his 
process against DreamHost. It will increase the joy tenfold :-D.

-- 
John Bokma          Freelance software developer
                                &
                    Experienced Perl programmer: http://castleamber.com/


------------------------------

Date: 13 Aug 2006 23:36:40 -0700
From: usenet@DavidFilmer.com
Subject: Re: LWP::UserAgent question--MULTIPOSTED
Message-Id: <1155537400.709794.125610@m79g2000cwm.googlegroups.com>

John Bokma wrote:
> Last warning: next time I report this annoying piece of garbage as
> Usenet abuse.

> I am sure that running bots, especially the piece of
> crap you are using, are a ToS violation of Giganews.

I do not believe it is a ToS violation; GigaNews' terms may be found
here:
   http://www.giganews.com/legal/aup.html

> You can't fix an issue by causing a bigger one.

Prehaps you will be kind enough to explain your objections. I have, on
numerious occasions (along with many others) flagged multiposted
messages (manually).  I don't recall that anyone has ever been critical
of this practice, and such notifications are common courtesy in
professional newsgroups.

Now I write a bot to do this and several people complain (but don't
really say why). I thought folks like us write Perl scripts to automate
repetitive manual tasks.

Is it the fact that it's a bot that bothers you?  That would seem an
odd objection for a programmer.  Or do you not like the content of the
auto-messages?  I'm not an English major - I would be happy to consider
content edits. But I don't want to simply furnish a link - I've seen
lots of links given to lots of OP's (Posting Guidelines, etc) which
seem to be ignored.  Prehaps a direct reply such as this would be more
effective.

Or do you think my code is crap?  I would be happy to post it for peer
review.

Regarding the length of the message: There is a lot of "introductory"
info there which was designed to be temporary (as the message says). I
have removed this text, and the message is now down to a much more
resonable size (about 60 lines).

If you don't like seeing the messages, you may killfile the scanner
(just as you may killfile the Faq-O-Matic bot or the Posting Guidelines
bot).  If you think I'm a jerk for doing this then you may killfile me
as well. But, first, I hope you would at least give me a chance to
understand and address any concerns you have.

-- 
David Filmer (http://DavidFilmer.com)



------------------------------

Date: Mon, 14 Aug 2006 06:52:02 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: LWP::UserAgent question--MULTIPOSTED
Message-Id: <m4VDg.7727$tP4.2377@clgrps12>

John Bokma wrote:
> usenet.cop@3955291010.com wrote:
> 
> Last warning: next time I report this annoying piece of garbage as 
> Usenet abuse. I am sure that running bots, especially the piece of 
> crap you are using, are a ToS violation of Giganews.
> 
> You can't fix an issue by causing a bigger one.

Tell that to Bush.   ;-)


John
-- 
use Perl;
program
fulfillment


------------------------------

Date: 14 Aug 2006 01:29:02 GMT
From: John Bokma <john@castleamber.com>
Subject: Re: LWP::UserAgent question
Message-Id: <Xns981ED05D63E9Fcastleamber@130.133.1.4>

"a" <a@mail.com> wrote:

> Hi
> I would like to use LWP::UserAgent to login the web site and process
> the web content. Then I should use $ua -> credentials($netloc, $realm,
> $uname, $pass)
> What is $netloc, $realm, $uname, and $pass?
> Can someone post an example to demonstrate?

First, don't multipost.

Second, if it's basic authentication, you can use the example given under
"ACCESS TO PROTECTED DOCUMENTS" in lwpcook.


Examples of the use of credentials is under HTTP Authentication of lwptut.

"

$browser->credentials(
    'servername:portnumber',
    'realm-name',
   'username' => 'password'
  );


"

Note that the realm-name is displayed when you get the pop up window that 
allows you to log in.


(type on the command line: perldoc lwptut )

-- 
John Bokma          Freelance software developer
                                &
                    Experienced Perl programmer: http://castleamber.com/


------------------------------

Date: Mon, 14 Aug 2006 04:42:24 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Mon Aug 14 2006
Message-Id: <J3z12o.8oM@zorch.sf-bay.org>

The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN).  You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.

Archive-Extract-0.12
http://search.cpan.org/~kane/Archive-Extract-0.12/
A generic archive extracting mechanism
----
CPANPLUS-Dist-Deb-0.05
http://search.cpan.org/~kane/CPANPLUS-Dist-Deb-0.05/
----
Catalyst-Plugin-PageCache-0.13
http://search.cpan.org/~mramberg/Catalyst-Plugin-PageCache-0.13/
Cache the output of entire pages
----
Catalyst-View-ClearSilver-0.01
http://search.cpan.org/~jiro/Catalyst-View-ClearSilver-0.01/
ClearSilver View Class
----
Class-Constant-0.04
http://search.cpan.org/~robn/Class-Constant-0.04/
Build constant classes
----
Date-Biorhythm-2.0
http://search.cpan.org/~beppu/Date-Biorhythm-2.0/
a biorhythm calculator
----
Date-Biorhythm-2.1
http://search.cpan.org/~beppu/Date-Biorhythm-2.1/
a biorhythm calculator
----
Debug-1.00
http://search.cpan.org/~atg/Debug-1.00/
----
Frivolity-0.6.5
http://search.cpan.org/~jmac/Frivolity-0.6.5/
A Perl implementation of the Volity game platform
----
Geo-Coordinates-RDNAP-0.03
http://search.cpan.org/~pijll/Geo-Coordinates-RDNAP-0.03/
convert to/from Dutch RDNAP coordinate system
----
Graph-Easy-0.46
http://search.cpan.org/~tels/Graph-Easy-0.46/
Render graphs as ASCII, HTML, SVG or via Graphviz
----
Graph-Easy-As_svg-0.19
http://search.cpan.org/~tels/Graph-Easy-As_svg-0.19/
Output a Graph::Easy as Scalable Vector Graphics (SVG)
----
Graph-Easy-Manual-0.33
http://search.cpan.org/~tels/Graph-Easy-Manual-0.33/
HTML manual for Graph::Easy
----
HTML-Perlinfo-1.40
http://search.cpan.org/~accardo/HTML-Perlinfo-1.40/
Display a lot of Perl information in HTML format
----
HTML-Perlinfo-1.41
http://search.cpan.org/~accardo/HTML-Perlinfo-1.41/
Display a lot of Perl information in HTML format
----
Image-MetaData-GQview-1.5
http://search.cpan.org/~kethgen/Image-MetaData-GQview-1.5/
Perl extension for GQview image metadata
----
Image-MetaData-GQview-1.6
http://search.cpan.org/~kethgen/Image-MetaData-GQview-1.6/
Perl extension for GQview image metadata
----
Image-MetaData-GQview-1.7
http://search.cpan.org/~kethgen/Image-MetaData-GQview-1.7/
Perl extension for GQview image metadata
----
MDV-Repsys-0.08
http://search.cpan.org/~nanardon/MDV-Repsys-0.08/
----
MIME-Types-1.17
http://search.cpan.org/~markov/MIME-Types-1.17/
Definition of MIME types
----
Math-Units-PhysicalValue-0.67
http://search.cpan.org/~jettero/Math-Units-PhysicalValue-0.67/
An object oriented interface for handling values with units.
----
MediaWiki-1.04
http://search.cpan.org/~spectrum/MediaWiki-1.04/
OOP MediaWiki engine client
----
Module-Load-Conditional-0.12
http://search.cpan.org/~kane/Module-Load-Conditional-0.12/
Looking up module information / loading at runtime
----
Net-SMS-TWSMS-0.1
http://search.cpan.org/~snowfly/Net-SMS-TWSMS-0.1/
Send SMS messages via the www.twsms.com service.
----
Parse-Win32Registry-0.23
http://search.cpan.org/~jmacfarla/Parse-Win32Registry-0.23/
Parse Windows Registry Files
----
SAP-Rfc-1.46
http://search.cpan.org/~piers/SAP-Rfc-1.46/
SAP RFC - RFC Function calls against an SAP R/3 System
----
Sledge-Template-ClearSilver-0.01
http://search.cpan.org/~jiro/Sledge-Template-ClearSilver-0.01/
ClearSilver template system for Sledge
----
Test-Chimps-0.05
http://search.cpan.org/~zev/Test-Chimps-0.05/
Collaborative Heterogeneous Infinite Monkey Perfectionification Service
----
Test-Chimps-0.06
http://search.cpan.org/~zev/Test-Chimps-0.06/
Collaborative Heterogeneous Infinite Monkey Perfectionification Service
----
Test-Chimps-Anna-0.04
http://search.cpan.org/~zev/Test-Chimps-Anna-0.04/
An IRQ bot that announces test failures (and unexpected passes)
----
Test-Chimps-Client-0.05
http://search.cpan.org/~zev/Test-Chimps-Client-0.05/
Send smoke test results to a server
----
Win32-SDDL-0.03
http://search.cpan.org/~tojo/Win32-SDDL-0.03/
SDDL parsing module for Windows
----
Win32-SDDL-0.04
http://search.cpan.org/~tojo/Win32-SDDL-0.04/
SDDL parsing module for Windows
----
XML-Atom-Stream-0.09
http://search.cpan.org/~miyagawa/XML-Atom-Stream-0.09/
A client interface for AtomStream
----
XML-TreePP-0.18
http://search.cpan.org/~kawasaki/XML-TreePP-0.18/
Pure Perl implementation for parsing/writing xml files


If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.

This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
  http://www.stonehenge.com/merlyn/LinuxMag/col82.html

print "Just another Perl hacker," # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!


------------------------------

Date: 13 Aug 2006 23:59:14 -0700
From: usenet@DavidFilmer.com
Subject: Re: PerlDoc used in CPAN?--MULTIPOSTED
Message-Id: <1155538754.546622.112860@i3g2000cwc.googlegroups.com>

Paul Lalli wrote:
> Is this suppose to be irony?  The message IDs are identical. The OP DID
> cross post, but this idiotic bot MULTIPOSTED

Yeah, I tested the ever living heck out of this thing (see
alt.test.test, alt.test.test2, and alt.test.testing).  But, during
testing, I commented out this line (as a test case):

         next MSGNUM if @threads == 1;  #ignore plain crossposts

And I forgot to un-comment it.  I feel totally stupid.

> And why the hell didn't it detect its own multiposting?

Because it ignores any message with a "References" header.

DF> > If they get mad at me, their anger may
DF> > #   spill over into future postings that I participate in.

> And how does not posting your address here prevent that, since, as you
> said, all the "regulars" already know who you are?

An OP who gets flagged in a future post can do a historical search on
the message (or search in the testing newsgroups) and determine my
identity (and that's fine).  But I doubt many OPs would think to do
that.



------------------------------

Date: Mon, 14 Aug 2006 13:07:59 +1000
From: Simon Taylor <simon@unisolve.com.au>
Subject: Re: Proposal: extending perldoc -f
Message-Id: <ebopjr$2bmk$1@otis.netspace.net.au>

Hello all,

usenet@DavidFilmer.com wrote:

> IMHO, EACH AND EVERY Perl keyword should resolve by "perodoc -f" even
> if it simply points to a more relevant perldoc (as you, Michelle,
> wisely imply).


I think this is very intuitive idea for the user who is new to Perl.

I floated a similiar idea recently on the perl-documentation list for 
qw(), et al.

    http://www.mail-archive.com/perl-documentation@perl.org/msg00734.html.

As a result Flavio Poletti produced a useful patch which, along
with my proposal, was rejected.

Note also that Ivan Tubert-Brohman's proposed -k (keyword) flag is
probably the best answer to this problen, at least in the medium term.

The online demo of his POD Indexing project handles Michele's 'while' 
example well:

     http://pod-indexing.annocpan.org/perldoc-k.cgi?keyword=while

Regards,

Simon Taylor

-- 
http://www.perlmeme.org


------------------------------

Date: 13 Aug 2006 18:06:38 -0700
From: "Xicheng Jia" <xicheng@gmail.com>
Subject: Re: RegEx question: Exclude characters from group
Message-Id: <1155517598.316489.108410@p79g2000cwp.googlegroups.com>

Axel Dahmen wrote:
> Hi,
>
> I'm searching for a regular expression to find every punctuation character
> (UNICODE \p{P}) except for the quote character. I don't know how to do this.
> Is there a NOT operator available?

Just negate the negation of \p{P} plus quotation marks

[^\P{P}'"]

(untested)
Xicheng



------------------------------

Date: Sun, 13 Aug 2006 20:36:57 -0500
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: system command won't let go
Message-Id: <slrnedvktp.sv5.tadmc@magna.augustmail.com>

rallabs@adelphia.net <rallabs@adelphia.net> wrote:
> Brian McCauley wrote:
>> rallabs@adelphia.net wrote:
>>
>> > I am having some difficulties with the 'system' command.
>>
>> > system "~/runsgood.exe<point.$newID";


>> If you are asking how you can redirect the output of the ~/runsgood.exe


> I don't want to redirect the output of runsgood.  


Yes you do!


> I don't understand it.


You can get really strange behavior when you don't output any headers.

Fix that first. Then fix the other problems.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: 13 Aug 2006 23:49:23 -0700
From: "kaleem" <kaleem177@gmail.com>
Subject: Re: The assignment of command output to an array hangs.
Message-Id: <1155538163.481450.200730@75g2000cwc.googlegroups.com>

Hey Brian,

You seem to be the sensible (and technically adept) one here.:-) The
other two people who replied to my post seemed to be full of hubris and
technically incompetent (trying to hide their incompetency asking me
silly questions).

Well, as I wrote, this problem is reproducible only in this particular
case, I mean only in this particular Perl script of mine. In this
script I have other code before this piece of code
is executed where it hangs. If I created another Perl script which just
has the problem code (the piece of code which hangs) it works fine! So
it seems it might be a KornShell problem. Well, I'll try to look more
into this and will get back. Thanks for respoding, man.

Regards,
Kaleem.


Brian McCauley wrote:
> kaleem wrote:
>
> > I've found that after the KornShell script completes,
> > it becomes defunct (zombie). It means that its parent hasn't waited for
> > it.
>
> Right. IIRC readpipe() (aka qx aka `` ) will first wait for an EOF
> condition on the FIFO and then wait() for the exit status of the
> subprocess.
>
> The symptoms you describe indicate that most likely your subrocess has
> performed a fork-off-and-die without remebering to close (or redirect)
> STDOUT in the child.
>
> I can trivially reproduce the symptoms you are experiencing thus
>
> $ perl -e'`sleep 30 &`'
>
> The /bin/sleep process is still holding the FIFO open so the /bin/sh
> becomes a zombie for 30 seconds.



------------------------------

Date: 13 Aug 2006 22:48:52 -0700
From: shilpi.rustagi@beesys.com
Subject: Use of named groups in VC++ or C++ code
Message-Id: <1155534532.437987.222440@m79g2000cwm.googlegroups.com>

hi all..
Can anyone plz tell me that hw shd i use named groups in a Regular
Expression in C++ or VC++



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 9600
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[28236] in Perl-Users-Digest

Perl-Users Digest, Issue: 9600 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Mon Aug 14 03:05:55 2006

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Aug 14 03:05:55 2006