[30657] in Perl-Users-Digest
Perl-Users Digest, Issue: 1902 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Oct 6 14:09:48 2008
Date: Mon, 6 Oct 2008 11:09:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Mon, 6 Oct 2008 Volume: 11 Number: 1902
Today's topics:
<Help> How to use routines from another perl script baiyanhuang@gmail.com
Re: <Help> How to use routines from another perl script <jurgenex@hotmail.com>
Re: <Help> How to use routines from another perl script <darkon.tdo@gmail.com>
Re: Data File <jurgenex@hotmail.com>
How it works?(about while loop and regex as condition) <havel.zhang@gmail.com>
Re: How it works?(about while loop and regex as conditi <jurgenex@hotmail.com>
Re: How it works?(about while loop and regex as conditi sln@netherlands.com
How to escape single quotes inside fields but not the o <tank209209@yahoo.com>
Re: How to escape single quotes inside fields but not t <RedGrittyBrick@spamweary.invalid>
mod_perl <jcarlock@127.0.0.1>
Re: mod_perl <jcarlock@127.0.0.1>
Re: Problems flushing my buffer! (perl) <hjp-usenet2@hjp.at>
Real Estate | Property Listing | Apartment, Homes for S <serafinjr.gutierrez@gmail.com>
Re: sysread <dontmewithme@got.it>
Re: sysread <willem@stack.nl>
Re: sysread <dontmewithme@got.it>
Re: sysread <dontmewithme@got.it>
Re: sysread xhoster@gmail.com
Re: sysread <RedGrittyBrick@spamweary.invalid>
Re: trouble parsing "kind of" comma-delimited text <cartercc@gmail.com>
Re: trouble parsing "kind of" comma-delimited text <jurgenex@hotmail.com>
Re: What's the difference between *ERROR and \*ERROR in <cdalten@gmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Mon, 6 Oct 2008 04:12:40 -0700 (PDT)
From: baiyanhuang@gmail.com
Subject: <Help> How to use routines from another perl script
Message-Id: <1d62463e-5d54-4c3a-9631-4b3ca8a6e815@a3g2000prm.googlegroups.com>
Hi, All,
I am just a novice to perl, I want to reuse a routine wrote in one
perl script to all other perl scripts, just like c, c++ do, but I
don't know how to "include" another perl script into current perl
script to utilize the routines. would anyone give some tips on it?
Thanks so much.
Baiyan
------------------------------
Date: Mon, 06 Oct 2008 06:25:00 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: <Help> How to use routines from another perl script
Message-Id: <l84ke4pto0ujna70mhvdvkugdfipfcboq6@4ax.com>
baiyanhuang@gmail.com wrote:
>I am just a novice to perl, I want to reuse a routine wrote in one
>perl script to all other perl scripts, just like c, c++ do, but I
>don't know how to "include" another perl script into current perl
>script to utilize the routines. would anyone give some tips on it?
Typically you would create a module and import its functions into your
main program using 'use'.
There is also 'do' which is kind of the poor man's 'use'.
jue
------------------------------
Date: Mon, 06 Oct 2008 10:14:41 -0500
From: darkon <darkon.tdo@gmail.com>
Subject: Re: <Help> How to use routines from another perl script
Message-Id: <Xns9B2F7263E87F8dkwwashere@216.168.3.30>
Jürgen Exner <jurgenex@hotmail.com> wrote:
> baiyanhuang@gmail.com wrote:
>>I am just a novice to perl, I want to reuse a routine wrote in
>>one perl script to all other perl scripts, just like c, c++ do,
>>but I don't know how to "include" another perl script into
>>current perl script to utilize the routines. would anyone give
>>some tips on it?
>
> Typically you would create a module and import its functions
> into your main program using 'use'.
>
> There is also 'do' which is kind of the poor man's 'use'.
And I suppose we might as well add the following; a novice might
not (yet) be accustomed to searching the voluminous FAQ files.
perldoc -q module
Found in C:\Perl\lib\pod\perlfaq7.pod
How do I create a module?
(contributed by brian d foy)
perlmod, perlmodlib, perlmodstyle explain modules in all the gory
details. perlnewmod gives a a brief overview of the process along
with a couple of suggestions about style.
------------------------------
Date: Mon, 06 Oct 2008 09:52:48 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Data File
Message-Id: <94gke4t8njhit9fl5vuknendh66ud0aip2@4ax.com>
[Forwarded from CLP.Modules]
"friend.05@gmail.com" <hirenshah.05@gmail.com> wrote:
>I have a large file in following format:
>
>ID | Time | IP | Code
>
>I want to cluster the file by IP address and count total number of ID
>for each IP address.
split() each line at '|' into its components.
Use the IP address as the key in a hash and increment a counter in that
hash to count the total for that IP address.
To create the clusters there are basically 2 different approaches,
depending upon how large the file is, how much RAM you got, and how many
IP addresses there are.
Efficient, but needs lots of RAM: in addition to the counter also store
each line in the hash for that key as a array. Then, after reading the
whole file once, just walk through the whole hash key by key and print
the array for each IP.
Slow, but needs virtually no RAM: After reading the whole file once and
creating all the hash keys walk through all the keys and for each key
read the whole file again and print those lines, that have the same IP
as the current key.
jue
memory
>
>I am new to perl.
>
>Thanks.
------------------------------
Date: Mon, 6 Oct 2008 02:41:30 -0700 (PDT)
From: "havel.zhang" <havel.zhang@gmail.com>
Subject: How it works?(about while loop and regex as condition)
Message-Id: <56de7c65-e827-4e4a-91ce-d03396945c6c@v15g2000hsa.googlegroups.com>
dear perl-gurus,
i don't understand how this function works. can you please give me
further
explanation:
the program is very simple:
+++++++++++++program++++++++++++++++++++++
open (O,"<z.html");
@l = <O>;
close(O);
foreach(@l){
if ($_ =~ /<a\b([^>]+)(.*?)<\/a>/ig){
$html=$_;
while($html =~ m{a\b([^>]+)(.*?)</a>}ig){
my $Guts = $1;
my $Link = $2;
print "$Guts\n$Link\n";
}
}
};
++++++++z.html content+++++++++++++++++++++
the z.html 's content is:
<A HREF="http://10.123.111.11">link1</A><A HREF="text.txt">text.txt</
A><A HREF=
"fes.iso">fes.iso</A>
+++++++and output is:++++++++++++++++++++++++++++
HREF="http://10.123.111.11"
>link1
HREF="text.txt"
>text.txt
HREF="fes.iso"
>fes.iso
++++++++end+++++++++++++++++++++++++++++++++
I want to using this program pick out hrefs and lables like
"link1","text.txt","fes.iso".
This program works well, but i can't understand the while loop with
regex:
"$html =~ m{a\b([^>]+)(.*?)</a>}ig"
^^^^^^^^^^^^^^^^^^^^^^^
it's works fine, and so amazing:) everytime, it's pick out patten "<a
href=...></a>" and get right result. But HOW does it work? I think it
will always pick out the first matched patten.
Can any perl guru give me answer?
Thank you :)
Havel
------------------------------
Date: Mon, 06 Oct 2008 06:34:22 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: How it works?(about while loop and regex as condition)
Message-Id: <vg4ke4l2s5bgt5u2lhom29spqkk9uld4ev@4ax.com>
"havel.zhang" <havel.zhang@gmail.com> wrote:
[...]
>This program works well, but i can't understand the while loop with
>regex:
> "$html =~ m{a\b([^>]+)(.*?)</a>}ig"
> ^^^^^^^^^^^^^^^^^^^^^^^
>it's works fine, and so amazing:) everytime, it's pick out patten "<a
>href=...></a>" and get right result. But HOW does it work? I think it
>will always pick out the first matched patten.
>
>Can any perl guru give me answer?
The documentation can. See 'perldoc perlop', section 'Quote and
quote-like operators', the two paragraphs beginning with
"The "/g" modifier specifies global pattern matching--that is, ..."
However, it is not surprising that you didn't find it. The whole perlop
man page is about 2000 lines long. That is way too long and complex. It
is almost impossible to find anything there or to point people to
specific part of it. Is someone already working on breaking it down into
more managable chunks?
jue
------------------------------
Date: Mon, 06 Oct 2008 17:41:40 GMT
From: sln@netherlands.com
Subject: Re: How it works?(about while loop and regex as condition)
Message-Id: <j6eke452u4hclj71ukk29qj0e7l9ttss7o@4ax.com>
On Mon, 6 Oct 2008 02:41:30 -0700 (PDT), "havel.zhang" <havel.zhang@gmail.com> wrote:
>dear perl-gurus,
>i don't understand how this function works. can you please give me
>further
>explanation:
>
>the program is very simple:
>+++++++++++++program++++++++++++++++++++++
>open (O,"<z.html");
>@l = <O>;
>close(O);
>
>foreach(@l){
> if ($_ =~ /<a\b([^>]+)(.*?)<\/a>/ig){
^^ might need a while here
> $html=$_;
> while($html =~ m{a\b([^>]+)(.*?)</a>}ig){
does the same thing as above, could even add the '<'
m{<a\b([^>]+)(.*?)</a>}ig
the if ($_ =~ /.. is not needed
> my $Guts = $1;
> my $Link = $2;
> print "$Guts\n$Link\n";
> }
> }
>};
>++++++++z.html content+++++++++++++++++++++
>the z.html 's content is:
> <A HREF="http://10.123.111.11">link1</A><A HREF="text.txt">text.txt</
>A><A HREF=
>"fes.iso">fes.iso</A>
>+++++++and output is:++++++++++++++++++++++++++++
> HREF="http://10.123.111.11"
>>link1
> HREF="text.txt"
>>text.txt
> HREF="fes.iso"
>>fes.iso
>++++++++end+++++++++++++++++++++++++++++++++
>
>I want to using this program pick out hrefs and lables like
>"link1","text.txt","fes.iso".
>This program works well, but i can't understand the while loop with
>regex:
> "$html =~ m{a\b([^>]+)(.*?)</a>}ig"
> ^^^^^^^^^^^^^^^^^^^^^^^
the modifier 'g' will continue the match until the end of string.
The problem is the first 'if' regex will only match the first occurance.
Does the same as the inner match except only once. Why do you need the outer 'if'
then?
>it's works fine, and so amazing:) everytime, it's pick out patten "<a
>href=...></a>" and get right result. But HOW does it work? I think it
>will always pick out the first matched patten.
>
>Can any perl guru give me answer?
>
>Thank you :)
>
>Havel
>
use strict;
use warnings;
my $str = '<A HREF="http://10.123.111.11">link1</A><A HREF="text.txt">text.txt</A><A HREF="fes.iso">fes.iso</A>';
print "Output from 'if \$str':\n---------------\n";
if ($str =~ /(<a\b([^>]+)(.*?)<\/a>)/ig)
{
print "found: '$1'\n\n";
my $html = $1;
while ($html =~ m{a\b([^>]+)(.*?)</a>}ig)
{
my $Guts = $1;
my $Link = $2;
print "$Guts\n$Link\n";
}
}
pos ($str) = 0;
print "\n\nOutput from 'while \$str':\n---------------\n";
while ($str =~ /(<a\b([^>]+)(.*?)<\/a>)/ig)
{
print "found: '$1'\n\n";
my $html = $1;
while ($html =~ m{a\b([^>]+)(.*?)</a>}ig)
{
my $Guts = $1;
my $Link = $2;
print "$Guts\n$Link\n";
}
}
pos ($str) = 0;
print "\n\nOutput from just 'while \$html':\n---------------\n";
while ($str =~ m{<a\s*([^>]+)(.*?)</a\s*>}ig)
{
my $Guts = $1;
my $Link = $2;
print "$Guts\n$Link\n";
}
__END__
Output from 'if $str':
---------------
found: '<A HREF="http://10.123.111.11">link1</A>'
HREF="http://10.123.111.11"
>link1
Output from 'while $str':
---------------
found: '<A HREF="http://10.123.111.11">link1</A>'
HREF="http://10.123.111.11"
>link1
found: '<A HREF="text.txt">text.txt</A>'
HREF="text.txt"
>text.txt
found: '<A HREF="fes.iso">fes.iso</A>'
HREF="fes.iso"
>fes.iso
Output from just 'while $html':
---------------
HREF="http://10.123.111.11"
>link1
HREF="text.txt"
>text.txt
HREF="fes.iso"
>fes.iso
In general it doesn't work fine. You can run into problems if the phrase your
looking for spans lines. Also problematic is your regex does not account for
legal white spaces.
The better regex would be: "while ( m{<a\s*([^>]+)(.*?)</a\s*>}ig ) {}"
Its always good to have delimeters surrounding what you are trying to match.
In your case the '<a ...></a>' the 'a' tag being the delimeters.
This will grab inner non 'a' tags, nested 'a' tags however, will not work.
Because of nesting, html/xml can't be parsed this way, seeking the end delimeter.
But in your case it should be ok.
In general, should you need to do specific parsing, you should get a parser that
captures groups of phrases, from which you can parse with reliability.
==================================================
use strict;
use warnings;
use RXParse; # VERSIN 2
my $p = new RXParse();
$p->setMode( 'html' => 1, 'resume_onerror'=> 1 );
my %oldh = $p->setHandlers('start' => \&starth, 'end' => \&endh);
sub starth
{
my ($obj, $el, $term, @attr) = @_;
my $buffer = lc($el);
$obj->CaptureOn( $buffer ) if ($buffer eq 'a');
}
sub endh
{
my ($obj, $el, $term) = @_;
my $buffer = lc($el);
$obj->CaptureOff( $buffer, 1 ) if ($buffer eq 'a');
}
open my $fh, 'c:\temp\z.html' or die "can't open z.html...";
$p->parse($fh);
close $fh;
# get and parse capture buffer 'a'
# ....
# display 'a'
$p->DumpCaptureBuffs();
__END__
BUFFER: a
=====================================
index seqence
----- --------
[0] 1 <A HREF="http://10.123.111.11">link1</A>
[1] 2 <A HREF="text.txt">text.txt</A>
[2] 3 <A HREF="fes.iso">fes.iso</A>
------------------------------
Date: Mon, 6 Oct 2008 08:16:38 -0700 (PDT)
From: "Henry J." <tank209209@yahoo.com>
Subject: How to escape single quotes inside fields but not the ones around fields?
Message-Id: <ebdd498d-8722-4b85-9a76-45a891c2dc0f@l64g2000hse.googlegroups.com>
Need to escape single quotes ( i.e., ' -> '' ) in a data file before
sending to DB as part of insert SQLs.
Example 1):
it's mine, it's yours, 12, 42, 2008/10/06 => it''s mine,
it''s yours, 12, 42, 2008/10/06
Example 2):
'it's mine', 'it's yours', 12, 42, '2008/10/06' => 'it''s
mine', 'it''s yours', 12, 42, '2008/10/06'
The tricky part is that the data file may or may not have the string
fields wrapped in single quotes. In Example 1), it is not and single
quotes around fields will be added by another script before sending to
DB, in Example 2), its fields are already enclosed in single quotes
and will be sent to DB as is.
Does anybody have handy perl one-liner or script that tackles this?
Thanks!
------------------------------
Date: Mon, 06 Oct 2008 16:41:40 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: How to escape single quotes inside fields but not the ones around fields?
Message-Id: <48ea31b6$0$3166$db0fefd9@news.zen.co.uk>
Henry J. wrote:
> Need to escape single quotes ( i.e., ' -> '' ) in a data file before
> sending to DB as part of insert SQLs.
>
> Example 1):
>
> it's mine, it's yours, 12, 42, 2008/10/06 => it''s mine,
> it''s yours, 12, 42, 2008/10/06
>
> Example 2):
>
> 'it's mine', 'it's yours', 12, 42, '2008/10/06' => 'it''s
> mine', 'it''s yours', 12, 42, '2008/10/06'
>
> The tricky part is that the data file may or may not have the string
> fields wrapped in single quotes. In Example 1), it is not and single
> quotes around fields will be added by another script before sending to
> DB, in Example 2), its fields are already enclosed in single quotes
> and will be sent to DB as is.
>
> Does anybody have handy perl one-liner or script that tackles this?
> Thanks!
>
$dbh->prepare("SELECT foo, bar FROM table WHERE baz=?")->execute($baz);
--
RGB
------------------------------
Date: Mon, 6 Oct 2008 12:01:00 -0400
From: "Jim Carlock" <jcarlock@127.0.0.1>
Subject: mod_perl
Message-Id: <48ea3641$0$6193$2318a52a@unlimited.newshosting.com>
Anyone here know if there's a version of mod_perl available for
Apache 2.2.9?
Maybe I'm doing something wrong.
I've downlaoded the following,
http://perl.apache.org/docs/2.0/os/win32/mpinstall
I've then done the,
perl.exe mpinstall
The script prompts, suggesting version 1 two or three times. I
decline all those suggestions. Then it suggests/prompts version
2 which by default it accepts, and so I accept. After awhile it
offers the message:
ppm install failed: The PPD does not provide code to install for
this platform system C:\Perl\bin\ppm install mod_perl-2.0.ppd
failed: 256 at mpinstall line 242.
And it bails out of the attempted install. At one point in the
past I did get a mod_perl.so built but I do not what recall
where I obtained the script to do the build. The above link I
did obtain from the links below (I followed the first to the
second to the final installation script above):
http://perl.apache.org/docs/
http://perl.apache.org/docs/2.0/os/win32/install.html
Alos I used wget.exe to download the mpinstall and wget.exe did
NOT retain the modification date, so it appears that the link,
mpinstall, above comes from a dynamically generated page, rather
than residing on the web as a static page with a fixed mod-date.
If anyone can correct that particular problem, it might help.
Perhaps I should use the downloaded files that are in the
following link?
http://perl.apache.org/dist/mod_perl-2.0-current.tar.gz
The README file inside the archive there suggests versioning:
Apache:
Dynamic mod_perl (DSO): Apache 2.0.47 - 2.2.8.
Static mod_perl: Apache 2.0.51 - 2.2.8.
I noticed that Apache 2.2.9 was not listed. So I downloaded the
mpinstall link to see what date it held and either it gets up-
dated constantly or it the mod-date ends up as the current date
because of dynamic script generation.
Anyone have any other suggestions?
--
Jim Carlock
You Have More Than Five Senses
http://www.associatedcontent.com/article/381163/more_than_five_senses.html
------------------------------
Date: Mon, 6 Oct 2008 13:38:59 -0400
From: "Jim Carlock" <jcarlock@127.0.0.1>
Subject: Re: mod_perl
Message-Id: <48ea4d3c$0$6192$2318a52a@unlimited.newshosting.com>
"Jim Carlock" wrote...
: Anyone here know if there's a version of mod_perl available for
: Apache 2.2.9?
:
: Maybe I'm doing something wrong.
:
: I've downlaoded the following,
:
: http://perl.apache.org/docs/2.0/os/win32/mpinstall
:
: I've then done the,
:
: perl.exe mpinstall
:
: The script prompts, suggesting version 1 two or three times. I
: decline all those suggestions. Then it suggests/prompts version
: 2 which by default it accepts, and so I accept. After awhile it
: offers the message:
The first method above results in utter failure. It perhaps needs
to get fixed for Perl 5.10.xx (and maybe others).
: ppm install failed: The PPD does not provide code to install for
: this platform system C:\Perl\bin\ppm install mod_perl-2.0.ppd
: failed: 256 at mpinstall line 242.
:
: And it bails out of the attempted install. At one point in the
: past I did get a mod_perl.so built but I do not what recall
: where I obtained the script to do the build. The above link I
: did obtain from the links below (I followed the first to the
: second to the final installation script above):
I missed out on the fact that Perl 5.8.xx differs from 5.10.xx and
that two different builds existed. I suggest employing a selectbox
which requests the Perl version, another selectbox which requests
the Apache version, combined into one form with a button which
employs some javascript to jump to the appropriate local link "#".
I suggest a complete rewrite of that mod_perl links below.
: http://perl.apache.org/docs/
: http://perl.apache.org/docs/2.0/os/win32/install.html
: Also I used wget.exe to download the mpinstall and wget.exe did
: NOT retain the modification date, so it appears that the link,
: mpinstall, above comes from a dynamically generated page, rather
: than residing on the web as a static page with a fixed mod-date.
: If anyone can correct that particular problem, it might help.
Again, I recommend a complete of the mod_perl pages.
: Perhaps I should use the downloaded files that are in the
: following link?
:
: http://perl.apache.org/dist/mod_perl-2.0-current.tar.gz
:
: The README file inside the archive there suggests versioning:
:
: Apache:
: Dynamic mod_perl (DSO): Apache 2.0.47 - 2.2.8.
: Static mod_perl: Apache 2.0.51 - 2.2.8.
I gave up on the link above.
: I noticed that Apache 2.2.9 was not listed. So I downloaded the
: mpinstall link to see what date it held and either it gets up-
: dated constantly or it the mod-date ends up as the current date
: because of dynamic script generation.
:
: Anyone have any other suggestions?
Again, my own suggestion involves completely rewriting the links
above.
mod_perl finally compiled ok when I realized that there are four
different compilations listed:
1) mod_perl for Apache 1, Perl 5.8x.
2) mod_perl for Apache 1, Perl 5.10x.
3) mod_perl for Apache 2+, Perl 5.8x.
4) mod_perl for Apache 2+, Perl 5.10x.
It installed and it seems ok.
I need to start testing it, and I'd like to know if a series of
tests already exists. Thanks to anyone that can help.
--
Jim Carlock
You Have More Than Five Senses
http://www.associatedcontent.com/article/381163/more_than_five_senses.html
------------------------------
Date: Mon, 6 Oct 2008 19:36:10 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Problems flushing my buffer! (perl)
Message-Id: <slrngekj4b.etl.hjp-usenet2@hrunkner.hjp.at>
On 2008-10-05 09:21, Eric Pozharski <whynot@pozharski.name> wrote:
> Ilya Zakharevich <nospam-abuse@ilyaz.org> wrote:
>> <slrngefera.etl.hjp-usenet2@hrunkner.hjp.at>:
>>> Right. Although that's extremely rare these days. All the major
>>> graphical HTML rendering engines (IE, Gecko, KHTML, ...) are able to
>>> render a page incrementally. However, many text based browsers (e.g.,
>>> lynx, w3m) aren't - if Nigel is using one of them that might be the
>>> problem.
>> lynx is incremental as well...
Not the version I have here (2.8.7dev9) which appears to be almost the
newest developer version (there's a 2.8.7dev10 on lynx.isc.org). So I
wouldn't expect the stable release (2.8.6) to be incremental, either.
> elinks too...
Yup. That's the exception I am aware of.
hp
------------------------------
Date: Mon, 6 Oct 2008 05:54:54 -0700 (PDT)
From: "serafinjr.gutierrez@gmail.com" <serafinjr.gutierrez@gmail.com>
Subject: Real Estate | Property Listing | Apartment, Homes for Sale and for Rent
Message-Id: <b21e7c8a-6cbd-4fc1-ad7d-1de3a42cf8f0@o40g2000prn.googlegroups.com>
Real Estate Ozfreeonline, your one stop free property listing of
commercial real estates, homes for sale, apartment and house for rent
in Australia. Get listed and sell house fast or search your dream
apartment and house for rent.visit us: http://realestate.ozfreeonline.com
------------------------------
Date: Mon, 06 Oct 2008 09:27:23 +0200
From: Larry <dontmewithme@got.it>
Subject: Re: sysread
Message-Id: <dontmewithme-063EC3.09272306102008@news.tin.it>
In article <20081005192912.762$Lz@newsreader.com>, xhoster@gmail.com
wrote:
> It might be possible, but extremely unlikely, that this could return early
> after reading less than 4 bytes, even in the absence of an error condition.
> If I had to use sysread, I'd probably take that chance, myself. But why
> not just use read? That will restart as needed until it gets 4 bytes,
> unless there are errors.
this is something I forgot to tell you about:
"sysread bypasses buffered IO, so mixing this with other kinds of reads,
print, write, seek, tell, or eof can cause confusion because the perlio
or stdio layers usually buffers data."
what is buffered IO ?? how size is this buffer ??
May I just use read() for getting important data (like the header) then
use sysread to get the raw data??
thanks
------------------------------
Date: Mon, 6 Oct 2008 09:06:27 +0000 (UTC)
From: Willem <willem@stack.nl>
Subject: Re: sysread
Message-Id: <slrngejl8j.1gh8.willem@snail.stack.nl>
Larry wrote:
) this is something I forgot to tell you about:
)
) "sysread bypasses buffered IO, so mixing this with other kinds of reads,
) print, write, seek, tell, or eof can cause confusion because the perlio
) or stdio layers usually buffers data."
)
) what is buffered IO ?? how size is this buffer ??
)
) May I just use read() for getting important data (like the header) then
) use sysread to get the raw data??
Why do you not want to use read() to get the raw data ?
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
------------------------------
Date: Mon, 06 Oct 2008 12:13:19 +0200
From: Larry <dontmewithme@got.it>
Subject: Re: sysread
Message-Id: <dontmewithme-711845.12131906102008@news.tin.it>
In article <dontmewithme-063EC3.09272306102008@news.tin.it>,
Larry <dontmewithme@got.it> wrote:
> "sysread bypasses buffered IO, so mixing this with other kinds of reads,
> print, write, seek, tell, or eof can cause confusion because the perlio
> or stdio layers usually buffers data."
>
> what is buffered IO ?? how size is this buffer ??
Ok, I have found this:
use IO::Handle '_IOLBF';
$io->setvbuf($buffer_var, _IOLBF, 1024);
(still I don't know I need to set _IOLBF or _IOFBF)
Will it set input buffer also? I would like to keep it as small as
possible.
And yes, I'll be reading from STDIN by using read, not sysread:
(hopefully the following will work, untested yet)
use IO::Handle;
$io = new IO::Handle;
$io->setvbuf($buffer_var, _IOLBF, 1024);
if ( $io->fdopen(fileno(STDIN),"r") )
{
$io->read($buf, 4);
$io->read($header, unpack("N", $buf));
while( $io->read($raw, 1024) )
{
... raw data
}
}
------------------------------
Date: Mon, 06 Oct 2008 13:46:09 +0200
From: Larry <dontmewithme@got.it>
Subject: Re: sysread
Message-Id: <dontmewithme-ABD3FC.13460906102008@news.tin.it>
In article <dontmewithme-711845.12131906102008@news.tin.it>,
Larry <dontmewithme@got.it> wrote:
> And yes, I'll be reading from STDIN by using read, not sysread:
> (hopefully the following will work, untested yet)
>
I tried the following and it worked great (although I don't know how
much '_IONBF' affected the script)
Do you think I should read the header in small chunks or I can read it
all on the fly with just one read() call ??
If I set the internal buffer to be as small as possible, will I be able
to read the header all at once with just one reda() call ??
Still, I don't know what size the internal buffer is...
#!/usr/bin/perl
use strict;
use warnings;
use IO::Handle '_IONBF';
my $io = new IO::Handle;
my $buf;
my $len;
my $header;
if( $ENV{"REQUEST_METHOD"} eq 'GET' || $ENV{"REQUEST_METHOD"} eq 'HEAD' )
{
close(STDIN);
print "Content-type: text/plain\n\n";
print "Hello World!\n";
exit;
}
if( $ENV{"REQUEST_METHOD"} ne 'POST' )
{
close(STDIN);
print "Content-type: text/plain\n\n";
print "What are you trying to do?\n";
exit;
}
open my $fh1, ">", "rawdata.txt" or die "$!\n";
binmode $fh1;
if ( $io->fdopen(fileno(STDIN),"r") )
{
$io->read($len, 4);
$io->read($header, unpack("N", $len));
{open my $fh2, ">", "header.txt" or die "$!\n";
binmode $fh2;
print $fh2 $header;
close $fh2;}
while( $io->read($buf, 1024) )
{
print $fh1 $buf;
}
$io->close;
}
close $fh1;
print "Content-type: text/plain\n\n";
------------------------------
Date: 06 Oct 2008 15:47:16 GMT
From: xhoster@gmail.com
Subject: Re: sysread
Message-Id: <20081006114717.810$9F@newsreader.com>
Larry <dontmewithme@got.it> wrote:
> In article <20081005192912.762$Lz@newsreader.com>, xhoster@gmail.com
> wrote:
>
> > It might be possible, but extremely unlikely, that this could return
> > early after reading less than 4 bytes, even in the absence of an error
> > condition. If I had to use sysread, I'd probably take that chance,
> > myself. But why not just use read? That will restart as needed until
> > it gets 4 bytes, unless there are errors.
>
> this is something I forgot to tell you about:
>
> "sysread bypasses buffered IO, so mixing this with other kinds of reads,
> print, write, seek, tell, or eof can cause confusion because the perlio
> or stdio layers usually buffers data."
>
> what is buffered IO ??
IO that is buffered by perl or by C on behalf of perl.
> how size is this buffer ??
That depends on your system. Why do you care? The point of using
high-level languages is that you usually don't need to worry about such
things.
> May I just use read() for getting important data (like the header) then
> use sysread to get the raw data??
That would be mixing sysread with "other kinds of reads", which as you
just quoted, is a bad idea.
Why are you bound and determined to shoot yourself in the foot?
Xho
--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
------------------------------
Date: Mon, 06 Oct 2008 17:01:52 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: sysread
Message-Id: <48ea3672$0$3383$fa0fcedb@news.zen.co.uk>
Larry wrote:
> I have just read up on sysread and I was struck by the following:
...
> I'm writing up a script to get binary data from <STDIN> (the script is
> run on a normal web server and the data is sent to it by using http's
> POST method)
...
> any help will be apreciated,
If I was having trouble reinventing a wheel, I'd consider using the free
wheel.
According to the docs, CGI.pm handles binary uploads of this type
without requiring you to use sysread in your code.
Just my ¤0.02 worth.
--
RGB
------------------------------
Date: Mon, 6 Oct 2008 04:52:25 -0700 (PDT)
From: cartercc <cartercc@gmail.com>
Subject: Re: trouble parsing "kind of" comma-delimited text
Message-Id: <1ca5f72e-d2a3-4722-8a17-050dc120a759@e2g2000hsh.googlegroups.com>
On Oct 3, 1:37=A0pm, nun <j...@yahoo.com> wrote:
> I need to process a text file of product data that's supplied to me by a
> vendor. This data is "kind of" comma-delimited.... some of the rows
> contain commas within the "description" field, and in these cases (and
> only in these cases) that field's data is enclosed by double quotes.
> Here is some sample data:
>
> SKU,DESCRIPTION,PRICE
> 12345,CABLE,21.25
> 56789,"CONNECTOR, LARGE",13.50
If you want, you can open the file in Excel and save as a file
delimited in whatever you want.
I often get Excel files, and my approach is to preprocess the file to
do two things: (1) replace all commas with the pipe symbol (|) unless
the comma is between the double quotes, and (2) remove the double
quotes.
CC
------------------------------
Date: Mon, 06 Oct 2008 06:41:41 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: trouble parsing "kind of" comma-delimited text
Message-Id: <q05ke4tufk2rgpdg6kkn2u8bt5vujslnfv@4ax.com>
cartercc <cartercc@gmail.com> wrote:
>On Oct 3, 1:37 pm, nun <j...@yahoo.com> wrote:
>> I need to process a text file of product data that's supplied to me by a
>> vendor. This data is "kind of" comma-delimited.... some of the rows
>> contain commas within the "description" field, and in these cases (and
>> only in these cases) that field's data is enclosed by double quotes.
>> Here is some sample data:
>>
>> SKU,DESCRIPTION,PRICE
>> 12345,CABLE,21.25
>> 56789,"CONNECTOR, LARGE",13.50
>
>If you want, you can open the file in Excel and save as a file
>delimited in whatever you want.
>
>I often get Excel files, and my approach is to preprocess the file to
>do two things: (1) replace all commas with the pipe symbol (|) unless
>the comma is between the double quotes, and (2) remove the double
>quotes.
Which only replaces Scylla with Charybdis, because now your target file
will have double quotes around every value that contains a pipe symbol.
Why not just use a standard CSV-parser to read the data? Simple,
straight forward, no fuss, no problems, no pre-processing, and most
important correct.
jue
------------------------------
Date: Mon, 6 Oct 2008 05:52:08 -0700 (PDT)
From: grocery_stocker <cdalten@gmail.com>
Subject: Re: What's the difference between *ERROR and \*ERROR in the following code
Message-Id: <d08af330-71ff-4998-b35b-0f397fc5a4d6@r15g2000prd.googlegroups.com>
On Oct 5, 5:22=A0pm, grocery_stocker <cdal...@gmail.com> wrote:
> When I take a glob reference to ERROR....
>
> m-net% more mv.pl
> #!/usr/bin/perl -w
>
> use IPC::Open3;
>
> local (*READ, *WRITE, *ERROR);
>
> $pid =3D open3(\*READ, \*WRITE, \*ERROR, 'mv abc /efg');
>
> waitpid($pid, 0);
> if($?) {
> =A0 =A0 warn "exit code =3D ", $?>>8, "\n";
> =A0 =A0 my $result =3D <ERROR>;
> =A0 =A0 print $result;
>
> }
>
> I get....
> m-net% ./mv.pl
> exit code =3D 1
> mv: rename abc to /efg: No such file or directory
>
> But when I DON'T take a glob reference to ERROR, I get the same thing.
> m-net% more mv2.pl
> #!/usr/bin/perl -w
>
> use IPC::Open3;
>
> local (*READ, *WRITE, *ERROR);
>
> $pid =3D open3(*READ, *WRITE, *ERROR, 'mv abc /efg');
>
> waitpid($pid, 0);
> if($?) {
> =A0 =A0 warn "exit code =3D ", $?>>8, "\n";
> =A0 =A0 my $result =3D <ERROR>;
> =A0 =A0 print $result;
>
> }
>
> m-net% ./mv2.pl
> exit code =3D 1
> mv: rename abc to /efg: No such file or directory
> m-net%
>
> Why is this?
Never mind. When I use strict refs in the later, I get an error.
Presumably because the later could possibly be used as a ref string.
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 1902
***************************************