[30607] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 1850 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Sep 12 00:09:46 2008

Date: Thu, 11 Sep 2008 21:09:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 11 Sep 2008     Volume: 11 Number: 1850

Today's topics:
    Re: binmode blues <whynot@pozharski.name>
    Re: Extract Numeric values from string <v3gupta@gmail.com>
    Re: I'm struggling with an EZ way to do this regex <nospam-abuse@ilyaz.org>
    Re: I'm struggling with an EZ way to do this regex <ben@morrow.me.uk>
    Re: laziest / fastest way to match last characters of a <jurgenex@hotmail.com>
    Re: laziest / fastest way to match last characters of a <blabla@dungeon.de>
    Re: laziest / fastest way to match last characters of a <ben@morrow.me.uk>
    Re: OK to delete hash pairs while iterating through it? <whynot@pozharski.name>
    Re: OK to delete hash pairs while iterating through it? xhoster@gmail.com
    Re: OK to delete hash pairs while iterating through it? xhoster@gmail.com
        removingCR/LF from unix and windows and mixed files <news1234@free.fr>
    Re: removingCR/LF from unix and windows and mixed files <news1234@free.fr>
    Re: removingCR/LF from unix and windows and mixed files <jimsgibson@gmail.com>
    Re: removingCR/LF from unix and windows and mixed files <tadmc@seesig.invalid>
    Re: simple perl script for file uploads ? <jack_posemsky@yahoo.com>
    Re: standard modules? <joost@zeekat.nl>
        Thoughts on speeding up PDF::API2 <bill@ts1000.us>
    Re: Thoughts on speeding up PDF::API2 <ben@morrow.me.uk>
    Re: Thoughts on speeding up PDF::API2 xhoster@gmail.com
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 11 Sep 2008 23:33:10 +0300
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: binmode blues
Message-Id: <6tbop5x693.ln2@carpet.zombinet>

jidanni@jidanni.org wrote:
> What is wrong with
> $ cat normalize
> #!/usr/bin/perl
> binmode STDIN, ":utf8"; binmode STDOUT, ":utf8";
> use Unicode::Normalize q(decompose);
> local $/; $_=<>; print decompose($_);

> which is causing these to differ?:
> $ normalize   4.htm |head -c 11|od -x|head -n 1
> 0000000 2020 cc61 c282 c280 20a2 0061
> $ normalize < 4.htm |head -c 11|od -x|head -n 1
> 0000000 2020 80e2 20a2 b0e5 e88e 00a6

> I.e., why must I remember to use a "<" to get proper results?

23:03:38 38 [0:0]$ perl -wle '
$x = <>; system qq|ls -l /proc/$$/fd|' /etc/passwd
Name "main::x" used only once: possible typo at -e line 1.
total 0
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 0 -> /dev/pts/1
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 1 -> /dev/pts/1
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 2 -> /dev/pts/1
lr-x------ 1 whynot whynot 64 2008-09-11 23:31 3 -> /etc/passwd
                                               ^^^^^^^^^^^^^^^^
lr-x------ 1 whynot whynot 64 2008-09-11 23:31 4 -> pipe:[4517507]
l-wx------ 1 whynot whynot 64 2008-09-11 23:31 5 -> pipe:[4517507]
23:31:10 39 [0:0]$ perl -wle '
$x = <>; system qq|ls -l /proc/$$/fd|' </etc/passwd
Name "main::x" used only once: possible typo at -e line 1.
total 0
lr-x------ 1 whynot whynot 64 2008-09-11 23:31 0 -> /etc/passwd
                                               ^^^^^^^^^^^^^^^^
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 1 -> /dev/pts/1
lrwx------ 1 whynot whynot 64 2008-09-11 23:31 2 -> /dev/pts/1
lr-x------ 1 whynot whynot 64 2008-09-11 23:31 3 -> pipe:[4517541]


-- 
Torvalds' goal for Linux is very simple: World Domination


------------------------------

Date: Thu, 11 Sep 2008 21:05:29 -0700 (PDT)
From: Vishal G <v3gupta@gmail.com>
Subject: Re: Extract Numeric values from string
Message-Id: <a3c46c44-7394-4879-878b-fbebc146ea97@b2g2000prf.googlegroups.com>

On Sep 12, 3:44=A0am, xhos...@gmail.com wrote:
> J=FCrgen Exner <jurge...@hotmail.com> wrote:
> >VishalG<v3gu...@gmail.com> wrote:
> > >I have string which contain numbers...
>
> > >$str =3D "30 574 454 67 59 298928 74 4875 8 934"; # in actual string
> > >there are 112 million values
>
> > Wow! A single string of maybe half a gigabyte length? That sounds like
> > an awfully poor datastructure.
>
> Yes. =A0But Perl is often used as a glue language. =A0As such, it often h=
as
> to deal with poor datastructures. =A0If the other programs could be easil=
y
> changed to do the right thing in the first place, we wouldn't need the
> glue.
>
> ...
>
>
>
> > I would put that data into a more suitable data structure.
> > Maybe write the string to a file and then read it back into an array
> > using the space character as the line separator?
>
> That would use at least half as much memory as splitting, and so would
> probably be memory prohibitive.
>
> > Or loop through the string character by character and note all position=
s
> > of space characters in an array. Then you can use substr() to extract
> > the desired substring directly.
>
> If this only has to be done once per execution, then I would just leave i=
t
> in the original structure and step though it with /(\d+)/g. =A0If I was g=
oing
> to do several extractions, I would convert the string so that each elemen=
t
> is fixed size (either by padding the numbers with 0 to the max length, or
> by using pack with the appropriate template) then use substr to get the
> desired chunk.
>
> while ($str=3D~/(\d+)/g) {$y.=3Dpack "i", $1};
>
> Xho
>
> --
> --------------------http://NewsReader.Com/--------------------
> The costs of publication of this article were defrayed in part by the
> payment of page charges. This article must therefore be hereby marked
> advertisement in accordance with 18 U.S.C. Section 1734 solely to indicat=
e
> this fact.

Thanx a lot for all these insightful ideas.

-Wow! A single string of maybe half a gigabyte length? That sounds
like an awfully poor data structure.

Actually, I am changing Perl scripts written by someone else and
changing the data structure is not an option cause other modules
depends on it.

Its an ACE (assembly) file which contains DNA and quality value for
each base. So, if there is 220 million bases long DNA then we end with
one string containing 220 million numeric values which is cumbersome
to manage when you have to add & extract information from this string.

The information is in the file as I said earlier and read in to this
data structure. I am trying to split the assembly into parts of
variable length. That=92s why I am trying to split the string but if I
use split function to get the 1 million records, it uses 3.0 GB of
memory which is ridicules


------------------------------

Date: Thu, 11 Sep 2008 20:17:42 +0000 (UTC)
From:  Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: I'm struggling with an EZ way to do this regex
Message-Id: <gabud6$1vi6$1@agate.berkeley.edu>

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Ben Morrow 
<ben@morrow.me.uk>], who wrote in article <8qlnp5-1d9.ln1@osiris.mauzo.dyndns.org>:
> >    '44,33,4.44.64.10,32,25,88,20,6,55'
> > 
> > and I want a regex that replaces any number in the string with say 'XX',

I do not know what is a "number".  I assume you mean "a sequence of digits".

> Something like
> 
>     s/ (?! $a ) \d+ /XX/gx

  s/ \b (?! $a \b ) \d+ /XX/gx

Hope this helps,
Ilya


------------------------------

Date: Fri, 12 Sep 2008 01:35:20 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: I'm struggling with an EZ way to do this regex
Message-Id: <83qop5-7ku1.ln1@osiris.mauzo.dyndns.org>


Quoth Ilya Zakharevich <nospam-abuse@ilyaz.org>:
> [A complimentary Cc of this posting was NOT [per weedlist] sent to
> Ben Morrow 
> <ben@morrow.me.uk>], who wrote in article
> <8qlnp5-1d9.ln1@osiris.mauzo.dyndns.org>:
> > >    '44,33,4.44.64.10,32,25,88,20,6,55'
> > > 
> > > and I want a regex that replaces any number in the string with say 'XX',
> 
> I do not know what is a "number".  I assume you mean "a sequence of digits".
> 
> > Something like
> > 
> >     s/ (?! $a ) \d+ /XX/gx
> 
>   s/ \b (?! $a \b ) \d+ /XX/gx

Duh! I was thinking I needed a \d\D boundary, but of course for the
string given a \w\W boundary works just as well.

Thanks

Ben

-- 
"If a book is worth reading when you are six,                * ben@morrow.me.uk
it is worth reading when you are sixty."  [C.S.Lewis]


------------------------------

Date: Thu, 11 Sep 2008 11:51:49 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: laziest / fastest way to match last characters of a string
Message-Id: <40qic4th444blvjo5fugho1pckhnl7tdlr@4ax.com>

hofer <blabla@dungeon.de> wrote:
>$text = "Today is a nice day";
>$end = "day";
>print "text ends with $end" if $text =~ /$end$/;
>
>Would the regular expression be efficient for long strings?
>
>The alternative is a little more awkward to type
>
>print "text ends with $end"  substr($text,-length($end)) eq $end;  # I
>didn't try this line, but it should work I think

These two versions do very different things. If you need REs, then the
second version won't do you any good. 
If you want textual comparison without RE-behaviour then the first
version is wrong unless you have a very limited set of possible data.

Use the one that matches your needs. Usually correct is more important
than fast.

jue


------------------------------

Date: Thu, 11 Sep 2008 13:45:56 -0700 (PDT)
From: hofer <blabla@dungeon.de>
Subject: Re: laziest / fastest way to match last characters of a string
Message-Id: <4cf26e9f-4df6-4e0c-b594-96eda2f078bc@79g2000hsk.googlegroups.com>

On Sep 11, 8:51=A0pm, J=FCrgen Exner <jurge...@hotmail.com> wrote:

> >print "text ends with $end" if $text =3D~ /$end$/;
>
> >print "text ends with $end" =A0substr($text,-length($end)) eq $end; =A0#=
 I
>
> These two versions do very different things. If you need REs, then the
> second version won't do you any good.
> If you want textual comparison without RE-behaviour then the first
> version is wrong unless you have a very limited set of possible data.
>
> Use the one that matches your needs. Usually correct is more important
> than fast.
>
Hi Juergen,

In fact I don't need REs and the finishing strings won't contain
backslashes, dots or other characters, that could be taken as RE.

So in my special case both are interchangable.

For me the RE is visualy more intuitive than the substr with the -
length() and the fact, that the string to be searched has
to be entered twice if it were a constant and not a variable

I just wondered if perl has a built-in string_ends_with() function or
whether REs would be much slower.

As it Ben pointed out  the first thing the RE search does is checking
at the end of the string, so I guess I'll stick with REs


bye


N



------------------------------

Date: Fri, 12 Sep 2008 01:40:35 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: laziest / fastest way to match last characters of a string
Message-Id: <3dqop5-7ku1.ln1@osiris.mauzo.dyndns.org>


Quoth hofer <blabla@dungeon.de>:
> On Sep 11, 8:51 pm, Jürgen Exner <jurge...@hotmail.com> wrote:
> 
> > >print "text ends with $end" if $text =~ /$end$/;
> >
> > >print "text ends with $end"  substr($text,-length($end)) eq $end;  # I
> >
> > These two versions do very different things. If you need REs, then the
> > second version won't do you any good.
> > If you want textual comparison without RE-behaviour then the first
> > version is wrong unless you have a very limited set of possible data.
> >
> > Use the one that matches your needs. Usually correct is more important
> > than fast.
> >
> Hi Juergen,
> 
> In fact I don't need REs and the finishing strings won't contain
> backslashes, dots or other characters, that could be taken as RE.
> 
> So in my special case both are interchangable.

Be aware that /$/ has rather odd semantics: it will match before a
newline at the end of the string, in a somewhat misguided attempt to
handle reading from a filehandle without chomping. If this is an issue
(if your string might contain newlines, and you *don't* want to match
them like this), use /\z/ instead.

Also, it's always worth interpolating a variable that's meant to be
taken literally like this:

    /\Q$end\E$/

just in case.

> For me the RE is visualy more intuitive than the substr with the -
> length() and the fact, that the string to be searched has
> to be entered twice if it were a constant and not a variable

The second is a nonissue. Allowing you to type things only once is what
variables are *for* :).

> I just wondered if perl has a built-in string_ends_with() function or
> whether REs would be much slower.

Well, yes; it's called a regex.

Ben

-- 
           All persons, living or dead, are entirely coincidental.
ben@morrow.me.uk                                                  Kurt Vonnegut


------------------------------

Date: Thu, 11 Sep 2008 23:58:54 +0300
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: OK to delete hash pairs while iterating through it?
Message-Id: <eddop5xo2n.ln2@carpet.zombinet>

xhoster@gmail.com wrote:
*SKIP*
> Making it safe to delete the one just returned required a special case
> in the hash code guts, while deleting anything else is "naturally"
> safe.  Adding, on the other hand, is something best avoided.

(Just opening one more DGBI war)

First I picked that:

perl -wle '
%x = qw(a b c d e f);
$z = 0;
while($y = each %x) {
print "$y => $x{$y}";
$x{$z++} = $z++ if $z < 10; }
'
e => f
1 => 0
c => d
a => b
3 => 2
7 => 6
9 => 8
5 => 4

I don't know what's going on here.  The second, I believe, should be
more clear:

perl -wle '
%x = qw(a b c d e f);
%y = qw(z y x w v u);
while($z = each %x) {
print "$z => $x{$z}";
$z = each %y;
defined $z or next;
$x{$z} = $y{$z}; }
'
e => f
c => d
a => b
x => w
v => u
z => y

What surprises me most, is key-messing in first case, which doesn't show
in second.

-- 
Torvalds' goal for Linux is very simple: World Domination


------------------------------

Date: 12 Sep 2008 02:39:31 GMT
From: xhoster@gmail.com
Subject: Re: OK to delete hash pairs while iterating through it?
Message-Id: <20080911223935.296$F7@newsreader.com>

Eric Pozharski <whynot@pozharski.name> wrote:
> xhoster@gmail.com wrote:
> *SKIP*
> > Making it safe to delete the one just returned required a special case
> > in the hash code guts, while deleting anything else is "naturally"
> > safe.  Adding, on the other hand, is something best avoided.
>
> (Just opening one more DGBI war)

What's a DGBI war?

>
> First I picked that:
>
> perl -wle '
> %x = qw(a b c d e f);
> $z = 0;
> while($y = each %x) {
> print "$y => $x{$y}";
> $x{$z++} = $z++ if $z < 10; }
> '
> e => f
> 1 => 0
> c => d
> a => b
> 3 => 2
> 7 => 6
> 9 => 8
> 5 => 4
>
> I don't know what's going on here.

I don't know what part you don't know.  Could you be more specific.


> The second, I believe, should be
> more clear:

And I don't *what* you think will be made clear.
>
> perl -wle '
> %x = qw(a b c d e f);
> %y = qw(z y x w v u);
> while($z = each %x) {
> print "$z => $x{$z}";
> $z = each %y;
> defined $z or next;
> $x{$z} = $y{$z}; }

Here, the value of $x{$z} is changed, but no *keys* are added or deleted
from %x while it is being iterated over by each.


> '
> e => f
> c => d
> a => b
> x => w
> v => u
> z => y
>
> What surprises me most, is key-messing in first case, which doesn't show
> in second.

Which key do you consider to be missing?  Adding to a hash can give
"weird" results, but your first example doesn't illustrate that in any way.

Are you confusing yourself with the increment operator and order of
execution?  Sorry, I just don't see what you are seeing in that data.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: 12 Sep 2008 02:51:59 GMT
From: xhoster@gmail.com
Subject: Re: OK to delete hash pairs while iterating through it?
Message-Id: <20080911225202.898$i6@newsreader.com>

xhoster@gmail.com wrote:
> Eric Pozharski <whynot@pozharski.name> wrote:
> > perl -wle '
> > %x = qw(a b c d e f);
> > %y = qw(z y x w v u);
> > while($z = each %x) {
> > print "$z => $x{$z}";
> > $z = each %y;
> > defined $z or next;
> > $x{$z} = $y{$z}; }
>
> Here, the value of $x{$z} is changed, but no *keys* are added or deleted
> from %x while it is being iterated over by each.

No, I misinterpreted that.  Yes, keys are being added to %x.


>
> > '
> > e => f
> > c => d
> > a => b
> > x => w
> > v => u
> > z => y

But what conclusion are you drawing from this?

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Thu, 11 Sep 2008 23:19:15 +0200
From: nntpman68 <news1234@free.fr>
Subject: removingCR/LF from unix and windows and mixed files
Message-Id: <48c98b50$0$13865$426a74cc@news.free.fr>

Hi,


I'm having files, which I'd like to slurp into an array
(one file per array)
However I'd like to get rid of the end of line characters of the files.
As files were created by windows users or linux users the lines will end
with either \n or  with \r\n.
Additionally linux files might have been modified by windows users
(or vice versa) and not all editors are smart enough to adapt to the 
files mode. so some files might have mixed line endings.

I came up with

@a = <$filehandle>
foreach $l (@a) { $l =~ s/(\r\n|\r)$//; }

or with

@a = grep { s/(\r\n|\r)$// } <$filehandle>

or with

@a=()
while(<$fh){ s/(\r\n|\r)$// ; push(@a); }

In above examples I could also replace the substitute with  tr/\r\n//d


Is there already something like a strip_eof() function, or should I 
stick with one of the above?



N





------------------------------

Date: Thu, 11 Sep 2008 23:27:23 +0200
From: nntpman68 <news1234@free.fr>
Subject: Re: removingCR/LF from unix and windows and mixed files
Message-Id: <48c98d38$0$8682$426a74cc@news.free.fr>

Opps, minor typo.

The last sentence  should have been:
"Is there already something like a strip_eol() function"
and not "strip_eof()"
nntpman68 wrote:
> Hi,
> 
> 
> I'm having files, which I'd like to slurp into an array
> (one file per array)
> However I'd like to get rid of the end of line characters of the files.
> As files were created by windows users or linux users the lines will end
> with either \n or  with \r\n.
> Additionally linux files might have been modified by windows users
> (or vice versa) and not all editors are smart enough to adapt to the 
> files mode. so some files might have mixed line endings.
> 
> I came up with
> 
> @a = <$filehandle>
> foreach $l (@a) { $l =~ s/(\r\n|\r)$//; }
> 
> or with
> 
> @a = grep { s/(\r\n|\r)$// } <$filehandle>
> 
> or with
> 
> @a=()
> while(<$fh){ s/(\r\n|\r)$// ; push(@a); }
> 
> In above examples I could also replace the substitute with  tr/\r\n//d
> 
> 
> Is there already something like a strip_eof() function, or should I 
> stick with one of the above?
> 
> 
> 
> N
> 
> 
> 


------------------------------

Date: Thu, 11 Sep 2008 16:56:01 -0700
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: removingCR/LF from unix and windows and mixed files
Message-Id: <110920081656013176%jimsgibson@gmail.com>

In article <48c98b50$0$13865$426a74cc@news.free.fr>, nntpman68
<news1234@free.fr> wrote:

> Hi,
> 
> 
> I'm having files, which I'd like to slurp into an array
> (one file per array)
> However I'd like to get rid of the end of line characters of the files.
> As files were created by windows users or linux users the lines will end
> with either \n or  with \r\n.
> Additionally linux files might have been modified by windows users
> (or vice versa) and not all editors are smart enough to adapt to the 
> files mode. so some files might have mixed line endings.
> 
> I came up with
> 
> @a = <$filehandle>
> foreach $l (@a) { $l =~ s/(\r\n|\r)$//; }

What about lines with just "\n" in them? This is a little shorter and
uses a character class instead of grouping and alternation:

  s/[\r\n]+// for @a;

> 
> or with
> 
> @a = grep { s/(\r\n|\r)$// } <$filehandle>

You will drop lines that don't have "\r" in them. You should probably
use map instead of grep here.

> 
> or with
> 
> @a=()
> while(<$fh){ s/(\r\n|\r)$// ; push(@a); }
> 
> In above examples I could also replace the substitute with  tr/\r\n//d

That would be a good idea.

> 
> 
> Is there already something like a strip_eof() function, or should I 
> stick with one of the above?

No. Perl's built-in functions are described in 'perldoc perlfunc' and
by 'perldoc -f xxx'. If you don't find something there, then you can
start looking at CPAN (<http://search.cpan.org>).

-- 
Jim Gibson


------------------------------

Date: Thu, 11 Sep 2008 18:28:12 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: removingCR/LF from unix and windows and mixed files
Message-Id: <slrngcjacc.88g.tadmc@tadmc30.sbcglobal.net>

nntpman68 <news1234@free.fr> wrote:

>
> I'm having files, which I'd like to slurp into an array
> (one file per array)
> However I'd like to get rid of the end of line characters of the files.
> As files were created by windows users or linux users the lines will end
> with either \n or  with \r\n.
> Additionally linux files might have been modified by windows users
> (or vice versa) and not all editors are smart enough to adapt to the 
> files mode. so some files might have mixed line endings.
>
> I came up with
>
> @a = <$filehandle>
> foreach $l (@a) { $l =~ s/(\r\n|\r)$//; }


   foreach $l (@a) { $l =~ s/\r?\n//; }


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Thu, 11 Sep 2008 20:55:31 -0700 (PDT)
From: Jack <jack_posemsky@yahoo.com>
Subject: Re: simple perl script for file uploads ?
Message-Id: <6588a485-c9a5-442a-ba3f-79424e3d3dea@s20g2000prd.googlegroups.com>

On Sep 11, 10:44=A0am, Jack <jack_posem...@yahoo.com> wrote:
> On Sep 11, 9:01=A0am, xhos...@gmail.com wrote:
>
>
>
>
>
> > Jack <jack_posem...@yahoo.com> wrote:
> > > Hi I am looking for a NON form based simple script that handles a fil=
e
> > > upload on the server side.
>
> > So of the N possible user interfaces, where N is a very large number,
> > perhaps infinite, you are willing to work with any of the N-1 of them?
>
> > I think a better approach is to define what you want, not what you don'=
t
> > want.
>
> > Xho
>
> > --
> > --------------------http://NewsReader.Com/--------------------
> > The costs of publication of this article were defrayed in part by the
> > payment of page charges. This article must therefore be hereby marked
> > advertisement in accordance with 18 U.S.C. Section 1734 solely to indic=
ate
> > this fact.
>
> I want to add the ability to post a file from a browser to a server.
> Example is craigslist where the user selects an image file and uploads
> it to their servers. Thats it, its very simple and preferably not via
> a form post since I am using that already with my ajax code thats
> populating my HTML + perl logic into a form, and its advisable to only
> post to one form, not more than 1.
>
> Thank you,
>
> Jack- Hide quoted text -
>
> - Show quoted text -

Hi I answered Xho's question.  Can anyone help here with real example
Perl code (NOT CGI) that works for browser - server file uploads ?

thank you,

Jack


------------------------------

Date: Thu, 11 Sep 2008 21:14:55 +0200
From: Joost Diepenmaat <joost@zeekat.nl>
Subject: Re: standard modules?
Message-Id: <87d4japk9c.fsf@zeekat.nl>

bugbear <bugbear@trim_papermule.co.uk_trim> writes:

> Is there a definitive list of the modules
> that are part of perl, and ALWAYS installed,
> as opposed to modules that are "very handy"
> and OFTEN installed?

What the other said, but remember that some linux distributions at
least, split up the perl core into several packages, so if a module is
in the core, it's still not guaranteed to be installed even if perl
itself is available.

-- 
Joost Diepenmaat | blog: http://joost.zeekat.nl/ | work: http://zeekat.nl/


------------------------------

Date: Thu, 11 Sep 2008 17:11:26 -0700 (PDT)
From: Bill H <bill@ts1000.us>
Subject: Thoughts on speeding up PDF::API2
Message-Id: <fb578a8e-7277-4047-9e0d-ca74b0ad9295@r66g2000hsg.googlegroups.com>

In a recent post I asked about speeding up a perl script that uses
PDF::API2. I did some profiling of the code and see that the vast
majority of the time (about 90%) is used in going through all
the .pm's in the PDF::API2 library. Once it gets past all of the
initialization, my code that uses the api goes very fast, creating a
20+ pdf document with seperate image thumbnail files of each page (via
imagemagik) in less than 2 seconds.

In a meeting we were having tonight we was tossing around the idea of
having the program go through its initial setup and then "pause" to
wait for a signal to create a pdf file, then create the pdf, images
and then go back to the pause. Basically running all the time as a
service. Anyone see any reason why this would be a bad idea?

We further started wondering, instead of pausing, then running on a
signal and then going back to pause for next signal to make a pdf,
would it be possible to fork off a child at that point and have the
child create the pdf / images and end, while the parent stayed at the
pause position waiting for another signal to fork off a child. If we
forked off a child, would it start from the begining of the script or
would it start at the same place (probably next line) in the perl
script it was forked off of?

Any thoughts?

Bill H


------------------------------

Date: Fri, 12 Sep 2008 01:48:26 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Thoughts on speeding up PDF::API2
Message-Id: <qrqop5-7ku1.ln1@osiris.mauzo.dyndns.org>


Quoth Bill H <bill@ts1000.us>:
> In a recent post I asked about speeding up a perl script that uses
> PDF::API2. I did some profiling of the code and see that the vast
> majority of the time (about 90%) is used in going through all
> the .pm's in the PDF::API2 library. Once it gets past all of the
> initialization, my code that uses the api goes very fast, creating a
> 20+ pdf document with seperate image thumbnail files of each page (via
> imagemagik) in less than 2 seconds.
> 
> In a meeting we were having tonight we was tossing around the idea of
> having the program go through its initial setup and then "pause" to
> wait for a signal to create a pdf file, then create the pdf, images
> and then go back to the pause. Basically running all the time as a
> service. Anyone see any reason why this would be a bad idea?

No, it's a very good idea. This is exactly what systems like mod_perl
and FastCGI do to speed things up. You do have to be careful to clear
everything out between one run and the next...

> We further started wondering, instead of pausing, then running on a
> signal and then going back to pause for next signal to make a pdf,
> would it be possible to fork off a child at that point and have the
> child create the pdf / images and end, while the parent stayed at the
> pause position waiting for another signal to fork off a child.

 ...which is something fork allows you to avoid :). fork does have some
overhead, which is why programs like Apache go to some trouble to avoid
forking a new process as each request comes in, but since your previous
model was a whole new perl process for each run this probably isn't
significant.

If anyone suggests using threads from perl on a system that has a real
fork, laugh :).

> If we forked off a child, would it start from the begining of the
> script or would it start at the same place (probably next line) in the
> perl script it was forked off of?

perldoc -f fork
man 2 fork

Basically, both old and new processes will return from the fork call,
the only difference between them at that point being what is returned.

Ben

-- 
Every twenty-four hours about 34k children die from the effects of poverty.
Meanwhile, the latest estimate is that 2800 people died on 9/11, so it's like
that image, that ghastly, grey-billowing, double-barrelled fall, repeated
twelve times every day. Full of children. [Iain Banks]         ben@morrow.me.uk


------------------------------

Date: 12 Sep 2008 03:27:09 GMT
From: xhoster@gmail.com
Subject: Re: Thoughts on speeding up PDF::API2
Message-Id: <20080911232712.793$7o@newsreader.com>

Bill H <bill@ts1000.us> wrote:
> In a recent post I asked about speeding up a perl script that uses
> PDF::API2. I did some profiling of the code and see that the vast
> majority of the time (about 90%) is used in going through all
> the .pm's in the PDF::API2 library. Once it gets past all of the
> initialization, my code that uses the api goes very fast, creating a
> 20+ pdf document with seperate image thumbnail files of each page (via
> imagemagik) in less than 2 seconds.

If 10% of the time is spent doing something that takes 2 seconds,
then 100% of the time is 20 seconds and the module loading must be taking
almost 18 seconds.  That is outrageous on anything modestly recent
computer. On my machine, loading PDF::API2 takes ~0.5 seconds.

One possible problem is if the PDF::API2 location install show up late in
@INC, and the stuff earlier in @INC is on slow network drives.  For each of
the files it opens as part of loading PDF:API2, it has to "stat" its way
through the entire @INC list before finally finding it.


> In a meeting we were having tonight we was tossing around the idea of
> having the program go through its initial setup and then "pause" to
> wait for a signal to create a pdf file, then create the pdf, images
> and then go back to the pause. Basically running all the time as a
> service. Anyone see any reason why this would be a bad idea?

Nope.  Sounds like a good idea.  Working out the "signal" could be tricky.

> We further started wondering, instead of pausing, then running on a
> signal and then going back to pause for next signal to make a pdf,
> would it be possible to fork off a child at that point and have the
> child create the pdf / images and end, while the parent stayed at the
> pause position waiting for another signal to fork off a child.

Yes, you can do that, but it probably wouldn't be worthwhile.  Since the
make a pdf part is fast, what is the point of parallelizing it?  It would
add complexity for probably little to no benefit.


> If we
> forked off a child, would it start from the begining of the script or
> would it start at the same place (probably next line) in the perl
> script it was forked off of?

The new process and the old process start/continue at the same place.  It
isn't the next line, it is the" returning" of the fork.
$x=fork();

The fork itself only happens in the parent, but the assignment to $x
happens in both the parent and the child.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 1850
***************************************


home help back first fref pref prev next nref lref last post