[31609] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2868 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Mar 11 14:09:28 2010

Date: Thu, 11 Mar 2010 11:09:13 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 11 Mar 2010     Volume: 11 Number: 2868

Today's topics:
    Re: Help on String to array ! <someone@example.com>
    Re: Help on String to array ! sln@netherlands.com
    Re: Help on String to array ! <uri@StemSystems.com>
    Re: Help on String to array ! <uri@StemSystems.com>
    Re: to RG - Lisp lunacy and Perl psychosis <tadmc@seesig.invalid>
    Re: to RG - Lisp lunacy and Perl psychosis <alessiostalla@gmail.com>
        trying to extract blob as file... is corrupted <tch@nospam.wpkg.org>
    Re: trying to extract blob as file... is corrupted <ben@morrow.me.uk>
        Web Based Perl Courses <usenet05@drabble.me.uk>
    Re: Web Based Perl Courses <uri@StemSystems.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 11 Mar 2010 09:01:10 -0800
From: "John W. Krahn" <someone@example.com>
Subject: Re: Help on String to array !
Message-Id: <ql9mn.23497$wr5.2631@newsfe22.iad>

Uri Guttman wrote:
> 
> sub regex {
> 	my @arr = $hex =~ /(..{2})/g ;
> #	print "@arr\n"
> }

Shouldn't that be:

  	my @arr = $hex =~ /../g ;

Or:

  	my @arr = $hex =~ /.{2}/g ;

You are capturing *three* characters instead of two.



John
-- 
The programmer is fighting against the two most
destructive forces in the universe: entropy and
human stupidity.               -- Damian Conway


------------------------------

Date: Thu, 11 Mar 2010 09:57:43 -0800
From: sln@netherlands.com
Subject: Re: Help on String to array !
Message-Id: <l8bip51bq9eokoj52kpsl7hr1qlqe94gmk@4ax.com>

On Thu, 11 Mar 2010 04:43:45 -0800 (PST), jis <jismagic@gmail.com> wrote:

>On Mar 11, 11:15 am, "Uri Guttman" <u...@StemSystems.com> wrote:
>> >>>>> "j" == jis  <jisma...@gmail.com> writes:
>>
>Uri,
>
>I have used the script you have posted  with only change in input file
>i get the following results.
>  (warning: too few iterations for a reliable count)
>            (warning: too few iterations for a reliable count)
>            (warning: too few iterations for a reliable count)
>          s/iter unpacking     regex substring
>unpacking   9.06        --      -27%      -34%
>regex       6.59       37%        --       -9%
>substring   6.01       51%       10%        --
>
>Unpacking still remains the longest to finish.
>
>I use Windows XP professional with a 2Gb RAM. I also have got a 45GB
>free space in my C drive.
>
>DO you see something else different?
>
>thanks,
>jis

You have Windows!
Try this test below. It uses timethis() for $count itterations.
You don't want a partial itteration result given a small time interval.

After you run the code as written, run it by plugging in your file
information and change the $count to 3 itterations.
Go for a cofee break. Post back.

My results:

 Unpacking: 12.7929 wallclock secs ( 9.94 usr +  2.84 sys = 12.78 CPU) @  0.08/s (n=1)
     Regex: 29.6103 wallclock secs (29.53 usr +  0.08 sys = 29.61 CPU) @  0.03/s (n=1)
 Substring: 2.85185 wallclock secs ( 2.81 usr +  0.03 sys =  2.84 CPU) @  0.35/s (n=1)

-sln

-----------------
use strict;
use warnings;

use Benchmark qw(:all :hireswallclock) ;

#---- Uncomment, plug in filename ---------
# use File::Slurp ;
# my $file_name = '/boot/vmlinuz-2.6.28-15-generic' ;
# my $data = read_file( $file_name, binary => 1 ) ;
# #$data = "\x00\x10" ;
# my $hex = unpack 'H*', $data;
#------------------------------------------

my $count = 1; # increase count to 3 after first testing 1

#---- Comment out $hex -------------------
my $hex = 'a0b0c1d2e3f411aabbcc' x 200_000;  # about 4MB's         
#-----------------------------------------

timethis ($count, \&unpacking, "Unpacking");
timethis ($count, \&regex, "Regex");
timethis ($count, \&substring, "Substring");

sub unpacking {
	my @arr = unpack( '(A2)*' , $hex) ;
#	print "@arr\n"
}

sub regex {
	my @arr = $hex =~ /.{2}/g ;  # regex modified
#	print "@arr\n"
}

sub substring {
	my ($val, $offs, @arr) = ('',0);
	while ($val=substr( $hex, $offs, 2)) {
		push @arr, $val;
		$offs+=2;
	}
#	print "@arr\n"
}
__END__



------------------------------

Date: Thu, 11 Mar 2010 13:27:28 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Help on String to array !
Message-Id: <873a06ww6n.fsf@quad.sysarch.com>

>>>>> "j" == jis  <jismagic@gmail.com> writes:

  j> On Mar 11, 11:15 am, "Uri Guttman" <u...@StemSystems.com> wrote:
  >> >>>>> "j" == jis  <jisma...@gmail.com> writes:
  >> 
  >>   j> if i uncommment  regex protion and comment unpack it would take
  >>   j> 1minute 25 sec
  >> 
  >>   j> print "bye";
  >>   j> print $arr[2];    This would take only 9 seconds.
  >> 
  >>   j> I have used a stopwatch to calculate time.
  >> 
  >> as i said, that is a silly way to time programs. and there is no way it
  >> would take minutes to do this unless you are on a severely slow cpu or
  >> you are low on ram and are disk thrashing. here is my benchmarked
  >> version which shows that unpacking (fixed to use A and not C) is the
  >> fastest and regex (also fixed to do the simplest but correct thing which
  >> is grab 2 chars) ties your code.
  >> 
  >> uncomment out those commented lines to see that this does the same and
  >> correct thing in all cases.
  >> 
  >> here is the timing result run for 10 seconds each:
  >> 
  >>           s/iter     regex substring unpacking
  >> regex       2.11        --       -0%      -25%
  >> substring   2.11        0%        --      -25%
  >> unpacking   1.58       33%       33%        --
  >> 
  >> uri
  >> 
  >> use strict;
  >> use warnings;
  >> 
  >> use File::Slurp ;
  >> use Benchmark qw(:all) ;
  >> 
  >> my $duration = shift || -2 ;
  >> 
  >> my $file_name = '/boot/vmlinuz-2.6.28-15-generic' ;
  >> 
  >> my $data = read_file( $file_name, binary => 1 ) ;
  >> 
  >> #$data = "\x00\x10" ;
  >> 
  >> my $hex = unpack 'H*', $data;
  >> 
  >> # unpacking() ;
  >> # regex() ;
  >> # substring() ;
  >> # exit ;
  >> 
  >> cmpthese( $duration, {
  >> 
  >>         unpacking       => \&unpacking,
  >>         regex           => \&regex,
  >>         substring       => \&substring,
  >> 
  >> } ) ;
  >> 
  >> sub unpacking {
  >>         my @arr = unpack( '(A2)*' , $hex) ;
  >> #       print "@arr\n"
  >> 
  >> }
  >> 
  >> sub regex {
  >>         my @arr = $hex =~ /(..{2})/g ;
  >> #       print "@arr\n"
  >> 
  >> }
  >> 
  >> sub substring {
  >> 
  >>         my ($val, $offs, @arr) = ('',0);
  >>         while ($val=substr( $hex, $offs, 2)){
  >>                 push @arr, $val;
  >>                 $offs+=2;
  >>         }
  >> 
  >> #       print "@arr\n"
  >> 
  >> }
  >> 
  >> --
  >> Uri Guttman  ------  u...@stemsystems.com  --------  http://www.sysarch.com--
  >> -----  Perl Code Review , Architecture, Development, Training, Support ------
  >> ---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com---------

  j> Uri,

  j> I have used the script you have posted  with only change in input file
  j> i get the following results.
  j>   (warning: too few iterations for a reliable count)
  j>             (warning: too few iterations for a reliable count)
  j>             (warning: too few iterations for a reliable count)
  j>           s/iter unpacking     regex substring
  j> unpacking   9.06        --      -27%      -34%
  j> regex       6.59       37%        --       -9%
  j> substring   6.01       51%       10%        --

  j> Unpacking still remains the longest to finish.

  j> I use Windows XP professional with a 2Gb RAM. I also have got a 45GB
  j> free space in my C drive.

  j> DO you see something else different?

i don't have 45GB files nor do i intend to do that. you are disk
thrashing which is the cause of your slowdowns. you are not properly
testing the perl code as your OS I/O is the limiting factor here. learn
how to understand benchmarks better. your test is not legitimate in
comparing the algorithms as the disk I/O dominates.

try it with smaller files that will fit in your ram. not more than .5 gb
given your systems. and with files that large, i would do the conversion
in large chunks in a look to mitigate the i/o and then see which does
better.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 11 Mar 2010 13:30:20 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Help on String to array !
Message-Id: <87y6hyvhhf.fsf@quad.sysarch.com>

>>>>> "JWK" == John W Krahn <someone@example.com> writes:

  JWK> Uri Guttman wrote:
  >> 
  >> sub regex {
  >> my @arr = $hex =~ /(..{2})/g ;
  >> #	print "@arr\n"
  >> }

  JWK> Shouldn't that be:

  JWK>  	my @arr = $hex =~ /../g ;

  JWK> Or:

  JWK>  	my @arr = $hex =~ /.{2}/g ;

  JWK> You are capturing *three* characters instead of two.

true. i did my output test and must have optimized this without running
the tests again. anyhow, this whole thing is moot. the OP never said he
had a 25GB file on a 2gb system. slurping in the whole file and then
processing it is disk bound and the 2 char algorithm is irrelevant. i am
out of this thread. the OP doesn't seem to get the concept of
benchmarking or optimizing. let him stick to his substr and stopwatch.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 11 Mar 2010 08:17:51 -0600
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: to RG - Lisp lunacy and Perl psychosis
Message-Id: <slrnhphul6.d9n.tadmc@tadbox.sbcglobal.net>

["Followup-To:" header set to comp.lang.perl.misc.]

Ron Garret <rNOSPAMon@flownet.com> wrote:
> In article 
><5a67c5c1-22a1-458e-8c5c-21b97d95bb4a@z11g2000yqz.googlegroups.com>,
>  ccc31807 <cartercc@gmail.com> wrote:
>
>> WRT Perl, I agree that it's an ugly, convoluted, write once read never
>> language, but it's popular, and it's popular for a reason.
>
> Sarah Palin is popular too.


But she's not a "fat, ugly, old bitch".  :-)


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.


------------------------------

Date: Thu, 11 Mar 2010 08:45:46 -0800 (PST)
From: Alessio Stalla <alessiostalla@gmail.com>
Subject: Re: to RG - Lisp lunacy and Perl psychosis
Message-Id: <ce8d1ce6-36c0-4410-b0cb-9fa8aa5a03d4@f8g2000yqn.googlegroups.com>

On Mar 11, 1:17=A0pm, Tim X <t...@nospam.dev.null> wrote:
> Alessio Stalla <alessiosta...@gmail.com> writes:
> > On Mar 11, 8:32=A0am, Tim X <t...@nospam.dev.null> wrote:
>
> >> P.S. If anyone knows of a CL library that will allow me to interact wi=
th
> >> an Oracle database AND allow me to call stored procedures, passing dat=
a
> >> in both directions and access ref cursors, *PLEASE* let me know - I
> >> would still prefer to use CL for these jobs over perl.
>
> > If it's a viable option for you, you could use ABCL on the JVM and use
> > Oracle's JDBC driver.
>
> Yes, I was thinking about that as a possible option. Don't know anything
> about ABCL and was hoping to stick with a familiar ANSI implementation
> such as SBCL or even CLISP. Don't know anything about the
> interface/mapping between ABCL and java libs and wsn't sure if I was
> going that route whether I would be better off just jumping into clojure
> as I've been thinking about finding some project to try it out.

The Java access API is derived from ACL's jlinker (http://
www.franz.com/support/documentation/current/doc/jlinker.htm). The
notable differences are:

- ACL's API assumes the JVM is a different process, while ABCL runs in-
process, so it lacks all the functions for managing the connection
with the JVM; and

- ABCL's API has the option to be a bit less verbose, letting ABCL
choose automatically the method to call from the argument types, at
the cost of a runtime performance hit. For example: (jcall (jmethod
"java.lang.Comparable" "compareTo" "java.lang.Object") instance obj)
vs (jcall "compareTo" instance obj). There's also the JSS library
maintained by Alan Ruttenberg which is even less verbose, has more
features and is more efficient (wrt ABCL's abbreviated API).

> Is there much that would take getting accustomed to in ABCL that may
> feel 'foreign' and how ANSI compliant is it?

It is mostly ANSI compliant; it lacks the long form of define-method-
combination and fails 30-something tests from the GCL ANSI test suite
(IIRC 21k+ tests).
Compared to other, more mature implementations with a larger user
base, it lacks a number of things that are usually taken for granted;
for example, the MOP is incomplete, the debugger could be improved in
many ways (e.g. it doesn't show local variable information), the
compiler doesn't use type information very much, etc.
Oh, and it's the only CL implementation I know of where =3D fixnums are
not always EQ, due to boxing imposed by the JVM.

hth,
Alessio


------------------------------

Date: Thu, 11 Mar 2010 17:06:13 +0100
From: Tomasz Chmielewski <tch@nospam.wpkg.org>
Subject: trying to extract blob as file... is corrupted
Message-Id: <7vsinmFt1kU1@mid.uni-berlin.de>

I have a database which has a lot of files saved as blobs (some "fancy" CMS system).

I would like to save them as files.

I saved them as files using such a query (for each blob):

        my $sql = $db->prepare("SELECT blob_data FROM tx_drblob_content WHERE uid = (?)");
        $sql->execute($uid); # $uid is ID of the blob in the database
        my $blob = $sql->fetchrow_array;
        open BLOBFILE, ">$datadir/$uid" or die "Cannot open $!";
        print BLOBFILE $blob;
        close BLOBFILE;

Unfortunately, the files (PDF, ZIP etc.) are corrupted.

I "uploaded" a text file to the database using system's web interface, then fetched it with the above perl code.

Here are some example differences (- denotes original file; + denotes the file fetched with perl):

-# From ``Assigned Numbers'':
+# From ``Assigned Numbers\'\':


So we can see that the file has \ appended in front of each '.

Which could be because the CMS system stores the files as such, or perhaps I should fetch/save the files differently?

Does anyone have some obvious thoughts on why I see \ appended before certain characters?



-- 
Tomasz Chmielewski
http://wpkg.org


------------------------------

Date: Thu, 11 Mar 2010 17:42:46 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: trying to extract blob as file... is corrupted
Message-Id: <mlmn67-g3l1.ln1@osiris.mauzo.dyndns.org>


Quoth Tomasz Chmielewski <tch@nospam.wpkg.org>:
> I have a database which has a lot of files saved as blobs (some "fancy"
> CMS system).
> 
> I would like to save them as files.
> 
> I saved them as files using such a query (for each blob):
> 
>         my $sql = $db->prepare("SELECT blob_data FROM tx_drblob_content
> WHERE uid = (?)");
>         $sql->execute($uid); # $uid is ID of the blob in the database
>         my $blob = $sql->fetchrow_array;
>         open BLOBFILE, ">$datadir/$uid" or die "Cannot open $!";

Use 3-arg open. Use lexical filehandles. Since you're printing binary
data, you should use 'binmode' or the :raw layer.

    open my $BLOBFILE, ">:raw", "$datadir/$uid" or die "...";

>         print BLOBFILE $blob;
>         close BLOBFILE;

If this is important work you should check the return value of close.
(It's not useful to check the return value of print: checking close is
both necessary and sufficient.)

> Unfortunately, the files (PDF, ZIP etc.) are corrupted.
> 
> I "uploaded" a text file to the database using system's web interface,
> then fetched it with the above perl code.
> 
> Here are some example differences (- denotes original file; + denotes
> the file fetched with perl):
> 
> -# From ``Assigned Numbers'':
> +# From ``Assigned Numbers\'\':
> 
> So we can see that the file has \ appended in front of each '.
> 
> Which could be because the CMS system stores the files as such, or
> perhaps I should fetch/save the files differently?

Which database are you using? Which DBD? What type is the 'blob_data'
field?

Can you get the field out using the DB's own command-line tool (psql, or
equivalent for other databases) to compare?

> Does anyone have some obvious thoughts on why I see \ appended before
> certain characters?

Well, it looks to me as though the data has been SQL-quoted, since ' is
a special character in SQL but ` isn't (depending on the dialect, of
course). However, without knowing where the quoting is happening (in the
database, in the client library, in the DBD) it's hard to say.

Ben



------------------------------

Date: Thu, 11 Mar 2010 14:29:39 GMT
From: Graham Drabble <usenet05@drabble.me.uk>
Subject: Web Based Perl Courses
Message-Id: <Xns9D389371BE66Agrahamdrabblelineone@drabble.me.uk>

Hi,

I've been asked to find some basic / intermediate Perl courses for a 
colleague. We've already got Learning Perl and Programming Perl on the 
bookshelf but I've been asked to get a Web course that they can follow.

I know that many people here have a very low opinion of many of them 
and having reviewed a couple I agree. What's the consensus on the best 
(least worst!) ones?

-- 
Graham Drabble
http://www.drabble.me.uk/


------------------------------

Date: Thu, 11 Mar 2010 13:39:09 -0500
From: "Uri Guttman" <uri@StemSystems.com>
Subject: Re: Web Based Perl Courses
Message-Id: <87pr3avh2q.fsf@quad.sysarch.com>

>>>>> "GD" == Graham Drabble <usenet05@drabble.me.uk> writes:

  GD> I've been asked to find some basic / intermediate Perl courses for a 
  GD> colleague. We've already got Learning Perl and Programming Perl on the 
  GD> bookshelf but I've been asked to get a Web course that they can follow.

  GD> I know that many people here have a very low opinion of many of
  GD> them and having reviewed a couple I agree. What's the consensus on
  GD> the best (least worst!) ones?

almost all the web tutorials suck donkey eggs. 

but the o'reilly school of technology (yes, the publisher) has just
released their perl level 1 class and are planning something like 5-6
levels. these come with continuing education credits from a real
university as well. i recommend them as i know the author well (peter
scott) and he is very good at perl training. there are current discounts
on the course so tell your colleague about them.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests. 

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2868
***************************************


home help back first fref pref prev next nref lref last post