[32446] in Perl-Users-Digest
Perl-Users Digest, Issue: 3713 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Jun 13 03:09:22 2012
Date: Wed, 13 Jun 2012 00:09:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Wed, 13 Jun 2012 Volume: 11 Number: 3713
Today's topics:
3-arg open - was Re: Very Sluggish Code <dave@invalid.invalid>
Re: 3-arg open - was Re: Very Sluggish Code <ben@morrow.me.uk>
Re: 3-arg open - was Re: Very Sluggish Code <rweikusat@mssgmbh.com>
Re: A remark <bugbear@trim_papermule.co.uk_trim>
an effective script for grabbing and putting images fro <cal@example.invalid>
Re: an effective script for grabbing and putting images <ben@morrow.me.uk>
Re: an effective script for grabbing and putting images <m@rtij.nl.invlalid>
Re: an effective script for grabbing and putting images <cal@example.invalid>
Re: Odd behaviour on Mac OS X Lion <vilain@NOspamcop.net>
Re: Odd behaviour on Mac OS X Lion <ben@morrow.me.uk>
Re: Odd behaviour on Mac OS X Lion <trudge@gmail.com>
Re: Odd behaviour on Mac OS X Lion <jimsgibson@gmail.com>
Re: Odd behaviour on Mac OS X Lion <vilain@NOspamcop.net>
Re: Very Sluggish Code <hjp-usenet2@hjp.at>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Tue, 12 Jun 2012 11:01:38 +0000 (UTC)
From: "Dave Saville" <dave@invalid.invalid>
Subject: 3-arg open - was Re: Very Sluggish Code
Message-Id: <fV45K0OBJxbE-pn2-sP4Gv1O9Rhlv@localhost>
On Mon, 11 Jun 2012 21:40:55 UTC, Ben Morrow <ben@morrow.me.uk> wrote:
<snip>
> Use a variable for the filehandle, as above. Also use 3-arg open, and
> check the return value.
Hi Ben, that's the second time I have seen you advocate 3-arg open. I
think I now understand using a variable for the filehandle but I fail
to see the difference between "<foo" and "<", "foo".
<snip>
--
Regards
Dave Saville
------------------------------
Date: Tue, 12 Jun 2012 13:53:41 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: 3-arg open - was Re: Very Sluggish Code
Message-Id: <lnqja9-hfd2.ln1@anubis.morrow.me.uk>
Quoth "Dave Saville" <dave@invalid.invalid>:
> On Mon, 11 Jun 2012 21:40:55 UTC, Ben Morrow <ben@morrow.me.uk> wrote:
>
> > Use a variable for the filehandle, as above. Also use 3-arg open, and
> > check the return value.
>
> Hi Ben, that's the second time I have seen you advocate 3-arg open. I
> think I now understand using a variable for the filehandle but I fail
> to see the difference between "<foo" and "<", "foo".
In that case there is none. In general, though, 3-arg is safer; suppose
you write
open my $FOO, "<$foo" or die ...;
and $foo happens to contain " foo". In this case 2-arg open will swallow
the space and open the wrong file.
Ben
------------------------------
Date: Tue, 12 Jun 2012 16:16:23 +0100
From: Rainer Weikusat <rweikusat@mssgmbh.com>
Subject: Re: 3-arg open - was Re: Very Sluggish Code
Message-Id: <87r4tkblx4.fsf@sapphire.mobileactivedefense.com>
"Dave Saville" <dave@invalid.invalid> writes:
> On Mon, 11 Jun 2012 21:40:55 UTC, Ben Morrow <ben@morrow.me.uk> wrote:
[...]
>> Use a variable for the filehandle, as above. Also use 3-arg open, and
>> check the return value.
>
> Hi Ben, that's the second time I have seen you advocate 3-arg open. I
> think I now understand using a variable for the filehandle but I fail
> to see the difference between "<foo" and "<", "foo".
Logically, the open mode and the pathname to open are two different
things and putting them into the same string argument which either the
compiler or the runtime environment then need to take apart again by
parsing it wasn't a good idea: The 'open mode' characters Perl uses
are usually perfectly valid although practically somewhat 'rare' filename
characters as well.
Somewhat contrived example:
----------------------------
sub open_for_reading($)
{
my $fh;
open($fh, '<'.$_[0]) // die("open: $_[0]: $!");
return $fh;
}
my $fh0;
$fh0 = open_for_reading('&STDIN');
-----------------------------
This won't try to open the file named &STDIN for reading but will dup
the STDIN file handle instead. Given a situation like this one, it is
also somewhat stupid to first execute code to concatenate name and
mode and then execute code to take them apart again.
------------------------------
Date: Tue, 12 Jun 2012 10:51:18 +0100
From: bugbear <bugbear@trim_papermule.co.uk_trim>
Subject: Re: A remark
Message-Id: <rpSdnc4wjPSLjErSnZ2dnUVZ7rednZ2d@brightview.co.uk>
Rainer Weikusat wrote:
> Removing possibly useful documentation texts because they contain
> 'politically unwelcome information' in order to help with forcing
> people to download $random_non_perl_oo_crap from CPAN even for really
> basic problems is a tacit admission that said $random_non_perl_oo_crap
> failed to supplant the actually very nice Perl OO system based on its
> technical merits.
>
Go on - gimme some context, please.
BugBear
------------------------------
Date: Tue, 12 Jun 2012 16:15:20 -0600
From: Cal Dershowitz <cal@example.invalid>
Subject: an effective script for grabbing and putting images from or to a website
Message-Id: <He-dnSy8J5jkIkrSnZ2dnUVZ_rWdnZ2d@supernews.com>
I'm trying to figure out what the better methods are dealing with images
that from server to user and then to server somewhere else again. I've
kludged together this script for for this purpose, and I think it can
use some beautifying:
$ perl lh2.pl
img: WWW::Mechanize::Image=HASH(0xa7bbc5c)
https://sites.google.com/site/lutherhavennm/_/rsrc/1255408739014/mission/Attachment3.jpg?height=278&width=420
ext: jpg?height=278&width=420
ext: jpg
img: WWW::Mechanize::Image=HASH(0xa7c5784)
https://sites.google.com/site/lutherhavennm/_/rsrc/1255386973661/mission/Picture1.jpg?height=279&width=420
ext: jpg?height=279&width=420
ext: jpg
img: WWW::Mechanize::Image=HASH(0xa7c58ec)
https://sites.google.com/site/lutherhavennm/_/rsrc/1255408642180/mission/Attachment10.jpg?height=280&width=420
ext: jpg?height=280&width=420
ext: jpg
img: WWW::Mechanize::Image=HASH(0xa7c54dc)
https://sites.google.com/site/lutherhavennm/_/rsrc/1255387202014/mission/Looking%20up%20at%20the%20Bldg.JPG?height=315&width=420
ext: JPG?height=315&width=420
ext: JPG
downloaded 4 images from https://sites.google.com/site/lutherhavennm/mission
to folder site_20
$ cat lh2.pl
#!/usr/bin/perl -w
use strict;
use feature ':5.10';
use WWW::Mechanize;
use LWP::Simple;
use Errno qw[ EEXIST ];
# get information about images
my $domain = 'https://sites.google.com/site/lutherhavennm/mission';
my $m = WWW::Mechanize->new();
$m->get($domain);
my @list = $m->images();
# create new folder and download images to it.
my $counter = 0;
my $dir = &mk_new_dir;
for my $img (@list) {
print "img: $img\n";
my $url = $img->url_abs();
print "$url \n";
my $ext = ($url =~ m/([^.]+)$/)[0];
print "ext: $ext\n";
$ext =~ s/\?.+//;
print "ext: $ext\n";
$counter++;
my $filename = $dir . "/image_" . $counter. '.' . $ext;
getstore( $url, $filename ) or die "Can't download '$url': $@\n";
}
# output
print "downloaded ", $counter, " images from ", $domain, "\n";
print "to folder ", $dir, "\n";
sub mk_new_dir {
my $counter2 = 1;
while (1) {
my $word = "site";
my $name = $word . '_' . $counter2++;
if ( mkdir $name, 0755 ) {
return $name; # success, return new dir name
}
else {
next if $!{EEXIST}; # mkdir failed because file exists
die sprintf "(%d) %s", $!, $!; # other failure; bail out!
}
}
}
$
Is this what jpg's look like on the internet, with the question mark
after what is the traditional extension? If so, then I think I want to
make a more-sophisticated capture.
Is what I do with regex better done with a module? (which one?) a split?
Thanks for your comment,
--
Cal
------------------------------
Date: Wed, 13 Jun 2012 00:05:45 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: an effective script for grabbing and putting images from or to a website
Message-Id: <9juka9-ihg2.ln1@anubis.morrow.me.uk>
Quoth Cal Dershowitz <cal@example.invalid>:
>
> #!/usr/bin/perl -w
> use strict;
> use feature ':5.10';
> use WWW::Mechanize;
> use LWP::Simple;
> use Errno qw[ EEXIST ];
>
> # get information about images
> my $domain = 'https://sites.google.com/site/lutherhavennm/mission';
> my $m = WWW::Mechanize->new();
> $m->get($domain);
> my @list = $m->images();
>
> # create new folder and download images to it.
> my $counter = 0;
> my $dir = &mk_new_dir;
Don't call subs with '&' unless you know what it does.
my $dir = mk_new_dir();
> for my $img (@list) {
> print "img: $img\n";
> my $url = $img->url_abs();
> print "$url \n";
>
> my $ext = ($url =~ m/([^.]+)$/)[0];
> print "ext: $ext\n";
> $ext =~ s/\?.+//;
> print "ext: $ext\n";
You would be better off doing this splitting with the URI module.
Bizarrely the ->URI method of WWW::Mechanize::Image returns a URI::URL
object (which is an obsolete compatibility class), so you would be best
off building a URI yourself:
use URI;
my $url = URI->new($img->url, $img->base);
my $file = ($url->path_segments)[-1];
my ($ext) = $file =~ /([^.]*)$/;
You should be aware that, although many sites put conventional
extensions on their URLs, there is no particular reason why they should.
Unless you know the sites you are using do (and will continue to do so),
you should rather look at the Content-Type header returned and map that
to an appropriate extension. You can do the mapping with a simple hash,
or with the MIME::Types module, and you will need to switch from
LWP::Simple to the full LWP::UserAgent to get hold of the Content-Type
response header.
Other than that, I don't see anything important wrong with the code.
Ben
------------------------------
Date: Wed, 13 Jun 2012 08:43:55 +0200
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: an effective script for grabbing and putting images from or to a website
Message-Id: <bepla9-gpg.ln1@news.rtij.nl>
On Tue, 12 Jun 2012 17:52:43 -0600, Cal Dershowitz wrote:
> On 06/12/2012 05:05 PM, Ben Morrow wrote:
>> You should be aware that, although many sites put conventional
>> extensions on their URLs, there is no particular reason why they
>> should.
>> Unless you know the sites you are using do (and will continue to do
>> so),
>> you should rather look at the Content-Type header returned and map that
>> to an appropriate extension. You can do the mapping with a simple hash,
>> or with the MIME::Types module, and you will need to switch from
>> LWP::Simple to the full LWP::UserAgent to get hold of the Content-Type
>> response header.
>
> ok. So if I'm writing a perl script to grab an image, I do not make
> decisions about it based first on the extension, but what the html says
> about the image, right?
Nitpick, what (the) HTTP (Content-Type header) says about the image.
M4
------------------------------
Date: Tue, 12 Jun 2012 17:52:43 -0600
From: Cal Dershowitz <cal@example.invalid>
Subject: Re: an effective script for grabbing and putting images from or to a website
Message-Id: <5vSdnbJJepjWS0rSnZ2dnUVZ_gqdnZ2d@supernews.com>
On 06/12/2012 05:05 PM, Ben Morrow wrote:
>
> Quoth Cal Dershowitz<cal@example.invalid>:
>> my $dir =&mk_new_dir;
>
> Don't call subs with '&' unless you know what it does.
I'll change that. If I'm correct, it used to be necessary but now is
discouraged if not necessary.
>
> my $dir = mk_new_dir();
>
>> for my $img (@list) {
>> print "img: $img\n";
>> my $url = $img->url_abs();
>> print "$url \n";
>>
>> my $ext = ($url =~ m/([^.]+)$/)[0];
>> print "ext: $ext\n";
>> $ext =~ s/\?.+//;
>> print "ext: $ext\n";
>
> You would be better off doing this splitting with the URI module.
> Bizarrely the ->URI method of WWW::Mechanize::Image returns a URI::URL
> object (which is an obsolete compatibility class), so you would be best
> off building a URI yourself:
Ben, it was bizarre from my point of view. I don't necessarily want to
dwell on time that I threw down a well, but you can't find an url_abs
method on either of the modules I used. Let me state it differently. I
couldn't find anything.
>
> use URI;
>
> my $url = URI->new($img->url, $img->base);
> my $file = ($url->path_segments)[-1];
> my ($ext) = $file =~ /([^.]*)$/;
I'll take a look at the documentation on URI soon.
>
> You should be aware that, although many sites put conventional
> extensions on their URLs, there is no particular reason why they should.
> Unless you know the sites you are using do (and will continue to do so),
> you should rather look at the Content-Type header returned and map that
> to an appropriate extension. You can do the mapping with a simple hash,
> or with the MIME::Types module, and you will need to switch from
> LWP::Simple to the full LWP::UserAgent to get hold of the Content-Type
> response header.
ok. So if I'm writing a perl script to grab an image, I do not make
decisions about it based first on the extension, but what the html says
about the image, right?
>
> Other than that, I don't see anything important wrong with the code.
>
$ perltidy lh3.pl
$ perl lh3.pl
url is
https://sites.google.com/site/lutherhavennm/_/rsrc/1255408739014/mission/Attachment3.jpg?height=278&width=420
file is Attachment3.jpg
ext is jpg
url is
https://sites.google.com/site/lutherhavennm/_/rsrc/1255386973661/mission/Picture1.jpg?height=279&width=420
file is Picture1.jpg
ext is jpg
url is
https://sites.google.com/site/lutherhavennm/_/rsrc/1255408642180/mission/Attachment10.jpg?height=280&width=420
file is Attachment10.jpg
ext is jpg
url is
https://sites.google.com/site/lutherhavennm/_/rsrc/1255387202014/mission/Looking%20up%20at%20the%20Bldg.JPG?height=315&width=420
file is Looking up at the Bldg.JPG
ext is JPG
downloaded 4 images from https://sites.google.com/site/lutherhavennm/mission
to folder site_21
$ cat lh3.pl
#!/usr/bin/perl -w
use strict;
use feature ':5.10';
use WWW::Mechanize;
use LWP::Simple;
use URI;
use Errno qw[ EEXIST ];
# get information about images
my $domain = 'https://sites.google.com/site/lutherhavennm/mission';
my $m = WWW::Mechanize->new();
$m->get($domain);
my @list = $m->images();
# create new folder and download images to it.
my $counter = 0;
my $dir = &mk_new_dir;
for my $img (@list) {
my $url = URI->new($img->url, $img->base);
print "url is $url \n";
my $file = ($url->path_segments)[-1];
print "file is $file \n";
my ($ext) = $file =~ /([^.]*)$/;
print "ext is $ext \n";
$counter++;
my $filename = $dir . "/image_" . $counter. '.' . $ext;
getstore( $url, $filename ) or die "Can't download '$url': $@\n";
}
# output
print "downloaded ", $counter, " images from ", $domain, "\n";
print "to folder ", $dir, "\n";
sub mk_new_dir {
my $counter2 = 1;
while (1) {
my $word = "site";
my $name = $word . '_' . $counter2++;
if ( mkdir $name, 0755 ) {
return $name; # success, return new dir name
}
else {
next if $!{EEXIST}; # mkdir failed because file exists
die sprintf "(%d) %s", $!, $!; # other failure; bail out!
}
}
}
--
Cal
------------------------------
Date: Mon, 11 Jun 2012 19:56:18 -0700
From: Michael Vilain <vilain@NOspamcop.net>
Subject: Re: Odd behaviour on Mac OS X Lion
Message-Id: <vilain-EA794E.19561811062012@news.individual.net>
In article
<fe917735-cc0a-4559-95a3-9b714fbdc6df@w24g2000vby.googlegroups.com>,
Trudge <trudge@gmail.com> wrote:
> On Jun 11, 7:30 pm, Trudge <tru...@gmail.com> wrote:
> > On Jun 11, 5:29 pm, Ben Morrow <b...@morrow.me.uk> wrote:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > > Quoth Trudge <tru...@gmail.com>:
> >
> > > > I've been having some odd behaviour lately on our work computer running
> > > > the installed Apple Perl. I have the following few lines of code:
> >
> > > > #!/usr/bin/perl
> > > > BEGIN
> > > > {
> > > > open (STDERR,">>$0-err.txt");
> > > > print STDERR "\n",scalar localtime,"\n";
> > > > }
> > > <snip>
> >
> > > > However, there is no output to the Terminal window, and no entry is made
> > > > in the log file. There should at least be a date/time stamp.
> >
> > > > If I remove the 'shebang' line, the script outputs as expected, and an
> > > > entry is made in the log file indicating the timestamp.
> >
> > > Are you running it with
> >
> > > perl script
> >
> > > or with
> >
> > > ./script
> >
> > > ? If the latter, does it make a difference if you invoke perl directly?
> >
> > > I ask because I wonder if something is inserting a UTF-8 BOM (or some
> > > other such invisible character) at the beginning of the file, which is
> > > causing the kernel not to recognise the #!. Of course, I would then
> > > expect you to get some sort of error message...
> >
> > > Otherwise, start with something completely minimal, like
> >
> > > #!/usr/bin/perl
> > > 1;
> >
> > > and add pieces of the original script until it stops working. If you get
> > > as far as replacing the whole script, compare the two files with 'cmp'
> > > to see if they are actually identical, and if they aren't hex-dump them
> > > both to see what the difference is.
> >
> > > Ben
> >
> > I'm running 'perl -f test.pl', but I did try your suggestion of
> > 'perl ./test.pl' with identical results.
> >
> > I'll now try your 2nd one and build it a line at a time.
> >
> > btw, I'm using TextWrangler with 'Invisibles' turned on but don't see
> > anything strange.
> > --
> > Amer Neely
>
> OK, this is very strange. According to TextWrangler this is my code
> for perl_test.pl:
>
> #!/usr/bin/perl
> BEGIN
> {
> open (STDERR,">>$0-err.txt");
> print STDERR "\n",scalar localtime,"\n";
> }
>
> $|=1;
> use strict;
> use warnings;
>
> my $Source="/Users/prepress/Desktop/northbay_three";
> my $Destination="/Users/prepress/Desktop/northbay_two";
> my $ext="pdf";
> my $f;
> my @pdfs=();
>
> print "Now runnng $0.\n";
> print "\$Source: $Source\n";
> print "\$Destination: $Destination\n";
> print "\$ext: $ext\n";
>
>
> *But* according to cat it is really:
> Prepress-Mac-Pro:scripts prepress$ cat perl_test.pl
> Prepress-Mac-Pro:scripts prepress$ n";op/northbay_two";
>
> And then cmp tells me:
> Prepress-Mac-Pro:scripts prepress$ cmp perl_test.pl perl_test.1.pl
> perl_test.pl perl_test.1.pl differ: char 2, line 1
>
> Ben, it looks like you may have hit on the culprit. Where do I go from
> here?
> --
> Amer Neely
On BBEdit, it will show the line endings. They should be Unix-style
endings. Just a SWAG, but is your file CRLF (Windows) line endings?
Try creating the file in vi. That will only create UNIX style line
endings.
--
DeeDee, don't press that button! DeeDee! NO! Dee...
[I filter all Goggle Groups posts, so any reply may be automatically ignored]
------------------------------
Date: Tue, 12 Jun 2012 11:07:07 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: Odd behaviour on Mac OS X Lion
Message-Id: <bvgja9-7kc2.ln1@anubis.morrow.me.uk>
Quoth Trudge <trudge@gmail.com>:
> On Jun 11, 7:30 pm, Trudge <tru...@gmail.com> wrote:
> >
> > I'm running 'perl -f test.pl', but I did try your suggestion of
> > 'perl ./test.pl' with identical results.
Do you know what -f does?
> > I'll now try your 2nd one and build it a line at a time.
> >
> > btw, I'm using TextWrangler with 'Invisibles' turned on but don't see
> > anything strange.
>
> OK, this is very strange. According to TextWrangler this is my code
> for perl_test.pl:
>
> #!/usr/bin/perl
> BEGIN
<snip>
>
> *But* according to cat it is really:
> Prepress-Mac-Pro:scripts prepress$ cat perl_test.pl
> Prepress-Mac-Pro:scripts prepress$ n";op/northbay_two";
cat is a very bad tool for examining files. At least use less; something
like od would be better.
It looks to me as though this file has Apple line-endings (\015) rather
than Unix (\012). This means that as far as perl is concerned your file
is one very long line, which happens to all be a comment. If you remove
the #!, it is no longer all a comment, so it works properly.
> And then cmp tells me:
> Prepress-Mac-Pro:scripts prepress$ cmp perl_test.pl perl_test.1.pl
> perl_test.pl perl_test.1.pl differ: char 2, line 1
That's interesting: I would have expected something like 'char 16'. Is
there some other difference? Look at the file with od.
> Ben, it looks like you may have hit on the culprit. Where do I go from
> here?
Throw away TextWrangler and get a real text editor.
Ben
------------------------------
Date: Tue, 12 Jun 2012 10:06:34 -0700 (PDT)
From: Trudge <trudge@gmail.com>
Subject: Re: Odd behaviour on Mac OS X Lion
Message-Id: <c59d4017-271f-48b7-99d2-800dfb20f8fa@googlegroups.com>
On Tuesday, June 12, 2012 6:07:07 AM UTC-4, Ben Morrow wrote:
> Quoth Trudge=20
> > On Jun 11, 7:30=EF=BF=BDpm, Trudge <tru...@gmail.com> wrote:
> > >
> > > I'm running 'perl -f test.pl', but I did try your suggestion of
> > > 'perl ./test.pl' with identical results.
>=20
> Do you know what -f does?
>=20
> > > I'll now try your 2nd one and build it a line at a time.
> > >
> > > btw, I'm using TextWrangler with 'Invisibles' turned on but don't see
> > > anything strange.
> >=20
> > OK, this is very strange. According to TextWrangler this is my code
> > for perl_test.pl:
> >=20
> > #!/usr/bin/perl
> > BEGIN
> <snip>
> >=20
> > *But* according to cat it is really:
> > Prepress-Mac-Pro:scripts prepress$ cat perl_test.pl
> > Prepress-Mac-Pro:scripts prepress$ n";op/northbay_two";
>=20
> cat is a very bad tool for examining files. At least use less; something
> like od would be better.
>=20
> It looks to me as though this file has Apple line-endings (\015) rather
> than Unix (\012). This means that as far as perl is concerned your file
> is one very long line, which happens to all be a comment. If you remove
> the #!, it is no longer all a comment, so it works properly.
>=20
> > And then cmp tells me:
> > Prepress-Mac-Pro:scripts prepress$ cmp perl_test.pl perl_test.1.pl
> > perl_test.pl perl_test.1.pl differ: char 2, line 1
>=20
> That's interesting: I would have expected something like 'char 16'. Is
> there some other difference? Look at the file with od.
>=20
> > Ben, it looks like you may have hit on the culprit. Where do I go from
> > here?
>=20
> Throw away TextWrangler and get a real text editor.
>=20
> Ben
Heh. Why did I know you would say that :)
OK, based on yours and Michael's *suggestion* I'll switch to another editor=
.
btw, TextWrangler was set to Mac line endings. Right on Ben.
Thank you all who responded to this thread. Unless something else weird hap=
pens, I would consider this post SOLVED.
--=20
Amer Neely
------------------------------
Date: Tue, 12 Jun 2012 13:00:22 -0700
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: Odd behaviour on Mac OS X Lion
Message-Id: <120620121300222790%jimsgibson@gmail.com>
In article <bvgja9-7kc2.ln1@anubis.morrow.me.uk>, Ben Morrow
<ben@morrow.me.uk> wrote:
> Quoth Trudge <trudge@gmail.com>:
> > On Jun 11, 7:30?pm, Trudge <tru...@gmail.com> wrote:
> > >
> > Ben, it looks like you may have hit on the culprit. Where do I go from
> > here?
>
> Throw away TextWrangler and get a real text editor.
TextWrangler /is/ a real editor, one of the best on the Mac platform.
In this case, the person using TextWrangler can use the 'File > Hex
Dump Front Document' menu option to see what characters are in his
file. If line endings are a problem, he can use the 'Edit > Document
Options' menu selection to change the line endings for the whole file.
If some other set of characters is the problem, he can use the 'Text >
Zap Gremlins' option to modify or delete non-ASCII, control, or null
characters.
--
Jim Gibson
------------------------------
Date: Tue, 12 Jun 2012 23:11:16 -0700
From: Michael Vilain <vilain@NOspamcop.net>
Subject: Re: Odd behaviour on Mac OS X Lion
Message-Id: <vilain-BE6243.23111512062012@news.individual.net>
In article <120620121300222790%jimsgibson@gmail.com>,
Jim Gibson <jimsgibson@gmail.com> wrote:
> In article <bvgja9-7kc2.ln1@anubis.morrow.me.uk>, Ben Morrow
> <ben@morrow.me.uk> wrote:
>
> > Quoth Trudge <trudge@gmail.com>:
> > > On Jun 11, 7:30?pm, Trudge <tru...@gmail.com> wrote:
> > > >
>
> > > Ben, it looks like you may have hit on the culprit. Where do I go from
> > > here?
> >
> > Throw away TextWrangler and get a real text editor.
>
> TextWrangler /is/ a real editor, one of the best on the Mac platform.
>
> In this case, the person using TextWrangler can use the 'File > Hex
> Dump Front Document' menu option to see what characters are in his
> file. If line endings are a problem, he can use the 'Edit > Document
> Options' menu selection to change the line endings for the whole file.
> If some other set of characters is the problem, he can use the 'Text >
> Zap Gremlins' option to modify or delete non-ASCII, control, or null
> characters.
BBEdit, TextWrangler's big brother, allows you to change the file's line
endings between UNIX (\n), Mac (\r), and Windows (\r\n) by selecting the
type at the bottom of the file's window.
I just downloaded TextWrangler 4.01 and it has the same feature. Unless
you like TextWrangler more than MacVim or Emacs, it's time to RTFM and
find out the features of the tool your using.
--
DeeDee, don't press that button! DeeDee! NO! Dee...
[I filter all Goggle Groups posts, so any reply may be automatically ignored]
------------------------------
Date: Tue, 12 Jun 2012 23:01:12 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: Re: Very Sluggish Code
Message-Id: <slrnjtfbgp.84v.hjp-usenet2@hrunkner.hjp.at>
On 2012-06-11 21:40, Ben Morrow <ben@morrow.me.uk> wrote:
> Quoth GlenM <glenmillard@gmail.com>:
>> I have a Perl script, which I will post the contents below. It seems to
>> get 'stuck' and I have to do an actually kill of the process.
>
> Is the process using a huge amount of memory at that point? Is the
> system swapping?
Good questions.
And questions like these (also: Is the code CPU-bound or disk-bound) can
usually be answered without looking at the code, and can guide you in
the right direction.
But when optimizing some specific code, the most important question is:
Where is it spending most of its time?
You can use a profiler for this (Devel::NYTProf is quite good), but for
a first estimate a few well-placed print statements logging the elapsed
time since program start are often sufficient (and they distort
run-times much less than a profiler).
>> my $nodelist = $xp->find ("//row");
>> foreach my $row ($nodelist->get_nodelist ())
>> {
>> $dbh->do (
>> "INSERT IGNORE INTO rc_city_town (state_prov, city_town,
>> did_number) VALUES (?,?,?)",
>> undef,
>> $row->find ("state")->string_value (),
>> $row->find ("ratecenter")->string_value (),
>> $row->find ("number")->string_value (),
>> );
>
> This will in principle go faster if you use ->prepare and ->execute
> rather than ->do. I don't actually know whether DBD::mysql supports
> server-side prepared statements, so it's possible this will make no
> difference in practice.
I don't know about mysql (recent versions do support server-side
prepared statements, and DBD::mysql supports them, but I haven't run any
benchmarks), but IME (mostly with Oracle) using prepare/execute ist
noticably but not spectacularly faster. This is not surprising:
Preparing an insert is a simple and fast operation (unless you insert
into a view joining a dozen tables), so there is not much time to be
saved. Mostly you save the round-trip time to the database server.
Using array inserts OTOH can result in a massive speedup on databases
which support it. I can't find my old benchmark results (for Oracle) at
the moment, but I think I've seen a speedup of about 2 orders of
magnitude for some workloads. Unfortunately, using multi-row inserts on
MySQL ("insert into ... values(...) values(...) values(...) ...") didn't
gain much last time I tried it.
hp
--
_ | Peter J. Holzer | Deprecating human carelessness and
|_|_) | Sysadmin WSR | ignorance has no successful track record.
| | | hjp@hjp.at |
__/ | http://www.hjp.at/ | -- Bill Code on asrg@irtf.org
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 3713
***************************************