[23701] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 5907 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Dec 7 14:05:41 2003

Date: Sun, 7 Dec 2003 11:05:10 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 7 Dec 2003     Volume: 10 Number: 5907

Today's topics:
    Re: ! <geoff.cox@blueyonder.co.uk>
        can some one please explain this regex?! <geoff.cox@blueyonder.co.uk>
        Do we just use different file handles when we want to h <someone@somewhere.nb.ca>
    Re: Do we just use different file handles when we want  <jundy@jundy.com>
    Re: Do we just use different file handles when we want  <jwillmore@remove.adelphia.net>
    Re: How to open a file from the end and read the last 1 <jwillmore@remove.adelphia.net>
    Re: hwo to match more than 1 line? <noreply@gunnar.cc>
        newbie's question on the text file processing? <bacchantecn@yahoo.com.cn>
    Re: newbie's question on the text file processing? <asu1@c-o-r-n-e-l-l.edu>
    Re: newbie's question on the text file processing? <ww3140@_yah-oo.com>
    Re: newbie's question on the text file processing? (Tad McClellan)
    Re: Overloading <usenet@morrow.me.uk>
        Perlcc and converting scripts to bytecode <wjbell@belletc.net>
    Re: Perlcc and converting scripts to bytecode <usenet@morrow.me.uk>
    Re: Processing array elements without iterative loop (John Markos O'Neill)
    Re: read file with while and then scan lines into array (Martin Foster)
    Re: Why can't I parse google search results? <asu1@c-o-r-n-e-l-l.edu>
    Re: Why can't I parse google search results? <flavell@ph.gla.ac.uk>
    Re: Why can't I parse google search results? <ww3140@_yah-oo.com>
    Re: Why can't I parse google search results? <asu1@c-o-r-n-e-l-l.edu>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sun, 07 Dec 2003 16:14:03 GMT
From: Geoff Cox <geoff.cox@blueyonder.co.uk>
Subject: Re: !
Message-Id: <lbk6tvcl62s5p10bl5qur3mt13k4ermjbo@4ax.com>

On Sun, 07 Dec 2003 20:33:27 +0900, ko <kuujinbo@hotmail.com> wrote:

>Geoff Cox wrote:
>> On Sun, 7 Dec 2003 23:10:48 +1300, "Tintin" <me@privacy.net> wrote:
>> 
>> 
>>>"Geoff Cox" <geoff.cox@blueyonder.co.uk> wrote in message
>>>news:1es5tv8ppfs34pua3uesreni0hlkjvsc7d@4ax.com...
>
>[snip]
>
>>>>Ideas please?!
>>>
>>>You've discovered that regexes aren't very robust/easy/flexible when it
>>>comes to parsing HTML.  Use one of the HTML parsers on CPAN.
>> 
>> There seem to be a large number of them! any recommendation?!
>
>HTML::Parser. If you're only interested in extracting text, here's an 
>example to get you started:
>
>http://search.cpan.org/src/GAAS/HTML-Parser-3.34/eg/htext
>
>There are other example scripts in the parent directory.
>
>HTH - keith

Keith - thanks for the link...

Cheers

Geoff



------------------------------

Date: Sun, 07 Dec 2003 18:02:07 GMT
From: Geoff Cox <geoff.cox@blueyonder.co.uk>
Subject: can some one please explain this regex?!
Message-Id: <0iq6tv0f7u1efrv1ge01eog6e89kpc4okj@4ax.com>

Hello,

this comes from my posting re how to match more than 1 line (from
Gunnar) but would appreciate any one just explaining what is matched
as the code does not work for me. If I could learn from this I could
probably sort it out for myself ..

Thanks

Geoff

     if ( $line =~ /Head\s+Teacher.+?<TD[^>]+>([^<]+)
                    .+?
                    Address.+?<TD[^>]+>([^<]+)
                   /isx ) {


------------------------------

Date: Sun, 07 Dec 2003 17:15:32 GMT
From: "Guy" <someone@somewhere.nb.ca>
Subject: Do we just use different file handles when we want to have multiple files open at the same time...?
Message-Id: <UeJAb.6677$IF6.316336@ursa-nb00s0.nbnet.nb.ca>

To have many files open at the same time, do I just use a different file
handle like this:

open(FILEA,"$stopfil");
open(FILEB,"$startfil");

# some extra code goes here

close(FILEA);
close(FILEB);

Or are there other things that have to be done also.

Guy Doucet




------------------------------

Date: Sun, 07 Dec 2003 18:56:32 GMT
From: Erik Tank <jundy@jundy.com>
Subject: Re: Do we just use different file handles when we want to have multiple files open at the same time...?
Message-Id: <99c6e8e730b528b308c3a162e59891c8@news.teranews.com>

That is pretty much all you have to do to open multiple files at the
same time.

On Sun, 07 Dec 2003 17:15:32 GMT, "Guy" <someone@somewhere.nb.ca>
wrote:

>To have many files open at the same time, do I just use a different file
>handle like this:
>
>open(FILEA,"$stopfil");
>open(FILEB,"$startfil");
>
># some extra code goes here
>
>close(FILEA);
>close(FILEB);
>
>Or are there other things that have to be done also.
>
>Guy Doucet
>



------------------------------

Date: Sun, 07 Dec 2003 17:18:45 GMT
From: James Willmore <jwillmore@remove.adelphia.net>
Subject: Re: Do we just use different file handles when we want to have multiple files open at the same time...?
Message-Id: <20031207121845.1c9a4183.jwillmore@remove.adelphia.net>

On Sun, 07 Dec 2003 17:15:32 GMT
"Guy" <someone@somewhere.nb.ca> wrote:

> To have many files open at the same time, do I just use a different
> file handle like this:
> 
> open(FILEA,"$stopfil");
> open(FILEB,"$startfil");
> 
> # some extra code goes here
> 
> close(FILEA);
> close(FILEB);
> 
> Or are there other things that have to be done also.

That will work.  However, I'd check to see if you actually opened the
file.

open(FILEA,"$stopfil") or die "Open failed for $stopfil: $!\n";
open(FILEB,"$startfil") or die "Open failed for $starfil: $!\n";

HTH

-- 
Jim

Copyright notice: all code written by the author in this post is
 released under the GPL. http://www.gnu.org/licenses/gpl.txt 
for more information.

a fortune quote ...
"Do not meddle in the affairs of wizards, for you are crunchy and
good with ketchup." 


------------------------------

Date: Sun, 07 Dec 2003 17:01:19 GMT
From: James Willmore <jwillmore@remove.adelphia.net>
Subject: Re: How to open a file from the end and read the last 100 lines
Message-Id: <20031207120119.7fcafc22.jwillmore@remove.adelphia.net>

On Sun, 07 Dec 2003 03:37:47 GMT
"Mihai N." <nmihai_year_2000@yahoo.com> wrote:
> Uri Guttman <uri@stemsystems.com> wrote in
> news:x7brqlsao7.fsf@mail.sysarch.com: 
> >>>>>> "AS" == Anno Siegel <anno4000@lublin.zrz.tu-berlin.de>
> >writes:
> >   AS> Sara <genericax@hotmail.com> wrote in comp.lang.perl.misc:
> >  >> anno4000@lublin.zrz.tu-berlin.de (Anno Siegel) wrote in
> >  >message> news:<bqs8md$hfo$1@mamenchi.zrz.TU-Berlin.DE>...
> >  >> > > 
> >  >> > >    die "I hate MONDAYS!\n" unless open F, 'log';
> >  >> > >    my @l = <F>;
> >  >> > >    close F;
> >  >> > >    @l = splice @l, @l-100;
> >  >> > > 
> >  >> > > and botta bing you have your last 100 lines! 
> >  >> > 
> >  >> > ...except when the file has fewer than 100 lines, in which
> >  >case a> > fatal run-time error results.  Uri's
> >  >File::ReadBackwards deals> > correctly with that case.
> >  >> > 
> >  >> > This is a good demonstration why "rolling your own" is a bad
> >  >idea,> > even if the problem looks trivial.
> >  >> 
> >  >> Oh yes a small mod for that trivial case:
> >  >> 
> >  >> splice @l, @l-100 if @l > 100;
> > 
> >   AS> Don't shrug off trivial bugs.  In a production program, this
> >   AS> kind
> >   of AS> bug can go unnoticed for a long time.  Then the program
> >   fails for no AS> good reason, perhaps because it's called a few
> >   times a day instead of AS> just once.  Not good.  CPAN modules
> >   don't *have* trivial bugs like that. 
> > 
> > and my module is much faster than her code as well. slurping in a
> > whole file to get last 100 lines is a waste of ram and cpu. and if
> > the file is a large log, forget it. sara will just have to learn
> > that rolling your own all the time is fruitless. the ultimate
> > result is write your own in c because perl is just a large c based
> > application. 
> > 
> > uri
> > 
> 
> what about this:
> while( <> ) {
>     	push @lines, $_;
>     	shift @lines if $#lines > 100;
> }
> 
> I agree CPAN is great.
> But you have to put hings in balance.
> 
> When the time + effort to search for what you need,
> evaluate the 20 possible modules, select one or two, compare,
> understand how they work, I would rather write my own two lines.
> 
> I would not do this for complex stuff, like parsing XML/HTML,
> sending emails, DB interogations, etc.
> Where is the limit for "complex" for each one, it is for each
> one to decide. If I am not able to write a line of perl to
> do my stuff, chances are I will not be able to use properly a
> CPAN module.
> CPAN is not going to think for you.

If you *must* "roll your own", you might try this.  However, read the
notes at the end of the code before using.

--untested--
#!/usr/bin/perl -w

#use strict pragma
use strict;

#get the filename of the file to process - 
#die if no filename is provided
my $in_file = shift 
	or die "No filename provided\n";
#open the file to process -
#die if it can't be opened
open(IN, $in_file) 
	or die "Can't open $in_file: $!\n";
#read the file - in reverse - into an array
my @reverse = reverse <IN>;
#close the file
close IN;

#declare a counter
my $count = 0;
#declare an array to hold the lines we want
my @lines;
#while there is still an array 
#containing the original file (in reverse) ...
while(@reverse){
	#increment the counter
	$count++;
	#push onto the array for the lines we want the
	#value of the current line - shift it from the
	#original file (in reverse) array
	push @lines, shift @reverse;
	#break the loop if we reach the amount of lines
	#from the end of the file (ie last 100 lines
	#of the original file)
	last if $count == 100;
}

#print the filnal results (ie last 100 lines of the file)
print reverse @lines;
--untested--

I tested (yes, I *always* put untested for the code, because IMHO it's
never tested enough) this with two files.  One with 5 lines and my
messages file (which has well over 100 lines).  It worked for both. 
However, this method has some issues.  First, if the file being read
exceeds the system memory, the script will most likely crap.  It also
uses open versus sysopen..  And, there's no file locking.  And, you
will most likely miss log messages if you use this againist an active
log file.  There are propbably other issues with this method, but if
you want to use it instead of using a tested method, be my guest :-)

This was "quick and dirty" - so the quality is most likely below
standards.

HTH

-- 
Jim

Copyright notice: all code written by the author in this post is
 released under the GPL. http://www.gnu.org/licenses/gpl.txt 
for more information.

a fortune quote ...
"We have reason to believe that man first walked upright to free 
his hands for masturbation."   -- Lily Tomlin 


------------------------------

Date: Sun, 07 Dec 2003 19:07:26 +0100
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: hwo to match more than 1 line?
Message-Id: <bqvqgr$27gvlg$1@ID-184292.news.uni-berlin.de>

Geoff Cox wrote:
> Gunnar Hjalmarsson wrote:
>> 
>>    if ( $line =~ /Head\s+Teacher.+?<TD[^>]+>([^<]+)
>>                   .+?
>>                   Address.+?<TD[^>]+>([^<]+)
>>                  /isx ) {
>>        print "Name: $1\nAddress: $2\n";
>>    }
> 
> the above is not working for me at the moment - if you have the
> time (and patience!) it would really help me if you could "talk" me
> through it ...

I'd prefer not to. Besides the character classes, which we now have
explained, and a couple of modifiers, whose meaning you can read about
in 'perldoc perlre', it doesn't include anything that was not included
in the regex you posted yourself.

I suggest that you post a minimal but complete program that others can
run and that illustrates that the above regex fails in extracting the
name and address.

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl



------------------------------

Date: Mon, 8 Dec 2003 00:05:51 +0800
From: "Jim" <bacchantecn@yahoo.com.cn>
Subject: newbie's question on the text file processing?
Message-Id: <bqvj4l$d2h$1@mail.cn99.com>

Hello,

I am learning Perl and I have come across something. I would like to process
the text file and calculate the word frequency in it. All analysis is case
insensitive and all punctuation marks other than hyphens, apostrophe and
plus and minus signs were substituted by the space.As I am a new bie, I have
no idea of  how to write a complex regular expression to extract the correct
word one by one  from the file.  Can anyone help me finish the script?





------------------------------

Date: 7 Dec 2003 16:41:32 GMT
From: "A. Sinan Unur" <asu1@c-o-r-n-e-l-l.edu>
Subject: Re: newbie's question on the text file processing?
Message-Id: <Xns944A76F17ED9Easu1cornelledu@132.236.56.8>

"Jim" <bacchantecn@yahoo.com.cn> wrote in
news:bqvj4l$d2h$1@mail.cn99.com: 

> Hello,
> 
> I am learning Perl and I have come across something. I would like to
> process the text file and calculate the word frequency in it. All
> analysis is case insensitive and all punctuation marks other than
> hyphens, apostrophe and plus and minus signs were substituted by the
> space.As I am a new bie, I have no idea of  how to write a complex
> regular expression to extract the correct word one by one  from the
> file.  

This smells of homework or some other blatant attempt to make others do 
your work for you.

> Can anyone help me finish the script? 

Show us what you have done so far and ask specific questions.

-- 
A. Sinan Unur
asu1@c-o-r-n-e-l-l.edu
Remove dashes for address
Spam bait: mailto:uce@ftc.gov


------------------------------

Date: Sun, 07 Dec 2003 17:11:45 GMT
From: ww <ww3140@_yah-oo.com>
Subject: Re: newbie's question on the text file processing?
Message-Id: <pkn6tvknntbfqoclmh51ges9ev9dpopiq0@4ax.com>

hint: what does open() do?
hint: what does join(split()) do?
hint: what does grep() return?
hint: I don't know how to solve your problem.

-w w



On Mon, 8 Dec 2003 00:05:51 +0800, "Jim" <bacchantecn@yahoo.com.cn>
wrote:

>Hello,
>
>I am learning Perl and I have come across something. I would like to process
>the text file and calculate the word frequency in it. All analysis is case
>insensitive and all punctuation marks other than hyphens, apostrophe and
>plus and minus signs were substituted by the space.As I am a new bie, I have
>no idea of  how to write a complex regular expression to extract the correct
>word one by one  from the file.  Can anyone help me finish the script?
>
>



------------------------------

Date: Sun, 7 Dec 2003 11:52:05 -0600
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: newbie's question on the text file processing?
Message-Id: <slrnbt6q65.krj.tadmc@magna.augustmail.com>

Jim <bacchantecn@yahoo.com.cn> wrote:

> I would like to process
> the text file and calculate the word frequency in it.

   my %words;
   while ( <> ) {
      $words{$1}++ while /(\w+)/g;
   }
   printf "%9d   %s\n", $_, $words{$_} for sort keys %words;


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Sun, 7 Dec 2003 17:38:43 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Overloading
Message-Id: <bqvoj3$3ih$2@wisteria.csv.warwick.ac.uk>

"Jim Keenan" <no_spam_for_jkeen@verizon.net> wrote:
> In attempting to reproduce your problem, I rearranged the code so as to put
> package I at the top of the file,

Why?

> then explicitly called package main.

package is lexical, so the braces make sure we go back to main::.

>  I also threw in some newlines for readability.

 ...which is why I used '#!/usr/bin/perl -l', which puts in all those
newlines and several more.

> The result: package main ran without warnings.

 ...because you didn't turn them on early enough. Well done.

Ben

-- 
   If you put all the prophets,   |   You'd have so much more reason
   Mystics and saints             |   Than ever was born
   In one room together,          |   Out of all of the conflicts of time.
ben@morrow.me.uk |----------------+---------------| The Levellers, 'Believers'


------------------------------

Date: Sun, 07 Dec 2003 16:10:32 GMT
From: Warren Bell <wjbell@belletc.net>
Subject: Perlcc and converting scripts to bytecode
Message-Id: <YhIAb.33982$cK2.1699@newssvr29.news.prodigy.com>

I'm running Perl 5.8.2 on linux.  I've heard a little about perlcc so I 
desided to try it on one of my scripts (perlcc -o index -B index.cgi) 
and I have a few questions:

Will the script in bytecode run faster?

Can I distribute the script in bytecode and will it work on most 
linux/unix systems with perl?

Is it easy for someone to turn that bytecode back into my original source?


------------------------------

Date: Sun, 7 Dec 2003 17:34:20 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Perlcc and converting scripts to bytecode
Message-Id: <bqvoas$3ih$1@wisteria.csv.warwick.ac.uk>

Warren Bell <wjbell@belletc.net> wrote:
> I'm running Perl 5.8.2 on linux.  I've heard a little about perlcc so I 
> desided to try it on one of my scripts (perlcc -o index -B index.cgi) 
> and I have a few questions:

perlcc is considered experimental, and should not be used for
production code.

> Will the script in bytecode run faster?

No. Startup will (may) be faster, due to not having to compile the
program initially.

> Can I distribute the script in bytecode and will it work on most 
> linux/unix systems with perl?

Yes, if it has 'use Bytecode;' at the top.

> Is it easy for someone to turn that bytecode back into my original
> source?

Yes. See B::Deparse.

Ben

-- 
Every twenty-four hours about 34k children die from the effects of poverty.
Meanwhile, the latest estimate is that 2800 people died on 9/11, so it's like
that image, that ghastly, grey-billowing, double-barrelled fall, repeated
twelve times every day. Full of children. [Iain Banks]         ben@morrow.me.uk


------------------------------

Date: 7 Dec 2003 09:46:57 -0800
From: john@nhoj.com (John Markos O'Neill)
Subject: Re: Processing array elements without iterative loop
Message-Id: <8179ccbd.0312070946.584ad5e8@posting.google.com>

Hi all, Uri Guttman wrote,

> why do you need that temp var? and what if the element was 0 - that
> would fail. 
> 
>   JMO>     print("\$even_array_ele:  $even_array_ele\n");
>   JMO> }

I agree:  the temporary variable is completely unnecessary (in fact,
it's causing a problem) and I have eliminated it.

> others have shown you clean short ways. i just want to know why yours
> was much longer than it needed to be.

I simply didn't imagine a cleaner solution.  Thanks J�rgen Exner and
A. Sinan Unur for your more elegant ones.

John Markos O'Neill


------------------------------

Date: 7 Dec 2003 08:47:35 -0800
From: mdfoster44@netscape.net (Martin Foster)
Subject: Re: read file with while and then scan lines into array
Message-Id: <6a20f90a.0312070847.55fba893@posting.google.com>

Here's the data
 .....skipping top part of file
loop_
_iza_sc_CoordinationSequence
1 4 9 17 28 42 60 82 111 149 191 229 262 297 336 384
1 4 10 19 30 44 63 89 121 155 188 221 258 302 355 415
1 4 9 18 32 49 68 89 114 144 179 221 267 314 364 417

loop_
_iza_sc_VertexSymbols
4.6.4.6.4.6
4.4.6.6.6.8_{3}
4.4.4.6.8.12
 ......skipping bottom part of file.

I want to scan in the number sequences after
_iza_sc_CoordinationSequence
into an array and them into mySQL.  



"Jim Keenan" <no_spam_for_jkeen@verizon.net> wrote in message news:<aSxAb.1503$Ji.1154@nwrdny02.gnilink.net>...
> "Martin Foster" <mdfoster44@netscape.net> wrote in message
> news:6a20f90a.0312051108.2559ac0f@posting.google.com...
> > I'm scanning text files into a database.
> >
> > My perl script looks like this:
> >
> 
> You've written your post in such a confusing manner that it is difficult to
> figure out what your problem is.

I was being a little too brief.

> 
> > # start loop of file to scan for data
> >   while (defined ($_2 = <INFILE>)){
> >     # Find  cell data
> >     if ($_2 =~ m/_cell_length_a\s+(-?([0-9]+(\.[0-9]*)?|\.[0-9]+))/){
> >       $cell[0] = $1;
> 
> In the code presented, you don't assign to any element of @cell other than
> $cell[0].  So why use an array at all?
> 
I do have other data lines I scan in, but yes I could just reuse the
same variable.

> >       print "Found cell parameter a= ", $cell[0], "  ";
> >       print "For str_id number ", $au_id, "\n";
> 
> Where did $au_id come from?
> 
$au_id the auto-increment value from mySQL, I get this earlier in my
code.

> >       # Insert data
> >       $stmt1 = "UPDATE bgb_data SET latpar_a = ? WHERE str_id = ?";
> >       $sth = $dbh->prepare($stmt1);
> >       $sth->execute($cell[0], $au_id);
> >     }
> >
> >    # get sequences
> >     if ($_2 =~ m/_Sequence/){
> >       # start loop to scan in sequences
> 
> This loop is incomplete.  Was what you really intended something like this?
> 
I've got several if statements... I can do several ifs and then the
last one is else if, right? or is if...else if...elseif....else if
etc.?

>      if  ($_2 =~ m/_cell_length_a\s+(-?([0-9]+(\.[0-9]*)?|\.[0-9]+))/){
>         # process
>     } elsif () {
>      # process
>     } ($_2 =~ m/_Sequence/)
>  
> >
> > So now I've found a tag and the next few lines are number sequences
> > which
> > I want in an array.
> >
> > I want to scan in those lines into until a blank line appears and then
> > continue scanning for further data, in the while loop.
> >
> Does that mean that when you are processing a file line-by-line and
> encounter a blank line, you wish to start a new array to hold the sequence
> numbers?
> 
Yes almost.
> Can you provide some sample data we could test this with?
> 
Please see above.
> Jim Keenan

Thanks for your help.

Kind regards,
Martin Foster.


------------------------------

Date: 7 Dec 2003 16:24:35 GMT
From: "A. Sinan Unur" <asu1@c-o-r-n-e-l-l.edu>
Subject: Re: Why can't I parse google search results?
Message-Id: <Xns944A7411BFE15asu1cornelledu@132.236.56.8>

utsuxs@hotmail.com (bob) wrote in 
news:51c3a5d3.0312070801.5093c8cf@posting.google.com:

> Can't fetch HTML from http://www.google.com/search?q=smeghead at
> parsing.pl line 13.

The error is on line 13.

> I obviously missing something but I don't know what it is.  

How can we know if we don't see the code?

> Help would be greatly appreaciated.

Help others help you!

> Thank you.

You are welcome.

-- 
A. Sinan Unur
asu1@c-o-r-n-e-l-l.edu
Remove dashes for address
Spam bait: mailto:uce@ftc.gov


------------------------------

Date: Sun, 7 Dec 2003 16:53:17 +0000
From: "Alan J. Flavell" <flavell@ph.gla.ac.uk>
Subject: Re: Why can't I parse google search results?
Message-Id: <Pine.LNX.4.53.0312071644500.18755@ppepc56.ph.gla.ac.uk>

On Sun, 7 Dec 2003, A. Sinan Unur wrote:

> utsuxs@hotmail.com (bob) wrote in
> news:51c3a5d3.0312070801.5093c8cf@posting.google.com:
>
> > Can't fetch HTML from http://www.google.com/search?q=smeghead at
> > parsing.pl line 13.
>
> The error is on line 13.

Joking apart - some of the hardest-to-diagnose errors are those
where the error report is pointing somewhere else than the line which
is _really_ in error, due to some kind of knock-on effect.

> How can we know if we don't see the code?

Let's not tempt the newbie to shovel their entire 600-line script onto
Usenet, though.

We _do_ need to see the code in some kind of appropriate context,
sure.  The advice in the group's posting guidelines (as posted
regularly by Tad) would stand the questioner in good stead, if they
would only read it and at least give an impression that they're
following its advice.

Hint:  the above line doesn't appear to be an error message coming
from Perl itself.  Ergo, it's probably an error from some code written
in Perl.  Look more closely at that code - work out whether it can
provide some additional diagnostics, and, if it can, then work out why
they aren't being displayed by the calling program.  (the variable $!
may be of interest, for example).


------------------------------

Date: Sun, 07 Dec 2003 17:07:51 GMT
From: ww <ww3140@_yah-oo.com>
Subject: Re: Why can't I parse google search results?
Message-Id: <s5n6tvc0uklq228r9mft2iu1usk7rcb83c@4ax.com>


~cough~ http://www.google.com/apis/

Check out my post here: http://perlmonks.com/index.pl?node_id=182706

Using the google api will be the easiest way IMHO.

or maybe google is expecting form data via the POST method, rather
than GET as you are trying to use.

or perhaps this will help:
http://search.cpan.org/~petdance/WWW-Mechanize-0.70/lib/WWW/Mechanize.pm

or maybe any of these modules:
http://search.cpan.org/search?query=google&mode=module


good luck.

-w w

On 7 Dec 2003 08:01:16 -0800, utsuxs@hotmail.com (bob) wrote:

>I'm trying to extract data from the results page of search engines
>with these two
>modules  use LWP::Simple and HTML::Parse, and the get command.
>
>I can extract from yahoo and altavista but google is not cooperating.
>
>I get this error message
>
>Can't fetch HTML from http://www.google.com/search?q=smeghead at
>parsing.pl line 13.
>
>
>
>I obviously missing something but I don't know what it is.  Help would
>be greatly appreaciated.  Thank you.



------------------------------

Date: 7 Dec 2003 17:40:55 GMT
From: "A. Sinan Unur" <asu1@c-o-r-n-e-l-l.edu>
Subject: Re: Why can't I parse google search results?
Message-Id: <Xns944A8102C6225asu1cornelledu@132.236.56.8>

"Alan J. Flavell" <flavell@ph.gla.ac.uk> wrote in 
news:Pine.LNX.4.53.0312071644500.18755@ppepc56.ph.gla.ac.uk:

> On Sun, 7 Dec 2003, A. Sinan Unur wrote:
> 
>> utsuxs@hotmail.com (bob) wrote in
>> news:51c3a5d3.0312070801.5093c8cf@posting.google.com:
>>
>> > Can't fetch HTML from http://www.google.com/search?q=smeghead at
>> > parsing.pl line 13.
>>
>> The error is on line 13.
> 
> Joking apart - some of the hardest-to-diagnose errors are those
> where the error report is pointing somewhere else than the line which
> is _really_ in error, due to some kind of knock-on effect.

That is true.

>> How can we know if we don't see the code?
> 
> Let's not tempt the newbie to shovel their entire 600-line script onto
> Usenet, though.

Which is why I asked for the code, but you are absolutely right, I should 
have pointed the OP either to the posting guidelines or explained how to 
post source code.
 
> We _do_ need to see the code in some kind of appropriate context,
> sure.  The advice in the group's posting guidelines (as posted
> regularly by Tad) would stand the questioner in good stead, if they
> would only read it and at least give an impression that they're
> following its advice.

So, a plea to the OP: Please read the quidelines before posting source 
code:

http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html

Doing so and following the recommendations therein will ensure you can 
get the best help possible.

Sinan
-- 
A. Sinan Unur
asu1@c-o-r-n-e-l-l.edu
Remove dashes for address
Spam bait: mailto:uce@ftc.gov


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 5907
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[23701] in Perl-Users-Digest

Perl-Users Digest, Issue: 5907 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Sun Dec 7 14:05:41 2003

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Dec 7 14:05:41 2003