[24241] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 6432 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Apr 20 14:10:49 2004

Date: Tue, 20 Apr 2004 11:10:09 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Tue, 20 Apr 2004     Volume: 10 Number: 6432

Today's topics:
    Re: slurp not working? ideas please! (Anno Siegel)
    Re: slurp not working? ideas please! <geoffacox@dontspamblueyonder.co.uk>
    Re: slurp not working? ideas please! <uri@stemsystems.com>
    Re: slurp not working? ideas please! <xxala_qumsiehxx@xxyahooxx.com>
    Re: slurp not working? ideas please! <Joe.Smith@inwap.com>
    Re: slurp not working? ideas please! <geoffacox@dontspamblueyonder.co.uk>
    Re: slurp not working? ideas please! <geoffacox@dontspamblueyonder.co.uk>
    Re: slurp not working? ideas please! <geoffacox@dontspamblueyonder.co.uk>
    Re: slurp not working? ideas please! <geoffacox@dontspamblueyonder.co.uk>
    Re: slurp not working? ideas please! <tassilo.parseval@rwth-aachen.de>
    Re: slurp not working? ideas please! <glex_nospam@qwest.invalid>
    Re: slurp not working? ideas please! <geoffacox@dontspamblueyonder.co.uk>
    Re: slurp not working? ideas please! <glex_nospam@qwest.invalid>
    Re: Writing fast(er) performing parsers in Perl <clint@0lsen.net>
    Re: Writing fast(er) performing parsers in Perl (Walter Roberson)
        XML::Xerces questions <apollock11@hotmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 20 Apr 2004 13:06:34 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: slurp not working? ideas please!
Message-Id: <c6378q$m3p$4@mamenchi.zrz.TU-Berlin.DE>

Geoff Cox  <geoffacox@dontspamblueyonder.co.uk> wrote in comp.lang.perl.misc:
> On 20 Apr 2004 11:24:01 GMT, anno4000@lublin.zrz.tu-berlin.de (Anno
> Siegel) wrote:
> 
> .>> thanks for your reply ...  but which part of the OOP code, if any,
> >> would go in the Fiel::Find sub? 
> >
> >I have no idea.  Your code is too much of a mess to repair, so that
> >would amount to writing the program for you.  This is something we
> >rarely do.
> 
> Anno,
> 
> hey! don't mind my feelings will you!! 
> 
> I don't expect you to rewrite my code - just to point out, if you can,

I can't.  That's the whole point!

> why I am getting the particular error message, "use of uninitialized
> value in pattern match at line (below)".
> 
>     if ($next3 =~ /\$i\<(\d+);/) {
> 
> I am quite prepared to admit that the code is not very well written
> but apart from this particular problem. it does work. I have left out
> large parts of the code which do work ...

Then show your code as it is now.  A sub that defines (non-anonymous)
subs in its body is so much off kilter, it's impossible to guess what
it should and shouldn't do.

Anno


------------------------------

Date: Tue, 20 Apr 2004 13:25:46 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: slurp not working? ideas please!
Message-Id: <hs8a80d5j5suhef7g5ces3joauql9kshjq@4ax.com>

On Tue, 20 Apr 2004 08:19:32 -0500, Tad McClellan
<tadmc@augustmail.com> wrote:

>You should always, yes *always*, check the return value from open().
>
>This has been pointed out to you before.
>
>Do you actually read the followups to your posts?

Tad,

I am not quite so bad as my post indicated! I do check for opened
files etc but have been moving arround so much trying to find out what
is wrong that this got missed out....The code below works for the file
indicated except that for some reason the slurp does not work 

ie in the sub classroomnotes part ...

my $line = <INNN>;

while (<INNN>){
    last if /$pattern/;
              }
my ($curr, $next1, $next2, $next3) = <INNN>;
    close (INNN);

    if ($next3 =~ /\$i\<(\d+);/) {

etc - the last line giving the warning. Can you see why?!

Thanks

Geoff



package MyParser;
use base qw(HTML::Parser);

my $in_heading;
my $p;

my $name = "as-left.htm";
open (OUT, ">>d:/a-keep9/short-nondb/short/members2/$name") ||
die "cannot open >>d:/a-keep9/short-nondb/short/members2/$name \n";

print OUT ("<html><head><title>test</title></head><body> \n");
print OUT ("<table width='100%' border='1'> \n");

sub start {

        my ($self, $tagname, $attr, undef, $origtext) = @_;

	if ($tagname eq 'h2') {
	    $in_heading = 1;
	    return;
                              }

        if ($tagname eq 'p') {
            $p = 1;
	    return;
                             }
       
         if ($tagname eq 'option') {

           choice($attr->{ value });
                                    }
 
        }

        sub end         {
        my ($self, $tagname, $origtext) = @_;
	if ($tagname eq 'h2') {
	    $in_heading = 0;
	    return;
                              }


         if ($tagname eq 'p') {
            $p = 0;
	    return;
                              }
                        }
    
    sub text       {
        my ($self, $origtext) = @_;
        print OUT ("<h2>$origtext</h2> \n") if $in_heading;
        print OUT ("<p>$origtext</p> \n") if $p;

                   }

sub choice {
my ($path) = @_;
 
if ($path =~ /docs\/aslevel\/classroom-notes/) {
  intro($path);
  classroomnotes($path);
                                               } 

           }

sub intro {

my ($pathhere) = @_;
   open (INN, "d:/a-keep9/short-nondb/db/total-160404.txt") ||
   die "cannot open d:/a-keep9/short-nondb/db/total-160404.txt \n";
my $lineintro;

       while (defined ($lineintro = <INN>)) {
              if ($lineintro =~ /$pathhere','(.*?)'\)\;/) {
               print OUT ("<tr><td>$1 <p> </td>\n");
                                                          }
                                            }
          }



sub classroomnotes {

my ($pattern) = @_;

# print ("\$pattern has value $pattern \n");

open (INNN, "d:/a-keep9/short-nondb/allphp/allphp2.php") ||
die "cannot open d:/a-keep9/short-nondb/allphp/allphp2.php \n";

my $line = <INNN>;

while (<INNN>){
    last if /$pattern/;
              }
my ($curr, $next1, $next2, $next3) = <INNN>;
    close (INNN);

    if ($next3 =~ /\$i\<(\d+);/) {
    my $nn = $1;
    print OUT ("<td valign='top'> \n");
    for ($c=1;$c<$nn;$c++) {
    print OUT ('<a href="'. $pattern . "-doc" . $c . ".zip" . '">' .
"Document$c" . "</a><br>" . "\n");
                           }     
    print OUT ("</td></tr>\n");
                                 }
                          }


package main;
open (IN, "d:/a-keep9/short-nondb/oldshort2/$name") ||
die "cannot open package main d:/a-keep9/short-nondb/oldshort2/$name
\n";
undef $/;
my $html = <IN>;
my $parser = MyParser->new;
$parser->parse($html);

print OUT ("</tr></table> \n");
print OUT ("</body></html> \n");






------------------------------

Date: Tue, 20 Apr 2004 13:35:53 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: slurp not working? ideas please!
Message-Id: <x7k70apy9z.fsf@mail.sysarch.com>

>>>>> "GC" == Geoff Cox <geoffacox@dontspamblueyonder.co.uk> writes:

  GC> I am not quite so bad as my post indicated! I do check for opened
  GC> files etc but have been moving arround so much trying to find out
  GC> what is wrong that this got missed out....The code below works for
  GC> the file indicated except that for some reason the slurp does not
  GC> work

your code does indicate your badness.

and this code does not slurp. that means reading the entire file in one
operation and you do a loop and you can even stop reading before the end
of the file.

  GC> ie in the sub classroomnotes part ...

  GC> my $line = <INNN>;

what happens to this line?

  GC> while (<INNN>){
  GC>     last if /$pattern/;
  GC>               }
  GC> my ($curr, $next1, $next2, $next3) = <INNN>;

that will read to the end of the file and not just the next four lines.

if you have less than 4 lines left, then some of those vars will be undef.

  GC>     if ($next3 =~ /\$i\<(\d+);/) {

  GC> etc - the last line giving the warning. Can you see why?!

no, i can't see why because i can't see your data nor your intent. i can
say you have an undef value there.

and learn to properly indent your code. it is unreadable and i won't
apologise if your feelings are hurt. you post code here, you get
feedback here. programming is not for the overly sensitive.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org


------------------------------

Date: Tue, 20 Apr 2004 13:37:56 GMT
From: Ala Qumsieh <xxala_qumsiehxx@xxyahooxx.com>
Subject: Re: slurp not working? ideas please!
Message-Id: <UI9hc.53252$I33.46366@newssvr25.news.prodigy.com>

Geoff Cox wrote:
> 
> my $line = <INNN>;
> 
> while (<INNN>){
>     last if /$pattern/;
>               }
> my ($curr, $next1, $next2, $next3) = <INNN>;
>     close (INNN);
> 
>     if ($next3 =~ /\$i\<(\d+);/) {
> 
> etc - the last line giving the warning. Can you see why?!

Because INNN doesn't have enough lines to be read?

--Ala



------------------------------

Date: Tue, 20 Apr 2004 13:58:28 GMT
From: Joe Smith <Joe.Smith@inwap.com>
Subject: Re: slurp not working? ideas please!
Message-Id: <80ahc.4900$GR.622274@attbi_s01>

Geoff Cox wrote:

> is wrong that this got missed out....The code below works for the file
> indicated except that for some reason the slurp does not work 
> 
> ie in the sub classroomnotes part ...
> 
> my $line = <INNN>;
> while (<INNN>){
>     last if /$pattern/;
>               }
> my ($curr, $next1, $next2, $next3) = <INNN>;
>     close (INNN);
>     if ($next3 =~ /\$i\<(\d+);/) {
> 
> etc - the last line giving the warning. Can you see why?!

Have you tried using print() statements for debugging?

   my $pattern_line = "NOTHING MATCHED!!\n"
   my $line = <INNN>;
   while (<INNN>) {
     $pattern_line = $_;
     last if /$pattern/;
   }
   my ($curr, $next1, $next2, $next3) = <INNN>;
   print "First line: $line";
   print "Line with pattern or last line of file: $pattern_line";
   print "  curr=$curr  next1=$next1  next2=$next2";
   print "  eof()=", eof() ? "true\n" : "false\n";



------------------------------

Date: Tue, 20 Apr 2004 15:11:55 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: slurp not working? ideas please!
Message-Id: <96fa801ru1jr3at8f3bek7jj2j3bli0am6@4ax.com>

On 20 Apr 2004 13:06:34 GMT, anno4000@lublin.zrz.tu-berlin.de (Anno
Siegel) wrote:

>Then show your code as it is now.  A sub that defines (non-anonymous)
>subs in its body is so much off kilter, it's impossible to guess what
>it should and shouldn't do.


Anno

the code as of now follows - I am confused re how the OOP fits in with
the File::Find ... but as I am using html files in the 1 folder have
removed the File::Find part .. but still get the warning re
uninitialized value in pattern match for the

   if ($next3 =~ /\$i\<(\d+);/) {

in sub classroomnotes

any ideas why?

Cheers

Geoff

package MyParser;
use base qw(HTML::Parser);

my $in_heading;
my $p;

my $name = "as-left.htm";
open (OUT, ">>d:/a-keep9/short-nondb/short/members2/$name") ||
die "cannot open >>d:/a-keep9/short-nondb/short/members2/$name \n";

print OUT ("<html><head><title>test</title></head><body> \n");
print OUT ("<table width='100%' border='1'> \n");

sub start {

        my ($self, $tagname, $attr, undef, $origtext) = @_;

	if ($tagname eq 'h2') {
	    $in_heading = 1;
	    return;
                              }

        if ($tagname eq 'p') {
            $p = 1;
	    return;
                             }
       
         if ($tagname eq 'option') {

           choice($attr->{ value });
                                    }
 
        }

        sub end         {
        my ($self, $tagname, $origtext) = @_;
	if ($tagname eq 'h2') {
	    $in_heading = 0;
	    return;
                              }


         if ($tagname eq 'p') {
            $p = 0;
	    return;
                              }
                        }
    
    sub text       {
        my ($self, $origtext) = @_;
        print OUT ("<h2>$origtext</h2> \n") if $in_heading;
        print OUT ("<p>$origtext</p> \n") if $p;

                   }

sub choice {
my ($path) = @_;
 
if ($path =~ /docs\/aslevel\/classroom-notes/) {
  intro($path);
  classroomnotes($path);
                                               } 

           }

sub intro {

my ($pathhere) = @_;
   open (INN, "d:/a-keep9/short-nondb/db/total-160404.txt") ||
   die "cannot open d:/a-keep9/short-nondb/db/total-160404.txt \n";
my $lineintro;

       while (defined ($lineintro = <INN>)) {
              if ($lineintro =~ /$pathhere','(.*?)'\)\;/) {
               print OUT ("<tr><td>$1 <p> </td>\n");
                                                          }
                                            }
          }



sub classroomnotes {

my ($pattern) = @_;

# print ("\$pattern has value $pattern \n");

open (INNN, "d:/a-keep9/short-nondb/allphp/allphp2.php") ||
die "cannot open d:/a-keep9/short-nondb/allphp/allphp2.php \n";

my $line = <INNN>;

while (<INNN>){
    last if /$pattern/;
              }
my ($curr, $next1, $next2, $next3) = <INNN>;
    close (INNN);

    if ($next3 =~ /\$i\<(\d+);/) {
    my $nn = $1;
    print OUT ("<td valign='top'> \n");
    for ($c=1;$c<$nn;$c++) {
    print OUT ('<a href="'. $pattern . "-doc" . $c . ".zip" . '">' .
"Document$c" . "</a><br>" . "\n");
                           }     
    print OUT ("</td></tr>\n");
                                 }
                          }


package main;
open (IN, "d:/a-keep9/short-nondb/oldshort2/$name") ||
die "cannot open package main d:/a-keep9/short-nondb/oldshort2/$name
\n";
undef $/;
my $html = <IN>;
my $parser = MyParser->new;
$parser->parse($html);

print OUT ("</tr></table> \n");
print OUT ("</body></html> \n");




------------------------------

Date: Tue, 20 Apr 2004 15:29:52 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: slurp not working? ideas please!
Message-Id: <mifa80122umuj95cgddc1986n82hdglle7@4ax.com>

On Tue, 20 Apr 2004 13:35:53 GMT, Uri Guttman <uri@stemsystems.com>
wrote:

Uri

>  GC> my $line = <INNN>;
>
>what happens to this line?

my mistake - the above line should not be there.

The following code does work when in a separate script in which I
provide the path for the sub classroomnotes but is not working in the
script in the second section below

Geoff

--------------------------------------------------------

while (<INNN>)         {
    last if /$pattern/;
                                 }
my ($curr, $next1, $next2, $next3) = <INNN>;
    print ("$curr - $next1 - $next2 - $next3 \n");
    close (INNN);

    if ($next3 =~ /\$i\<(\d+);/)      {
    my $nn = $1;
    print OUT ("<td valign='top'> \n");
         for ($c=1;$c<$nn;$c++)               {
             print OUT ('<a href="'. $pattern . "-doc" . $c . ".zip" .
'">' . "Document$c" . "</a><br>" . "\n");
                                                             }     
    print OUT ("</td></tr>\n");
                                                            }

---------------------------------------------------------------------

package MyParser;
use base qw(HTML::Parser);

my $in_heading;
my $p;

my $name = "as-left.htm";
open (OUT, ">>d:/a-keep9/short-nondb/short/members2/$name") ||
die "cannot open >>d:/a-keep9/short-nondb/short/members2/$name \n";

print OUT ("<html><head><title>test</title></head><body> \n");
print OUT ("<table width='100%' border='1'> \n");

sub start {

        my ($self, $tagname, $attr, undef, $origtext) = @_;

	if ($tagname eq 'h2') {
	    $in_heading = 1;
	    return;
                              }

        if ($tagname eq 'p') {
            $p = 1;
	    return;
                             }
       
         if ($tagname eq 'option') {

           choice($attr->{ value });
                                    }
 
        }

        sub end         {
        my ($self, $tagname, $origtext) = @_;
	if ($tagname eq 'h2') {
	    $in_heading = 0;
	    return;
                              }


         if ($tagname eq 'p') {
            $p = 0;
	    return;
                              }
                        }
    
    sub text       {
        my ($self, $origtext) = @_;
        print OUT ("<h2>$origtext</h2> \n") if $in_heading;
        print OUT ("<p>$origtext</p> \n") if $p;

                   }

sub choice {
my ($path) = @_;
 
if ($path =~ /docs\/aslevel\/classroom-notes/) {
  intro($path);
  classroomnotes($path);
                                               } 

           }

sub intro {

my ($pathhere) = @_;
   open (INN, "d:/a-keep9/short-nondb/db/total-160404.txt") ||
   die "cannot open d:/a-keep9/short-nondb/db/total-160404.txt \n";
my $lineintro;

       while (defined ($lineintro = <INN>)) {
              if ($lineintro =~ /$pathhere','(.*?)'\)\;/) {
               print OUT ("<tr><td>$1 <p> </td>\n");
                                                          }
                                            }
          }



sub classroomnotes {

my ($pattern) = @_;

# print ("\$pattern has value $pattern \n");

open (INNN, "d:/a-keep9/short-nondb/allphp/allphp2.php") ||
die "cannot open d:/a-keep9/short-nondb/allphp/allphp2.php \n";

while (<INNN>){
    last if /$pattern/;
              }
my ($curr, $next1, $next2, $next3) = <INNN>;
    close (INNN);

    if ($next3 =~ /\$i\<(\d+);/) {
    my $nn = $1;
    print OUT ("<td valign='top'> \n");
    for ($c=1;$c<$nn;$c++) {
    print OUT ('<a href="'. $pattern . "-doc" . $c . ".zip" . '">' .
"Document$c" . "</a><br>" . "\n");
                           }     
    print OUT ("</td></tr>\n");
                                 }
                          }


package main;
open (IN, "d:/a-keep9/short-nondb/oldshort2/$name") ||
die "cannot open package main d:/a-keep9/short-nondb/oldshort2/$name
\n";
undef $/;
my $html = <IN>;
my $parser = MyParser->new;
$parser->parse($html);

print OUT ("</tr></table> \n");
print OUT ("</body></html> \n");



------------------------------

Date: Tue, 20 Apr 2004 15:32:01 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: slurp not working? ideas please!
Message-Id: <agga80lu2gpsqeoaiqj8krp1nvf61qm84j@4ax.com>

On Tue, 20 Apr 2004 13:58:28 GMT, Joe Smith <Joe.Smith@inwap.com>
wrote:

>> etc - the last line giving the warning. Can you see why?!
>
>Have you tried using print() statements for debugging?
>
>   my $pattern_line = "NOTHING MATCHED!!\n"
>   my $line = <INNN>;
>   while (<INNN>) {
>     $pattern_line = $_;
>     last if /$pattern/;
>   }
>   my ($curr, $next1, $next2, $next3) = <INNN>;
>   print "First line: $line";
>   print "Line with pattern or last line of file: $pattern_line";
>   print "  curr=$curr  next1=$next1  next2=$next2";
>   print "  eof()=", eof() ? "true\n" : "false\n";


Joe

Yes, have tried some of the above but not the eof one !

Thanks

Geoff


------------------------------

Date: Tue, 20 Apr 2004 15:55:48 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: slurp not working? ideas please!
Message-Id: <6lha80ppb6jp5jmo5abjf1iuj19uo0qu6n@4ax.com>

On Tue, 20 Apr 2004 15:32:01 GMT, Geoff Cox
<geoffacox@dontspamblueyonder.co.uk> wrote:

the following code works when on its own but when in the longer script
as a sub it does not!?

Geoff

my $pattern = "docs/aslevel/classroom-notes/finance/finance";

open (INNN, "d:/a-keep9/short-nondb/allphp/allphp2.php");
open (OUT, ">>d:/a-keep9/short-nondb/short/members2/test.htm");

while (<INNN>)          {
    last if /$pattern/;
                                  }
my ($curr, $next1, $next2, $next3) = <INNN>;
    close (INNN);

    if ($next3 =~ /\$i\<(\d+);/)         {
    my $nn = $1;
    print ("\$nn = $nn \n");
    print OUT ("<td valign='top'> \n");
          for ($c=1;$c<$nn;$c++)  {
          print OUT ('<a href="'. $pattern . "-doc" . $c . ".zip" .
'">' . "Document$c" . "</a><br>" . "\n");
                                                 }     
    print OUT ("</td></tr>\n");
                                                       }

----------------------------------------------------------------------------------

in the following context it does not work ...

package MyParser;
use base qw(HTML::Parser);

my $in_heading;
my $p;

my $name = "as-left.htm";
open (OUT, ">>d:/a-keep9/short-nondb/short/members2/$name") ||
die "cannot open >>d:/a-keep9/short-nondb/short/members2/$name \n";

print OUT ("<html><head><title>test</title></head><body> \n");
print OUT ("<table width='100%' border='1'> \n");

sub start {

        my ($self, $tagname, $attr, undef, $origtext) = @_;

	if ($tagname eq 'h2') {
	    $in_heading = 1;
	    return;
                              }

        if ($tagname eq 'p') {
            $p = 1;
	    return;
                             }
       
         if ($tagname eq 'option') {

           choice($attr->{ value });
                                    }
 
        }

        sub end         {
        my ($self, $tagname, $origtext) = @_;
	if ($tagname eq 'h2') {
	    $in_heading = 0;
	    return;
                              }


         if ($tagname eq 'p') {
            $p = 0;
	    return;
                              }
                        }
    
    sub text       {
        my ($self, $origtext) = @_;
        print OUT ("<h2>$origtext</h2> \n") if $in_heading;
        print OUT ("<p>$origtext</p> \n") if $p;

                   }

sub choice {
my ($path) = @_;
 
if ($path =~ /docs\/aslevel\/classroom-notes/) {
  intro($path);
  classroomnotes($path);
                                               } 

           }

sub intro {

my ($pathhere) = @_;
   open (INN, "d:/a-keep9/short-nondb/db/total-160404.txt") ||
   die "cannot open d:/a-keep9/short-nondb/db/total-160404.txt \n";
my $lineintro;

       while (defined ($lineintro = <INN>)) {
              if ($lineintro =~ /$pathhere','(.*?)'\)\;/) {
               print OUT ("<tr><td>$1 <p> </td>\n");
                                                          }
                                            }
          }

sub classroomnotes {

my ($pattern) = @_;

open (INNN, "d:/a-keep9/short-nondb/allphp/allphp2.php") ||
die "cannot open d:/a-keep9/short-nondb/allphp/allphp2.php \n";

while (<INNN>)            {
    last if /$pattern/;
                                    }
my ($curr, $next1, $next2, $next3) = <INNN>;
    close (INNN);

           if ($next3 =~ /\$i\<(\d+);/)                  {
           my $nn = $1;
          print OUT ("<td valign='top'> \n");
                 for ($c=1;$c<$nn;$c++)             {
                 print OUT ('<a href="'. $pattern . "-doc" . $c .
".zip" . '">' . "Document$c" . "</a><br>" . "\n");
                                                                   }
                 print OUT ("</td></tr>\n");

}
                               }

package main;
open (IN, "d:/a-keep9/short-nondb/oldshort2/$name") ||
die "cannot open package main d:/a-keep9/short-nondb/oldshort2/$name
\n";
undef $/;
my $html = <IN>;
my $parser = MyParser->new;
$parser->parse($html);

print OUT ("</tr></table> \n");
print OUT ("</body></html> \n");





------------------------------

Date: 20 Apr 2004 16:14:45 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: slurp not working? ideas please!
Message-Id: <c63i9l$7kubu$1@ID-231055.news.uni-berlin.de>

Also sprach Anno Siegel:

> Geoff Cox  <geoffacox@dontspamblueyonder.co.uk> wrote in comp.lang.perl.misc:

>> I am quite prepared to admit that the code is not very well written
>> but apart from this particular problem. it does work. I have left out
>> large parts of the code which do work ...
> 
> Then show your code as it is now.  A sub that defines (non-anonymous)
> subs in its body is so much off kilter, it's impossible to guess what
> it should and shouldn't do.

Actually, the code doesn't define functions inside others. The indenting
merely suggests it does. :-) The code is probably a bit better than it
looks on first sight (after all, it was in major parts written by me in
a previous thread;-).

To the OP: Please fix the indenting first (just as Uri has told you). As
it currently is, it is deliberately misleading its readers. Maybe this
will already help you to spot the problem yourself.

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: Tue, 20 Apr 2004 11:22:36 -0500
From: "J. Gleixner" <glex_nospam@qwest.invalid>
Subject: Re: slurp not working? ideas please!
Message-Id: <o7chc.19$C73.12031@news.uswest.net>

Geoff Cox wrote:
> On Tue, 20 Apr 2004 15:32:01 GMT, Geoff Cox
> <geoffacox@dontspamblueyonder.co.uk> wrote:
> 
> the following code works when on its own but when in the longer script
> as a sub it does not!?


If you really want others to help, then please do what many have 
suggested.. FORMAT YOUR FLIPPIN CODE! For guidelines see:

perldoc perlstyle

Print the value of $next3 before the following line

if ($next3 =~ /\$i\<(\d+);/)

so you/we can see its value.  Obviously it's not what you think it is.


------------------------------

Date: Tue, 20 Apr 2004 17:15:20 GMT
From: Geoff Cox <geoffacox@dontspamblueyonder.co.uk>
Subject: Re: slurp not working? ideas please!
Message-Id: <vgma80h999voc5jllp3oih9359f85h0850@4ax.com>

On Tue, 20 Apr 2004 11:22:36 -0500, "J. Gleixner"
<glex_nospam@qwest.invalid> wrote:

>If you really want others to help, then please do what many have 
>suggested.. FORMAT YOUR FLIPPIN CODE! For guidelines see:
>
>perldoc perlstyle

will check this..
>
>Print the value of $next3 before the following line
>
>if ($next3 =~ /\$i\<(\d+);/)
>
>so you/we can see its value.  Obviously it's not what you think it is.

I have done that and $next3 is empty ... but am not clear where to go
from there!?

Cheers

Geoff



------------------------------

Date: Tue, 20 Apr 2004 12:47:58 -0500
From: "J. Gleixner" <glex_nospam@qwest.invalid>
Subject: Re: slurp not working? ideas please!
Message-Id: <indhc.38$xo5.32897@news.uswest.net>

Geoff Cox wrote:
> On Tue, 20 Apr 2004 11:22:36 -0500, "J. Gleixner"
> <glex_nospam@qwest.invalid> wrote:
> 
> 
>>If you really want others to help, then please do what many have 
>>suggested.. FORMAT YOUR FLIPPIN CODE! For guidelines see:
>>
>>perldoc perlstyle
> 
> 
> will check this..

Could also use "perltidy" which does a pretty good job.

> 
>>Print the value of $next3 before the following line
>>
>>if ($next3 =~ /\$i\<(\d+);/)
>>
>>so you/we can see its value.  Obviously it's not what you think it is.
> 
> 
> I have done that and $next3 is empty ... but am not clear where to go
> from there!?

$next3 is 'undef'ined.

What are the 4 lines of allphp2.php following the line with the 
$pattern?  Compare them to the values of $curr, $next1, $next2, and 
$next3.  Do they match?

Possibly, $pattern isn't found, and the "last" is never performed, 
probably want a flag there.

my $found;
while (<INNN>) {
	if (/$pattern/) { $found=1; last; }
}

return if !$found;  # Or print some error message..
my ( $curr, $next1, $next2, $next3 ) = <INNN>;

Or possibly you're off by one line.

If $pattern is found, then the second <INNN> starts with the line after 
the line with the match.  Meaning, $curr would be the line after the 
match.  Guessing by the names of your variables, maybe you want:

my ($next1, $next2, $next3 ) = <INNN>;

Also, you don't need the ()'s for print.


------------------------------

Date: Tue, 20 Apr 2004 16:15:20 GMT
From: Clint Olsen <clint@0lsen.net>
Subject: Re: Writing fast(er) performing parsers in Perl
Message-Id: <slrnc8aj4o.rg7.clint@poly.0lsen.net>

On 2004-04-19, Walter Roberson <roberson@ibd.nrc-cnrc.gc.ca> wrote:
>
> In my parsing programs, I fairly consistantly find that 70% or more of
> the runtime is being spent simply split()'ing the lines into fields
> (using the default whitespace splitting.)
>
> I haven't tried Parse::Yapp, but I did find that performance improved a
> fair bit when I "guess" that a key field will start at the same string
> offset as it did on the previous line, and use substr() to probe for it
> there, and only doing the more general split() if that probe failed [or
> if the context of the keyword is such that I need the full split to
> understand the line anyhow.]

It's actually more efficient to treat the file as a stream of tokens and
just handle newline just like any other token.  This is more effective when
dealing with languages like C or Verilog which are free-form and whitespace
isn't necessarily required to separate tokens.  It's also helpful when
doing bookkeeping like line/column tracking for tokens within the file.

-Clint


------------------------------

Date: 20 Apr 2004 17:14:46 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: Writing fast(er) performing parsers in Perl
Message-Id: <c63lq6$95l$1@canopus.cc.umanitoba.ca>

In article <slrnc8aj4o.rg7.clint@poly.0lsen.net>,
Clint Olsen  <clint@0lsen.net> wrote:
:It's actually more efficient to treat the file as a stream of tokens and
:just handle newline just like any other token.

Not if your grammar happens to be line-oriented (as has been the case
for the parsing I've been doing.) In your approach, you have the overhead
of tokenizing -everything-, but in line-oriented grammars there may
be portions of the line that can be ignored (at least until the
context calls upon them.)

For example, a lot of my parsing these days is on firewall logs.
Each line starts with a fixed-width datestamp (put in by the logging host),
then the name of the logging device, then a second fixed-width datestamp
(put in by the firewall), followed by a fixed-width message-type code.
For my purposes, well over half of the message-type codes are
indicative of lines that can be ignored, so by just extracting one
substring and doing a hash lookup, half the time I can determine that
I don't need to tokenize anything else on the line.
-- 
   Most Windows users will run any old attachment you send them, so if
   you want to implicate someone you can just send them a Trojan
   -- Adam Langley


------------------------------

Date: Tue, 20 Apr 2004 10:21:58 -0700
From: Arvin Portlock <apollock11@hotmail.com>
Subject: XML::Xerces questions
Message-Id: <c63m7q$kt9$1@agate.berkeley.edu>

I'm using the XML::Xerces module to validate batches of
XML documents against a schema. The module is still under
development so there is little documentation that I can
find, but I'm still finding it incredibly useful. I have
4 questions that would enhance my Xerces experience greatly.

1. The way to get validation errors seems incredibly odd
to me:

eval {$parser->parse ($file)};
print $@;

Is this the only way to get at error messages? Via $@?
Does this wrapper provide a more direct method? Does this
seem odd to anybody else in the perl community or is
it just me?

2. Is there any way to use local copies of the schemas
rather than have Xerces fetch them from the web? In my
XML documents the referenced schemas have the form:

xsi:schemaLocation="http://www.loc.gov/standards/mets/mets.xsd"

I.e., they are all URLs. I think this is why Xerces is so
slow. As I'd like to use this module to validate batches
of thousands of documents, it would be nice if Xerces didn't
have to go out and fetch the schemas for every single document.

3. Xerces stops validating after the first error encountered.
Is there any way to get it to report all the errors in the
documents. I understand what the standard says about parsers
and errors, but evry other validator I know about has an option
to continue validation after an error. Is there a similar option
for Xerces?

4. Lastly, $@ reports errors in this form:

ERROR:
FILE:    D:\sgml\mets\tei2mets/test.mets.xml
LINE:    34
COLUMN:  27
MESSAGE: Unknown element 'mods:namePPart'
  at validate.pl line 13

So I need to parse out the various pieces using regular expressions
to compose messages in the form I want. So I guess this is a repeat
of the first question: is there a way to get direct access to the
pieces of the error message?

I'd prefer to not change or extend Xerces.pm itself.

Thanks!



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6432
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[24241] in Perl-Users-Digest

Perl-Users Digest, Issue: 6432 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Tue Apr 20 14:10:49 2004

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Apr 20 14:10:49 2004