[30620] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 1865 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Sep 20 03:09:46 2008

Date: Sat, 20 Sep 2008 00:09:09 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 20 Sep 2008     Volume: 11 Number: 1865

Today's topics:
    Re: Easiest way to do this? <RedGrittyBrick@spamweary.invalid>
    Re: IPC:Shareable <clauskick@hotmail.com>
    Re: IPC:Shareable xhoster@gmail.com
        new CPAN modules on Sat Sep 20 2008 (Randal Schwartz)
    Re: Problem using Data::Translate to convert hex to dec <ced@blv-sam-01.ca.boeing.com>
    Re: Regular express for <p>, <ul> and <ol> tags sln@netherlands.com
    Re: Regular express for <p>, <ul> and <ol> tags sln@netherlands.com
    Re: Regular express for <p>, <ul> and <ol> tags sln@netherlands.com
    Re: Regular express for <p>, <ul> and <ol> tags <jurgenex@hotmail.com>
    Re: Regular express for <p>, <ul> and <ol> tags <tadmc@seesig.invalid>
    Re: Regular express for <p>, <ul> and <ol> tags sln@netherlands.com
    Re: Regular express for <p>, <ul> and <ol> tags <jurgenex@hotmail.com>
    Re: Regular express for <p>, <ul> and <ol> tags sln@netherlands.com
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Fri, 19 Sep 2008 19:47:42 +0100
From: RedGrittyBrick <RedGrittyBrick@spamweary.invalid>
Subject: Re: Easiest way to do this?
Message-Id: <gb0s4i$7of$1@registered.motzarella.org>


void.no.spam.com@gmail.com wrote:
> On Sep 19, 9:15 am, hymie_@_lactose.homelinux.net (hymie!) wrote:
>> In our last episode, the evil Dr. Lacto had captured our hero,
>>   <void.no.spam....@gmail.com>, who said:
>>
>>> I need to be able to take any dollar amount, such as 25103.34, and
>>> multiply it by 0.0000056.  
>>> Then I need to take the result and truncate
>>> everything after the fifth decimal place, so if the result is
>>> 0.140578704, then I will have 0.14057.
>> This seems like a useless step considering
> 
> Thanks for the answer.  That step actually isn't useless, because if
> the multiplication results in something like 0.140000001, and then you
> don't truncate, the final result will be 0.15 when it should be 0.14.
> 

I don't understand what you mean, because ...

 >perl -e "print 25000.00002 * 0.0000056"
0.140000000112

 >perl -e "printf '%.2f', 25000.00002 * 0.0000056"
0.14

 From perldoc -f sprintf:
     # Round number to 3 digits after decimal point
     $rounded = sprintf("%.3f", $number);

So no truncation there.


-- 
RGB


------------------------------

Date: Fri, 19 Sep 2008 11:19:00 -0700 (PDT)
From: Snorik <clauskick@hotmail.com>
Subject: Re: IPC:Shareable
Message-Id: <0eca352c-b423-47bb-8e29-8893f404cb1a@a1g2000hsb.googlegroups.com>

On 19 Sep., 18:21, Ben Morrow <b...@morrow.me.uk> wrote:

> > So Storable persists (and of course serializes) any datastructure;that
> > means I can store the hash to disk (or memory, hopefully memory?).
>
> Yes. You use store/retrieve to save to and load from disk; you use
> freeze/thaw to save to and load from memory.

Ok, thanks for that, I will read the documentation and actually try to
understand it.

> > How can I retrieve this in the calling script, as this sub is going to
> > live in a module itself? I must admit, this is my first attempt at IPC
> > myself.
>
> If you store it with 'freeze', you get it out again with 'thaw'.

Yes, I have understood that, but if I freeze a hash in one script, how
can I thaw it in the other script? I do not have the reference?
I tried to use a tied variable for that, figuring that this should
work this time, but this failed unfortunately.



------------------------------

Date: 19 Sep 2008 18:32:51 GMT
From: xhoster@gmail.com
Subject: Re: IPC:Shareable
Message-Id: <20080919143253.017$d4@newsreader.com>

Snorik <clauskick@hotmail.com> wrote:
> On 19 Sep., 18:21, Ben Morrow <b...@morrow.me.uk> wrote:
>
> >
> > If you store it with 'freeze', you get it out again with 'thaw'.
>
> Yes, I have understood that, but if I freeze a hash in one script, how
> can I thaw it in the other script?

When you freeze, you get a serialized data, which is just a string.  You
pass that string to the other script using shared memory (or pipes).

> I do not have the reference?

That is what thaw does.  It makes a reference again out of the serialized
data. Obviously it isn't the same reference, but deep copy of the
referenced data.

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.


------------------------------

Date: Sat, 20 Sep 2008 04:42:23 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Sat Sep 20 2008
Message-Id: <K7H92n.15HJ@zorch.sf-bay.org>

The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN).  You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.

Audio-Ecasound-Multitrack-0.96
http://search.cpan.org/~ganglion/Audio-Ecasound-Multitrack-0.96/
Perl libraries for multitrack audio processing using the Ecasound signal-processing engine. 
----
CPAN-WWW-Testers-0.34
http://search.cpan.org/~barbie/CPAN-WWW-Testers-0.34/
Present CPAN Testers data 
----
CPAN-WWW-Testers-Generator-0.29
http://search.cpan.org/~barbie/CPAN-WWW-Testers-Generator-0.29/
Download and summarize CPAN Testers data 
----
Class-Accessor-Fast-XS-0.01
http://search.cpan.org/~ruz/Class-Accessor-Fast-XS-0.01/
XS replacement for Class::Accessor::Fast 
----
DBD-Pg-2.10.6
http://search.cpan.org/~turnstep/DBD-Pg-2.10.6/
PostgreSQL database driver for the DBI module 
----
DBIx-SchemaChecksum-0.21
http://search.cpan.org/~domm/DBIx-SchemaChecksum-0.21/
Generate and compare checksums of database schematas 
----
Data-InputMonster-0.005
http://search.cpan.org/~rjbs/Data-InputMonster-0.005/
consume data from multiple sources, best first; om nom nom! 
----
Data-InputMonster-0.006
http://search.cpan.org/~rjbs/Data-InputMonster-0.006/
consume data from multiple sources, best first; om nom nom! 
----
Data-InputMonster-Util-Catalyst-0.002
http://search.cpan.org/~rjbs/Data-InputMonster-Util-Catalyst-0.002/
InputMonster sources for common Catalyst sources 
----
DateTimeX-Easy-0.083_1
http://search.cpan.org/~rkrimen/DateTimeX-Easy-0.083_1/
Parse a date/time string using the best method available 
----
Devel-Declare-0.002000
http://search.cpan.org/~mstrout/Devel-Declare-0.002000/
----
Fey-0.13
http://search.cpan.org/~drolsky/Fey-0.13/
Better SQL Generation Through Perl 
----
Fey-0.14
http://search.cpan.org/~drolsky/Fey-0.14/
Better SQL Generation Through Perl 
----
Foorum-0.2.4
http://search.cpan.org/~fayland/Foorum-0.2.4/
Foorum is a forum script built in Catalyst. 
----
Games-Risk-1.0.0
http://search.cpan.org/~jquelin/Games-Risk-1.0.0/
classical 'risk' board game 
----
Games-Sudoku-SudokuTk-0.11
http://search.cpan.org/~cguine/Games-Sudoku-SudokuTk-0.11/
Sudoku Game 
----
Gtk2-Ex-Xor-3
http://search.cpan.org/~kryde/Gtk2-Ex-Xor-3/
shared support for drawing with XOR 
----
IO-Socket-SSL-1.16
http://search.cpan.org/~sullr/IO-Socket-SSL-1.16/
Nearly transparent SSL encapsulation for IO::Socket::INET. 
----
IO-Socket-SSL-1.16_1
http://search.cpan.org/~sullr/IO-Socket-SSL-1.16_1/
Nearly transparent SSL encapsulation for IO::Socket::INET. 
----
Kephra-0.3.10.12
http://search.cpan.org/~lichtkind/Kephra-0.3.10.12/
crossplatform, GUI-Texteditor along perllike Paradigms 
----
Locale-Currency-Format-1.26
http://search.cpan.org/~tnguyen/Locale-Currency-Format-1.26/
Perl functions for formatting monetary values 
----
Mail-DWIM-0.03
http://search.cpan.org/~mschilli/Mail-DWIM-0.03/
Do-What-I-Mean Mailer 
----
Module-Build-0.2808_05
http://search.cpan.org/~ewilhelm/Module-Build-0.2808_05/
Build and install Perl modules 
----
Module-Starter-Plugin-CGIApp-0.05
http://search.cpan.org/~jaldhar/Module-Starter-Plugin-CGIApp-0.05/
template based module starter for CGI apps. 
----
Muldis-D-0.48.0
http://search.cpan.org/~duncand/Muldis-D-0.48.0/
Formal spec of Muldis D relational DBMS lang 
----
Muldis-Rosetta-0.12.0
http://search.cpan.org/~duncand/Muldis-Rosetta-0.12.0/
Full-featured truly relational DBMS in Perl 
----
ORLite-0.12
http://search.cpan.org/~adamk/ORLite-0.12/
Extremely light weight SQLite-specific ORM 
----
ORLite-0.13
http://search.cpan.org/~adamk/ORLite-0.13/
Extremely light weight SQLite-specific ORM 
----
ORLite-Mirror-0.07
http://search.cpan.org/~adamk/ORLite-Mirror-0.07/
Extend ORLite to support remote SQLite databases 
----
POE-Component-SmokeBox-Uploads-NNTP-0.08
http://search.cpan.org/~bingos/POE-Component-SmokeBox-Uploads-NNTP-0.08/
Obtain uploaded CPAN modules via NNTP. 
----
POE-Component-WakeOnLAN-1.00
http://search.cpan.org/~bingos/POE-Component-WakeOnLAN-1.00/
A POE Component to send packets to power on computers. 
----
Parse-CPAN-Distributions-0.05
http://search.cpan.org/~barbie/Parse-CPAN-Distributions-0.05/
Provides an index for current CPAN distributions 
----
Perlanet-0.02
http://search.cpan.org/~davecross/Perlanet-0.02/
A program for creating web pages that aggregate web feeds (both RSS and Atom). 
----
Perlanet-0.03
http://search.cpan.org/~davecross/Perlanet-0.03/
A program for creating web pages that aggregate web feeds (both RSS and Atom). 
----
Perlanet-0.04
http://search.cpan.org/~davecross/Perlanet-0.04/
A program for creating web pages that aggregate web feeds (both RSS and Atom). 
----
Pod-From-GoogleWiki-0.01
http://search.cpan.org/~fayland/Pod-From-GoogleWiki-0.01/
convert from Google Code wiki markup to POD 
----
Pod-From-GoogleWiki-0.02
http://search.cpan.org/~fayland/Pod-From-GoogleWiki-0.02/
convert from Google Code wiki markup to POD 
----
Pod-From-GoogleWiki-0.03
http://search.cpan.org/~fayland/Pod-From-GoogleWiki-0.03/
convert from Google Code wiki markup to POD 
----
RPC-XML-0.62
http://search.cpan.org/~rjray/RPC-XML-0.62/
A set of classes for core data, message and XML handling 
----
RPC-XML-0.63
http://search.cpan.org/~rjray/RPC-XML-0.63/
A set of classes for core data, message and XML handling 
----
Rose-DBx-Object-Cached-CHI-0.03
http://search.cpan.org/~kmcgrath/Rose-DBx-Object-Cached-CHI-0.03/
Rose::DB::Object Cache using the CHI interface 
----
SVK-v2.2.0
http://search.cpan.org/~clkao/SVK-v2.2.0/
A Distributed Version Control System 
----
Task-POE-IRC-1.10
http://search.cpan.org/~bingos/Task-POE-IRC-1.10/
Task to install all POE related IRC modules. 
----
Template-Teeny-0.00_001
http://search.cpan.org/~konobi/Template-Teeny-0.00_001/
Teeny-weeny templating system 
----
Test-Aggregate-0.34_05
http://search.cpan.org/~ovid/Test-Aggregate-0.34_05/
Aggregate *.t tests to make them run faster. 
----
Test-Dir-1.006
http://search.cpan.org/~mthurn/Test-Dir-1.006/
test directory attributes 
----
Test-Weaken-0.100000
http://search.cpan.org/~jkegl/Test-Weaken-0.100000/
Test that freed references are, indeed, freed 
----
Tk-StyleDialog-0.02
http://search.cpan.org/~kirsle/Tk-StyleDialog-0.02/
Stylish dialog boxes with custom icons. 
----
Tk-StyleDialog-0.03
http://search.cpan.org/~kirsle/Tk-StyleDialog-0.03/
Stylish dialog boxes with custom icons. 
----
Tk-Wizard-2.141
http://search.cpan.org/~lgoddard/Tk-Wizard-2.141/
GUI for step-by-step interactive logical process 
----
Verilog-Perl-3.042
http://search.cpan.org/~wsnyder/Verilog-Perl-3.042/
----
Video-Dumper-QuickTime-1.0003
http://search.cpan.org/~grandpa/Video-Dumper-QuickTime-1.0003/
Dump QuickTime movie file structure 
----
YAML-YuyuPress-0.07_2
http://search.cpan.org/~jmerelo/YAML-YuyuPress-0.07_2/
Tool for making presentations out of YAML files. 
----
YUM-RepoQuery-0.0.4
http://search.cpan.org/~rsrchboy/YUM-RepoQuery-0.0.4/
Query a YUM repository for package information 
----
ZConf-0.4.0
http://search.cpan.org/~vvelox/ZConf-0.4.0/
A configuration system allowing for either file or LDAP backed storage. 


If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.

This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
  http://www.stonehenge.com/merlyn/LinuxMag/col82.html

print "Just another Perl hacker," # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion


------------------------------

Date: Fri, 19 Sep 2008 21:42:03 -0700 (PDT)
From: "C.DeRykus" <ced@blv-sam-01.ca.boeing.com>
Subject: Re: Problem using Data::Translate to convert hex to decimal
Message-Id: <94a7fa63-4210-47eb-a2a1-9a23643cb0e4@o40g2000prn.googlegroups.com>

On Sep 18, 11:51 am, ajcrm125 <ajcrm...@gmail.com> wrote:
> On Sep 18, 10:58 am, Ben Morrow <b...@morrow.me.uk> wrote:
>
>
>
> > Quoth ajcrm125 <ajcrm...@gmail.com>:
>
> > > For whatever reason... the h2d is not converting some characters.
> > > Here's an example trying to convert "09":
>
> > > ===================================================
> > > use lib "/afs/btv.ibm.com/u/adamc/usr/lib/perl5/site_perl/5.8.5";
> > > use Data::Translate;
> > > $data = new Translate;
>
> > > ($status,$result) = $data->h2d("09");
> > > print "result is $result\n";
> > > ===================================================
>
> > Why not just use hex()?
>
> >     ~% perl -le'print hex("0f")
> >     15
>
> > Ben
>
> > --
> > Heracles: Vulture! Here's a titbit for you / A few dried molecules of the gall
> >    From the liver of a friend of yours. / Excuse the arrow but I have no spoon.
> > (Ted Hughes,        [ Heracles shoots Vulture with arrow. Vulture bursts into ]
> >  'Alcestis')        [ flame, and falls out of sight. ]         b...@morrow.me.uk
>
> Never even knew that function existed. Thanks!
>
> Wonder why the h2d function is not working as expected though.
> ??

Hm, here's Data::Translate::h2d:

sub h2d {
  shift;
  local (@hex)=@_;my $i;
  for ($i=0;$i<=$#hex;$i++) {
    $hex[$i]=ord(unpack("A",
        pack("H*", $hex[$i])));
  }
  return 1,@hex;
}

Perhaps the author was intending something like:

 hex[$i]= ord(
    pack("H*",sprintf "%02s",$_));


But, as mentioned, that could be replaced  with just:


  hex[$i] = hex;

which is much better.


--
Charles DeRykus


------------------------------

Date: Sat, 20 Sep 2008 00:12:44 GMT
From: sln@netherlands.com
Subject: Re: Regular express for <p>, <ul> and <ol> tags
Message-Id: <gte8d41rld588j0j1a4knpg92lbqnrp71v@4ax.com>

On Mon, 25 Aug 2008 16:47:42 GMT, Jürgen Exner <jurgenex@hotmail.com> wrote:

>Shahid <mirzashahidmahmood@gmail.com> wrote:
>>I am parsing an .HTML file that contains following example code:
>[snip]
>>I am trying to parse all the <p>, <ol> and <ul> tags but couldn't
>>succeed yet.
>>I am trying following Regular Expression(RE):
>>"(<[pP][^>]*>(.*)</[pP]>)|(<[oO][lL][^>]+>(.*)</[oO][lL]>)|(<[uU][lL]
>>[^>]+>(.*)</[uU][lL]>)"
>
>HTML is not a regular language!

Not only not a real language, Html is not a regular expression.

> While the extensions to Perl's RE
>language might be powerful enough to cover HTML

That power is not needed, nor ever was

>, no sane person would
>even try to do so.

I guess I'm not sane then

>If you want to parse HTML then use a proper HTML
>parser. There are several on CPAN.
>
But do they use Perl independent regular expressions, or a system dependent
C library?

>>I am using preg_match_all(). 
>
>Undefined subroutine &main::preg_match_all called at [...]
>
>>Remember I am working in PHP.
>
>You must have walked into the wrong room. This here is about Perl. Of
>course, I suppose PHP's REs are not even as powerful as Perl's, so
>probably trying to parse HTML using PHP's REs is even worse than using
>Perl's REs.

I agree, well don't know actually what php's rx engine can do. I'm sure it
can, in fact, I'm positive it can parse not only html, but any markup that exists.
Because, fact is, it's very very simple. 

>
>>If any one can help me, I will be very grateful to him/her. I need its
>>solution urgent.
>
>Use a parser, dude.
>
>jue

Dude,

Parsing Markup is considered to be the easiest thing in the world.
It was the first design goal. Without the ability to peel off the first layer,
markup, it is not even possible to get to the sub-layers.

This is what the OP was asking. Not if it was too hard.

Releasing RxParse 2.0 in a day or two. Not just a parser anymore.

sln


------------------------------

Date: Sat, 20 Sep 2008 00:13:59 GMT
From: sln@netherlands.com
Subject: Re: Regular express for <p>, <ul> and <ol> tags
Message-Id: <q0g8d49ec26m6ko6g9t8jch3ff42qohodr@4ax.com>

On Mon, 25 Aug 2008 10:22:02 -0400, Sherm Pendley <spamtrap@dot-app.org> wrote:

>Shahid <mirzashahidmahmood@gmail.com> writes:
>
>> I am using preg_match_all(). Remember I am working in PHP.
>
>Try comp.lang.php - we speak Perl here.
>
>sherm--

I believe it was a regular expression question though, wasn't it?

sln



------------------------------

Date: Sat, 20 Sep 2008 00:24:40 GMT
From: sln@netherlands.com
Subject: Re: Regular express for <p>, <ul> and <ol> tags
Message-Id: <53g8d4lctq949t42ur03eaogvcsmugclf4@4ax.com>

On Mon, 25 Aug 2008 15:45:42 +0200, Peter Makholm <peter@makholm.net> wrote:

>Shahid <mirzashahidmahmood@gmail.com> writes:
>
>> I am trying to parse all the <p>, <ol> and <ul> tags but couldn't
>> succeed yet.
>> I am trying following Regular Expression(RE):
>> "(<[pP][^>]*>(.*)</[pP]>)|(<[oO][lL][^>]+>(.*)</[oO][lL]>)|(<[uU][lL]
>> [^>]+>(.*)</[uU][lL]>)"
>
>Regular expressions is in general not the right tool to hadle xml and
>other xml-like data formats. You should us a module that parses the
>HTML correctly instead. HTML::TreeBuilder is one possibility.
>
This is hillarious since the w3c uses regular expression notation to detail the formal
specifications for html and xml.

>> I am using preg_match_all(). Remember I am working in PHP.
>
>Then you shouldn't use an perl group for you question. but even PHP
>should have better tools to parse HTML than regular
>expressionsm.

Wrong! There is no better markup syntax parser than regular expressions, none!

> Asking in a PHP forum should tell you which tools this
>is.
>
>//Makholm
>
>
Demigod!

Part of the RxParse 2.0 engine code being released in a day or 2, it parses anything.
Parsing is the easy part, adding tools is something else. I'm just adjusting parameters now.

%Dflth = (
   'hparsestart' => \&dflt_parsestart,
   'hparseend'   => \&dflt_parseend,
   'hstart' => \&dflt_start,
   'hend'   => \&dflt_end,
   'hchar'  => \&dflt_char,
   'hcdata' => \&dflt_cdata,
   'hcomment' => \&dflt_comment,
   'hattlist' => \&dflt_attlist,
   'hentity'  => \&dflt_entity,
   'hdoctype' => \&dflt_doctype,
   'helement' => \&dflt_element,
   'hxmldecl' => \&dflt_xmldecl,
   'hproc'    => \&dflt_proc,
   'herror'   => \&dflt_error,
   'hcopy'    => \&dflt_copy,
  );
  @UC_Nstart = (
    "\\x{C0}-\\x{D6}",
    "\\x{D8}-\\x{F6}",
    "\\x{F8}-\\x{2FF}",
    "\\x{370}-\\x{37D}",
    "\\x{37F}-\\x{1FFF}",
    "\\x{200C}-\\x{200D}",
    "\\x{2070}-\\x{218F}",
    "\\x{2C00}-\\x{2FEF}",
    "\\x{3001}-\\x{D7FF}",
    "\\x{F900}-\\x{FDCF}",
    "\\x{FDF0}-\\x{FFFD}",
    "\\x{10000}-\\x{EFFFF}",
  ); 
  @UC_Nchar = (
    "\\x{B7}",
    "\\x{0300}-\\x{036F}",
    "\\x{203F}-\\x{2040}",
  );
  $Nstrt = "[A-Za-z_:".join ('',@UC_Nstart)."]";
  $Nchar = "[-\\w:\\.".join ('',@UC_Nchar).join ('',@UC_Nstart)."]";
  $Name  = "(?:$Nstrt$Nchar*?)";
  #die "$Name\n";


  ## v2 parse regex:
  ## -------------------------------------------------
  $RxParseXP1 =
qr/(?:<(?:(?:(\/*)($Name)\s*(\/*))|(?:($Name+)(\s+(?:(?:(?:".*?")|(?:'.*?'))|(?:[^>]*?))+)\s*(\/?))|(?:\?(.*?)\?)|(?:!(?:(?:DOCTYPE(.*?))|(?:\[CDATA\[(.*?)\]\])|(?:--(.*?)--)|(?:ATTLIST(.*?))|(?:ENTITY(.*?))|(?:ELEMENT(.*?)))))>)|(.+?)/s;
  #                (  <(  (  1   12     2   3   3)|(  4      45   (  (  (       )|(       ))|(        )) 5   6   6)|(    7   7  )|(  !(  (         8   8)|(           9   9    )|(    0   0  )|(
1   1)|(        2   2)|(         3   3))))>)|4   4

  $RxAttr = qr/^\s+(?:(?:($Name)\s*=\s*("|'|))|($Name+))/;

  $RxAttr_DL1 = qr/^(?:([^'&<]*?)|([^'<]*?))'/;
  $RxAttr_DL2 = qr/^(?:([^"&<]*?)|([^"<]*?))"/;
  $RxAttr_DL3 = qr/^([^"'=<\s]+)/;
  $RxAttr_RM = qr/[^\s\n]+/;

  $RxPi = qr/^($Name)\s+(.*?)$/s;








------------------------------

Date: Fri, 19 Sep 2008 19:09:51 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Regular express for <p>, <ul> and <ol> tags
Message-Id: <5al8d4pl5l3kl5rgb8ld6jl4lgr3f6okea@4ax.com>

sln@netherlands.com wrote:
>>HTML is not a regular language!
>
>Not only not a real language, Html is not a regular expression.

I have absolutely no idea what what you mean by this. Not only is "real"
not well defined but why would snybody even think about a language
("HyperText Markup _LANGUAGE_") being an expression?

Do you even know what a regular language is and what properties are
associated with being a regular language resp. what properties are
assiciated with _NOT_ being a regular language?

>> While the extensions to Perl's RE
>>language might be powerful enough to cover HTML
>
>That power is not needed, nor ever was

Oh, that answers that question. Obviously you are unaware that only
regular languages can be parsed by (ordinary) regular expressions which
have the same expressiveness as regular grammars and finite automatons.

To parse context-sensitive languages you need at least a
non-deterministic pushdown automaton which in turn cannot be described
using regular expressions.

If you don't believe me then please re-read your books about Theory of
Programming Languages, chapter The Chomsky Hierarchy.

Now, Perl's REs are far more powerful than ordinary regular expressions,
so they might be powerful enough to parse context-sensitive languages.
But it's still a stupid thing to do. A simple parser is far easier to
write and to maintain than a gigantic mess of REs.

>>, no sane person would
>>even try to do so.
>
>I guess I'm not sane then

That's your call, not mine.

>>If you want to parse HTML then use a proper HTML
>>parser. There are several on CPAN.
>>
>But do they use Perl independent regular expressions, or a system dependent
>C library?

What does it matter? They parse HTML and thus solve the task at hand.
Correctly!

>I agree, well don't know actually what php's rx engine can do. I'm sure it
>can, in fact, I'm positive it can parse not only html, but any markup that exists.
>Because, fact is, it's very very simple. 

Oh, then by all means, please publish your findings. Contradicting
Chomsky is worth at least a Ph.D.

jue


------------------------------

Date: Fri, 19 Sep 2008 21:36:35 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: Regular express for <p>, <ul> and <ol> tags
Message-Id: <slrngd8odj.s3r.tadmc@tadmc30.sbcglobal.net>

sln@netherlands.com <sln@netherlands.com> wrote:
> On Mon, 25 Aug 2008 16:47:42 GMT, Jürgen Exner <jurgenex@hotmail.com> wrote:
>
>>Shahid <mirzashahidmahmood@gmail.com> wrote:
>>>I am parsing an .HTML file


>>HTML is not a regular language!
>
> Not only not a real language, 


He did not say real language.

He said regular language.

    http://en.wikipedia.org/wiki/Regular_language


>> While the extensions to Perl's RE
>>language might be powerful enough to cover HTML
>
> That power is not needed,


But it is. It has been mathematically proven to be needed.

If a parser can accept a context-free language (eg. HTML) then
the parser is not "regular".


>>, no sane person would
>>even try to do so.
>
> I guess I'm not sane then


You just don't know enough about language theory to be taken seriously.

Should you choose to remedy that, you could start with:

    http://en.wikipedia.org/wiki/Chomsky_hierarchy#The_hierarchy

If you should really want to sling the lingo believably, then
continue with the "Dragon book".


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Sat, 20 Sep 2008 04:11:08 GMT
From: sln@netherlands.com
Subject: Re: Regular express for <p>, <ul> and <ol> tags
Message-Id: <8rt8d4ll5ajbpa8o4fjqvnnavp32e0pr2l@4ax.com>

On Fri, 19 Sep 2008 19:09:51 -0700, Jürgen Exner <jurgenex@hotmail.com> wrote:

>sln@netherlands.com wrote:
>>>HTML is not a regular language!
>>
>>Not only not a real language, Html is not a regular expression.
>
>I have absolutely no idea what what you mean by this. Not only is "real"
>not well defined but why would snybody even think about a language
>("HyperText Markup _LANGUAGE_") being an expression?
>
>Do you even know what a regular language is and what properties are
>associated with being a regular language resp. what properties are
>assiciated with _NOT_ being a regular language?
>
>>> While the extensions to Perl's RE
>>>language might be powerful enough to cover HTML
>>
>>That power is not needed, nor ever was
>
>Oh, that answers that question. Obviously you are unaware that only
>regular languages can be parsed by (ordinary) regular expressions which
>have the same expressiveness as regular grammars and finite automatons.
>
>To parse context-sensitive languages you need at least a
>non-deterministic pushdown automaton which in turn cannot be described
>using regular expressions.
>
>If you don't believe me then please re-read your books about Theory of
>Programming Languages, chapter The Chomsky Hierarchy.
>
>Now, Perl's REs are far more powerful than ordinary regular expressions,
>so they might be powerful enough to parse context-sensitive languages.
>But it's still a stupid thing to do. A simple parser is far easier to
>write and to maintain than a gigantic mess of REs.
>
>>>, no sane person would
>>>even try to do so.
>>
>>I guess I'm not sane then
>
>That's your call, not mine.
>
>>>If you want to parse HTML then use a proper HTML
>>>parser. There are several on CPAN.
>>>
>>But do they use Perl independent regular expressions, or a system dependent
>>C library?
>
>What does it matter? They parse HTML and thus solve the task at hand.
>Correctly!
>
>>I agree, well don't know actually what php's rx engine can do. I'm sure it
>>can, in fact, I'm positive it can parse not only html, but any markup that exists.
>>Because, fact is, it's very very simple. 
>
>Oh, then by all means, please publish your findings. Contradicting
>Chomsky is worth at least a Ph.D.
>
>jue

Yeah, right, whatever you say jue.
See the Tad Macllelan reply for some really waste my time stuff I don't wan't 
to repost for you !

sln



------------------------------

Date: Fri, 19 Sep 2008 22:31:51 -0700
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Regular express for <p>, <ul> and <ol> tags
Message-Id: <pk29d4pc4o7l6dd19u4m77155cvfao7fk7@4ax.com>

sln@netherlands.com wrote:
>See the Tad Macllelan reply for some really waste my time stuff I don't wan't 
>to repost for you !

Indeed, Tad summed it up much more concise than I did.

jue


------------------------------

Date: Sat, 20 Sep 2008 05:44:27 GMT
From: sln@netherlands.com
Subject: Re: Regular express for <p>, <ul> and <ol> tags
Message-Id: <8c39d41ddnej3u505kvqkhlb8io5bp3juf@4ax.com>

On Fri, 19 Sep 2008 22:31:51 -0700, Jürgen Exner <jurgenex@hotmail.com> wrote:

>sln@netherlands.com wrote:
>>See the Tad Macllelan reply for some really waste my time stuff I don't wan't 
>>to repost for you !
>
>Indeed, Tad summed it up much more concise than I did.
>
>jue
Unfortunately, your both in the same boat then.
No offence.

sln



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 1865
***************************************


home help back first fref pref prev next nref lref last post