[9604] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 3198 Volume: 8

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Jul 19 13:07:28 1998

Date: Sun, 19 Jul 98 10:00:28 -0700
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 19 Jul 1998     Volume: 8 Number: 3198

Today's topics:
    Re: /^[a-z0-9]/ <jdf@pobox.com>
        Alphabetical order [CGI-Unix] <431854@cienz.unizar.es>
    Re: Alphabetical order [CGI-Unix] (Larry Rosler)
    Re: Alphabetical order [CGI-Unix] <rootbeer@teleport.com>
        ansi color in Format <starrow@hotmail.com>
    Re: ansi color in Format <rra@stanford.edu>
        atomic <maen@g-ol.com>
    Re: Counting Number of Strings <rootbeer@teleport.com>
    Re: expressions <jdf@pobox.com>
    Re: File Sizes (Andrew M. Langmead)
    Re: grep for lists <rootbeer@teleport.com>
    Re: Help Help : Registry settings for IIS and perl <rmorin@kbcafe.com>
    Re: How do I clear an array? (Craig Berry)
        How would I associate the alphabet with numbers? <poohba@io.com>
        increment operator question <xuchu@iscs.nus.edu.sg>
    Re: increment operator question <uri@sysarch.com>
    Re: increment operator question <jdf@pobox.com>
    Re: newbie date format (Andrew M. Langmead)
    Re: NEWBIE Question on Text replacement (Craig Berry)
    Re: OOP : Object Oriented PROBLEMS! <*clinton@consol.co.uk>
        Opening documents in the web [CGI-Unix] <431854@cienz.unizar.es>
    Re: Opening documents in the web [CGI-Unix] <rootbeer@teleport.com>
    Re: Perl for kids <tchrist@mox.perl.com>
    Re: Regular expressions and HTML <dgris@rand.dimensional.com>
    Re: Scrpt that will dump unwamted users <rootbeer@teleport.com>
    Re: Scrpt that will dump unwamted users <joe@rhein.to>
        semafors <maen@g-ol.com>
    Re: semafors <rootbeer@teleport.com>
    Re: Sybperl (w/ CT-lib) mpeppler@mbay.net
        Special: Digest Administrivia (Last modified: 12 Mar 98 (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 19 Jul 1998 11:35:34 -0500
From: Jonathan Feinberg <jdf@pobox.com>
To: abigail@fnx.com
Subject: Re: /^[a-z0-9]/
Message-Id: <7m19er3d.fsf@mailhost.panix.com>

abigail@fnx.com (Abigail) writes:

>    /[^a-z0-9]/
> 
> But that disallowes A-Z and accented letters as well. Perhaps you want /\W/.

Huh?

   print join "\n", grep /^[^a-z0-9]/, qw(apple Brown betty Delight);

Does that not print "Brown\nDelight" on your box?

-- 
Jonathan Feinberg   jdf@pobox.com   Sunny Brooklyn, NY
http://pobox.com/~jdf/


------------------------------

Date: Sun, 19 Jul 1998 12:32:59 +0200
From: miedo <431854@cienz.unizar.es>
Subject: Alphabetical order [CGI-Unix]
Message-Id: <35B1CB5A.4A1A9046@cienz.unizar.es>

How i can put a series of strings in a file by alphabethical order?
Thanks.





------------------------------

Date: Sun, 19 Jul 1998 09:15:04 -0700
From: lr@hpl.hp.com (Larry Rosler)
Subject: Re: Alphabetical order [CGI-Unix]
Message-Id: <MPG.101bb98293faee2098975d@nntp.hpl.hp.com>

[This followup was posted to comp.lang.perl.misc and a copy was sent to 
the cited author.]

In article <35B1CB5A.4A1A9046@cienz.unizar.es> on Sun, 19 Jul 1998 
12:32:59 +0200, miedo <431854@cienz.unizar.es> says...
> How i can put a series of strings in a file by alphabethical order?

This is a Frequently Asked Question.  The answer is in perlfaq4: "How do 
I sort an array by (anything)? "

-- 
Larry Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Sun, 19 Jul 1998 15:54:52 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Alphabetical order [CGI-Unix]
Message-Id: <Pine.GSO.3.96.980719085420.22120D-100000@user2.teleport.com>

On Sun, 19 Jul 1998, miedo wrote:

> How i can put a series of strings in a file by alphabethical order?

You probably want the sort function. It's documented in perlfunc. Hope
this helps! 

-- 
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/



------------------------------

Date: 19 Jul 1998 14:11:54 GMT
From: Tony Starrow <starrow@hotmail.com>
Subject: ansi color in Format
Message-Id: <6osura$r847@id4.nus.edu.sg>


I'm trying to put ansi colors in my report format but seems it doesn't work
at all. This is what I did:

$color = "\e33;1m" #print yellow color

 ...

write;

 ...

format FORMAT 
@<<<<<<< ...
$color, ...
 .


the output i got is:

[33;1m ....

seems it strip the escape code


anyone can help me solve this problem?

-----

Tony


------------------------------

Date: 19 Jul 1998 07:24:11 -0700
From: Russ Allbery <rra@stanford.edu>
Subject: Re: ansi color in Format
Message-Id: <m3vhot2a2c.fsf@windlord.Stanford.EDU>

Tony Starrow <starrow@hotmail.com> writes:

> I'm trying to put ansi colors in my report format but seems it doesn't
> work at all. This is what I did:

> $color = "\e33;1m" #print yellow color

[...]

> the output i got is:

> [33;1m ....

> seems it strip the escape code

Yup.  Known bug in format; I've reported it at least twice.  I'm not sure
if it's fixed in the current 5.005 beta or not; I haven't had a chance to
compile it and try.

-- 
#!/usr/bin/perl -- Russ Allbery, Just Another Perl Hacker
$^=q;@!>~|{>krw>yn{u<$$<[~||<Juukn{=,<S~|}<Jwx}qn{<Yn{u<Qjltn{ > 0gFzD gD,
 00Fz, 0,,( 0hF 0g)F/=, 0> "L$/GEIFewe{,$/ 0C$~> "@=,m,|,(e 0.), 01,pnn,y{
rw} >;,$0=q,$,,($_=$^)=~y,$/ C-~><@=\n\r,-~$:-u/ #y,d,s,(\$.),$1,gee,print


------------------------------

Date: Sun, 19 Jul 1998 17:44:32 +0300
From: Maen Suleiman <maen@g-ol.com>
Subject: atomic
Message-Id: <35B2064F.4452DC9@g-ol.com>

Hi again ,
i would like toconvert the proccess of reading then  writing to a file
an atomic operation !!
how can it be done in perl ..
thanx in advance

--
Sincerely Yours
 ________________________________________
/                                        \
*  Maen Suleiman                         *
*  NTS - New Technology Solutions LTD    *
*  Galilee On Line -ISP                  *
*  Tel:972-6-6456996, Fax:972-6-6550793  *
*  Email: maen@g-ol.com                  *
*  http://www.g-ol.com                   *
\________________________________________/




------------------------------

Date: Sun, 19 Jul 1998 14:21:18 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Counting Number of Strings
Message-Id: <Pine.GSO.3.96.980719071946.19380A-100000@user2.teleport.com>

On Thu, 16 Jul 1998, Dave wrote:

> I'm parsing the options selected from a dropdown list (multiple options)
> and inserting them into columns in a database.  How can I count the
> number of options selected? 

Are they simply stored in an array?

    $number = @array;

If that's not it, you may want to use grep in a scalar context. Hope this
helps!

-- 
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/



------------------------------

Date: 19 Jul 1998 11:26:48 -0500
From: Jonathan Feinberg <jdf@pobox.com>
To: George Cushing <gcushing@exchange.nih.gov>
Subject: Re: expressions
Message-Id: <af65erhz.fsf@mailhost.panix.com>

George Cushing <gcushing@exchange.nih.gov> writes:

> ($annr, $rfc931, $authuser, $timestamp, $request, $status, $bytes) =
>     /^(\S+) (\S+) (\S+) \[(.+)\] \"(.+)\" (\S+) (\S+)\s/;

I'd suggest using \s+ rather than a single literal space character.

   /^(\S+)\s+(\S+)\s+   etc.

> \"(.+)\" should have scaned for the 2th quotes but it went to the
> last quote of the line.

".+" means match as many characters as possible, which will take it to
the end of the string, and then it will *backtrack* until it's able to
match the rest of the expression. You'll have better luck with

  \"([^"]+)\"

See perlre and Friedl's _Mastering Regular Expressions_.

-- 
Jonathan Feinberg   jdf@pobox.com   Sunny Brooklyn, NY
http://pobox.com/~jdf/


------------------------------

Date: Sun, 19 Jul 1998 15:31:57 GMT
From: aml@world.std.com (Andrew M. Langmead)
Subject: Re: File Sizes
Message-Id: <EwCLt9.BGz@world.std.com>

"Martin" <minich@globalnet.co.uk> writes:

>Does anyone know how to get the size of a file given its file
>handle without having to read all of the file into an array and
>get its length?

The standard C functions fseek() and ftell() (on which the perl
functions seek() and tell() are based) are not guaranteed to refer to
the number of bytes offset the file the pointer. But on most systems,
that is precisly what it is.

The avoidance of defining what the values mean is probably do to the
way that certain C library implementations remove certain characters
from the input stream when reading, when the systems underlying
concept of a text file an end of line marker is a multicharacter
sequence. Finding out that ftell() returns a value of "n" at EOF does
not mean that you can read "n" characters from the beginning. On these
types of systems ftell() will usually include the "stripped"
characters and so will still refer to the file size.

If you want to make use of this almost universal implemntation quirk,
you can use tell() to find out the current location of the file
pointer, seek() to the end, tell() again to find out how far the end
of the file is from the beginning, and then seek() back to the
original location within the file.

If your system's C library documentation defines fseek() and ftell()
in terms of bytes, and not of some opaque value, it is guaranteed to
work. If not, the best I can say is that I have never seen a system
that it doesn't.

-- 
Andrew Langmead


------------------------------

Date: Sun, 19 Jul 1998 14:27:31 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: grep for lists
Message-Id: <Pine.GSO.3.96.980719072522.19380D-100000@user2.teleport.com>

On Sun, 19 Jul 1998, David Turner wrote:

> What I want to do is take two lists A and B and produce a third list C
> which contains all the elements in A that are NOT in B. 

I think the FAQ has something about this in section four.

> open (FILEA, "<./fileA");
> open (FILEB, "<./fileB");

Even when your script is "just an example" (and perhaps especially in that
case!) you should _always_ check the return value after opening a file.

Hope this helps!

-- 
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/



------------------------------

Date: Sun, 19 Jul 1998 11:50:35 -0400
From: Randy Charles Morin <rmorin@kbcafe.com>
To: Kris Van Gompel <krisvg@glo.be>
Subject: Re: Help Help : Registry settings for IIS and perl
Message-Id: <35B215CB.E778A2F2@kbcafe.com>

Try...
http://tips.kbcafe.com/tips/kb.cgi?tips=090
--
Randy Charles Morin - mailto:rmorin@kbcafe.com
The Programmer's Knowledge Base - http://tips.kbcafe.com/tips/


Kris Van Gompel wrote:

> hi,
>
> I need to install perl for windows NT and everything went fine, but when
> I try to execute a .bat file or an .perl file, I got the error :
>
> HTTP/1.0 501 Not Supported
>
> I know this is something I have to change in the registry file, but I
> don't know how , especially the data I have to give.
>
> Please help, it is urgent ...
>
> Thanks in advance







------------------------------

Date: 19 Jul 1998 16:33:18 GMT
From: cberry@cinenet.net (Craig Berry)
Subject: Re: How do I clear an array?
Message-Id: <6ot74f$85r$1@marina.cinenet.net>

Ronald J Kimball (rjk@coos.dartmouth.edu) wrote:
: Craig Berry <cberry@cinenet.net> wrote:
: 
: > Not true (to my way of thinking).  'undef @array' says "I'm entirely done
: > with @array; nuke it," while '@array = ()' says "@array is still around,
: > it just doesn't contain anything right now."  That's a big (semantic)
: > distinction.  And if I mean "it's empty now" but use undef to get there, I
: > get "use of undefined value" warnings (quite properly).
: 
: I wasn't able to reproduce this behavior.  Could you post a bit of code
: that shows how @array = () gives different warnings than undef @array?

Yeah, looks like you're right -- I was extrapolating (falsely, it turns
out) from the warning given for use of an undef scalar.  O Perl Gurus, how
come 

  print $foo;

for undefined $foo gives a 'use of uninitialized value' warning, but

  print @foo;

for undefined @foo doesn't?  

---------------------------------------------------------------------
   |   Craig Berry - cberry@cinenet.net
 --*--    Home Page: http://www.cinenet.net/users/cberry/home.html
   |      Member of The HTML Writers Guild: http://www.hwg.org/   
       "Every man and every woman is a star."


------------------------------

Date: Sun, 19 Jul 1998 11:47:20 -0500
From: Chocolate <poohba@io.com>
Subject: How would I associate the alphabet with numbers?
Message-Id: <Pine.BSF.3.96.980719113942.19213A-100000@dillinger.io.com>

I would like to make a=100000 and b=200000 and c=300000...  How would I do
this?
what I want is

if a=100000 then a00001 = 100001;

so if I put in a number of a54856 it should say


  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
 _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/

			_/_/_/		     _/     _/
		       _/  _/	            _/     _/		
  Web Page Designs    _/_/_/ _/_/_/ _/_/_/ _/_/_/ _/_/_/ _/_/_/ 
    Small Programs   _/     _/   / _/  _/ _/  _/ _/  _/ _/  _/  poohba@io.com
www.io.com/~poohba  _/	   _/_/_/ _/_/_/ _/  _/ _/_/_/ _/_/\_ 	(919)506-5883

  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
 _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/



------------------------------

Date: 19 Jul 1998 15:50:49 GMT
From: wings <xuchu@iscs.nus.edu.sg>
Subject: increment operator question
Message-Id: <6ot4kp$rcr6@id4.nus.edu.sg>

Hi, folks:

When I am reading PP(P79), it gives an example
on autoincrement operator ++:

	print ++($foo='zz'); #prints 'aaa'
	
I tried it and it's true. but i cant understand
why two 'z's will become three 'a's. Where's the
3rd 'a' from? it's like mathematics that has
'borrow' effect?

any help will be appreciated.

-- 
wings
------
World is a book, those dont travel read only one page.

Email: xwings@usa.net, xuchu@iscs.nus.edu.sg
ICQ UIN: 1440319
http://gump.iscs.nus.edu.sg


------------------------------

Date: 19 Jul 1998 12:20:39 -0400
From: Uri Guttman <uri@sysarch.com>
To: wings <xuchu@iscs.nus.edu.sg>
Subject: Re: increment operator question
Message-Id: <x7n2a5n76w.fsf@sysarch.com>

>>>>> "w" == wings  <xuchu@iscs.nus.edu.sg> writes:

  w> Hi, folks: When I am reading PP(P79), it gives an example on
  w> autoincrement operator ++:

  w> 	print ++($foo='zz'); #prints 'aaa'
	
  w> I tried it and it's true. but i cant understand why two 'z's will
  w> become three 'a's. Where's the 3rd 'a' from? it's like mathematics
  w> that has 'borrow' effect?

s/borrow/carry/;

yes.

just rtfm again and again until you grok it :-)

it is in the paragraph just before the example you quote.

uri

-- 
Uri Guttman  -----------------  SYStems ARCHitecture and Software Engineering
Perl Hacker for Hire  ----------------------  Perl, Internet, UNIX Consulting
uri@sysarch.com  ------------------------------------  http://www.sysarch.com
The Best Search Engine on the Net -------------  http://www.northernlight.com


------------------------------

Date: 19 Jul 1998 12:47:07 -0500
From: Jonathan Feinberg <jdf@pobox.com>
To: wings <xuchu@iscs.nus.edu.sg>
Subject: Re: increment operator question
Message-Id: <d8b14tt0.fsf@mailhost.panix.com>

wings <xuchu@iscs.nus.edu.sg> writes:

> When I am reading PP(P79), it gives an example
> on autoincrement operator ++:
> 
> 	print ++($foo='zz'); #prints 'aaa'
> 	
> I tried it and it's true. but i cant understand
> why two 'z's will become three 'a's. Where's the
> 3rd 'a' from? it's like mathematics that has
> 'borrow' effect?

Did you miss the sentence right above that example (which, by the way,
is also in the excellent free documentation that accompanies perl):

   If, however, the variable has been used in only string contexts since
   it was set, and has a value that is not null and matches the pattern
   /^[a-zA-Z]*[0-9]*$/, the increment is done as a string, preserving
   each character within its range, with carry:

The phrase "with carry" addresses your question, I believe.  For a
detailed look at what this means, you should consult the sources.

-- 
Jonathan Feinberg   jdf@pobox.com   Sunny Brooklyn, NY
http://pobox.com/~jdf/


------------------------------

Date: Sun, 19 Jul 1998 15:05:27 GMT
From: aml@world.std.com (Andrew M. Langmead)
Subject: Re: newbie date format
Message-Id: <EwCKL3.CH1@world.std.com>

hex@voicenet.com (Matt Knecht) writes:
>For larger cases, yes.  But, this confuses me.  Surely sprintf is a
>reasonably complex procedure.  I don't see (Other than using benchmarks)
>how it could *possibly* be faster than a few conditionals.  It just
>doesn't make sense to me.

>Then again, Perl is not C.

Think of it this way, each perl opcode (which will correspond to an
operator or built in function) is written in C and compiled to machine
instructions, so it should be reasonably fast at doing its job. When
you duplicate its functionality in perl, it is spending more time in
the interpreter, which is slow.  The larger the chunk of work which
you can pass of to the machine code, and the less work you give the
interpreter to do, the faster the code will be.

-- 
Andrew Langmead


------------------------------

Date: 19 Jul 1998 16:50:20 GMT
From: cberry@cinenet.net (Craig Berry)
Subject: Re: NEWBIE Question on Text replacement
Message-Id: <6ot84c$85r$2@marina.cinenet.net>

Randy Owens (randyo@mindspring.com) wrote:
: I am trying to figure out how to search an ascii output file, locate
: all occurences of CNTL L (^L), which is being used as a form feed, and
: replace them with a Carriage return (^M) and a line feed (^J).

Do yourself a favor and get _Learning Perl_, or at least read perlre.
This kind of stuff is bread-and-butter Perl; you won't get anywhere until
doing substitutions like this requires no thought at all.

Just to get you started, though, you want s/\cL/\cM\cJ/g .  Applying it to
the entire file is left as an exercise; there are many ways to do it.

: I may need to advance about 50 bytes down the line before doing the
: replacement.  

Sorry, that's not clear enough for me to offer advice.

---------------------------------------------------------------------
   |   Craig Berry - cberry@cinenet.net
 --*--    Home Page: http://www.cinenet.net/users/cberry/home.html
   |      Member of The HTML Writers Guild: http://www.hwg.org/   
       "Every man and every woman is a star."


------------------------------

Date: Sun, 19 Jul 1998 16:18:17 +0100
From: "Clinton Gormley" <*clinton@consol.co.uk>
Subject: Re: OOP : Object Oriented PROBLEMS!
Message-Id: <6ot32i$l0u$1@taliesin.netcom.net.uk>

Thanks Zenin

That's a Win 32 interface to the Berkley DB, but do you know if a binary for
the database itself exists (where I can get hold of it!)

I've just discovered SDBM's 1024 byte limit, which is a bit of a bugger, so
I'm going to have to make the change anyway.  Bit difficult with Visual C
though!

Thanks

Clint





------------------------------

Date: Sun, 19 Jul 1998 12:56:29 +0200
From: miedo <431854@cienz.unizar.es>
Subject: Opening documents in the web [CGI-Unix]
Message-Id: <35B1D0DC.3418763A@cienz.unizar.es>

I have seen in searchers like altavista or lycos that you put your
domain and they open your page and they exctract the title, comment,
keywords.... How i can open a file that is in other server?

Thanks




------------------------------

Date: Sun, 19 Jul 1998 15:55:53 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Opening documents in the web [CGI-Unix]
Message-Id: <Pine.GSO.3.96.980719085502.22120E-100000@user2.teleport.com>

On Sun, 19 Jul 1998, miedo wrote:

> How i can open a file that is in other server?

Get LWP from CPAN. Hope this helps!

-- 
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/



------------------------------

Date: 19 Jul 1998 16:38:18 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: Perl for kids
Message-Id: <6ot7dq$8om$1@csnews.cs.colorado.edu>

 [courtesy cc of this posting sent to cited author via email]

In comp.lang.perl.misc, jhayward@students.uiuc.edu (jonathan seth hayward)
writes some very interesting questions.

If you want something rather simplistic, ``Learning Perl'' might work.
But if you're more into `real programmer stuff', you might look at ``Perl:
The Programmers Companion'' by Nigel Chapman.  Other books can be found
listed at:

    http://www.perl.com/perl/critiques/

As for good preparation, I'll grant you the mathematics (puzzle books
are good), but I'm not sure what learning physics avails the nascent
programmer.  Whatever your answer, I'll bet that you could make the
same argument for learning a strongly inflected non-English language
or three, and of course good old formal music composition and harmony
are also useful.  Or devise semi-complex games involving mathematical
formulae in their underpinnings.  Or sketch out the architecture and
connectivity of large, hypothetical buildings--or traffic systems.

You're right that instilling the algorithmic, problem-solving mentality
early is key here, and that low-level B&D languages just get in the way of
that.  There'll be pain enough in the future.  Start with something fun.

--tom
-- 
MAGIC*  xmg_magic;  /* linked list of magicalness */
    --Larry Wall, from sv.h in the v5.0 perl distribution


------------------------------

Date: Sun, 19 Jul 1998 16:47:45 GMT
From: Daniel Grisinger <dgris@rand.dimensional.com>
Subject: Re: Regular expressions and HTML
Message-Id: <6ot7ac$huq$1@rand.dimensional.com>

[posted to comp.lang.perl.misc and mailed to the cited author]

In article <6orfcc$pno@news.service.uci.edu>
ehood@medusa.acs.uci.edu (Earl Hood) wrote:

>Since matching start tags is a subset of matching a complete element,
>they are not independent.  Hence, if you cannot match all legal start
>tags with a single regex, then you cannot match a single element.

Thanks for your reply.  I had already determined that the recursion
problem renders matching complete elements impossible (even if
one can ignore the possibility of optional end tags).  What I
am really trying to figure out is if some of the new regular
expression features in 5.005 allow the recursion limit to be
avoided in some contexts.

>Yes.  An, element can contain embedded elements, making things much
>more complex.  Example:
>
>    <p align="center">This is a para with an image
>    <img src='foo.gif' alt="Stupid </p> ==>'alt'<== text.">
>    <em>Some emphasized text</em>
>    <!-- And a comment with <p>Tags that should be treated
>    aspart of the comment </p> -->
>    </p>
>
>In simple terms, due to the alternation of single and double quotes
>for deliminiting attribute values, and attribute values can contain
>less-then and greater-then characters and quote characters, a single
>regex will not work to match all legal start tags.  Elements are
>worse since they include start tags and other constructs.

It is worth noting that in the above example each individual
tag can be successfully extracted using a single regular 
expression[1].

>Note, "matching" an element requires more than just checking for
>the right delimiters.  Since end tags are optional, it requires
>knowledge of the possible content models allowed to accurately
>dermine the ending (and possible the beginning) of some elements.

Yes, I was completely ignoring semantic issues, trying instead
for a simple way to remove all HTML tags from a set of documents.

>If you want the gory details, check the various standards.

I have, and have come to the conclusion that even with Ilya's 
enhancements to perl's regular expression semantics the
above is probably still impossible (I have a Loony Idea[tm] that
it may be possible by playing with tied variables and 
C<use re qw/eval/;>, but haven't yet managed to put something
together that works).

The problem comes with HTML that is valid according to the
current 4.0 DTD, even though no browser that I have access
to renders it properly.  Snippets such as-

   <p <em> <strong>> This is some text </strong </em </p>>>

Lynx, Navigator, and Arena all fail to properly render this,
but according to the 4.0 document type definition this is
valid HTML.  

Thanks for you help.

Daniel


[1]- This expression will correctly extract the html elements
in the above example (assuming that the entire string (in this
case, Earl's article) has been placed in $c)-

$comment = q/(?:<!\s*(?:--(?:(?:[^-])|(?:-(?!-)))*--\s*)+>)/;
$html    = q/(?:<(?:(?:[^><"'])+|(?:"[^"]+?")|(?:'[^']+?')|(?:$comment))+?>)/;
$html    =~ s/(\$comment)/$1/ee;

@a = $c =~ m/((?:(?:(?=<!)$comment)|(?:(?=<)$html)))/gs;

Note that all parens are made non-capturing in the sub-expressions, this
is to allow proper assignment of each matched tag to its appropriate
place in the array.  When fed Earl's original article as input the
following strings are assigned to each element of @a-

0- <ehood@hydra.acs.uci.edu>

1- <6or7j3$gkv$1@rand.dimensional.com>

2- <dgris@rand.dimensional.com>

3- <img src= "rt-arrow.gif" alt= "==>" >

4- <p>

5- </p>

6- <p align="center">

7- <img src='foo.gif' alt="Stupid </p> ==>'alt'<== text.">

8- <em>

9- </em>

10- <!-- And a comment with <p>Tags that should be treated
      aspart of the comment </p> -->

11- </p>

Now if (?{}) interpolated, I think that this would be
possible in the general case (although we are no longer dealing
with expressions that are strictly regular :-).

-- 
Daniel Grisinger           dgris@perrin.dimensional.com
"No kings, no presidents, just a rough consensus and
running code."
                           Dave Clark


------------------------------

Date: Sun, 19 Jul 1998 14:25:18 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Scrpt that will dump unwamted users
Message-Id: <Pine.GSO.3.96.980719072334.19380C-100000@user2.teleport.com>

On Sun, 19 Jul 1998, Jim wrote:

> Can I find a script that will filter out and bounce unwanted guest from
> my site either by visitor or entire ISP's?

If you're wishing merely to _find_ (as opposed to write) programs,
this newsgroup may not be the best resource for you. There are many
freeware and shareware archives which you can find by searching Yahoo
or a similar service. Hope this helps!

-- 
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/



------------------------------

Date: Sun, 19 Jul 1998 16:32:08 +0200
From: "joe" <joe@rhein.to>
Subject: Re: Scrpt that will dump unwamted users
Message-Id: <6ot06g$2u$1@usenet41.supernews.com>

some of these archives are:
http://www.worldwidemart.com/scripts/
http://www.freecode.com






------------------------------

Date: Sun, 19 Jul 1998 17:13:37 +0300
From: Maen Suleiman <maen@g-ol.com>
Subject: semafors
Message-Id: <35B1FF11.EA4DABDB@g-ol.com>

Hi All ,
i am writing a cgi script in perl that updates some tables , and it is
kindoff online operation , i want to prevent two users to access the
same table in the same time for writing .. is there anyway to implement
semafors in Perl .. like Wait and Signal in C ?
thanx in advance for everything :)


--
Sincerely Yours
 ________________________________________
/                                        \
*  Maen Suleiman                         *
*  NTS - New Technology Solutions LTD    *
*  Galilee On Line -ISP                  *
*  Tel:972-6-6456996, Fax:972-6-6550793  *
*  Email: maen@g-ol.com                  *
*  http://www.g-ol.com                   *
\________________________________________/




------------------------------

Date: Sun, 19 Jul 1998 15:57:13 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: semafors
Message-Id: <Pine.GSO.3.96.980719085603.22120F-100000@user2.teleport.com>

On Sun, 19 Jul 1998, Maen Suleiman wrote:

> i am writing a cgi script in perl that updates some tables , and it is
> kindoff online operation , i want to prevent two users to access the
> same table in the same time for writing .. is there anyway to implement
> semafors in Perl .. like Wait and Signal in C ?

You may need to use the methods in perlipc. But I think you could use the
methods in Randal's fourth Web Techniques column, which explains how to
use flock() to avoid problems when multiple processes need to modify one
file. Hope this helps!

   http://www.stonehenge.com/merlyn/WebTechniques/

-- 
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/



------------------------------

Date: Sun, 19 Jul 1998 15:31:33 GMT
From: mpeppler@mbay.net
Subject: Re: Sybperl (w/ CT-lib)
Message-Id: <6ot3gl$7c0$1@nnrp1.dejanews.com>

In article <6ockd0$ac@jusdnews.fir.fbc.com>,
  mmavani@fir.fbc.com (Manish M Mavani) wrote:
> Hi friends,
>
> We have been using Sybperl (Perl 4 + DB lib), as DB lib is fading out and
> not going to be supported in future - we would like to create a ~new~
> Sybperl ( Perl 4 + CT Lib).
>
> In brief, changing dbopen calls to ct_open. It turns out this is not
> very simple thing to do.
>
> Sybperl with CT lib is directly available - but it uses Perl 5. We would
> like to avoid migrating to Perl 5, as millions of lines of code is
> affected.

I would *stronly* advise you to migrate to perl 5.

In any case, migrating from DBlibrary to CTlibrary will entail a serious
re-write, as the underlying logic of CTlib is quite different from DBlib
(as you've no doubt realized).

Michael

-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/rg_mkgrp.xp   Create Your Own Free Member Forum


------------------------------

Date: 12 Jul 98 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Special: Digest Administrivia (Last modified: 12 Mar 98)
Message-Id: <null>


Administrivia:

Special notice: in a few days, the new group comp.lang.perl.moderated
should be formed. I would rather not support two different groups, and I
know of no other plans to create a digested moderated group. This leaves
me with two options: 1) keep on with this group 2) change to the
moderated one.

If you have opinions on this, send them to
perl-users-request@ruby.oce.orst.edu. 


The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.

The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V8 Issue 3198
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[9604] in Perl-Users-Digest

Perl-Users Digest, Issue: 3198 Volume: 8

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Sun Jul 19 13:07:28 1998

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Jul 19 13:07:28 1998