[19359] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 1554 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Aug 17 18:10:37 2001

Date: Fri, 17 Aug 2001 15:10:16 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <998086216-v10-i1554@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Fri, 17 Aug 2001     Volume: 10 Number: 1554

Today's topics:
    Re: regexp question with no answer yet (Anno Siegel)
    Re: regexp question with no answer yet (Tad McClellan)
    Re: regexp question with no answer yet <lbrtchx@hotmail.com>
    Re: regexp question with no answer yet <lbrtchx@hotmail.com>
    Re: regexp question with no answer yet (Tad McClellan)
    Re: Searching Text File <johndporter@yahoo.com>
        Simple Win32 Perl/DOS Directory Listing / Ghost Check <godzilla@stomp.stomp.tokyo>
    Re: Sorting Hash of Arrays (Drew Myers)
    Re: Using manpages?? <miscellaneousemail@yahoo.com>
    Re: Using manpages?? (Tad McClellan)
        weird perl behaviour (Hernan)
    Re: weird perl behaviour <uri@sysarch.com>
    Re: Will Perl report on variables no longer used?? <tsee@gmx.net>
    Re: XML Encoding (John J. Trammell)
    Re: XML Encoding (Malcolm Dew-Jones)
    Re: XML Encoding <jurgenex@hotmail.com>
    Re: XML Encoding (Tad McClellan)
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 17 Aug 2001 18:22:52 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: regexp question with no answer yet
Message-Id: <9ljnds$kqd$1@mamenchi.zrz.TU-Berlin.DE>

According to Albretch <lbrtchx@hotmail.com>:
> > If you *have* a maximal length, the solution is trivial:  Generate
> > all strings of that or smaller length and apply the regex.  Print
> > if match.  Now, to do so efficiently...
> 
>  _ We both know this is the "monkey way" and an incredible and irrealistic
> waste. I wasn't asking for that.
> 
>  Matter of factly, I found the reverse of what I am looking for
> 
> http://www.netch.se/~hakank/makeregex/
> 
>  So my need is not such an "odd" one.
 
That doesn't follow.

>  So, again given a regexp and the possible maximal lenght of the result
> strings, how could you expand all the possible "plain (no metacharacters)"
> Strings?

I'll assume a "real" regex (i.e. no backreferences, no lookaround and
such tomfoolery).

Parse the regex (Parse::RecDescent should do that easily), and re-write
it in terms of literal charcters, | (alternation), (, ), and *. (Character
classes and {n,m} can be re-written that way.)

Next, assume you have the solution s1 (a set of strings) for a regex re1,
and a solution s2 for a regex re2.  Work out what the solutions for
re1|re2, (re1)(re2) and re1* are in terms of s1 and s2.  You'll learn
quite a bit about Cartesian products of string sets along the way.  Solve
the problem recursively.

Implementation left as an exercise :)

Anno


------------------------------

Date: Fri, 17 Aug 2001 13:53:36 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: regexp question with no answer yet
Message-Id: <slrn9nqmh0.b9t.tadmc@tadmc26.august.net>

Albretch <lbrtchx@hotmail.com> wrote:
>> If you *have* a maximal length, the solution is trivial:  Generate
>> all strings of that or smaller length and apply the regex.  Print
>> if match.  Now, to do so efficiently...
>
> _ We both know this is the "monkey way" and an incredible and irrealistic
>waste. I wasn't asking for that.
>
> Matter of factly, I found the reverse of what I am looking for
>
>http://www.netch.se/~hakank/makeregex/
>
> So my need is not such an "odd" one.
>
> So, again given a regexp and the possible maximal lenght of the result
>strings, how could you expand all the possible "plain (no metacharacters)"
>Strings?


Regular Expressions are for _recognizing_ strings in the grammar.

Regular Grammars are for _generating_ strings in the grammar.

First step would be to convert the regex to a Regular Grammar.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Fri, 17 Aug 2001 16:13:03 -0400
From: "Albretch" <lbrtchx@hotmail.com>
Subject: Re: regexp question with no answer yet
Message-Id: <998080523.720309@zver>

 Yeah! My question reworded.

 Again how do you do that, for you "how do you convert the regex to a
Regular Grammar"?

 Can you lead me to pointers regarding this?


"Tad McClellan" <tadmc@augustmail.com> wrote in message
news:slrn9nqmh0.b9t.tadmc@tadmc26.august.net...
> Albretch <lbrtchx@hotmail.com> wrote:
> >> If you *have* a maximal length, the solution is trivial:  Generate
> >> all strings of that or smaller length and apply the regex.  Print
> >> if match.  Now, to do so efficiently...
> >
> > _ We both know this is the "monkey way" and an incredible and
irrealistic
> >waste. I wasn't asking for that.
> >
> > Matter of factly, I found the reverse of what I am looking for
> >
> >http://www.netch.se/~hakank/makeregex/
> >
> > So my need is not such an "odd" one.
> >
> > So, again given a regexp and the possible maximal lenght of the result
> >strings, how could you expand all the possible "plain (no
metacharacters)"
> >Strings?
>
>
> Regular Expressions are for _recognizing_ strings in the grammar.
>
> Regular Grammars are for _generating_ strings in the grammar.
>
> First step would be to convert the regex to a Regular Grammar.
>
>
> --
>     Tad McClellan                          SGML consulting
>     tadmc@augustmail.com                   Perl programming
>     Fort Worth, Texas




------------------------------

Date: Fri, 17 Aug 2001 16:20:53 -0400
From: "Albretch" <lbrtchx@hotmail.com>
Subject: Re: regexp question with no answer yet
Message-Id: <998080524.199263@zver>

 ... Cartesian products of string sets  ...

 Thanks! This is within my line of thought, but I thought that was simply a
method call something like (Java syntax)

 String[] azPlnExps = getPlainExpressions(azRegExp);

 I am amazed that apparently nobody stumbled on thsi problem before, since
to me it should be a natural issue for regexps.

 By the way, I studied Physics/Math in the TU Dresdens

"Anno Siegel" <anno4000@lublin.zrz.tu-berlin.de> wrote in message
news:9ljnds$kqd$1@mamenchi.zrz.TU-Berlin.DE...
> According to Albretch <lbrtchx@hotmail.com>:
> > > If you *have* a maximal length, the solution is trivial:  Generate
> > > all strings of that or smaller length and apply the regex.  Print
> > > if match.  Now, to do so efficiently...
> >
> >  _ We both know this is the "monkey way" and an incredible and
irrealistic
> > waste. I wasn't asking for that.
> >
> >  Matter of factly, I found the reverse of what I am looking for
> >
> > http://www.netch.se/~hakank/makeregex/
> >
> >  So my need is not such an "odd" one.
>
> That doesn't follow.
>
> >  So, again given a regexp and the possible maximal lenght of the result
> > strings, how could you expand all the possible "plain (no
metacharacters)"
> > Strings?
>
> I'll assume a "real" regex (i.e. no backreferences, no lookaround and
> such tomfoolery).
>
> Parse the regex (Parse::RecDescent should do that easily), and re-write
> it in terms of literal charcters, | (alternation), (, ), and *. (Character
> classes and {n,m} can be re-written that way.)
>
> Next, assume you have the solution s1 (a set of strings) for a regex re1,
> and a solution s2 for a regex re2.  Work out what the solutions for
> re1|re2, (re1)(re2) and re1* are in terms of s1 and s2.  You'll learn
> quite a bit about Cartesian products of string sets along the way.  Solve
> the problem recursively.
>
> Implementation left as an exercise :)
>
> Anno






------------------------------

Date: Fri, 17 Aug 2001 17:04:53 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: regexp question with no answer yet
Message-Id: <slrn9nr1nl.bpv.tadmc@tadmc26.august.net>


[ Please put your comments *following* the quoted text that
  you are commenting on.

  Please do not quote an entire article.

  Please do not quote .sigs.

  Thank you.
]


[ text rearranged ]


Albretch <lbrtchx@hotmail.com> wrote:
>"Tad McClellan" <tadmc@augustmail.com> wrote in message
>news:slrn9nqmh0.b9t.tadmc@tadmc26.august.net...
>> Albretch <lbrtchx@hotmail.com> wrote:
>> >> If you *have* a maximal length, the solution is trivial:  Generate
>> >> all strings of that or smaller length and apply the regex.  Print
>> >> if match.  Now, to do so efficiently...

>> > So my need is not such an "odd" one.


Then maybe a search engine might find something? ...


What do you plan to use this for?

I can't think of any compelling reasons for wanting to do what
you want to do. Looks like you have such reasons. Care to share them?


>> > So, again given a regexp and the possible maximal lenght of the result
>> >strings, how could you expand all the possible "plain (no
>metacharacters)"
>> >Strings?
>>
>> Regular Expressions are for _recognizing_ strings in the grammar.
>>
>> Regular Grammars are for _generating_ strings in the grammar.
>>
>> First step would be to convert the regex to a Regular Grammar.

> Yeah! My question reworded.
>
> Again how do you do that, for you "how do you convert the regex to a
>Regular Grammar"?


Well you can't for arbitrary Perl regular expressions, as they
are no longer regular. You will need to restrict the regex
features that can be handled.


> Can you lead me to pointers regarding this?


No, but I can type "convert regular expression" into the little box 
at google.com.   :-)

It found (word-wrapped for posting):

   "Constructing an Equivalent Regular Grammar from a Regular Expression"

   http://www-verimag.imag.fr/~pace/Research/Software/Relic/
          Transformations/RE/toRG.html

and

   http://www.csd.uwo.ca/research/grail/.man/

   retofl: convert regular expression to finite language 
     (if the language of the expression is finite) 



You are likely to get more help in a newsgroup about grammars 
and such, perhaps:

   comp.compilers
   comp.compilers.tools


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Fri, 17 Aug 2001 21:37:06 GMT
From: John Porter <johndporter@yahoo.com>
Subject: Re: Searching Text File
Message-Id: <3B7D8DEF.8A09768F@yahoo.com>

tazjrg2 wrote:
> 
> The problem is, what this returns to me is "Microsoft Windows NT
> Server 4.0 " - IT DOES NOT READ the next lines of the file.
> 
> Does anyone know how I could modify this so that it would take all of
> the information, "Microsoft Windows NT Server 4.0  (build 1381)
> Service Pack 6"??

I think you're going to a lot of unnecessary trouble.  Try this:

while (<DATA>) {
  chomp;
  if ( /Operating System/ ) {
    my $s = substr $_, 41;
    $_ = <DATA>; chomp;
    $s .= substr $_, 41;
    # (etc.)
    print "$s\n";
  }
  elsif ( /EISA .* OS/ ) {
    my $s = substr $_, 41;
    print "$s\n";
  }
}

__DATA__
Operating System ....................(+) Microsoft Windows NT Server 4.0
                                         (build 1381)  Service Pack 6
                                     (-) Microsoft Windows NT Server 4.0
                                         (build 1381)  Service Pack 1
 EISA Configured Primary OS ............ Windows NT 4.0

__END__

The interesting thing to remember, I think, is that when you use
while(<>)
to read lines from a file, you can read additional lines inside the
loop, if you want, as I have done above.


-- 
John Porter


------------------------------

Date: Fri, 17 Aug 2001 14:30:23 -0700
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Simple Win32 Perl/DOS Directory Listing / Ghost Check
Message-Id: <3B7D8CEF.6AA0F5A5@stomp.stomp.tokyo>


Below my signature you will discover a simple script
which allows a quick glance at directories listed in
an array. This is quite handy for ghosting scripts;
utilities to ghost two drives. This script targets
Win32 systems only; Win9.x and Win.me systems.

It does not recursively search, as intended. This
script is intended for quick glances and directory
comparisons. It will provide a lot of information
beyond directories and files.

If you want a short print, want a quick comparison
of total file count and total byte count for a ghost
script, change my sub-routine,  Listings  , to this:

sub Listings
 {
  @Listings = `dir`;
  print "$Listings[3]\n";
  print "$Listings[$#Listings - 1]\n\n";
 }


Use of a browser makes for easy scrolling and searching.


Godzilla! 
--

#!perl

print "Content-type: text/plain\n\n";

@Directories = qw (c:/apache/users/test d:/mail);

for (@Directories)
 {
  if (-d "$_")
   {
    print "*****\n\n";
    chdir ($_);
    &Listings;
   }
  else
   { print "*****\nBad Directory Reference: $_\n*****\n\n"; }
 }

sub Listings
 {
  @Listings = `dir`;
  print "$Listings[1]$Listings[2]$Listings[3]\n";
  print "File/Dir Name    ByteSize   Date - Time     DOS Name\n";
  for ($iterate = 4; $iterate <= $#Listings - 2; $iterate++)
   { print $Listings[$iterate]; }
  print "\nTotals:\n";
  print "$Listings[$#Listings - 1]$Listings[$#Directory]\n\n";
 }


LONG LISTING:
_____________

*****

 Volume in drive C is DRIVE1     
 Volume Serial Number is 07CF-0713
 Directory of C:\APACHE\USERS\TEST

File/Dir Name    ByteSize   Date - Time     DOS Name

 .              <DIR>        01-30-01  1:35p .
 ..             <DIR>        01-30-01  1:35p ..
TEST1    PL            658  08-17-01  2:21p test1.pl
TIME-IT  PL          1,503  08-14-01  3:25p Time-it.pl
TEST3    PL            797  06-30-01 10:09a test3.pl
TEST4    PL            189  07-03-01  8:16p test4.pl
SEARCH   PL          3,230  05-23-01  7:58p Search.pl
TEST2    PL            190  08-05-01  5:33p test2.pl
DIRSORT  PL          1,040  05-02-01  8:48a dirsort.pl
TEST5    PL            200  05-16-01  6:20p test5.pl
SEARCH   1           3,579  05-19-01  6:35p search.1
TEST1    TXT             2  08-15-01  9:10a test1.txt
TEST1PL  BAK           483  08-17-01  2:09p test1.pl.bak
BORNWILD MID       227,752  06-21-00  8:01p BORNWILD.MID
SEARCH   2           3,235  05-20-01 10:02a search.2
FINDDIR  PL            854  05-20-01 12:49p finddir.pl
SEARCH   CGI         3,127  06-03-01 10:40a Search.cgi
TEST2    TXT            71  07-17-01  1:54p test2.txt
TEST     TXT            28  08-04-01 10:33p test.txt
TEST     PL            682  09-11-00  8:33a TEST.PL
REWRITE  PL            633  08-12-01  7:27p rewrite.pl

Totals:
        19 file(s)        248,253 bytes
         2 dir(s)        8,886.20 MB free


*****

 Volume in drive D is DRIVE 2    
 Volume Serial Number is 15F0-3855
 Directory of D:\MAIL

File/Dir Name    ByteSize   Date - Time     DOS Name

 .              <DIR>        08-17-01 11:33a .
 ..             <DIR>        08-17-01 11:33a ..
DRAFTS                   0  01-17-01  7:16p DRAFTS
DRAFTS   SNM        16,384  02-28-01  2:00p DRAFTS.SNM
TEMPLA~1             2,566  02-03-00  5:44p TEMPLA~1
TEMPLA~1 SNM        32,768  02-28-01  2:00p TEMPLA~1.SNM
TRASH               25,120  03-10-01  9:30a TRASH
TRASH    SNM        98,304  04-30-01  6:58p TRASH.SNM
SENT             1,575,649  08-16-01  8:27p SENT
SENT     SNM    11,337,728  08-16-01  8:27p SENT.SNM
UNSENT~1                 0  01-17-01  7:14p UNSENT~1
UNSENT~1 SNM        16,384  02-28-01  2:00p UNSENT~1.SNM
INBOX            3,791,467  08-16-01  2:19p INBOX
INBOX    SNM       705,696  08-16-01  8:38p INBOX.SNM
POPSTATE DAT            73  08-16-01  2:19p POPSTATE.DAT
ADDRES~1 SNM        16,384  07-24-99 12:34a ADDRES~1.SNM
OLDMAIL          1,095,644  03-10-01  9:30a OLDMAIL
OLDMAIL  SNM     1,163,264  08-05-01 12:38p OLDMAIL.SNM
JOKES               61,047  03-13-00  8:17a JOKES
JOKES    SNM        16,384  02-28-01  2:00p JOKES.SNM
CORVETTE           103,100  02-07-01  3:32p CORVETTE
CORVETTE SNM       344,064  02-28-01  2:00p CORVETTE.SNM
RCMP               289,068  06-16-00  9:53a RCMP
RCMP     SNM       524,288  08-05-01 12:59p RCMP.SNM

Totals:
        22 file(s)     21,215,382 bytes
         2 dir(s)       16,849.05 MB free


SHORT LISTING:
______________

(Use Shorter Sub-Routine)


*****

 Directory of C:\APACHE\USERS\TEST

        19 file(s)        247,932 bytes


*****

 Directory of D:\MAIL

        22 file(s)     21,215,382 bytes


------------------------------

Date: 17 Aug 2001 11:53:38 -0700
From: bh_ent@hotmail.com (Drew Myers)
Subject: Re: Sorting Hash of Arrays
Message-Id: <d1b6a249.0108171053.38c01b2f@posting.google.com>

helgi@NOSPAMdecode.is (Helgi Briem) wrote in message 

> Don't listen to a word that Kira, a.k.a. Godzilla says.  
> Listen to Tad, Uri, Randal, Ilya, Brent, Ren and other
> skilled and helpful people instead.  Kira is neither
> skilled nor helpful.  She is here only to piss people off 
> and make them angry (this is also called trolling). 
> This is her only interest in Perl of which she knows only 
> the raw basics (and badly at that).  Ignore her.

Thanks for everyone's help.  This has been *quite* a learning
experience, Perl-related and otherwise.

Thanks to all.
Drew


------------------------------

Date: Fri, 17 Aug 2001 18:08:01 GMT
From: Carlos C. Gonzalez <miscellaneousemail@yahoo.com>
Subject: Re: Using manpages??
Message-Id: <MPG.15e70b18da6dd3ff98977a@news.edmonton.telusplanet.net>

In article <slrn9nqfpr.avo.tadmc@tadmc26.august.net>, Tad McClellan at 
tadmc@augustmail.com says...

> A few (3, I think) years ago there was a survey that showed
> 2/3 of Perl programmers charging in the $40-90/hour range.

Veerrryyyy interesting (green dollar signs appearing in eyes).  Since I 
live in Canada and given that dollar for dollar the Canadian dollar is 
50% less than the U.S. dollar there might be some good will promoting, 
cross cultural enhancing, cross border NAFTA type of....profitable 
opportunities (that's the word I was looking for) here. =:).

---
Carlos 
www.internetsuccess.ca
*NOTE*: Internet Success is NOT yet fully operational so although you are 
welcomed to visit and take a look, trying to subscribe will only be a 
frustration for you as your data will not be saved at this time.


------------------------------

Date: Fri, 17 Aug 2001 13:49:16 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Using manpages??
Message-Id: <slrn9nqm8s.b9t.tadmc@tadmc26.august.net>

Carlos C. Gonzalez <miscellaneousemail@yahoo.com> wrote:
>In article <slrn9nqfpr.avo.tadmc@tadmc26.august.net>, Tad McClellan at 
>tadmc@augustmail.com says...
>
>> A few (3, I think) years ago there was a survey that showed
>> 2/3 of Perl programmers charging in the $40-90/hour range.
>
>Veerrryyyy interesting (green dollar signs appearing in eyes).  


Don't get too excited. The above are hourly rates for contractors.

The usual rule of thumb is to multiply the hourly rate by 1000.
(you can bill 1000 hours/year as a contractor, whereas an "employee"
 would bill about 1800 hours/year. Contractors have lots of
 unbillable tasks they must perform (eg. marketing).
)


So 2/3 of Perl contractors make the equivalent of a salary in
the $40,000-90,000 range. 

Still not too shabby.

-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: 17 Aug 2001 14:41:31 -0700
From: hgonzalez@sinectis.com.ar (Hernan)
Subject: weird perl behaviour
Message-Id: <74c2c400.0108171341.55adec2b@posting.google.com>

I'm not a perl newbie, but this 
behaviour puzzles me:
If I test an entry for a hash-of-hashes,
from an empty hash, an entry is created... (!)
Why ? Is this correct ?

Hernan Gonzalez

#################################
use strict;
use vars qw/%C/;

%C=();            # empty hash
PrintC();         # print it... it's empty, ok

# test for an inexistent element (and do nothing)
if(defined $C{'y'}->{'x'})  { } 

PrintC();  # print the hash.... it's not empty now!
 
exit 0;

# print the contents of the %C hash 
sub PrintC
    {
    print "Hash C:\n";
    foreach my $key (keys %C)
        {
        print " $key -> $C{$key}\n";
        }
    }

####################################


------------------------------

Date: Fri, 17 Aug 2001 21:52:40 GMT
From: Uri Guttman <uri@sysarch.com>
Subject: Re: weird perl behaviour
Message-Id: <x71ymajqvp.fsf@home.sysarch.com>

>>>>> "H" == Hernan  <hgonzalez@sinectis.com.ar> writes:

  H> If I test an entry for a hash-of-hashes,
  H> from an empty hash, an entry is created... (!)
  H> Why ? Is this correct ?

yes it is. it is called autovivification.

  H> use vars qw/%C/;

  H> %C=();            # empty hash

it was empty to start with.

  H> PrintC();         # print it... it's empty, ok

don't use upper or mixed case for perl subs and var names. lower case is
the common style.


  H> # test for an inexistent element (and do nothing)
  H> if(defined $C{'y'}->{'x'})  { } 

look into exists as well. but it would autoviv too in this case.

  H> PrintC();  # print the hash.... it's not empty now!

see my tutorial on this at: 

http://tlc.perlarchive.com/articles/perl/ug0002.shtml

uri

-- 
Uri Guttman  ---------  uri@sysarch.com  ----------  http://www.sysarch.com
SYStems ARCHitecture and Stem Development ------ http://www.stemsystems.com
Search or Offer Perl Jobs  --------------------------  http://jobs.perl.org


------------------------------

Date: Fri, 17 Aug 2001 21:24:19 +0200
From: "Steffen Müller" <tsee@gmx.net>
Subject: Re: Will Perl report on variables no longer used??
Message-Id: <9ljqqj$uvp$00$1@news.t-online.com>

"Philip Newton" <pne-news-20010817@newton.digitalspace.net> schrieb im
Newsbeitrag news:dp8pntsdrqc2if661gkmqgpfmvhadfjg09@4ax.com...

> I think he doesn't want to create them in the first place if he's never
> going to use them. That's sort of like saying "when you get spam, you
> can press the delete key to get rid of it", but I'd prefer not getting
> any spam that I'd have to delete.

My bad. Sorry about my misunderstanding.

> > I'm not quite sure if $foo = undef; actually *deletes* the var from
> > namespace.
>
> I am. It doesn't. Nor does "undef $foo;" do so. (Which one you use makes
> a difference with arrays and hashes.)

That I knew ;)

Steffen




------------------------------

Date: 17 Aug 2001 18:20:57 GMT
From: trammell@haqq.hypersloth.invalid (John J. Trammell)
Subject: Re: XML Encoding
Message-Id: <slrn9nr7ac.553.trammell@haqq.hypersloth.net>

On 17 Aug 2001 08:55:58 -0700, Wes Spears <jspears@weston.com> wrote:
> I realize this must be a basic and stupid question.  I have looked all
> over the place and either can not see the answer to the question or
> can not find it.  I have a set of text that has less than, greater
> than and other characters in it.  I want to use a perl library, and I
> beleive there is one, to do what I would call encoding to be &lt, &gt,
> etc.
> 
> Where I am stuck is even getting started on this.  What is this kind
> of transformation called?

It's called a "one-liner":

[ ~ ] perl -pe 's/</&lt;/g; s/>/gt;/g'
less than: <
less than: &lt;
greater than: >
greater than: gt;
[ ~ ]

-- 
Aren't you, at this point, cutting down a California Redwood using a
banana *and* a particle accelerator?
                                         - Bernard El-Hagin, in CLPM


------------------------------

Date: 17 Aug 2001 11:49:06 -0800
From: yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones)
Subject: Re: XML Encoding
Message-Id: <3b7d6722@news.victoria.tc.ca>

John J. Trammell (trammell@haqq.hypersloth.invalid) wrote:
: On 17 Aug 2001 08:55:58 -0700, Wes Spears <jspears@weston.com> wrote:
: > I realize this must be a basic and stupid question.  I have looked all
: > over the place and either can not see the answer to the question or
: > can not find it.  I have a set of text that has less than, greater
: > than and other characters in it.  I want to use a perl library, and I
: > beleive there is one, to do what I would call encoding to be &lt, &gt,
: > etc.
: > 
: > Where I am stuck is even getting started on this.  What is this kind
: > of transformation called?

: It's called a "one-liner":

: [ ~ ] perl -pe 's/</&lt;/g; s/>/gt;/g'
: less than: <
: less than: &lt;
: greater than: >
: greater than: gt;
: [ ~ ]


However this leaves out the "etc." part.



------------------------------

Date: Fri, 17 Aug 2001 12:36:32 -0700
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: XML Encoding
Message-Id: <3b7d724f@news.microsoft.com>

"Malcolm Dew-Jones" <yf110@vtn1.victoria.tc.ca> wrote in message
news:3b7d6722@news.victoria.tc.ca...
> John J. Trammell (trammell@haqq.hypersloth.invalid) wrote:
> : On 17 Aug 2001 08:55:58 -0700, Wes Spears <jspears@weston.com> wrote:
[...]
> : >I have a set of text that has less than, greater
> : > than and other characters in it.  I want to use a perl library, and I
> : > beleive there is one, to do what I would call encoding to be &lt, &gt,
> : > etc.
> : [ ~ ] perl -pe 's/</&lt;/g; s/>/gt;/g'
> : less than: <
> : less than: &lt;
> : greater than: >
> : greater than: gt;
> : [ ~ ]
> However this leaves out the "etc." part.

Well, the "etc." part actually is very(!) short. The only characters which
are not allowed in XML and must be written as entities are
- the less sign: <
- the ampersand: &
- the character combination: ]]>

You got the less sign, you got the combination (via the greater sign), now
it should be trivial to add the ampersand sign, too.

jue




------------------------------

Date: Fri, 17 Aug 2001 15:18:03 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: XML Encoding
Message-Id: <slrn9nqrfb.bgj.tadmc@tadmc26.august.net>

Jürgen Exner <jurgenex@hotmail.com> wrote:

>Well, the "etc." part actually is very(!) short. The only characters which
>are not allowed in XML and must be written as entities are
>- the less sign: <
>- the ampersand: &
>- the character combination: ]]>
>
>You got the less sign, you got the combination (via the greater sign), now
>it should be trivial to add the ampersand sign, too.


Not quite so trivial.

*All* < should be escaped.

At least > in ]]> should be escaped.

*Some* ampersands should be escaped.


Don't want to be changing "&lt;" into "&amp;lt;"  :-)


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 1554
***************************************


home help back first fref pref prev next nref lref last post