[19359] in Perl-Users-Digest
Perl-Users Digest, Issue: 1554 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Aug 17 18:10:37 2001
Date: Fri, 17 Aug 2001 15:10:16 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <998086216-v10-i1554@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Fri, 17 Aug 2001 Volume: 10 Number: 1554
Today's topics:
Re: regexp question with no answer yet (Anno Siegel)
Re: regexp question with no answer yet (Tad McClellan)
Re: regexp question with no answer yet <lbrtchx@hotmail.com>
Re: regexp question with no answer yet <lbrtchx@hotmail.com>
Re: regexp question with no answer yet (Tad McClellan)
Re: Searching Text File <johndporter@yahoo.com>
Simple Win32 Perl/DOS Directory Listing / Ghost Check <godzilla@stomp.stomp.tokyo>
Re: Sorting Hash of Arrays (Drew Myers)
Re: Using manpages?? <miscellaneousemail@yahoo.com>
Re: Using manpages?? (Tad McClellan)
weird perl behaviour (Hernan)
Re: weird perl behaviour <uri@sysarch.com>
Re: Will Perl report on variables no longer used?? <tsee@gmx.net>
Re: XML Encoding (John J. Trammell)
Re: XML Encoding (Malcolm Dew-Jones)
Re: XML Encoding <jurgenex@hotmail.com>
Re: XML Encoding (Tad McClellan)
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 17 Aug 2001 18:22:52 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: regexp question with no answer yet
Message-Id: <9ljnds$kqd$1@mamenchi.zrz.TU-Berlin.DE>
According to Albretch <lbrtchx@hotmail.com>:
> > If you *have* a maximal length, the solution is trivial: Generate
> > all strings of that or smaller length and apply the regex. Print
> > if match. Now, to do so efficiently...
>
> _ We both know this is the "monkey way" and an incredible and irrealistic
> waste. I wasn't asking for that.
>
> Matter of factly, I found the reverse of what I am looking for
>
> http://www.netch.se/~hakank/makeregex/
>
> So my need is not such an "odd" one.
That doesn't follow.
> So, again given a regexp and the possible maximal lenght of the result
> strings, how could you expand all the possible "plain (no metacharacters)"
> Strings?
I'll assume a "real" regex (i.e. no backreferences, no lookaround and
such tomfoolery).
Parse the regex (Parse::RecDescent should do that easily), and re-write
it in terms of literal charcters, | (alternation), (, ), and *. (Character
classes and {n,m} can be re-written that way.)
Next, assume you have the solution s1 (a set of strings) for a regex re1,
and a solution s2 for a regex re2. Work out what the solutions for
re1|re2, (re1)(re2) and re1* are in terms of s1 and s2. You'll learn
quite a bit about Cartesian products of string sets along the way. Solve
the problem recursively.
Implementation left as an exercise :)
Anno
------------------------------
Date: Fri, 17 Aug 2001 13:53:36 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: regexp question with no answer yet
Message-Id: <slrn9nqmh0.b9t.tadmc@tadmc26.august.net>
Albretch <lbrtchx@hotmail.com> wrote:
>> If you *have* a maximal length, the solution is trivial: Generate
>> all strings of that or smaller length and apply the regex. Print
>> if match. Now, to do so efficiently...
>
> _ We both know this is the "monkey way" and an incredible and irrealistic
>waste. I wasn't asking for that.
>
> Matter of factly, I found the reverse of what I am looking for
>
>http://www.netch.se/~hakank/makeregex/
>
> So my need is not such an "odd" one.
>
> So, again given a regexp and the possible maximal lenght of the result
>strings, how could you expand all the possible "plain (no metacharacters)"
>Strings?
Regular Expressions are for _recognizing_ strings in the grammar.
Regular Grammars are for _generating_ strings in the grammar.
First step would be to convert the regex to a Regular Grammar.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Fri, 17 Aug 2001 16:13:03 -0400
From: "Albretch" <lbrtchx@hotmail.com>
Subject: Re: regexp question with no answer yet
Message-Id: <998080523.720309@zver>
Yeah! My question reworded.
Again how do you do that, for you "how do you convert the regex to a
Regular Grammar"?
Can you lead me to pointers regarding this?
"Tad McClellan" <tadmc@augustmail.com> wrote in message
news:slrn9nqmh0.b9t.tadmc@tadmc26.august.net...
> Albretch <lbrtchx@hotmail.com> wrote:
> >> If you *have* a maximal length, the solution is trivial: Generate
> >> all strings of that or smaller length and apply the regex. Print
> >> if match. Now, to do so efficiently...
> >
> > _ We both know this is the "monkey way" and an incredible and
irrealistic
> >waste. I wasn't asking for that.
> >
> > Matter of factly, I found the reverse of what I am looking for
> >
> >http://www.netch.se/~hakank/makeregex/
> >
> > So my need is not such an "odd" one.
> >
> > So, again given a regexp and the possible maximal lenght of the result
> >strings, how could you expand all the possible "plain (no
metacharacters)"
> >Strings?
>
>
> Regular Expressions are for _recognizing_ strings in the grammar.
>
> Regular Grammars are for _generating_ strings in the grammar.
>
> First step would be to convert the regex to a Regular Grammar.
>
>
> --
> Tad McClellan SGML consulting
> tadmc@augustmail.com Perl programming
> Fort Worth, Texas
------------------------------
Date: Fri, 17 Aug 2001 16:20:53 -0400
From: "Albretch" <lbrtchx@hotmail.com>
Subject: Re: regexp question with no answer yet
Message-Id: <998080524.199263@zver>
... Cartesian products of string sets ...
Thanks! This is within my line of thought, but I thought that was simply a
method call something like (Java syntax)
String[] azPlnExps = getPlainExpressions(azRegExp);
I am amazed that apparently nobody stumbled on thsi problem before, since
to me it should be a natural issue for regexps.
By the way, I studied Physics/Math in the TU Dresdens
"Anno Siegel" <anno4000@lublin.zrz.tu-berlin.de> wrote in message
news:9ljnds$kqd$1@mamenchi.zrz.TU-Berlin.DE...
> According to Albretch <lbrtchx@hotmail.com>:
> > > If you *have* a maximal length, the solution is trivial: Generate
> > > all strings of that or smaller length and apply the regex. Print
> > > if match. Now, to do so efficiently...
> >
> > _ We both know this is the "monkey way" and an incredible and
irrealistic
> > waste. I wasn't asking for that.
> >
> > Matter of factly, I found the reverse of what I am looking for
> >
> > http://www.netch.se/~hakank/makeregex/
> >
> > So my need is not such an "odd" one.
>
> That doesn't follow.
>
> > So, again given a regexp and the possible maximal lenght of the result
> > strings, how could you expand all the possible "plain (no
metacharacters)"
> > Strings?
>
> I'll assume a "real" regex (i.e. no backreferences, no lookaround and
> such tomfoolery).
>
> Parse the regex (Parse::RecDescent should do that easily), and re-write
> it in terms of literal charcters, | (alternation), (, ), and *. (Character
> classes and {n,m} can be re-written that way.)
>
> Next, assume you have the solution s1 (a set of strings) for a regex re1,
> and a solution s2 for a regex re2. Work out what the solutions for
> re1|re2, (re1)(re2) and re1* are in terms of s1 and s2. You'll learn
> quite a bit about Cartesian products of string sets along the way. Solve
> the problem recursively.
>
> Implementation left as an exercise :)
>
> Anno
------------------------------
Date: Fri, 17 Aug 2001 17:04:53 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: regexp question with no answer yet
Message-Id: <slrn9nr1nl.bpv.tadmc@tadmc26.august.net>
[ Please put your comments *following* the quoted text that
you are commenting on.
Please do not quote an entire article.
Please do not quote .sigs.
Thank you.
]
[ text rearranged ]
Albretch <lbrtchx@hotmail.com> wrote:
>"Tad McClellan" <tadmc@augustmail.com> wrote in message
>news:slrn9nqmh0.b9t.tadmc@tadmc26.august.net...
>> Albretch <lbrtchx@hotmail.com> wrote:
>> >> If you *have* a maximal length, the solution is trivial: Generate
>> >> all strings of that or smaller length and apply the regex. Print
>> >> if match. Now, to do so efficiently...
>> > So my need is not such an "odd" one.
Then maybe a search engine might find something? ...
What do you plan to use this for?
I can't think of any compelling reasons for wanting to do what
you want to do. Looks like you have such reasons. Care to share them?
>> > So, again given a regexp and the possible maximal lenght of the result
>> >strings, how could you expand all the possible "plain (no
>metacharacters)"
>> >Strings?
>>
>> Regular Expressions are for _recognizing_ strings in the grammar.
>>
>> Regular Grammars are for _generating_ strings in the grammar.
>>
>> First step would be to convert the regex to a Regular Grammar.
> Yeah! My question reworded.
>
> Again how do you do that, for you "how do you convert the regex to a
>Regular Grammar"?
Well you can't for arbitrary Perl regular expressions, as they
are no longer regular. You will need to restrict the regex
features that can be handled.
> Can you lead me to pointers regarding this?
No, but I can type "convert regular expression" into the little box
at google.com. :-)
It found (word-wrapped for posting):
"Constructing an Equivalent Regular Grammar from a Regular Expression"
http://www-verimag.imag.fr/~pace/Research/Software/Relic/
Transformations/RE/toRG.html
and
http://www.csd.uwo.ca/research/grail/.man/
retofl: convert regular expression to finite language
(if the language of the expression is finite)
You are likely to get more help in a newsgroup about grammars
and such, perhaps:
comp.compilers
comp.compilers.tools
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Fri, 17 Aug 2001 21:37:06 GMT
From: John Porter <johndporter@yahoo.com>
Subject: Re: Searching Text File
Message-Id: <3B7D8DEF.8A09768F@yahoo.com>
tazjrg2 wrote:
>
> The problem is, what this returns to me is "Microsoft Windows NT
> Server 4.0 " - IT DOES NOT READ the next lines of the file.
>
> Does anyone know how I could modify this so that it would take all of
> the information, "Microsoft Windows NT Server 4.0 (build 1381)
> Service Pack 6"??
I think you're going to a lot of unnecessary trouble. Try this:
while (<DATA>) {
chomp;
if ( /Operating System/ ) {
my $s = substr $_, 41;
$_ = <DATA>; chomp;
$s .= substr $_, 41;
# (etc.)
print "$s\n";
}
elsif ( /EISA .* OS/ ) {
my $s = substr $_, 41;
print "$s\n";
}
}
__DATA__
Operating System ....................(+) Microsoft Windows NT Server 4.0
(build 1381) Service Pack 6
(-) Microsoft Windows NT Server 4.0
(build 1381) Service Pack 1
EISA Configured Primary OS ............ Windows NT 4.0
__END__
The interesting thing to remember, I think, is that when you use
while(<>)
to read lines from a file, you can read additional lines inside the
loop, if you want, as I have done above.
--
John Porter
------------------------------
Date: Fri, 17 Aug 2001 14:30:23 -0700
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Simple Win32 Perl/DOS Directory Listing / Ghost Check
Message-Id: <3B7D8CEF.6AA0F5A5@stomp.stomp.tokyo>
Below my signature you will discover a simple script
which allows a quick glance at directories listed in
an array. This is quite handy for ghosting scripts;
utilities to ghost two drives. This script targets
Win32 systems only; Win9.x and Win.me systems.
It does not recursively search, as intended. This
script is intended for quick glances and directory
comparisons. It will provide a lot of information
beyond directories and files.
If you want a short print, want a quick comparison
of total file count and total byte count for a ghost
script, change my sub-routine, Listings , to this:
sub Listings
{
@Listings = `dir`;
print "$Listings[3]\n";
print "$Listings[$#Listings - 1]\n\n";
}
Use of a browser makes for easy scrolling and searching.
Godzilla!
--
#!perl
print "Content-type: text/plain\n\n";
@Directories = qw (c:/apache/users/test d:/mail);
for (@Directories)
{
if (-d "$_")
{
print "*****\n\n";
chdir ($_);
&Listings;
}
else
{ print "*****\nBad Directory Reference: $_\n*****\n\n"; }
}
sub Listings
{
@Listings = `dir`;
print "$Listings[1]$Listings[2]$Listings[3]\n";
print "File/Dir Name ByteSize Date - Time DOS Name\n";
for ($iterate = 4; $iterate <= $#Listings - 2; $iterate++)
{ print $Listings[$iterate]; }
print "\nTotals:\n";
print "$Listings[$#Listings - 1]$Listings[$#Directory]\n\n";
}
LONG LISTING:
_____________
*****
Volume in drive C is DRIVE1
Volume Serial Number is 07CF-0713
Directory of C:\APACHE\USERS\TEST
File/Dir Name ByteSize Date - Time DOS Name
. <DIR> 01-30-01 1:35p .
.. <DIR> 01-30-01 1:35p ..
TEST1 PL 658 08-17-01 2:21p test1.pl
TIME-IT PL 1,503 08-14-01 3:25p Time-it.pl
TEST3 PL 797 06-30-01 10:09a test3.pl
TEST4 PL 189 07-03-01 8:16p test4.pl
SEARCH PL 3,230 05-23-01 7:58p Search.pl
TEST2 PL 190 08-05-01 5:33p test2.pl
DIRSORT PL 1,040 05-02-01 8:48a dirsort.pl
TEST5 PL 200 05-16-01 6:20p test5.pl
SEARCH 1 3,579 05-19-01 6:35p search.1
TEST1 TXT 2 08-15-01 9:10a test1.txt
TEST1PL BAK 483 08-17-01 2:09p test1.pl.bak
BORNWILD MID 227,752 06-21-00 8:01p BORNWILD.MID
SEARCH 2 3,235 05-20-01 10:02a search.2
FINDDIR PL 854 05-20-01 12:49p finddir.pl
SEARCH CGI 3,127 06-03-01 10:40a Search.cgi
TEST2 TXT 71 07-17-01 1:54p test2.txt
TEST TXT 28 08-04-01 10:33p test.txt
TEST PL 682 09-11-00 8:33a TEST.PL
REWRITE PL 633 08-12-01 7:27p rewrite.pl
Totals:
19 file(s) 248,253 bytes
2 dir(s) 8,886.20 MB free
*****
Volume in drive D is DRIVE 2
Volume Serial Number is 15F0-3855
Directory of D:\MAIL
File/Dir Name ByteSize Date - Time DOS Name
. <DIR> 08-17-01 11:33a .
.. <DIR> 08-17-01 11:33a ..
DRAFTS 0 01-17-01 7:16p DRAFTS
DRAFTS SNM 16,384 02-28-01 2:00p DRAFTS.SNM
TEMPLA~1 2,566 02-03-00 5:44p TEMPLA~1
TEMPLA~1 SNM 32,768 02-28-01 2:00p TEMPLA~1.SNM
TRASH 25,120 03-10-01 9:30a TRASH
TRASH SNM 98,304 04-30-01 6:58p TRASH.SNM
SENT 1,575,649 08-16-01 8:27p SENT
SENT SNM 11,337,728 08-16-01 8:27p SENT.SNM
UNSENT~1 0 01-17-01 7:14p UNSENT~1
UNSENT~1 SNM 16,384 02-28-01 2:00p UNSENT~1.SNM
INBOX 3,791,467 08-16-01 2:19p INBOX
INBOX SNM 705,696 08-16-01 8:38p INBOX.SNM
POPSTATE DAT 73 08-16-01 2:19p POPSTATE.DAT
ADDRES~1 SNM 16,384 07-24-99 12:34a ADDRES~1.SNM
OLDMAIL 1,095,644 03-10-01 9:30a OLDMAIL
OLDMAIL SNM 1,163,264 08-05-01 12:38p OLDMAIL.SNM
JOKES 61,047 03-13-00 8:17a JOKES
JOKES SNM 16,384 02-28-01 2:00p JOKES.SNM
CORVETTE 103,100 02-07-01 3:32p CORVETTE
CORVETTE SNM 344,064 02-28-01 2:00p CORVETTE.SNM
RCMP 289,068 06-16-00 9:53a RCMP
RCMP SNM 524,288 08-05-01 12:59p RCMP.SNM
Totals:
22 file(s) 21,215,382 bytes
2 dir(s) 16,849.05 MB free
SHORT LISTING:
______________
(Use Shorter Sub-Routine)
*****
Directory of C:\APACHE\USERS\TEST
19 file(s) 247,932 bytes
*****
Directory of D:\MAIL
22 file(s) 21,215,382 bytes
------------------------------
Date: 17 Aug 2001 11:53:38 -0700
From: bh_ent@hotmail.com (Drew Myers)
Subject: Re: Sorting Hash of Arrays
Message-Id: <d1b6a249.0108171053.38c01b2f@posting.google.com>
helgi@NOSPAMdecode.is (Helgi Briem) wrote in message
> Don't listen to a word that Kira, a.k.a. Godzilla says.
> Listen to Tad, Uri, Randal, Ilya, Brent, Ren and other
> skilled and helpful people instead. Kira is neither
> skilled nor helpful. She is here only to piss people off
> and make them angry (this is also called trolling).
> This is her only interest in Perl of which she knows only
> the raw basics (and badly at that). Ignore her.
Thanks for everyone's help. This has been *quite* a learning
experience, Perl-related and otherwise.
Thanks to all.
Drew
------------------------------
Date: Fri, 17 Aug 2001 18:08:01 GMT
From: Carlos C. Gonzalez <miscellaneousemail@yahoo.com>
Subject: Re: Using manpages??
Message-Id: <MPG.15e70b18da6dd3ff98977a@news.edmonton.telusplanet.net>
In article <slrn9nqfpr.avo.tadmc@tadmc26.august.net>, Tad McClellan at
tadmc@augustmail.com says...
> A few (3, I think) years ago there was a survey that showed
> 2/3 of Perl programmers charging in the $40-90/hour range.
Veerrryyyy interesting (green dollar signs appearing in eyes). Since I
live in Canada and given that dollar for dollar the Canadian dollar is
50% less than the U.S. dollar there might be some good will promoting,
cross cultural enhancing, cross border NAFTA type of....profitable
opportunities (that's the word I was looking for) here. =:).
---
Carlos
www.internetsuccess.ca
*NOTE*: Internet Success is NOT yet fully operational so although you are
welcomed to visit and take a look, trying to subscribe will only be a
frustration for you as your data will not be saved at this time.
------------------------------
Date: Fri, 17 Aug 2001 13:49:16 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Using manpages??
Message-Id: <slrn9nqm8s.b9t.tadmc@tadmc26.august.net>
Carlos C. Gonzalez <miscellaneousemail@yahoo.com> wrote:
>In article <slrn9nqfpr.avo.tadmc@tadmc26.august.net>, Tad McClellan at
>tadmc@augustmail.com says...
>
>> A few (3, I think) years ago there was a survey that showed
>> 2/3 of Perl programmers charging in the $40-90/hour range.
>
>Veerrryyyy interesting (green dollar signs appearing in eyes).
Don't get too excited. The above are hourly rates for contractors.
The usual rule of thumb is to multiply the hourly rate by 1000.
(you can bill 1000 hours/year as a contractor, whereas an "employee"
would bill about 1800 hours/year. Contractors have lots of
unbillable tasks they must perform (eg. marketing).
)
So 2/3 of Perl contractors make the equivalent of a salary in
the $40,000-90,000 range.
Still not too shabby.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 17 Aug 2001 14:41:31 -0700
From: hgonzalez@sinectis.com.ar (Hernan)
Subject: weird perl behaviour
Message-Id: <74c2c400.0108171341.55adec2b@posting.google.com>
I'm not a perl newbie, but this
behaviour puzzles me:
If I test an entry for a hash-of-hashes,
from an empty hash, an entry is created... (!)
Why ? Is this correct ?
Hernan Gonzalez
#################################
use strict;
use vars qw/%C/;
%C=(); # empty hash
PrintC(); # print it... it's empty, ok
# test for an inexistent element (and do nothing)
if(defined $C{'y'}->{'x'}) { }
PrintC(); # print the hash.... it's not empty now!
exit 0;
# print the contents of the %C hash
sub PrintC
{
print "Hash C:\n";
foreach my $key (keys %C)
{
print " $key -> $C{$key}\n";
}
}
####################################
------------------------------
Date: Fri, 17 Aug 2001 21:52:40 GMT
From: Uri Guttman <uri@sysarch.com>
Subject: Re: weird perl behaviour
Message-Id: <x71ymajqvp.fsf@home.sysarch.com>
>>>>> "H" == Hernan <hgonzalez@sinectis.com.ar> writes:
H> If I test an entry for a hash-of-hashes,
H> from an empty hash, an entry is created... (!)
H> Why ? Is this correct ?
yes it is. it is called autovivification.
H> use vars qw/%C/;
H> %C=(); # empty hash
it was empty to start with.
H> PrintC(); # print it... it's empty, ok
don't use upper or mixed case for perl subs and var names. lower case is
the common style.
H> # test for an inexistent element (and do nothing)
H> if(defined $C{'y'}->{'x'}) { }
look into exists as well. but it would autoviv too in this case.
H> PrintC(); # print the hash.... it's not empty now!
see my tutorial on this at:
http://tlc.perlarchive.com/articles/perl/ug0002.shtml
uri
--
Uri Guttman --------- uri@sysarch.com ---------- http://www.sysarch.com
SYStems ARCHitecture and Stem Development ------ http://www.stemsystems.com
Search or Offer Perl Jobs -------------------------- http://jobs.perl.org
------------------------------
Date: Fri, 17 Aug 2001 21:24:19 +0200
From: "Steffen Müller" <tsee@gmx.net>
Subject: Re: Will Perl report on variables no longer used??
Message-Id: <9ljqqj$uvp$00$1@news.t-online.com>
"Philip Newton" <pne-news-20010817@newton.digitalspace.net> schrieb im
Newsbeitrag news:dp8pntsdrqc2if661gkmqgpfmvhadfjg09@4ax.com...
> I think he doesn't want to create them in the first place if he's never
> going to use them. That's sort of like saying "when you get spam, you
> can press the delete key to get rid of it", but I'd prefer not getting
> any spam that I'd have to delete.
My bad. Sorry about my misunderstanding.
> > I'm not quite sure if $foo = undef; actually *deletes* the var from
> > namespace.
>
> I am. It doesn't. Nor does "undef $foo;" do so. (Which one you use makes
> a difference with arrays and hashes.)
That I knew ;)
Steffen
------------------------------
Date: 17 Aug 2001 18:20:57 GMT
From: trammell@haqq.hypersloth.invalid (John J. Trammell)
Subject: Re: XML Encoding
Message-Id: <slrn9nr7ac.553.trammell@haqq.hypersloth.net>
On 17 Aug 2001 08:55:58 -0700, Wes Spears <jspears@weston.com> wrote:
> I realize this must be a basic and stupid question. I have looked all
> over the place and either can not see the answer to the question or
> can not find it. I have a set of text that has less than, greater
> than and other characters in it. I want to use a perl library, and I
> beleive there is one, to do what I would call encoding to be <, >,
> etc.
>
> Where I am stuck is even getting started on this. What is this kind
> of transformation called?
It's called a "one-liner":
[ ~ ] perl -pe 's/</</g; s/>/gt;/g'
less than: <
less than: <
greater than: >
greater than: gt;
[ ~ ]
--
Aren't you, at this point, cutting down a California Redwood using a
banana *and* a particle accelerator?
- Bernard El-Hagin, in CLPM
------------------------------
Date: 17 Aug 2001 11:49:06 -0800
From: yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones)
Subject: Re: XML Encoding
Message-Id: <3b7d6722@news.victoria.tc.ca>
John J. Trammell (trammell@haqq.hypersloth.invalid) wrote:
: On 17 Aug 2001 08:55:58 -0700, Wes Spears <jspears@weston.com> wrote:
: > I realize this must be a basic and stupid question. I have looked all
: > over the place and either can not see the answer to the question or
: > can not find it. I have a set of text that has less than, greater
: > than and other characters in it. I want to use a perl library, and I
: > beleive there is one, to do what I would call encoding to be <, >,
: > etc.
: >
: > Where I am stuck is even getting started on this. What is this kind
: > of transformation called?
: It's called a "one-liner":
: [ ~ ] perl -pe 's/</</g; s/>/gt;/g'
: less than: <
: less than: <
: greater than: >
: greater than: gt;
: [ ~ ]
However this leaves out the "etc." part.
------------------------------
Date: Fri, 17 Aug 2001 12:36:32 -0700
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: XML Encoding
Message-Id: <3b7d724f@news.microsoft.com>
"Malcolm Dew-Jones" <yf110@vtn1.victoria.tc.ca> wrote in message
news:3b7d6722@news.victoria.tc.ca...
> John J. Trammell (trammell@haqq.hypersloth.invalid) wrote:
> : On 17 Aug 2001 08:55:58 -0700, Wes Spears <jspears@weston.com> wrote:
[...]
> : >I have a set of text that has less than, greater
> : > than and other characters in it. I want to use a perl library, and I
> : > beleive there is one, to do what I would call encoding to be <, >,
> : > etc.
> : [ ~ ] perl -pe 's/</</g; s/>/gt;/g'
> : less than: <
> : less than: <
> : greater than: >
> : greater than: gt;
> : [ ~ ]
> However this leaves out the "etc." part.
Well, the "etc." part actually is very(!) short. The only characters which
are not allowed in XML and must be written as entities are
- the less sign: <
- the ampersand: &
- the character combination: ]]>
You got the less sign, you got the combination (via the greater sign), now
it should be trivial to add the ampersand sign, too.
jue
------------------------------
Date: Fri, 17 Aug 2001 15:18:03 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: XML Encoding
Message-Id: <slrn9nqrfb.bgj.tadmc@tadmc26.august.net>
Jürgen Exner <jurgenex@hotmail.com> wrote:
>Well, the "etc." part actually is very(!) short. The only characters which
>are not allowed in XML and must be written as entities are
>- the less sign: <
>- the ampersand: &
>- the character combination: ]]>
>
>You got the less sign, you got the combination (via the greater sign), now
>it should be trivial to add the ampersand sign, too.
Not quite so trivial.
*All* < should be escaped.
At least > in ]]> should be escaped.
*Some* ampersands should be escaped.
Don't want to be changing "<" into "&lt;" :-)
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 1554
***************************************