[30886] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 2131 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jan 15 18:09:48 2009

Date: Thu, 15 Jan 2009 15:09:11 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Thu, 15 Jan 2009     Volume: 11 Number: 2131

Today's topics:
    Re: Arrays instead of files into hashes sln@netherlands.com
    Re: Circular lists <jurgenex@hotmail.com>
    Re: Circular lists sln@netherlands.com
    Re: handling hypens(-) in word boundary matching sln@netherlands.com
        Match CASE/END SQL Construct <jc_va@hotmail.com>
    Re: Match CASE/END SQL Construct sln@netherlands.com
    Re: Match CASE/END SQL Construct <glex_no-spam@qwest-spam-no.invalid>
    Re: Match CASE/END SQL Construct sln@netherlands.com
    Re: opening a file <cwilbur@chromatico.net>
    Re: processing text <uri@stemsystems.com>
    Re: processing text <tim@burlyhost.com>
    Re: processing text <syscjm@sumire.gwu.edu>
    Re: processing text <uri@stemsystems.com>
    Re: processing text sln@netherlands.com
    Re: processing text sln@netherlands.com
    Re: The single biggest STUPIDITY in Perl ... <jimsgibson@gmail.com>
    Re: The single biggest STUPIDITY in Perl ... <uri@stemsystems.com>
        What do you need to have to be considered a Master at P sln@netherlands.com
        XML::Parser help (hymie!)
    Re: XML::Parser help <jimsgibson@gmail.com>
    Re: XML::Parser help sln@netherlands.com
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 15 Jan 2009 22:11:21 GMT
From: sln@netherlands.com
Subject: Re: Arrays instead of files into hashes
Message-Id: <ggcvm4ljk8rk8lquj3gp2o4lm4bg2omb3q@4ax.com>

On Thu, 15 Jan 2009 01:03:49 -0800 (PST), Francois Massion <massion@gmx.de> wrote:

>On 13 Jan., 19:09, Francois Massion <mass...@gmx.de> wrote:
>> On 13 Jan., 01:41, Tim Greer <t...@burlyhost.com> wrote:
>>
>>
>>
>>
>>
>> > s...@netherlands.com wrote:
>> > > On Mon, 12 Jan 2009 17:55:36 -0600, Tad J McClellan
>> > > <ta...@seesig.invalid> wrote:
>>
>> > >>Francois Massion <mass...@gmx.de> wrote:
>>
>> > >>> I would like to use 2 arrays, says @array1 and
>> > >>> @array2 instead of files a.txt and b.txt.
>>
>> > >>> while (<WORDLIST2>) {
>> > >>>     $list2{$_}=1; #or any other value
>> > >>> }
>>
>> > >>    my @b_txt = <WORDLIST2>;
>> > > Isin't slurp a bit rich?
>> > > [snip]
>>
>> > > sln
>>
>> > Probably, but that's what the OP asked for.  I can't imagine why, but
>> > perhaps they just need to elaborate on their reasons to get a good
>> > answer.  Personally, I didn't answer them, because I had to ask what
>> > they were wanting to do, since it didn't seem clear to me (or didn't
>> > seem to have a purpose and would just waste good processing).
>> > --
>> > Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
>> > Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
>> > and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
>> > Industry's most experienced staff! -- Web Hosting With Muscle!- Zitierten Text ausblenden -
>>
>> > - Zitierten Text anzeigen -
>>
>> Thanks to all for your contributions. I'll have to try them out. The
>> background for my asking (as a non-pro) is the following. I am doing
>> terminology extraction for linguistic purposes. Thus I take a text,
>> split it up in words or expressions and perform various "refining"
>> operations in order to get only the clean interesting terms as they
>> appear in a dictionary. Each operation is currently a small amateurish
>> little script and the output is anarraywhich I can display in a file
>> or on screen.
>>
>> Now I want to automate all these single steps into one operation which
>> means that instead of reading word lists from text files I would like
>> to use the arrays generated by the previous step. This is the reason
>> for the question above. I have 2 different files as the result of 2
>> previous steps and the difference are the words which are interesting
>> for my terminology work. I hope this helps.- Zitierten Text ausblenden -
>>
>> - Zitierten Text anzeigen -
>
>Thanks to all. Basically the solution to my problem seems to be:
>
>my %in_array2 = map { $_ => 1 } @array2;
>my @array3 = grep { !$in_array2{$_} } @array1;
[snip]

There are many solutions. The one you picked will take the longest.
I just don't like this. I don't line grep to begin with, there is not a functional
way to terminate it, let alone the hash lookup and map overhead.

If speed isin't a concern, then any solution will do.

sln


------------------------------

Date: Thu, 15 Jan 2009 12:10:35 -0800
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Circular lists
Message-Id: <fv4vm4ps9b0c0052vf3j2sd832ktkauofc@4ax.com>

gamo <gamo@telecable.es> wrote:
>The thing could change radically if there is a method to canonicalize all
>the rotations of a list in a compact string. Did you say that is possible?

I don't think you have to. What you are looking for are equivalence
classes. For each class you need one representative element or normal
form. Then, if and only if your permute algorithm generates this very
representative, then you found a new equivalence class and you can print
that representative immediately even without storing it.
If a permutation returned by the permute algorithm is not a
representative/not in normal form then you can discard that permutation
right away.

How you define this normal form is up to you. You need to find a
definition, that does not require to generate all the other elements of
the equivalence class, i.e. all the other rotations, because that would
be expensive.

One idea: use alphabetical sorting and define the normal form to be the
smallest alphabetical list. 
That means, if the list doesn't start with the smallest element then it
can't be in normal form. This will eliminate the vast majority of
candidates right away.
If a candidate does start with the smallest element and there is no
other element with the same small value, then you got a normal form and
the representative of a new equivalence class.
Only if the same smallest value appears multiple times in the list, then
you need to dig deeper and spend some time evaluating, if your candidate
is the smallest alphanumerical element. But this shouldn't be too
frequent, so the performance hit should be bearable.

>I don't think so. If we have numbers instead of letters, I can't think in 
>a method that could describe the list of circular numbers in less space 
>that it has. 

The trick is not to store all permutations or even all normal forms but
to develop a normal form criteria that allows you to determine if a
permutation is in normal form _without_ looking at previously generated
permutations and for performance reasons without looking at other
rotations, i.e. at other elements of the same equivalence class.

jue


------------------------------

Date: Thu, 15 Jan 2009 21:03:43 GMT
From: sln@netherlands.com
Subject: Re: Circular lists
Message-Id: <bv8vm4p912sjo7qrul8ukuuhghn56en5u9@4ax.com>

On Thu, 15 Jan 2009 11:05:41 -0800, Jürgen Exner <jurgenex@hotmail.com> wrote:

>cartercc <cartercc@gmail.com> wrote:
>>On Jan 9, 5:45 am, gamo <g...@telecable.es> wrote:
>>> I want to learn an effient way of handle circular lists.
>>
>>If you want a circular list, you use a counter and modulus it by the
>>number of items in the list.
>>
>>For example, if you have a ten element list in an array, and your
>>counter is modulus ten, 9 gets $array[9], but 10 gets (10 modulus 10 =
>>0) $array[0].
>
>You are falling for the same red herring that has plagued me for the
>past days. This is not about circular lists, quite the opposite
>actually.
>
>From my understanding now (thanks again to xhoster for the explanation)
>the OP is looking for all equivalence classes of all permutations of a
>given list, where two (permutated) lists belong to the same equivalence
>class if and only if those two lists can be transformed into each other
>by rotating/circling their elements.
>
>jue

You guys make me laff. You should watch that new show on the science channel
"Million 2 One". They had a good one there, how many people in a room does it
take to statistically almost guarantee 2 people will have the same birth date
(day-month) out of 365 days/year?

Answer: 26

So funny!

sln



------------------------------

Date: Thu, 15 Jan 2009 19:55:47 GMT
From: sln@netherlands.com
Subject: Re: handling hypens(-) in word boundary matching
Message-Id: <545vm4h8mc857fjctcdvbl7q89lites8vm@4ax.com>

On Mon, 12 Jan 2009 04:50:15 -0800 (PST), arun <urgearun@gmail.com> wrote:

>Hi Perl gurus,
>
>Iam trying to replace the sub string "feat-ha" with 1 in a string but
>it's failing because of hypen (-) please
>help me how to handle it. I have pasted my code snippet below.
>
>
>#!c:\Perl\site\bin
>
>use strict;
>
>my $string = qq((feat-bgp-mpls-vpn  AND feat-ha-sso  AND span));
>
>my $pattern = qq(feat-ha);
>
>$string=~s/\b$pattern\b/1/g;
>
>print $string;
>
>output : (feat-bgp-mpls-vpn  AND 1-sso AND span)
>
>
>In the above programme iam trying to replace the word "feat-ha" with 1
>if it matches the exact word in the string "(feat-bgp-mpls-vpn  AND
>feat-ha-sso  AND span)" but it's replacing feat-ha-sso as 1-sso which
>shouldn't happen  please help me.
>
>Thanks in Advance..
>
>Regards
>-Arun
>
From perlretut:
An anchor useful in basic regexps is the word anchor  \b.
This matches a boundary between a word character and a non-word character \w\W or \W\w

Where:
\d is a digit and represents [0-9] 

\s is a whitespace character and represents [\ \t\r\n\f] 

\w is a word character (alphanumeric or _) and represents [0-9a-zA-Z_] 

\D is a negated \d; it represents any character but a digit [^0-9] 

\S is a negated \s; it represents any non-whitespace character [^\s] 

\W is a negated \w; it represents any non-word character [^\w] 

The period '.' matches any character but ``\n'' 

------------------------------------------------

So feat-ha-sso matches because between 'a' and '-' is a word boundry going from \w\W.

Visually, it appears you want this ->  s/(\s)$pattern(\s)/$1\1$2/g
if say you don't wan't to do extended pattern constructs.

Possibly ->  s/\b$pattern([^\w-]|$))/1$1/g

But doing it this way negates the variability of $pattern since the last character
in the pattern must be a word character.

Still though, its better to be specific. To just use \b, a word boundry as bookend
delimeters might yeild more than you want (or less than you want). It all depends upon
what data you expect to be fed to it.

Oh I guess you could force a prevailing rule of \b, boundries, test the pattern, create
pre/post delimeter character classes then put it all together in a regular expression.

Generally, \b is not something reliable given the variable nature of possiblillities it could
match with complex source text.

sln


------------------------------

Date: Thu, 15 Jan 2009 15:05:29 -0500
From: "Perry Aynum" <jc_va@hotmail.com>
Subject: Match CASE/END SQL Construct
Message-Id: <9GMbl.19524$Nq5.11481@newsfe24.iad>

I am working on a SQL parser.  I have a routine that recursively removes 
enclosing parentheses and it works fine.  Below is the regex that I use.

However, I want to use the same routine, but instead of looking for 
enclosing parens, I want to look for a string enclosed by CASE and END.  Can 
someone help me translate the regex below so that it will match a CASE/END 
construct?

Thanks very much.

Parens
----------
(?:\s+)?\([^\(\)]*\)



This is what I've managed so far with the CASE/END

(?:\s+)?case(?!case|end)\s+end 




------------------------------

Date: Thu, 15 Jan 2009 20:47:20 GMT
From: sln@netherlands.com
Subject: Re: Match CASE/END SQL Construct
Message-Id: <b58vm49j511v91kqp42v6hhbpoi8sql2su@4ax.com>

On Thu, 15 Jan 2009 15:05:29 -0500, "Perry Aynum" <jc_va@hotmail.com> wrote:

>I am working on a SQL parser.  I have a routine that recursively removes 
>enclosing parentheses and it works fine.  Below is the regex that I use.
>
>However, I want to use the same routine, but instead of looking for 
>enclosing parens, I want to look for a string enclosed by CASE and END.  Can 
>someone help me translate the regex below so that it will match a CASE/END 
>construct?
>
>Thanks very much.
>
>Parens
>----------
>(?:\s+)?\([^\(\)]*\)
>
>
>
>This is what I've managed so far with the CASE/END
>
>(?:\s+)?case(?!case|end)\s+end 
>
Its probably not this simple.

sln

-------------------------

use strict;
use warnings;

my $txt = "(this (is a) test)";

while ($txt =~ s/\(([^()]*?)\)/$1/) {};

print $txt,"\n";

$txt = "case this case is a end test end";

while ($txt =~ s/case\s+(.*?)\s+end/$1/) {};

print $txt,"\n";

__END__

this is a test
this is a test



------------------------------

Date: Thu, 15 Jan 2009 16:28:17 -0600
From: "J. Gleixner" <glex_no-spam@qwest-spam-no.invalid>
Subject: Re: Match CASE/END SQL Construct
Message-Id: <496fb882$0$89392$815e3792@news.qwest.net>

Perry Aynum wrote:
> I am working on a SQL parser.  I have a routine that recursively removes 
> enclosing parentheses and it works fine.  Below is the regex that I use.
> 
> However, I want to use the same routine, but instead of looking for 
> enclosing parens, I want to look for a string enclosed by CASE and END.  Can 
> someone help me translate the regex below so that it will match a CASE/END 
> construct?
> 
> Thanks very much.
> 
> Parens
> ----------
> (?:\s+)?\([^\(\)]*\)

See also: Test::Balanced and/or Parse::RecDescent


> This is what I've managed so far with the CASE/END
> 
> (?:\s+)?case(?!case|end)\s+end 
> 
> 

perldoc -q "How can I pull out lines between two patterns that are 
themselves on different lines"

Maybe:

perldoc -q "How do I find matching/nesting anything"


------------------------------

Date: Thu, 15 Jan 2009 22:48:53 GMT
From: sln@netherlands.com
Subject: Re: Match CASE/END SQL Construct
Message-Id: <h9fvm41quplt7di6avspjt2adp8qh72j5h@4ax.com>

On Thu, 15 Jan 2009 16:28:17 -0600, "J. Gleixner" <glex_no-spam@qwest-spam-no.invalid> wrote:

>Perry Aynum wrote:
>> I am working on a SQL parser.  I have a routine that recursively removes 
>> enclosing parentheses and it works fine.  Below is the regex that I use.
>> 
>> However, I want to use the same routine, but instead of looking for 
>> enclosing parens, I want to look for a string enclosed by CASE and END.  Can 
>> someone help me translate the regex below so that it will match a CASE/END 
>> construct?
>> 
>> Thanks very much.
>> 
>> Parens
>> ----------
>> (?:\s+)?\([^\(\)]*\)
>
>See also: Test::Balanced and/or Parse::RecDescent
>
>
>> This is what I've managed so far with the CASE/END
>> 
>> (?:\s+)?case(?!case|end)\s+end 
>> 
>> 
>
>perldoc -q "How can I pull out lines between two patterns that are 
>themselves on different lines"
>
>Maybe:
>
>perldoc -q "How do I find matching/nesting anything"

Why would one need a module for something so apparently simple?

sln


------------------------------

Date: Thu, 15 Jan 2009 16:06:56 -0500
From: Charlton Wilbur <cwilbur@chromatico.net>
Subject: Re: opening a file
Message-Id: <86d4eo2s9b.fsf@mithril.chromatico.net>

>>>>> "cc" == cartercc  <cartercc@gmail.com> writes:

    >> How long do you think a doctor would last if he did everything
    >> the patient wanted, regardless of whether it was in the patient's
    >> best interest?

    cc> Not the same thing. In fact, doctors do work impossible
    cc> hours. If the doctor said, 'I'm not going to take care of this
    cc> patient because it will cause me personal inconvenience,' how
    cc> long do you think he would last? Besides, there was no question
    cc> (in my case) of inconsistency between what was best for the
    cc> enterprise and best for the manager -- coding up the job had
    cc> absolute priority, regardless of my convenience.

Where did you throw personal convenience into the mix?

You're doing unprofessional work because you're acceding to your
manager's demands to do things that are quick and dirty.  You're
rationalizing it as a first effort that merely validates the spec, but
what it boils down to is this:  you, as a *professional*, by claiming
that term, are responsible for the quality of the code you produce.  
This means pushing back when the manager tells you to do something
inappropriate.

Didn't you start a thread some time back about crisis mode programming?
Do you really not see a connection between your willingness to work on
deathmarch and panic-driven schedules without pushing back, and the
frequency with which you have to deal with crises?  

    cc> And I suppose you have never faced a circumstance that required
    cc> your services after hours or during scheduled off time? Do you
    cc> think it proper to refuse to work simply because the off-hours
    cc> duties were directly caused by your manager's procrastination?

I am a firm believer in the maxim "A failure to prepare or plan on your
part does not constitute an emergency on mine."

I have had managers who were walking repositories of self-created crises
and emergencies.  I pushed back hard against the self-created deadlines,
and when I found managers who were impossible to train, I got out of
those jobs as quickly as I could.

I advise you to do likewise.

Charlton



-- 
Charlton Wilbur
cwilbur@chromatico.net


------------------------------

Date: Thu, 15 Jan 2009 14:19:33 -0500
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: processing text
Message-Id: <x7vdsg5qd6.fsf@mail.sysarch.com>

>>>>> "G" == George  <george@example.invalid> writes:

  G> I thought I was past trouble working with simple file manipulations, but I
  G> seem to be stumped here again:


  G> #!/usr/bin/perl
  G> use strict;
  G> use warnings;

  G>    my $filename = 'larry1.txt';
  G>    my $outfile = 'processed1.txt'

missing ;

that causes that and the next line to be a single long statement which
is parsed wierdly and spits out lots of error.

  G>    open my $fh, '<', $filename or die "cannot open $filename: $!";
  G>    open my $gh, '>', $outfile or die "cannot open $filename: $!";
  G>    while (<$fh>) {
  G>        s/%%/%\n/;
  G>        print $gh, $_;

the file handle arg to print doesn't get a comma afterwards

  G>    }
  G>    close($fh)
  G>    close($gh)

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 15 Jan 2009 11:41:49 -0800
From: Tim Greer <tim@burlyhost.com>
Subject: Re: processing text
Message-Id: <1kMbl.19523$Nq5.7908@newsfe24.iad>

George wrote:

> 
> I thought I was past trouble working with simple file manipulations,
> but I seem to be stumped here again:
> 
> 
> #!/usr/bin/perl
> use strict;
> use warnings;
> 
>    my $filename = 'larry1.txt';
>    my $outfile = 'processed1.txt'

You forgot a semicolon there. = 'processed1.txt';

>    open my $fh, '<', $filename or die "cannot open $filename: $!";
>    open my $gh, '>', $outfile or die "cannot open $filename: $!";
>    while (<$fh>) {
>        s/%%/%\n/;
>        print $gh, $_;

print $gh $_; # no comma.

>    }
>    close($fh)

You forgot ;

>    close($gh)
> 

You don't need a semi-colon on the last close statement, since that's
the end of the script, but you should probably add one anyway,
especially if code could follow.

You also might want to add a warning if the filehandle fails to close,
if it's anything important.
-- 
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting.  24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!


------------------------------

Date: Thu, 15 Jan 2009 14:06:15 -0600
From: Chris Mattern <syscjm@sumire.gwu.edu>
Subject: Re: processing text
Message-Id: <slrngmv5pm.qc3.syscjm@sumire.gwu.edu>

On 2009-01-15, Uri Guttman <uri@stemsystems.com> wrote:
>>>>>> "G" == George  <george@example.invalid> writes:
>
>  G> I thought I was past trouble working with simple file manipulations, but I
>  G> seem to be stumped here again:
>
>
>  G> #!/usr/bin/perl
>  G> use strict;
>  G> use warnings;
>
>  G>    my $filename = 'larry1.txt';
>  G>    my $outfile = 'processed1.txt'
>
> missing ;
>
> that causes that and the next line to be a single long statement which
> is parsed wierdly and spits out lots of error.
>
>  G>    open my $fh, '<', $filename or die "cannot open $filename: $!";
>  G>    open my $gh, '>', $outfile or die "cannot open $filename: $!";
>  G>    while (<$fh>) {
>  G>        s/%%/%\n/;
>  G>        print $gh, $_;
>
> the file handle arg to print doesn't get a comma afterwards
>
>  G>    }
>  G>    close($fh)
>  G>    close($gh)

More missing semicolons here that gave him the errors on the close 
statements.
>
> uri
>


-- 
             Christopher Mattern

NOTICE
Thank you for noticing this new notice
Your noticing it has been noted
And will be reported to the authorities


------------------------------

Date: Thu, 15 Jan 2009 15:16:02 -0500
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: processing text
Message-Id: <x78wpc5nr1.fsf@mail.sysarch.com>

>>>>> "CM" == Chris Mattern <syscjm@sumire.gwu.edu> writes:

  CM> On 2009-01-15, Uri Guttman <uri@stemsystems.com> wrote:
  >>>>>>> "G" == George  <george@example.invalid> writes:
  >> 
  G> open my $fh, '<', $filename or die "cannot open $filename: $!";
  G> open my $gh, '>', $outfile or die "cannot open $filename: $!";
  G> while (<$fh>) {
  G> s/%%/%\n/;
  G> print $gh, $_;
  >> 
  >> the file handle arg to print doesn't get a comma afterwards
  >> 
  G> }
  G> close($fh)
  G> close($gh)

  CM> More missing semicolons here that gave him the errors on the close 
  CM> statements.

and no one mentioned this is a trivial one liner (untested):

perl -ne 's/%%/%\n/' <infile >outfile

i don't know the file format so maybe a /g is needed there too.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 15 Jan 2009 22:13:53 GMT
From: sln@netherlands.com
Subject: Re: processing text
Message-Id: <n7dvm417m24jmfu0dvm0tc5as6ji6nca7a@4ax.com>

On Thu, 15 Jan 2009 15:16:02 -0500, Uri Guttman <uri@stemsystems.com> wrote:

>>>>>> "CM" == Chris Mattern <syscjm@sumire.gwu.edu> writes:
>
>  CM> On 2009-01-15, Uri Guttman <uri@stemsystems.com> wrote:
>  >>>>>>> "G" == George  <george@example.invalid> writes:
>  >> 
>  G> open my $fh, '<', $filename or die "cannot open $filename: $!";
>  G> open my $gh, '>', $outfile or die "cannot open $filename: $!";
>  G> while (<$fh>) {
>  G> s/%%/%\n/;
>  G> print $gh, $_;
>  >> 
>  >> the file handle arg to print doesn't get a comma afterwards
>  >> 
>  G> }
>  G> close($fh)
>  G> close($gh)
>
>  CM> More missing semicolons here that gave him the errors on the close 
>  CM> statements.
>
>and no one mentioned this is a trivial one liner (untested):
>
>perl -ne 's/%%/%\n/' <infile >outfile
>
>i don't know the file format so maybe a /g is needed there too.
>
>uri
Its all that typing in 1 linears, over and over and over again.

sln


------------------------------

Date: Thu, 15 Jan 2009 22:57:55 GMT
From: sln@netherlands.com
Subject: Re: processing text
Message-Id: <mmfvm4t4943kn1tiobhpo0ols7qm397eub@4ax.com>

On Thu, 15 Jan 2009 22:13:53 GMT, sln@netherlands.com wrote:

>On Thu, 15 Jan 2009 15:16:02 -0500, Uri Guttman <uri@stemsystems.com> wrote:
>
>>>>>>> "CM" == Chris Mattern <syscjm@sumire.gwu.edu> writes:
>>
>>  CM> On 2009-01-15, Uri Guttman <uri@stemsystems.com> wrote:
>>  >>>>>>> "G" == George  <george@example.invalid> writes:
>>  >> 
>>  G> open my $fh, '<', $filename or die "cannot open $filename: $!";
>>  G> open my $gh, '>', $outfile or die "cannot open $filename: $!";
>>  G> while (<$fh>) {
>>  G> s/%%/%\n/;
>>  G> print $gh, $_;
>>  >> 
>>  >> the file handle arg to print doesn't get a comma afterwards
>>  >> 
>>  G> }
>>  G> close($fh)
>>  G> close($gh)
>>
>>  CM> More missing semicolons here that gave him the errors on the close 
>>  CM> statements.
>>
>>and no one mentioned this is a trivial one liner (untested):
>>
>>perl -ne 's/%%/%\n/' <infile >outfile
>>
>>i don't know the file format so maybe a /g is needed there too.
>>
>>uri
>Its all that typing in 1 linears, over and over and over again.
>
>sln

Why don't you make a batch file, then you only need to type the batch file
name and parameters over and over and over and over and over again.
Still more lines. Another line for that line? another line, why not...

sln


------------------------------

Date: Thu, 15 Jan 2009 12:13:32 -0800
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: The single biggest STUPIDITY in Perl ...
Message-Id: <150120091213321081%jimsgibson@gmail.com>

In article <x73afk76fe.fsf@mail.sysarch.com>, Uri Guttman
<uri@stemsystems.com> wrote:

> >>>>> "BC" == Bernie Cosell <bernie@fantasyfarm.com> writes:
> 
>   BC> Uri Guttman <uri@stemsystems.com> wrote:
>   BC> } but the number of differences are more common:
>   BC> } 
>   BC> }   arrays        lists
>   BC> }   ------        -----


>   BC> You left out:
>   BC>     evaluates to # of elements    evaluates to last element
>   BC>      in scalar context              in scalar context
> 
> incorrect. there is no such thing as a list in scalar context. it is
> just a series of comma ops in that situation, no list is ever created.

Here is something to ponder:

#!/usr/bin/perl

use strict;
use warnings;

sub returns_a_list
{
  return ('a', 'b', 'c');
}

sub returns_an_array
{
  my @r = ('d', 'e', 'f');
  return @r;
}

my @a = returns_a_list();
print "a=@a\n";
my $x = returns_a_list();
print "x=$x\n";
my @b = returns_an_array();
print "b=@b\n";
my $y = returns_an_array();
print "y=$y\n";

__OUTPUT__

a=a b c
x=c
b=d e f
y=3

Note the difference between the values assigned to x and y!

-- 
Jim Gibson


------------------------------

Date: Thu, 15 Jan 2009 15:25:41 -0500
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: The single biggest STUPIDITY in Perl ...
Message-Id: <x7zlhs48qi.fsf@mail.sysarch.com>

>>>>> "JG" == Jim Gibson <jimsgibson@gmail.com> writes:

  JG> In article <x73afk76fe.fsf@mail.sysarch.com>, Uri Guttman
  JG> <uri@stemsystems.com> wrote:

  >> >>>>> "BC" == Bernie Cosell <bernie@fantasyfarm.com> writes:

  BC> You left out:
  BC> evaluates to # of elements    evaluates to last element
  BC> in scalar context              in scalar context
  >> 
  >> incorrect. there is no such thing as a list in scalar context. it is
  >> just a series of comma ops in that situation, no list is ever created.

  JG> Here is something to ponder:

  JG> sub returns_a_list
  JG> {
  JG>   return ('a', 'b', 'c');
  JG> }

  JG> sub returns_an_array
  JG> {
  JG>   my @r = ('d', 'e', 'f');
  JG>   return @r;
  JG> }

  JG> my @a = returns_a_list();
  JG> print "a=@a\n";
  JG> my $x = returns_a_list();
  JG> print "x=$x\n";
  JG> my @b = returns_an_array();
  JG> print "b=@b\n";
  JG> my $y = returns_an_array();
  JG> print "y=$y\n";

  JG> a=a b c
  JG> x=c
  JG> b=d e f
  JG> y=3

  JG> Note the difference between the values assigned to x and y!

sure. and that defends my point. the context of the call determines what
the sub returns. the context says list or scalar and that propogates to
the return code. no different than saying @x = (1, 2, 3 ) vs $x =
(1,2,3). this is documented behavior.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  --------  http://www.sysarch.com --
-----  Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
---------  Gourmet Hot Cocoa Mix  ----  http://bestfriendscocoa.com ---------


------------------------------

Date: Thu, 15 Jan 2009 23:06:56 GMT
From: sln@netherlands.com
Subject: What do you need to have to be considered a Master at Perl?
Message-Id: <s4gvm49sckabo1nt3a6rpmsluqo00kna75@4ax.com>

I'm just going to jumstart the topic with these:

- Technically expert at Regular Expressions
- Analytically and Creative at Regular Expressions
- Thoroughly adept at map/split/grep, know limitations
- Arrays and Hashes, including slices and references

Add some more...

sln


------------------------------

Date: Thu, 15 Jan 2009 19:23:38 GMT
From: hymie@lactose.homelinux.net (hymie!)
Subject: XML::Parser help
Message-Id: <_2Mbl.90214$zJ2.84288@newsfe23.iad>

Greetings.

I have a perl script that uses XML::Parser.  I don't understand fully
how it works.  Alas, that doesn't stop me from having to maintain it.

$self is a large reference that carries around a whole bunch of stuff --
data, functions, etc.

I see where my script sets up some handlers including
        End     => sub {$self->endElement(@_)},

I see where the endElement function has some arguments
    my ($expat, $el) = @_;

I see where we are checking $el for the name of the node we are processing
and taking entries out of that node.
    if($el eq "NewsLines") {
        my $v = delete $self->item->{HeadLine};
        $self->doc->{"HEADLINE"} = $v if defined($v);

The problem I'm having is that knowing I'm in a node named "NewsLines" is
not sufficient.  This horrible XML file is a bastardization of NewsML.
As such, NewsLines could appear in any of these places:

/NewsML/NewsItem/NewsComponent/NewsLines
/NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines
/NewsML/NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsLines

I need to ensure that I am looking at either
/NewsML/NewsItem/NewsComponent/NewsLines
or
/NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines
but **never**
/NewsML/NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsLines

and my knowledge of XML and/or XML::Parser is not that strong.

Can somebody give me a push?

Thanks.

--hymie!    http://lactose.homelinux.net/~hymie    hymie@lactose.homelinux.net
------------------------ Without caffeine for 807 days ------------------------


------------------------------

Date: Thu, 15 Jan 2009 12:28:27 -0800
From: Jim Gibson <jimsgibson@gmail.com>
Subject: Re: XML::Parser help
Message-Id: <150120091228274807%jimsgibson@gmail.com>

In article <_2Mbl.90214$zJ2.84288@newsfe23.iad>, hymie!
<hymie@lactose.homelinux.net> wrote:

> Greetings.
> 
> I have a perl script that uses XML::Parser.  I don't understand fully
> how it works.  Alas, that doesn't stop me from having to maintain it.
> 
> $self is a large reference that carries around a whole bunch of stuff --
> data, functions, etc.
> 
> I see where my script sets up some handlers including
>         End     => sub {$self->endElement(@_)},
> 
> I see where the endElement function has some arguments
>     my ($expat, $el) = @_;
> 
> I see where we are checking $el for the name of the node we are processing
> and taking entries out of that node.
>     if($el eq "NewsLines") {
>         my $v = delete $self->item->{HeadLine};
>         $self->doc->{"HEADLINE"} = $v if defined($v);
> 
> The problem I'm having is that knowing I'm in a node named "NewsLines" is
> not sufficient.  This horrible XML file is a bastardization of NewsML.
> As such, NewsLines could appear in any of these places:
> 
> /NewsML/NewsItem/NewsComponent/NewsLines
> /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines
> /NewsML/NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsLines
> 
> I need to ensure that I am looking at either
> /NewsML/NewsItem/NewsComponent/NewsLines
> or
> /NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines
> but **never**
> /NewsML/NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsLines
> 
> and my knowledge of XML and/or XML::Parser is not that strong.
> 
> Can somebody give me a push?

Does your script declare a "Start" handler as well as "End"? You will
need one to keep track of all of the tags that are above "NewsLines" in
your XML hierarchy. Try printing the element name in your Start and End
handlers and see what you get. For a nicely formatted version, declare
a global level counter, initialize it to zero, increment it in Start,
decrement it in End, and precede the element with that many spaces
(e.g. print ' ' x $level, "$el\n").

You can set up another global counter (or an element of $self) that
counts how many levels of "NewsComponent" you are currently in and only
process the NewLines tag if the count is less than 3.

-- 
Jim Gibson


------------------------------

Date: Thu, 15 Jan 2009 21:31:08 GMT
From: sln@netherlands.com
Subject: Re: XML::Parser help
Message-Id: <ls9vm4lkurr9lflgtkb5l8ci19numaotj9@4ax.com>

On Thu, 15 Jan 2009 19:23:38 GMT, hymie@lactose.homelinux.net (hymie!) wrote:

>Greetings.
>
>I have a perl script that uses XML::Parser.  I don't understand fully
>how it works.  Alas, that doesn't stop me from having to maintain it.
>
>$self is a large reference that carries around a whole bunch of stuff --
>data, functions, etc.
>
>I see where my script sets up some handlers including
>        End     => sub {$self->endElement(@_)},
>
>I see where the endElement function has some arguments
>    my ($expat, $el) = @_;
>
>I see where we are checking $el for the name of the node we are processing
>and taking entries out of that node.
>    if($el eq "NewsLines") {
>        my $v = delete $self->item->{HeadLine};
>        $self->doc->{"HEADLINE"} = $v if defined($v);
>
>The problem I'm having is that knowing I'm in a node named "NewsLines" is
>not sufficient.  This horrible XML file is a bastardization of NewsML.
>As such, NewsLines could appear in any of these places:
>
>/NewsML/NewsItem/NewsComponent/NewsLines
>/NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines
>/NewsML/NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsLines
>
>I need to ensure that I am looking at either
>/NewsML/NewsItem/NewsComponent/NewsLines
>or
>/NewsML/NewsItem/NewsComponent/NewsComponent/NewsLines
>but **never**
>/NewsML/NewsItem/NewsComponent/NewsComponent/NewsComponent/NewsLines
>
>and my knowledge of XML and/or XML::Parser is not that strong.
>
>Can somebody give me a push?
>
>Thanks.
>
>--hymie!    http://lactose.homelinux.net/~hymie    hymie@lactose.homelinux.net
>------------------------ Without caffeine for 807 days ------------------------

In XML terms, there is no relationship as you have laid it out, in nodes

/NewsML/NewsItem/ .... / ....

unless there is nested tags, mamma tag, baby tag...

Theorhetically, you could push and pop those tags starting from NewsML,
stopping with the end NewsML.

Then each new start/end you have a new stack where you can 

if (join ('', @stack) eq 'WhatImLookingFor')
{
	# you are where you are
	$process_flag = 'Time to process sub nodes';
	$stop_processing_stack = 1; # until we get closure on this node of course
}

I have a program that captures xml from handlers and/or turns off capture.
This allows sub node processing later, in order to extract the data when parsing is done.
It amazes me how much bloated code this saves the user.

sln





------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2131
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[30886] in Perl-Users-Digest

Perl-Users Digest, Issue: 2131 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Thu Jan 15 18:09:48 2009

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jan 15 18:09:48 2009