[16688] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4100 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Aug 22 21:10:46 2000

Date: Tue, 22 Aug 2000 18:10:30 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <966993028-v9-i4100@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Tue, 22 Aug 2000     Volume: 9 Number: 4100

Today's topics:
    Re: Rationale Behind 'Use of Uninitialized Value' Warni (Craig Berry)
    Re: Rationale Behind 'Use of Uninitialized Value' Warni <ren.maddox@tivoli.com>
    Re: Rationale Behind 'Use of Uninitialized Value' Warni <grichard@uci.edu>
    Re: Rationale Behind 'Use of Uninitialized Value' Warni (Mike Stok)
    Re: Rationale Behind 'Use of Uninitialized Value' Warni <tina@streetmail.com>
    Re: regexing html-like tags <blair@geo-NOSPAM-soft.org>
    Re: regexing html-like tags <callgirl@la.znet.com>
    Re: regexing html-like tags <lr@hpl.hp.com>
    Re: regexing html-like tags <elijah@workspot.net>
        Sitescooper in Mandrake jpai@rocketmail.com
    Re: Sorting by a subfield (WAS: Re: This is my last que <lr@hpl.hp.com>
    Re: Sorting by a subfield (WAS: Re: This is my last que <lr@hpl.hp.com>
    Re: Sorting by a subfield (WAS: Re: This is my last que pape_98@my-deja.com
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 22 Aug 2000 22:33:19 GMT
From: cberry@cinenet.net (Craig Berry)
Subject: Re: Rationale Behind 'Use of Uninitialized Value' Warning
Message-Id: <sq5vtf3vt9159@corp.supernews.com>

Gabe (grichard@uci.edu) wrote:
: I have not been using -w because I don't like getting all the "Use of
: Uninitialized Value" warnings. What's the value of the warning? Why is it
: better to say "my $foo = '';" as opposed to "my $foo;"? The latter is
: quicker to type, and doesn't seem to increase my probability of making
: programming errors. So what gives?

This is a stylistic and debugging question.  I have found that by
enforcing the discipline of insisting on explicit value assignments, I
both write more readable code and catch more bugs, more easily.  YMMV.

-- 
   |   Craig Berry - http://www.cinenet.net/~cberry/
 --*--  "Every force evolves a form."
   |              - Shriekback


------------------------------

Date: 22 Aug 2000 17:04:16 -0500
From: Ren Maddox <ren.maddox@tivoli.com>
Subject: Re: Rationale Behind 'Use of Uninitialized Value' Warning
Message-Id: <m3itst7y1r.fsf@dhcp11-177.support.tivoli.com>

"Gabe" <grichard@uci.edu> writes:

> I have not been using -w because I don't like getting all the "Use of
> Uninitialized Value" warnings. What's the value of the warning? Why is it
> better to say "my $foo = '';" as opposed to "my $foo;"? The latter is
> quicker to type, and doesn't seem to increase my probability of making
> programming errors. So what gives?

I have certainly had that warning highlight bugs in my code before.

My take on this is that it is not intended to lead you to use "my $foo
= '';" instead of "my $foo;".  Rather, the intent is to alert you when
you have assumed that a variable has a value, but it doesn't.  So, if
you want a particular variable initialized to a null string, do so.
However, if you are expecting that variable to contain a derived
value, then don't initialize it.  Then the warning will let you know
that there is some problem in the flow of logic that allows the
variable to not have a value.

I realize that this is not very clear, and I've tried to create an
example twice without success.  Here is one more try:

#!/usr/bin/perl -w

use strict;

my $x;

if ( 0 ) {
    $x = 3;
} elsif ( undef ) {
    $x = 4;
} else {
    # oops, forgot to set $x
}

print "The result is $x\n";



In this case, the error lets me know that there is a code path that
allows $x not to be set, which I do not want.  Too bad there isn't a
way to have it check all code paths heuristically or something....

-- 
Ren Maddox
ren@tivoli.com


------------------------------

Date: Tue, 22 Aug 2000 16:22:05 -0700
From: "Gabe" <grichard@uci.edu>
Subject: Re: Rationale Behind 'Use of Uninitialized Value' Warning
Message-Id: <8nv28p$b2c$1@news.service.uci.edu>

Craig Berry <cberry@cinenet.net> wrote in message
news:sq5vtf3vt9159@corp.supernews.com...

> This is a stylistic and debugging question.  I have found that by
> enforcing the discipline of insisting on explicit value assignments, I
> both write more readable code and catch more bugs, more easily.  YMMV.

Hmmm. OK, how do you catch more bugs? Here's the situation. I want to assign
the variable in a loop, but I want it's scope to be out of the loop so I can
do something with it so my code is like:

my $foo;

while (condition) {
    if (condition) {$foo = 'foo'};
}

#do something with $foo...

Now why would it be better to say

my $foo = '';

instead?

That is the question.

Gabe





------------------------------

Date: Tue, 22 Aug 2000 23:37:49 GMT
From: mike@stok.co.uk (Mike Stok)
Subject: Re: Rationale Behind 'Use of Uninitialized Value' Warning
Message-Id: <h9Eo5.14698$K5.224511@typhoon.austin.rr.com>

In article <8nurbe$80e$1@news.service.uci.edu>, Gabe <grichard@uci.edu> wrote:
>I have not been using -w because I don't like getting all the "Use of
>Uninitialized Value" warnings. What's the value of the warning? Why is it
>better to say "my $foo = '';" as opposed to "my $foo;"? The latter is
>quicker to type, and doesn't seem to increase my probability of making
>programming errors. So what gives?

There are some places where the use of an uninitialised variable *doesn't*
trigger a warning, and these are relatively common places where the default
empty string or 0 numeric value are reasonable starting places.  These places
include a counter being incermented by ++ and a buffer being accumulated using
 .= and logical tests (where the default value is false).

This program works OK:

#!/usr/local/bin/perl -w

use strict;

my ($text, $lines);

while (<STDIN>) {
    $text .= $_;
    $lines++;
}

if ($lines) {
    print "You typed $lines line(s) containing '$text'\n";
}

Once the variables have been used then perl knows how to treat them (as numbers
or strings.

If you comment out the if ($lines) { line and its associated } and feed the 
program no input then it gripes:

[mike@ratdog tmp]$ perl try.pl </dev/null
Use of uninitialized value in concatenation (.) at try.pl line 13.
Use of uninitialized value in concatenation (.) at try.pl line 13.
You typed  line(s) containing ''

On a human level it makes sense, should $lines be treated as an empty string
or the value 0 in an interpolated string.

In general I like sticking with use strict; and using -w during development 
(as I never know what new warnings are going to appear when a new version of
perl comes along).

It's a reasonable discipline, and if there are places where you are
deliberately manipulating undefined values in ways which provoke warnings then 

  {
    local $^W; # implicitly undef, but you can say $^W = 0 if you like

	# "dangerous code"
  }

will flag your code (and you can refine which warnings are turned off in recent
perls - check perlvar and warnings man pages).  This serves to alert the
maintainer that they should pay attention to the code there.

In general the things perl warns about are things which can get you into
trouble later, so it's up to you.

Hope this helps,

Mike
-- 
mike@stok.co.uk                    |           The "`Stok' disclaimers" apply.
http://www.stok.co.uk/~mike/       |
GPG PGP Key 1024D/059913DA         | Fingerprint      0570 71CD 6790 7C28 3D60
stok@colltech.com (CT - work)      |                  75D2 9EC4 C1C0 0599 13DA


------------------------------

Date: 23 Aug 2000 00:39:09 GMT
From: Tina Mueller <tina@streetmail.com>
Subject: Re: Rationale Behind 'Use of Uninitialized Value' Warning
Message-Id: <8nv6fc$9nhu1$2@ID-24002.news.cis.dfn.de>

hi,
Gabe <grichard@uci.edu> wrote:
> Craig Berry <cberry@cinenet.net> wrote in message
> news:sq5vtf3vt9159@corp.supernews.com...

>> This is a stylistic and debugging question.  I have found that by
>> enforcing the discipline of insisting on explicit value assignments, I
>> both write more readable code and catch more bugs, more easily.  YMMV.

> Hmmm. OK, how do you catch more bugs? Here's the situation. I want to assign
> the variable in a loop, but I want it's scope to be out of the loop so I can
> do something with it so my code is like:

> my $foo;
> while (condition) {
>     if (condition) {$foo = 'foo'};
> }
> #do something with $foo...

> Now why would it be better to say
> my $foo = '';
> instead?

i wouldn't do that.
depends on what you want to do with $foo.
why not test it on un-defined()?

tina

-- 
http://tinita.de    \  enter__| |__the___ _ _ ___
tina's moviedatabase \     / _` / _ \/ _ \ '_(_-< of
search & add comments \    \__,_\___/\___/_| /__/ perception
please don't email unless offtopic or followup is set. thanx


------------------------------

Date: Tue, 22 Aug 2000 22:23:15 GMT
From: "Blair Heuer" <blair@geo-NOSPAM-soft.org>
Subject: Re: regexing html-like tags
Message-Id: <n3Do5.730$yH2.37445@newsread2.prod.itd.earthlink.net>

> > Sheesh, its either one end of the spectrum or the other on this
newsgroup:
> > "not enough information", "too little information."
>
> I did not say "not enough...." nor did I say "too little...."
>
> I said you have not explained what you want, clearly and concisely.
> Pay attention and give some thought to what you want to write
> well before wetting the paper with quill ink. You still have
> not explained clearly and concisely what you want. Bonehead
> English would be an appropriate course of study for you.
> This will supplement your graduate status from The Sears,
> Roebuck & Company Academy Of Language Arts.



> > That skeleton in no way does anything what I need.
>
> My skeleton script does exactly what you want per
> your stated parameters, at least those parameters
> which are comprehensible. Appears you lack enough
> experience in Perl programming to realize my small
> script does not only what you want, but more.

Oh, I am sorry, is this the English perl newsgroup, because what I wrote was
clear enough for most people to understand. Your skeleton script found the
complete tags. Well, I could already do that. But it was "unclear", yet you
say that "I am trying to write a regular expression that is able to parse a
variable for "tags" and return the tag attributes and values." is
"comprehensible [and, only this.]"  Okay, so it says I was trying to write a
way to get the tag attributes and values, which is not even touched upon at
all by the skeleton script. Though I do thank you for trying.

> It also appears you expect to be served all you want
> on a well polished silver platter. You will be
> quite lucky to have me serve you with a paper plate.
> You will be even more lucky if I don't severely beat
> you with my kitchen spatula.

You say this, and yet you scream at Larry Rosler for not custom tailoring
his code exactly to my needs. I took his code on a "paper plate" and was
able to use it. I just wanted a basic way to parse the tags, I did not say
"Create a template processing script for me and have it on my desk by 8 AM!"

> > I will shorten my query with the hopes of making it
> > easier to understand
>
> It is my long experience as a woman, shorter is
> quite often not better.

Funny. So cliche.

> (if you say too little information.... I will kill you :) ).
>
> My boyfriend, an amateur boxer and a university professor,
> has been training me in boxing, in martial arts and in
> mind munching for well over a decade now. You may find
> killing me, to be quite challenging, especially with my
> having survived the mean streets of East L.A. during my
> teenage and young adult years.
>
> You will not survive my mind munching, Chollo Loco.

I could just play cheap. There are more ways to kill someone rather than to
box/fight them one-on-one.

> > Say I have:
>
> Your unconscious use of idioms in writing beguiles you everytime.
> Concealment of writing style is best left to well educated experts
> such as myself.

Can you see me right now? I am laughing.

> >     $template = "[out name=bob age="44" comment="This is my comment!"];
#
>
> This snippet is a joke, right? Ha. Ha. Funny. * rolls eyes *

No, just random.

> > What should I do to put the name/value pairs of that tag (name: bob,
age:
> > 44, comment: This is my comment! ) into a hash.
>
> You do mean an associative array, correct?
> Hash is what I serve coyotes around here.

You say potato, I say hash.

> Do you see,
>
> "Idjit Slave Girl"
>
> tattooed on my forehead? Figure out how to do this yourself.

I did not force you to reply. That would be a good motto for a place where
people go to help. "Figure out how to do this yourself." Genius.

> While you are figuring this out, silently and repeatedly chant,
> "Kira is a superior programmer than myself, as always and as always be."

Without a doubt you are probably better than me at the current time. I have
only about a year and a half experience in perl and learn more and more as
time goes by. But I will never chant that. ;)

-Blair




------------------------------

Date: Tue, 22 Aug 2000 15:33:38 -0700
From: "Godzilla!" <callgirl@la.znet.com>
Subject: Re: regexing html-like tags
Message-Id: <39A2FFC2.44823440@la.znet.com>

Blair Heuer wrote:

(snipped incomprehensible senseless whatevers)


Frank, do you actually believe I would lend 
credibility to this smelly mule manure of yours?
Shirley, you don't believe this.

Oh Kira! Such a great double ended pun!

However, this oxymoronic self-contradicting humor
you sling like sweat with each spin of your head,
certainly has some laughable credibility.

Oh my Kira! Such a collection of stunning double
whammy puns! Your hindsight is too much!

Boy howdy, is it ever.


Godzilla!


------------------------------

Date: Tue, 22 Aug 2000 16:07:49 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: regexing html-like tags
Message-Id: <MPG.140ca7a028eeec8f98acc4@nntp.hpl.hp.com>

In article <n3Do5.730$yH2.37445@newsread2.prod.itd.earthlink.net> on 
Tue, 22 Aug 2000 22:23:15 GMT, Blair Heuer <blair@geo-NOSPAM-soft.org> 
says...

[Godzilla! attribution incorrectly omitted.  Prefix is '> > '.]

 ...

> ... Though I do thank you for trying.

Why?  Do you perceive a net benefit?  I don't, and I doubt anyone else 
does.

 ...

> > > What should I do to put the name/value pairs of that tag (name: bob,
> > > age: 44, comment: This is my comment! ) into a hash.
> >
> > You do mean an associative array, correct?
> > Hash is what I serve coyotes around here.
> 
> You say potato, I say hash.

'Associative array' is what this structure is called in awk, inherited 
by Perl.  In Perl 5, it was decided[1] that seven syllables was too many 
for such an important concept, and the functionally-descriptive name was 
replaced by a one-syllable name more descriptive of the implementation.  
Too bad, IMO, but that's the way it is.

[1] Note careful use of the passive voice to avoid imputing blame to any 
individual or group. :-]

The criticism of 'hash' demonstrates the degree to which Godzilla!'s 
head (or whatever) is stuck in Perl 4.

 ...

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: 23 Aug 2000 00:26:11 GMT
From: Eli the Bearded <elijah@workspot.net>
Subject: Re: regexing html-like tags
Message-Id: <eli$0008222020@qz.little-neck.ny.us>

In comp.lang.perl.misc, Blair Heuer <blair@geo-NOSPAM-soft.org> wrote:
> > Below is some code from it you can use or just study.
> Thanks. Those regexes will probably come in handy.

There were some bugs in the code I posted. I was editing one of the
blocks for size, and screwed it up.

This:
      while (length($line)) {
        if ($line =~ s/(.+?)<\?//) {
          print OUT $once.$1 unless $depends;
          $once = '';
        }

Should be:
      while (length($line)) {
        if ($line =~ s/(.+?)<\?/<?/) {
          print OUT $once.$1 unless $depends;
          $once = '';
        }

And this:
      # } end processing a command
    } else {
      print OUT $once.$line unless $depends;
    }
  } # while IN

Should be:
      # } end processing a command
    } else {
      print OUT $indent.$line unless $depends;
    }
  } # while IN

> Also, thanks for being 100 times more polite than Godzilla. :)

De nada.

Elijah
------
you're lucky I have /r(eg(ular)?.?)?exp?/ in my subject auto-selection


------------------------------

Date: Tue, 22 Aug 2000 22:37:52 GMT
From: jpai@rocketmail.com
Subject: Sitescooper in Mandrake
Message-Id: <8nuvbu$jtp$1@nnrp1.deja.com>

Hi all,

	I am trying to install Sitescooper (3.0) in a Mandrake distribution
(7.0). Everything went well. When I run it, it complains with the
following message:


Reading configuration from "/root/.sitescooper/sitescooper.cf".
Can't locate LWP/UserAgent.pm in @INC (@INC contains:
/usr/lib/perl5/5.00503/i386-linux /usr/lib/perl5/5.00503
/usr/lib/perl5/site_perl/5.005/i386-linux /usr/lib/perl5/site_perl/5.005
 . /usr/bin/lib /usr/bin/site_perl /usr/bin/../share/sitescooper/lib
/usr/bin/../share/sitescooper/site_perl) at Scoop.pm line 701.


I know next to nothing about perl. I will appreciate if somebody can
give me a hint on how to fix this and make it work.


Thanks,



Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: Tue, 22 Aug 2000 15:30:38 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: Sorting by a subfield (WAS: Re: This is my last question, I swear!!!!!!!!!!)
Message-Id: <MPG.140c9eed6e37eb8598acc2@nntp.hpl.hp.com>

[Please use the style of most newsgroups, and quote selectively *before* 
your comments or questions.  I have rearranged and clipped this for 
logical flow.]

In article <8nur1v$euv$1@nnrp1.deja.com> on Tue, 22 Aug 2000 21:24:32 
GMT, pape_98@my-deja.com <pape_98@my-deja.com> says...
> In article <MPG.140b3086cf77ae4d98acaf@nntp.hpl.hp.com>,
>   Larry Rosler <lr@hpl.hp.com> wrote:

 ...

> > > NIH,10B-410,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > NIH,6B-4,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > Suburban,6C-258,52 51,5256,15 13,152
> > > Suburban,ACardiology,52 51,5256,15 13,152
> > > NIH,9B-4,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > NIH,15-410,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > NIH,60B-410,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > NIH,B1D-416,52,135 6,1513,52 hi,
> > > NIH,B1D-43,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > Suburban,10C-58,52 51,5256,15 13,152
> > > Suburban,1B-29,52 51,5256,15 13,152
> > > NIH,B1D-403,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > NIH,B1D-410,52 51 36,135 256,15413,1512
> > > Suburban,6B-281,52 51,5256,15 13,152
> > > Suburban,Office,52 51,5256,15 13,152

 ...

> I find it difficult to understand what the programmer is saying in
> these lines:
> 
> > @sorted = map "${\ (unpack 'na*', $_)[-1] }" => sort map pack
> > ('na*', /B1D-(\d+),/?$1:0, $_ ) => @unsorted;
> 
> The only thing I can decipher from this is that I each time I sort, the
> expression B1D- is being removed. The reason why I can't do much with
> the rest is that I don't quite know what it means.
> 
> Would someone like to explain it to me.

OK, I'll be happy to.

Working from back to front, the regex extracts the string of digits, if 
any, following the first occurrence of 'B1D-' in each line of input 
data; if there is no such string, it uses 0.  The pack() function then 
converts that string into a two-byte binary number and prepends it to 
the original string.  The sort() then sorts the string 
lexicographically.  The pack() then recovers the original string.  (As I 
said in another post, I would simply have used substr($_, 2).)

> And this doesn't allow to perform a "correct" numerical sort does it??

Yes, it does, because the two-byte binary-encoded number has the high-
order byte first, followed by the low-order byte.  So lexicographic 
sorting gives the same result as numeric sorting.  This is one of the 
main points of the Guttman-Rosler Transform -- default lexicographical 
sorting is the fastest (though in Perl 5.6.0, default numeric sorting is 
fast also).

But the problem you added later is that each string no longer contains 
'B1D-'.  You have to adapt the regex to accommodate the new data.

One possibility is simply /-(\d+),/, which works for all the data except 
two, which would then end up sorted lexicographically at the front of 
the sorted list.

Here it is, somewhat reformulated and reformatted.  This may make the 
generation and prepending and later stripping of the two-byte sortkey 
clearer.

    @sorted = map substr($_, 2) => sort
        map pack(n => /-(\d+),/ ? $1 : 0) . $_ => @unsorted;

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Tue, 22 Aug 2000 15:44:49 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: Sorting by a subfield (WAS: Re: This is my last question, I swear!!!!!!!!!!)
Message-Id: <MPG.140ca2388f19f56e98acc3@nntp.hpl.hp.com>

In article <8nusrh$dc0$5@slb3.atl.mindspring.net> on 22 Aug 2000 
21:54:57 GMT, Eric Bohlman <ebohlman@netcom.com> says...
> pape_98@my-deja.com wrote:
> : I find it difficult to understand what the programmer is saying in
> : these lines:
> : 
> : > @sorted = map "${\ (unpack 'na*', $_)[-1] }" => sort map pack
> : > ('na*', /B1D-(\d+),/?$1:0, $_ ) => @unsorted;

 ...

> The sort sorts that list.  Sorting numbers in packed binary form, even 
> with Perl's default lexicographic comparison routine, will put them in 
> proper numeric order.

Note that one must use big-endian (or 'n'etwork) byte order for the 
binary number.

> The first map (which is the final one to execute) simply strips the packed
> binary number off of each element of the sorted list, giving a list of the
> original scalars in sorted order.  It's written in what's IMHO an
> excessively tricky manner (using a reference trick to interpolate a
> function call into a quoted string rather than using a block), but this
> may have been done to speed things up. 

No,  it is slower than the simple substr, which is blindingly fast.  The 
use of the inverse transform to the pack() was intended to be pedagogic.  
The interpolation is just fatuous.

This was all worked through in the previous manifestation of this 
thread.

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Tue, 22 Aug 2000 23:27:40 GMT
From: pape_98@my-deja.com
Subject: Re: Sorting by a subfield (WAS: Re: This is my last question, I swear!!!!!!!!!!)
Message-Id: <8nv28o$nao$1@nnrp1.deja.com>

Thank you guys this all very helpfull.
Now my problem is that I might end up having to sort more than one
field. By this I mean that I would have to do something like this:

for ($i=0; $i <= $#lines; $i++) {
  @names = split(/,/,$lines[$i]);
Once the lines would have been broken down (into 6 sections), how (if I
can) do I go about sorting both fields; the first by letter and the
second by numbers in increasing number.

I'm sorry if I seem to be dragging this discussion into eternity, but
this is all new to me; and not as easy to soak in.

PS: Larry, couldn't do newsgroup style because I wasn't responding to
your points; But you've been very helpful.

Thanks again,



In article <MPG.140c9eed6e37eb8598acc2@nntp.hpl.hp.com>,
  Larry Rosler <lr@hpl.hp.com> wrote:
> [Please use the style of most newsgroups, and quote selectively
*before*
> your comments or questions.  I have rearranged and clipped this for
> logical flow.]
>
> In article <8nur1v$euv$1@nnrp1.deja.com> on Tue, 22 Aug 2000 21:24:32
> GMT, pape_98@my-deja.com <pape_98@my-deja.com> says...
> > In article <MPG.140b3086cf77ae4d98acaf@nntp.hpl.hp.com>,
> >   Larry Rosler <lr@hpl.hp.com> wrote:
>
> ...
>
> > > > NIH,10B-410,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > > NIH,6B-4,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > > Suburban,6C-258,52 51,5256,15 13,152
> > > > Suburban,ACardiology,52 51,5256,15 13,152
> > > > NIH,9B-4,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > > NIH,15-410,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > > NIH,60B-410,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > > NIH,B1D-416,52,135 6,1513,52 hi,
> > > > NIH,B1D-43,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > > Suburban,10C-58,52 51,5256,15 13,152
> > > > Suburban,1B-29,52 51,5256,15 13,152
> > > > NIH,B1D-403,01 36,13 5 26,15 43,1 5 2 5 2 4
> > > > NIH,B1D-410,52 51 36,135 256,15413,1512
> > > > Suburban,6B-281,52 51,5256,15 13,152
> > > > Suburban,Office,52 51,5256,15 13,152
>
> ...
>
> > I find it difficult to understand what the programmer is saying in
> > these lines:
> >
> > > @sorted = map "${\ (unpack 'na*', $_)[-1] }" => sort map pack
> > > ('na*', /B1D-(\d+),/?$1:0, $_ ) => @unsorted;
> >
> > The only thing I can decipher from this is that I each time I sort,
the
> > expression B1D- is being removed. The reason why I can't do much
with
> > the rest is that I don't quite know what it means.
> >
> > Would someone like to explain it to me.
>
> OK, I'll be happy to.
>
> Working from back to front, the regex extracts the string of digits,
if
> any, following the first occurrence of 'B1D-' in each line of input
> data; if there is no such string, it uses 0.  The pack() function
then
> converts that string into a two-byte binary number and prepends it to
> the original string.  The sort() then sorts the string
> lexicographically.  The pack() then recovers the original string.
(As I
> said in another post, I would simply have used substr($_, 2).)
>
> > And this doesn't allow to perform a "correct" numerical sort does
it??
>
> Yes, it does, because the two-byte binary-encoded number has the high-
> order byte first, followed by the low-order byte.  So lexicographic
> sorting gives the same result as numeric sorting.  This is one of the
> main points of the Guttman-Rosler Transform -- default
lexicographical
> sorting is the fastest (though in Perl 5.6.0, default numeric sorting
is
> fast also).
>
> But the problem you added later is that each string no longer
contains
> 'B1D-'.  You have to adapt the regex to accommodate the new data.
>
> One possibility is simply /-(\d+),/, which works for all the data
except
> two, which would then end up sorted lexicographically at the front of
> the sorted list.
>
> Here it is, somewhat reformulated and reformatted.  This may make the
> generation and prepending and later stripping of the two-byte sortkey
> clearer.
>
>     @sorted = map substr($_, 2) => sort
>         map pack(n => /-(\d+),/ ? $1 : 0) . $_ => @unsorted;
>
> --
> (Just Another Larry) Rosler
> Hewlett-Packard Laboratories
> http://www.hpl.hp.com/personal/Larry_Rosler/
> lr@hpl.hp.com
>


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V9 Issue 4100
**************************************


home help back first fref pref prev next nref lref last post