[19519] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 1714 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Sep 8 06:05:30 2001

Date: Sat, 8 Sep 2001 03:05:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <999943505-v10-i1714@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Sat, 8 Sep 2001     Volume: 10 Number: 1714

Today's topics:
    Re: ActivePerl says Out of Memory, but I'm not out of m <mdemello@blue.owlnet.rice.edu>
    Re: array questions <goldbb2@earthlink.net>
    Re: Confused (again) over complex data structures. (Tassilo v. Parseval)
    Re: getting ip <goldbb2@earthlink.net>
        Message Catalog in Perl <sudhir@newmail.net>
    Re: object & fork <Steffen.Bachmann@t-online.de>
    Re: Perl/CGI problem <godzilla@stomp.stomp.tokyo>
    Re: Script Error? <goldbb2@earthlink.net>
        sudhir: array references. <sudhir@newmail.net>
        why "\W" does work in the split ??? <shijialeee@yahoo.com>
    Re: why "\W" does work in the split ??? <davidhilseenews@yahoo.com>
    Re: why "\W" does work in the split ??? <davidhilseenews@yahoo.com>
    Re: why "\W" does work in the split ??? <shijialeee@yahoo.com>
    Re: why "\W" does work in the split ??? <davidhilseenews@yahoo.com>
    Re: why "\W" does work in the split ??? <krahnj@acm.org>
    Re: why "\W" does work in the split ??? <shijialeee@yahoo.com>
    Re: why "\W" does work in the split ??? <davidhilseenews@yahoo.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 8 Sep 2001 08:49:41 GMT
From: Martin Julian DeMello <mdemello@blue.owlnet.rice.edu>
Subject: Re: ActivePerl says Out of Memory, but I'm not out of memory?  Is this  a  limitation in the Perl build I have?
Message-Id: <9ncm35$mkv$1@joe.rice.edu>

Jonadab the Unsightly One <jonadab@bright.net> wrote:
: another one eventually.  He's convinced that it
: *can't* be possible to solve every position, and
: further he's convinced that he's going to find
: the unsolveable one.  And no argument from math 
: is going to convince him of anything, one way or
: another.  

And, indeed, he's right.

 From the FAQ at http://home.earthlink.net/~fomalhaut/fcfaq.html
  * Can they all be solved?

  All of the Microsoft deals except for number 11982 are solvable.
  
and for some theory, see  

  http://www.cs.ruu.nl/~hansb/d.freecell/node2.html#SECTION0002

-- 
Martin DeMello


------------------------------

Date: Sat, 08 Sep 2001 04:40:06 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: array questions
Message-Id: <3B99D966.6BC53BA@earthlink.net>

Dave Tweed wrote:
> 
> "Randal L. Schwartz" wrote:
> > Well, "deeper" copy, but not technically a "deep" copy. For the
> > difference, see my article on deep copies ...
> 
> Yes, I made a copy of whatever scalars were in @child. If those were
> themselves references, I didn't do anything about that.
> 
> I figured I explained it at an appropriate level of detail for arrays
> of arrays, and if anyone had more complex structures, they would pick
> up on the pattern.
> 
> In my experience, the required level of copying depth is going to be
> highly dependent on the specific application. Consider:
> 
>    @a = ('scalar1', 'scalar2');
>    @aa = (\@a, \@a);
>    @bb = &deep_copy (\@aa);
> 
> @bb ends up with a very different structure than the one in @aa.

If it gives a different data structure than the original, then, IMHO,
it's wrong.  Here's a better [but untested] version:

sub deep_copy {
    my $this = shift;
    if (not ref $this or UNIVERSAL::isa($this, "CODE")) {
        return $this;
    };
    my $addr = addr($this);
    my $cache = shift || {};
    return $cache->{$addr} if exists $cache->{$addr};
    if (UNIVERSAL::isa($this,"SCALAR") or
        UNIVERSAL::isa($this, "REF")) {
        # make a copy of the scalar and return a reference to that
        my $temp = $cache->{$addr} = do { my $x; \$x };
        if( tied $$this ) {
            tie $$temp, __PACKAGE__,
                deep_copy(tied $$this, $cache);
        } else {
            if( ref($this) eq "Regexp" ) {
                $$temp = qr/$this/;
            } else {
                $$temp = deep_copy($$this, $cache);
            }
        }
        bless $temp, ref $this;
    } elsif (UNIVERSAL::isa($this, "ARRAY")) {
        my $temp = $cache->{$addr} = [];
        if( tied @$this ) {
            tie @$temp, __PACKAGE__,
                deep_copy(tied @$this, $cache);
        } else { foreach(@$this) {
            if( tied $_ ) {
                # make use of $#array+1 == @array
                tie $$temp[@$temp], __PACKAGE__,
                    deep_copy(tied $_, $cache);
            } elsif( UNIVERSAL::isa(\$_, "GLOB") {
                require Symbol;
                push @$temp, *{gensym()};
                deep_copy(\$_, $cache, \$$temp[-1]);
            } else {
                push @$temp, deep_copy($_, $cache);
            }
        } }
        bless $temp, ref $this;
    } elsif (UNIVERSAL::isa($this, "HASH")) {
        my $temp = $cache->{$addr} = {};
        if( tied %$this ) {
            tie %$temp, __PACKAGE__,
                deep_copy(tied %$this, $cache);
        } else { foreach( keys %$this ) {
            if( tied $$this{$_} ) {
                tie $$temp{$_}, __PACKAGE__,
                    deep_copy(tied $$this{$_}, $cache);
            } elsif( UNIVERSAL::isa(\$$this{$_}, "GLOB") {
                require Symbol;
                $$temp{$_} = *{gensym()};
                deep_copy(\$$this{$_}, $cache, \$$temp{$_});
            } else {
                $$temp{$_} = deep_copy($$this{$_}, $cache);
            }
        } }
        bless $temp, ref $this;
    } elsif (UNIVERSAL::isa($this, "GLOB")) {
        require Symbol qw(gensym);
        my $temp = $cache->{$addr} = shift || gensym;
        for(qw(SCALAR ARRAY HASH CODE IO)) {
            *$temp = deep_copy(*$this{$_}, $cache)
                if *$this{$_};
        }
        tie *$temp, __PACKAGE__, deep_copy(tied *$this, $cache);
        bless $temp, ref $this;
    } elsif (UNIVERSAL::isa($this, "IO")) {
        local (*FOO, *BAR) = $this;
        open(BAR, "<+&FOO) or do {
            require Carp;
            Carp::carp "Couldn't dup $this: $!";
        }
        $cache->{$addr} = bless *BAR{IO}, ref $this;
    } else {
        require Carp;
        $addr =~ s/\(0x[0-9a-fA-F]*\)\z//;
        Carp::croak "Unrecognized type: $addr";
    }
}

It's possible that I've made typoes, logic bugs, etc.  In particular,
I'm not sure I have all the logic behind the use of globs right.

The reason for this is that you can store a GLOB object [both a real one
and a reference to one] inside a scalar.  Eg:
	$x = *foo;
	print *foo, "\n", # *main::foo
	print $x, "\n"; # *main::foo
	print \$x, "\n"; # GLOB(0x80e4a5c)
	print \*$x, "\n"; # GLOB(0x80e4a5c)
	print \*foo, "\n"; # GLOB(0x80e4a50)

This is an especially annoying problem when you have an array or hash
whose members are typeglobs [not references to ones, actual ones].

Also, I'm not entirely sure about the handling of Regexp objects.  They
think they're SCALARs, but they're C structs.  So you can't deref, copy,
and bless, like you can with normal scalars.

-- 
"I think not," said Descartes, and promptly disappeared.


------------------------------

Date: 8 Sep 2001 07:45:10 GMT
From: Tassilo.Parseval@post.rwth-aachen.de (Tassilo v. Parseval)
Subject: Re: Confused (again) over complex data structures.
Message-Id: <9ncia6$ban$1@nets3.rz.RWTH-Aachen.DE>

On 7 Sep 2001 19:11:23 GMT, Abigail wrote:
> Tassilo v. Parseval (Tassilo.Parseval@post.rwth-aachen.de) wrote on
> == Take it or leave it, I do use the longer notation here as well. It costs
> == me two more characters each time but that's a fair trade-off. The '->'
> == is quite intuitive in my eyes: $ref->[0] := give me the first element of
> == the array that is referenced by $ref.
> 
> Well, *that* arrow is mandatory.

Sure, it was just meant to illustrate how I once learned references. A
sort of 'donkey's bridge' as Germans say.
$ref->[0]->{key} triggers the same associations for me while the second
arrow is not mandatory here.

> == But you saw what happened to the OP's author. He thought he could use
> == the "shortcut" and hence screwed up precedence.
> == And this comparison to parenthesizing arithmetic expressions isn't quite
> == fair since children learn the arithmetic precedence at the ago of ten or
> == at latest eleven.
> 
> Really? Do they learn the relative precedence of =, ?:, <<, ., and &?

No. In complicated if-conditions, I often use brackets to group the
logical operators. Well, you could say that I am just too lazy to learn
the exact precedence of them....and with this, you'd probably be right.

> People *do* keep making mistakes with them, and often their problems 
> end up in this group. But would that be a reason to suggest to always
> fully parenthesize ones expressions? I think not. Then why should we with
> @{} (and ${}, %{} and &{}).
> 
> ==                   This is not the case with Perl and its stupendously
> == expressive syntax. In 99% of all cases, you could make any Perl program
> == a one-liner, yet you don't do it. Why not? ;-)
> 
> 
> Because it would make POD and comments harder.

Not to mention here-docs. Comments and PODs are often put into the
source to make it more legible to other people. And in some situations,
using otherwise superfluous brackets and arrows and other operators
increases readability as well.

Tassilo
-- 
$a=[(74,116)];$b=[($a->[1]-1,$a->[1]++,0x20)];$c=[(97,110)];$d=[($c->
[1]+1,$b->[1],"her")];for(@{[$a,$b,$c,$d]}){for(@{$_}){$_=~/\d+/?print
(chr($_)):print;}}$c=sub{$l=shift;[(0x20+$l-1,0x50,0x65,0x73-0x01,108
),(0x20,0x68,0x61,)]};print(map{chr($_)}@{($c->(1))});$h={a=>33*3,b=>
10**2+7,c=>"1"."0"."1",d=>0162};@h=sort(keys(%$h));for(@h){print(chr(
ord(chr($h->{$_}))))};



------------------------------

Date: Sat, 08 Sep 2001 00:06:52 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: getting ip
Message-Id: <3B99995C.50C974DC@earthlink.net>

Matthew Frick wrote:
> 
> I have a webpage calling a script on another server and I need the
> script to be able to get the ip of the webpage not of the person
> navigating. Does anyone know how I would go about this??

Require that the <form> which refers to the script contain a field with
the host in it, eg: <input type=hidden name=host value=this.host.com>

Then have the script first check $ENV{HTTP_REFERER}, and if if it's
there, parse the url, and if it's not there, use the 'host' parameter
from the query.

-- 
"I think not," said Descartes, and promptly disappeared.


------------------------------

Date: Sat, 08 Sep 2001 08:20:47 GMT
From: Sudhir Krishnan <sudhir@newmail.net>
Subject: Message Catalog in Perl
Message-Id: <3B99D5AD.850F5368@newmail.net>

Hi,
 Is there a way I can use message catalogs in perl?
 Here's the extract from the manpage for catgets.
  

##################################
NAME
       catgets - get message from a message catalog
 
SYNOPSIS
       #include <nl_types.h>
 
       char  *catgets(nl_catd  catalog,  int set_number, int mesĀ­
       sage_number, const char *message);      

##################################

Sudhir


------------------------------

Date: Sat, 08 Sep 2001 13:08:47 +0200
From: Steffen Bachmann <Steffen.Bachmann@t-online.de>
Subject: Re: object & fork
Message-Id: <3B99FC3F.B3F92DC0@t-online.de>

Mark Jason Dominus wrote:
> 
> You misunderstand fork().  fork() makes a new process which is a
> *copy* of the old process.  No memory is shared between the two processes.
> 
> Consider this:
> 
>         my $VAR = 0;
>         my $pid = fork;
>         if ($pid == 0) {
>           # child
> 
>           $VAR = 1;
>           exit;
> 
>         } else {
>           # parent
> 
>           while ($VAR == 0) {
>             sleep 1;
>           }
>           print "Child finished.\n";
>           exit;
> 
>         }
> 
> Try this.  It never finishes.  The child sets $VAR to 1, but that is
> the child's private copy of $VAR.  The parent has a different $VAR,
> which is always 0.
> 
> Your program has the same problem. After fork(), there are *two*
> $self->{result} variables, one in the parent and one in the child.
> The child sets its $self->{result} and then exits, and its memory is
> destroyed.  The parent's $self->{result} is always '?'.
> 
> To communicate between two processes after fork(), you need to use an
> interprocess communication mechanism.  For example, you may use a
> 'shared memory segment', which is managed by the shmget, smhctl,
> shmread, and shmwrite functions, or you can get the IPC::Shareable
> module from CPAN, which provides a simpler interface.
> 
> Another thing you can do is have the two processes communicate with a
> file.  Have the child process deposit its result into a file; the
> parent can see when the file appears, and pick up the result.
> 
> >As you can see, the assignment of the process-id works (in myProcess.pm
> >start-method), also the assignment of the '?' as result is working ( in
> >myProcess.pm new-method) but the assignment of the command output
> >doesn't work at all (in myProcess.pm start-method). I also checke the
> >hash reference ($self) in all methods, and it's really the same.
> 
> It's not the same.  Two data objects in different prcesses are *never*
> the same.
> 
> --
> @P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
> @p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord
> ($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&
> close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print

Thanks for help, I'm using open() to establish the interprocess
communication and
it's working fine:

package myProcess;

sub new{
        my $class = shift;
        my $self = {};
        $self->{'result'} = '?';
        bless($self, $class);
        };

sub start{
        my ($self,$cmd) = @_;
        $SIG{'CHLD'} = sub{wait;};
        $self->{'pid'} = open("PIPE_$self","$cmd |") || "Can't fork:
$!\n";
        return($self->{'pid'});
        };

sub result{
        my $self = shift;
        my $pipe ='PIPE_'.$self;
        my $r = '';
        while(<$pipe>){$r = $r.$_;};
        $self->{'result'} = $r;
        return($self->{'result'});
        };

sub cleanup{
        my $self = shift;
        my $r = $self->{'pid'};
        close("PIPE_$self");
        kill('SIGTERM',$self->{'pid'});
        delete $self->{'pid'};
        return($r);
        };

1;


------------------------------

Date: Sat, 08 Sep 2001 00:23:20 -0700
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: Perl/CGI problem
Message-Id: <3B99C768.56CC483D@stomp.stomp.tokyo>

Caroline Jacinta Tsay wrote:

 
> If I'm running a CGI script written in Perl after a user submits some
> info. on a web page, why can't it run any other Perl modules such as
> Win32::NetAdmin??   Right now, my script can run on the command line fine,
> but if I call it when a user submits info. on a page, it won't run
> correctly.


Could be worse. Your script could run correctly but walk with a limp.


Godzilla!
--
00:02:04 09/08/2001 - RESTRICTED FILE REDIRECT:
   - DNS: dsl61-254.dsl.voyageur.ca - IPA: 207.61.116.254
   - System: 
   - Redirect URL: /default.ida


------------------------------

Date: Sat, 08 Sep 2001 00:42:45 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Script Error?
Message-Id: <3B99A1C5.21A2331@earthlink.net>

WHW wrote:
> 
> Hi,
> 
> Can you see an error in the code below?
> 
> [CODE]

#!/usr/local/bin/perl -w
use strict;

> # Use the DBI module
> use DBI qw(:sql_types);
> 
> # Declare local variables
> 
> my ($databaseName, $databaseUser, $databasePw, $dbh);
> my ($stmt, sth, @newRow);

This should be $sth, not sth.

> my ($domain, $password, $name, $street, $town, $county, $postcode,
> $country, $email, $phone, $fax, $nameserver1, $nameserver2, $ip1,
> $ip2);

You should declare your variables as you need them, not all at the top
of your script.  Also... wouldn't all these things be better off in an
array or a hash?  Then you could simply declare %record or @record, and
assign to the appropriate places.

> 
> #Line 50 is here
> # Set the parameter values for the connection
> $databaseName = "DBI:mysql:database_name";
> $databaseUser = "database_username";
> $databasePw = "database_password";
> 
> # Connect to the database
> # Note this connection can be used to
> # execute more than one statement
> # on any number of tables in the database
> 
> $dbh = DBI->connect($databaseName, $databaseUser,
> $databasePw) || die "Connect failed: $DBI::errstr\n";

Unless you *need* to use those variables later, you can just put them
directly in as arguments to connect.  Also, as I said, declare your
variables right when you first need them:

my $dbh = DBI->connect( "DBI:mysql:database_name",
	"database_username", "database_password",
	{ RaiseError => 1 } );

I added the RaiseError because nowhere in your script do I see you doing
anything other than dieing when you detect an error, so you might as
well have DBI do it for you, and thus make your code cleaner looking.

> INSERT INTO phpSP_users (user, password, userlevel, name, street,
> town, county, postcode, country, email, phone, fax, nameserver1,
> nameserver2, ip1, ip2);
> VALUES ('$domain', '$password', '1', '$name', '$street', '$town',
> '$county', '$postcode', '$country', '$email', '$phone', '$fax',
> '$nameserver1', '$nameserver2', '$ip1', '$ip2');

The above is a syntax error...

> # Prepare and execute the SQL query
> $sth = $$dbh->prepare($$stmt)
> || die "prepare: $$stmt: $DBI::errstr";
> $sth->execute || die "execute: $$stmt: $DBI::errstr";

If you only use the prepared statement once, and never fetch from it,
then you can replace it with $dbh->do(statement)

my $sth = $dbh->do("
	INSERT INTO phpSP_users (
		user, password, userlevel, name,
		street, town, county, postcode, country,
		email, phone, fax,
		nameserver1, nameserver2, ip1, ip2)
	VALUES ('", join("', '", map $dbi->quote($_), 
		$domain, $password, 1, $name,
		$street, $town, $county, $postcode, $country,
		$email, $phone, $fax,
		$nameserver1, $nameserver2, $ip1, $ip2),
	"')" );

Considering that you have a large number of variable names, and you
probably only deal with them as a group, not seperately, you might be
better off putting all those bits of data into a single data structure
[a hash], and not dealing with all those seperate variables.

$record{domain} = ...
$record{password} = ...
$record{userlevel} = 1;
 ....

$dbi->do( "INSERT INTO phpSP_users (" .
		join(", ", keys %record) . ") VALUES ('" .
		join("', '", map $dbi->quote($_), values %record) .
	"')");

This would allow you to replace those 15 different variables with one
single variable.  Also, by moving the data from the prepare to the
execute, you can move the prepare out of whatever loop it's in, 

[snip]
> Can't declare constant item in "my" at home/path/cgi-bin/ukcoorder.cgi
> line 47, near ");"

This indicates the error is on line 47.  Why didn't you just look on
line 47?

-- 
"I think not," said Descartes, and promptly disappeared.


------------------------------

Date: Sat, 08 Sep 2001 07:25:35 GMT
From: Sudhir Krishnan <sudhir@newmail.net>
Subject: sudhir: array references.
Message-Id: <3B99C8BE.874ABA40@newmail.net>

Hi!,

I have a triple dimensional array a[i][j][k]

It's actually a reference to an reference to an array, thats how
it works internally, right?

Now if I want to pass it as an argument to a function, how should the
prototype be defined and how should the value be returned?

If it was just a normal (unidimensional) array then I would do the
following:

sub func1
{
  local (@a) = @_;
  local (@tmp);
  push (@tmp, @a);
  .
  .
  .
  <some processing on tmp>
  .
  .
  .
  return @tmp;
;
}

sub main 
{
  @a=();
  @b=();
  @b = func1 (@a);
}


------------------------------

Date: Sat, 08 Sep 2001 05:03:17 GMT
From: James Lee <shijialeee@yahoo.com>
Subject: why "\W" does work in the split ???
Message-Id: <pEhm7.124558$n75.29506528@news4.rdc1.on.home.com>

hi, all

i have a problem when counting words in a file. situation is i am counting 
a line which starts as "/usr/bin/perl" (no " in reality and no leading 
space) . my code always count it as 4 words instead 3.. could someone point 
out the reason. Thank you very much. my code is ...


while (<>) {
  @words = split(/\W+/);
  $count{$ARGV}+= @words;
  print "@words\n";
}
foreach $file (keys %count) {
    print "$file has $count{$file} words \n";
}


Q.


------------------------------

Date: Sat, 08 Sep 2001 05:06:08 GMT
From: "David Hilsee" <davidhilseenews@yahoo.com>
Subject: Re: why "\W" does work in the split ???
Message-Id: <4Hhm7.78624$hT4.20071329@news1.rdc1.md.home.com>


"James Lee" <shijialeee@yahoo.com> wrote in message
news:pEhm7.124558$n75.29506528@news4.rdc1.on.home.com...
> hi, all
>
> i have a problem when counting words in a file. situation is i am counting
> a line which starts as "/usr/bin/perl" (no " in reality and no leading
> space) . my code always count it as 4 words instead 3.. could someone
point
> out the reason. Thank you very much. my code is ...
>
>
> while (<>) {
>   @words = split(/\W+/);
>   $count{$ARGV}+= @words;
>   print "@words\n";
> }
> foreach $file (keys %count) {
>     print "$file has $count{$file} words \n";
> }
>
>
> Q.

The real easy way to answer your own questions is to print out the results
that you get.

$ perl -e '$_="/usr/bin/perl"; print "\"$_\"\n" for split /\W/
""
"usr"
"bin"
"perl"

David Hilsee




------------------------------

Date: Sat, 08 Sep 2001 05:09:19 GMT
From: "David Hilsee" <davidhilseenews@yahoo.com>
Subject: Re: why "\W" does work in the split ???
Message-Id: <3Khm7.78625$hT4.20074766@news1.rdc1.md.home.com>


> $ perl -e '$_="/usr/bin/perl"; print "\"$_\"\n" for split /\W/

Of course, somehow the last single quote to end that statement was lost in
the paste.  Tack a ' on the end.

David Hilsee




------------------------------

Date: Sat, 08 Sep 2001 05:35:08 GMT
From: James Lee <shijialeee@yahoo.com>
Subject: Re: why "\W" does work in the split ???
Message-Id: <g6im7.124758$n75.29554650@news4.rdc1.on.home.com>

sorry . i am new to perl.. confused with the command line..
could you explain it  in the coding way ?? 
what should i do to make it work?? 

thanks for your reply

Q.

David Hilsee wrote:

 The real easy way to answer your own questions is to print out the results
> that you get.
> 
> $ perl -e '$_="/usr/bin/perl"; print "\"$_\"\n" for split /\W/
> ""
> "usr"
> "bin"
> "perl"
> 
> David Hilsee
> 
> 
> 



------------------------------

Date: Sat, 08 Sep 2001 05:47:12 GMT
From: "David Hilsee" <davidhilseenews@yahoo.com>
Subject: Re: why "\W" does work in the split ???
Message-Id: <Ahim7.78632$hT4.20113944@news1.rdc1.md.home.com>


"James Lee" <shijialeee@yahoo.com> wrote in message
news:g6im7.124758$n75.29554650@news4.rdc1.on.home.com...
> sorry . i am new to perl.. confused with the command line..
> could you explain it  in the coding way ??
> what should i do to make it work??
>
> thanks for your reply
>
> Q.
>

Well, the reason that you are getting 4 words is because the array contains
4 elements (listed after the command line, in quotes).

Doing split /\W+/ on "/usr/bin/perl" gives:

""
"usr"
"bin"
"perl"

Note that the first element is empty.  To get rid of it, you can do shift
@words.  You could also just add 1 less than the number of elements.  If
there are other lines that could be affected by changing your approach, you
may need a little more than that.

This happens because you are getting everything on the left and right of
each series of non-word characters.  The first character is non-word ("/"),
so you get "" (nothing) and "usr", since that is what is on the left and the
right.

Hope this explanation helped.

David Hilsee




------------------------------

Date: Sat, 08 Sep 2001 05:58:59 GMT
From: "John W. Krahn" <krahnj@acm.org>
Subject: Re: why "\W" does work in the split ???
Message-Id: <3B99B420.3B5FD573@acm.org>

James Lee wrote:
> 
> hi, all
> 
> i have a problem when counting words in a file. situation is i am counting
> a line which starts as "/usr/bin/perl" (no " in reality and no leading
> space) . my code always count it as 4 words instead 3.. could someone point
> out the reason. Thank you very much. my code is ...

This is happening because you are using split, use a regex instead.


#!/usr/bin/perl -w
use strict;

my %count;

> while (<>) {
>   @words = split(/\W+/);

  my @words = /(\w+)/g;


>   $count{$ARGV}+= @words;
>   print "@words\n";
> }
> foreach $file (keys %count) {
>     print "$file has $count{$file} words \n";
> }



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Sat, 08 Sep 2001 06:36:54 GMT
From: James lee <shijialeee@yahoo.com>
Subject: Re: why "\W" does work in the split ???
Message-Id: <a0jm7.125099$n75.29634675@news4.rdc1.on.home.com>


> This happens because you are getting everything on the left and right of
> each series of non-word characters.  The first character is non-word
> ("/"), so you get "" (nothing) and "usr", since that is what is on the
> left and the right.

so what if i split /usr/bin/perl/  .. as my understanding now , the last 
slash in the split has "perl" on its left , and "" on it's right ?  then 
there should be one more words "" total would be 5 ?   
as i read about the detail of split function , it says 

>Splits a string into an array of strings, and returns it. By default, 
>empty leading fields are preserved, and empty trailing ones are deleted.
is this what you mean ? so it wont count the last "" ? 

i can think of a way to get rid of the "" is to make the count decrease by 
1 each time counting a line.  but  i want to use reg to work it through.. 
like substitue the "" by // ( or its the same ? ) ...

sorry if my words sound confusing .. i am really about to bed now ... @-)  


Q.


------------------------------

Date: Sat, 08 Sep 2001 06:54:49 GMT
From: "David Hilsee" <davidhilseenews@yahoo.com>
Subject: Re: why "\W" does work in the split ???
Message-Id: <Zgjm7.78805$hT4.20173316@news1.rdc1.md.home.com>


"James lee" <shijialeee@yahoo.com> wrote in message
news:a0jm7.125099$n75.29634675@news4.rdc1.on.home.com...
>
> > This happens because you are getting everything on the left and right of
> > each series of non-word characters.  The first character is non-word
> > ("/"), so you get "" (nothing) and "usr", since that is what is on the
> > left and the right.
>
> so what if i split /usr/bin/perl/  .. as my understanding now , the last
> slash in the split has "perl" on its left , and "" on it's right ?  then
> there should be one more words "" total would be 5 ?
> as i read about the detail of split function , it says
>
> >Splits a string into an array of strings, and returns it. By default,
> >empty leading fields are preserved, and empty trailing ones are deleted.
> is this what you mean ? so it wont count the last "" ?

Yes, this is what I meant.  Unfortunately, it is not what I said, since I
didn't mention the exception.  I should have just said to perldoc -f split,
but I, like yourself, am tired.  If nothing else, I hope that this
discussion has led to a better understanding of split.

Anyways, I think John's approach is what you're looking for.

David Hilsee




------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 1714
***************************************


home help back first fref pref prev next nref lref last post