[24913] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 7163 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Sep 21 21:43:36 2004

Date: Tue, 21 Sep 2004 18:42:03 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Tue, 21 Sep 2004     Volume: 10 Number: 7163

Today's topics:
        space deliminated to comma delinated with varried and n <laura.hradowy@NOSPAM.mts.caaaaa>
    Re: space deliminated to comma delinated with varried a <jurgenex@hotmail.com>
    Re: space deliminated to comma delinated with varried a <tore@aursand.no>
    Re: space deliminated to comma delinated with varried a <laura.hradowy@NOSPAM.mts.caaaaa>
    Re: space deliminated to comma delinated with varried a <laura.hradowy@NOSPAM.mts.caaaaa>
    Re: space deliminated to comma delinated with varried a <tore@aursand.no>
    Re: space deliminated to comma delinated with varried a <tore@aursand.no>
    Re: space deliminated to comma delinated with varried a (Anno Siegel)
    Re: space deliminated to comma delinated with varried a (Larry Felton Johnson)
    Re: space deliminated to comma delinated with varried a <shawn.corey@sympatico.ca>
    Re: space deliminated to comma delinated with varried a <scobloke2@infotop.co.uk>
    Re: space deliminated to comma delinated with varried a <noreply@gunnar.cc>
    Re: space deliminated to comma delinated with varried a <scobloke2@infotop.co.uk>
        Spell check French in perl <daveandniki@ntlworld.com>
        split problem <gabriel.larkin@gmail.com>
    Re: split problem <mritty@gmail.com>
    Re: split problem <thundergnat@hotmail.com>
    Re: split problem <someone@example.com>
    Re: split problem <gabriel.larkin_at@at_gmailDOT.com>
    Re: split problem <gabriel.larkin_at@at_gmailDOT.com>
    Re: split problem <someone@example.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 20 Sep 2004 14:13:07 -0500
From: "LHradowy" <laura.hradowy@NOSPAM.mts.caaaaa>
Subject: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <WYF3d.1170$IO.7153@news1.mts.net>

I have file that looks like this...
       1555002                         00 0 04 27              TELN NOT BILL
       3555007                         00 0 06 00              CUSTOMER HAS
> 1
       5555410                         00 0 12 10              CUSTOMER HAS
> 1
       6755012                         00 0 12 06              CUSTOMER HAS
> 1

Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...

I need to created a perl script that takes this file and makes it look like
this
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

This output needs to be written to a file.
I have no idea how to start, if I split on a space " " the it will spit the
third an fourth column up. The fourth column can basically be left alone.

Thanks for the help.






------------------------------

Date: Mon, 20 Sep 2004 19:29:52 GMT
From: "Jürgen Exner" <jurgenex@hotmail.com>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <QcG3d.4905$Ii2.2541@trnddc09>

LHradowy wrote:
> I have file that looks like this...
>       1555002                         00 0 04 27              TELN
>       NOT BILL 3555007                         00 0 06 00
> CUSTOMER HAS
>> 1
>       5555410                         00 0 12 10
> CUSTOMER HAS
>> 1
>       6755012                         00 0 12 06
> CUSTOMER HAS
>> 1
>
> Notice the white spaces at beginning of the line, I DONT WANT THEM
> THERE

Please see the thread "
Replacing spaces" that was discussed here over the weekend.

> Notice the white spaces in the 2nd and 3rd columns, I NEED THEM
> THERE...

The solutions posted in the thread mentioned above will leave those alone.


> I need to created a perl script that takes this file

perldoc -f open
perldoc perlop (and check for <>)

> and makes it look like this
> 1555002,00 0 04 27,TELN NOT BILL
> 3555007,00 0 06 00,CUSTOMER HAS > 1
> 5555410,00 0 12 10,CUSTOMER HAS > 1
> 6755012,00 0 12 06,CUSTOMER HAS > 1
>
> This output needs to be written to a file.

perldoc -f open
perldoc -f print

> I have no idea how to start, if I split on a space " " the it will
> spit the third an fourth column up. The fourth column can basically
> be left alone.

So, what is the distinguishing difference between the separator for the 
items in the third column on the one hand and the separator between the 
third column and the fourth column on the other hand?

jue 




------------------------------

Date: Tue, 21 Sep 2004 00:19:57 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <pan.2004.09.20.22.19.56.876106@aursand.no>

On Mon, 20 Sep 2004 14:13:07 -0500, LHradowy wrote:
> I have file that looks like this...
>        1555002                         00 0 04 27              TELN NOT BILL
>        3555007                         00 0 06 00              CUSTOMER HAS
>> 1
>        5555410                         00 0 12 10              CUSTOMER HAS
>> 1
>        6755012                         00 0 12 06              CUSTOMER HAS
>> 1
> 
> Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
> Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...
> 
> I need to created a perl script that takes this file and makes it look like
> this
> 1555002,00 0 04 27,TELN NOT BILL
> 3555007,00 0 06 00,CUSTOMER HAS > 1
> 5555410,00 0 12 10,CUSTOMER HAS > 1
> 6755012,00 0 12 06,CUSTOMER HAS > 1

If we skip everything that has got to do with the file(s), here's a
suggestion (untested);

  while ( <DATA> ) {
      chomp;    # Get rid of line breaks
      s,^\s+,,; # Remove leading spaces
      my @cols = split( /\s+{2,}/, $_ ); # Split on two (or more) spaces
      print join( ',', @cols ) . "\n";
  }


-- 
Tore Aursand <tore@aursand.no>
"Daring ideas are like chessmen moved forward; they may be beaten, but
 they may start a winning game." (Johann Wolfgang von Goethe)


------------------------------

Date: Mon, 20 Sep 2004 22:57:40 -0500
From: "LHradowy" <laura.hradowy@NOSPAM.mts.caaaaa>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <HEN3d.1268$IO.9485@news1.mts.net>


"Tore Aursand" <tore@aursand.no> wrote in message
news:pan.2004.09.20.22.19.56.876106@aursand.no...
> On Mon, 20 Sep 2004 14:13:07 -0500, LHradowy wrote:
> > I have file that looks like this...
> >        1555002                         00 0 04 27              TELN NOT
BILL
> >        3555007                         00 0 06 00              CUSTOMER
HAS
> >> 1
> >        5555410                         00 0 12 10              CUSTOMER
HAS
> >> 1
> >        6755012                         00 0 12 06              CUSTOMER
HAS
> >> 1
> >
> > Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
> > Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...
> >
> > I need to created a perl script that takes this file and makes it look
like
> > this
> > 1555002,00 0 04 27,TELN NOT BILL
> > 3555007,00 0 06 00,CUSTOMER HAS > 1
> > 5555410,00 0 12 10,CUSTOMER HAS > 1
> > 6755012,00 0 12 06,CUSTOMER HAS > 1
>
> If we skip everything that has got to do with the file(s), here's a
> suggestion (untested);
>
>   while ( <DATA> ) {
>       chomp;    # Get rid of line breaks
>       s,^\s+,,; # Remove leading spaces
>       my @cols = split( /\s+{2,}/, $_ ); # Split on two (or more) spaces
>       print join( ',', @cols ) . "\n";
>   }


Ahhh, I think I am forgetting something, THIS is exactly what I want!
But I am getting an error when I run it, and my skills at perl are weak.
#!/opt/perl/bin/perl

use strict;
use warnings;


while (<>) {
chomp;  # Will remove the leading , or new line
s,^\s+,,; #Remove leading spaces
my @cols=split(/\s+{2,}/,$_); #Split on two (or more) spaces
print join (',',@cols)."\n";
}

user@server$ ./test.pl file
Nested quantifiers in regex; marked by <-- HERE in m/\s+{ <-- HERE 2,}/ at
 ./test.pl line 10.




------------------------------

Date: Mon, 20 Sep 2004 23:51:51 -0500
From: "LHradowy" <laura.hradowy@NOSPAM.mts.caaaaa>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <urO3d.1384$IO.9843@news1.mts.net>


"Ian Wilson" <scobloke2@infotop.co.uk> wrote in message
news:cinis9$fvn$1@sparta.btinternet.com...
> LHradowy wrote:
> > I have file that looks like this...
> >        1555002                         00 0 04 27              TELN NOT
BILL
> >        3555007                         00 0 06 00              CUSTOMER
HAS
> >
> >>1
> >
> >        5555410                         00 0 12 10              CUSTOMER
HAS
> >
> >>1
> >
> >        6755012                         00 0 12 06              CUSTOMER
HAS
> >
> >>1
> >
> >
> > Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
> > Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...
> >
> > I need to created a perl script that takes this file and makes it look
like
> > this
> > 1555002,00 0 04 27,TELN NOT BILL
> > 3555007,00 0 06 00,CUSTOMER HAS > 1
> > 5555410,00 0 12 10,CUSTOMER HAS > 1
> > 6755012,00 0 12 06,CUSTOMER HAS > 1
> >
> > This output needs to be written to a file.
> > I have no idea how to start, if I split on a space " " the it will spit
the
> > third an fourth column up. The fourth column can basically be left
alone.
> >
> > Thanks for the help.
> >
> >
>
> If the data always has multiple spaces (ASCII 32) between fields, I'd
> try stripping the leading spaces and then converting >1 consecutive
> spaces to commas:
>
> perl -e -p 's/^ +//; s/  +/,/g' oldfile > newfile
>
> But I expect Shawn's substr solution to be more robust. Using unpack may
> be another useful approach.

I like this but I get nothing back in the new file. And I have no tabs they
are all spaces.




------------------------------

Date: Tue, 21 Sep 2004 07:07:51 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <pan.2004.09.21.05.07.51.512086@aursand.no>

On Tue, 21 Sep 2004 00:40:17 +0200, Gunnar Hjalmarsson wrote:
>> If we skip everything that has got to do with the file(s), here's a
>> suggestion (untested);
>> 
>>   while ( <DATA> ) {
>>       chomp;    # Get rid of line breaks
>>       s,^\s+,,; # Remove leading spaces
>>       my @cols = split( /\s+{2,}/, $_ ); # Split on two (or more) spaces
> -----------------------------^^^^^
> 
> Maybe you should have tested it... ;-)

You are so right, Gunnar, and I'm terribly sorry.  The correct split()
should - of course - look like this:

  my @cols = split( /\s{2,}/, $_ );

Still untested, though. :)


-- 
Tore Aursand <tore@aursand.no>
"I know not with what weapons World War 3 will be fought, but World War
 4 will be fought with sticks and stones." (Albert Einstein)


------------------------------

Date: Tue, 21 Sep 2004 07:08:52 +0200
From: Tore Aursand <tore@aursand.no>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <pan.2004.09.21.05.08.37.747479@aursand.no>

On Mon, 20 Sep 2004 22:57:40 -0500, LHradowy wrote:
> my @cols=split(/\s+{2,}/,$_); #Split on two (or more) spaces

My fault.  Don't split on '\s+{2,}', but on '\s{2,}';

  my @cols = split( /\s{2,}/, $_ );


-- 
Tore Aursand <tore@aursand.no>
"I know not with what weapons World War 3 will be fought, but World War
 4 will be fought with sticks and stones." (Albert Einstein)


------------------------------

Date: 21 Sep 2004 10:46:08 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <cip0pg$11j$2@mamenchi.zrz.TU-Berlin.DE>

LHradowy <laura.hradowy@NOSPAM.mts.caaaaa> wrote in comp.lang.perl.misc:
> I have file that looks like this...
>        1555002                         00 0 04 27              TELN NOT BILL
>        3555007                         00 0 06 00              CUSTOMER HAS
> > 1
>        5555410                         00 0 12 10              CUSTOMER HAS
> > 1
>        6755012                         00 0 12 06              CUSTOMER HAS
> > 1
> 
> Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
> Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...
> 
> I need to created a perl script that takes this file and makes it look like
> this
> 1555002,00 0 04 27,TELN NOT BILL
> 3555007,00 0 06 00,CUSTOMER HAS > 1
> 5555410,00 0 12 10,CUSTOMER HAS > 1
> 6755012,00 0 12 06,CUSTOMER HAS > 1
> 
> This output needs to be written to a file.
> I have no idea how to start, if I split on a space " " the it will spit the
> third an fourth column up. The fourth column can basically be left alone.

    while ( <DATA> ) {
        my @l = split;
        print join( ',', $l[ 0], "@l[ 1 .. 4]", "@l[ 5 .. $#l]"), "\n";
    }

Anno


------------------------------

Date: 21 Sep 2004 13:24:28 -0700
From: larryj@gsu.edu (Larry Felton Johnson)
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <4ae7bf57.0409211224.14e8090b@posting.google.com>

"LHradowy" <laura.hradowy@NOSPAM.mts.caaaaa> wrote in message news:<WYF3d.1170$IO.7153@news1.mts.net>...
> I have file that looks like this...
>        1555002                         00 0 04 27              TELN NOT BILL
>        3555007                         00 0 06 00              CUSTOMER HAS
> > 1
>     5555410                         00 0 12 10              CUSTOMER HAS
> > 1
>     6755012                         00 0 12 06              CUSTOMER HAS
> > 1
> 
> Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
> Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...
> 
> I need to created a perl script that takes this file and makes it look like
> this
> 1555002,00 0 04 27,TELN NOT BILL
> 3555007,00 0 06 00,CUSTOMER HAS > 1
> 5555410,00 0 12 10,CUSTOMER HAS > 1
> 6755012,00 0 12 06,CUSTOMER HAS > 1
> 
> This output needs to be written to a file.
> I have no idea how to start, if I split on a space " " the it will spit the
> third an fourth column up. The fourth column can basically be left alone.
> 
> Thanks for the help.

I get the idea I may be oversimplifying or misunderstanding some part
of this question, but if there is a uniform number of columns, and
components within
the columns a simple regex should do it, and it's a matter of just
reconstructing it with a print statement with the spacing you want.

perl -pi.bak -e 's/^\s+(\w+)\s+(\w+)\s+(\w+)\s+(\w+)\s+(\w+)\s+(.*)/$1,$2
$3 $4 $5,$6/g' spaces

In my first pass the long and ugly oneliner above did it for me when I
cut and pasted your file snippet into a file called spaces.  This
edited in place and copied the old file to spaces.bak
If there's a need to write it to a file of another name the same regex
could
be wrapped in a script opening the infile for reading and the outfile
for writing.

How about it?  Am I misunderstanding something here?


------------------------------

Date: Mon, 20 Sep 2004 16:32:54 -0400
From: Shawn Corey <shawn.corey@sympatico.ca>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <l7H3d.8352$bL1.394336@news20.bellglobal.com>

Hi,

If the data is in fixed columns, you can use substr.

perldoc -f substr

	--- Shawn

LHradowy wrote:
> I have file that looks like this...
>        1555002                         00 0 04 27              TELN NOT BILL
>        3555007                         00 0 06 00              CUSTOMER HAS
> 
>>1
> 
>        5555410                         00 0 12 10              CUSTOMER HAS
> 
>>1
> 
>        6755012                         00 0 12 06              CUSTOMER HAS
> 
>>1
> 
> 
> Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
> Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...
> 
> I need to created a perl script that takes this file and makes it look like
> this
> 1555002,00 0 04 27,TELN NOT BILL
> 3555007,00 0 06 00,CUSTOMER HAS > 1
> 5555410,00 0 12 10,CUSTOMER HAS > 1
> 6755012,00 0 12 06,CUSTOMER HAS > 1
> 
> This output needs to be written to a file.
> I have no idea how to start, if I split on a space " " the it will spit the
> third an fourth column up. The fourth column can basically be left alone.
> 
> Thanks for the help.
> 
> 
> 
> 



------------------------------

Date: Mon, 20 Sep 2004 21:42:33 +0000 (UTC)
From: Ian Wilson <scobloke2@infotop.co.uk>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <cinis9$fvn$1@sparta.btinternet.com>

LHradowy wrote:
> I have file that looks like this...
>        1555002                         00 0 04 27              TELN NOT BILL
>        3555007                         00 0 06 00              CUSTOMER HAS
> 
>>1
> 
>        5555410                         00 0 12 10              CUSTOMER HAS
> 
>>1
> 
>        6755012                         00 0 12 06              CUSTOMER HAS
> 
>>1
> 
> 
> Notice the white spaces at beginning of the line, I DONT WANT THEM THERE
> Notice the white spaces in the 2nd and 3rd columns, I NEED THEM THERE...
> 
> I need to created a perl script that takes this file and makes it look like
> this
> 1555002,00 0 04 27,TELN NOT BILL
> 3555007,00 0 06 00,CUSTOMER HAS > 1
> 5555410,00 0 12 10,CUSTOMER HAS > 1
> 6755012,00 0 12 06,CUSTOMER HAS > 1
> 
> This output needs to be written to a file.
> I have no idea how to start, if I split on a space " " the it will spit the
> third an fourth column up. The fourth column can basically be left alone.
> 
> Thanks for the help.
> 
> 

If the data always has multiple spaces (ASCII 32) between fields, I'd 
try stripping the leading spaces and then converting >1 consecutive 
spaces to commas:

perl -e -p 's/^ +//; s/  +/,/g' oldfile > newfile

But I expect Shawn's substr solution to be more robust. Using unpack may 
be another useful approach.


------------------------------

Date: Tue, 21 Sep 2004 00:40:17 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <2r94hhF17ms7hU1@uni-berlin.de>

Tore Aursand wrote:
> If we skip everything that has got to do with the file(s), here's a
> suggestion (untested);
> 
>   while ( <DATA> ) {
>       chomp;    # Get rid of line breaks
>       s,^\s+,,; # Remove leading spaces
>       my @cols = split( /\s+{2,}/, $_ ); # Split on two (or more) spaces
-----------------------------^^^^^

Maybe you should have tested it... ;-)

-- 
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


------------------------------

Date: Tue, 21 Sep 2004 21:04:18 +0000 (UTC)
From: Ian Wilson <scobloke2@infotop.co.uk>
Subject: Re: space deliminated to comma delinated with varried and need spaces between some columns
Message-Id: <ciq50i$30l$1@sparta.btinternet.com>

LHradowy wrote:

> "Ian Wilson" <scobloke2@infotop.co.uk> wrote in message 
> news:cinis9$fvn$1@sparta.btinternet.com...
> 
>> 
>> If the data always has multiple spaces (ASCII 32) between fields,
>> I'd try stripping the leading spaces and then converting >1
>> consecutive spaces to commas:
>> 
>> perl -e -p 's/^ +//; s/  +/,/g' oldfile > newfile
>> 
>> But I expect Shawn's substr solution to be more robust. Using
>> unpack may be another useful approach.
> 
> 
> I like this but I get nothing back in the new file. And I have no
> tabs they are all spaces.
> 
> 

C:\> type oldname.txt
        1555002                         00 0 04 27              TELN NOT 
BILL
        3555007                         00 0 06 00              CUSTOMER 
HAS > 1
        5555410                         00 0 12 10              CUSTOMER 
HAS > 1
        6755012                         00 0 12 06              CUSTOMER 
HAS > 1

C:\> perl -p -e "s/^ +//; s/  +/,/g" oldname.txt
1555002,00 0 04 27,TELN NOT BILL
3555007,00 0 06 00,CUSTOMER HAS > 1
5555410,00 0 12 10,CUSTOMER HAS > 1
6755012,00 0 12 06,CUSTOMER HAS > 1

I recall some versions of Perl on some versions of Windows have problems 
with redirecting STDOUT to a file from a command prompt / DOS window. 
Maybe you have one of those combinations?


------------------------------

Date: Sun, 19 Sep 2004 13:39:38 GMT
From: "Dave" <daveandniki@ntlworld.com>
Subject: Spell check French in perl
Message-Id: <u_f3d.316$Y_6.299@newsfe6-gui.ntli.net>

I would like to spell check strings for valid (or otherwise) French words.
The text is in an XML file with other (non-French) entries and I want my
'validation' program (in perl) to check spelling as it checks for other
errors in the input. I can easily pull the french out into strings and split
these into single words as needed before passing to a processor. Ideally the
result would be between:
Yes this is a valid French word
No, but it might be one of (list) mispelled.
No, no idea.
(i.e. an return code and a reference to an array of strings as a return.)

I have looked at CPAN but can't see anything. Is there a module or should I
be calling 'aspell'?

I'm hoping to find something that will work on XP and Linux.

Any ideas appreciated




------------------------------

Date: 20 Sep 2004 06:30:23 -0700
From: "gabkin" <gabriel.larkin@gmail.com>
Subject: split problem
Message-Id: <1095687023.800104.248240@k26g2000oda.googlegroups.com>

I am having a problem with the split function.
Here is the sub that it is used in, it should illustrate what I'm
doing, criticism is welcomed...


<PERL SUB>
sub parseLine()
{
#this parses a line which will be in a similar format to this
#"0010230"	"Book of the Dead"	"Yendor books"
#(tab delimited, escaped by quotes)
#it will take as an argument the column headers and the string to
parse
#it will return a hash,using the columnheader as the key
#and the column data as the element
my $ParseMe = $_[0];
my @ColumnHeaders = $_[1..@_];
my %returnData;
chop($ParseMe);
my @Columns = split(/\t/,$ParseMe);
#my $size=@Columns;print("Size = ",$size,"\n");
for(my $i=0;$i<@Columns;$i++) {
$Columns[$i] =~ s/\"//g; # remove extraneous quotes
#print($ColumnHeaders[$i],"\t",$Columns[$i],"\n");
$returnData{$ColumnHeaders[$i]} = $Columns[$i];
}
return %returnData;
}
</PERL SUB>
(Sorry about the awful two-space indentation, but google seems to strip
out tabs)

A problem has arisen in that in one example, the last four columns are
blank (i.e. null) they're there, theres just nothing in them. For these
last four, the split function seems to discard them. I checked this
with the aid of the commented out lines.

Is there a way to force split to not lose the blank columns?

Or would I have to 're-invent' the split algorithm so as to keep them?
Any help would be greatly appreciated...



------------------------------

Date: Mon, 20 Sep 2004 14:50:00 GMT
From: "Paul Lalli" <mritty@gmail.com>
Subject: Re: split problem
Message-Id: <s6C3d.4743$QB1.4489@trndny02>

"gabkin" <gabriel.larkin@gmail.com> wrote in message
news:1095687023.800104.248240@k26g2000oda.googlegroups.com...
> I am having a problem with the split function.

Did you consider reading the documentation for the function you're
having problems with?

<snipped a bunch of poorly formatted code>

> A problem has arisen in that in one example, the last four columns are
> blank (i.e. null) they're there, theres just nothing in them. For
these
> last four, the split function seems to discard them. I checked this
> with the aid of the commented out lines.
>
> Is there a way to force split to not lose the blank columns?

perldoc -f split
4th paragraph.

Paul Lalli




------------------------------

Date: Mon, 20 Sep 2004 16:51:51 -0400
From: thundergnat <thundergnat@hotmail.com>
Subject: Re: split problem
Message-Id: <414f42e6$0$2651$61fed72c@news.rcn.com>

gabkin wrote:
> I am having a problem with the split function.
> Here is the sub that it is used in, it should illustrate what I'm
> doing, criticism is welcomed...
> 
[snip]
> A problem has arisen in that in one example, the last four columns are
> blank (i.e. null) they're there, theres just nothing in them. For these
> last four, the split function seems to discard them. I checked this
> with the aid of the commented out lines.
> 
> Is there a way to force split to not lose the blank columns?
> 
> Or would I have to 're-invent' the split algorithm so as to keep them?
> Any help would be greatly appreciated...
> 

Did you read the docs for split? (Really. Not being sarcastic.)

Seems like you are looking for the Limit option on split.

Since you know how many cloumns you are looking for, specify that.


------------------------------

Date: Mon, 20 Sep 2004 23:09:32 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: split problem
Message-Id: <MqJ3d.96204$XP3.78812@edtnps84>

gabkin wrote:
> I am having a problem with the split function.
> Here is the sub that it is used in, it should illustrate what I'm
> doing, criticism is welcomed...
          ^^^^^^^^^^^^^^^^^^^^^
Ok, you asked for it.  :-)


> <PERL SUB>
> sub parseLine()
> {
> #this parses a line which will be in a similar format to this
> #"0010230"	"Book of the Dead"	"Yendor books"
> #(tab delimited, escaped by quotes)
> #it will take as an argument the column headers and the string to
> parse
> #it will return a hash,using the columnheader as the key
> #and the column data as the element
> my $ParseMe = $_[0];
> my @ColumnHeaders = $_[1..@_];
                       ^^^^^^^^^
That is wrong.  The '$' at the beginning denotes a scalar value so you are 
assigning a single value from the @_ array to the @ColumnHeaders array.  And 
even if you had used a proper array slice, you are accessing an extra element 
at the end of the array that does not exist.

$ perl -le'@x="a".."f"; print @x . "  @x"; @y = @x[1..@x]; print @y . "  @y"'
6  a b c d e f
6  b c d e f

The correct syntax is:

my @ColumnHeaders = @_[ 1 .. $#_ ];

However the usual way to do that is:

my ( $ParseMe, @ColumnHeaders ) = @_;

Or if you really want to do it on two lines:

my $ParseMe = shift;
my @ColumnHeaders = @_;


> my %returnData;
> chop($ParseMe);

chop() isn't really used very much anymore.  You should use chomp() unless you 
have a valid reason not to.


> my @Columns = split(/\t/,$ParseMe);

As others have pointed out, use the third argument to split().

my @Columns = split /\t/,$ParseMe, -1;


> #my $size=@Columns;print("Size = ",$size,"\n");
> for(my $i=0;$i<@Columns;$i++) {

That is usually written as:

for my $i ( 0 .. $#Columns ) {


> $Columns[$i] =~ s/\"//g; # remove extraneous quotes

Double quote characters don't have to be escaped in regular expressions.


> #print($ColumnHeaders[$i],"\t",$Columns[$i],"\n");
> $returnData{$ColumnHeaders[$i]} = $Columns[$i];
> }
> return %returnData;
> }
> </PERL SUB>



John
-- 
use Perl;
program
fulfillment


------------------------------

Date: Tue, 21 Sep 2004 10:19:31 +0200
From: Gabkin <gabriel.larkin_at@at_gmailDOT.com>
Subject: Re: split problem
Message-Id: <2ra6glF182n42U1@uni-berlin.de>

John W. Krahn wrote:
> gabkin wrote:
> 
>> I am having a problem with the split function.
>> Here is the sub that it is used in, it should illustrate what I'm
>> doing, criticism is welcomed...
> 
>          ^^^^^^^^^^^^^^^^^^^^^
> Ok, you asked for it.  :-)

I welcome criticism because I know I am new to perl and am probably 
carrying over mistakes from other languages (Java,VB,COBOL) into my perl 
writing.

>> my @ColumnHeaders = $_[1..@_];
> 
>                       ^^^^^^^^^
> That is wrong.  The '$' at the beginning denotes a scalar value so you 
> are assigning a single value from the @_ array to the @ColumnHeaders 
> array.  And even if you had used a proper array slice, you are accessing 
> an extra element at the end of the array that does not exist.
> 
> $ perl -le'@x="a".."f"; print @x . "  @x"; @y = @x[1..@x]; print @y . "  
> @y"'
> 6  a b c d e f
> 6  b c d e f
> 
> The correct syntax is:
> 
> my @ColumnHeaders = @_[ 1 .. $#_ ];
> 
> However the usual way to do that is:
> 
> my ( $ParseMe, @ColumnHeaders ) = @_;
> 
> Or if you really want to do it on two lines:
> 
> my $ParseMe = shift;
> my @ColumnHeaders = @_;

Uh, Thanks. I'm still trying to understand all of this but I have 
implemented the much cleaner single-line assignment. I have actually 
seen this before and thus should know about it. Thanks!

I still have a lot of trouble with all of the 'magic' variables (like 
$#) and 'shift', it may be because I have never used C...

>> my %returnData;
>> chop($ParseMe);
>  
> chop() isn't really used very much anymore.  You should use chomp() 
> unless you have a valid reason not to.

It's more a case of I started using chop from the start and it works, so 
I haven't changed it, I will try to use 'chomp' over 'chop' though.

> 
>> my @Columns = split(/\t/,$ParseMe);
> 
> As others have pointed out, use the third argument to split().

Yes, I have found that out.
I now use this instead..
my @Columns = split(/\t/,$ParseMe,@ColumnHeaders);

>> for(my $i=0;$i<@Columns;$i++) {
> 
> That is usually written as:
> 
> for my $i ( 0 .. $#Columns ) {

I have never seen that before, it looks quite handy.
I am not too familiar with the $# usage yet, so I will go look it up now.

>> $Columns[$i] =~ s/\"//g; # remove extraneous quotes
> 
> Double quote characters don't have to be escaped in regular expressions.

I tend to err on the side of caution with regexes, due to their 
inconsistent handling between perl,sed,grep and vi (and probably others 
too).
Thanks though, duly noted.

> John

Thanks for these little tips!

I would love to get my entire program inspected and criticized like 
this, but I feel I might be amiss to post the entire thing (1252 lines 
in the main program, and 113 lines in the 'data verification' program), 
because I know of at least one major algorithm that I did wrong, I used 
a hash where I should have used a string.


------------------------------

Date: Tue, 21 Sep 2004 10:23:13 +0200
From: Gabkin <gabriel.larkin_at@at_gmailDOT.com>
Subject: Re: split problem
Message-Id: <2ra6nmF1832k9U1@uni-berlin.de>

thundergnat wrote:

> gabkin wrote:
>> Is there a way to force split to not lose the blank columns?
>>
>> Or would I have to 're-invent' the split algorithm so as to keep them?
>> Any help would be greatly appreciated...
>>
> 
> Did you read the docs for split? (Really. Not being sarcastic.)
> 
> Seems like you are looking for the Limit option on split.
> 
> Since you know how many cloumns you are looking for, specify that.

You are quite right, I did not read the help for split before posting this!

I apologize, since it has answered my question perfectly...


------------------------------

Date: Tue, 21 Sep 2004 11:03:11 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: split problem
Message-Id: <PTT3d.102030$XP3.3252@edtnps84>

Gabkin wrote:
> John W. Krahn wrote:
>>
>> The correct syntax is:
>>
>> my @ColumnHeaders = @_[ 1 .. $#_ ];
>>
>> However the usual way to do that is:
>>
>> my ( $ParseMe, @ColumnHeaders ) = @_;
>>
>> Or if you really want to do it on two lines:
>>
>> my $ParseMe = shift;
>> my @ColumnHeaders = @_;
> 
> Uh, Thanks. I'm still trying to understand all of this but I have 
> implemented the much cleaner single-line assignment. I have actually 
> seen this before and thus should know about it. Thanks!
> 
> I still have a lot of trouble with all of the 'magic' variables (like 
> $#) and 'shift', it may be because I have never used C...

Don't confuse the magic variable $# (which is deprecated)

perldoc perlvar

with the index of the last element in an array

perldoc perldata


>>> my %returnData;
>>> chop($ParseMe);
>>  
>> chop() isn't really used very much anymore.  You should use chomp() 
>> unless you have a valid reason not to.
> 
> It's more a case of I started using chop from the start and it works, so 
> I haven't changed it, I will try to use 'chomp' over 'chop' though.

chop() will always remove and return the last character in a string while 
chomp() will remove the value of $/ if it is at the end of the string.

perldoc -f chop
perldoc -f chomp


John
-- 
use Perl;
program
fulfillment


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 7163
***************************************


home help back first fref pref prev next nref lref last post