[30420] in Perl-Users-Digest
Perl-Users Digest, Issue: 1663 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Fri Jun 20 14:14:33 2008
Date: Fri, 20 Jun 2008 11:14:24 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Fri, 20 Jun 2008 Volume: 11 Number: 1663
Today's topics:
Re: listing a directory by size xhoster@gmail.com
Re: listing a directory by size <ben@morrow.me.uk>
Re: listing a directory by size xhoster@gmail.com
Re: listing a directory by size <jurgenex@hotmail.com>
Re: listing a directory by size <someone@example.com>
Re: Perl Script <tzz@lifelogs.com>
Re: Print Spanish characters in Perl? <smallpond@juno.com>
Re: Print Spanish characters in Perl? <jurgenex@hotmail.com>
Re: Print Spanish characters in Perl? <tzz@lifelogs.com>
Substituting in a group <samikr@gmail.com>
Re: Substituting in a group <pgovern@u.washington.edu>
Re: Substituting in a group <mritty@gmail.com>
Re: Substituting in a group <daveb@addr.invalid>
Re: Substituting in a group <samikr@gmail.com>
Re: Substituting in a group <someone@example.com>
Re: Substituting in a group <willem@stack.nl>
Re: Substituting in a group <willem@stack.nl>
Re: Substituting in a group <willem@stack.nl>
Why is ftp.perl.org down all the time? <ignoramus7021@NOSPAM.7021.invalid>
Re: Why is ftp.perl.org down all the time? <brian.d.foy@gmail.com>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 20 Jun 2008 15:13:38 GMT
From: xhoster@gmail.com
Subject: Re: listing a directory by size
Message-Id: <20080620111339.564$GX@newsreader.com>
April <xiaoxia2005a@yahoo.com> wrote:
> For the following program, found somethings seem not seen before, one
> is the input <*>, everything and anything?
Yes. There is nothing special about it, it is just the "file glob" version
of the diamond operator, and just happens to have an argument of '*', which
does indeed mean all non-hidden files (in the current directory).
> Another is the usage
> $i{$f} or $i{$b}, etc., not sure what that means?
%i is a hash. $i{$f} is a hash lookup.
> foreach $f (<*>) ( $i{$f} = -S $f };
This is storing each file in the hash %i, with the hash key being the
file name and hash value being the file size.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
------------------------------
Date: Fri, 20 Jun 2008 16:32:32 +0100
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: listing a directory by size
Message-Id: <g5vsi5-uh4.ln1@osiris.mauzo.dyndns.org>
Quoth xhoster@gmail.com:
> April <xiaoxia2005a@yahoo.com> wrote:
> > For the following program, found somethings seem not seen before, one
> > is the input <*>, everything and anything?
>
> Yes. There is nothing special about it, it is just the "file glob" version
> of the diamond operator, and just happens to have an argument of '*', which
> does indeed mean all non-hidden files (in the current directory).
So, not everything and anything. Specifically, it omits all files with
names beginning with '.', even on OSs where that is not a convention for
'hidden file'. It ignores e.g. DOS 'hidden file' attributes.
Ben
--
Many users now operate their own computers day in and day out on various
applications without ever writing a program. Indeed, many of these users
cannot write new programs for their machines...
-- F.P. Brooks, 'No Silver Bullet', 1987 [ben@morrow.me.uk]
------------------------------
Date: 20 Jun 2008 16:02:48 GMT
From: xhoster@gmail.com
Subject: Re: listing a directory by size
Message-Id: <20080620120250.466$gw@newsreader.com>
Ben Morrow <ben@morrow.me.uk> wrote:
> Quoth xhoster@gmail.com:
> > April <xiaoxia2005a@yahoo.com> wrote:
> > > For the following program, found somethings seem not seen before, one
> > > is the input <*>, everything and anything?
> >
> > Yes. There is nothing special about it, it is just the "file glob"
> > version of the diamond operator, and just happens to have an argument
> > of '*', which does indeed mean all non-hidden files (in the current
> > directory).
>
> So, not everything and anything. Specifically, it omits all files with
> names beginning with '.', even on OSs where that is not a convention for
> 'hidden file'. It ignores e.g. DOS 'hidden file' attributes.
I did not know that. I knew it used the Unix interpretation of "*.*"
rather DOS's, but I didn't know it also used Unix's method of hiddenness.
Maybe 'Why doesn't glob("*.*") get all the files?' should be changed
to make that clearer. I don't exactly how, maybe from:
You'll need "glob("*")" to get all (non-hidden) files
to
You'll need "glob("*")" to get all (non-dot) files
That doesn't sound all that clear either.
Maybe adding ", including the Unix notion of filenames starting with a
dot being hidden." after:
Because even on non-Unix ports, Perl's glob function follows standard Unix
globbing semantics
Anyway, I found it misleading as-is because I assumed the semantics being
discussed were only those concerning *.*, not also those concerning .*
Xho
--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
------------------------------
Date: Fri, 20 Jun 2008 16:20:35 GMT
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: listing a directory by size
Message-Id: <c1mn541dqhhufpst0gq17qh9ljgnqs0gap@4ax.com>
xhoster@gmail.com wrote:
>Ben Morrow <ben@morrow.me.uk> wrote:
>> So, not everything and anything. Specifically, it omits all files with
>> names beginning with '.', even on OSs where that is not a convention for
>> 'hidden file'. It ignores e.g. DOS 'hidden file' attributes.
>
>I did not know that. I knew it used the Unix interpretation of "*.*"
>rather DOS's, but I didn't know it also used Unix's method of hiddenness.
It's documented but well hidden in perldoc -f glob:
glob Returns the value of EXPR with filename expansions such as
the standard Unix shell /bin/csh would do. [...]
I guess you just have to know what /bin/csh would do.
>Anyway, I found it misleading as-is because I assumed the semantics being
>discussed were only those concerning *.*, not also those concerning .*
Another good point.
jue
------------------------------
Date: Fri, 20 Jun 2008 17:37:21 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: listing a directory by size
Message-Id: <lVR6k.300$2G6.45@edtnps83>
xhoster@gmail.com wrote:
> April <xiaoxia2005a@yahoo.com> wrote:
>> For the following program, found somethings seem not seen before, one
>> is the input <*>, everything and anything?
>
> Yes. There is nothing special about it, it is just the "file glob" version
> of the diamond operator, and just happens to have an argument of '*', which
> does indeed mean all non-hidden files (in the current directory).
>
>> Another is the usage
>> $i{$f} or $i{$b}, etc., not sure what that means?
>
> %i is a hash. $i{$f} is a hash lookup.
>
>> foreach $f (<*>) ( $i{$f} = -S $f };
>
> This is storing each file in the hash %i, with the hash key being the
> file name and hash value being the file size.
Actually the value is true or false depending on whether $f is a socket
or not.
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
------------------------------
Date: Fri, 20 Jun 2008 11:34:01 -0500
From: Ted Zlatanov <tzz@lifelogs.com>
Subject: Re: Perl Script
Message-Id: <867ickt69y.fsf@lifelogs.com>
On Thu, 19 Jun 2008 23:43:12 +0200 Martijn Lievaart <m@rtij.nl.invlalid> wrote:
ML> On Wed, 18 Jun 2008 08:28:49 -0700, cartercc wrote:
>> On Jun 18, 10:47 am, Andrew DeFaria <And...@DeFaria.com> wrote:
>>> That depends on what kind of encryption you use...
>>
>> This particular database is Postgres. It uses a one way hash. We also
>> receive social security numbers, which we are required to read into the
>> database but not required to read out, so we hash these.
>>
>> Key management is one headache that I don't have. ;-)
ML> But key collisions may be.
ML> I agree with your reasoning, except it may introduce another risk. If the
ML> SSN is just used for an check, so you check it against a known record,
ML> OK, no problem. But if you use it as a search key, there may be a hash
ML> collision.
SHA-1 is 160 bits vs. 128 for MD5, so using SHA-1 would definitely avoid
collisions. I'm 99% sure there's no MD5 collisions either for USA-style
SSNs, which are currently 9 decimal digits and thus will fit in 30 bits.
Even at 10 digits (which may happen some day), it's just 33 bits.
On the other hand, this means that without some kind of global or local
salt, every SSN can easily be obtained from the hashed value if one only
precomputes the table of SSN to MD5 or SHA-1 hashes.
Ted
------------------------------
Date: Fri, 20 Jun 2008 09:11:48 -0400
From: smallpond <smallpond@juno.com>
Subject: Re: Print Spanish characters in Perl?
Message-Id: <45f70$485bac9b$6718@news.teranews.com>
DanB wrote:
> This is probably has a simple answer, but it isn't in my Perl books and I
> have been trying to google it up using a dialup line locked down by the
> telephone company to 20k bps. (And they don't offer what they call
> broadband. i.e 56k)
>
> Anyway, after a couple of hours of surfing through a very small part of
> about 5 zillion hits for "Perl Unicode Spanish" at the rate of about one
> page per minute, I decided to cheat and just ask.
>
> Using Debian etch.
> I am trying to build a set of Spanish flash cards using TK, and I need to
> be able to display the accented characters. I know that I need to specify
> them in some unicode besides utf-8, but an example of the actual Perl code
> to activate(?) and use the proper unicode table is what I can't find.
> Actually, Unicode is such a big topic that as a Perl beginner I might not
> recognise the answer if I managed to google it up.
>
> Anybody?
>
> Thanks
> Dan
Not sure if usenet will like the accented chars, but this works on the
command line:
perl -e 'use Tk; new MainWindow(-title=>"Cómo está?"); MainLoop'
** Posted from http://www.teranews.com **
------------------------------
Date: Fri, 20 Jun 2008 15:27:25 GMT
From: Jürgen Exner <jurgenex@hotmail.com>
Subject: Re: Print Spanish characters in Perl?
Message-Id: <tfin54lk22l6i0ov8nssq0baornad8ajfb@4ax.com>
Bill H <bill@ts1000.us> wrote:
>On Jun 20, 12:37 am, Jürgen Exner <jurge...@hotmail.com> wrote:
>> DanB <dbxxxx...@yahoo.com> wrote:
>> >I am trying to build a set of Spanish flash cards using TK, and I need to
>> >be able to display the accented characters. I know that I need to specify
>> >them in some unicode besides utf-8,
>>
>> Actually, you don't. Just put them into your code in your favourite
>> editor and treat them like any ASCII character.
>>
>> A problems arise only if your editor saves the file in a different
>> encoding then your display device expects. Typical examples are e.g.
>> saving as UTF-8, then including the text in an HTML page but forgetting
>> to specify UTF-8 as charset. In this case the browser defaults to
>> ISO-Latin-1 and the non-ASCII characters will be messed up, of course.
>> Or saving the file as Windows-1252 (or ISO-Latin-1) and then viewing the
>> output in a DOS Window which for western languages uses OEM CP 850.
>
>I know if I am writing some code using Edit.com (yes dos - can't get
>away from the simplicity of it) I can add the foriegn language
>characters to my programs just using the ALT+0??? code and it works
>fine. I haven't tried doing it in a windows based editor.
Oh, now that you mention it, maybe the OP wasn't asking about how to
correctly print/display non-ASCII characters from his Perl program but
about how to enter them on his keyboard in the editor. Two very
different things.
The easiest way would be to switch the keyboard into Spanish mode. How
to do that depends on your OS.
If you have to type text in multiple different languages frequently you
might want to check out those keyboards that have little LCDs on the
keys, which change to actually match the national layout for the current
keyboard mode.
jue
------------------------------
Date: Fri, 20 Jun 2008 12:52:31 -0500
From: Ted Zlatanov <tzz@lifelogs.com>
Subject: Re: Print Spanish characters in Perl?
Message-Id: <86tzforo2o.fsf@lifelogs.com>
On Fri, 20 Jun 2008 15:27:25 GMT Jürgen Exner <jurgenex@hotmail.com> wrote:
JE> Oh, now that you mention it, maybe the OP wasn't asking about how to
JE> correctly print/display non-ASCII characters from his Perl program but
JE> about how to enter them on his keyboard in the editor. Two very
JE> different things.
JE> The easiest way would be to switch the keyboard into Spanish mode. How
JE> to do that depends on your OS.
JE> If you have to type text in multiple different languages frequently you
JE> might want to check out those keyboards that have little LCDs on the
JE> keys, which change to actually match the national layout for the current
JE> keyboard mode.
As a follow up I wanted to mention my favorite tools for this.
Yudit is a good Unicode editor that can do a lot of edge cases for
input.
Yet another way is to use Emacs with Quail. It lets you set up
transliterated inputs; for example with quail-cyrillic-translit I can
type `къща №5' with k ~ /t a /no 5
There are equivalent methods for Western European inputs:
latin1-alt-postfix for example has this table (from the docs):
| postfix | examples
------------+---------+----------
acute | ' | a' -> á
grave | ` | a` -> Ã
circumflex | ^ | a^ -> â
diaeresis | \" | a\" -> ä
tilde | ~ | a~ -> ã
cedilla | / | c/ -> ç
nordic | / | d/ -> ð t/ -> þ a/ -> å e/ -> æ o/ -> ø
others | /<> | s/ -> ß ?/ -> ¿ !/ -> ¡
| various | << -> « >> -> » o_ -> º a_ -> ª
This is not directly related to Perl, but it's really hard (IMHO) to set
up easy input of UCS characters in a consistent way across platforms, so
inside Emacs Quail is a pretty good solution. I hope someone finds it
useful.
Ted
------------------------------
Date: Fri, 20 Jun 2008 08:51:26 -0700 (PDT)
From: aquadoll <samikr@gmail.com>
Subject: Substituting in a group
Message-Id: <998891da-3a95-41d4-8faf-17cf005bfeb0@n19g2000prg.googlegroups.com>
(Duplicate copy - not sure if the previous msg got posted !!)
Hello,
I am having the following kind of lines:
ABC XXX,2231,"Math, Physics",0.45,2
PQR ERR,2217,"Physics, Chemistry, Math",0.21,5
ABC PQR,1213,Physics,0.5,1
I want to detect when there are groups of subjects in the 3rd column,
remove the quotes in those cases and replace the comma by # inside the
groups. So, the above lines would be transformed to:
ABC XXX,2231,Math# Physics,0.45,2
PQR ERR,2217,Physics# Chemistry# Math,0.21,5
ABC PQR,1213,Physics,0.5,1
I could not think of any one-liner, so I tried the following:
(Assuming I am reading each line in a variable called $Entry)
if($Entry =~ /"[A-Za-z\s]*(,[A-Za-z\s]*)+"/)
{
my $TempEntry=$Entry;
$TempEntry =~ s/"([A-Za-z\s]*([,][A-Za-z\s]*)+)"/$1/;
# Change comma to # in this phrase
$TempEntry =~ s/,/#/g;
print "TempEntry=$TempEntry\n";
# Now replace the original phrase with this phrase in the original
entry
$Entry =~ s/"[A-Za-z\s]*(,[A-Za-z\s]*)+"/$TempEntry/;
print "New Entry=$Entry\n";
}
The above does not work - for some reason all commas get transformed
into # for the first two lines. Where is the problem?
Also, is there a not-so-cryptic one-liner for this one?
Thanks.
------------------------------
Date: Fri, 20 Jun 2008 09:51:21 -0700 (PDT)
From: patrick <pgovern@u.washington.edu>
Subject: Re: Substituting in a group
Message-Id: <186514f6-0005-4010-9678-b8a6ef8593e3@q27g2000prf.googlegroups.com>
On Jun 20, 8:51=A0am, aquadoll <sam...@gmail.com> wrote:
> (Duplicate copy - not sure if the previous msg got posted !!)
>
> Hello,
> I am having the following kind of lines:
>
> ABC XXX,2231,"Math, Physics",0.45,2
> PQR ERR,2217,"Physics, Chemistry, Math",0.21,5
> ABC PQR,1213,Physics,0.5,1
>
> I want to detect when there are groups of subjects in the 3rd column,
> remove the quotes in those cases and replace the comma by # inside the
> groups. So, the above lines would be transformed to:
>
> ABC XXX,2231,Math# Physics,0.45,2
> PQR ERR,2217,Physics# Chemistry# Math,0.21,5
> ABC PQR,1213,Physics,0.5,1
>
> I could not think of any one-liner, so I tried the following:
> (Assuming I am reading each line in a variable called $Entry)
>
> if($Entry =3D~ /"[A-Za-z\s]*(,[A-Za-z\s]*)+"/)
> {
> =A0 =A0my $TempEntry=3D$Entry;
> =A0 =A0$TempEntry =3D~ s/"([A-Za-z\s]*([,][A-Za-z\s]*)+)"/$1/;
> =A0 =A0# Change comma to # in this phrase
> =A0 =A0$TempEntry =3D~ s/,/#/g;
> =A0 =A0print "TempEntry=3D$TempEntry\n";
> =A0 =A0# Now replace the original phrase with this phrase in the original
> entry
> =A0 =A0$Entry =3D~ s/"[A-Za-z\s]*(,[A-Za-z\s]*)+"/$TempEntry/;
> =A0 =A0print "New Entry=3D$Entry\n";
>
> }
>
> The above does not work - for some reason all commas get transformed
> into # for the first two lines. Where is the problem?
>
> Also, is there a not-so-cryptic one-liner for this one?
>
> Thanks.
You might try
perl -F'"' -lane '$F[0] =3D~ s/"//; $F[1] =3D~ s/"//;$F[1] =3D~ s/,/#/;prin=
t
@F' in.txt > out.txt
Patrick
------------------------------
Date: Fri, 20 Jun 2008 10:02:32 -0700 (PDT)
From: Paul Lalli <mritty@gmail.com>
Subject: Re: Substituting in a group
Message-Id: <69e139c7-8251-44f4-8212-18a4c4f4d5c2@z72g2000hsb.googlegroups.com>
On Jun 20, 11:51=A0am, aquadoll <sam...@gmail.com> wrote:
> (Duplicate copy - not sure if the previous msg got posted !!)
>
> Hello,
> I am having the following kind of lines:
>
> ABC XXX,2231,"Math, Physics",0.45,2
> PQR ERR,2217,"Physics, Chemistry, Math",0.21,5
> ABC PQR,1213,Physics,0.5,1
>
> I want to detect when there are groups of subjects in the 3rd column,
> remove the quotes in those cases and replace the comma by # inside the
> groups. So, the above lines would be transformed to:
>
> ABC XXX,2231,Math# Physics,0.45,2
> PQR ERR,2217,Physics# Chemistry# Math,0.21,5
> ABC PQR,1213,Physics,0.5,1
> I could not think of any one-liner, so I tried the following:
> (Assuming I am reading each line in a variable called $Entry)
>
> if($Entry =3D~ /"[A-Za-z\s]*(,[A-Za-z\s]*)+"/)
> {
> =A0 =A0my $TempEntry=3D$Entry;
> =A0 =A0$TempEntry =3D~ s/"([A-Za-z\s]*([,][A-Za-z\s]*)+)"/$1/;
This gets rid of all the quotes in the TempEntry.
> =A0 =A0# Change comma to # in this phrase
> =A0 =A0$TempEntry =3D~ s/,/#/g;
This changes ALL commas in the entire entry, not just the commas that
were originally part of the quoted material.
> =A0 =A0print "TempEntry=3D$TempEntry\n";
> =A0 =A0# Now replace the original phrase with this phrase in the original
> entry
> =A0 =A0$Entry =3D~ s/"[A-Za-z\s]*(,[A-Za-z\s]*)+"/$TempEntry/;
> =A0 =A0print "New Entry=3D$Entry\n";
>
> }
>
> The above does not work - for some reason all commas get transformed
> into # for the first two lines. Where is the problem?
$TempEntry is the whole line, not just the part of $Entry you cared
about.
#First obtain the grouped items substring
my ($group) =3D ($TempEntry =3D~ /("[^"]+?")/);
#Create a copy of the group string to modify:
my $mod_group =3D $group;
#Remove all commas from the group
$mod_group =3D~ tr/,/#/;
#Remove the quotes from the group:
$mod_group =3D~ s/^"|"#//g;
#Replace the original group with the modified group in the original
Entry
$TempEntry =3D~ s/$group/$mod_group/;
Hope that helps,
Paul Lalli
------------------------------
Date: Fri, 20 Jun 2008 19:20:15 +0200
From: Dave B <daveb@addr.invalid>
Subject: Re: Substituting in a group
Message-Id: <g3gp0v$ngc$1@registered.motzarella.org>
aquadoll wrote:
> ABC XXX,2231,"Math, Physics",0.45,2
> PQR ERR,2217,"Physics, Chemistry, Math",0.21,5
> ABC PQR,1213,Physics,0.5,1
>
> I want to detect when there are groups of subjects in the 3rd column,
> remove the quotes in those cases and replace the comma by # inside the
> groups. So, the above lines would be transformed to:
>
> ABC XXX,2231,Math# Physics,0.45,2
> PQR ERR,2217,Physics# Chemistry# Math,0.21,5
> ABC PQR,1213,Physics,0.5,1
>[snip]
> Also, is there a not-so-cryptic one-liner for this one?
I'm a beginner in perl, so please forgive any naivety. This oneliner seems
to work:
$ perl -pe 'if (s/"([^"]*)"/$1/) {$m=$n=$1; $n=~s/,/#/g; s/$m/$n/;}' file
XXX,2231,Math# Physics,0.45,2
PQR ERR,2217,Physics# Chemistry# Math,0.21,5
ABC PQR,1213,Physics,0.5,1
This assumes that the text between double quotes (the part that is matched
in the first place) does not appear elsewhere before the double quotes, and
assumes that it's the only text in double quotes in the line.
--
D.
------------------------------
Date: Fri, 20 Jun 2008 10:24:42 -0700 (PDT)
From: aquadoll <samikr@gmail.com>
Subject: Re: Substituting in a group
Message-Id: <57fc25f3-a6db-4a5c-b599-6fe78ef37476@v1g2000pra.googlegroups.com>
On Jun 20, 11:02=A0am, Paul Lalli <mri...@gmail.com> wrote:
> On Jun 20, 11:51=A0am, aquadoll <sam...@gmail.com> wrote:
>
>
>
> > (Duplicate copy - not sure if the previous msg got posted !!)
>
> > Hello,
> > I am having the following kind of lines:
>
> > ABC XXX,2231,"Math, Physics",0.45,2
> > PQR ERR,2217,"Physics, Chemistry, Math",0.21,5
> > ABC PQR,1213,Physics,0.5,1
>
> > I want to detect when there are groups of subjects in the 3rd column,
> > remove the quotes in those cases and replace the comma by # inside the
> > groups. So, the above lines would be transformed to:
>
> > ABC XXX,2231,Math# Physics,0.45,2
> > PQR ERR,2217,Physics# Chemistry# Math,0.21,5
> > ABC PQR,1213,Physics,0.5,1
> > I could not think of any one-liner, so I tried the following:
> > (Assuming I am reading each line in a variable called $Entry)
>
> > if($Entry =3D~ /"[A-Za-z\s]*(,[A-Za-z\s]*)+"/)
> > {
> > =A0 =A0my $TempEntry=3D$Entry;
> > =A0 =A0$TempEntry =3D~ s/"([A-Za-z\s]*([,][A-Za-z\s]*)+)"/$1/;
>
> This gets rid of all the quotes in the TempEntry.
>
> > =A0 =A0# Change comma to # in this phrase
> > =A0 =A0$TempEntry =3D~ s/,/#/g;
>
> This changes ALL commas in the entire entry, not just the commas that
> were originally part of the quoted material.
>
> > =A0 =A0print "TempEntry=3D$TempEntry\n";
> > =A0 =A0# Now replace the original phrase with this phrase in the origin=
al
> > entry
> > =A0 =A0$Entry =3D~ s/"[A-Za-z\s]*(,[A-Za-z\s]*)+"/$TempEntry/;
> > =A0 =A0print "New Entry=3D$Entry\n";
>
> > }
>
> > The above does not work - for some reason all commas get transformed
> > into # for the first two lines. Where is the problem?
>
> $TempEntry is the whole line, not just the part of $Entry you cared
> about.
>
> #First obtain the grouped items substring
> my ($group) =3D ($TempEntry =3D~ /("[^"]+?")/);
> #Create a copy of the group string to modify:
> my $mod_group =3D $group;
> #Remove all commas from the group
> $mod_group =3D~ tr/,/#/;
> #Remove the quotes from the group:
> $mod_group =3D~ s/^"|"#//g;
> #Replace the original group with the modified group in the original
> Entry
> $TempEntry =3D~ s/$group/$mod_group/;
>
> Hope that helps,
> Paul Lalli
Hello,
Thanks for all the replies. I was actually trying to get the part of
$Entry I am interested in, in $TempEntry.
I used the following 2 lines (as shown in the OP):
$TempEntry=3D$Entry
$TempEntry =3D~ s/"([A-Za-z\s]*([,][A-Za-z\s]*)+)"/$1/;
Why did the above did not get "the part of $Entry I am interested in"
in $TempEntry? What did I do wrong?
Thanks.
------------------------------
Date: Fri, 20 Jun 2008 17:50:55 GMT
From: "John W. Krahn" <someone@example.com>
Subject: Re: Substituting in a group
Message-Id: <36S6k.302$2G6.179@edtnps83>
patrick wrote:
> On Jun 20, 8:51 am, aquadoll <sam...@gmail.com> wrote:
>>
>> I am having the following kind of lines:
>>
>> ABC XXX,2231,"Math, Physics",0.45,2
>> PQR ERR,2217,"Physics, Chemistry, Math",0.21,5
>> ABC PQR,1213,Physics,0.5,1
>>
>> I want to detect when there are groups of subjects in the 3rd column,
>> remove the quotes in those cases and replace the comma by # inside the
>> groups. So, the above lines would be transformed to:
>>
>> ABC XXX,2231,Math# Physics,0.45,2
>> PQR ERR,2217,Physics# Chemistry# Math,0.21,5
>> ABC PQR,1213,Physics,0.5,1
>
> You might try
> perl -F'"' -lane '$F[0] =~ s/"//; $F[1] =~ s/"//;$F[1] =~ s/,/#/;print
> @F' in.txt > out.txt
split() *removes* the expression you are splitting on so there are no
'"' characters in @F to remove so that could be simplified to:
perl -F'"' -lane '$F[1] =~ s/,/#/;print @F' in.txt > out.txt
But that only changes the first ',' to a '#' and not all of them so you
probably want this instead:
perl -F'"' -lane '$F[1] =~ s/,/#/g;print @F' in.txt > out.txt
Or:
perl -F'"' -lane '$F[1] =~ tr/,/#/;print @F' in.txt > out.txt
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
------------------------------
Date: Fri, 20 Jun 2008 17:51:49 +0000 (UTC)
From: Willem <willem@stack.nl>
Subject: Re: Substituting in a group
Message-Id: <slrng5nrhl.66j.willem@snail.stack.nl>
aquadoll wrote:
) ABC XXX,2231,"Math, Physics",0.45,2
) PQR ERR,2217,"Physics, Chemistry, Math",0.21,5
) ABC PQR,1213,Physics,0.5,1
)
) I want to detect when there are groups of subjects in the 3rd column,
) remove the quotes in those cases and replace the comma by # inside the
) groups. So, the above lines would be transformed to:
)
) ABC XXX,2231,Math# Physics,0.45,2
) PQR ERR,2217,Physics# Chemistry# Math,0.21,5
) ABC PQR,1213,Physics,0.5,1
)
) I could not think of any one-liner, so I tried the following:
) (Assuming I am reading each line in a variable called $Entry)
How about:
while (s/(")(.*?)"/$2/) { substr($_,$+[1]-1,$+[2]-$+[1]) =~ s/,/#/g }
Which should do what you want, even for multiple quoted strings.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
------------------------------
Date: Fri, 20 Jun 2008 17:56:31 +0000 (UTC)
From: Willem <willem@stack.nl>
Subject: Re: Substituting in a group
Message-Id: <slrng5nrqf.66j.willem@snail.stack.nl>
aquadoll wrote:
) Hello,
) Thanks for all the replies. I was actually trying to get the part of
) $Entry I am interested in, in $TempEntry.
) I used the following 2 lines (as shown in the OP):
) $TempEntry=$Entry
) $TempEntry =~ s/"([A-Za-z\s]*([,][A-Za-z\s]*)+)"/$1/;
)
) Why did the above did not get "the part of $Entry I am interested in"
) in $TempEntry? What did I do wrong?
It's a substitution. You substitute the quoted part with the part
between quotes. The rest remains intact.
To get just the part between quotes, use this:
my ($TempEntry) = $Entry =~ /"(.*?)"/;
Why the complicated match string by the way ?
Do you only want to match quoted strings that contain a comma ?
It seems needlessly complex.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
------------------------------
Date: Fri, 20 Jun 2008 18:01:28 +0000 (UTC)
From: Willem <willem@stack.nl>
Subject: Re: Substituting in a group
Message-Id: <slrng5ns3o.6kv.willem@snail.stack.nl>
Willem wrote:
) while (s/(")(.*?)"/$2/) { substr($_,$+[1]-1,$+[2]-$+[1]) =~ s/,/#/g }
Of course,
while (s/"(.*?)"/$1/) { substr($_,$-[1]-1,$+[1]-$-[1]) =~ s/,/#/g }
is slightly easier.
SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
------------------------------
Date: Fri, 20 Jun 2008 10:08:53 -0500
From: Ignoramus7021 <ignoramus7021@NOSPAM.7021.invalid>
Subject: Why is ftp.perl.org down all the time?
Message-Id: <BZqdnTWc_r2YVcbVnZ2dnUVZ_qHinZ2d@giganews.com>
Inquiring minds want to know.
--
Due to extreme spam originating from Google Groups, and their inattention
to spammers, I and many others block all articles originating
from Google Groups. If you want your postings to be seen by
more readers you will need to find a different means of
posting on Usenet.
http://improve-usenet.org/
------------------------------
Date: Fri, 20 Jun 2008 10:35:57 -0500
From: brian d foy <brian.d.foy@gmail.com>
Subject: Re: Why is ftp.perl.org down all the time?
Message-Id: <200620081035573117%brian.d.foy@gmail.com>
In article <BZqdnTWc_r2YVcbVnZ2dnUVZ_qHinZ2d@giganews.com>,
Ignoramus7021 <ignoramus7021@NOSPAM.7021.invalid> wrote:
> Inquiring minds want to know.
What are you trying to do?
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 1663
***************************************