[23622] in Perl-Users-Digest
Perl-Users Digest, Issue: 5829 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Nov 19 14:05:54 2003
Date: Wed, 19 Nov 2003 11:05:11 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Wed, 19 Nov 2003 Volume: 10 Number: 5829
Today's topics:
Re: [OT] Re: Creating UNICODE filenames with PERL 5.8 <usenet@morrow.me.uk>
Re: [OT] Re: Creating UNICODE filenames with PERL 5.8 <flavell@ph.gla.ac.uk>
Re: Creating UNICODE filenames with PERL 5.8 <ben.liddicott@comodogroup.com>
Re: Creating UNICODE filenames with PERL 5.8 <usenet@morrow.me.uk>
Re: Creating UNICODE filenames with PERL 5.8 (Malcolm Dew-Jones)
Re: Excel question. <ben.liddicott@comodogroup.com>
Re: How do I substitute the 2nd occurance of a comma fr (Greg Gallagher)
How to extract info from a HTML input box <bluecat22@go.com>
Re: How to extract info from a HTML input box <usenet@morrow.me.uk>
Image::Magick memory leak question <stanb@panix.com>
Re: Inserting the same thing multi times into array. <spikeywan@bigfoot.com.delete.this.bit>
Re: Inserting the same thing multi times into array. <feralboncer@netscape.net>
Re: Inserting the same thing multi times into array. <spikeywan@bigfoot.com.delete.this.bit>
Re: Inserting the same thing multi times into array. <nobull@mail.com>
Re: Match and cut regex? <raisin@delete-this-trash.mts.net>
Re: Match and cut regex? (Tad McClellan)
Matching two arrays, and returning the "rest" (petersson)
Re: Matching two arrays, and returning the "rest" <usenet@morrow.me.uk>
Re: Matching two arrays, and returning the "rest" <nobull@mail.com>
Re: Matching two arrays, and returning the "rest" <usenet@morrow.me.uk>
Re: mod_perl win2k mail::sender problem <news@ducati.demon.co.uk>
Need some help... (warpman)
Re: Need some help... <xaonon@hotpop.com>
Re: perl LibXML <ernst-udo.wallenborn@freenet.de>
Re: Project Organization <ben.liddicott@comodogroup.com>
Re: Project Organization <usenet@morrow.me.uk>
Re: Project Organization <ben.liddicott@comodogroup.com>
Re: Project Organization <trammell+usenet@hypersloth.invalid>
Re: Project Organization (Tad McClellan)
Re: regex to convert 1000000 -> 1,000,000 ? (Tad McClellan)
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Wed, 19 Nov 2003 14:46:58 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: [OT] Re: Creating UNICODE filenames with PERL 5.8
Message-Id: <bpfvp2$2mn$1@wisteria.csv.warwick.ac.uk>
"Ben Liddicott" <ben.liddicott@comodogroup.com> wrote:
> Some history required...
>
>
> "Ben Morrow" <usenet@morrow.me.uk> wrote in message
> news:bpel00$sl6$1@wisteria.csv.warwick.ac.uk...
> > yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones) wrote:
> > > Ben Morrow (usenet@morrow.me.uk) wrote:
> > > : OK, your problem here is that Win2k is being stupid about Unicode: any
> > > : sensible OS that understood UTF8 would be fine :).
> > >
> > > Hum, NT has been handling unicode for at least ten years (3.5, 1993) by
> > > the simple expedient of using 16 bit characters. It is hardware that is
> > > stupid, by continuing to use ancient tiny 8 bit elementary units.
> >
> > OK, I invited that with gratuitous OS-bashing :)... nevertheless:
> >
> > 1. Unicode is *NOT* a 16-bit character set. UTF16 is an evil bodge to
> > work around those who started assuming it was before the standards
> > were properly in place.
>
> Unicode 1.0 WAS a 16-bit character set. So there. UTF16 is a representation
> of Unicode 3.0 which is selected to be backwards compatible with Unicode
> 1.0.
OK. This doesn't stop it being completely wrong. Given the choice
between breaking compatibility with the few people who implemented
Unicode 1.0, breaking compatibility with everyone else who was still
assuming everything was a superset of ASCII and creating seven[1]
different, incompatible representations of the supposed answer to
character encoding problems it is fairly clear to me at least which is
the right answer.
Not to mention that, because of the endianness problem, ucs-2 was
broken as an encoding from the start.
[1] utf8, utf16 BE, LE and with BOM, utf32 ditto.
> So you can knock them for not having the foresignt to know that 65535
> characters wouldn't be enough.
I can also knock them for not having changed in the ten years since
NT3.5 was released. It is not *that* difficult a change to implement,
as Perl 5.8 has demonstrated; even though it has some nasty bits,
ditto.
Ben
--
$.=1;*g=sub{print@_};sub r($$\$){my($w,$x,$y)=@_;for(keys%$x){/main/&&next;*p=$
$x{$_};/(\w)::$/&&(r($w.$1,$x.$_,$y),next);$y eq\$p&&&g("$w$_")}};sub t{for(@_)
{$f&&($_||&g(" "));$f=1;r"","::",$_;$_&&&g(chr(0012))}};t # ben@morrow.me.uk
$J::u::s::t, $a::n::o::t::h::e::r, $P::e::r::l, $h::a::c::k::e::r, $.
------------------------------
Date: Wed, 19 Nov 2003 14:58:53 +0000
From: "Alan J. Flavell" <flavell@ph.gla.ac.uk>
Subject: Re: [OT] Re: Creating UNICODE filenames with PERL 5.8
Message-Id: <Pine.LNX.4.53.0311191454390.25320@ppepc56.ph.gla.ac.uk>
On Wed, 19 Nov 2003, Ben Liddicott wrote:
> Guess what: Decision on character set had to be made in the eighties.
Yeah: as far as I recall, IBM invented DBCS EBCDIC. Doubtless a fine
standard for its time. But things move on.
------------------------------
Date: Wed, 19 Nov 2003 15:37:14 -0000
From: "Ben Liddicott" <ben.liddicott@comodogroup.com>
Subject: Re: Creating UNICODE filenames with PERL 5.8
Message-Id: <bpg2nf$qg7$1@kylie.comodogroup.com>
Probably your best bet is to try to use Unicode::String to convert the file
names to utf-8. It is obviously reading the filenames using the Unicode API,
(otherwise you would get REPLACEMENT CHARACTER instead), but not recognising
that it has done so.
Alternatively, with Win32::API you can use Win32 FindFirstFileW,
FindNextFileW, FindCloseW. This should be pretty much guaranteed to work.
Alternatively you can see if File::Find works, though I suspect it may
suffer the same problems.
Alternatively again, you can try spawning a cmd shell, and parsing the
output. This is only going to be any good if ${^WIDE_SYSTEM_CALLS} affects
qx() or open("command |"), and I don't know if it does or not.
If you specify /u to cmd.exe, it sets the console output to UTF-16, which
you could convert back by hand, using Unicode::String. I'm not entirely sure
how one could send unicode in through $sDirName, though. Experimentation may
tell you.
# /u means unicode, /c means run command and exit
my $sDirCommand = qq(cmd.exe /u /c dir /a "$sDirName");
my $fh = new IO::File($sDirCommand);
Cheers,
Ben Liddicott
"Allan Yates" <allan@yates.ca> wrote in message
news:d6f51524.0311171301.168edab3@posting.google.com...
> The key was the missing "-C". I didn't clue in from the documentation
> that this was important. Once I added that command line parameter, the
> file was created with the correct name.
>
> My next step was to read the file name from the directory. However, I
> thought I read in some documentation somewhere that 'readdir' is not
> UNICODE aware. I seemed to prove this by reading the directory
> containing the file I just created. It comes back with a two character
> file name that 'ord' into 0xd8 and 0xb6 as you indicated.
>
> Do you know of a method of reading directories to get the UNICODE file
> names?
>
>
------------------------------
Date: Wed, 19 Nov 2003 16:11:19 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Creating UNICODE filenames with PERL 5.8
Message-Id: <bpg4n7$6pp$1@wisteria.csv.warwick.ac.uk>
[stop top-posting]
"Ben Liddicott" <ben.liddicott@comodogroup.com> wrote:
> "Allan Yates" <allan@yates.ca> wrote in message
> news:d6f51524.0311171301.168edab3@posting.google.com...
> > The key was the missing "-C". I didn't clue in from the documentation
> > that this was important. Once I added that command line parameter, the
> > file was created with the correct name.
Note that the functionality of -C no longer exists under 5.8.1, and
perl581delta claims it didn't work under 5.8.0 either.
> > My next step was to read the file name from the directory. However, I
> > thought I read in some documentation somewhere that 'readdir' is not
> > UNICODE aware. I seemed to prove this by reading the directory
> > containing the file I just created. It comes back with a two character
> > file name that 'ord' into 0xd8 and 0xb6 as you indicated.
> >
> > Do you know of a method of reading directories to get the UNICODE file
> > names?
>
> Probably your best bet is to try to use Unicode::String to convert the file
> names to utf-8. It is obviously reading the filenames using the Unicode API,
> (otherwise you would get REPLACEMENT CHARACTER instead), but not recognising
> that it has done so.
No. The right answer is to use Encode::decode to convert *from* utf16.
> Alternatively you can see if File::Find works, though I suspect it may
> suffer the same problems.
Why don't you look? A quick grep through perldoc -m File::Find shows
that the names come straight out of readdir, so yes, it will suffer
exactly the same problems.
> Alternatively again, you can try spawning a cmd shell, and parsing the
> output. This is only going to be any good if ${^WIDE_SYSTEM_CALLS} affects
> qx() or open("command |"), and I don't know if it does or not.
Bleech. And no, -C will have no effect on this; rather, it will be
affected by the PerlIO layers pushed onto the filehandle.
> If you specify /u to cmd.exe, it sets the console output to UTF-16, which
> you could convert back by hand, using Unicode::String. I'm not entirely sure
> how one could send unicode in through $sDirName, though.
Either -C will use a Unicode-aware pipe-opening API, and it will Just
Work, or use Encode::encode to encode it into whatever Windows expects
command lines to be specified in, probably utf16.
Ben
--
I've seen things you people wouldn't believe: attack ships on fire off the
shoulder of Orion; I've watched C-beams glitter in the darkness near the
Tannhauser Gate. All these moments will be lost, in time, like tears in rain.
Time to die. |-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-| ben@morrow.me.uk
------------------------------
Date: 19 Nov 2003 10:18:41 -0800
From: yf110@vtn1.victoria.tc.ca (Malcolm Dew-Jones)
Subject: Re: Creating UNICODE filenames with PERL 5.8
Message-Id: <3fbbb401@news.victoria.tc.ca>
Alan J. Flavell (flavell@ph.gla.ac.uk) wrote:
: On Wed, 18 Nov 2003, Malcolm Dew-Jones wrote:
: > Hum, NT has been handling unicode for at least ten years (3.5, 1993) by
: > the simple expedient of using 16 bit characters.
: ...which unfortunately turns out to be somewhat of a mistake, seeing
: that Unicode went and broke the 16-bit boundary.
Which was also a mistake. "character" now includes all the heiroglyphics
of places like china, (but why not all the heiroglyphics of, say, ancient
egypt? why not all the standardized international road symbols?). When
the arabians invented the modern idea of characters then it became widely
recognized as much more powerful, fundamentally better, and fundamentally
"different" than the old single-picture-means-a-word method of writing.
Now we have jumped backwards 1800 years. Things like chinese writing
should not be treated using standardized application level encodings, just
as we now standarize many markup languages for encoding other higher level
data. ($0.02)
: > It is hardware that is
: > stupid, by continuing to use ancient tiny 8 bit elementary units.
: utf-8 is the closest they managed to get to variable-length character
: encoding. It's not perfect, but it gets around quite a lot of the
: compatibility problems that exist with other approaches.
: > Imagine if all that hardware still used 16 or 24 bit memory addresses.
: Imagine if every us-ascii character were required to occupy 64 bits?
First, it would never be 64 bits for a character. Even if we hardcoded
current unicode values, it would be no more than 24 bits per character.
That's three (or two at 16 bits) times the space, which for the vast
majority of users would be irrelevent anyway due to the enourmous increase
in storage capacities.
Also, it is almost a norm to store any static data in compressed format,
and compression tools would utilize the larger character size to pack more
data, so the total storage space required for a lot of data would not
increase.
Things that would truly be affected, such as humungous databases, already
have to use many mechanisms to be able to manipulate the data, and I'm
sure they could find ways to handle the larger volumes, probably by using
the exact reverse of wide characters.
: And then there's legacy data to think about.
stored on legacy systems, and manipulated using legacy software and
hardware.
This is murhpy's law. Because the old systems have been successful, new
systems can't be made better.
: > Character size was always a compromise between functionality and memory.
: Agreed.
: > Character size continually increased from the first character manipulating
: > electronic equipment of the (gee, way way back 1930's or so, believe it or
: > not)
: Interestingly, those early codes regularly had shift-in and shift-out
: codes to extend their repertoire. A practice which faded out for a
: while,
yes, as soon as hardware costs made larger characters possible, they got
rid of the gludginess.
almost got reborn in a big way in ISO-2022, and then -
: iso-10646/Unicode and associated encodings. I wonder what the future
: holds in store? ;-)
: > Character size remains frozen due to one of murphy's laws regarding the
: > success of hardware first build using compromises that were appropriate
: > twenty years ago.
: It's easy to poke fun, but it's harder to come up with a viable
: compromise IMHO.
I am out of time, to say more.
------------------------------
Date: Wed, 19 Nov 2003 15:40:32 -0000
From: "Ben Liddicott" <ben.liddicott@comodogroup.com>
Subject: Re: Excel question.
Message-Id: <bpg2tk$qoq$1@kylie.comodogroup.com>
Converting VB to Win32::OLE is easy. The ActiveState documentation contains
several examples.
See under "Using OLE with Perl" in your documentation, which should be
under:
html/faq/Windows/ActivePerl-Winfaq12.html
in your perl installation directory.
Cheers,
Ben Liddicott
"Richard S Beckett" <spikeywan@bigfoot.com.delete.this.bit> wrote in message
news:bpfktd$iqs$1@newshost.mot.com...
> Guys,
>
> I want to be able to work out how to do things in excel by myself, but I
> don't know how to convert between the macro syntax, and that required in
> perl modules.
------------------------------
Date: 19 Nov 2003 08:04:41 -0800
From: ggallagher@swcis.nhs.uk (Greg Gallagher)
Subject: Re: How do I substitute the 2nd occurance of a comma from a text file
Message-Id: <58950242.0311190804.617da714@posting.google.com>
Thanks to everyone who offered help. Tore, you obviously have worked
with csv files before!!
I checked out cpan and found some whizzy code which has pretty much
sorted out the problem with some horrid data :o)
Can't say that I fully understand it, but here it is:
@new = ();
push(@new, $+) while $_ =~ m{
"([^\"\\]*(?:\\.[^\"\\]*)*)",?
| ([^,]+),?
| ,
}gx;
push(@new, undef) if substr($_,-1,1) eq ',';
Cheers
Greg
------------------------------
Date: Tue, 18 Nov 2003 16:57:23 -0500
From: "Blue Cat" <bluecat22@go.com>
Subject: How to extract info from a HTML input box
Message-Id: <bpfsbi0276b@enews2.newsguy.com>
I'm trying to write an HTML page with a PerlScript. Something like what you
see below. How do I do it?
<HTML>
<SCRIPT LANGUAGE = "PerlScript">
Sub DoSomething {
$a = [data recovered from an input box called txrNumber];
$b = [a mathematical operation with $a];
[Put $b into an input box called Result];
}
</SCRIPT>
<BODY>
<INPUT TYPE = TEXT NAME = "txtNumber" SIZE = 25>
<INPUT TYPE = BUTTON VALUE = "Click Me" onclick = "DoSomething()"> <BR>
<INPUT TYPE = TEXT NAME = "Result" VALUE = 0 SIZE = 25>
</BODY>
</HTML>
------------------------------
Date: Wed, 19 Nov 2003 14:49:03 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: How to extract info from a HTML input box
Message-Id: <bpfvsv$2mn$2@wisteria.csv.warwick.ac.uk>
"Blue Cat" <bluecat22@go.com> wrote:
> I'm trying to write an HTML page with a PerlScript. Something like what you
> see below. How do I do it?
>
> <HTML>
> <SCRIPT LANGUAGE = "PerlScript">
> Sub DoSomething {
> $a = [data recovered from an input box called txrNumber];
> $b = [a mathematical operation with $a];
> [Put $b into an input box called Result];
> }
> </SCRIPT>
> <BODY>
> <INPUT TYPE = TEXT NAME = "txtNumber" SIZE = 25>
> <INPUT TYPE = BUTTON VALUE = "Click Me" onclick = "DoSomething()"> <BR>
> <INPUT TYPE = TEXT NAME = "Result" VALUE = 0 SIZE = 25>
> </BODY>
> </HTML>
Umm... like that. For information on how to interact with the elements
on the page, see the documentation for JavaScript in IE and
translate. The object model is exactly the same.
Ben
--
If you put all the prophets, | You'd have so much more reason
Mystics and saints | Than ever was born
In one room together, | Out of all of the conflicts of time.
ben@morrow.me.uk |----------------+---------------| The Levellers, 'Believers'
------------------------------
Date: Wed, 19 Nov 2003 17:46:32 +0000 (UTC)
From: Stan Brown <stanb@panix.com>
Subject: Image::Magick memory leak question
Message-Id: <bpga9o$ouj$2@reader2.panix.com>
OK, I've got this memory leak down to a 3 line example:
my $image = Image::Magick->new(magick=>'GIF',font=>'clean');
$image->Read($l_tmpfile);
undef $image;
In a loop leaks memory.
Can anyone tell me if I'm doing somethign wrong here?
--
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin
------------------------------
Date: Wed, 19 Nov 2003 14:12:15 -0000
From: "Richard S Beckett" <spikeywan@bigfoot.com.delete.this.bit>
Subject: Re: Inserting the same thing multi times into array.
Message-Id: <bpftqs$1us$1@newshost.mot.com>
A very nice man said:
> > splice @array, 2, 0, 7*(""); # i.e. 7 lots of ""
> splice @array, 2, 0, ("")x7;
Oooh! So close! :-) Thanks.
Someone else said:
> Pls do a perldoc perlop and search for 'repeat'
Well, I tried perldoc, but without asking this question, I wouldn't have
thought of trying the word 'repeat'. Similarly, I searched google, before
asking. As both were fruitless I asked. That _is_ what this newsgroup is
for, right?
--
R.
GPLRank +79.699
------------------------------
Date: 19 Nov 2003 15:01:11 GMT
From: Ferine Boncer <feralboncer@netscape.net>
Subject: Re: Inserting the same thing multi times into array.
Message-Id: <slrnbrn19r.2fk.feralboncer@auk.jeekay>
Richard S Beckett <spikeywan@bigfoot.com.delete.this.bit> wrote:
> A very nice man said:
> > > splice @array, 2, 0, 7*(""); # i.e. 7 lots of ""
> > splice @array, 2, 0, ("")x7;
>
> Oooh! So close! :-) Thanks.
>
> Someone else said:
> > Pls do a perldoc perlop and search for 'repeat'
>
> Well, I tried perldoc, but without asking this question, I wouldn't have
> thought of trying the word 'repeat'. Similarly, I searched google, before
> asking. As both were fruitless I asked. That _is_ what this newsgroup is
> for, right?
I also mentioned that the answer is to use the operator 'x' and *then*
requested you to read the perldoc and also directed you to search for
the appropriate word so that you know all the ins-and-outs of the
operator.
I wasn't trying to be unhelpful...
:)
--
FB
------------------------------
Date: Wed, 19 Nov 2003 15:26:59 -0000
From: "Richard S Beckett" <spikeywan@bigfoot.com.delete.this.bit>
Subject: Re: Inserting the same thing multi times into array.
Message-Id: <bpg270$845$1@newshost.mot.com>
> I also mentioned that the answer is to use the operator 'x' and *then*
> requested you to read the perldoc and also directed you to search for
> the appropriate word so that you know all the ins-and-outs of the
> operator.
> I wasn't trying to be unhelpful...
> :)
Sorry.
--
R.
GPLRank +79.699
------------------------------
Date: 19 Nov 2003 17:38:53 +0000
From: Brian McCauley <nobull@mail.com>
Subject: Re: Inserting the same thing multi times into array.
Message-Id: <u9fzgks1g2.fsf@wcl-l.bham.ac.uk>
Ben Morrow <usenet@morrow.me.uk> writes:
> "Richard S Beckett" <spikeywan@bigfoot.com.delete.this.bit> wrote:
> > splice @array, 2, 0, ("", "", "", "", "", "", "");
> > Is there a way that I can do something like?:
> >
> > splice @array, 2, 0, 7*(""); # i.e. 7 lots of ""
>
> splice @array, 2, 0, map "", 1..7; #untested
As others have pointed out the x operator is more appropriate than map()
in this case.
It is, however, worth pointing out that if you were inserting
reference to anonymous things rather than strings then you probably
would want map().
splice @array, 2, 0, map [], 1..7; # DWIM - refs to 7 empty arrays
splice @array, 2, 0, ([]) x 7; # !DWIM - 7 refs to one empty array
--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
------------------------------
Date: Wed, 19 Nov 2003 08:28:36 -0600
From: Master Web Surfer <raisin@delete-this-trash.mts.net>
Subject: Re: Match and cut regex?
Message-Id: <MPG.1a253a0fdc4b16639896d9@news.mts.net>
[This followup was posted to comp.lang.perl.misc]
In article <j0xub.6699$9q7.54480104@newssvr21.news.prodigy.com>,
bryan@akanta.com says...
> If I have a 'cut' phrase:
> my $cut = "ABBCCCD";
>
> and a 'sentence':
> my $sentence = "ALSSDJOOASABBCCCDUUSIIASDLLLPP";
>
> What regex do I use to match the ABBCCCD and then chop off everything
> after it?
>
> Thanks,
> B
When you do pattern matching in Perl there are several "special"
variables that can be of great help to you :
$PREMATCH - text prior to matched pattern
$MATCH - text matched by pattern
$POSTMATCH - text after matched pattern
#!/usr/bin/perl -w
use English;
if ( $string =~ m/some pattern stuff/ ) {
$new_string = $PREMATCH . $MATCH;
}
------------------------------
Date: Wed, 19 Nov 2003 09:23:32 -0600
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Match and cut regex?
Message-Id: <slrnbrn2nk.i0k.tadmc@magna.augustmail.com>
Master Web Surfer <raisin@delete-this-trash.mts.net> wrote:
> [This followup was posted to comp.lang.perl.misc]
We can tell (because we are reading postings to comp.lang.perl.misc...).
> In article <j0xub.6699$9q7.54480104@newssvr21.news.prodigy.com>,
> bryan@akanta.com says...
>> If I have a 'cut' phrase:
>> my $cut = "ABBCCCD";
>>
>> and a 'sentence':
>> my $sentence = "ALSSDJOOASABBCCCDUUSIIASDLLLPP";
>>
>> What regex do I use to match the ABBCCCD and then chop off everything
>> after it?
> When you do pattern matching in Perl there are several "special"
> variables that can be of great help to you :
>
> $PREMATCH - text prior to matched pattern
> $MATCH - text matched by pattern
> $POSTMATCH - text after matched pattern
But you should give consideration to the warning given in perlre.pod
if you think you must use those variables:
WARNING: Once Perl sees that you need one of $&, $`, or $'
anywhere in the program, it has to provide them for every
pattern match. This may substantially slow your program.
...
But if you never use $&, $` or
$', then patterns without capturing parentheses will not
be penalized. So avoid $&, $', and $` if you can, but if
you can't...
They are not needed for the task described, so they should not
be used for the task described.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 19 Nov 2003 07:33:54 -0800
From: petersson@my-deja.com (petersson)
Subject: Matching two arrays, and returning the "rest"
Message-Id: <b35c5297.0311190733.133d75d6@posting.google.com>
Hi,
I have two arrays. The first one contains a list of items. For
example:
abc
bca
bac
cab
acb
My second one contains fewer items, but parts of the individual items
match with the items in my first array.
12_bca
21_acb
22_cab
Hence that the "bca" in "12_bca" above matches with the single "bca"
in my first array, like this:
abc
bca : 12_bca
bac
cab : 22_cab
acb : 21_acb
Now I want a function that returns the items from my first array that
doesn't partially match any of the items in my second array. Like
this:
abc
bac
Note: the example above is extremely simplified. Splitting the second
array on the "_" sign won't work... Anyone done this before? Knows of
a simple solution?
Thanks in advance
/s
------------------------------
Date: Wed, 19 Nov 2003 15:48:09 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Matching two arrays, and returning the "rest"
Message-Id: <bpg3bp$5ao$1@wisteria.csv.warwick.ac.uk>
petersson@my-deja.com (petersson) wrote:
> Hi,
>
> I have two arrays. The first one contains a list of items. For
> example:
>
> abc
> bca
> bac
> cab
> acb
I'll call this @a1, since you didn;t give it a name.
> My second one contains fewer items, but parts of the individual items
> match with the items in my first array.
>
> 12_bca
> 21_acb
> 22_cab
and this @a2
> Hence that the "bca" in "12_bca" above matches with the single "bca"
> in my first array, like this:
>
> abc
> bca : 12_bca
> bac
> cab : 22_cab
> acb : 21_acb
>
> Now I want a function that returns the items from my first array that
> doesn't partially match any of the items in my second array. Like
> this:
>
> abc
> bac
grep { my $x = qr|\Q$_|; 0 == grep $x, @a2 } @a1; #untested
Ben
--
For the last month, a large number of PSNs in the Arpa[Inter-]net have been
reporting symptoms of congestion ... These reports have been accompanied by an
increasing number of user complaints ... As of June,... the Arpanet contained
47 nodes and 63 links. [ftp://rtfm.mit.edu/pub/arpaprob.txt] * ben@morrow.me.uk
------------------------------
Date: 19 Nov 2003 17:29:11 +0000
From: Brian McCauley <nobull@mail.com>
Subject: Re: Matching two arrays, and returning the "rest"
Message-Id: <u9k75ws1w8.fsf@wcl-l.bham.ac.uk>
Ben Morrow <usenet@morrow.me.uk> writes:
> grep { my $x = qr|\Q$_|; 0 == grep $x, @a2 } @a1; #untested
You are missing / /
grep { my $x = qr|\Q$_|; 0 == grep /$x/, @a2 } @a1; #still untested
IMHO it looks neater as:
grep { my $x = qr/\Q$_/; ! grep /$x/, @a2 } @a1; #still untested
Of course it's probably faster to use index rather than m//
grep { my $x = $_; ! grep index($_,$x) > -1, @a2 } @a1; #still untested
--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
------------------------------
Date: Wed, 19 Nov 2003 19:00:50 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Matching two arrays, and returning the "rest"
Message-Id: <bpgel2$fei$1@wisteria.csv.warwick.ac.uk>
Brian McCauley <nobull@mail.com> wrote:
> Ben Morrow <usenet@morrow.me.uk> writes:
>
> > grep { my $x = qr|\Q$_|; 0 == grep $x, @a2 } @a1; #untested
>
> You are missing / /
D'oh! I took it out automatically when I thought 'why not use qr//?'.
> IMHO it looks neater as:
>
> grep { my $x = qr/\Q$_/; ! grep /$x/, @a2 } @a1; #still untested
Agreed.
> Of course it's probably faster to use index rather than m//
>
> grep { my $x = $_; ! grep index($_,$x) > -1, @a2 } @a1; #still untested
I would hope that this would come down to the same thing when using
qr// and no regex metachars...
[formatting adjusted]
% perl -MBenchmark=cmpthese -e'my @a = 1..10_000;
cmpthese -10, {
qr => sub { my $x = qr/\Q4/; grep /$x/, @a },
index => sub { grep index($_, '4') > -1, @a }
}'
Benchmark: running index, qr for at least 10 CPU seconds...
index: 13 wallclock secs (10.57 usr + 0.00 sys = 10.57 CPU) @
137.18/s (n=1450)
qr: 12 wallclock secs (10.56 usr + 0.00 sys = 10.56 CPU) @
98.48/s (n=1040)
Rate qr index
qr 98.5/s -- -28%
index 137/s 39% --
...but it seems you are right :). Moving the qr// outside the cmpthese
makes no difference.
Ben
--
And if you wanna make sense / Whatcha looking at me for? (Fiona Apple)
* ben@morrow.me.uk *
------------------------------
Date: Wed, 19 Nov 2003 18:09:36 -0000
From: "Roger Moffatt" <news@ducati.demon.co.uk>
Subject: Re: mod_perl win2k mail::sender problem
Message-Id: <bpgbki$qc3$1$830fa79f@news.demon.co.uk>
PS. Just to say that my application is nothing to do with bulk mailing - it
sends single e-mails to people requesting the content.
------------------------------
Date: 19 Nov 2003 10:52:15 -0800
From: warpman999@netscape.net (warpman)
Subject: Need some help...
Message-Id: <99977611.0311191052.3c1f5ff1@posting.google.com>
I need some help. I'm new to perl and I have borrow the following
code. This sub keeps on given me the following date:
November 5, 103 at 10:16:33:
I don't understand why it keeps on given me the year as 1xx instead of
2 digits. Any help would be greatly appreciated. Thanks in advanced.
=====
sub get_variables {
if ($FORM{'followup'}) {
$followup = "1";
@followup_num = split(/,/,$FORM{'followup'});
$num_followups = @followups = @followup_num;
$last_message = pop(@followups);
$origdate = "$FORM{'origdate'}";
$origname = "$FORM{'origname'}";
$origsubject = "$FORM{'origsubject'}";
}
else {
$followup = "0";
}
if ($FORM{'name'}) {
$name = "$FORM{'name'}";
$name =~ s/"//g;
$name =~ s/<//g;
$name =~ s/>//g;
$name =~ s/\&//g;
}
else {
&error(no_name);
}
if ($FORM{'email'} =~ /.*\@.*\..*/) {
$email = "$FORM{'email'}";
}
if ($FORM{'subject'}) {
$subject = "$FORM{'subject'}";
$subject =~ s/\&/\&\;/g;
$subject =~ s/"/\"\;/g;
}
else {
&error(no_subject);
}
if ($FORM{'url'} =~ /.*\:.*\..*/ && $FORM{'url_title'}) {
$message_url = "$FORM{'url'}";
$message_url_title = "$FORM{'url_title'}";
}
if ($FORM{'img'} =~ /.*tp:\/\/.*\..*/) {
$message_img = "$FORM{'img'}";
}
if ($FORM{'body'}) {
$body = "$FORM{'body'}";
$body =~ s/\cM//g;
$body =~ s/\n\n/<p>/g;
$body =~ s/\n/<br>/g;
$body =~ s/</</g;
$body =~ s/>/>/g;
$body =~ s/"/"/g;
}
else {
&error(no_body);
}
if ($quote_text == 1) {
$hidden_body = "$body";
$hidden_body =~ s/</</g;
$hidden_body =~ s/>/>/g;
$hidden_body =~ s/"/"/g;
}
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime(time);
if ($sec < 10) {
$sec = "0$sec";
}
if ($min < 10) {
$min = "0$min";
}
if ($hour < 10) {
$hour = "0$hour";
}
if ($mon < 10) {
$mon = "0$mon";
}
if ($mday < 10) {
$mday = "0$mday";
}
$month = ($mon + 1);
@months = ("January","February","March","April","May","June","July","August","September","October","November","December");
if ($use_time == 1) {
$date = "$hour\:$min\:$sec $month/$mday/$year";
}
else {
$date = "$month/$mday/$year";
}
chop($date) if ($date =~ /\n$/);
$long_date = "$months[$mon] $mday, $year at $hour\:$min\:$sec";
# $long_date = "$months[$mon] $mday, 19$year at $hour\:$min\:$sec";
}
------------------------------
Date: 19 Nov 2003 19:04:10 GMT
From: Xaonon <xaonon@hotpop.com>
Subject: Re: Need some help...
Message-Id: <slrnbrnfke.t2n.xaonon@xaonon.local>
Ned i bach <99977611.0311191052.3c1f5ff1@posting.google.com>, warpman
<warpman999@netscape.net> teithant i thiw hin:
> I need some help. I'm new to perl and I have borrow the following
> code. This sub keeps on given me the following date:
>
> November 5, 103 at 10:16:33:
>
> I don't understand why it keeps on given me the year as 1xx instead of
> 2 digits. Any help would be greatly appreciated. Thanks in advanced.
perldoc -f localtime
-> $year is the number of years since 1900. That is, $year is 123 in year
-> 2023.... The proper way to get a complete 4-digit year is simply:
->
-> $year += 1900;
--
Xaonon, EAC Chief of Mad Scientists and informal BAAWA, aa #1821, Kibo #: 1
http://xaonon.dyndns.org/ Guaranteed content-free since 1999. No refunds.
"Uploading isn't a >H goal because it's one step closer to some mythical and
unknowable perfection, but because it'll be jolly practical." -- Rich Artym
------------------------------
Date: 19 Nov 2003 16:02:01 +0100
From: Ernst-Udo Wallenborn <ernst-udo.wallenborn@freenet.de>
Subject: Re: perl LibXML
Message-Id: <m3smkkpfkm.udo@no.domain.net>
trevor_obba@yahoo.co.uk (trevor) writes:
> I installed the following perl module with my perl version 5.8.0
Where? Your system seems to be unixiod, but i am only guessing this.
How? From source, as binary or rpm|deb|pkg package?
> libxml2-2.6.2
> XML-SAX-0.12
> XML-NamespaceSupport-1.08
> XML-LibXML-1.56
> XML-LibXSLT-1.53
> XML-LibXML-Common-0.13
...
> error log:
> Warning: program compiled against libxml 206 using older 204
Though there are several possibilities, i guess: Your XML-LibXSLT is
linked against libxml2 2.6.x, either because you compiled it from source and
linked it specifically against this version, or because you installed it as a
binary and it was linked this way by the binaries' creator. On your system,
there seems to be an older version of libxml2, namely 2.4.x, and for some
reason the user that runs the cgi finds this older version first.
Try to make sure the cgi finds the newer version of the lib instead of the
older.
--
Ernst-Udo Wallenborn
------------------------------
Date: Wed, 19 Nov 2003 15:55:14 -0000
From: "Ben Liddicott" <ben.liddicott@comodogroup.com>
Subject: Re: Project Organization
Message-Id: <bpg3p7$rpv$1@kylie.comodogroup.com>
Good point.
My latest is a demon which chdir's to it's bin directory in a BEGIN block. I
forgot that this doesn't work in the general case.
I do think one should search the script directory first, nevertheless.
Cheers,
Ben Liddicott
"Ben Morrow" <usenet@morrow.me.uk> wrote in message
news:bpft41$ai$1@wisteria.csv.warwick.ac.uk...
> [please stop top-posting]
[There is more than one way to do it. I do it this way.]
>
> "Ben Liddicott" <ben.liddicott@comodogroup.com> wrote:
(...)
> > > 4) 'use lib "../dir/dir";',
> > > followed my 'use module;' (which resides in '../dir/dir/')
>
> This Won't Work unless your program is always started from the same
> working directory.
>
> > > 5) something I don't know about
> >
> > My philosophy is always 5:
> >
> > If it comes with the distribution, leave it where it is.
> > If it belongs with an application, even if it is a file which might
> > conceiveably be of use in other applications, put it with the
application
> > script.
>
> It won't be found unless the application is started from its installed
> directory... you need to use FindBin to find where the program file
> is, and then use lib with that path.
------------------------------
Date: Wed, 19 Nov 2003 16:12:23 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: Project Organization
Message-Id: <bpg4p7$6pp$2@wisteria.csv.warwick.ac.uk>
"Ben Liddicott" <ben.liddicott@comodogroup.com> wrote:
> "Ben Morrow" <usenet@morrow.me.uk> wrote in message
> news:bpft41$ai$1@wisteria.csv.warwick.ac.uk...
> > [please stop top-posting]
> [There is more than one way to do it. I do it this way.]
*plonk*
--
Musica Dei donum optimi, trahit homines, trahit deos. |
Musica truces molit animos, tristesque mentes erigit. | ben@morrow.me.uk
Musica vel ipsas arbores et horridas movet feras. |
------------------------------
Date: Wed, 19 Nov 2003 16:38:11 -0000
From: "Ben Liddicott" <ben.liddicott@comodogroup.com>
Subject: Re: Project Organization
Message-Id: <bpg69s$vg6$1@kylie.comodogroup.com>
It's rather rude to plonk people publicly. It makes you look like a rude
person.
I might care, but you do this all the time, and for equally trivial reasons;
so I don't.
If anyone else wants to plonk, plink or plunk me for top-posting, please do
so now, but quietly so as not to annoy the other posters.
And yes, I do think you are reading this message. I don't think you could
resist seeing if I responded.
Cheers,
Ben Liddicott
"Ben Morrow" <usenet@morrow.me.uk> wrote in message
news:bpg4p7$6pp$2@wisteria.csv.warwick.ac.uk...
> "Ben Liddicott" <ben.liddicott@comodogroup.com> wrote:
> > "Ben Morrow" <usenet@morrow.me.uk> wrote in message
> > news:bpft41$ai$1@wisteria.csv.warwick.ac.uk...
> > > [please stop top-posting]
> > [There is more than one way to do it. I do it this way.]
>
> *plonk*
------------------------------
Date: Wed, 19 Nov 2003 17:05:43 +0000 (UTC)
From: "John J. Trammell" <trammell+usenet@hypersloth.invalid>
Subject: Re: Project Organization
Message-Id: <slrnbrn8n7.cp6.trammell+usenet@hypersloth.el-swifto.com.invalid>
On Wed, 19 Nov 2003 16:38:11 -0000, Ben Liddicott
<ben.liddicott@comodogroup.com> TOFU-ed:
> If anyone else wants to plonk, plink or plunk me for top-posting,
> please do so now, but quietly so as not to annoy the other posters.
>
*plop*
------------------------------
Date: Wed, 19 Nov 2003 12:13:25 -0600
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Project Organization
Message-Id: <slrnbrncm5.icv.tadmc@magna.augustmail.com>
Ben Liddicott <ben.liddicott@comodogroup.com> wrote:
> If anyone else wants to plonk, plink or plunk me for top-posting, please do
> so now, but quietly so as not to annoy the other posters.
If you want to top-post please do it quietly so as not to annoy
the other posters. (or do you get to annoy folks but nobody
else does?)
*plonk*
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: Wed, 19 Nov 2003 08:11:22 -0600
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: regex to convert 1000000 -> 1,000,000 ?
Message-Id: <slrnbrmuga.hqg.tadmc@magna.augustmail.com>
Pinocchio <wsanford@wallysanford.com> wrote:
> jsut
Yet another alias?
It must be getting crowded in there...
> why some newbies seem to
> give blank stairs
Yesterday upon the stair
I met a man who wasn't there.
He wasn't there again today
Oh how I wish he'd go away.
-- Hughes Mearns (1875-1965)
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 5829
***************************************