[30984] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2229 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Mon Feb 23 03:09:40 2009

Date: Mon, 23 Feb 2009 00:09:08 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 23 Feb 2009     Volume: 11 Number: 2229

Today's topics:
        new CPAN modules on Mon Feb 23 2009 (Randal Schwartz)
    Re: Once again: Rolling Frame! <mstep@podiuminternational.org>
    Re: sort question <nick@maproom.co.uk>
    Re: sort question sln@netherlands.com
    Re: sort question sln@netherlands.com
    Re: Sorting based on existence of keys sln@netherlands.com
    Re: utf8 and chomp <ben@morrow.me.uk>
    Re: utf8 and chomp <whynot@pozharski.name>
        XML Simple force array <thesnake_123@-NO-S_P_A_M-hotmail.com>
    Re: XML Simple force array <thepoet_nospam@arcor.de>
    Re: XML Simple force array <perl@marc-s.de>
    Re: XML Simple force array <perl@marc-s.de>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Mon, 23 Feb 2009 05:42:26 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Mon Feb 23 2009
Message-Id: <KFI7uq.10I2@zorch.sf-bay.org>

The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN).  You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.

App-Sequence-0.03_05
http://search.cpan.org/~kimoto/App-Sequence-0.03_05/
pluggable subroutine engine. 
----
CPAN-Mini-Indexed-0.01_01.1
http://search.cpan.org/~nkh/CPAN-Mini-Indexed-0.01_01.1/
Index the content of your CPAN mini repository 
----
Catalyst-Authentication-Store-Tangram-0.010
http://search.cpan.org/~bobtfish/Catalyst-Authentication-Store-Tangram-0.010/
A storage class for Catalyst authentication from a class stored in Tangram 
----
Class-Attribute-0.025
http://search.cpan.org/~deepfryed/Class-Attribute-0.025/
A fast and light weight alternative for defining class attributes. 
----
Class-MOP-0.77_01
http://search.cpan.org/~drolsky/Class-MOP-0.77_01/
A Meta Object Protocol for Perl 5 
----
Class-Monadic-0.02
http://search.cpan.org/~gfuji/Class-Monadic-0.02/
Provides monadic methods (a.k.a. singleton methods) 
----
CouchDB-ExternalProcess-0.02
http://search.cpan.org/~fansipans/CouchDB-ExternalProcess-0.02/
Make creating Perl-based external processs for CouchDB easy 
----
Data-OpenStruct-Deep-0.02
http://search.cpan.org/~masaki/Data-OpenStruct-Deep-0.02/
allows you to create data objects and set arbitrary attributes deeply 
----
Data-OpenStruct-Deep-0.03
http://search.cpan.org/~masaki/Data-OpenStruct-Deep-0.03/
allows you to create data objects and set arbitrary attributes deeply 
----
DateTime-Format-Natural-0.75_01
http://search.cpan.org/~schubiger/DateTime-Format-Natural-0.75_01/
Create machine readable date/time with natural parsing logic 
----
DateTime-Format-Strptime-1.0900
http://search.cpan.org/~rickm/DateTime-Format-Strptime-1.0900/
Parse and format strp and strf time patterns 
----
DayDayUp-0.09
http://search.cpan.org/~fayland/DayDayUp-0.09/
good good study, day day up 
----
Doc-Simply-0.03
http://search.cpan.org/~rkrimen/Doc-Simply-0.03/
Generate POD-like documentation from embedded comments in JavaScript, Java, C, C++ source 
----
File-Find-Object-0.2.0
http://search.cpan.org/~shlomif/File-Find-Object-0.2.0/
An object oriented File::Find replacement 
----
File-Find-Object-Rule-0.0101
http://search.cpan.org/~shlomif/File-Find-Object-Rule-0.0101/
Alternative interface to File::Find::Object 
----
Foorum-1.000005
http://search.cpan.org/~fayland/Foorum-1.000005/
forum system based on Catalyst 
----
GStreamer-0.15
http://search.cpan.org/~tsch/GStreamer-0.15/
Perl interface to the GStreamer library 
----
Graphics-Primitive-0.39
http://search.cpan.org/~gphat/Graphics-Primitive-0.39/
Device and library agnostic graphic primitives 
----
Graphics-Primitive-Driver-CairoPango-0.52
http://search.cpan.org/~gphat/Graphics-Primitive-Driver-CairoPango-0.52/
Cairo/Pango backend for Graphics::Primitive 
----
HTML-FormHandler-0.18
http://search.cpan.org/~gshank/HTML-FormHandler-0.18/
form handler written in Moose 
----
Hash-Merge-Simple-0.04
http://search.cpan.org/~rkrimen/Hash-Merge-Simple-0.04/
Recursively merge two or more hashes, simply 
----
JSON-XS-2.232
http://search.cpan.org/~mlehmann/JSON-XS-2.232/
JSON serialising/deserialising, done correctly and fast 
----
Kools-Okapi-2.6.3.L3.002
http://search.cpan.org/~muguet/Kools-Okapi-2.6.3.L3.002/
Perl extension for the OKAPI api of Kondor+ 2.6 
----
MFor-0.05
http://search.cpan.org/~cornelius/MFor-0.05/
A module for multi-dimension looping. 
----
MFor-0.051
http://search.cpan.org/~cornelius/MFor-0.051/
A module for multi-dimension looping. 
----
MFor-0.052
http://search.cpan.org/~cornelius/MFor-0.052/
A module for multi-dimension looping. 
----
Math-GSL-0.17_01
http://search.cpan.org/~leto/Math-GSL-0.17_01/
Perl interface to the GNU Scientific Library (GSL) 
----
Moose-0.71_01
http://search.cpan.org/~drolsky/Moose-0.71_01/
A postmodern object system for Perl 5 
----
MouseX-Object-Pluggable-0.02
http://search.cpan.org/~kitano/MouseX-Object-Pluggable-0.02/
Mouse port of MooseX::Object::Pluggable 
----
Net-Generatus-0.31
http://search.cpan.org/~shiny/Net-Generatus-0.31/
----
Net-SMTP-Pipelining-v0.0.2
http://search.cpan.org/~marcb/Net-SMTP-Pipelining-v0.0.2/
Send email using ESMTP PIPELINING extension 
----
Number-Phone-CountryCode-0.02
http://search.cpan.org/~mschout/Number-Phone-CountryCode-0.02/
Country phone dialing prefixes 
----
POE-1.003_01
http://search.cpan.org/~rcaputo/POE-1.003_01/
portable multitasking and networking framework for Perl 
----
POE-Test-Loops-1.003_01
http://search.cpan.org/~rcaputo/POE-Test-Loops-1.003_01/
Reusable tests for POE::Loop authors 
----
Panotools-Script-0.21
http://search.cpan.org/~bpostle/Panotools-Script-0.21/
Panorama Tools scripting 
----
Parse-Eyapp-1.140
http://search.cpan.org/~casiano/Parse-Eyapp-1.140/
Extensions for Parse::Yapp 
----
Path-Resource-0.071
http://search.cpan.org/~rkrimen/Path-Resource-0.071/
URI/Path::Class combination. 
----
Rose-HTML-Objects-0.602
http://search.cpan.org/~jsiracusa/Rose-HTML-Objects-0.602/
Object-oriented interfaces for HTML. 
----
SOAP-WSDL-2.00.08
http://search.cpan.org/~mkutter/SOAP-WSDL-2.00.08/
SOAP with WSDL support 
----
SQL-Beautify-0.01
http://search.cpan.org/~jkramer/SQL-Beautify-0.01/
----
Search-Indexer-Incremental-MD5-0.04.15
http://search.cpan.org/~nkh/Search-Indexer-Incremental-MD5-0.04.15/
Incrementally index your files 
----
Shipwright-2.1.1
http://search.cpan.org/~sunnavy/Shipwright-2.1.1/
Best Practical Builder 
----
Shipwright-2.1.2
http://search.cpan.org/~sunnavy/Shipwright-2.1.2/
Best Practical Builder 
----
Simo-0.0804
http://search.cpan.org/~kimoto/Simo-0.0804/
Very simple framework for Object Oriented Perl. 
----
Simo-Wrapper-0.0203
http://search.cpan.org/~kimoto/Simo-Wrapper-0.0203/
Object wrapper to manipulate attrs and methods. 
----
Simo-Wrapper-0.0204
http://search.cpan.org/~kimoto/Simo-Wrapper-0.0204/
Object wrapper to manipulate attrs and methods. 
----
Simo-Wrapper-0.0205
http://search.cpan.org/~kimoto/Simo-Wrapper-0.0205/
Object wrapper to manipulate attrs and methods. 
----
Test-LeakTrace-0.02
http://search.cpan.org/~gfuji/Test-LeakTrace-0.02/
Traces memory leaks (EXPERIMENTAL) 
----
Test-Server-0.05_02
http://search.cpan.org/~jkutej/Test-Server-0.05_02/
what about test driven administration? 
----
Video-FrameGrab-0.01
http://search.cpan.org/~mschilli/Video-FrameGrab-0.01/
Grab a frame from a video 
----
XTerm-Conf-0.06
http://search.cpan.org/~srezic/XTerm-Conf-0.06/
change configuration of a running xterm 
----
dvdrip-0.98.10
http://search.cpan.org/~jred/dvdrip-0.98.10/
----
minismokebox-0.16
http://search.cpan.org/~bingos/minismokebox-0.16/
a small lightweight SmokeBox 


If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.

This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
  http://www.stonehenge.com/merlyn/LinuxMag/col82.html

print "Just another Perl hacker," # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion


------------------------------

Date: Sun, 22 Feb 2009 20:47:33 -0800 (PST)
From: Marek <mstep@podiuminternational.org>
Subject: Re: Once again: Rolling Frame!
Message-Id: <5ce88179-968a-4b28-b5db-861b6f800b1d@d32g2000yqe.googlegroups.com>


sln Thank you! Great help!


Best greetings marek


------------------------------

Date: Sun, 22 Feb 2009 23:26:30 +0000
From: Nick Wedd <nick@maproom.co.uk>
Subject: Re: sort question
Message-Id: <6j5BVpEm8doJFA6O@maproom.demon.co.uk>

In message <Xns9BBAAF54BFFC0asu1cornelledu@127.0.0.1>, A. Sinan Unur 
<1usa@llenroc.ude.invalid> writes
>Nick Wedd <nick@maproom.co.uk> wrote in
>news:qE51cBBk2boJFAaq@maproom.demon.co.uk:
>
>> This is exactly what I hoped for.  In fact it is better (more useful
>> to me) than I feel I have any right to expect.  When it sorts on one
>> criterion, it leaves the items that tie under that criterion in the
>> order they were in before.
>>
>> Now this is exactly what I want it to do.  But no documentation that I
>> can recall promises that it will do that.
>
>Which version of Perl are you using?

Version 5.6.1

>http://perldoc.perl.org/functions/sort.html

So what I am looking for is a "stable" sort.  Knowing what it is called 
will make further investigations easier for me.

That page tells me that the sort in 5.6 is not stable;  the one in 5.7 
is stable;  and in 5.8 I can use a pragma to ensure that it uses a 
stable sort.  So I have been lucky so far, maybe because my arrays have 
never had more than seven elements.
>
>Perl 5.6 and earlier used a quicksort algorithm to implement sort. That
>algorithm was not stable, and could go quadratic. (A stable sort
>preserves the input order of elements that compare equal. Although
>quicksort's run time is O(NlogN) when averaged over all arrays of length
>N, the time can be O(N**2), quadratic behavior, for some inputs.) In
>5.7, the quicksort implementation was replaced with a stable mergesort
>algorithm whose worst-case behavior is O(NlogN). But benchmarks
>indicated that for some inputs, on some platforms, the original
>quicksort was faster. 5.8 has a sort pragma for limited control of the
>sort. Its rather blunt control of the underlying algorithm may not
>persist into future Perls, but the ability to characterize the input or
>output in implementation independent ways quite probably will. See sort.
>
>http://perldoc.perl.org/sort.html
>
> use sort 'stable';            # guarantee stability
>
>
>-- Sinan

Thank you.

Nick
-- 
Nick Wedd    nick@maproom.co.uk


------------------------------

Date: Mon, 23 Feb 2009 02:09:38 GMT
From: sln@netherlands.com
Subject: Re: sort question
Message-Id: <vj04q4d4uvk7vk3dl7tu7d4u8dqtam6h80@4ax.com>

On Sun, 22 Feb 2009 21:03:32 +0000, Nick Wedd <nick@maproom.co.uk> wrote:

>Here is my program:
>
>
>use strict;
>
>sub by_incsub1 { $$a[1] <=> $$b[1]; }
>sub by_incsub2 { $$a[2] <=> $$b[2]; }
>
>my $i;
>my @a = ( ['a',3,1],['b',2,2],['c',1,1],['d',1,2],['e',2,1],['f',3,2] );
>
>my @result = sort by_incsub1 @a;
>foreach $i ( 0..5 )
>   { print "$result[$i][0]$result[$i][1]$result[$i][2] "; }
>print "\n";
>@result = sort by_incsub2 @result;
>foreach $i ( 0..5 )
>   { print "$result[$i][0]$result[$i][1]$result[$i][2] "; }
>print "\n";
>@result = sort by_incsub1 @result;
>foreach $i ( 0..5 )
>   { print "$result[$i][0]$result[$i][1]$result[$i][2] "; }
>
>
>and here is its output:
>
>
>c11 d12 b22 e21 a31 f32
>c11 e21 a31 d12 b22 f32
>c11 d12 e21 b22 a31 f32
>
>
>This is exactly what I hoped for.  In fact it is better (more useful to 
>me) than I feel I have any right to expect.  When it sorts on one 
>criterion, it leaves the items that tie under that criterion in the 
>order they were in before.
>
>Now this is exactly what I want it to do.  But no documentation that I 
>can recall promises that it will do that.  Output like
>c11 d12 e21 b22 a31 f32
>a31 c11 e21 b22 d12 f32
>d12 c11 e21 b22 f32 a31
>would still meet the specification of "sort".
>
>Can I rely on Perl's sort to continue to do what I want, or is it 
>implementation-dependent?
>
>Nick

There is no difference in style when sorting primary, secondary, tertiary
fields as it crosses languages, the same result will be arived at no matter
what. So regardless of what sort method is used, the layout of how you interpret
relationships is the same. Its either less than, greater than or equal in the
comparison.

That said, there is no need to re-sort on countless fields when it could be,
at least done in one pass. This is no guarantee of speed benefit, but in general,
the below framework is how it is done in one pass. Its up to you to customize
as your requirements dictate.

Obviously doing a sort in one pass is quicker. However, this requires custom, user
supplied comparison functions.

Below is a sample of whats possible. Minimal error checking, it is asumed that you 
would know what to do.

There is a prototype "key" function at the top that is just there to explain the
logic in sorting. Once you understand that, you can write any custom bomb you want.
And you should. Don't lay all the responsibility on Perl, its up to you to craft
a sort protocol.

Good luck!
-sln

-------------------------------------------------------------------------------
## iii.pl
## More sort junk
## -sln

use warnings;
use strict;

sub Sort_Template_Protype_aka_By_Both
{
  if ( $$a[1] < $$b[1] ) {return -1}
  if ( $$a[1] > $$b[1] ) {return 1}
  if ( $$a[1] == $$b[1]) {
	if (($$a[2] < $$b[2])) {return -1}
  	if (($$a[2] > $$b[2])) {return 1}
	# if element 2's are equal {
	#	.. check element 3, etc..
	return 0
  }
}

sub By_Both
{
  my $element_compare = $$a[1] <=> $$b[1];
  $element_compare == 0 ? ($$a[2] <=> $$b[2]) : $element_compare;
}

sub By_Field_Range
{
	my ($start,$end) = @_;
	return $$a[$start] <=> $$b[$start] if (!defined $end || $end <= $start);
	for ($start..$end)
	{
		my $element_compare = $$a[$_] <=> $$b[$_];
		next if ($element_compare == 0);
		return $element_compare;
	}
	$$a[$_] <=> $$b[$_];
}

sub By_Field_Array
{
	return 0 if (!@_);
	return $$a[$_[0]] <=> $$b[$_[0]] if (scalar(@_) == 1);
	for (@_)
	{
		my $element_compare = $$a[$_] <=> $$b[$_];
		next if ($element_compare == 0);
		return $element_compare;
	}
	$$a[$_] <=> $$b[$_];
}

## --------------------------------------------------------
my @result;
my @a = ( ['a',3,1],['b',2,2],['c',1,1],['d',1,2],['e',2,1],['f',3,2] );

sub Print_Results
{
	foreach my $i ( 0..5 ) {
		print "$result[$i][0]$result[$i][1]$result[$i][2] ";
	}
	print "\n";
}

## Stuff to play with..
## ---------------------

@result = sort { By_Both } @a;		      Print_Results();
@result = sort { By_Field_Range ( 1 ) } @a;   Print_Results();		#need params
@result = sort { By_Field_Range (1,1) } @a;   Print_Results();		#need params
@result = sort { By_Field_Range (1,3) } @a;   Print_Results();		#need params
@result = sort { By_Field_Array ( 1 ) } @a;   Print_Results();		#need params
@result = sort { By_Field_Array (1,2) } @a;   Print_Results();		#need params
@result = sort { By_Field_Array ( 2 ) } @a;   Print_Results();		#need params


__END__

Output:

c11 d12 e21 b22 a31 f32
c11 d12 b22 e21 a31 f32
c11 d12 b22 e21 a31 f32
c11 d12 e21 b22 a31 f32
c11 d12 b22 e21 a31 f32
c11 d12 e21 b22 a31 f32
a31 c11 e21 b22 d12 f32




------------------------------

Date: Mon, 23 Feb 2009 02:48:17 GMT
From: sln@netherlands.com
Subject: Re: sort question
Message-Id: <j434q49k7vk4c4mmqhasg6at693kurl99h@4ax.com>

On Mon, 23 Feb 2009 02:09:38 GMT, sln@netherlands.com wrote:

>On Sun, 22 Feb 2009 21:03:32 +0000, Nick Wedd <nick@maproom.co.uk> wrote:
>
[snip for code correction]
>
>-------------------------------------------------------------------------------
>## iii.pl
>## More sort junk
>## -sln
>
>use warnings;
>use strict;
>
>sub Sort_Template_Protype_aka_By_Both
>{
>  if ( $$a[1] < $$b[1] ) {return -1}
>  if ( $$a[1] > $$b[1] ) {return 1}
>  if ( $$a[1] == $$b[1]) {
>	if (($$a[2] < $$b[2])) {return -1}
>  	if (($$a[2] > $$b[2])) {return 1}
>	# if element 2's are equal {
>	#	.. check element 3, etc..
>	return 0
>  }
>}
>
>sub By_Both
>{
>  my $element_compare = $$a[1] <=> $$b[1];
>  $element_compare == 0 ? ($$a[2] <=> $$b[2]) : $element_compare;
>}
>
>sub By_Field_Range
>{
>	my ($start,$end) = @_;
>	return $$a[$start] <=> $$b[$start] if (!defined $end || $end <= $start);
>	for ($start..$end)
>	{
>		my $element_compare = $$a[$_] <=> $$b[$_];
>		next if ($element_compare == 0);
>		return $element_compare;
>	}
fix>	$$a[$_] <=> $$b[$_];
        ^
        0
for sure this should be zero not only because $element_compare equals 0 here
but because $_ could be undefined (my own template says that).

>}
>
>sub By_Field_Array
>{
>	return 0 if (!@_);
>	return $$a[$_[0]] <=> $$b[$_[0]] if (scalar(@_) == 1);
>	for (@_)
>	{
>		my $element_compare = $$a[$_] <=> $$b[$_];
>		next if ($element_compare == 0);
>		return $element_compare;
>	}
fix>	$$a[$_] <=> $$b[$_];
        ^
        0
for sure this should be zero not only because $element_compare equals 0 here
but because $_ could be undefined (my own template says that).

>}
>
>## --------------------------------------------------------
>my @result;
>my @a = ( ['a',3,1],['b',2,2],['c',1,1],['d',1,2],['e',2,1],['f',3,2] );
>
>sub Print_Results
>{
>	foreach my $i ( 0..5 ) {
>		print "$result[$i][0]$result[$i][1]$result[$i][2] ";
>	}
>	print "\n";
>}
>
>## Stuff to play with..
>## ---------------------
>
>@result = sort { By_Both } @a;		      Print_Results();
>@result = sort { By_Field_Range ( 1 ) } @a;   Print_Results();		#need params
>@result = sort { By_Field_Range (1,1) } @a;   Print_Results();		#need params
fix>@result = sort { By_Field_Range (1,3) } @a;   Print_Results();		#need params
                                       ^
                                  is out of range given @a, and not checked in sort function, set it to 2.
                                  its up to the user to check parameters.

>@result = sort { By_Field_Array ( 1 ) } @a;   Print_Results();		#need params
>@result = sort { By_Field_Array (1,2) } @a;   Print_Results();		#need params
>@result = sort { By_Field_Array ( 2 ) } @a;   Print_Results();		#need params
>
>
>__END__

-sln



------------------------------

Date: Mon, 23 Feb 2009 03:19:02 GMT
From: sln@netherlands.com
Subject: Re: Sorting based on existence of keys
Message-Id: <ek44q4dj3dju1niflvcb9ul54d7c2faklb@4ax.com>

On Sun, 22 Feb 2009 15:56:19 -0500, Uri Guttman <uri@stemsystems.com> wrote:

>>>>>> "JE" == Jürgen Exner <jurgenex@hotmail.com> writes:
>
>  JE> Uri Guttman <uri@stemsystems.com> wrote:
>  >>>>>>> "EP" == Eric Pozharski <whynot@pozharski.name> writes:
>  >> 
>  EP> On 2009-02-19, Uri Guttman <uri@stemsystems.com> wrote:
>  >> >> 
>  JE> $h{$a} and $h{$b} exist  ===>  length($h{$a}) <=> length($h{$a}) 
>  JE> $h{$a} exists but $h{$b} doesn't  ===> -1
>  JE> $h{$a} does't exist but $h{$b} does  ===> 1
>  JE> Neither $h{$a} nor $h{$b} exists  ===>  0
>  >> >> 
>  >> >> this is why doing a prefilter on the sort keys makes life much
>  >> >> simpler. 
>  JE> [...]
>  >> when you add that back
>  >> you get much more complicated comparisons. i haven't even brought up
>  >> speed for which the presort key extraction is needed. see my other post
>  >> for an example which should work if i typed it cleanly and is simpler,
>  >> clearer and faster.
>
>  JE> Different approaches do the same problem.
>
>  JE> You are favouring reducing/adjusting the data domain such that you can
>  JE> use standard Perl operators while I favour adding a new comparison
>  JE> operator to my data algebra, i.e. I have a given data domain and create
>  JE> the proper comparison operator for that given domain.
>  JE> To me my approach is much cleaner and simpler because I don't have to
>  JE> tweak the data set just to make the comparison work. Also, I am not
>  JE> convinced that your speed argument is correct, but it's really not
>  JE> important enough to write a big benchmark test. 
>  JE> To everyone his own, I guess.
>
>the prefilter design simplifies the logic no matter how you slice
>it. one common bug in multi-key sorts is getting the extraction right
>for each key and also keeping the proper order of
>comparisons. prefiltering reduces bugs because the extraction is coded
>one time and not twice with $a and $b. and it keeps the actual
>comparison code shorter as well so it is easier to manage the key order
>issues (sort up/down, etc.). as for speed, sort::maker comes with a
>benchmark script and the ability to generate a typical sort block like
>you have been doing as well as faster versions. it is easy to find where
>the breakeven point is for speed. the prefilter design is well known to
>be much faster for larger data sets and especially so for multi-key and
>complex sorts. nuff said here, i don't need to defend my point as it has
>been proven many times.
>
>uri

You keep saying 'Pre-Filter' as if it actually means anything in relation to sort.
This is a particular case of sort implentation that has specific logic. There are no
multiple sort keys/fields, whatever in this case.

The logic still has to be adhered to in as far as this boolean:
  $h{$a} and $h{$b} exist  ===>  length($h{$a}) <=> length($h{$a}) 
  $h{$a} exists but $h{$b} doesn't  ===> -1
  $h{$a} does't exist but $h{$b} does  ===> 1
  Neither $h{$a} nor $h{$b} exists  ===>  0
(I didn't check it, but it looks right.)

It doesen't matter if you do it as a pre-filter or all at once. It may look cleaner
as a pre-filter, but that don't count for squat as far as speed. The same number of
operations have to be performed no matter where the logic is.

As far as multi-key/fields sorting, doing it in one pass, regardless of the sort method
always is faster. Less runs through the function or block.

-sln



------------------------------

Date: Sun, 22 Feb 2009 23:37:21 +0000
From: Ben Morrow <ben@morrow.me.uk>
Subject: Re: utf8 and chomp
Message-Id: <h64976-mq1.ln1@osiris.mauzo.dyndns.org>


Quoth Josef Feit <jfeit@ics.muni.cz>:
> 
> I have run accross a Perl behaviour, which I do not
> understand:
> 
> I am trying to analyze some text with utf8 characters,
> eg a file with "nXlXx", where the 'X' stands for
> some utf8 encoded character.  eg. "náláx"
> (not sure whether it gets through).
> 
> Please change the 'X' in the %ascii for some
> utf8 character (should be 'á').
> 
> 
> #!/usr/bin/perl
> # -----------------------------------------------------------
> use warnings;
> use strict;
> use encoding 'utf-8';
> use 5.010;
> 
> my %ascii = (
>       'X' => 'a',
> );
> 
> my $line = <>;
> chomp $line;    # to chomp or not to chomp
> print length($line), ": ";;
> for( my $i = 0; $i < length($line); $i++ ){
>    my $znak = substr($line, $i, 1);

This is more cleanly written as something like

    for my $znak (split //, $line) {

>    if( exists( $ascii{$znak} ) ){
>       print "+";
>    }else{
>       print "-";
>    }
> }
> print "\n";
> 
> ---
> The problem is with the chomp:
> 
> In case I chomp the $line, the output is as
> expected: 5: -+-+-
> 
> If I comment out the chomp, the result is
> 8: --------
> so the Perl does not consider the $line to be
> utf8 encoded.

Here, with 5.10.0 build for i386-freebsd-thread-multi, I get -+-+- with
the chomp and -+-+-- without, as expected. Are you sure you are using
the same input both times?

Ben



------------------------------

Date: Mon, 23 Feb 2009 03:47:45 +0200
From: Eric Pozharski <whynot@pozharski.name>
Subject: Re: utf8 and chomp
Message-Id: <slrngq403t.fng.whynot@orphan.zombinet>

On 2009-02-22, Josef Feit <jfeit@ics.muni.cz> wrote:
*SKIP*
> The problem is with the chomp:
>
> In case I chomp the $line, the output is as
> expected: 5: -+-+-
>
> If I comment out the chomp, the result is
> 8: --------
> so the Perl does not consider the $line to be
> utf8 encoded.
>
> Is this a side effect of chomp or do I have it
> wrong? I need not to chomp and get the utf8.

Just checked -- I can't recreate that.  I have C<5: -+-+-> with B<chomp>
and C<6: -+-+--> without.  Consider forcing I<$line> to be utf8
(C<perldoc Encode> has more).

p.s.  And rewrite your C in Perl.


-- 
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom


------------------------------

Date: Mon, 23 Feb 2009 04:06:02 GMT
From: "-Brad-" <thesnake_123@-NO-S_P_A_M-hotmail.com>
Subject: XML Simple force array
Message-Id: <Kgpol.22881$cu.22286@news-server.bigpond.net.au>

Hi all,

I have an xml file that looks like :

<control_file name="XXXX_99.ctl">
    <files_ps count="2">
        <file seq="1" name="file1.gz" size="1107045" />
        <file seq="2" name="file2.gz" size="1107045" />
    </files_ps>
</control_file>

I would like to be able to loop through all the child elements under 
files_ps, and print out their attribute values.
I was planning on using forcearray on the 'file' node so I can loop through 
all the array elements, but I cant seem to get it to work.

use Data::Dumper;
use XML::Simple;

my $xs = new XML::Simple();
my $xml = $xs->XMLin(<->,
                     keeproot  => '1',
                     forcearray=> ['file',]
                     );

print Dumper($xml);  # it still looks like a hash, not an array!


Any help would be apprecitaed!

Cheers





------------------------------

Date: Mon, 23 Feb 2009 07:03:45 +0100
From: Christian Winter <thepoet_nospam@arcor.de>
Subject: Re: XML Simple force array
Message-Id: <49a23c2a$0$31339$9b4e6d93@newsspool4.arcor-online.net>

-Brad- wrote:
> I have an xml file that looks like :
> 
> <control_file name="XXXX_99.ctl">
>     <files_ps count="2">
>         <file seq="1" name="file1.gz" size="1107045" />
>         <file seq="2" name="file2.gz" size="1107045" />
>     </files_ps>
> </control_file>
> 
> I would like to be able to loop through all the child elements under 
> files_ps, and print out their attribute values.
> I was planning on using forcearray on the 'file' node so I can loop through 
> all the array elements, but I cant seem to get it to work.
> 
> use Data::Dumper;
> use XML::Simple;
> 
> my $xs = new XML::Simple();
> my $xml = $xs->XMLin(<->,
>                      keeproot  => '1',
>                      forcearray=> ['file',]

"forcearray" ne "ForceArray". Perl is case sensitive.

>                      );
> 
> print Dumper($xml);  # it still looks like a hash, not an array!

-Chris


------------------------------

Date: Mon, 23 Feb 2009 07:10:54 +0100
From: Marc Lucksch <perl@marc-s.de>
Subject: Re: XML Simple force array
Message-Id: <gntek2$2sln$1@ariadne.rz.tu-clausthal.de>

-Brad- schrieb:
> Hi all,
> 
> I have an xml file that looks like :
> 
> <control_file name="XXXX_99.ctl">
>     <files_ps count="2">
>         <file seq="1" name="file1.gz" size="1107045" />
>         <file seq="2" name="file2.gz" size="1107045" />
>     </files_ps>
> </control_file>
> 
> I would like to be able to loop through all the child elements under 
> files_ps, and print out their attribute values.
> I was planning on using forcearray on the 'file' node so I can loop through 
> all the array elements, but I cant seem to get it to work.
> 
That is because XML::Simple treats the name special and as a unique 
identifier, you will have to switch that off:

use Data::Dumper;
use XML::Simple;

my $xs = new XML::Simple();
my $xml = $xs->XMLin("test.xml",
                      KeyAttr => [],
                      keeproot  => '1',
                      forcearray=> ['file',]
                      );

print Dumper($xml);
__END__

$VAR1 = {
     'control_file' => {
         'name' => 'XXXX_99.ctl',
         'files_ps' => {
             'count' => '2',
             'file' => [
                 {
                     'name' => 'file1.gz',
                     'seq' => '1',
                     'size' => '1107045'
                 },
                 {
                     'name' => 'file2.gz',
                     'seq' => '2',
                     'size' => '1107045'
                 }
             ]
         }
     }
};

On the other hand, if name is unique you could also use foreach keys to 
iterate.

Marc "Maluku" Lucksch


------------------------------

Date: Mon, 23 Feb 2009 07:13:14 +0100
From: Marc Lucksch <perl@marc-s.de>
Subject: Re: XML Simple force array
Message-Id: <gnteof$2sln$2@ariadne.rz.tu-clausthal.de>

Christian Winter schrieb:
> -Brad- wrote: 
> "forcearray" ne "ForceArray". 
Yes
> Perl is case sensitive.
Nope, that ain't it, XML::Simple doesn't care.

Marc "Maluku" Lucksch


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2229
***************************************


home help back first fref pref prev next nref lref last post