[23916] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 6118 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Feb 11 18:10:44 2004

Date: Wed, 11 Feb 2004 15:10:14 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 11 Feb 2004     Volume: 10 Number: 6118

Today's topics:
    Re: RFC: utils.pm <tore@aursand.no>
    Re: RFC: utils.pm <tassilo.parseval@rwth-aachen.de>
    Re: RFC: utils.pm <uri@stemsystems.com>
    Re: RFC: utils.pm <tore@aursand.no>
    Re: RFC: utils.pm <tore@aursand.no>
    Re: RFC: utils.pm <uri@stemsystems.com>
    Re: RFC: utils.pm <usenet@morrow.me.uk>
    Re: RFC: utils.pm <usenet@morrow.me.uk>
        Simple question <barty@fuelfix.com>
    Re: Simple question (Walter Roberson)
    Re: Simple question <tadmc@augustmail.com>
    Re: Simple question <tore@aursand.no>
    Re: Simple question (Walter Roberson)
        strange behaviour with map inside a hash <nospam@nospam.org>
    Re: strange behaviour with map inside a hash <bik.mido@tiscalinet.it>
    Re: strange behaviour with map inside a hash <uri@stemsystems.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Wed, 11 Feb 2004 20:36:05 +0100
From: Tore Aursand <tore@aursand.no>
Subject: Re: RFC: utils.pm
Message-Id: <pan.2004.02.11.19.31.24.54398@aursand.no>

On Wed, 11 Feb 2004 15:23:37 +0000, Ben Morrow wrote:
>> split_csv( $value ) - Easy CSV splitting. (*)

> As you say, there are modules to do this. Or
> 
> @values = split /,/, $values;

Not quite as powerful as this one;

  my @array = ();
  push( @array, $+ ) while $string =~ m{
      "([^\"\\]*(?:\\.[^\"\\]*)*)",?  # groups the phrase inside the quotes
      | ([^,]+),?
      | ,
  }gx;
  push( @array, undef ) if ( substr($string, -1, 1) eq ',' );

>> random_string( $length ) - Generates a random string $length
>>                            characters long.

> This is a little harder... also *much* less useful.

Maybe, but I find myself constantly using this one when I want to create
cookies.  Nice to have. :)

>> as_string( $value, [$default] ) - Always returns a defined value,
>>                                   optionally $default if $value
>>                                   isn't defined.

> Eh what? You don't *need* to cast in Perl.
> 
> $value || $default

This only returns $default if $value isn't true.  So if we have this
function:

  sub as_string {
      my $value   = shift;
      my $default = shift;
      return $value || $default;
  }

What happens when you pass 0 to this function?

> or defined($value) ? $value : $default

Better, but I'm tired to writing this god damn line each time I want to
make sure that a value is defined. :)

My 'as_string' function will only return the $default value if a) it is
given, and b) if the input $value is undefined.

When it comes to "casting".  Maybe it's not the correct word to use, but
when dealing with ie. CGI parameters, I find my functions useful.  Here's
an example;

  my $firstname = as_string( $cgi->param('firstname') );
  my $lastname  = as_string( $cgi->param('lastname') );
  my $age       = as_int( $cgi->param('age') );

>> as_int( $value ) - Always returns $value as an integer.

> err.. int($value)

What if $value isn't a number?  Ah.  A warning occurs, of course.  I don't
want warnings.  'as_int' clearly says that I want _any_ value passed to it
returned as an integer.  If that's not possible, return 0.

> Or POSIX::floor, or POSIX::ceil.

I know about them.  I just didn't want to deal with more than one module,
so I created my own functions instead. :)

>> as_decimal( $value, [$decimals] ) - Always returns $value as a
>>                                     decimal number with $decimals
>>                                     numbers after the decimal point.

> sprintf

Excactly what 'as_decimal' uses, except that I prefer writing

  my $value = as_decimal( 1.23456789, 2 );

instead of

  my $value = sprintf( '%.2f', 1.23456789 );

>> as_boolean( $value ) - Always returns $value as a boolena value (ie.
>>                        TRUE/1 or FALSE/0).

> !!$value

Yeah, but 'as_boolean' takes care of "other" boolean values to;

  sub as_boolean {
      my $value = as_string( shift );
      return ( $value =~ m,^1|y|yes|on|true$,i ) ? 1 : 0;
  }

I needed this function when dealing with output from an application where
a true value could be almost anything.

No, I didn't write that program, and it wasn't in Perl (or any other
interpreted language), so I couldn't change the output.

>> as_date( $value ) - Always returns $value as a date (YYYY-MM-DD).
>> as_time( $value ) - Always returns $value as a time (HH:MM:SS).
>> as_datetime( $value ) - Always returns $value as a datetime (which
>>                         means combining as_date() and as_time()).

> POSIX::strftime

'as_date', 'as_time' and 'as_datetime' are - of course - a lot easier to
use than 'strftime'. :)

I don't use the 'strftime' much, and every time I had to use it, I had to
look it up in the documentation.  I don't want to do that.  I'm lazy. :)

>> VALIDATION
>>   Each of the CASTING functions also have a is_* function, which returns
>>   TRUE/1 or FALSE/0 depending on wether the input argument conforms to
>>   the datatype.

> Ummm... everything is (can be) a string.

Yes, but not everything can be everything _else_ than a string.  Let's
look at my CGI parameter example again.  Instead of writing

  my $nr = (defined $cgi->param('nr')) ? $cgi->param('nr') : 0;
  unless ( $nr >= 0 && $nr <= 12 ) {
      # Error
  }

I want to write

  my $nr = as_int( $cgi->param('nr') );
  unless ( is_int($nr, 0, 12) ) {
      # Error
  }

> is_int is simply $value =~ /^\d+$/.

No.  Take a look at this:

  my @values = ( 2, -2, +2, '2', '-2', '+2' );
  foreach ( @values ) {
      print "Is $_ an integer? ";
      ( /^\d+$/ ) ? print 'Yes' : print 'No';
      print "\n";
  }

Of course, one could argue about wether a stringified '-2' (or '+2')
really is an integer, but functions like these are nice to have if you
want to _avoid_ errors:

  # $value appears from somewhere, possibly from a stupid user :)
  if ( is_int($value) ) {
      # Ok, we can use this value
      $value = as_int( $value ); # 'Cast' it to integer
  }
  else {
      # Sorry, but $value isn't even close to being an integer
  }

> Dates can and should be handled by your favourite date-time parsing
> module: the code is non-trivial, so should be reused.

I guess you're right.  I created many of them (the functions) "by hand",
though, and tested them agains Date::Calc and Date::Manip.  Man, that was
one useful learning session. :)

> I can't think offhand how I'd do is_decimal, probably because I've
> never had occasion to.

I once (...) encountered a problem with a web form where the user had to
register some data, and he/she _had_ to enter a number as a decimal.  That
was to make sure that the data entered was as accurate as possible, or
something like that;

  sub is_decimal {
      my $value = as_string( shift );
      return ( $value =~ m,^[-+]?(\d+)?\.\d+$, ) ? 1 : 0;
  }

>> round( $value ) - Rounds a number to the nearest integer.

> int($value + 0.5) is usually good enough.

Yeah, excapt when do want to round negative numbers as well;

  return ( $value > 0 ) ? int($value + 0.5) : int($value - 0.5);

>> random_number( $min, $max ) - Returns a random number in the range
>>                               $min to $max.

> (rand * ($max - $min)) + $min

Right, but what if only one of $min or $max is known?

>> format_number( $value, $separator ) - Formats a number with a given
>>                                       separator; 1234 becomes 1,234.

> Less trivial... is probably better done by something locale-aware.

Who says that a 'Utils' module can't be locale-aware? :)

>> unique( $arrayref ) - Returns only the unique elements in $array.
>> intersection( $arrayref1, $arrayref2 ) - Computes the intersection
>>                                          of two array references.
>> union( $arrayref1, $arrayref2 ) - Computes the union of two array
>>                                   references.

> These should all be in a module called Set::Util... feel free to write
> it.

It isn't in a module already?  Doesn't it fit into List::Util?


-- 
Tore Aursand <tore@aursand.no>
"Scientists are complaining that the new "Dinosaur" movie shows
 dinosaurs with lemurs, who didn't evolve for another million years.
 They're afraid the movie will give kids a mistaken impression. What
 about the fact that the dinosaurs are singing and dancing?" -- Jay
 Leno


------------------------------

Date: 11 Feb 2004 20:07:33 GMT
From: "Tassilo v. Parseval" <tassilo.parseval@rwth-aachen.de>
Subject: Re: RFC: utils.pm
Message-Id: <c0e225$216$1@nets3.rz.RWTH-Aachen.DE>

Also sprach Tore Aursand:

> On Wed, 11 Feb 2004 17:17:59 +0000, Tassilo v. Parseval wrote:
>>>> The real reasons though for not including them is indeed the size
>>>> argument. A recent Perl distribution is really big and the amount of
>>>> code in it (that has to be maintained) scary.
> 
>>> I agree on this one.  The core size of Perl should always be a minimum,
>>> but I don't think that a module like this one would have much impact on
>>> the size. :)
> 
>> Haha, you are too optimistic, I am afraid. :-)  If such functions ever
>> made it into the core, clearly they would be heavily optimized. In the
>> end, they'd be written in C (note that there is no core function that is
>> written in pure Perl).
> 
> My fault entirely:  I did _not_ mean the Perl *core*, but the set of
> standard modules shipped with Perl.  Doh.  Can't blame that I haven't
> slept too little in the last few days, either. :)
> 
> All the functions I listed are pure Perl at the moment.  I'm not too
> familiar with C, but I understand that "XS programming" is something to
> investigate further.

No doubt you should do that. Also, XS is a good opportunity to learn or
refresh your C.

I guess some of your proposed string functions are good candidates for
XSization. Since you say that you use them often, you'll gain a little
bit of speed in your scripts if you come up with a not too unreasonable
C implementation.

Tassilo
-- 
$_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
$_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval


------------------------------

Date: Wed, 11 Feb 2004 21:28:38 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: RFC: utils.pm
Message-Id: <x7lln971bt.fsf@mail.sysarch.com>

>>>>> "TA" == Tore Aursand <tore@aursand.no> writes:

  TA> When it comes to "casting".  Maybe it's not the correct word to use, but
  TA> when dealing with ie. CGI parameters, I find my functions useful.  Here's
  TA> an example;

  TA>   my $firstname = as_string( $cgi->param('firstname') );
  TA>   my $lastname  = as_string( $cgi->param('lastname') );

cgi params are always strings. nothing but strings can be passed to a
web server.

  TA>   my $age       = as_int( $cgi->param('age') );

my $age = $cgi->param('age') + 0 ;

not needed. as soon as you use it as an number it becomes a number. and
having perl do that is faster than a sub call.

  >>> as_int( $value ) - Always returns $value as an integer.

  >> err.. int($value)

  TA> What if $value isn't a number?  Ah.  A warning occurs, of course.  I don't
  TA> want warnings.  'as_int' clearly says that I want _any_ value passed to it
  TA> returned as an integer.  If that's not possible, return 0.

that is not the meaning of 'as_int'. call it verify_int then. and you
can trap or ignore the warning. and checking if it is an int is trivial:

	my $int = $param =~ /\D/ ? 0 : $param ;

  TA>   sub as_boolean {
  TA>       my $value = as_string( shift );
  TA>       return ( $value =~ m,^1|y|yes|on|true$,i ) ? 1 : 0;
  TA>   }

  TA> I needed this function when dealing with output from an application where
  TA> a true value could be almost anything.

that isn't almost anything. it doesn't handle undef or the null string
which are false. in fact it only tests for YOUR true values which are
not universal. not useful in a general purpose module.

  >> POSIX::strftime

  TA> 'as_date', 'as_time' and 'as_datetime' are - of course - a lot easier to
  TA> use than 'strftime'. :)

and as i said, they have a bug. they don't take time() as an argument so
there can be skewed times between calls.

  TA> I don't use the 'strftime' much, and every time I had to use it, I had to
  TA> look it up in the documentation.  I don't want to do that.  I'm lazy. :)

but strftime is correct which is lazier

  TA> Yes, but not everything can be everything _else_ than a string.  Let's
  TA> look at my CGI parameter example again.  Instead of writing

  TA>   my $nr = (defined $cgi->param('nr')) ? $cgi->param('nr') : 0;
  TA>   unless ( $nr >= 0 && $nr <= 12 ) {
  TA>       # Error
  TA>   }

  TA> I want to write

  TA>   my $nr = as_int( $cgi->param('nr') );
  TA>   unless ( is_int($nr, 0, 12) ) {
  TA>       # Error
  TA>   }

  >> is_int is simply $value =~ /^\d+$/.

  TA> No.  Take a look at this:

  TA>   my @values = ( 2, -2, +2, '2', '-2', '+2' );
  TA>   foreach ( @values ) {
  TA>       print "Is $_ an integer? ";
  TA>       ( /^\d+$/ ) ? print 'Yes' : print 'No';
  TA>       print "\n";
  TA>   }

so prefix a [+-]? to it.

and call it verify_int.

and better yet, wrap it so it does all of that in one sub with the cgi
by subclassing:

sub CGI::verify_int {

	my( $self, $param ) = @_ ;

	my $cgi_val = $self->param( $param ) ;

	return unless defined $cgi_val ;

	return unless $cgi_val =~ /^\s*[-+]?\d+$/ ;

	return $cgi_val + 0 ;
}

better and simpler to use than your code.

	my $int = $cgi->verify_int( 'foo' ) ;

  >> Dates can and should be handled by your favourite date-time parsing
  >> module: the code is non-trivial, so should be reused.

  TA> I guess you're right.  I created many of them (the functions) "by hand",
  TA> though, and tested them agains Date::Calc and Date::Manip.  Man, that was
  TA> one useful learning session. :)

and you couldn't have possibly covered all date formats.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org


------------------------------

Date: Wed, 11 Feb 2004 22:53:42 +0100
From: Tore Aursand <tore@aursand.no>
Subject: Re: RFC: utils.pm
Message-Id: <pan.2004.02.11.21.47.56.856600@aursand.no>

On Wed, 11 Feb 2004 21:28:38 +0000, Uri Guttman wrote:
>> my $firstname = as_string( $cgi->param('firstname') );
>> my $lastname  = as_string( $cgi->param('lastname') );

> cgi params are always strings. nothing but strings can be passed to a
> web server.

Of course, but the values of 'firstname' and/or 'lastname' can be
undefined.  It's so much simpler - and faster - to let a function take
care of making sure that $firstname and $lastname is defined.

>> my $age = as_int( $cgi->param('age') );

> my $age = $cgi->param('age') + 0 ;

Except when you want to do range checking and setting the value to a
default value if the input argument is out of range.  'as_int()', and a
lot of the other 'as_*' functions take care of that;

  my $value = as_int( $input, $min, $max, $default );

>> What if $value isn't a number?  Ah.  A warning occurs, of course.  I don't
>> want warnings.  'as_int' clearly says that I want _any_ value passed to it
>> returned as an integer.  If that's not possible, return 0.

> that is not the meaning of 'as_int'. call it verify_int then.

Ok, maybe it should be renamed.  I just though 'as_int()' made sense,
'cause no matter what I send as arguments to that function, I want an
integer back. :)

>> sub as_boolean {
>>     my $value = as_string( shift );
>>     return ( $value =~ m,^1|y|yes|on|true$,i ) ? 1 : 0;
>> }

> that isn't almost anything. it doesn't handle undef or the null string
> which are false.

Yes, it does.  'as_string(shift)' makes sure that $value is defined.  And
the function only returns true if $value matches the regular expression. 
A blank value is - of course - false.

> in fact it only tests for YOUR true values which are not universal. not
> useful in a general purpose module.

That is - of course - correct.

>> 'as_date', 'as_time' and 'as_datetime' are - of course - a lot
>> easier to use than 'strftime'. :)

> and as i said, they have a bug. they don't take time() as an argument so
> there can be skewed times between calls.

Hmm.  Pardon my English, but what do you really mean here?  The three
functions above can be used like this (with 'as_date()' as example):

  my $date = as_date( '2004-02-11' );
  my $date = as_date( '2004-02-11 12:34' );
  # and other variations of the above
  my $date = as_date( 2004, 02, 11 );
  my $date = as_date( time );

> [...]
> and call it verify_int.
> 
> and better yet, wrap it so it does all of that in one sub with the cgi
> by subclassing:
> [...]

Ah - forget about CGI.  I just used it as an example. :)  Most of my work
involves creating CGI scripts (and/or mod_perl-based web applications).

> [...]
> and you couldn't have possibly covered all date formats.

"There can be only one..." :)


-- 
Tore Aursand <tore@aursand.no>
"Writing is a lot like sex. At first you do it because you like it.
 Then you find yourself doing it for a few close friends and people you
 like. But if you're any good at all, you end up doing it for money."
 -- Unknown


------------------------------

Date: Wed, 11 Feb 2004 22:53:43 +0100
From: Tore Aursand <tore@aursand.no>
Subject: Re: RFC: utils.pm
Message-Id: <pan.2004.02.11.21.35.19.479315@aursand.no>

On Wed, 11 Feb 2004 20:07:33 +0000, Tassilo v. Parseval wrote:
>> All the functions I listed are pure Perl at the moment.  I'm not too
>> familiar with C, but I understand that "XS programming" is something to
>> investigate further.

> No doubt you should do that. Also, XS is a good opportunity to learn or
> refresh your C.

Yeah.  Does anyone have related documents for me (apart from perlxs and
perlxstut)?

> I guess some of your proposed string functions are good candidates for
> XSization. Since you say that you use them often, you'll gain a little
> bit of speed in your scripts if you come up with a not too unreasonable
> C implementation.

Preemptive optimization is the root of all evil. :)  The functions I
listed are - of course - subject to optimization at some time, but I don't
need them to be fast.  I need them to do my work faster (and better).


-- 
Tore Aursand <tore@aursand.no>
"The purpose of all war is ultimately peace." -- Saint Augustine


------------------------------

Date: Wed, 11 Feb 2004 21:59:46 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: RFC: utils.pm
Message-Id: <x7y8r95lbh.fsf@mail.sysarch.com>

>>>>> "TA" == Tore Aursand <tore@aursand.no> writes:


  >>> 'as_date', 'as_time' and 'as_datetime' are - of course - a lot
  >>> easier to use than 'strftime'. :)

  >> and as i said, they have a bug. they don't take time() as an argument so
  >> there can be skewed times between calls.

  TA> Hmm.  Pardon my English, but what do you really mean here?  The three
  TA> functions above can be used like this (with 'as_date()' as example):

i meant the now* funcs. they call time internally and can be skewed
between neighboring calls. see my first followup.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org


------------------------------

Date: Wed, 11 Feb 2004 22:06:45 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: RFC: utils.pm
Message-Id: <c0e91l$9nq$1@wisteria.csv.warwick.ac.uk>


Tore Aursand <tore@aursand.no> wrote:
> On Wed, 11 Feb 2004 15:23:37 +0000, Ben Morrow wrote:
> >> split_csv( $value ) - Easy CSV splitting. (*)
> 
> > As you say, there are modules to do this. Or
> > 
> > @values = split /,/, $values;
> 
> Not quite as powerful as this one;
> 
>   my @array = ();
>   push( @array, $+ ) while $string =~ m{
>       "([^\"\\]*(?:\\.[^\"\\]*)*)",?  # groups the phrase inside the quotes
            ^ ^
            Unnecessary

>       | ([^,]+),?
>       | ,
>   }gx;
>   push( @array, undef ) if ( substr($string, -1, 1) eq ',' );

use Text::Balanced qw/extract_multiple extract_delimited/;

extract_multiple $string,
    [ sub { extract_delimited $_[0], '"' }, qr/[^,]*/ ],
    undef, 1;

Or, there are modules to do it.

> >> random_string( $length ) - Generates a random string $length
> >>                            characters long.
> 
> > This is a little harder... also *much* less useful.
> 
> Maybe, but I find myself constantly using this one when I want to create
> cookies.  Nice to have. :)

Well, yeah; but hardly of general utility. Perl != CGI.

> >> as_string( $value, [$default] ) - Always returns a defined value,
> >>                                   optionally $default if $value
> >>                                   isn't defined.
> 
> > Eh what? You don't *need* to cast in Perl.
> > 
> > $value || $default
> 
> This only returns $default if $value isn't true.  So if we have this
> function:

Yeah, yeah; I know that. It's good enough most of the time,
though. Roll on 5.10 when (I think) they're going to put // in (which
tests for definedness rather than truth).

> > or defined($value) ? $value : $default
> 
> Better, but I'm tired to writing this god damn line each time I want to
> make sure that a value is defined. :)

But it's a lot clearer... if the default is not supplied, your
function is a noop; otherwise, it is a defined-or-default
function. Not what I'd call as_string.

> When it comes to "casting".  Maybe it's not the correct word to use, but
> when dealing with ie. CGI parameters, I find my functions useful.  Here's
> an example;
> 
>   my $firstname = as_string( $cgi->param('firstname') );

What, precisely, is the difference between this and

    my $firstname = $cgi->param('firstname');

?

> >> as_int( $value ) - Always returns $value as an integer.
> 
> > err.. int($value)
> 
> What if $value isn't a number?  Ah.  A warning occurs, of course.  I don't
> want warnings.

So turn it off. Not exactly hard. And, again, an explicit statement to
a reader of your code that you are prepared to tolerate non-numeric
inputs.

> >> as_decimal( $value, [$decimals] ) - Always returns $value as a
> >>                                     decimal number with $decimals
> >>                                     numbers after the decimal point.
> 
> > sprintf
> 
> Excactly what 'as_decimal' uses, except that I prefer writing
> 
>   my $value = as_decimal( 1.23456789, 2 );
> 
> instead of
> 
>   my $value = sprintf( '%.2f', 1.23456789 );

Why? This slightly peculiar taste of yours hardly justifies inclusion
in a core module.

> >> as_boolean( $value ) - Always returns $value as a boolena value (ie.
> >>                        TRUE/1 or FALSE/0).
> 
> > !!$value
> 
> Yeah, but 'as_boolean' takes care of "other" boolean values to;
> 
>   sub as_boolean {
>       my $value = as_string( shift );
>       return ( $value =~ m,^1|y|yes|on|true$,i ) ? 1 : 0;
>   }

Whoa... not what it said on the tin *at* *all*. I'd definitely much
prefer that to be explicit in the code; and you *really* don't need
that ?:: m// already returns a boolean value...

> >> as_date( $value ) - Always returns $value as a date (YYYY-MM-DD).
> >> as_time( $value ) - Always returns $value as a time (HH:MM:SS).
> >> as_datetime( $value ) - Always returns $value as a datetime (which
> >>                         means combining as_date() and as_time()).
> 
> > POSIX::strftime
> 
> 'as_date', 'as_time' and 'as_datetime' are - of course - a lot easier to
> use than 'strftime'. :)

'Of course'? I think not. Apart from anything else, it is clear from a
strftime call what format the result will be in, whereas your
functions are not.

> >> VALIDATION
> >>   Each of the CASTING functions also have a is_* function, which returns
> >>   TRUE/1 or FALSE/0 depending on wether the input argument conforms to
> >>   the datatype.
> 
> > Ummm... everything is (can be) a string.
> 
> Yes, but not everything can be everything _else_ than a string.  Let's
> look at my CGI parameter example again.  Instead of writing
> 
>   my $nr = (defined $cgi->param('nr')) ? $cgi->param('nr') : 0;
>   unless ( $nr >= 0 && $nr <= 12 ) {
>       # Error
>   }
> 
> I want to write
> 
>   my $nr = as_int( $cgi->param('nr') );
>   unless ( is_int($nr, 0, 12) ) {
>       # Error
>   }

I wasn't debating is_int, but is_string. What, in fact, does this
function *do*?

> > is_int is simply $value =~ /^\d+$/.
> 
> No.  Take a look at this:
> 
>   my @values = ( 2, -2, +2, '2', '-2', '+2' );
>   foreach ( @values ) {
>       print "Is $_ an integer? ";
>       ( /^\d+$/ ) ? print 'Yes' : print 'No';
>       print "\n";
>   }

OK, my mistake... $value =~ /^[+-]?\d+$/.

> > Dates can and should be handled by your favourite date-time parsing
> > module: the code is non-trivial, so should be reused.
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> I guess you're right.  I created many of them (the functions) "by hand",
> though, and tested them agains Date::Calc and Date::Manip.  Man, that was
> one useful learning session. :)

Right, it's a good way to learn... but in production, use the
peer-reviewed code. *That's what it's there for*.

> > I can't think offhand how I'd do is_decimal, probably because I've
> > never had occasion to.
> 
> I once (...) encountered a problem with a web form where the user had to
> register some data, and he/she _had_ to enter a number as a decimal.  That
> was to make sure that the data entered was as accurate as possible, or
> something like that;
> 
>   sub is_decimal {
>       my $value = as_string( shift );
>       return ( $value =~ m,^[-+]?(\d+)?\.\d+$, ) ? 1 : 0;
                                   ^^^^^^
                                    \d*
>   }

Right. Once. As I said.

> >> random_number( $min, $max ) - Returns a random number in the range
> >>                               $min to $max.
> 
> > (rand * ($max - $min)) + $min
> 
> Right, but what if only one of $min or $max is known?

Err... what if? What do you *want* to happen in that case? (And how do
you implement it?... insofaras I am aware, the random-number generator
will not accept 'infinity' as an argument.)

> >> format_number( $value, $separator ) - Formats a number with a given
> >>                                       separator; 1234 becomes 1,234.
> 
> > Less trivial... is probably better done by something locale-aware.
> 
> Who says that a 'Utils' module can't be locale-aware? :)

OK; but personally I'd rather that sprintf be made aware of the '
modifier from SUSv2 (which does just this).

> >> unique( $arrayref ) - Returns only the unique elements in $array.
> >> intersection( $arrayref1, $arrayref2 ) - Computes the intersection
> >>                                          of two array references.
> >> union( $arrayref1, $arrayref2 ) - Computes the union of two array
> >>                                   references.
> 
> > These should all be in a module called Set::Util... feel free to write
> > it.
> 
> It isn't in a module already?  Doesn't it fit into List::Util?

Well, maybe, but they aren't in there. Personally, I'd have said
there's a useful distinction to be made between 'operations on ordered
sets of data' (List::Util) and 'operations on unordered sets of data'
(Set::Util).

Anyway, if you thought it was in a module already *why aren't you
using it*?

Ben

-- 
   If you put all the prophets,   |   You'd have so much more reason
   Mystics and saints             |   Than ever was born
   In one room together,          |   Out of all of the conflicts of time.
ben@morrow.me.uk |----------------+---------------| The Levellers, 'Believers'


------------------------------

Date: Wed, 11 Feb 2004 22:13:08 +0000 (UTC)
From: Ben Morrow <usenet@morrow.me.uk>
Subject: Re: RFC: utils.pm
Message-Id: <c0e9dk$9nq$3@wisteria.csv.warwick.ac.uk>


Tore Aursand <tore@aursand.no> wrote:
> Yeah.  Does anyone have related documents for me (apart from perlxs and
> perlxstut)?

perlguts and perlapi. And the perl source.

I found http://gisle.aas.no/perl/illguts/ helpful.

Ben

-- 
'Deserve [death]? I daresay he did. Many live that deserve death. And some die
that deserve life. Can you give it to them? Then do not be too eager to deal
out death in judgement. For even the very wise cannot see all ends.'
 :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-: ben@morrow.me.uk


------------------------------

Date: Wed, 11 Feb 2004 20:15:16 +0100
From: Barty Slartfast <barty@fuelfix.com>
Subject: Simple question
Message-Id: <c0dv81$hq4$04$1@news.t-online.com>

I have a string of which the middle part is an unknown generated by user 
input, for example:

ladiddfa BEGIN balblabla END and whatever else..."

The middle part being --> balblabla <---

I would like to extract whatever string is within the BEGIN and END 
delimiters, but I don't know in advance what the middle part is.
It may be any number of characters, numbers, empty spaces, linebreaks, or 
nothing at all.
Any idea of the simplest possible perl code to do this?



------------------------------

Date: 11 Feb 2004 20:22:03 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: Simple question
Message-Id: <c0e2tb$6vk$1@canopus.cc.umanitoba.ca>

In article <c0dv81$hq4$04$1@news.t-online.com>,
Barty Slartfast  <barty@fuelfix.com> wrote:
:I have a string of which the middle part is an unknown generated by user 
:input, for example:

:ladiddfa BEGIN balblabla END and whatever else..."

:The middle part being --> balblabla <---

:I would like to extract whatever string is within the BEGIN and END 
:delimiters, but I don't know in advance what the middle part is.
:It may be any number of characters, numbers, empty spaces, linebreaks, or 
:nothing at all.
:Any idea of the simplest possible perl code to do this?

if ( m/BEGIN\s(.*?)\sEND/s ) {
  $middle = $1;
} else {
  ???
}
-- 
Strange but true: there are entire WWW pages devoted to listing
programs designed to obfuscate HTML.


------------------------------

Date: Wed, 11 Feb 2004 14:46:42 -0600
From: Tad McClellan <tadmc@augustmail.com>
Subject: Re: Simple question
Message-Id: <slrnc2l55i.9mo.tadmc@magna.augustmail.com>

Barty Slartfast <barty@fuelfix.com> wrote:

> Subject: Simple question

Too simple Subject.


Please put the subject of your article in the Subject of your article.


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Wed, 11 Feb 2004 22:53:43 +0100
From: Tore Aursand <tore@aursand.no>
Subject: Re: Simple question
Message-Id: <pan.2004.02.11.21.50.29.260314@aursand.no>

On Wed, 11 Feb 2004 20:22:03 +0000, Walter Roberson wrote:
>> ladiddfa BEGIN balblabla END and whatever else..."
>> 
>> The middle part being --> balblabla <---
>> 
>> I would like to extract whatever string is within the BEGIN and END
>> delimiters, but I don't know in advance what the middle part is. It may
>> be any number of characters, numbers, empty spaces, linebreaks, or
>> nothing at all.

> if ( m/BEGIN\s(.*?)\sEND/s ) {
>   $middle = $1;
> } else {
>   ???
> }

Better yet, IMO;

  if ( m,BEGIN\s*(.*?)\s*END,s ) {
      my $middle = $1;
  }
  else {
      # No match
  }

Just from the top of my head, but I'm sure that the OP don't want any
extra spaces. :)



-- 
Tore Aursand <tore@aursand.no>
"A car is not the only thing that can be recalled by its maker." --
 Unknown


------------------------------

Date: 11 Feb 2004 22:00:04 GMT
From: roberson@ibd.nrc-cnrc.gc.ca (Walter Roberson)
Subject: Re: Simple question
Message-Id: <c0e8l4$9g2$1@canopus.cc.umanitoba.ca>

In article <pan.2004.02.11.21.50.29.260314@aursand.no>,
Tore Aursand  <tore@aursand.no> wrote:
:Better yet, IMO;

:  if ( m,BEGIN\s*(.*?)\s*END,s ) {
:      my $middle = $1;
:  }
:  else {
:      # No match
:  }

but $middle is going to disappear after that 'if', which is not necessarily
to be desired.
-- 
   Entropy is the logarithm of probability   -- Boltzmann


------------------------------

Date: Wed, 11 Feb 2004 16:14:13 -0500
From: "Christian Caron" <nospam@nospam.org>
Subject: strange behaviour with map inside a hash
Message-Id: <c0e5v5$j7g2@nrn2.NRCan.gc.ca>

######### First example with map
#!/usr/local/bin/perl

use Date::Calc qw(:all);

my $y = 2004;
my $m = '02';
my $d = '09';

my %conversion = (      '%W' => map { /^\d{1}$/ ? sprintf('%02d', $_) : $_ }
(Week_of_Year($y,$m,$d))[0],
                        '%w' => Day_of_Week($y,$m,$d),
                        '%j' => Day_of_Year($y,$m,$d),
                        '%Y' => "$y",
                        '%y' => eval($y-2000),
                        '%m' => "$m",
                        '%d' => "$d" );

foreach $k (sort keys %conversion) {
 print "$k - $conversion{$k}\n";
}
#########
nrn6# perl test1
%W - 07
%Y - 2004
%d - 09
%j - 40
%m - 02
%w - 01
%y - 04

######### Second example without map
#!/usr/local/bin/perl

use Date::Calc qw(:all);

my $y = 2004;
my $m = '02';
my $d = '09';

my %conversion = (      '%W' => (Week_of_Year($y,$m,$d))[0],
                        '%w' => Day_of_Week($y,$m,$d),
                        '%j' => Day_of_Year($y,$m,$d),
                        '%Y' => "$y",
                        '%y' => eval($y-2000),
                        '%m' => "$m",
                        '%d' => "$d" );

foreach $k (sort keys %conversion) {
 print "$k - $conversion{$k}\n";
}
#########
nrn6# perl test1
%W - 7
%Y - 2004
%d - 09
%j - 40
%m - 02
%w - 1
%y - 4


As you see, there are three numbers returned without leading "0". Why if I
put only one map command in one of them, all three get a leading "0"?

Thanks!

Christian




------------------------------

Date: Wed, 11 Feb 2004 22:53:51 +0100
From: Michele Dondi <bik.mido@tiscalinet.it>
Subject: Re: strange behaviour with map inside a hash
Message-Id: <6q8l20hh08so5knkj3gllmq45perorqqvv@4ax.com>

On Wed, 11 Feb 2004 16:14:13 -0500, "Christian Caron"
<nospam@nospam.org> wrote:

>use Date::Calc qw(:all);
>
>my $y = 2004;
>my $m = '02';
>my $d = '09';
>
>my %conversion = (      '%W' => map { /^\d{1}$/ ? sprintf('%02d', $_) : $_ }
[snip rest]

Huh! Not exactly what I'd call a *minimal* example exhibiting the
problem. I'm not even trying to read it all... though, the subject
line and the last quoted line above suggest that you may want
something like

  my %conversion = ('%W' => [ map { whatever } @whatever ],
  #                         ^                            ^
  ...


Michele
-- 
you'll see that it shouldn't be so. AND, the writting as usuall is
fantastic incompetent. To illustrate, i quote:
- Xah Lee trolling on clpmisc,
  "perl bug File::Basename and Perl's nature"


------------------------------

Date: Wed, 11 Feb 2004 21:55:57 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: strange behaviour with map inside a hash
Message-Id: <x73c9h702b.fsf@mail.sysarch.com>

>>>>> "CC" == Christian Caron <nospam@nospam.org> writes:

  CC> ######### First example with map
  CC> #!/usr/local/bin/perl

  CC> use Date::Calc qw(:all);

  CC> my $y = 2004;
  CC> my $m = '02';
  CC> my $d = '09';

  CC> my %conversion = (      '%W' => map { /^\d{1}$/ ? sprintf('%02d', $_) : $_ }
  CC> (Week_of_Year($y,$m,$d))[0],

why are you using map there? map is for generating a list, not for some
single scalar expression. use a do{} block for that and a lexical var

      '%W' => do { my $woy = (Week_of_Year($y,$m,$d))[0] ;
		   $woy =~ /^\d{1}$/ ? sprintf('%02d', $woy) : $woy },


do you know the syntax of map? it will slurp up all the list arguments
passed to it. this includes all of these below which is not what you want.

  CC>                         '%w' => Day_of_Week($y,$m,$d),
  CC>                         '%j' => Day_of_Year($y,$m,$d),
  CC>                         '%Y' => "$y",


  CC> As you see, there are three numbers returned without leading "0". Why if I
  CC> put only one map command in one of them, all three get a leading "0"?

see comment above.

uri

-- 
Uri Guttman  ------  uri@stemsystems.com  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6118
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[23916] in Perl-Users-Digest

Perl-Users Digest, Issue: 6118 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Wed Feb 11 18:10:44 2004

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Feb 11 18:10:44 2004