[28004] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 9368 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Jun 24 14:05:47 2006

Date: Sat, 24 Jun 2006 11:05:04 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sat, 24 Jun 2006     Volume: 10 Number: 9368

Today's topics:
    Re: Moving C code from 32 to 64 bit (aka ? the Platypus)
    Re: nl_langinfo - problem <benmorrow@tiscali.co.uk>
    Re: Python and cellular automata (It works this time!) bearophileHUGS@lycos.com
    Re: Python and cellular automata (It works this time!) <defcon8@gmail.com>
        Regex: Exact semantics of ^ and $ when using /m <news@nana.franken.de>
    Re: Saying "latently-typed language" is making a catego <david.nospam.hopwood@blueyonder.co.uk>
    Re: Saying "latently-typed language" is making a catego <marshall.spight@gmail.com>
        unpack 'C' (was: use binary operator on ascii text stri <hjp-usenet2@hjp.at>
    Re: unpack 'C' <David.Squire@no.spam.from.here.au>
    Re: Unwanted character "^@" in perl output <flavell@physics.gla.ac.uk>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 24 Jun 2006 15:21:23 GMT
From: "David Formosa (aka ? the Platypus)" <dformosa@dformosa.zeta.org.au>
Subject: Re: Moving C code from 32 to 64 bit
Message-Id: <slrne9qkvv.32k.dformosa@dformosa.zeta.org.au>

On 23 Jun 2006 18:39:10 -0700, Robert Hicks <sigzero@gmail.com> wrote:
> I have a co-worker at work that created a module for Perl coded in "C"
> that works fine in a 32-bit world. We moved to the Itanium processor
> and now it is flakey and getting memory errors. Is there a resource for
> Perl that describes moving your "C" coded module from 32 to 64-bits?

Is your co-worker still with you?  Porting C between platforms is not
a purely mechanical task and if the code hasn't been built to be
portable it will be flaky.  You will need to rewrite it taking the 64
bitness into account.

-- 
Please excuse my spelling as I suffer from agraphia. See
http://dformosa.zeta.org.au/~dformosa/Spelling.html to find out more.
Free the Memes.  Will set followups on crossposts of 3 of more 


------------------------------

Date: Sat, 24 Jun 2006 16:27:39 +0100
From: Ben Morrow <benmorrow@tiscali.co.uk>
Subject: Re: nl_langinfo - problem
Message-Id: <b820n3-os7.ln1@osiris.mauzo.dyndns.org>

[newsgroups truncated and f'ups set]

Quoth Adam Smith <adamsmith@econ.com>:
> UNIX FreeBSD -V 4.9 O/S, i386 arch, Perl -V 5.8.2 Installation

Please don't start a new thread with the same question: continue with
the one you had.

Please don't cross-post to so many groups: pick one that's most relevant
(clpm in this case).

If you have a valid reason for cross-posting, please set followups.

Please don't post to comp.lang.perl: it doesn't exist.

Ben

-- 
Like all men in Babylon I have been a proconsul; like all, a slave ... During
one lunar year, I have been declared invisible; I shrieked and was not heard,
I stole my bread and was not decapitated.
~ benmorrow@tiscali.co.uk ~            Jorge Luis Borges, 'The Babylon Lottery'


------------------------------

Date: 24 Jun 2006 09:49:32 -0700
From: bearophileHUGS@lycos.com
Subject: Re: Python and cellular automata (It works this time!)
Message-Id: <1151167772.881199.11020@u72g2000cwu.googlegroups.com>

Few coding suggestions:
- Don't mix spaces and tabs;
- Don't write line (comments) too much long;
- Don't post too much code here;
- For this program maybe Pygame is more fit (to show the images in real
time) instead of PIL;
- Maybe Psyco can help speed up this program;
- Maybe ShedSkin will support part of the cimg library, such kind of
programs is fit for it :-)

Bye,
bearophile



------------------------------

Date: 24 Jun 2006 10:30:17 -0700
From: "defcon8" <defcon8@gmail.com>
Subject: Re: Python and cellular automata (It works this time!)
Message-Id: <1151170217.279140.306370@m73g2000cwd.googlegroups.com>

blog-of-justin.blogspot.com

Sorry for the error.



------------------------------

Date: Sat, 24 Jun 2006 19:10:43 +0200
From: Wolfgang Thomas <news@nana.franken.de>
Subject: Regex: Exact semantics of ^ and $ when using /m
Message-Id: <449d7214$0$11069$9b4e6d93@newsread4.arcor-online.net>

Hi,

I am afraid that this question has been asked before, but I could not 
find the answer in the FAQ nor in the "Programming Perl" book, nor by 
googling.

My question refers to the /m modifier for regular expressions.
According to "Programming Perl" /m lets ^ and $ match next to new lines 
within the string instead of considering only the beginning and end of 
the string.

Therefore I wonder why the following example does not match:

my $s = "123\n456";
if ($s =~ /3$^4/m) {print "match (4)\n";}

Even more confusing (for me) is that
if ($s =~ /3$4/m) {print "match (2)\n";}
matches, whereas
if ($s =~ /34/m) {print "match (3)\n";}
does not match.

Could someone please point me to an explanation of that behavior?


------------------------------

Date: Sat, 24 Jun 2006 17:11:24 GMT
From: David Hopwood <david.nospam.hopwood@blueyonder.co.uk>
Subject: Re: Saying "latently-typed language" is making a category mistake
Message-Id: <0neng.213767$8W1.1948@fe1.news.blueyonder.co.uk>

Patricia Shanahan wrote:
> Vesa Karvonen wrote:
> ...
> 
>> An example of a form of informal reasoning that (practically) every
>> programmer does daily is termination analysis.  There are type systems
>> that guarantee termination, but I think that is fair to say that it is
>> not yet understood how to make a practical general purpose language, whose
>> type system would guarantee termination (or at least I'm not aware of
>> such a language).  It should also be clear that termination analysis need
>> not be done informally.  Given a program, it may be possible to formally
>> prove that it terminates.
> 
> To make the halting problem decidable one would have to do one of two
> things: Depend on memory size limits, or have a language that really is
> less expressive, at a very deep level, than any of the languages
> mentioned in the newsgroups header for this message.

I don't think Vesa was talking about trying to solve the halting problem.

A type system that required termination would indeed significantly restrict
language expressiveness -- mainly because many interactive processes are
*intended* not to terminate.

A type system that required an annotation on all subprograms that do not
provably terminate, OTOH, would not impact expressiveness at all, and would
be very useful. Such a type system could work by treating some dependent
type parameters as variants which must strictly decrease in a recursive
call or loop. For example, consider a recursive quicksort implementation.
The type of the 'sort' routine would take an array of length
(dependent type parameter) n. Since it only performs recursive calls to
itself with parameter strictly less than n, it is not difficult to prove
automatically that the quicksort terminates. The programmer would probably
just have to give hints in some cases as to which parameters are to be
treated as variants; the rest can be inferred.

-- 
David Hopwood <david.nospam.hopwood@blueyonder.co.uk>


------------------------------

Date: 24 Jun 2006 10:46:49 -0700
From: "Marshall" <marshall.spight@gmail.com>
Subject: Re: Saying "latently-typed language" is making a category mistake
Message-Id: <1151171209.684163.143410@c74g2000cwc.googlegroups.com>

David Hopwood wrote:
>
> A type system that required an annotation on all subprograms that do not
> provably terminate, OTOH, would not impact expressiveness at all, and would
> be very useful.

Interesting. I have always imagined doing this by allowing an
annotation on all subprograms that *do* provably terminate. If
you go the other way, you have to annotate every function that
uses general recursion (or iteration if you swing that way) and that
seems like it might be burdensome. Further, it imposes the
annotation requirement even where the programer might not
care about it, which the reverse does not do.


Marshall



------------------------------

Date: Sat, 24 Jun 2006 18:30:31 +0200
From: "Peter J. Holzer" <hjp-usenet2@hjp.at>
Subject: unpack 'C' (was: use binary operator on ascii text string)
Message-Id: <7bpj7e.5bk.ln@teal.hjp.at>

David Squire wrote:
> Peter J. Holzer wrote:
>> David Squire wrote:
>>>       my @line_array = unpack 'C*', $line;
>> 
>> I don't think this is a good idea, as it depends on whether $line is
>> stored as bytes or as UTF-8 internally, which shouldn't make any
>> semantic difference.
> 
> It was not clear to me from the OP what the actual application was. I 
> guess I suspect that bit masking is more likely to be applied to bytes 
> of data than characters...

Yes, but I would still argue that the "bytes" in $line are what you get
by splitting it into "characters", not by using unpack 'C*'.

(In fact, I'm not sure if the behaviour of unpack 'C*' is correct - the
docs aren't clear and it does violate the principle of least
astonishment).

Consider this script:

#!/usr/bin/perl
use warnings;
use strict;

my $x = "\x{FC}";
utf8::upgrade($x); 
my $y = "\x{FC}";

print "\$x and \$y are", ($x eq $y ? "" : " not"), " equal\n";

my @x = unpack 'C*', $x;

print "\$x is_utf8: ", utf8::is_utf8($x), "\n";
for (@x) { print "$_\n" }

my @y = unpack 'C*', $y;

print "\$y is_utf8: ", utf8::is_utf8($y), "\n";
for (@y) { print "$_\n" }
__END__

With perl, v5.8.4 built for i386-linux-thread-multi, it prints:

$x and $y are equal
$x is_utf8: 1
195
188
$y is_utf8: 
252

So while perl thinks that $x and $y are equal, unpacking them with C*
yields different results. I don't think this should be the case, as it
can introduce hard-to-find bugs if a string of (0..255) is for some
reason stored as UTF-8.

        hp

-- 
   _  | Peter J. Holzer    | Man könnte sich [die Diskussion] auch
|_|_) | Sysadmin WSR/LUGA  | sparen, wenn man sie sich einfach sparen
| |   | hjp@hjp.at         | würde.
__/   | http://www.hjp.at/ |   -- Ralph Angenendt in dang 2006-04-15


------------------------------

Date: Sat, 24 Jun 2006 18:10:43 +0100
From: David Squire <David.Squire@no.spam.from.here.au>
Subject: Re: unpack 'C'
Message-Id: <e7jrmk$nl4$1@news.ox.ac.uk>

Peter J. Holzer wrote:
> David Squire wrote:
>> Peter J. Holzer wrote:
>>> David Squire wrote:
>>>>       my @line_array = unpack 'C*', $line;
>>> I don't think this is a good idea, as it depends on whether $line is
>>> stored as bytes or as UTF-8 internally, which shouldn't make any
>>> semantic difference.
>> It was not clear to me from the OP what the actual application was. I 
>> guess I suspect that bit masking is more likely to be applied to bytes 
>> of data than characters...
> 
> Yes, but I would still argue that the "bytes" in $line are what you get
> by splitting it into "characters", not by using unpack 'C*'.

Well, to me a byte is a byte is a byte: 8 bits. I agree that the OP's 
example used a line of text as the example, so using unpack 'C*' is not 
a good idea.

> (In fact, I'm not sure if the behaviour of unpack 'C*' is correct - the
> docs aren't clear and it does violate the principle of least
> astonishment).

I don't think the docs are that unclear. In perlfunc#pack it says:

"C   An unsigned char value.  Only does bytes.  See U for Unicode."

I agree that calling this a char, and using the mnemonic 'C' is 
potentially confusing in today's world of multiple multi-byte character 
sets.

So, if I want bytes, that's what I would use. Mind you, I would only be 
doing this for something like a bit-based set representation, not when I 
was playing with characters intended to represent text (which may or may 
not be stored as bytes).


Regards,

DS


------------------------------

Date: Sat, 24 Jun 2006 16:26:43 +0100
From: "Alan J. Flavell" <flavell@physics.gla.ac.uk>
Subject: Re: Unwanted character "^@" in perl output
Message-Id: <Pine.LNX.4.64.0606241552450.17323@ppepc87.ph.gla.ac.uk>

On Sat, 24 Jun 2006, Peter J. Holzer wrote:

> Todd W wrote:
> > 
> > Where did you learn all this stuff?
> 
> On Usenet :-)

;-)

> Since I've been programming for almost 23 years and discussing about 
> it on usenet for about 18 years now and character sets have been a 
> constant source of problems during all this time, I've accrued a bit 
> of knowledge about this topic (and I'm still learning something new 
> all the time).

Until these recent rounds of unicode-ification of Perl, most of my 
encounters with the character coding issue have been in the context of 
HTML (or of usenet discussions of HTML, which can actually make the 
problem even more difficult and confusing - especially when goo-groups 
sees fit to intervene and parse their strange characters in creative 
ways).

I've come to the conclusion, over the years, that the most difficult 
"cases" are people who already believe that they have a firm grasp of 
the principles (when in fact they haven't), and who demand a simple 
answer to what they consider to be a simple question.  The reality is 
that their "simple question" reveals that they actually need to 
un-learn substantial parts of what they had previously been taking for 
granted, and start again.  But convincing them of that is hard.  
Doing it both diplomatically *and* effectively is especially hard.

Those who confidently know what they mean by the term "character set" 
can be particularly stubborn.  This is compounded by the fact that the 
MIME attribute called "charset=" specifies what we would nowadays call 
a "character encoding scheme" (such as utf-8), *NOT* a coded character 
set.

> > I'd like to find some hardcopy for the bookshelf. Those that 
> > provide detail in relation to perl, apache, relational databases, 
> > and web browsers would be the most useful to me. Any suggestions?
> 
> No, sorry, I don't know any book which covers those topics. 

Chapter 2 of the Unicode specification (available online in PDF) 
is quite readable, considering the nature if its content: it puts 
the terminology into the context of Unicode itself, and shows the 
layering in terms of assigning to the characters of a character 
repertoire, non-negative integer values to form a "coded character 
set", and from there to define a "character encoding form" and thence 
a "character encoding scheme".  However, it isn't so very informative 
in terms of how these differentiated terms work when applied to legacy 
encodings such as us-ascii, iso-8859-1 etc. where the distinction 
between the the terms is less evident.

> As reference material I prefer original documentation and standards, 
> and if I'm looking for some specific information, it is much faster 
> to search the web than order and read half a dozen books which may 
> or may not contain it.

Indeed, but the web is unfortunately also awash with unreliable 
"information", from people whose enthusiasm to share their discoveries 
exceeds their technical competence.  For example: don't get me started 
on "symbol fonts in HTML" :-{{

hope these comments help a bit.


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 9368
***************************************


home help back first fref pref prev next nref lref last post