[16465] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 3877 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Aug 1 21:10:25 2000

Date: Tue, 1 Aug 2000 18:10:15 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <965178615-v9-i3877@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Tue, 1 Aug 2000     Volume: 9 Number: 3877

Today's topics:
    Re: Parse::RecDescent with left-recursive grammar. <ocschwar@mit.edu>
    Re: Parse::RecDescent with left-recursive grammar. (Damian Conway)
    Re: Parse::RecDescent with left-recursive grammar. <ocschwar@mit.edu>
        Perl - w (PROBLEM) BUG????? jffuller@my-deja.com
    Re: Perl CGI - files occasionally truncated mikelot@my-deja.com
    Re: Perl CGI - files occasionally truncated (Eric Bohlman)
    Re: Perl Module in C - Question. <lr@hpl.hp.com>
        perlcc <jane-paterson@ntlworld.com>
    Re: question about tr (Craig Berry)
    Re: question about tr (T. Postel)
    Re: Regex not matching ... character using OROMatcher <dietmar.staab@t-online.de>
    Re: Removing newline chars <godzilla@stomp.stomp.tokyo>
    Re: select/vec/pipes question <margit@us.ibm.com>
        RE: Short read from a file (WAS: Syntax Question) <lr@hpl.hp.com>
    Re: splitting on spaces <dilworth@megsinet.net>
    Re: Syntax Question (Greg Bacon)
    Re: When does it pay to presize an array? <dietmar.staab@t-online.de>
    Re: When does it pay to presize an array? <lr@hpl.hp.com>
    Re: When does it pay to presize an array? <dietmar.staab@t-online.de>
    Re: Which Win32 Perl? <eyalb@aks.com>
        Digest Administrivia (Last modified: 16 Sep 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Tue, 01 Aug 2000 18:53:48 -0400
From: Omri Schwarz <ocschwar@mit.edu>
Subject: Re: Parse::RecDescent with left-recursive grammar.
Message-Id: <398754FB.3735AADD@mit.edu>

Damian Conway wrote:
> 
> abigail@foad.org (Abigail) writes:
> 
> >Omri Schwarz (ocschwar@mit.edu) wrote on MMDXXVI September MCMXCIII in
> ><URL:news:398525F0.ABCA33F9@mit.edu>:
> >--
> >-- Say, um, a frieend of mine, yeh, that's it, a friend of mine,
> >-- writes a left-recursive grammar using RecDescent and sets 'nocheck',
> >-- and runs it, what could happen?
> 
> >Did your friend try?
> 
> :-)
> 
> Well, <nocheck> is still vapourware (as I explained at TPC)
> If you run a left-recursive grammar under <nocheck> you
> will quickly get a "100 levels of recursion" warning, followed
> eventually by an "out of memory" failure.
> 
> Left-recursion is forever! ;-)

That's most unfortunate.

My project to write the C-to-English translator requires
the flexibility $item{} and $arg{} offer in RecDescent, 
(so no Parse::Yapp usage possible, it seems), but attempts
to turn C into a right-recursive grammar require doing
 ...';' in many odd places and other contortions that
if I continue to try I will turn into a protagonist in
an H.P. Lovecraft story.

 Well, thanks for all your help, everyone. If I lapse in
taking my medicines again I'll start the project again.

Regards,

Omri


------------------------------

Date: 1 Aug 2000 23:45:20 GMT
From: damian@cs.monash.edu.au (Damian Conway)
Subject: Re: Parse::RecDescent with left-recursive grammar.
Message-Id: <8m7neg$7ij$1@towncrier.cc.monash.edu.au>


>My project to write the C-to-English translator requires
>the flexibility $item{} and $arg{} offer in RecDescent, 
>(so no Parse::Yapp usage possible, it seems), but attempts
>to turn C into a right-recursive grammar require doing
>...';' in many odd places and other contortions that
>if I continue to try I will turn into a protagonist in
>an H.P. Lovecraft story.

Heaven forfend!

A wonderful person named Juan Karlo Del Mundo took the demo_Cgrammar.pl
file I included in the latest distribution and de-left-ified it.

It will be available in the next release, but in order to prevent your
descent into horror and madness, here it is.

Happy parsing...

-----------cut-----------cut-----------cut-----------cut-----------cut----------

#!/usr/bin/perl -w

use Parse::RecDescent;

local $/;
my $grammar = <DATA>;
my $parser = Parse::RecDescent->new($grammar);

my $text = <STDIN>;

my $parse_tree = $parser->translation_unit($text) or die "bad C code";

__DATA__

<autotree>

primary_expression:
          IDENTIFIER
        | CONSTANT
        | STRING_LITERAL
        | '(' expression ')'

postfix_expression:
          primary_expression
        | (primary_expression)(s) '[' expression ']'
        | (primary_expression)(s) '(' ')'
        | (primary_expression)(s) '(' argument_expression_list ')'
        | (primary_expression)(s) '.' IDENTIFIER
        | (primary_expression)(s) PTR_OP IDENTIFIER
        | (primary_expression)(s) INC_OP
        | (primary_expression)(s) DEC_OP

argument_expression_list:
          (assignment_expression ',')(s?) assignment_expression

unary_expression:
          postfix_expression
        | INC_OP unary_expression
        | DEC_OP unary_expression
        | unary_operator cast_expression
        | SIZEOF unary_expression
        | SIZEOF '(' type_name ')'

unary_operator:
          '&'
        | '*'
        | '+'
        | '-'
        | '~'
        | '!'

cast_expression:
          unary_expression
        | '(' type_name ')' cast_expression

multiplicative_expression:
          (cast_expression '*')(s?) cast_expression
        | (cast_expression '/')(s?) cast_expression
        | (cast_expression '%')(s?) cast_expression

additive_expression:
          (multiplicative_expression '+')(s?) multiplicative_expression
        | (multiplicative_expression '-')(s?) multiplicative_expression

shift_expression:
          (additive_expression LEFT_OP)(s?) additive_expression
        | (additive_expression RIGHT_OP)(s?) additive_expression

relational_expression:
          (shift_expression '<')(s?) shift_expression
        | (shift_expression '>')(s?) shift_expression
        | (shift_expression LE_OP)(s?) shift_expression
        | (shift_expression GE_OP)(s?) shift_expression

equality_expression:
          (relational_expression EQ_OP)(s?) relational_expression
        | (relational_expression NE_OP)(s?) relational_expression

and_expression:
          (equality_expression '&')(s?) equality_expression

exclusive_or_expression:
          (and_expression '^')(s?) and_expression

inclusive_or_expression:
          (exclusive_or_expression '|')(s?) exclusive_or_expression

logical_and_expression:
          (inclusive_or_expression AND_OP)(s?) inclusive_or_expression

logical_or_expression:
          (logical_and_expression OR_OP)(s?) logical_and_expression

conditional_expression:
          logical_or_expression
        | logical_or_expression '?' expression ':' conditional_expression

assignment_expression:
          conditional_expression
        | unary_expression assignment_operator assignment_expression

assignment_operator:
          '='
        | MUL_ASSIGN
        | DIV_ASSIGN
        | MOD_ASSIGN
        | ADD_ASSIGN
        | SUB_ASSIGN
        | LEFT_ASSIGN
        | RIGHT_ASSIGN
        | AND_ASSIGN
        | XOR_ASSIGN
        | OR_ASSIGN

expression:
          (assignment_expression ',')(s?) assignment_expression

constant_expression:
          conditional_expression

declaration:
          declaration_specifiers ';'
          { print "We have a match!\n"; }
        | declaration_specifiers init_declarator_list ';'

declaration_specifiers:
          storage_class_specifier
        | storage_class_specifier declaration_specifiers
        | type_specifier
        | type_specifier declaration_specifiers
        | type_qualifier
        | type_qualifier declaration_specifiers

init_declarator_list:
          (init_declarator ',')(s?) init_declarator

init_declarator:
          declarator
        | declarator '=' initializer

storage_class_specifier:
          TYPEDEF
        | EXTERN
        | STATIC
        | AUTO
        | REGISTER

type_specifier:
          VOID
        | CHAR
        | SHORT
        | INT
        | LONG
        | FLOAT
        | DOUBLE
        | SIGNED
        | UNSIGNED
        | struct_or_union_specifier
        | enum_specifier
        | TYPE_NAME

struct_or_union_specifier:
          struct_or_union IDENTIFIER '{' struct_declaration_list '}'
        | struct_or_union '{' struct_declaration_list '}'
        | struct_or_union IDENTIFIER

struct_or_union:
          STRUCT
        | UNION

struct_declaration_list:
          struct_declaration(s)

struct_declaration:
          specifier_qualifier_list struct_declarator_list ';'

specifier_qualifier_list:
          type_specifier specifier_qualifier_list
        | type_specifier
        | type_qualifier specifier_qualifier_list
        | type_qualifier

struct_declarator_list:
          (struct_declarator ',')(s?) struct_declarator

struct_declarator:
          declarator
        | ':' constant_expression
        | declarator ':' constant_expression

enum_specifier:
          ENUM '{' enumerator_list '}'
        | ENUM IDENTIFIER '{' enumerator_list '}'
        | ENUM IDENTIFIER

enumerator_list:
          (enumerator ',')(s?) enumerator

enumerator:
          IDENTIFIER
        | IDENTIFIER '=' constant_expression

type_qualifier:
          CONST
        | VOLATILE

declarator:
          pointer direct_declarator
        | direct_declarator

direct_declarator:
          IDENTIFIER
        | '(' declarator ')'
        | (IDENTIFIER)(s?) ('(' declarator ')')(s?) '[' constant_expression ']'
        | (IDENTIFIER)(s?) ('(' declarator ')')(s?) '[' ']'
        | (IDENTIFIER)(s?) ('(' declarator ')')(s?) '(' parameter_type_list ')'
        | (IDENTIFIER)(s?) ('(' declarator ')')(s?) '(' identifier_list ')'
        | (IDENTIFIER)(s?) ('(' declarator ')')(s?) '(' ')'

pointer:
          '*'
        | '*' type_qualifier_list
        | '*' pointer
        | '*' type_qualifier_list pointer

type_qualifier_list:
          type_qualifier(s)

parameter_type_list:
          parameter_list
        | parameter_list ',' ELLIPSIS

parameter_list:
          (parameter_declaration ',')(s?) parameter_declaration

parameter_declaration:
          declaration_specifiers declarator
        | declaration_specifiers abstract_declarator
        | declaration_specifiers

identifier_list:
          (IDENTIFIER ',')(s?) IDENTIFIER

type_name:
          specifier_qualifier_list
        | specifier_qualifier_list abstract_declarator

abstract_declarator:
          pointer
        | direct_abstract_declarator
        | pointer direct_abstract_declarator

direct_abstract_declarator:
          '(' abstract_declarator ')'
        | '[' ']'
        | '[' constant_expression ']'
        | DAD '[' ']'
        | DAD '[' constant_expression ']'
        | '(' ')'
        | '(' parameter_type_list ')'
        | DAD '(' ')'
        | DAD '(' parameter_type_list ')'

DAD:    #macro for direct_abstract_declarator 
          ( '(' abstract_declarator ')' )(s?)
          ( '[' ']' )(s?)
          ( '[' constant_expression ']' )(s?)
          ( '(' ')' )(s?)
          ( '(' parameter_type_list ')' )(s?)

initializer:
          assignment_expression
        | '{' initializer_list '}'
        | '{' initializer_list ',' '}'

initializer_list:
          (initializer ',')(s?) initializer

statement:
          labeled_statement
        | compound_statement
        | expression_statement
        | selection_statement
        | iteration_statement
        | jump_statement

labeled_statement:
          IDENTIFIER ':' statement
        | CASE constant_expression ':' statement
        | DEFAULT ':' statement

compound_statement:
          '{' '}'
        | '{' statement_list '}'
        | '{' declaration_list '}'
        | '{' declaration_list statement_list '}'

declaration_list:
          declaration(s)

statement_list:
          statement(s)

expression_statement:
          ';'
        | expression ';'

selection_statement:
          IF '(' expression ')' statement
        | IF '(' expression ')' statement ELSE statement
        | SWITCH '(' expression ')' statement

iteration_statement:
          WHILE '(' expression ')' statement
        | DO statement WHILE '(' expression ')' ';'
        | FOR '(' expression_statement expression_statement ')' statement
        | FOR '(' expression_statement expression_statement expression ')' statement

jump_statement:
          GOTO IDENTIFIER ';'
        | CONTINUE ';'
        | BREAK ';'
        | RETURN ';'
        | RETURN expression ';'

translation_unit:
          external_declaration(s)

external_declaration:
          function_definition
        | declaration

function_definition:
          declaration_specifiers declarator declaration_list compound_statement
        | declaration_specifiers declarator compound_statement
        | declarator declaration_list compound_statement
        | declarator compound_statement

# TERMINALS

reserved_word:
	AUTO     | BREAK   | CASE     | CHAR   | CONST    |
	CONTINUE | DEFAULT | DO       | DOUBLE | ENUM     |
	EXTERN   | FLOAT   | FOR      | GOTO   | IF       |
	INT      | LONG    | REGISTER | RETURN | SHORT    | 
	SIGNED   | SIZEOF  | STATIC   | STRUCT | SWITCH   |
	TYPEDEF  | UNION   | UNSIGNED | VOID   | VOLATILE |
	WHILE


ADD_ASSIGN:	'+='
AND_ASSIGN:	'&='
AND_OP:		'&&'
AUTO:		'auto'
BREAK:		'break'
CASE:		'case'
CHAR:		'char'
CONST:		'const'
CONSTANT:	/[+-]?(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?/
CONTINUE:	'continue'
DEC_OP:		'--'
DEFAULT:	'default'
DIV_ASSIGN:	'/='
DO:		'do'
DOUBLE:		'double'
ELLIPSIS:	'...'
ELSE:           'else'
ENUM:		'enum'
EQ_OP:		'=='
EXTERN:		'extern'
FLOAT:		'float'
FOR:		'for'
GE_OP:		'>='
GOTO:		'goto'
IDENTIFIER:	...!reserved_word /[a-z]\w*/i
IF:		'if'
INC_OP:		'++'
INT:		'int'
LEFT_ASSIGN:	'<<='
LEFT_OP:	'<<'
LE_OP:		'<='
LONG:		'long'
MOD_ASSIGN:	'%='
MUL_ASSIGN:	'*='
NE_OP:		'!='
OR_ASSIGN:	'|='
OR_OP:		'||'
PTR_OP:         '->'
REGISTER:	'register'
RETURN:		'return'
RIGHT_ASSIGN:	'>>='
RIGHT_OP:	'>>'
SHORT:		'short'
SIGNED:		'signed'
SIZEOF:		'sizeof'
STATIC:		'static'
STRING_LITERAL:	{ extract_delimited($text,'"') }
STRUCT:		'struct'
SUB_ASSIGN:	'-='
SWITCH:		'switch'
TYPEDEF:	'typedef'
TYPE_NAME:	# NONE YET
UNION:		'union'
UNSIGNED:	'unsigned'
VOID:		'void'
VOLATILE:	'volatile'
WHILE:		'while'
XOR_ASSIGN:	'^='



------------------------------

Date: Tue, 01 Aug 2000 20:15:16 -0400
From: Omri Schwarz <ocschwar@mit.edu>
Subject: Re: Parse::RecDescent with left-recursive grammar.
Message-Id: <39876814.28BA0342@mit.edu>

Damian Conway wrote:
> 
> >My project to write the C-to-English translator requires
> >the flexibility $item{} and $arg{} offer in RecDescent,
> >(so no Parse::Yapp usage possible, it seems), but attempts
> >to turn C into a right-recursive grammar require doing
> >...';' in many odd places and other contortions that
> >if I continue to try I will turn into a protagonist in
> >an H.P. Lovecraft story.
> 
> Heaven forfend!
> 
> A wonderful person named Juan Karlo Del Mundo took the demo_Cgrammar.pl
> file I included in the latest distribution and de-left-ified it.
> 

Excellent! I'll merge it into what I have at the moment.

Thanks you!


------------------------------

Date: Tue, 01 Aug 2000 22:09:38 GMT
From: jffuller@my-deja.com
Subject: Perl - w (PROBLEM) BUG?????
Message-Id: <8m7hqv$i5j$1@nnrp1.deja.com>

I hope somebody can help me with this problem.  I'm having problem
compiling perl though my Apache Web server..  It's very strange I
Installed the latest version of perl on my server (RedHat v. 6.2) and
to run a program I can't use the usual: #!/usr/local/bin/perl

I have to use with comments e.g.
#!/usr/local/bin/perl -w

When I use the above header everything works correctly, ????????

Does anybody have any ideas on how to fix this weird problem?

P.S. if you could email me at:  Info@esitez.com I'd definitely
appreciate it.

Sincerely,

Jeremy Fuller


Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: Tue, 01 Aug 2000 22:23:38 GMT
From: mikelot@my-deja.com
Subject: Re: Perl CGI - files occasionally truncated
Message-Id: <8m7il6$imo$1@nnrp1.deja.com>

In article <8m7edi$bnj$1@provolone.cs.utexas.edu>,
  logan@cs.utexas.edu (Logan Shaw) wrote:
> In article <8m7aag$c6s$1@nnrp1.deja.com>,  <mikelot@my-deja.com>
wrote:
> >I have a Perl CGI which basically updates a flatfile database based
on
> >form input. I'm following the recommended process (from the FAQ) of
> >creating a new temporary file, writing to that, then doing a rename
to
> >the original file name as the last step.
> >
> >I'm finding (very infrequently, as in months before a failure) that
the
> >file is not always updated correctly - that it may in fact be left
> >empty or with only a few records. Is it possible that the perl cgi is
> >getting killed in the middle of the rename step? What's the best
> >workaround - a signal handler?
>
> Are you locking the file?  What if two CGIs run at the same time and
> try to update the file?  In many cases, this kind of a scenario leads
> to corruption of flat file databases.
>
> Personally, I'd consider going to a database.  MySQL is free and it
> will almost definitely serve your needs.  Plus, it will probably be
> more efficient than loading and re-writing the file every time you
make
> a small change in it.
>
>   - Logan
>

yes- the files are locked whenever they are accessed (sorry, should
have mentioned that!)

Regarding MySQL - in my case, users want to be able to do a text search
of the records for matching keywords (ie almost like a grep), so I
didn't think MySQL would be that much benefit. But not having used it,
I could be way off base! Seems like I'd have to build some sort of
indexing system to make it useful in this particular application.



Sent via Deja.com http://www.deja.com/
Before you buy.


------------------------------

Date: 1 Aug 2000 22:43:04 GMT
From: ebohlman@netcom.com (Eric Bohlman)
Subject: Re: Perl CGI - files occasionally truncated
Message-Id: <8m7jpo$14o$6@slb6.atl.mindspring.net>

mikelot@my-deja.com wrote:
: yes- the files are locked whenever they are accessed (sorry, should
: have mentioned that!)

Make sure that you aren't falling into the common beginner's trap of 
opening the file for read, locking it, reading it into an array, closing 
the file (which releases the lock), massaging the array's contents, 
opening the file for write, locking it, writing out the array, and then 
closing the file.  There are two points in this sequence where a race 
condition can occur.  If another copy of the script reads the file in 
between the first close and the second open, it will get an original copy 
of the file, and when it writes it out it will overwrite any changes the 
first copy of the script made.  If the second copy reads the file in 
between the second open and the second lock, it will read an empty file 
(because opening a file for writing wipes out any previous copy) and when 
it writes the changes out it will overwrite *everything*.

What you need to do is open the file for read/write, lock it, read the 
contents in, massage them, seek to the beginning of the file, truncate 
the file to zero length, write out the changed contents and then, and 
only then, close the file.  That way the file remains locked throughout 
the entire update sequence.



------------------------------

Date: Tue, 1 Aug 2000 15:06:44 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: Perl Module in C - Question.
Message-Id: <MPG.13f0e9d0de84846898ac1d@nntp.hpl.hp.com>

In article <965166460.25924@itz.pp.sci.fi> on 1 Aug 2000 21:49:26 GMT, 
Ilmari Karonen <iltzu@sci.invalid> says...
> In article <8m7e9r$bmk$1@provolone.cs.utexas.edu>, Logan Shaw wrote:
> >In article <nneeoschbm15pski7lpdcrpdqlo1dcm4uu@4ax.com>,
> >John Fortin  <fortinj@attglobal.net> wrote:
> >>hmmm, Activestate perl for windows doesn't seem to have manpages.
> >>A bit 'unix'centric are we??
> >
> >It has them.  They're just hidden somewhere and called something
> 
>   s/somewhere/in the Start menu/, IIRC.

Start:Programs:ActivePerl:Documentation

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Tue, 1 Aug 2000 23:54:00 +0100
From: "Jane Paterson" <jane-paterson@ntlworld.com>
Subject: perlcc
Message-Id: <OvIh5.1461$jw1.29188@news2-win.server.ntlworld.com>

Has anyone managed to get perlcc working?

I can get the C source code but it will not compile to an executable. It
says

Couldn't Open !

Any help??

Colin





------------------------------

Date: Tue, 01 Aug 2000 22:20:47 GMT
From: cberry@cinenet.net (Craig Berry)
Subject: Re: question about tr
Message-Id: <soej9vkidbm125@corp.supernews.com>

T. Postel (T.Postel@ieee.org) wrote:
: I have been using tr to untaint input like this:
: $Group =~ tr/A-Za-z0-9 -_//cd;
: But I found, and I can't explain why, it doesn't work.
: This does work and I've changed to it:
: $Group =~ tr/_A-Za-z0-9 -//cd;
: I haven't found an explanation why adding an underscore to the end of
: the searchlist causes the tr to fail.

It's not the underscore at the end that makes it fail, but rather the
hypen in the middle.  Hyphens are special in tr match lists; they indicate
a character range.  For example, tr/a-z//cd deletes everything that isn't
a lowercase alphabetic character.  In your non-working pattern, the last
three characters are ' -_', which is being interpreted by Perl as "the
characters from ' ' to '_', in ASCII order, inclusive".  That's codes
32-95 (decimal), a span which includes a lot of the punctuation
characters.

The trick for including hyphen in a tr pattern (or character class, in a
regex) is to put it first or last -- as you accidentally discovered. :)

-- 
   |   Craig Berry - http://www.cinenet.net/users/cberry/home.html
 --*--  "Turning and turning in the widening gyre
   |   The falcon cannot hear the falconer." - Yeats, "The Second Coming"


------------------------------

Date: Wed, 02 Aug 2000 00:35:01 GMT
From: T.Postel@ieee.org (T. Postel)
Subject: Re: question about tr
Message-Id: <G0Kh5.4$eJZ4.1703975@news.randori.com>

In article <MPG.13f0c86a39c7cd6b98ac19@nntp.hpl.hp.com>, Larry Rosler <lr@hpl.hp.com> wrote:

>Because it creates a range ' ' to '_', which is a whole lot of 
>characters!  :-)
>
>Moving the '_' to the front puts the '-' at the end, so is loses its 
>metasemantics.
>
Man I am so dumb!
Thanks one and all, I just didn't see the trees for the forest.


-- 
While my e-mail address is not munged,     | T.Postel@ieee.org
I probably won't read anything sent there. |


------------------------------

Date: Wed, 02 Aug 2000 01:14:29 -0500
From: "Dietmar Staab" <dietmar.staab@t-online.de>
Subject: Re: Regex not matching ... character using OROMatcher
Message-Id: <8m7lje$pes$15$1@news.t-online.com>

In article <8m7398$rna$1@bob.news.rcn.net>, "Peter Lyons"
<pl22andfu@yahoo.com> wrote:

> Hi,
> 
>   I am using Daniel Savarese's OROMatcher package
>  http://www.savarese.org/oro/software/OROMatcher1.1.html ), which allows
>  me
> to use Perl5 regular expression in my Java programs.  I am using this
> regex:
> 
> (?:<[^>]+>|[^<]+)
> 
> to break a web page up into chunks that are either tags or hunks of
> content. It is working well, but it stumbles on the ... character in a
> web page.  I believe this is octal \205 or \xC9 (mac) or \x87 (pc) (not
> sure though). Anyway, emacs shows it as \205 and windows NotePad shows a
> black square.  I need [^<]+ to match this character, but it isn't. 

Hi Peter,  is it always the same ... character or different ones? If it's
only _one_ character and you don't need this, get rid of this character
and remove it with s//. If you need this character in your output, then my
suggestion would be to replace it first with a filler which is not in your
text - do your matching and undo the first replacement (filler back to
character ...). I have no other solution in mind - maybe there exists a
better one.

BTW, for parsing HTML have a look at CPAN for the module.

Greetings, Dietmar


------------------------------

Date: Tue, 01 Aug 2000 17:47:32 -0700
From: "Godzilla!" <godzilla@stomp.stomp.tokyo>
Subject: Re: Removing newline chars
Message-Id: <39876FA4.28055EDD@stomp.stomp.tokyo>

Logan Shaw wrote:


(snipped stuff about \r\n and lynx) 


> It happens with lynx 2.7.1 on Solaris 7 (SPARC) and lynx-2.8.1rel.2 on
> Solaris 8 (Intel).  It does not happen with lynx 2.8.3dev.6 on Redhat
> Linux 6.0.
 
> So, it's clearly browser-specific.  In fact, that fact that it doesn't
> happen on a newer version of lynx may indicate that they decided it was
> a bug and decided to fix it.

I've made note of this, added this info to my scrapbook
under Lynx. Accordingly, I will modify my responses on
 =~ s/\r\n/ /;  and add a note about lynx browser 
specific oddities and add code much like yours if 
not yours.

Your code, =~ s/[\r\n]+/ /g; would seem best choice
on taking care of this carriage return / newline thing.
There is no problem with doubling up spaces using your
character set as compared to \r|\n type regex. Tested
your code, works very well indeed. I couldn't break it.

This is a good learning experience. Thank you.

Godzilla!

-- 
With a little help from my friends, I will rock you.
  http://la.znet.com/~callgirl3/friends.mid


------------------------------

Date: Tue, 01 Aug 2000 18:13:15 -0400
From: Margit Meyer <margit@us.ibm.com>
Subject: Re: select/vec/pipes question
Message-Id: <39874B78.6574F850@us.ibm.com>



Ilja Tabachnik wrote:

> Margit Meyer wrote:
> >
> > Ilja Tabachnik wrote:
>
> ...some code skipped...
>
> >
> > > Sorry, I really cannot understand what process is the parent,
> > > what is the child, who is supposed to sleep (and when) and who
> > > is supposed to print (or to return ?) something.
> > >
> > > If you could specify you goals more hm-m... precise,
> > > I could try to help you.
> > >
> >
> > I specified a snippet of code which performs the print of the stdout received
> > from the child process. To give you some background, the actual program will
> > execute on another system scripts or commands native to that system and
> > return the output to the calling system. For instance, system A invokes a
> > script which resides on system B containing  ls /tmp; sleep 20; exit 0. Once
> > the exit is completed the output of the script from system B should be
> > printed through stdout on system A all at once. This is not the case here.
> > System A has one line of output printed, through stdout(from the snippet of
> > code specified), from the script(ls /tmp;sleep 20; exit 0) on system B. Then
> > the sleep 20 seconds ocurrs on system B from the local script, then the exit
> > from the local script ocurrs on system B causing the rest of the ls /tmp from
> > system B to be printed on system A through stdout(again from the snippet of
> > code). We, my collegues and I, think the line of code:
> > $nready = select($rout = $rin, undef, undef, undef);
> > is suppose to wait until the child process has completed (doing the ls /tmp;
> > sleep 20; exit). Maybe our assumption is incorrect.
>
> Your assumption is incorrect. AFAIK Perl's 4-argument select() does
> nothing more
> (but no less) than the underlying OS's select(2) system call.
> It has nothing to do with wait'ing for a child process to complete.
> If you need to wait for a child - use wait() or waitpid().
>
> If all you want is to capture child's stdout and then print it out
> (maybe with some processing), you may use open() (to open a pipe)
> or backticks.
>
> For more information about IPC solutions:
>
> perldoc perlipc
> perldoc perlfaq8
> perldoc -f <every_function_you_are_interested_in>
>
> > Maybe someone can explain exactly what the select and vec statements
> > appear to be doing.
>
> perldoc -f vec
>
> For select() details see your OS (AIX?) manpage: man select
>
> If perl docs unavailable locally, take a look at
> http://www.cpan.org/doc/manual/html/pod/index.html.
>
> BTW, I'm still curious about one hidden detail:
> Now do you create a child process on system B from a parent
> process on system A :-O
>
> Hope this helps.
> Ilja.

Hi again...

I guess I'm not explaining myself very cleary and appreciate your answers. I found
another append to someone else which I believe answers my question. I am catching
output from other perl scripts (child process kicks them off) We are using the
select() statement to catch the stdout but do not want to print until the entire
output is in stdout and the other process (child exit 0) has completed. We have
everything handled except that one line seems to print before the child is
done...then the rest. I thought that the select() wouldn't be 'triggered' until all
the output is in stdout but it looks like otherwise.
I have looked at all the man pages/documentation etc available but do not find a
detail enough explanation of the select() or vec() statement thus the forum.
For your curiousity...we are on an SP machine the program uses rsh under the covers
to 'kick' off the child processes on the other systems. We then capture the output
through the select() statement. All works well until you delay ending the child
process(sleep 20 in the script) which allows 1 line to be printed before child ends
and all the rest of the output is then printed.

Thanks again for your help.
Margit Meyer



------------------------------

Date: Tue, 1 Aug 2000 15:01:23 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: RE: Short read from a file (WAS: Syntax Question)
Message-Id: <MPG.13f0e88af41594598ac1c@nntp.hpl.hp.com>

In article <x74s544qth.fsf@home.sysarch.com> on Tue, 01 Aug 2000 
21:25:15 GMT, Uri Guttman <uri@sysarch.com> says...
> >>>>> "LR" == Larry Rosler <lr@hpl.hp.com> writes:

 ...

>   LR> This is my first deep dive into the perl source, and after I catch my 
>   LR> breath I may file a bug report.  The test at line 1542 should be for any 
>   LR> length smaller than requested, not just 0.
> 
> what if you are reading a non-blocking socket? you typically will get a
> smaller amount read than you requested.

Then using '-s FILE' as we have done wouldn't mean much, would it?

>                                         i don't know if that code you
> want fixed could handle that properly. IMO perl just passes the
> semantics of c's read up the line. <FOO> will cause multiple system
> reads to read the whole file any of which could fail. the file might
> even be truncated between the last and next read calls, then you could
> get a short read.

The code in question is in a conditional part that deals with ordinary 
files only.  Sockets are handled in a different branch.

If there is a short read from a file, there is an error.

The patch should check for a short read, and set the number of bytes 
read to -1 and rejoin the error-handling function.  Right now, it only 
checks for 0 bytes read.

> so let the perl coder handle that as they wish. it is not trivial for
> the perl core to know what is a read error based on a short read.

If there is a short read from a file, there is an error.

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Tue, 01 Aug 2000 19:57:14 -0400
From: Dilworth <dilworth@megsinet.net>
Subject: Re: splitting on spaces
Message-Id: <398763DA.39FC3223@megsinet.net>

Dietmar:

Oopsie!  My mistake(s).  The second "my $x =" is a typo - should lose the "my"
- d'oh!
 The s/ +//g should indeed be s/ +/ /g (i.e. space between the 2nd and 3rd
"/").  Fat fingers today.  Sorry about that.  I pride myself on being
embarassed as much and as often as possible and am glad that folks point out
my shortcomings. :-(

Red Faced and properly humbled,

Bob Dilworth
Toledo, Ohio
bdilworth@mco.edu

Dietmar Staab wrote:

> In article <39873223.CA4F1197@mco.edu>, Bob Dilworth <bdilworth@mco.edu>
> wrote:
>
> Hi Bob,
> > Try this:
> >
> > my $x = "Hello how can    I       help  you";  my $x =~ s/ +//g; # this
>
> $x =~s/ +/ /g or you will end up with a sentence with no spaces at all and
> why declaring the variable twice (with "my")?
>
> > regex will compress multi spaces to 1 space  my @array = split(/ /,$x);
> >
> > You should end up with 6 array elements - one for each word.
> >
> If you change the right part of the substitution.
>
> But try this string
> my $x = "    Hello how can    I       help  you";
>
> I bet you won't get six array elements. ;-)
>
> Greetings, Dietmar



------------------------------

Date: Tue, 01 Aug 2000 22:18:23 GMT
From: gbacon@HiWAAY.net (Greg Bacon)
Subject: Re: Syntax Question
Message-Id: <soej5fr5dbm12@corp.supernews.com>

In article <MPG.13f0d47a598e77b798ac1a@nntp.hpl.hp.com>,
    Larry Rosler  <lr@hpl.hp.com> wrote:

: In article <soe9bferdbm114@corp.supernews.com> on Tue, 01 Aug 2000 
: 19:30:55 GMT, Greg Bacon <gbacon@HiWAAY.net> says...
:
: + The documentation and your understanding seem to be at odds.  Note the
: + use of the word 'attempts' used in comparison to what is "actually
: + read".
: 
: See below.  Failure to read what is requested is an error, not a partial 
: read.

Bull.  Uri pointed out an excellent counterexample to that claim.  I
know we're talking about files, but remember that one uses the same
interface for files, sockets, devices, etc.

: + Return values are there for a reason.  Would you dismiss checking
: + open()'s return value as paranoia?
: 
: Of course not.  But you clipped my observation that you made no error 
: checks on the <FILE> operation.

The return values from readline make it impossible to distinguish an
error from an ordinary end-of-file condition.  The documentation does
not mention error handling.

: + Can you please cite a reference in the documentation (or even the
: + source) that supports your claim?
: 
: The perl source (5.6.0), and the ANSI/ISO C Standard.
: 
: In pp_sys.c, PP(pp_read) calls PP(pp_sysread) calls perlIO_read (at line 
: 1540), which calls fread() (perlio.c, line 368).  The returned value 
: gets passed up, and there are no loops.
: 
: The ANSI/ISO C Standard says (4.9.8.1):
: 
:     The *fread* function returns the number of elements successfully
:     read, which may be less than *nmemb* if a read error or end-of-file
:     is encountered.
: 
: The comment about fread() in the perl source:
: 
: 	length = PerlIO_read(IoIFP(io), buffer+offset, length);
: 	/* fread() returns 0 on both error and EOF */
: 	if (length == 0 && PerlIO_error(IoIFP(io)))
: 	    length = -1;
:     }
:     if (length < 0) {
:         ... deal with the error ...
: 
: is patently incorrect, and the I/O error return isn't analyzed for a 
: partial read, but it is an error nevertheless.

EINTR?  EAGAIN?  What if the file is an empty named pipe?  -s will
return 0, and the read will fail.  What about platforms that don't
use stdio fread?

: So a 'short' read or sysread represents an error, not something that can 
: be completed by a subsequent read.

My point remains: a readline on a valid filehandle cannot fail.  It is
safer and easy to get right.

Greg
-- 
If my children will live a better life than I did by my getting brain damage,
by my being brain dead then let it be.
    -- Mike Tyson


------------------------------

Date: Wed, 02 Aug 2000 00:34:25 -0500
From: "Dietmar Staab" <dietmar.staab@t-online.de>
Subject: Re: When does it pay to presize an array?
Message-Id: <8m7j8a$ng8$13$1@news.t-online.com>

In article <uwvi2fef1.fsf@demog.berkeley.edu>, aperrin@demog.berkeley.edu
(Andrew J. Perrin) wrote:

> I'm working on a project in which some fairly large arrays (around
> 100,000 to 200,000 elements) will be loaded from a file. I will know
> ahead of time how many elements will be read in, so it is feasible to
> pre-size the arrays using $#array = nnn where nnn is the size of the
> array. perldoc perldata suggests that this will speed up the array
> filling process.

Hi Andrew,  
does $#array = nnn presizes a hash (I don't know if it works)?
The feature of presizing a hash was introduced in release 5.004 of Perl
and the presizing is done by something like

keys(%myhash) = 1024;

This will presize the hash with 1024 buckets for the hash. The number of
buckets you reserve  has to be a power of 2 (that's due to perls internal
data structures). 

The performance depends on the distribution of your keys to the buckets
(and you can't change Perl's hash algorithm, if it works bad on your
key values).

Greetings, Dietmar


------------------------------

Date: Tue, 1 Aug 2000 15:56:38 -0700
From: Larry Rosler <lr@hpl.hp.com>
Subject: Re: When does it pay to presize an array?
Message-Id: <MPG.13f0f57d2fed3de798ac1f@nntp.hpl.hp.com>

In article <8m7j8a$ng8$13$1@news.t-online.com> on Wed, 02 Aug 2000 
00:34:25 -0500, Dietmar Staab <dietmar.staab@t-online.de> says...
> In article <uwvi2fef1.fsf@demog.berkeley.edu>, aperrin@demog.berkeley.edu

 ...

> does $#array = nnn presizes a hash (I don't know if it works)?

Perhaps you mean 'array', not 'hash'.  You can find out if it works by 
trying it.

> The feature of presizing a hash was introduced in release 5.004 of Perl
> and the presizing is done by something like
> 
> keys(%myhash) = 1024;
> 
> This will presize the hash with 1024 buckets for the hash. The number of
> buckets you reserve  has to be a power of 2 (that's due to perls internal
> data structures).

No, it doesn't have to be a power of 2.  Perl will take care of it for 
you.

Why not read the documantation for keys() before volunteering an answer 
about it?

perldoc -f keys

 ...

If you say

    keys %hash = 200;

then %hash will have at least 200 buckets allocated for it--256 of them, 
in fact, since it rounds up to the next power of two. 

 ...

-- 
(Just Another Larry) Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Wed, 02 Aug 2000 01:59:41 -0500
From: "Dietmar Staab" <dietmar.staab@t-online.de>
Subject: Re: When does it pay to presize an array?
Message-Id: <8m7o87$ueo$17$1@news.t-online.com>

In article <MPG.13f0f57d2fed3de798ac1f@nntp.hpl.hp.com>, Larry Rosler
<lr@hpl.hp.com> wrote:

Hi Andrew, hi Larry,

I've made a mistake - I'm sorry. When reading your posting, I switched in
mind from arrays to hashes. I wrote $#array in my answer and thougt about
hashes - oh no, what a substantial error. I saw Larry's follow up and
reread what I wrote and the question of the poster. The heading ditto
contains "array" - I'm ashamed of myself.

Number of buckets - "has to be" was expressed wrong, it should be "is".
It's not my day - I better leave the newsgoup now and eventually come
back, when I'm well rested. Then I'll read twice what I post to avoid such
 stupid mistakes.
 
Hope, you accept my excuse, Dietmar


------------------------------

Date: 02 Aug 2000 03:24:37 +0300
From: Eyal Ben-David <eyalb@aks.com>
Subject: Re: Which Win32 Perl?
Message-Id: <m28zugqzlm.fsf@localhost.localdomain>

Iain Georgeson <iain@kremlinux.demon.co.uk> writes:

> [X-posted because I'm after the input of the Tk folks. FU set to .misc]
> 
> I'm currently trying out a selection of Perl ports to Win32. I'm after
> a port that allows me to use Tk and also build the SNMP module
> ("build" is an important word there - I'd like to be able to fiddle
> with the C source). I'm also considering hacking together the bastard
> progeny of the BSD in.tftpd and xsub, so IWBNI that built with
> whatever compiler I end up with.
> 
> So far, I've played with cygwin 1.1 (which can't build Tk),
> mingw32/egcs 2.91.66 (doesn't appear to be able to build SNMP) and
> ActiveState (requires me to buy MS Visual C++ and has an obnoxious
> redistribution clause).
> 
[...]

Hello,

You can try Borland C++ 5.5 which is now a free compiler ('Free Beer' sense)

I use perl 5.6.0 with Borland C++ 5.4 (form BCB 4)

I had to change 3 or 4 lines in the source code and the Makefile.
The standard distro assumes BC 5.02. The changes I made are related to
BC support for anonymous structs in C which are supported in BC++ 5.4

Once compiled all the tests passed. I installed many CPAN modules including
libwww, libwin32, libnet, DBI and many more. It runs fine in a production
environment without problems at all.

(I want to upload the changes but I don't know to whom)

Last time I tried the free 5.5 compiler I didn't succeed. (miniperl dies
immediately with garbage print and beeps). I didn't have the time to find
out what is the problem.

Eyal.


------------------------------

Date: 16 Sep 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 16 Sep 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

| NOTE: The mail to news gateway, and thus the ability to submit articles
| through this service to the newsgroup, has been removed. I do not have
| time to individually vet each article to make sure that someone isn't
| abusing the service, and I no longer have any desire to waste my time
| dealing with the campus admins when some fool complains to them about an
| article that has come through the gateway instead of complaining
| to the source.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V9 Issue 3877
**************************************


home help back first fref pref prev next nref lref last post