[31044] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 2289 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Mar 22 03:09:42 2009

Date: Sun, 22 Mar 2009 00:09:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 22 Mar 2009     Volume: 11 Number: 2289

Today's topics:
    Re: how to extract bunch of things from ( ) sln@netherlands.com
    Re: How to separate a big text file (say 400 news stori <huxiankui@gmail.com>
    Re: How to separate a big text file (say 400 news stori <1usa@llenroc.ude.invalid>
    Re: How to separate a big text file (say 400 news stori <tadmc@seesig.invalid>
        How to separate a big text file (say 400 news stories)  <huxiankui@gmail.com>
        new CPAN modules on Sun Mar 22 2009 (Randal Schwartz)
    Re: software design question (David Combs)
    Re: software design question sln@netherlands.com
    Re: String parsed wrong by perl <Alexander.Farber@gmail.com>
    Re: String parsed wrong by perl <nospam-abuse@ilyaz.org>
    Re: String parsed wrong by perl sln@netherlands.com
    Re: use warnings unless $foo <nospam-abuse@ilyaz.org>
    Re: use warnings unless $foo sln@netherlands.com
    Re: use warnings unless $foo sln@netherlands.com
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sun, 22 Mar 2009 00:35:10 GMT
From: sln@netherlands.com
Subject: Re: how to extract bunch of things from ( )
Message-Id: <1g1bs45chavmcsoccu6h7m5k00f2li5tjc@4ax.com>

On Thu, 19 Mar 2009 16:48:11 -0700 (PDT), sherry <syangs04@gmail.com> wrote:

>I need to extract all the args from this line of code:
>
>foo3( "something %s", ( BYTE )arg1, arg2.m_pcContentLocation );
>
>the result is something like this:
>
>"something %s", ( BYTE )arg1, arg2.m_pcContentLocation
>
>I'm fairly new to perl. Not sure if this is doable.
>
>Thanks for your help!
>
>Sherry

This looks suspiciously like C++. Parsing to get the bracing
parameters is no less easy than parsing the function parameters themselves.

This very limited algo can get you into the parameters where you can
try some parsing techniques on the parameters.

As I said, beware, functions of functions, parameters of parameters will
make you crazy.

-sln

## ===============================================
## C_FunctionParser_v3.pl  @  3/21/09
## -------------------------------
## C/C++ Style Function Parser
##  Idea - To parse out C/C++ style functions
##  that have parenthetical closures (some don't).
##  (Could be a package some day, dunno, maybe ..)
##  - sln  
## ===============================================
my $VERSION = 3.0;
$|=1;

use strict;
use warnings;

# Prototype's
  sub Find_Function(\$\@);

# File-scoped variables
  my ($FxParse,$FName,$Preamble);

# Set default function name
  SetFunctionName();

## --------
## main 

# Source file
   my $Source = join '', <DATA>;

# Extended, possibly non-compliant, function name - pattern examples:
#  SetFunctionName(qr/_T/);
#  SetFunctionName(qr/\(\s*void\s*\)\s*function/);
#  SetFunctionName("\\(\\s*void\\s*\\)\\s*function");

# Parse some functions
   # func ...
    my @Funct = ();
    Find_Function($Source, @Funct);

# Print @Funct functions found
# Note that segments can be modified and collated.
   if (!@Funct) {
	print "Function name pattern: '$FName' not found!\n";
   } else {
	print "\nFound ".@Funct." matches.\nFunction pattern: '$FName' \n";
   }
   for my $ref (@Funct) {
	printf "\n\@: %6d - %s\n", $$ref[3], substr($Source, $$ref[0], $$ref[2] - $$ref[0]);
   }

## end main 
## ----------

# Set the parser's function regex pattern
#
sub SetFunctionName
{
	if (!@_) {
		$FName = "_*[a-zA-Z][\\w]*"; # Matches all compliant function names (default)
	} else {
		$FName = shift;
	}
	$Preamble   = "\\s*\\(";

	# Compile function parser regular expression
	  # Regex condensed:
	  # $FxParse = qr!//(?:[^\\]|\\\n?)*?\n|/\*.*?\*/|\\.|'["()]'|(")|($FName$Preamble)|(\()|(\))!s;
	  #                                    |         |   |       |1 1|2               2|3  3|4  4
	  # Note - Non-Captured, matching items, are meant to consume!
	  # -----------------------------------------------------------
	  # Regex /xpanded (with commentary):
	  $FxParse =                      # Regex Precedence (items MUST be in this order):
	    qr!                           # -----------------------------------------------
	         //                       # comment - //
	            (?:                   #    grouping
	                [^\\]             #       any non-continuation character ^\
	              |                   #         or
	                \\\n?             #       any continuation character followed by 0-1 newline \n
	            )*?                   #    to be done 0-many times, stopping at the first end of comment
	         \n         |             #  end of comment - //
	         /\*.*?\*/   |            # comment - /*  + anything + */
	         \\.       |              # escaped char - backslash + ANY character
	         '["()]'    |             # single quote char - quote then one of ", (, or ), then quote
	         (")       |              # capture $1 - double quote as a flag
	         ($FName$Preamble) |      # capture $2 - $FName + $Preamble
	         (\()  |                  # capture $3 - ( as a flag
	         (\))                     # capture $4 - ) as a flag
	  !xs;
}

# Procedure that finds C/C++ style functions
# (the engine)
# Notes:
#   - This is not a syntax checker !!!
#   - Nested functions index and closure are cached. The search is single pass.
#   - Parenthetical closures are determined via cached counter.
#   - This precedence avoids all ambigous paranthetical open/close conditions:
#       1. Dual comment styles.
#       2. Escapes.
#       3. Single quoted characters.
#       4. Double quotes, fip-flopped to determine closure.
#   - Improper closures are reported, with the last one reliably being the likely culprit
#     (this would be a syntax error, ie: the code won't complie, but it is reported as a closure error).
#
sub Find_Function(\$\@)
{
	my ($src,$Funct) = @_;
	my @Ndx     = ();
	my @Closure = ();
	my ($Lines,$offset,$closure,$dquotes) = (1,0,0,0);

	while ($$src =~ /$FxParse/g)
	{
		if (defined $1)  # double quote "
		{
			$dquotes = !$dquotes;
		}
		next if ($dquotes);

		if (defined $2)  # 'function name'
		{
			# ------------------------------------
			# Placeholder for exclusions......
			# ------------------------------------

			# Cache the current function index and current closure
			  push  @Ndx, scalar(@$Funct);
			  push  @Closure, $closure;

			  my ($funcpos, $parampos) = ( $-[0], pos($$src) );

			# Get newlines since last function
			  $Lines += substr ($$src, $offset, $funcpos - $offset) =~ tr/\n//;
			  # print $Lines,"\n";

			# Save positions:   function(   parms     )
			  push  @$Funct  ,  [$funcpos, $parampos, 0, $Lines];

			# Asign new offset
			  $offset = $funcpos;
			# Closure is now 1 because of preamble '('
			  $closure = 1;
		}
		elsif (defined $3)  # '('
		{
			++$closure;
		}
		elsif (defined $4)  # ')'
		{
			--$closure;
			if ($closure <= 0)
			{
				$closure = 0;
				if (@Ndx)
				{
					# Pop index and closure, store position
					  $$Funct[pop @Ndx][2] = pos($$src);
					  $closure = pop @Closure;
				}
			}
		}
	}

	# To test an error, either take off the closure of a function in its source,
	# or force it this way (pseudo error, make sure you have data in @$Funct):
	# push @Ndx, 1;

	# Its an error if index stack has elements.
	# The last one reported is the likely culprit.
	if (@Ndx)
	{
		## BAD RETURN ...
		## All elements in stack have to be fixed up
		while (@Ndx) {
			my $func_index = shift @Ndx;
			my $ref = $$Funct[$func_index];
			$$ref[2] = $$ref[1];
			print STDERR "** Bad return, index = $func_index\n";
			print "** Error! Unclosed function [$func_index], line ".
			     $$ref[3].": '".substr ($$src, $$ref[0], $$ref[2] - $$ref[0] )."'\n";
		}
		return 0;
	}
	return 1
}

__DATA__



------------------------------

Date: Sat, 21 Mar 2009 22:57:00 -0700 (PDT)
From: william <huxiankui@gmail.com>
Subject: Re: How to separate a big text file (say 400 news stories) to many  small text files?
Message-Id: <dd4138d5-7410-493b-b553-45836066bd0e@33g2000yqm.googlegroups.com>

Tad,

Thank you very much for providing a script for me!

I only modified one place to ignore anything before _ of _ DOCUMENTS.
It would be neat to have the date converted to the format you
mentioned. Since I'm not a Perl expert, I may have to do it later
using SAS. Below is the script I used. (I'm fascinated and overwhelmed
by what Perl's capability. I still have many questions related to
Perl.)

I really appreciate your help.

William

#!/usr/bin/perl
use warnings;
use strict;
open IN,"dell" or die "could not open $!";
my($article, $num, $date);
$num=0;
while ( <IN> ) {
    if ( /(\d+) of \d+ DOCUMENTS/ ) {
        if ( $article && $num ) {  # output then empty the article
buffer
            print "Output for file $num.\n";
            open my $ARTICLE, '>', "${date}_$num.txt" or die "could
not open $!";
            print $ARTICLE $article;
            close $ARTICLE;
            $article = '';
        }
        $num = $1;
    }
    if ( /LOAD-DATE: (.*)/ ) {
        $date = $1;
        $date =~ tr/ ,/_/s;
        chop($date);
    }
    $article .= $_;
}

print "Output for the last file $num.\n";

if ( $article ) {
    open my $ARTICLE, '>', "${date}_$num.txt" or die "could not open
$!";
    print $ARTICLE $article;
    close $ARTICLE;
}




------------------------------

Date: Sun, 22 Mar 2009 02:48:17 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: How to separate a big text file (say 400 news stories) to many small  text files?
Message-Id: <Xns9BD5E7FB8535Dasu1cornelledu@127.0.0.1>

william <huxiankui@gmail.com> wrote in news:5525dff8-f988-446b-9a96-
ee996e413fbf@v19g2000yqn.googlegroups.com:

> I have downloaded a big text file from Lexis-Nexis. This text file has
> 412 news stories. Each story begins with, for example, 1 of 412
> DOCUMENTS or 2 of 412 DOCUMENTS and close to the end it has LOAD-DATE:
> information.  I want to process this one big text file into 412
> separate text files and name them according to the LOAD-DATE, for
> example, DELL_20081205_1.txt or DELL_20081205_2.txt. Can anybody give
> me some hint on how to achieve this using Perl? The actual text file
> is attached below.

This groups is for programmers to get help with *code*. In particular, 
this is not a place to get ready-made scripts to solve your specific 
problem.

You need to show what you have so far and what you need help with.

There are many ways of getting what you want. All follow the same basic 
structure: Find the start of a record, save all content up to the start 
of the next record and do this until there are no more records left.

There is a loop or two, a couple of regexes and very simple command line 
processing involved.

If you want someone to write this for you, you should post the job at 
http://jobs.perl.org/

Sinan

-- 
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/


------------------------------

Date: Sat, 21 Mar 2009 22:28:34 -0500
From: Tad J McClellan <tadmc@seesig.invalid>
Subject: Re: How to separate a big text file (say 400 news stories) to many small  text files?
Message-Id: <slrngsbc32.noc.tadmc@tadmc30.sbcglobal.net>

william <huxiankui@gmail.com> wrote:
> I have downloaded a big text file from Lexis-Nexis. This text file has
> 412 news stories. Each story begins with, for example, 1 of 412
> DOCUMENTS or 2 of 412 DOCUMENTS and close to the end it has LOAD-DATE:
> information.  I want to process this one big text file into 412
> separate text files and name them according to the LOAD-DATE, for
> example, DELL_20081205_1.txt or DELL_20081205_2.txt. Can anybody give
> me some hint on how to achieve this using Perl?


Read the file.

Buffer up an article, making note of the number and date when they go by.

When at end of article print it to a file.

Which of those are you having trouble with?


Have a fish.

When you come back asking how to convert "December_5_2008"
into "20081205" we will expect to see the code that you've
written so far.


 ------------------------------
#!/usr/bin/perl
use warnings;
use strict;

my($article, $num, $date);
while ( <DATA> ) {
    if ( /(\d+) of \d+ DOCUMENTS/ ) {
        if ( $article ) {  # output then empty the article buffer
            open my $ARTICLE, '>', "${date}_$num.txt" or die "could not open $!";
            print $ARTICLE $article;
            close $ARTICLE;
            $article = '';
        }
        $num = $1;
    }

    if ( /LOAD-DATE: (.*)/ ) {
        $date = $1;
        $date =~ tr/ ,/_/s;
    }

    $article .= $_;
}

if ( $article ) {
    open my $ARTICLE, '>', "${date}_$num.txt" or die "could not open $!";
    print $ARTICLE $article;
    close $ARTICLE;
}

__DATA__
                               1 of 412 DOCUMENTS

                       Austin American-Statesman (Texas)

                            December 5, 2008 Friday
                                 Final Edition

CENTRAL TEXAS DIGEST

BYLINE: FROM STAFF REPORTS

SECTION: BUSINESS; Pg. B08

LENGTH: 375 words


<A0><A0><A0>COMPUTER MAKERS

<A0><A0><A0>Dell stockholders ask for investigation

<A0><A0><A0>Dell Inc. said that it received a shareholder-demand letter asking
the board
to investigate allegations that some current and former directors and
officers
imprudently invested and managed funds in the 401(k) plan.
 ......

LOAD-DATE: December 5, 2008

LANGUAGE: ENGLISH

PUBLICATION-TYPE: Newspaper


                               2 of 412 DOCUMENTS


                        Contra Costa Times (California)

                             August 29, 2008 Friday

Stocks on the move: International Paper, Magma Design, PetSmart

BYLINE: wire

SECTION: BUSINESS

LENGTH: 677 words


<A0><A0><A0>By Fabio Alves

<A0><A0><A0>Bloomberg News

<A0><A0><A0>The following companies are having unusual price changes in U.S.
markets this
afternoon.

<A0><A0><A0>Dell Inc. (DELL US) dropped 12 percent, the most since November, to
$22.08.
The world's second-largest personal-computer said the U.S. slump in
technology
spending has moved abroad. Second-quarter profit before one-time items
was 33
cents a share, short of the 36-cent average projection compiled by
Bloomberg.

LOAD-DATE: August 29, 2008

LANGUAGE: ENGLISH


 ------------------------------


-- 
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"


------------------------------

Date: Sat, 21 Mar 2009 19:02:09 -0700 (PDT)
From: william <huxiankui@gmail.com>
Subject: How to separate a big text file (say 400 news stories) to many small  text files?
Message-Id: <5525dff8-f988-446b-9a96-ee996e413fbf@v19g2000yqn.googlegroups.com>

I have downloaded a big text file from Lexis-Nexis. This text file has
412 news stories. Each story begins with, for example, 1 of 412
DOCUMENTS or 2 of 412 DOCUMENTS and close to the end it has LOAD-DATE:
information.  I want to process this one big text file into 412
separate text files and name them according to the LOAD-DATE, for
example, DELL_20081205_1.txt or DELL_20081205_2.txt. Can anybody give
me some hint on how to achieve this using Perl? The actual text file
is attached below.

Thank you very much.

William

                               1 of 412 DOCUMENTS

                       Austin American-Statesman (Texas)

                            December 5, 2008 Friday
                                 Final Edition

CENTRAL TEXAS DIGEST

BYLINE: FROM STAFF REPORTS

SECTION: BUSINESS; Pg. B08

LENGTH: 375 words


=A0=A0=A0COMPUTER MAKERS

=A0=A0=A0Dell stockholders ask for investigation

=A0=A0=A0Dell Inc. said that it received a shareholder-demand letter asking
the board
to investigate allegations that some current and former directors and
officers
imprudently invested and managed funds in the 401(k) plan.
 ......

LOAD-DATE: December 5, 2008

LANGUAGE: ENGLISH

PUBLICATION-TYPE: Newspaper


                               2 of 412 DOCUMENTS


                        Contra Costa Times (California)

                             August 29, 2008 Friday

Stocks on the move: International Paper, Magma Design, PetSmart

BYLINE: wire

SECTION: BUSINESS

LENGTH: 677 words


=A0=A0=A0By Fabio Alves

=A0=A0=A0Bloomberg News

=A0=A0=A0The following companies are having unusual price changes in U.S.
markets this
afternoon.

=A0=A0=A0Dell Inc. (DELL US) dropped 12 percent, the most since November, t=
o
$22.08.
The world's second-largest personal-computer said the U.S. slump in
technology
spending has moved abroad. Second-quarter profit before one-time items
was 33
cents a share, short of the 36-cent average projection compiled by
Bloomberg.

LOAD-DATE: August 29, 2008

LANGUAGE: ENGLISH




------------------------------

Date: Sun, 22 Mar 2009 04:42:27 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Sun Mar 22 2009
Message-Id: <KGw52r.Mt8@zorch.sf-bay.org>

The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN).  You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.

API-ISPManager-0.0101
http://search.cpan.org/~nrg/API-ISPManager-0.0101/
----
Acme-TextLayout-0.01
http://search.cpan.org/~thecramps/Acme-TextLayout-0.01/
Layout things in a grid, as described textually 
----
Alien-WiX-v0.305120
http://search.cpan.org/~csjewell/Alien-WiX-v0.305120/
Installing and finding Windows Installer XML (WiX) 
----
Apache2-ASP-2.36
http://search.cpan.org/~johnd/Apache2-ASP-2.36/
ASP for Perl, reloaded. 
----
App-Toodledo-0.03
http://search.cpan.org/~pjs/App-Toodledo-0.03/
Interacting with the Toodledo task management service. 
----
Async-Hooks-0.03
http://search.cpan.org/~melo/Async-Hooks-0.03/
Hook system with asynchronous capabilities 
----
Async-Hooks-0.04
http://search.cpan.org/~melo/Async-Hooks-0.04/
Hook system with asynchronous capabilities 
----
CPANPLUS-Dist-Arch-0.02
http://search.cpan.org/~juster/CPANPLUS-Dist-Arch-0.02/
----
CPANPLUS-Dist-Arch-0.03
http://search.cpan.org/~juster/CPANPLUS-Dist-Arch-0.03/
CPANPLUS backend for building Archlinux pacman packages 
----
Catalyst-Log-Log4perl-1.03
http://search.cpan.org/~bobtfish/Catalyst-Log-Log4perl-1.03/
Log::Log4perl logging for Catalyst 
----
Class-Accessor-Grouped-0.08003
http://search.cpan.org/~claco/Class-Accessor-Grouped-0.08003/
Lets you build groups of accessors 
----
Data-Inspect-0.02
http://search.cpan.org/~owl/Data-Inspect-0.02/
human-readable object representations 
----
Devel-Loading-0.01
http://search.cpan.org/~sartak/Devel-Loading-0.01/
Run code before each module is loaded 
----
Foorum-1.000006
http://search.cpan.org/~fayland/Foorum-1.000006/
forum system based on Catalyst 
----
Foorum-1.000007
http://search.cpan.org/~fayland/Foorum-1.000007/
forum system based on Catalyst 
----
Geo-ReadGRIB-0.98
http://search.cpan.org/~frankcox/Geo-ReadGRIB-0.98/
Perl extension that gives read access to GRIB 1 "GRIdded Binary" format Weather data files. 
----
Graphics-Primitive-Driver-CairoPango-0.54
http://search.cpan.org/~gphat/Graphics-Primitive-Driver-CairoPango-0.54/
Cairo/Pango backend for Graphics::Primitive 
----
HTML-ExtractContent-0.06
http://search.cpan.org/~tarao/HTML-ExtractContent-0.06/
An HTML content extractor with scoring heuristics 
----
HTML-WikiConverter-0.68
http://search.cpan.org/~diberri/HTML-WikiConverter-0.68/
Convert HTML to wiki markup 
----
IO-Stream-Proxy-HTTPS-1.0.2
http://search.cpan.org/~powerman/IO-Stream-Proxy-HTTPS-1.0.2/
HTTPS proxy plugin for IO::Stream 
----
JSON-CPAN-Meta-4.000
http://search.cpan.org/~rjbs/JSON-CPAN-Meta-4.000/
JSON is YAML; emit JSON into META.yml 
----
JSON-CPAN-Meta-5.000
http://search.cpan.org/~rjbs/JSON-CPAN-Meta-5.000/
JSON is YAML; emit JSON into META.yml 
----
Lingua-LinkParser-1.13
http://search.cpan.org/~dbrian/Lingua-LinkParser-1.13/
Perl module implementing the Link Grammar Parser by Sleator, Temperley and Lafferty at CMU. 
----
Log-Simplest-1.0
http://search.cpan.org/~dmytro/Log-Simplest-1.0/
Simple log module. Writes log messages to file and/or STDERR. 
----
Loop-Control-0.01
http://search.cpan.org/~marcel/Loop-Control-0.01/
FIRST and NEXT functions for loops 
----
MP3-Podcast-0.06_1aa
http://search.cpan.org/~jmerelo/MP3-Podcast-0.06_1aa/
Perl extension for podcasting directories full of MP3 files 
----
Mobile-Devices-0.01
http://search.cpan.org/~jkutej/Mobile-Devices-0.01/
search for and get mobile device information 
----
Net-SSH2-0.19
http://search.cpan.org/~nrg/Net-SSH2-0.19/
Support for the SSH 2 protocol via libssh2. 
----
PAR-Packer-0.991
http://search.cpan.org/~smueller/PAR-Packer-0.991/
PAR Packager 
----
Pod-POM-0.24
http://search.cpan.org/~andrewf/Pod-POM-0.24/
POD Object Model 
----
Railsish-0.20
http://search.cpan.org/~gugod/Railsish-0.20/
A web application framework. 
----
SSH-Command-0.0.1
http://search.cpan.org/~nrg/SSH-Command-0.0.1/
----
Text-CSV-1.11
http://search.cpan.org/~makamaka/Text-CSV-1.11/
comma-separated values manipulator (using XS or PurePerl) 
----
TryCatch-1.000003
http://search.cpan.org/~ash/TryCatch-1.000003/
first class try catch semantics for Perl, without source filters. 
----
Util-Any-0.05
http://search.cpan.org/~ktat/Util-Any-0.05/
Export any utilities and To create your own Util::Any 
----
WWW-YahooJapan-KanaAddress-0.1.4
http://search.cpan.org/~hiratara/WWW-YahooJapan-KanaAddress-0.1.4/
translating the address in Japan into kana. 


If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.

This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
  http://www.stonehenge.com/merlyn/LinuxMag/col82.html

print "Just another Perl hacker," # the original

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion


------------------------------

Date: Sun, 22 Mar 2009 03:23:05 +0000 (UTC)
From: dkcombs@panix.com (David Combs)
Subject: Re: software design question
Message-Id: <gq4aup$66s$1@reader1.panix.com>

In article <x7k56yf97c.fsf@mail.sysarch.com>,
Uri Guttman  <uri@stemsystems.com> wrote:
>>>>>> "R" == Ruud  <rvtol+usenet@xs4all.nl> writes:
>
>  R> Uri Guttman wrote:
>  >> [printing late] but scalar refs
>  >> solves the passing around big strings. you still need at least one large
>  >> buffer for it though.
>
>  R> Or store the strings in an array (and pass around its ref) because
>
>  R>      print LIST
>
>arrays take up more storage than a large scalar. and i like to process
>whole files with regexes vs looping over an array of lines. it is faster
>and usually simpler too. line loops usually require some form of state
>(in a paragraph or not, in pod or not, etc.) but you can do the same by
>grabbing a whole section in a regex and looping with while. last week i
>was working with my intern group and someone wrote a basic pod extractor
>with looping over lines. it was longish (20 or so lines) and
>stateful. we replaced it with a single regex in a while loop working on
>the whole file in a string.
>
>uri


Uri, could you show a quick example of each, even using pseudocode --
just enough to make this important principle a bit more concrete?

Thanks!

David




------------------------------

Date: Sun, 22 Mar 2009 03:53:51 GMT
From: sln@netherlands.com
Subject: Re: software design question
Message-Id: <5gdbs49k7f2mp94v2dirbsehkomvbp86m6@4ax.com>

On Sun, 22 Mar 2009 03:23:05 +0000 (UTC), dkcombs@panix.com (David Combs) wrote:

>In article <x7k56yf97c.fsf@mail.sysarch.com>,
>Uri Guttman  <uri@stemsystems.com> wrote:
>>>>>>> "R" == Ruud  <rvtol+usenet@xs4all.nl> writes:
>>
>>  R> Uri Guttman wrote:
>>  >> [printing late] but scalar refs
>>  >> solves the passing around big strings. you still need at least one large
>>  >> buffer for it though.
>>
>>  R> Or store the strings in an array (and pass around its ref) because
>>
>>  R>      print LIST
>>
>>arrays take up more storage than a large scalar. and i like to process
>>whole files with regexes vs looping over an array of lines. it is faster
>>and usually simpler too. line loops usually require some form of state
>>(in a paragraph or not, in pod or not, etc.) but you can do the same by
>>grabbing a whole section in a regex and looping with while. last week i
>>was working with my intern group and someone wrote a basic pod extractor
>>with looping over lines. it was longish (20 or so lines) and
>>stateful. we replaced it with a single regex in a while loop working on
>>the whole file in a string.
>>
>>uri
>
>
>Uri, could you show a quick example of each, even using pseudocode --
>just enough to make this important principle a bit more concrete?
>
>Thanks!
>
>David
>

OMG another important principle to massage ego..
gimme a break man.

-sln


------------------------------

Date: Sat, 21 Mar 2009 13:11:42 -0700 (PDT)
From: "A. Farber" <Alexander.Farber@gmail.com>
Subject: Re: String parsed wrong by perl
Message-Id: <778797a9-cf0b-4512-b44f-627a162bf125@t3g2000yqa.googlegroups.com>

On 21 Mrz., 18:09, Tad J McClellan <ta...@seesig.invalid> wrote:
> Finding a needle next to a stalk of hay is easier than finding
> a needle in a haystack. =A0:-)
>
> printf makes it easy to see the variables that are used in the statement:
>
> =A0 =A0 printf '[color=3D#FF0000:%s]%s[/color:%s]',
> =A0 =A0 =A0 =A0 =A0 =A0$self->{BBCODE}, $card->{HTML}, $self->{BBCODE};
>

:-) Ok, maybe sprintf is better

Thanks


------------------------------

Date: Sat, 21 Mar 2009 20:19:33 GMT
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: String parsed wrong by perl
Message-Id: <slrngsaiul.75q.nospam-abuse@chorin.math.berkeley.edu>

On 2009-03-21, A. Farber <Alexander.Farber@gmail.com> wrote:
> I'm just still a bit surprised, that perl tries
> to parse a regex from inside of a string...

It is not "inside of string".  Some parts of double-quoted regions are
STRINGS, some are CODE.  E.g.

  >perl -wle "qq(abc) =~ /(.)(.)(.)/; print qq(${1+2})"
  c 

Likewise for $a[CODE], $a{CODE}, and @a[CODE], @a{CODE}; modify for ->
accordingly...  One can avoid this by backwacking $, {, [, or -.

Hope this helps,
Ilya


------------------------------

Date: Sun, 22 Mar 2009 01:11:02 GMT
From: sln@netherlands.com
Subject: Re: String parsed wrong by perl
Message-Id: <ko3bs4d202312frh3q74t62j3jvd93gj8c@4ax.com>

On Sat, 21 Mar 2009 20:19:33 GMT, Ilya Zakharevich <nospam-abuse@ilyaz.org> wrote:

>On 2009-03-21, A. Farber <Alexander.Farber@gmail.com> wrote:
>> I'm just still a bit surprised, that perl tries
>> to parse a regex from inside of a string...
>
>It is not "inside of string".  Some parts of double-quoted regions are
>STRINGS, some are CODE.  E.g.
>
>  >perl -wle "qq(abc) =~ /(.)(.)(.)/; print qq(${1+2})"
>  c 
>
>Likewise for $a[CODE], $a{CODE}, and @a[CODE], @a{CODE}; modify for ->
>accordingly...  One can avoid this by backwacking $, {, [, or -.
                                       ^^^^^^^^^^^
>
>Hope this helps,
>Ilya

Whoa, now there's a technical term "backwacking".
What the hell does that mean?

Is that like itchy finger hoosamacalla?

Unfortunately, I've seen this term before but never, ever, would would give
creadence to its existence as legitimate phrase used to technically describe
anything, nor validate it's legitimacy.

-sln


------------------------------

Date: Sat, 21 Mar 2009 20:25:31 GMT
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: use warnings unless $foo
Message-Id: <slrngsaj9r.75q.nospam-abuse@chorin.math.berkeley.edu>

On 2009-03-21, Jürgen Exner <jurgenex@hotmail.com> wrote:
>>>use warnings if $foo;

>>>Is there a name for those keywords which do not form a normal
>>>expression? I.e. where I can not append a statement modifier.

> Misleading way of thinking. The problem is NOT that you couldn't
> use statement modifiers. The problem is that that BEGIN/END enforces
> execution at compile time, i.e. before perl had any chance to look at
> the condition 'if $foo'.

This is not fully kosher explanation.  Note that arguments to `use'
ARE executed at compile time.  Why not if/unless conditions?

Note that this bug in Perl has a fix - for about a decade.  It is just
that some %@$#$* rejected my patch which fixed this bug.

Thus I needed to created this `use if ...' ugliness...

Hope this helps,
Ilya


------------------------------

Date: Sun, 22 Mar 2009 01:19:06 GMT
From: sln@netherlands.com
Subject: Re: use warnings unless $foo
Message-Id: <8e4bs49s5iei6juibiohc3dcnallo2ernh@4ax.com>

On Sat, 21 Mar 2009 10:30:50 -0500, Tad J McClellan <tadmc@seesig.invalid> wrote:

>Florian Kaufmann <sensorflo@gmail.com> wrote:
>
>
>Please choose one posting address, and stick to it.
>
>
>> It's not that I currently wan't to do such sort of thing
>
>
>It's not that you currently wa not to do such sort of thing?

I just wan't to say Tad, you should learn English because I do'nt
think yo've been to school lately.

-sln


------------------------------

Date: Sun, 22 Mar 2009 01:23:26 GMT
From: sln@netherlands.com
Subject: Re: use warnings unless $foo
Message-Id: <ji4bs4pvrlunqielffs9mn8tcuihavlsm8@4ax.com>

On Fri, 20 Mar 2009 13:08:10 -0700 (PDT), Florian Kaufmann <sensorflo@gmail.com> wrote:

>use, no, package are special in that I can not say
>
>use warnings if $foo;
>
>Is there a name for those keywords which do not form a normal
>expression? I.e. where I can not append a statement modifier.
>
>Where in the perldoc do I find a complete list of those keywords?
>
>Thank you
>
>Flo

Yeah, isin't some use's pragmata though? If not then all pragmata are
packages then. How bizzare.. Never get what you wan't huh. Not out of Perl.
If you did, there would be no air time for the hackers with no jobs here.

-sln


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 2289
***************************************


home help back first fref pref prev next nref lref last post