[10842] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 4443 Volume: 8

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Dec 16 22:07:24 1998

Date: Wed, 16 Dec 98 19:00:19 -0800
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Wed, 16 Dec 1998     Volume: 8 Number: 4443

Today's topics:
    Re: @INC   and perl <zenin@bawdycaste.org>
    Re: @INC   and perl <tasburfoot@yahoo.com>
        ASP <donfykes@airnet.net>
        barcodes and perl <melton@diagdata.com>
        counterintuitive behavior of "shift" (Dragomir R. Radev)
    Re: counterintuitive behavior of "shift" (Sam Holden)
        CPAN package of top 10 Modules (John)
    Re: Disk Free <metcher@spider.herston.uq.edu.au>
        Encryption/Decryption of strings (Steven Edwards)
    Re: Hashref Compatibility with Perl 4.0 -- TIA (Martien Verbruggen)
    Re: how to add path to @INC permanently (Martien Verbruggen)
    Re: making html change (Tad McClellan)
    Re: Need some speed tips on this script.. <uri@ibnets.com>
        perlcc on Win32 link error <the.irwins@worldnet.att.net>
    Re: Pleas help with outputting messages to an html page imchat@ionet.net
    Re: Pleas help with outputting messages to an html page (Clay Irving)
    Re: problem with grep and readdir for subdirectories (Martien Verbruggen)
    Re: querying/updating Access db in WinPerl? <metcher@spider.herston.uq.edu.au>
    Re: school me on RE please (David Formosa)
        Searching through a 10MB file (Christian M. Aranda)
    Re: Searching through a 10MB file <uri@ibnets.com>
    Re: Searching through a 10MB file (Christian M. Aranda)
    Re: Searching through a 10MB file (Larry Rosler)
    Re: Why doesn't this work? <rick.delaney@home.com>
    Re: Writing Perl with Notepad (B. Mann)
        Special: Digest Administrivia (Last modified: 12 Dec 98 (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 17 Dec 1998 02:20:51 GMT
From: Zenin <zenin@bawdycaste.org>
Subject: Re: @INC   and perl
Message-Id: <913861376.269982@thrush.omix.com>

Kevin R. Price <kprice@iocenter.net> wrote:
: I am trying ot get PERL to run.  I have perl 5.005 01 installed on an
: AIX 4.2.1.0 box.

	Was this a package install, or did you build it from the source?

: When I run perl test2.pl  I get:
: Can't locate IO/Socket.pm in @INC (@INC contains:
: /usr/local/lib/perl5/aix/5.004
: 63 /usr/local/lib/perl5 /usr/local/lib/perl5/site_perl/aix
: /usr/local/lib/perl5/
: site_perl .) at UPSship.pm line 7.
:
: Where or what is the @INC file, and how can I edit it to put in the
: correct path to where those library modules can be found?

	@INC is not a file.  It's a Perl array of search paths for module
	lookup.

	You're running 5.005, but have a 5.004 in your @INC.  That, along
	with the fact that IO::Socket is a core module in 5.005 makes me
	think your perl is broken (ie, installed incorrectly or
        incompletely).

	If you used a package to install perl, you must use whatever default
	prefix wants to use, unless you're prepared to do some heavy
	hacking.  Using a different prefix will likely break your perl.

	When at all possible, always install from source.  This goes for
	pretty much anything on Unix.

-- 
-Zenin (zenin@archive.rhps.org)           From The Blue Camel we learn:
BSD:  A psychoactive drug, popular in the 80s, probably developed at UC
Berkeley or thereabouts.  Similar in many ways to the prescription-only
medication called "System V", but infinitely more useful. (Or, at least,
more fun.)  The full chemical name is "Berkeley Standard Distribution".


------------------------------

Date: Wed, 16 Dec 1998 18:45:42 -0800
From: "Joe" <tasburfoot@yahoo.com>
Subject: Re: @INC   and perl
Message-Id: <vf_d2.14$XY6.836@news.san.rr.com>

Here is a question along the same note.  I'd like to use a module that is
NOT in the path.  The ISP has not been helpful in installing anything else.
Is there a way that i can add to the path long enough for my script to work.
Maybe ->

push (@INC, "/location");

also - i was looking at the module itself and i dont understand how to
create the module.  the module is the date::time one.  It mentions a lot
about using C to compile it?  any references or information on this would be
great.

Joe





------------------------------

Date: Wed, 16 Dec 1998 20:22:38 -0600
From: "Don Fykes" <donfykes@airnet.net>
Subject: ASP
Message-Id: <9EURqNWK#GA.129@samson.airnet.net>

I need to have a perl script fire an ASP.

open(MAIL,"|$mailprog -t");

where $mailprog would be mail.asp

Can this be done?
thanks
don




------------------------------

Date: Thu, 17 Dec 1998 01:48:04 GMT
From: A Melton <melton@diagdata.com>
Subject: barcodes and perl
Message-Id: <367860D5.781C0710@diagdata.com>

any way to take flat ascii files,
1 line each and convert them to
UPC bar codes with perl.

Alan Melton


------------------------------

Date: 16 Dec 1998 20:53:59 -0500
From: radev@news.cs.columbia.edu (Dragomir R. Radev)
Subject: counterintuitive behavior of "shift"
Message-Id: <759o7n$2f5@disco.cs.columbia.edu>

----------------------------------------------------------------------------
bla.pl:

#!/bin/perl

$par1 = shift || 1;
$par2 = shift || 2;

print "$par1 $par2\n";
----------------------------------------------------------------------------

% bla.pl 5 6
% 5 6
% bla.pl 0 6
% 1 6             instead of "0 6"

I find this logical from Perl's point of view, yet very
counterintuitive and confusing.


Drago


-- 
Dragomir R. Radev                     http://www.cs.columbia.edu/~radev
Natural Language Processing Group     Columbia University CS Department
Home: 212-749-9770                   Office: 914-784-7899, 212-939-7121


------------------------------

Date: 17 Dec 1998 02:30:21 GMT
From: sholden@hons.cs.usyd.edu.au (Sam Holden)
Subject: Re: counterintuitive behavior of "shift"
Message-Id: <slrn77gr5t.ddh.sholden@hons.cs.usyd.edu.au>

On 16 Dec 1998 20:53:59 -0500, Dragomir R. Radev <radev@news.cs.columbia.edu>
	wrote:
>----------------------------------------------------------------------------
>bla.pl:
>
>#!/bin/perl
>
>$par1 = shift || 1;
>$par2 = shift || 2;
>
>print "$par1 $par2\n";
>----------------------------------------------------------------------------
>
>% bla.pl 5 6
>% 5 6
>% bla.pl 0 6
>% 1 6             instead of "0 6"
>
>I find this logical from Perl's point of view, yet very
>counterintuitive and confusing.

The fact that 0 is regarded as false should not be counterintuitive, unless
you are used to programming in shell of course... ;)

Try defined() if you want to know if something exists.

Don't use boolean tests when you are really testing if something is defined,
you will get burnt.

-- 
Sam

It has been discovered that C++ provides a remarkable facility for
concealing the trival details of a program--such as where its bugs are.
	--David Keppel


------------------------------

Date: 16 Dec 1998 22:55:16 GMT
From: falstaff@lennon.postino.com (John)
Subject: CPAN package of top 10 Modules
Message-Id: <759dok$102$1@lennon.postino.com>

X-Newsreader: TIN [version 1.2 PL2]

Is there available a single package that contains the top ten
most used CPAN modules? I Just went thru the pain of installing
the libwww-perl module, that required me to get the URI module,
and that required the MIME:64 module, which required the MailTools,
which needed Net::Domain.

It would be really nice if I could just download one huge file,
and be done with .

-- 
John



------------------------------

Date: Thu, 17 Dec 1998 11:50:08 +1000
From: Jaime Metcher <metcher@spider.herston.uq.edu.au>
Subject: Re: Disk Free
Message-Id: <36786350.E4F93905@spider.herston.uq.edu.au>

God!  Did I really post that?  This is what it should be:

# This time actually cut and pasted from a working script,
# not just read off the console next to me.
print (`cmd /c dir c:` =~ /([\d,]+) bytes free/);

-- 
Jaime Metcher

Pep Mico wrote:
> 
> Hi Jaime,
> Thanks for your suggestion but,
> I'm using this perl release :5.005_02-Binary build 507 for windows NT,
> 
> And when I executed your script line it just outputs number "1".
> 
> Regards
> pep_mico@hp.com
>


------------------------------

Date: Wed, 16 Dec 1998 20:46:08 GMT
From: rand@mindless.com (Steven Edwards)
Subject: Encryption/Decryption of strings
Message-Id: <36781b4a.22314516@news.cchat.com>

Hi,

I was wondering if there were any good routines available to decrypt
an encrypted string. I can encrypt the string using the crypt
function, but I'm tryng to find a way to decrypt a string based on a
password.

Suggestions? :)

Steven Edwards - rand@mindless.com
Spinal Confusion - http://come.to/SpinalConfusion


------------------------------

Date: Thu, 17 Dec 1998 02:01:44 GMT
From: mgjv@comdyn.com.au (Martien Verbruggen)
Subject: Re: Hashref Compatibility with Perl 4.0 -- TIA
Message-Id: <cCZd2.427$595.570@nsw.nnrp.telstra.net>

In article <757m8l$qcb@slip.net>,
	emclean@slip.net (Emmett McLean) writes:

>>you may say that, but I cannot think of one single reason that you
>>cannot upgrade to perl 5. I can think of many many reasons why you
>>should.
> 
>  Well for one, he might not be the sys admin on the machine.

so?

>>Use a comma (,) instead.
> 
>  Someone already suggested that.

so? What makes you think that I knew that?

>  Some people just like to grump.

And you seem to be the self-appointed grump in here.

*shrug*. As soon as I reload this newsgroup, I'll be rid of you.

Martien
-- 
Martien Verbruggen                  | 
Webmaster www.tradingpost.com.au    | I'm desperately trying to figure out
Commercial Dynamics Pty. Ltd.       | why kamikaze pilots wore helmets - Dave
NSW, Australia                      | Edison 


------------------------------

Date: Thu, 17 Dec 1998 02:09:12 GMT
From: mgjv@comdyn.com.au (Martien Verbruggen)
Subject: Re: how to add path to @INC permanently
Message-Id: <cJZd2.430$595.570@nsw.nnrp.telstra.net>

In article <7592mt$76$1@nnrp1.dejanews.com>,
	otis@my-dejanews.com writes:
> Hello,
> 
> Is there a way to add a certain directory path to @INC permanently?
> 
> After installing perl 5.005_2 over 5.004_4 @INC seems to have changed on my
> system and /usr/lib/perl5/site_perl is no longer a part of it (instead
> /usr/lib/perl5/site_perl/5.005 is).
> 
> How do I add my old /usr/lib/perl5/site_perl to @INC permanently?
> perl -V gives this:

You shouldn't do that. Any modules that are compiled will be
incompatible between the 5.004 and 5.005 versions. You just need to
reinstall the modules you need for 5.005_02. Most of the 'pure perl'
modules will probably still work, but there is a good reason that this
split was put there.

If you really do want to change it anyway, you can edit Config.pm in,
I believe

$PERLLIBDIR/5.00502/PLATFORM/Config.pm

Martien
-- 
Martien Verbruggen                  | 
Webmaster www.tradingpost.com.au    | I think there is a world market for
Commercial Dynamics Pty. Ltd.       | maybe five computers. --Thomas Watson,
NSW, Australia                      | chairman IBM, 1943


------------------------------

Date: Wed, 16 Dec 1998 19:50:27 -0600
From: tadmc@metronet.com (Tad McClellan)
Subject: Re: making html change
Message-Id: <31o957.9ak.ln@magna.metronet.com>

Kevin Johnson (kevin@utig2.ig.utexas.edu) wrote:
: Tad McClellan (tadmc@metronet.com) wrote:
: : Kevin Johnson (kevin@utig.ig.utexas.edu) wrote:

: : : What I have is a directory full of html files. I would like to search
: : : for the names of
: : : these files in a html document I specify. When I find a match, I would
: : : like to change
: : : the text to an href pointing to the html file it corresponds to.

: :    I'm missing something simple here I'm sure.

: :    What problem are you having?


: Ok. Maybe I didn't give enough information. 

: I have a directory. Call it /files, that has a bunch of files
: that look like 100.htm, 101.htm, ....

: I have a htm files that looks has text inside of it that looks like

: 100  --  This file is the first file
: 101  --  This file is the second file, etc. 

: I want to actually link the "100" to the files /files/100.htm
: I want to make the html change to do that. 

: How I am trying to do that, is for every line of my htm file that
: I search for each of the files names from /files after clipping off
: the ".htm" from each name. If the search matches I replace the 100
: with an the href. 


   Since you have a s///g, you then replace the 100 part of the 100.htm
   that you just put in with 100.htm, yielding 100.htm.htm.

   Then you replace the 100 part again:  100.htm.htm.htm

   It is becoming obvious that something is Not Right.

   If you might have more than one "filename fragment" on a line,
   then you have a problem...

   Otherwise lose the g option.



: This is all accomplished by the 

: s/$files[$i]/$href/g


   I see that you did stop breaking your promise.

   That's good.

   I guess it didn't fix the problem though, huh?


: What appears to be happening though is that there is never a match,


   So then you added print statements to see what was in @files and $_

   What did you see?


: which leads me to think that the $files[$i] is not actually being
: expanded correctly inside the / /'s. I couldn't find anything in 
: the camel book to indicate how or if this is done. 



I note that now you give filenames like 100.htm, but in your code
you say:        $files[$i] = substr($files[$i],0,9);

That's not going to work too well unless you have 9 characters
before the '.htm' part...

If you want to remove the '.htm' at the end of the string, then
just say so:         s/\.htm$//;



You really ought to take more care in formatting your code.

Whitespace is cheap, so use it to help humans understand your code.

You should only use double quotes when you _need_ them. You have
a bunch of extraneous ones.


You are also doing C programming with Perl  ;-)

Does this come close to doing it for you?

----------------------------
#!/usr/bin/perl -w

$ifile = 'foo.htm';

### get a list of filename fragments
opendir(THIS, '.') || die "Can't opendir $!";
@files = grep(!/^\.\.?$/ && /\.htm/,readdir(THIS));
closedir(THIS);

foreach (@files) {
  s/\.htm//;         # remove the filename extension
}


open(HTML, $ifile) || die "Can't open $ifile  $!";
open(OUT, ">$ifile.bak") || die "Can't open $ifile.bak  $!";

while (<HTML>) {
 foreach $fname (@files) {
   $href = qq(<a href="../patches/$fname.htm">$fname</a>);
   s/$fname/$href/;
   }
 print OUT;
}

close(OUT);
close(HTML);

rename($ifile, "$ifile.old") || die "could not rename '$ifile'  $!";
rename("$ifile.bak", $ifile) || die "could not rename '$ifile.bak'  $!";
----------------------------


--
    Tad McClellan                          SGML Consulting
    tadmc@metronet.com                     Perl programming
    Fort Worth, Texas


------------------------------

Date: 16 Dec 1998 19:06:48 -0500
From: Uri Guttman <uri@ibnets.com>
To: lr@hpl.hp.com (Larry Rosler)
Subject: Re: Need some speed tips on this script..
Message-Id: <391zlz638n.fsf@ibnets.com>

>>>>> "LR" == Larry Rosler <lr@hpl.hp.com> writes:

  LR> In article <x7k8zspzcc.fsf@sysarch.com> on 15 Dec 1998 21:58:27 -0500, 
  LR> Uri Guttman <uri@sysarch.com> says...

  >> join is one of the fastest complex perl funcs and is underused by many
  >> folks.

hey thanx for proving me right.

  LR>       For0: 13 secs ( 7.61 usr  0.15 sys =  7.76 cpu)
  LR>       For1: 16 secs ( 7.37 usr  0.15 sys =  7.52 cpu)
  LR>       Join:  4 secs ( 2.13 usr  0.10 sys =  2.23 cpu)
  LR>       Map0: 19 secs ( 9.16 usr  0.17 sys =  9.33 cpu)
  LR>       Map1: 26 secs (13.13 usr  0.23 sys = 13.36 cpu)
  LR>       Seps:  8 secs ( 3.65 usr  0.12 sys =  3.77 cpu)

  LR> That is an eyeopener.  The single print of a string of >10K characters 
  LR> is the winner, *big* over multiple print arguments produced by copying 
  LR> $_ and "\n" into a string, also over multiple print arguments separated 
  LR> by $,.  Not localizing the global variables in the last case makes it 
  LR> somewhat faster (3.27 sec), but not enough to change the conclusions.

it may be because print probably calls the low level C printf (and
family) routines for each argument value (including each element of an
array) it sees. that would make sense since it has to be smart about
conversions, etc. while join is one low level c routine which just
appends strings to a buffer (and remallocing when needed). it then makes
only one call to print.

the motto is "memory management is faster than formatting".

though i wouldn't go so far as to put a join in front of every print, i
do use it when there are arrays. for speed i should use it if there are
many arguments.

now every one say this together: i like join! we like join! you like
join!

  LR> Thanks for puncturing what I thought was a good technique!

they don't call me a prick for nothing! :-)

uri

-- 
Uri Guttman                             Hacking Perl for Ironbridge Networks
uri@sysarch.com				uri@ironbridgenetworks.com	


------------------------------

Date: 17 Dec 1998 00:37:06 GMT
From: "Scott Irwin" <the.irwins@worldnet.att.net>
Subject: perlcc on Win32 link error
Message-Id: <01be2955$bfc573a0$5a0c4a0c@sirwin.ati.att.com>

I have built Perl5.005_02 on my NT box with VC++ 6.0.

My print "Hello World\n"; compiles, links and executes fine.

Now I need to compile a somewhat larger program.  The link step fails when
it can't resolve _runops().  It appears that many references to variations
of _runops appear in the source code.  I executed the nmake -f Makefile
test script and all passed except one POSIX test (no big deal).

It was late and I never tried putting the -lperl again on the file list. 
I've done compiles before on Unix where that actually resolved the problem.
 I'll give it a shot Thursday, but I'm not holding my breath.

Thanks in advance.....

Scott Irwin


------------------------------

Date: Thu, 17 Dec 1998 01:21:49 GMT
From: imchat@ionet.net
Subject: Re: Pleas help with outputting messages to an html page while the user   waits.
Message-Id: <36785bfa.1118301923@news.ionet.net>

	Just popup a small window using Javascript by adding the
onClick command to the submit button field.
<input type=submit value=submit onClick="javascript:openWin()">

On Wed, 16 Dec 1998 16:58:26 -0500, Rama Murthy <rmurthy@plexstar.com>
wrote:

>Hi,
>     I am using PERL/CGI script to generate static HTML page after the
>user submits the form. While the user is waiting I would like to display
>messages like "please wait it is procesing etc." . How can do that in
>PERL/CGI script probably using JavaScript.
>
>Thanks
>Rama
>
>
>
>



------------------------------

Date: 16 Dec 1998 21:06:18 -0500
From: clay@panix.com (Clay Irving)
Subject: Re: Pleas help with outputting messages to an html page while the user waits.
Message-Id: <759ouq$rab@panix.com>

In <36782D02.9B24E0DA@plexstar.com> Rama Murthy <rmurthy@plexstar.com> writes:

>     I am using PERL/CGI script to generate static HTML page after the
>user submits the form. While the user is waiting I would like to display
>messages like "please wait it is procesing etc." . How can do that in
>PERL/CGI script probably using JavaScript.

If you are using CGI.pm:

    print "please wait it is processing etc.";

-- 
Clay Irving
clay@panix.com


------------------------------

Date: Thu, 17 Dec 1998 01:59:48 GMT
From: mgjv@comdyn.com.au (Martien Verbruggen)
Subject: Re: problem with grep and readdir for subdirectories
Message-Id: <oAZd2.426$595.570@nsw.nnrp.telstra.net>

In article <757m11$q5q@slip.net>,
	emclean@slip.net (Emmett McLean) writes:
> In article <xbEd2.99$595.86@nsw.nnrp.telstra.net>,
> Martien Verbruggen <mgjv@comdyn.com.au> wrote:
>>
>>Please, please, please, always consult the documentation first if you
>>have a problem with a certain function.
>>
>  As is typical in this group, don't give any answers and
>  wine that the answer is in the documentation.

Huh? And who the * are you anyway?

If someone asks a question that has an answer in the documentation, I
can either:

1) Ignore it. No one gains a thing

2) Quote from the documentation. The poster will come back next time,
   because they don't know where to find answers.

3) Give a pointer to the relevant documentation. The poster will know
next time where to find it.

4) A combination of 2) and 3)

5) Just whine.

I prefer to use option 4, and have so for the last few years on this
group. You are the first one to complain about that approach.

If an answer is in the documentation, I will point people to the
documentation. I will also include a snippet of the documentation that
is relevant to the question.

And there is nothing you can say to change my mind about it.

Emmet, meet my killfile. Killfile, meet ....*plonk*

Martien
-- 
Martien Verbruggen                  | 
Webmaster www.tradingpost.com.au    | The world is complex; sendmail.cf
Commercial Dynamics Pty. Ltd.       | reflects this.
NSW, Australia                      | 


------------------------------

Date: Thu, 17 Dec 1998 11:26:56 +1000
From: Jaime Metcher <metcher@spider.herston.uq.edu.au>
Subject: Re: querying/updating Access db in WinPerl?
Message-Id: <36785DE0.AD52E418@spider.herston.uq.edu.au>

There are two options:

1. DBI and DBD::ODBC
2. Win32::ODBC

I use DBI, but probably your first step is to see what's already
installed with perl.

-- 
Jaime Metcher
David Askov wrote:
> 
> [ Article crossposted from comp.lang.perl ]
> [ Author was David Askov ]
> [ Posted on Wed, 16 Dec 1998 22:26:16 GMT ]
> 
> Hi,
> 
> I'd like to be able to query, insert and update records in an MS Access
> database using the Win32 version of PERL. I'm working on a Win95 system.
> I'm trying to write a program that eventually could maintain an Access
> database via a web interface, without having Access loaded on that
> machine.
> 
> Is this possible? If so, how? As much detail as you can provide would be
> really great, including snippets of code. I know basic SQL syntax & I
> know PERL, but I don't know how to talk to the database or what to do
> with the results once I get them.
> 
> Any other useful suggestions on how to do this are welcomed.
> 
> Thanks, will summarize,
> 
> David Askov
> askov@digitalics.com


------------------------------

Date: 17 Dec 1998 01:57:26 GMT
From: dformosa@zeta.org.au (David Formosa)
Subject: Re: school me on RE please
Message-Id: <slrn77gp86.91b.dformosa@godzilla.zeta.org.au>

In article <759f3h$n5j$1@tilde.csc.ti.com>, keystroke@bigfoot.com wrote:

[...]

>$HTML_FILE =~ s|<TD>(.*?STRING.*?)</TD>|<TD>whatever</TD>|im;
>
>I understand what is happening.  The RE is matching the first '<TD>' to
>the string and to the next '</TD>'.  But I thought that the '?'
>following a quantifier makes it be non-greedy.

It dose, the regexp engion finds <TD> and then finds the next </TD>
after string, this means that it may span over serval <TD> </TD>
pairs.

What you are looking for is something like this.

s|<TD>([^<]*?STRING.*?)</TD>|<TD>whatever</TD>|im;

Which will work if there is no markup inside the TD, a better option
would be to use the HTML parsing module.

-- 
Please excuse my spelling as I suffer from agraphia. See
http://www.zeta.org.au/~dformosa/Spelling.html to find out more.



------------------------------

Date: Thu, 17 Dec 1998 00:24:59 GMT
From: christian.aranda@iiginc.com (Christian M. Aranda)
Subject: Searching through a 10MB file
Message-Id: <759j7u$91p$1@news-1.news.gte.net>

Hey Folks -

I need to search for markers in a file.  Once I find a marker, I need
some lines of text after it.  No problem here... my code works fine.

Here is the problem:  the file is 10MB.  It takes aprox. 10 mins to
search for something at the very bottom (1,800,000 lines).  This is
unacceptable because I am using this to load about 8000 records and
that would take forever.

Here's what I have considered:
Loading the entire thing into a variable.
Deleting the text that get from the file.  Once I read the text, I
don't need it again.
Shooting my self in the head.

Platform: HP-UX 10.20
Perl: 4.0  (there is NO ESCAPING it)

All ideas and suggestions are welcomed!  TIA -


Christian M. Aranda
Impact Innovations Group
------------------------
Decide what you want then decide
what you'll give up for it.  Me?
I'll give up sleep.


------------------------------

Date: 16 Dec 1998 20:05:58 -0500
From: Uri Guttman <uri@ibnets.com>
Subject: Re: Searching through a 10MB file
Message-Id: <39yao74lxl.fsf@ibnets.com>

>>>>> "CMA" == Christian M Aranda <christian.aranda@iiginc.com> writes:

  CMA> I need to search for markers in a file.  Once I find a marker, I need
  CMA> some lines of text after it.  No problem here... my code works fine.

it doesn't if it takes that long!

  CMA> Here is the problem:  the file is 10MB.  It takes aprox. 10 mins to
  CMA> search for something at the very bottom (1,800,000 lines).  This is
  CMA> unacceptable because I am using this to load about 8000 records and
  CMA> that would take forever.

  CMA> Here's what I have considered:
  CMA> Loading the entire thing into a variable.

you must be doing something very wrong to make it that slow. are all the
markers in the order in the file? are you seeking around looking for
stuff? are you rereading the file each time? 

we don't know how to help unless we know what the data and markers look
like and what you actually want to do.

  CMA> Shooting my self in the head.

how about posting some code to see if we can speed it up.

  CMA> Perl: 4.0  (there is NO ESCAPING it)

now you should shoot yourself in the head. :-)

uri

-- 
Uri Guttman                             Hacking Perl for Ironbridge Networks
uri@sysarch.com				uri@ironbridgenetworks.com	


------------------------------

Date: Thu, 17 Dec 1998 01:36:43 GMT
From: christian.aranda@iiginc.com (Christian M. Aranda)
Subject: Re: Searching through a 10MB file
Message-Id: <759ndr$svo$5@news-2.news.gte.net>

On 16 Dec 1998 20:05:58 -0500, Uri Guttman <uri@ibnets.com> wrote:

>you must be doing something very wrong to make it that slow. are all the
>markers in the order in the file? are you seeking around looking for
>stuff? are you rereading the file each time? 
>
>we don't know how to help unless we know what the data and markers look
>like and what you actually want to do.

Here is a sample from the datafile I'm searching:

Start: INS09837
data
 .
 .
 .
data
End: INS09837

Start: INS20989
data
 .
 .
 .
data
End: INS20989

The data appears in no particular order, unfortunatly, so I can't just
continue searching from where I left off.

Here is a code snippet where I am doing the work:
sub get_ddts_record
{

$status = open(DDTS_DATA, "$ddts_file");
&err_msg("fatal", "Unable to open data file", "open", $status) unless
($status);

   $BGSBugId = $ddts_value{identifier};
   $BugIdFound = 'FALSE';
   $GetData = 'FALSE';
   $Done = 'FALSE';
   $title = 'History';
   $end_record = 'End:';
   $split_text = 'Related-file:';
   undef($record{comment_tab});

   while (<DDTS_DATA>) {
      chop($_);
      if (/^Start: $BGSBugId/) { $BugIdFound = 'TRUE'; }
      if (/^$title/ && $BugIdFound eq 'TRUE') { $GetData = 'TRUE'; }
      if (/^$end_record/ && $BugIdFound eq 'TRUE') { $GetData =
'FALSE'; $Done = 'TRUE'; } 

      if ($GetData eq 'TRUE' && $Done eq 'FALSE') {
            $record{comment_tab} .= "$_\n";
      }
   }

@comments = split(/$split_text/,$record{comment_tab});
close(DDTS_DATA);

#print "$comments[0]:\n";
         
}


I don't know what else to provide.  If there is anything else which
can help you help me, let me know!

Thanks again -

Christian M. Aranda
Impact Innovations Group
------------------------
Decide what you want then decide
what you'll give up for it.  Me?
I'll give up sleep.


------------------------------

Date: Wed, 16 Dec 1998 18:46:29 -0800
From: lr@hpl.hp.com (Larry Rosler)
Subject: Re: Searching through a 10MB file
Message-Id: <MPG.10e2105b789f14ee98995e@nntp.hpl.hp.com>

[Posted to comp.lang.perl.misc and copy mailed.]

In article <759ndr$svo$5@news-2.news.gte.net> on Thu, 17 Dec 1998 
01:36:43 GMT, Christian M. Aranda <christian.aranda@iiginc.com> says...
 ...
>    $BGSBugId = $ddts_value{identifier};
>    $BugIdFound = 'FALSE';
>    $GetData = 'FALSE';
>    $Done = 'FALSE';
>    $title = 'History';
>    $end_record = 'End:';
>    $split_text = 'Related-file:';
>    undef($record{comment_tab});
> 
>    while (<DDTS_DATA>) {
>       chop($_);
>       if (/^Start: $BGSBugId/) { $BugIdFound = 'TRUE'; }
>       if (/^$title/ && $BugIdFound eq 'TRUE') { $GetData = 'TRUE'; }
>       if (/^$end_record/ && $BugIdFound eq 'TRUE') { $GetData =
> 'FALSE'; $Done = 'TRUE'; } 

In the last three statements, you are compiling a regex every time 
through the loop which is the same every time.  This is very expensive.  
Furthermore, as you are looking for a fixed string, you don't need a 
regex at all, but should use the 'index' function (assuming there is one 
in Perl 4 :-).  I would do it like this:

In the loop initialization:

     $start = "Start: $BGSBugId";

In the loop:

        if (index($_, $start) == 0) { $BugIdFound = 'TRUE'; }
        if (index($_, $title) == 0 && $BugIdFound eq 'TRUE') { $GetData 
= 'TRUE'; }
        if (index($_, $end_record) == 0 && $BugIdFound eq 'TRUE') { 
$GetData = 'FALSE'; $Done = 'TRUE'; } 

You can also save small amounts of time by replacing 'TRUE' by 1 and 
'FALSE' by 0, and doing logical tests instead of string compares, but 
compared to that other issue this is minor.

Please let us know if replacing the regexes as shown solves your 
problem.

-- 
(Just Another Larry) Rosler
Hewlett-Packard Company
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com


------------------------------

Date: Thu, 17 Dec 1998 00:47:57 GMT
From: Rick Delaney <rick.delaney@home.com>
Subject: Re: Why doesn't this work?
Message-Id: <36785677.D6E4BB0E@home.com>

[posted & mailed]

JYB wrote:
> 
> henlif@elsfl.com (Henry Lifton) writes:
> >
> > ($id,$address,$city,$type,$imp,$desc,$terms,$comments)=split('\t',$info);
> Seems correct. (maybe "\t" instead of '\t')

It is correct, though it would probably be better written as /\t/, to
emphasize the fact that the first argument to split is a regex.  

There is no difference between the following:

    split /\t/;
    split '\t';
    split "\t";
    split m|\t|;

The quotes are misleading because they don't require the 'm' operator.  This
is to accommodate the special case of splitting on space.  The following are
all equivalent and will split on whitespace after stripping off all leading
whitespace:

    split ' ';
    split " ";
    split q/ /;
    split qq/ /;
    split;

These are not the same as the preceding five since they will split on a single
space character:

    split / /;
    split m' ';
    split m" ";

All very confusing, isn't it?  This is why I always use /PATTERN/, unless
splitting on whitespace (when I use ' ' if I have a second argument).

-- 
Rick Delaney
rick.delaney@shaw.wave.ca


------------------------------

Date: 16 Dec 1998 18:58:32 PST
From: BMan@concentric.net (B. Mann)
Subject: Re: Writing Perl with Notepad
Message-Id: <759s0o$hmb@chronicle.concentric.net>

If you use MS write, and save as a text file, it usually comes out
fine.  Occassionally, not.  Just do a tr from linux to be sure.  It
works fine for me.  Vi is fine as far as I'm concerned, but, working
in windows can be mighty convenient.

dragons@scescape.net (Matthew Bafford) wrote:

->In article <36730C4B.D0A1B217@technologist.com>, evanp@technologist.com 
->says...
->=> This is a multi-part message in MIME format.
->=> --------------31C935D452E45C802F4A6D88
->=> Content-Type: text/plain; charset=us-ascii
->=> Content-Transfer-Encoding: 7bit

->Please don't do that.

->=> I have no problems writing Perl scripts with vi.  Yesterday though, I
->=> tried to write a script with Notepad but after I saved it on Linux and
->=> tried to execute it gave me an error complaining about linefeeds or
->=> something like that.  Can I use Notepad or different windows editor

->Something like that?

->=> for script writing? I have a class of high school students and using
->=> vi is like pulling teeth.

->Sure.  You have to watch for (at least) two things when using a windows 
->editor:

->1) Make sure the line endings are 'fixed'.
->2) Make sure the last line, if the last line of a format or of a here-  
->   doc, has a blank line after it.

->To get the proper line endings, either transfer as ASCII (when FTPing the 
->file).

->Or:

->Do a

->perl -i -pe 'tr/\r//d;' files

->from the Linux prompt.

->=> --------------31C935D452E45C802F4A6D88
->[snip VCARD]

->Please don't do that, either.

->=> Thanks,

->HTH!

->--Matthew




------------------------------

Date: 12 Dec 98 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Special: Digest Administrivia (Last modified: 12 Dec 98)
Message-Id: <null>


Administrivia:

Well, after 6 months, here's the answer to the quiz: what do we do about
comp.lang.perl.moderated. Answer: nothing. 

]From: Russ Allbery <rra@stanford.edu>
]Date: 21 Sep 1998 19:53:43 -0700
]Subject: comp.lang.perl.moderated available via e-mail
]
]It is possible to subscribe to comp.lang.perl.moderated as a mailing list.
]To do so, send mail to majordomo@eyrie.org with "subscribe clpm" in the
]body.  Majordomo will then send you instructions on how to confirm your
]subscription.  This is provided as a general service for those people who
]cannot receive the newsgroup for whatever reason or who just prefer to
]receive messages via e-mail.

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.

The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V8 Issue 4443
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[10842] in Perl-Users-Digest

Perl-Users Digest, Issue: 4443 Volume: 8

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Wed Dec 16 22:07:24 1998

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Wed Dec 16 22:07:24 1998