[9310] in Perl-Users-Digest
Perl-Users Digest, Issue: 2904 Volume: 8
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Jun 18 18:07:16 1998
Date: Thu, 18 Jun 98 15:00:32 -0700
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Thu, 18 Jun 1998 Volume: 8 Number: 2904
Today's topics:
Re: --- Is there a way to run Perl scripts from my Win9 (Steve Linberg)
Re: 2 questions about lists <jdporter@min.net>
Re: add a record into file DBF with perl (Steve Linberg)
Re: Array Combinations Question <jdporter@min.net>
C/XS code to accept oddball Perl constructs? (christopher f. chiesa)
Checking returns from system calls (Was: Please HELP co (Larry Rosler)
Creating the Cartesian product of a set of sets -- LONG <steve.tolkin@fmr.com>
Dynix/ptx build problems <mdc0788@fugue.ca.boeing.com>
Re: Faster Search <rootbeer@teleport.com>
Re: Faster Search (Michael J Gebis)
Re: first language (Larry Rosler)
Re: Have we got a good free Perl manual? (christopher f. chiesa)
Re: Help! Database file modification (Steve Linberg)
Re: Help! Database file modification <rootbeer@teleport.com>
Re: How to scrub Ctl-Z? (Craig Berry)
Re: How to sort alphanumeric string using "sort" functi (Steve Linberg)
IIS4.0 won't run perl scripts which call external comma postmanager@my-dejanews.com
Re: Making Life Easy - Templates, XSSI, Variables and s (Steve Linberg)
Digest Administrivia (Last modified: 8 Mar 97) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Thu, 18 Jun 1998 17:13:10 -0400
From: linberg@literacy.upenn.edu (Steve Linberg)
Subject: Re: --- Is there a way to run Perl scripts from my Win95 machine?
Message-Id: <linberg-1806981713100001@projdirc.literacy.upenn.edu>
In article <358806DA.7E2B@all.com>, Webcruiser <none@all.com> wrote:
> I am new to Perl, and would like to be able to test my scripts before
> ftping them to my host site. What is involved in doing this?
Reading the FAQ, and the appropriate documentation, especially that found
in CGI.pm, which should be sitting on your machine. Good luck!
_____________________________________________________________________
Steve Linberg National Center on Adult Literacy
Systems Programmer &c. University of Pennsylvania
linberg@literacy.upenn.edu http://www.literacyonline.org
------------------------------
Date: Thu, 18 Jun 1998 21:23:31 GMT
From: John Porter <jdporter@min.net>
Subject: Re: 2 questions about lists
Message-Id: <3589870C.61CA@min.net>
Larry Rosler wrote:
>
> The distinction between list and array arguments for foreach is that
> lists cannot move during the iteration, so the refs to their values are
> valid throughout. Writing via the refs happens, but has no visible
> effect because the list values are discarded after the iteration. But
> array values persist.
Maybe I'm misunderstanding what you're saying here, but...
$a = 0;
for ( $a ) { $_ = 1; }
sets $a to 1, so the effect is "visible"; whereas...
for ( 0 ) { $_ = 1; }
is fatal:
Modification of a read-only value attempted
--
John Porter
------------------------------
Date: Thu, 18 Jun 1998 17:06:25 -0400
From: linberg@literacy.upenn.edu (Steve Linberg)
Subject: Re: add a record into file DBF with perl
Message-Id: <linberg-1806981706250001@projdirc.literacy.upenn.edu>
In article <6mbs8o$gd0$1@nnrp1.dejanews.com>, peacemakers@my-dejanews.com wrote:
> How to add a record into file DBF with Perl.
> To read I'm acctualy using module Xbase.pm
> and I dont know how to write...
Your question is impossibly vague. How do you expect anyone to help you?
What, specifically, is your problem? You don't understand the
documentation? You need a lesson in Xbase.pm or database programming? We
don't know your operating platform, your system, what you're doing, or
even if you know anything about programming.
If you have a specific question that is not covered by any of the
documentation, and you've worked hard to figure it out and can't, and can
provide a small code sample that you're having trouble with, and you're
very polite and friendly, then maybe someone can help you. It doesn't
sound like you've done all your homework yet.
_____________________________________________________________________
Steve Linberg National Center on Adult Literacy
Systems Programmer &c. University of Pennsylvania
linberg@literacy.upenn.edu http://www.literacyonline.org
------------------------------
Date: Thu, 18 Jun 1998 21:27:13 GMT
From: John Porter <jdporter@min.net>
Subject: Re: Array Combinations Question
Message-Id: <358987EC.2FF2@min.net>
Randal Schwartz wrote:
>
> >>>>> "F" == F Quednau <quednauf@nortel.co.uk> writes:
>
> F> John Porter wrote:
> >> for ( rec( \@array ) ) {
> >> print "$_\n";
> >> }
> >>
> >> sub rec {
> >> my( $ar, @stack ) = @_;
> >> return "@stack" unless @$ar;
> >> my( $head, @a ) = @$ar;
> >> map { rec( \@a, @stack, $_ ) } @$head;
> >> }
>
> F> How many years of Perl programming do I have to do to produce an answer like
> F> that ?
>
> Well, one year, presuming you have also done two years of Prolog. :)
>
> That's certainly speaking Perl with a Prolog accent. Or is it Lisp?
>
> If that was *my* code, I'd have put the comment "NO USER SERVICABLE
> PARTS INSIDE" above it. :-)
Well, I think of it more as lisp-like.
Although I have thought it might be nice to have a prolog-like module
for perl...
--
John Porter
------------------------------
Date: 18 Jun 1998 17:11:56 -0400
From: cfchiesa@cyber1.servtech.com (christopher f. chiesa)
Subject: C/XS code to accept oddball Perl constructs?
Message-Id: <6mbvqs$bf4@cyber1.servtech.com>
Hello, Perlsters...
I'm developing an application to translate arrays of numeric data from
ASCII/text rows-and-columns to a file format implemented via a set of
custom callable APIs. My boss insisted that the "reader" be written in
Perl, but the nature of the output API requires that it be used from C.
Although other solutions are possible, the organizationally-"cleanest"
approach appears to be the use of XSUBs to bridge the gap between Perl
and C (and the custom API).
Despite being new to my job, Unix, Perl, regular expressions, and
Makefiles, let alone XSUBs and the custom output API, I have managed to
complete two-thirds of the job: I have a working "reader" in Perl, a
working C output writer which I have translated into Perl, and "first
drafts" of the XSUB(s) which provide the output API in Perl. I can in
fact open/create and close an output file using the API, so "I know I'm
CLOSE!"
Unfortunately, some of the API-routine arguments are supposed to be C
arrays (of integers), whereas Perl stuffs the read-in integers into an
"anonymous list (or array) of references to anonymous lists (or arrays) of
values," a far different in-memory animal as you know! So now I need to
write either Perl or C/XS code to convert the Perl representation to a C-
array-like form, either before or after passing the data to the output
API.
At my present level of knowledge, the first option appears impossible
(though perhaps judicious use of unpack --?), so I am attempting the
second. By reading manpages, books, and Websites, and doing a LOT of
trial-and-error programming I was able to create an XSUB that for SMALL
(say, 10 rows x 15 columns) datasets DOES appear to perform the necessary
conversion and return it to Perl as a string (containing binary data).
Unfortunately, when tested on a dataset whose dimensions approximate those
expected in the "real" data -- 15000 rows x 15 columns -- it fails badly.
It takes up about 99% of the available CPU on my SPARCstation 10, even so
cranks out only about three rows per second -- then overflows the system
swap file and coredumps with an "out of memory" error!
My code (appended beneath my signature, below) makes a single call to
safemalloc(), which successfully allocates (#rows x #columns x
sizeof(double) = 1.8 megs) memory. Thereafter I use a lot of the SV
macros described in the perlguts manpage, to navigate through the Perl
structure and extract the data values buried within. I *think* I'm
successfully returning the extracted elements to Perl "as a string," --
using the XPUSHs() macro -- but I'm not entirely sure I'm using it
correctly, or in the right place, or in the correct relationship to the
rest of the code.
I conclude that either
a) my swapfile is smaller than 1.8 megs (plus whatever Perl has
already used);
b) there's a hellacious memory-leak in the SV... macro package
and/or whatever the macros themselves call;
or
c) I am somehow misusing either the SV... macros or the XPUSHs()
macro.
Granted, (c) is by far the LIKELIEST explanation -- but I don't see HOW
I could possibly be misusing the SV... stuff, and the routine runs out of
memory and coredumps BEFORE the point where it would have executed the
XPUSHs() ... So I am at a rather complete loss for explanations at the
moment, and would like to ask whether anyone READING here can offer one I
haven't thought of. Anyone?
For what it's worth, I HAVE looked at TONS of documentation, but only
a small percentage even discusses the use of XS or writing of XSUBs; and
of THAT material, essentially NONE of it (except, by implication, the
perlguts manpage) discusses passing "complex" _Perl_ structures _into_ C
-- all examples of "complex structure" passing, focus on returning _C_
structures _back_ to Perl as function return-values, and all examples of
passing Perl data into C, discuss only scalars!
The sheer lack of coverage of the subject makes me wonder if maybe my
whole APPROACH to the problem is somehow wrong; in that case, PLEASE tell
me the CORRECT approach. I can't imagine this being a highly UNUSUAL
PROBLEM to solve.
Thanks in advance. I will attempt to read your responses here -- I've
subscribed to this group as of now -- but my newsreading facilities are
unbelievably flaky and I've learned NOT to DEPEND ON them. Thus, if you
take the time to post, and it's not too hard to send a COPY via e-mail,
I'd appreciate the consideration. I'll be happy to do the same for YOU
someday, if you happen to ask a question on which *I* am knowledgeable!
Chris Chiesa
cfchiesa@servtech.com
---- CODE EXAMPLE FOLLOWS ----
SV *
carray(lArray, lNrows, lNcols)
void * lArray
long lNrows
long lNcols
PPCODE:
SV *psvBase, *psvRow, *psvEntry, *psvReturn;
long lRow, lCol;
double *pWorking, *pStorage;
printf("Debug: Malloc()'ing... %d bytes\n",
lNrows * lNcols * sizeof (double));
printf("That's %d * %d * %d ...\n", lNrows, lNcols, sizeof(double));
pWorking = pStorage = (double *) safemalloc(lNrows * lNcols * sizeof(double));
printf("pWorking = pStorage = %p\n", pWorking);
if (pStorage == 0)
{
printf("Couldn't get memory in carray()!\n");
exit (-1);
}
psvBase = ST(0);
if (SvROK(psvBase) && (SvTYPE(psvBase) == SVt_RV))
{
for (lRow = 0; lRow < lNrows; lRow++)
{
psvRow = *av_fetch(SvRV(psvBase), lRow, 0);
if (SvROK(psvRow) && (SvTYPE(psvRow) == SVt_RV))
{
printf("%3d: ", lRow);
for (lCol = 0; lCol < lNcols; lCol++)
{
psvEntry = *av_fetch(SvRV(psvRow), lCol, 0);
/* This test is essentially unnecessary... */
if (SvIOK(psvEntry) || SvNOK(psvEntry))
{
/* Value is now SvNV(psvEntry) or SvIV(psvEntry) */
/* depending on datatype! */
if (SvIOK(psvEntry))
{
printf("%5.5g ", (double) SvIV(psvEntry));
printf("at %p\n", pWorking);
*(pWorking++) = (double) SvIV(psvEntry);
} else {
printf("%5.5g ", SvNV(psvEntry));
printf("at %p\n", pWorking);
*(pWorking++) = (double) SvNV(psvEntry);
}
} else {
printf("Scalar entries not IV or NV (%d instead).\n",
SvTYPE(psvEntry));
/*
if (SvTYPE(psvEntry) == SVt_PV)
printf("\tValue: \"%s\"\n",
(char *) SvPV(psvEntry));
*/
}
}
printf("\n");
/* Create and return output entity! */
psvReturn = newSVpv((char *) pStorage,
(lNrows * lNcols * sizeof(double)));
XPUSHs(psvReturn);
} else {
printf("Row reference not reference type (%d instead).\n",
SvTYPE(psvRow));
}
}
} else {
printf("Base arg not reference-to-array. (%d instead)\n",
SvTYPE(psvBase));
}
free(pStorage);
------------------------------
Date: Thu, 18 Jun 1998 13:53:15 -0700
From: lr@hpl.hp.com (Larry Rosler)
Subject: Checking returns from system calls (Was: Please HELP convert a SIMPLE 2 Line DOS Batch File!!)
Message-Id: <MPG.ff31e17c67b8e2c98969b@nntp.hpl.hp.com>
[This followup was posted to comp.lang.perl.misc and a copy was sent to
the cited author.]
In article <6m9sh6$5ol$1@mathserv.mps.ohio-state.edu>, ilya@math.ohio-
state.edu says...
> [A complimentary Cc of this posting was sent to Larry Rosler
> <lr@hpl.hp.com>],
> who wrote in article <MPG.ff202e96f92f019989699@nntp.hpl.hp.com>:
...
> > If one wants to know whether print() succeeds, shouldn't one ask print()
> > or die? But no one *ever* does that. I know print() can fail on a full
> > file system, for example, but would that necessarily be reflected by a
> > failure to close()?
>
> print() is usually called in a loop, so it may be expensive to check it.
>
> Thus people check close().
That is instructive. The 'may be expensive to check it' can hardly refer
to the performance impact of a simple 'or die ...' after each print
statement, which is negligible compared to the processing of the print
itself. So it must refer to visual clutter or programmer boredom or
both.
As Perl doesn't have macros, the obvious C solution wouldn't apply.
Encapsulating 'print LIST or die ...' in a subroutine has real
performance expense.
WIBNI (Wouldn't It Be Nice If) Perl had a cheap robust mechanism for
handling software exceptions, so that all the clutter caused by chacking
of Perl functions that invoke system calls (like 'open' -- "always check
your open()") could be handled centrally. But I don't find SIGSOFT or
catch/try in the documentation. And 'eval BLOCK' doesn't catch this kind
of exception. Sigh...
--
Larry Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com
------------------------------
Date: Thu, 18 Jun 1998 17:01:36 -0400
From: Steven Tolkin <steve.tolkin@fmr.com>
To: Joel Coltoff <joel@wmi0.wmi.com>
Subject: Creating the Cartesian product of a set of sets -- LONG
Message-Id: <3589802E.CEBCE438@fmr.com>
Joel Coltoff (joel@wmi0.wmi.com --who will receive a CC: to this posting)
wrote:
> I'm looking for a clever solution to the following problem. I've
> got a small number of lists of lists. I want to generate all the
> combinations of sets that can be formed from these lists. The problem
> is that I don't know until all the data has been generated how many
> lists there will be. This is usually a number between 4 and 10 and
> I suppose I could write a separate case for each. In the code below what
> would happen if I added a list of control chars? I don't really want to
> add another foreach(). Also, the data is generated on the fly so
> I end up the array @chars. Perhaps there is a better way to do this
> as well. That however is only an issue if it impacts the rest of
> the problem.
>
> @numbers = (['1', '2', '3'], ['4', '5', '6'], ['7', '8', '9']);
> @letters = (['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']);
> @symbols = (['!', '@', '#'], ['$', '%', '^'], ['&', '*', '(']);
> push(@chars, [@numbers], [@letters], [@symbols]);
> foreach $number (@numbers) {
> foreach $letter (@letters) {
> foreach $symbol (@symbols) {
> print "@$number -- @$letter -- @$symbol\n";
> }
> }
> }
>
What you actually seem to want, based on the code, is not the "combination"
but the Cartesian product of those sets.
A while ago I asked how to do this in comp.lang.perl.misc
and got some useful advice.
Here is the program I wrote as a result.
I am giving it freely (whatever that means) to the perl community and the
general public.
#!/usr/local/bin/perl -w
# $Id: product.pm,v 1.10 1998/04/06 20:29:12 sy71046 Exp $
# Written by Steven Tolkin -- steve.tolkin@fmr.com
# There is no warranty expressed or implied.
# If you use it, or have any suggestions on how to improve it
# I'd appreciate you sending me email.
# Who When What
# tolkin 04/01/98 started from a program by John Redford
# Renamed variables, changed termination condition etc.
# tolkin 04/03/98 changed main loop to use a carry based approach
# Changed to emit the results in row major by default
# but allow column major if preferred. (NOT hooked up to
# an user option yet.)
# tolkin 04/06/98 Fixed location of safety check, formatting, renamed
# product.pl to product.pm, and wrote testprod.pl which see.
#
# Usage: in the caller e.g. testprod.pl
# require product;
# product($my_list_of_sets, \&my_func);
# This program produces the Cartesian product of a list of sets or
multi-sets.
# If the input "sets" have duplicates then duplicate tuples will be
produced.
# For each tuple in the result a user supplied function is called
# with the output tuple as its only argument.
# I prefer things emitted in row major order, as in the C language, like an
# odometer, the rightmost array index varies most rapidly.)
# But column major is used by OLE DB for OLAP (MDX) API from Microsoft.
# For other approaches on how to do this use e.g. DejaNews to
# see the responses to my posting to
# news://comp.lang.perl.misc on Friday, March 27, 1998 with Subject:
# How to generate a Cartesian product of a varying number of sets?
#
# One reason I chose this implementation is that it just
# iterates over arrays, with no hashes, stacks, trees, or recursion.
# It can be ported to almost any language, and should have good efficiency.
# (But generating the tuples is likely
# to be very cheap using any approach, compared to processing them.)
# Do Later:
# Design a different interface, e.g. produce the tuples using an iterator.
# rather than passing the processing function into this.
# Rename to make_tuples or make_cartesian_product.
# Get a newer version of perl installed so I can restore use strict and my.
# Maybe make into a module. Figure out where this would fit in CPAN.
# We emit a warning message if there are no input sets, or if
# any are empty. Maybe later allow caller to suppress this.
# Have print_tuple be the default function; allow user to provide none.
# Let caller specify row major or column major.
# Later use the test data to provide a self-test feature.
# Maybe use the return code from user_func to decide to terminate sooner.
# ********* Code Begins
# use strict; # RATS needed to comment out my due to my old 5.003 perl !!!
# The default order to change array indexes is "row major" which means the
# rightmost array index changes most rapidly, as in C.
# The opposite is column major.
my $isrowmajor = 1;
sub product ($$)
{
($sets, $user_func) = @_; # originally had my !!!
my $n = @$sets; # Number of sets
my @tuple; # Current tuple
my @last; # Last index in set
my @curr; # Current index in set
my $s; # Active set
my $expected = 1; # For safety check, done by multiplication
my $actual = 0; # Count of tuples as we create them
# *** Initialize
# In theory it is not an error to have no sets,
# or to have some sets be empty, but we return
# immediately for performance and to let the code assumes non-empty
sets.
if (0 == $n) {
warn "Warning: No sets in input\n";
return;
}
for ($s = 0; $s < $n; $s++) {
my $size = @{$$sets[$s]}; # Number of values in current set
if ( $size == 0 ) {
warn "Warning: At least one empty set in the input.\n";
return;
}
$expected *= $size;
# set the index of the last value in each set
$last[$s] = $size - 1;
# Create the first tuple from the first (0th) value in each set
$curr[$s] = 0;
$tuple[$s] = $$sets[$s][0];
}
my $fastestset; # Which end has the most rapidly changing set
if ($isrowmajor) {
$fastestset = $n - 1;
}
else {
$fastestset = 0;
}
# ****** Main Loop
mainloop: while ( 1 ) {
$actual++;
# Call the user supplied function with the current tuple.
&$user_func(@tuple);
# Each iteration increments by one the index in the fastest set.
# But if we were at the last value we need to "carry" over
# to the next set, as in elementary addition. The direction of
# next depends on whether we are using row major or column major.
# Incrementing an index might require several carries, e.g.
# adding 1 second to a time of 7:59:59 becomes 8:00:00.
# Rather than carry past the last set we exit the main loop.
$s = $fastestset;
carry: while ( 1 ) {
if ($curr[$s] < $last[$s]) {
$curr[$s]++;
$tuple[$s] = $$sets[$s][$curr[$s]];
last carry;
}
else {
$curr[$s] = 0;
$tuple[$s] = $$sets[$s][0];
if ($isrowmajor) {
$s--;
if ($s < 0) { last mainloop; }
}
else {
$s++;
if ($s == $n) { last mainloop; }
}
}
} # end carry
} # end mainloop
# Safety check. (If we *know* there are never bugs this can be
omitted.)
if ($actual != $expected) {
die "Error: Produced $actual tuples but expected $expected\n";
}
}
# Sample Test data
my $test_sets = [
[qw(a b)],
[qw(red green blue)],
[qw(singleton)],
# [], # empty set
];
# Sample processing function
sub print_tuple (@) {
my @list_of_scalars = @_;
print "@list_of_scalars\n";
}
1; # library code must return true
# end of file
/////////////////////////////
Here is a simple test program
#!/usr/local/bin/perl -w
# $Id: testprod.pl,v 1.3 1998/04/06 20:31:09 sy71046 Exp $
# Written by Steven Tolkin
# Who When What
# tolkin 04/06/98 Renamed product.pl to product.pm
# Usage:
# product.pl # in Unix
# In windows it is more complex. You can first create a batch file
# and then run it, via the two commands run in a DOS box:
# pl2bat product.pl
# product
# But this does not permit using redirection
# So a better way is:
# perl product.pl > foo.txt
require product; # Maybe later make product a module and use: use Product.
# Sample Test data
my $test_sets = [
[qw(a b)],
[qw(red green blue)],
[qw(singleton)],
# [], # empty set
];
# Why do I get the warning message:
# Subroutine print_tuple redefined at product.pm line 150.
# when both files product.pl and product.pm contain
# the following code lines, but with the name print_tuple.
# The manual said under require "This form of loading of modules
# does not risk altering your namespace."
# Sample processing function
sub my_print_tuple (@) {
my @list_of_scalars = @_;
print "@list_of_scalars\n";
}
product($test_sets, \&my_print_tuple);
# end of file
--
Hopefully helpfully yours,
Steve
---
Steven Tolkin steve.tolkin@fmr.com 617-563-0516
Fidelity Investments 82 Devonshire St. R27C Boston MA 02109
There is nothing so practical as a good theory. Comments are by me,
not Fidelity Investments, its subsidiaries or affiliates.
------------------------------
Date: Thu, 18 Jun 1998 20:30:31 GMT
From: "Marty D. Cudmore" <mdc0788@fugue.ca.boeing.com>
Subject: Dynix/ptx build problems
Message-Id: <358978E7.7F36@fugue.ca.boeing.com>
I am attempting to build Perl on a Sequent box running Dynix/ptx Version
4.0. The build seems to go ok, except when I run the test. I get the
following error during the posix test:
-- snip --
lib/posix.........Can't load '../lib/auto/POSIX/POSIX.so' for module
POSIX: dynamic linker: ./perl: relocation error: symbol not found:
_fpgetround at ../lib/DynaLoader.pm line 166.
at ./lib/posix.t line 13
BEGIN failed--compilation aborted at ./lib/posix.t line 13.
FAILED at test 0
-- end of snip --
I went ahead and did the make install, added DBD::Oracle and got a
similiar error when it attempts to suck in DBD::Oracle.
Anybody have any ideas on this one?
Please respond by email as well as a post.
Many thanks and cheers,
Marty D. Cudmore (mdc0788@fugue.ca.boeing.com)
------------------------------
Date: Thu, 18 Jun 1998 21:04:27 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Faster Search
Message-Id: <Pine.GSO.3.96.980618140309.13348p-100000@user2.teleport.com>
On Thu, 18 Jun 1998, Vincent M. Probasco wrote:
> I've written a search in perl to go through
>
> html docs in a directory. The problem is there are over 3000 files
>
> to search through. The only way I know how to do this is open every
>
> file and then look at every line in those files. Is there any way
>
> that this might be done faster in Perl ?
It might help to stop double-spacing. :-)
But you could build an index once and use it repeatedly. Any good text on
building an index should be able to help you. Good luck!
--
Tom Phoenix Perl Training and Hacking Esperanto
Randal Schwartz Case: http://www.rahul.net/jeffrey/ovs/
------------------------------
Date: 18 Jun 1998 21:16:29 GMT
From: gebis@albrecht.ecn.purdue.edu (Michael J Gebis)
Subject: Re: Faster Search
Message-Id: <6mc03d$ea@mozo.cc.purdue.edu>
linberg@literacy.upenn.edu (Steve Linberg) writes:
}In article <3589733C.EBF832EF@h8mail.laf.cat.com>, "Vincent M. Probasco"
}<probavm@h8mail.laf.cat.com> wrote:
}> I've written a search in perl to go through
}> html docs in a directory. The problem is there are over 3000 files
}> to search through. The only way I know how to do this is open every
}> file and then look at every line in those files. Is there any way
}Build an index.
}This is not really a Perl question, btw.
Steve's right, this really isn't a perl question. However, glimpse is
cool, so I figured I would give it a free plug:
http://glimpse.cs.arizona.edu/index.html
>From the web page:
Glimpse is a very powerful indexing and query system that
allows you to search through all your files very quickly.
It even has some cool features like approximate matches. It's only
Unix right now, as far as I know, but it appears there some
information about porting efforts listed on the page. There are
comparable products for win and mac, but you'll have to ask on a group
where people know stuff about win and mac to get an answer.
--
Mike Gebis gebis@ecn.purdue.edu mgebis@eternal.net
------------------------------
Date: Thu, 18 Jun 1998 14:12:48 -0700
From: lr@hpl.hp.com (Larry Rosler)
Subject: Re: first language
Message-Id: <MPG.ff322ab6ba60ca098969c@nntp.hpl.hp.com>
In article <6mbktf$5qp$1@msunews.cl.msu.edu>, nguyend7@egr.msu.edu
says...
...
> PASCAL C Perl
Pascal :-)
As one of the first who tried to teach C many years ago, I can vouch that
it is a poor choice for beginners, for one spcific reason that is seldom
discussed: the difficulty of doing simple text input with data
conversion.
Once one gets past single-character input (getchar or getc) or perhaps
line-at-a-time-and-parse-it-yourself input (gets or fgets, atoi, atof,
...), one encounters the horrible scanf function, which demands an
understanding of pointers and internal representations. Fuggedaboudit!
C++ is better on input conversions, and Perl can rely on text isolation
via regexes and automatic conversions. Regexes are hard unless one has
been weaned on ed/vi/grep/awk/sed/... but the student must learn them
right away to get much useful work done anyway. But Perl references can
wait till much later, while C pointers cannot.
Don't teach C to beginners!
--
Larry Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com
------------------------------
Date: 18 Jun 1998 17:38:08 -0400
From: cfchiesa@cyber1.servtech.com (christopher f. chiesa)
Subject: Re: Have we got a good free Perl manual?
Message-Id: <6mc1c0$d6o@cyber1.servtech.com>
In article <58Wg1.137$8W3.572756@ptah.visi.com>,
Todd Lehman <lehman@visi.com> wrote:
>Barry Margolin <barmar@bbnplanet.com> writes:
>> Well, the words he often chooses are easily misinterpreted. For instance,
>> his references to the existing Perl documentation said, "but they were no
>> good because they weren't free." By "no good" he meant "not acceptable" or
>> "not appropriate", but it's easy to understand why people would interpret
>> it as "not good" == "bad".
>
>Is that sort of thing intentional or is he just a crummy writer? Or does
>he expect all readers to speak his dialect of English? Curious,
Sounds to me like he's just got a MAD ON all the time. I get like
that over a lot of 'standard' ways-and-means in this industry
MYSELF, and will say things in a dramatic, inflammatory style just
to convey the fact that I'm TICKED OFF. :-)
Chris Chiesa
cfchiesa@servtech.com
------------------------------
Date: Thu, 18 Jun 1998 16:59:25 -0400
From: linberg@literacy.upenn.edu (Steve Linberg)
Subject: Re: Help! Database file modification
Message-Id: <linberg-1806981659250001@projdirc.literacy.upenn.edu>
In article <davendontlikespam-1806981327140001@208.8.190.31>,
davendontlikespam@ldr.com (Dave Neuer) wrote:
> I have a tab-delimited database file which contains several lines, each of
> which looks like this:
>
> mailing_list_id# list_name employee_id#1 employee_id#2, employee_id#3 <etc>
>
> I need to come up with a script that, given an array of valid employee
> id#'s, goes through each line of the file and deletes any invalid id#'s.
>
> Anyone have any idea how to do this (without a million lines of code)?
>
> Thanks,
>
> Dave Neuer
Here's one way:
Build a hash of valid ID's whose values are also the IDs.
Go through your lines with regular expressions replacing each ID with its
corresponding valid one from the hash.
Invalid ones will be undefined and therefore be stripped.
Shouldn't be a million lines of code. Good luck!
_____________________________________________________________________
Steve Linberg National Center on Adult Literacy
Systems Programmer &c. University of Pennsylvania
linberg@literacy.upenn.edu http://www.literacyonline.org
------------------------------
Date: Thu, 18 Jun 1998 21:02:58 GMT
From: Tom Phoenix <rootbeer@teleport.com>
Subject: Re: Help! Database file modification
Message-Id: <Pine.GSO.3.96.980618140224.13348o-100000@user2.teleport.com>
On Thu, 18 Jun 1998, Dave Neuer wrote:
> I need to come up with a script that, given an array of valid employee
> id#'s, goes through each line of the file and deletes any invalid id#'s.
>
> Anyone have any idea how to do this (without a million lines of code)?
The FAQ has a question and answer about working with a file by lines. Hope
this helps!
--
Tom Phoenix Perl Training and Hacking Esperanto
Randal Schwartz Case: http://www.rahul.net/jeffrey/ovs/
------------------------------
Date: 18 Jun 1998 21:50:45 GMT
From: cberry@cinenet.net (Craig Berry)
Subject: Re: How to scrub Ctl-Z?
Message-Id: <6mc23l$b8c$1@marina.cinenet.net>
Greg Carey (gacarey@domain.com) wrote:
: I started getting improper ^Z characters in my mainframe download files
: and the middleware pgm treats it as EOF. I am attempting to use Perl as
: a scrubber, but unfortunately, Perl also sees it as EOF. Is there a way
: to substitute ^Z with a space? If so, how would I deal with the "real"
: EOF?
:
: I've tried:
: s/\x1A/" "/g
Do you really want to replace ctrl-Z with a doublequote character, a
space, and another doublequote character? That's what this would do.
What you probably mean is
s/\x1A/ /g
or
s/\cZ/ /g
or even
tr/\cZ/ /
As for Perl seeing it as an eof, I presume you're on a DOS-based system.
Try using binmode() to get into binary-read mode. After that, you should
slurp the entire file as a single operation into a scalar (undef $/) and
operate on that.
---------------------------------------------------------------------
| Craig Berry - cberry@cinenet.net
--*-- Home Page: http://www.cinenet.net/users/cberry/home.html
| Member of The HTML Writers Guild: http://www.hwg.org/
"Every man and every woman is a star."
------------------------------
Date: Thu, 18 Jun 1998 17:11:51 -0400
From: linberg@literacy.upenn.edu (Steve Linberg)
Subject: Re: How to sort alphanumeric string using "sort" function
Message-Id: <linberg-1806981711510001@projdirc.literacy.upenn.edu>
In article <35896405.62761A69@tandem.com>, Margaret Lee
<margaret.lee@tandem.com> wrote:
> Hi,
>
> In my program I have alphanumeric strings hash keys like the
> following:
> abc1d2, abc11d2, abc2d2, abc2e, def<0>, def<11>, def<1> def<2>
>
> I would like to retrieve the keys in the following order:
> abc1d2, abc2d2, abc11d2, abc2e, def<0>, def<1>, def<2>, def<11>
>
> I have been using the following format and has been unsuccessful in
> generating the appropriate MY_SORT function to give the above
> result.
> foreach $name (sort MY_SORT keys %list)
> { ... }
>
> The sorted list that my MY_SORT generated thus far is
> abc1d2, abc11d2, abc2d2, abc2de, def<0>, def<11>, def<1> , def<2>
> which is not quite what I want.
>
> If you have any suggestion as to what MY_SORT should be, please
> let me know. Any suggestions will be greatly appreciated.
You're trying to do an alphabetic sort on data that is not alphabetic.
How about creating a hash whose values are the actual values you would
like to sort by, and sort those values. Then use the keys in whatever
application you want. How you are going to create those values is another
question. It looks like you're trying to interpret parts of your strings
numerically, although they are ASCII data. You'll need an algorithm to
convert them to whatever makes sense to you and will sort correctly. Good
luck!
_____________________________________________________________________
Steve Linberg National Center on Adult Literacy
Systems Programmer &c. University of Pennsylvania
linberg@literacy.upenn.edu http://www.literacyonline.org
------------------------------
Date: Thu, 18 Jun 1998 21:36:26 GMT
From: postmanager@my-dejanews.com
Subject: IIS4.0 won't run perl scripts which call external commands
Message-Id: <6mc18q$nni$1@nnrp1.dejanews.com>
I am having a problem running a perl script which uses an rsh command. This
script worked fine in IIS3.0 but seems to fail in IIS4.0. I am able to get a
standard perl script to run just fine but the scripts seem to fail once other
commands are added to them. Anyone else had this problem?
-----== Posted via Deja News, The Leader in Internet Discussion ==-----
http://www.dejanews.com/ Now offering spam-free web-based newsreading
------------------------------
Date: Thu, 18 Jun 1998 17:08:14 -0400
From: linberg@literacy.upenn.edu (Steve Linberg)
Subject: Re: Making Life Easy - Templates, XSSI, Variables and such
Message-Id: <linberg-1806981708140001@projdirc.literacy.upenn.edu>
In article <6mbqn8$stf@sjx-ixn11.ix.netcom.com>, "Geoff Hudik"
<geoffhudik@cyberdude.com> wrote:
> I'm considering ways to make updating my site easier (before it gets too
> big). I'm thinking that I can use Perl and/or XSSI to do this. Say, for
> example, that I want to change colors on my site, but I don't want to go
> through and edit every individual web page. I could do the following:
>
> <!--#set var="color" value="#99CCCC" -->
>
> <td width="26%" bgcolor="<!--#echo var='color' -->">
>
> but that only works on one page. Is there a way to use some sort of
> "global" variable so I can change the color variable once and it will change
> every color attribute on my site? I thought I'd be clever and use a SSI to
> include a file full of nothing but variables, but this did not work.
>
> I'm basically just looking for ways to make updating my site easier. The
> color problem is just one example. I'd like to use a template, variables,
> Perl, whatever... so that I will have to change a value once, and only once.
>
> Is there any way to do this?
Yes, there are many ways to do this. Head over to CPAN and browse the
module list. There are more HTML/text compilers than you can shake a
stick at. Or you could write one of your own.
_____________________________________________________________________
Steve Linberg National Center on Adult Literacy
Systems Programmer &c. University of Pennsylvania
linberg@literacy.upenn.edu http://www.literacyonline.org
------------------------------
Date: 8 Mar 97 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 8 Mar 97)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.
The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V8 Issue 2904
**************************************