[13219] in Perl-Users-Digest
Perl-Users Digest, Issue: 629 Volume: 9
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Aug 24 10:07:11 1999
Date: Tue, 24 Aug 1999 07:05:11 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Tue, 24 Aug 1999 Volume: 9 Number: 629
Today's topics:
Can't build perl with gcc <fx77@dial.pipex.com>
Catching writes to files in Modules <simon@profero.com>
cflow for perl??? <dougb@capecod.net>
Re: DBI, fetchrow_array and sorting (mysql) (Gary O'Keefe)
Desperately searching for perl lint (T. Alex Beamish)
Flash & Perl or CGI? (Kent Delcastillo)
Re: Help for the newbie (Gary O'Keefe)
I just don't get the damn hash thing.. <pdobbs@home.com>
Re: I just don't get the damn hash thing.. <jpeterson@office.colt.net>
Re: I just don't get the damn hash thing.. <hove@ido.phys.ntnu.no>
intallation on AIX 4.3.2 macisdvm@my-deja.com
Re: PERL EDITOR (Lars Gregersen)
Perl won't compile (gcc 2.8, AIX) <ccoving@uhc.com>
Re: POP is pooped! <flavell@mail.cern.ch>
Processing of the Authorization form (Status: 401) <karpat@eeh.ee.ethz.ch>
Re: Processing of the Authorization form (Status: 401) <gellyfish@gellyfish.com>
Re: Shamefully simple question. (N. Albers)
Re: sorting files randomly out of a list <garethr@cre.canon.co.uk>
Re: sorting files randomly out of a list <mmilovan@grolier.fr>
Re: spider - stripping useless words <contact@nativetongues.com>
split inside split - can it be done ? <assakhof@nospam.mimos.my>
Re: split inside split - can it be done ? <nospam.newton@gmx.net>
Re: User's Operative Sistem???? <rassmann@sdm.de>
Re: Why use Perl when we've got Python?! (Gary O'Keefe)
Re: xml::parser help (Arved Sandstrom)
Digest Administrivia (Last modified: 1 Jul 99) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Tue, 24 Aug 1999 14:14:00 +0100
From: Technical Services - UK Link <fx77@dial.pipex.com>
Subject: Can't build perl with gcc
Message-Id: <37C29A98.DE4C84B@dial.pipex.com>
I am trying to build perl (5.005_57) and mysql on a HP-UX 10.20 box.
Because I need to build mysql with gcc I am trying to build perl with
gcc as well.
I have done:
./Configure -desO -D cc='gcc'
make
'Configure' runs fine, 'make' runs for a bit and then I get the
following error:
`sh cflags libperl.a util.o` util.c
CCCMD = gcc -DPERL_CORE -c -D_HPUX_SOURCE -O
util.c: In function `Perl_cast_iv':
util.c:2548: parse error before `l'
*** Error exit code 1
Stop.
When I build perl with cc it works fine, but I can't compile mysql with
cc.
Any ideas???
Kees Vonk
------------------------------
Date: Tue, 24 Aug 1999 13:59:27 +0100
From: Simon Wistow <simon@profero.com>
Subject: Catching writes to files in Modules
Message-Id: <37C2972F.D87719C9@profero.com>
Is there a way to get a file retrieved by Net::FTP to be read in a variable
rather than outputted to a file. I could try and hack the module so that it
accepted a filehandle instead of a filename and do it that way but is that isn't
exactly what you call portable.
I mean, is there a general way of doing this? There have been a few times when i
wished I could do this (the MIME modules for one).
Is there some clever trick you could od with named pipes? Although I'd prefer
not to do this.
Simon
--
Simon Wistow Development
simon@profero.com Profero Ltd
Phone : 0171 700 9960 Fax : 0171 700 9961
------------------------------
Date: Tue, 24 Aug 1999 10:06:29 -0500
From: Doug Blaisdell <dougb@capecod.net>
Subject: cflow for perl???
Message-Id: <37C2B4F5.5AF756F9@capecod.net>
Like cflow for C code--to show subroutine calling heirarchies and
filenames?
------------------------------
Date: Tue, 24 Aug 1999 11:55:06 GMT
From: gary@onegoodidea.com (Gary O'Keefe)
Subject: Re: DBI, fetchrow_array and sorting (mysql)
Message-Id: <37c2864c.12171070@news.hydro.co.uk>
A keyboard was whacked upside Michael Preminger's head and out came:
>Hei!
>
>I am using perl DBI against a mysql database, and fetching a set of
>record from the database ordered.
>
>I use the construct:
> $IndexedDocs=$dbh->prepare("select Document_inc_id from
>Term_indexes_Document where Term_inc_id=$old_n order by Document_inc_id
>");
> $IndexedDocs->execute;
> while (@doc_id=$IndexedDocs->fetchrow_array){
> $tmpTermEntry.="$doc_id[1] ";
> }
Are you sure you want to look at $doc_id[1]? $IndexedDocs only asks
for one column. Unless you've reassigned $[, you probably want to look
at $doc_id[0].
>To my surprize the list I get has lost the sorting imposed by the select
>call. Trying to run the same call directly against the base retains the
>sorting.
>
>Question:
>
>Why do I lose the sorting the way I implement it, and how do I retain
>the sorting within DBI
For me, sorting in DBI has always worked fine.
Gary
--
Gary O'Keefe
gary@onegoodidea.com
You know the score - my current employer has nothing to do with what I post
------------------------------
Date: Tue, 24 Aug 1999 13:37:06 GMT
From: talexb@tabsoft.on.ca (T. Alex Beamish)
Subject: Desperately searching for perl lint
Message-Id: <37c29f88.141743940@news1.on.sympatico.ca>
Hello,
My background (such as it is) in software development in C has taught
me that it's always a good idea to use good indentation practices,
lint your code and run it under a debugger to make absolutely sure
that you, the compiler and the machine code agree on what's supposed
to happen.
Since moving over to perl a year ago I've tried to follow the same
practices, but I do not see a perl lint anywhere. Before I attempt to
write one myself (in perl, naturally), is there a tool that anyone can
suggest?
Thanks so much.
T. Alex Beamish, Principal -- TAB Software
Toronto, Ontario -- www.tabsoft.on.ca
------------------------------
Date: 24 Aug 1999 13:07:52 GMT
From: delcasti@cs.fsu.edu (Kent Delcastillo)
Subject: Flash & Perl or CGI?
Message-Id: <7pu5f8$23f$1@news.fsu.edu>
Has anyone done anything trying to combine the two?
--
Kent Del Castillo
kent@kentd.com
------------------------------
Date: Tue, 24 Aug 1999 11:01:02 GMT
From: gary@onegoodidea.com (Gary O'Keefe)
Subject: Re: Help for the newbie
Message-Id: <37c275b4.7923291@news.hydro.co.uk>
A keyboard was whacked upside Tim Allen's head and out came:
>Is there a tutorial somewhere that can show me how to manipulate an sql
>database with a perl program?
>
>thanks
>
>tim
Check out the documentation that comes with the DBI module. It should
be exactly what you are looking for (it's pretty straightforward and
it corresponds to the procedures used in generating a dynamic SQL
request in Oracle's PL/SQL).
http://www.perl.com/CPAN/
Gary
--
Gary O'Keefe
gary@onegoodidea.com
You know the score - my current employer has nothing to do with what I post
------------------------------
Date: Tue, 24 Aug 1999 12:04:52 GMT
From: Paul Dobbs <pdobbs@home.com>
Subject: I just don't get the damn hash thing..
Message-Id: <37C28C7B.F3BE4056@home.com>
Not an emergency need to know, but I'm certain I'm missing
out on
fun/simplified methods by not using any hashes in my little
programs.
I just don't get the whole damn thing.. I've been reading
Perl Annotated
Archives, the ActivePerl web site, this newsgroup.. I just
don't seem to
get it though (or the foreach concept either).
If someone has an explanation that really made it clear for
them and think
it will help here please feel free to share.
For the time being, I'm stick'n to If, then and goto to get
the job done
for what I want, and so far it works well. I just want to
learn how to use
other commands. I've worked with Vic20, C64, C128,
Quickbasic, VB and VBA
over the years to various degrees for personal intrests, so
do have a good
base to build from. Just get stuck on the new math sometimes
:-)
------------------------------
Date: Tue, 24 Aug 1999 13:20:33 GMT
From: Jon Peterson <jpeterson@office.colt.net>
Subject: Re: I just don't get the damn hash thing..
Message-Id: <B_ww3.262$u07.1896@news.colt.net>
Paul Dobbs <pdobbs@home.com> wrote:
> Not an emergency need to know, but I'm certain I'm missing
> out on
> fun/simplified methods by not using any hashes in my little
> programs.
Hmmm...
Imagine a hash being like a set of mailboxes in an apartment block. Each
mailbox has a name on it, and something inside. Let's say the apartment block
and the hash are called 'Maple_Mansions'. In Perl, a hash name always has a
% character.
%Maple_Mansions
Let us suppose someone lives there, and they are called Sally. You might
refer to their mailbox has "Sally, Maple_Mansions". Or, the other way round:
"Maple_Mansions, Sally". Perl does it with the hash name first and the
element name second. It looks like this in Perl:
$Maple_Mansions{Sally}
Wah?! You said hashes had a % character, and now it starts with a $ character?
Well, I said % represents a hash (an apartment block). But now we are talking
about a particular mailbox, not the whole block. A mailbox (in our analogy)
can hold a single letter, and a 'scalar' can hold a single value. We are
now talking about a particular mailbox: "Sally, Maple_Mansions", or, in perl,
a particular scalar: "$Maple_Mansions{Sally}"
As I said, this mailbox can hold a letter. A single value, such as a number or
a single string of text:
$Maple_Mansions{Sally} = "Dear Sally, please pay your gas bill.";
Let's have another person in the apartment block, too.
$Maple_Mansions{Eric} = "Hi Eric, how are you?";
So, %Maple_Mansions has two entities. Each entity (a mailbox, in our analogy)
has two interesting things. It has a label on it, and something inside it. In
perl the label is a 'key' and the 'value'. Sally is a key and "Dear Sally,
please pay your gas bill." is a value.
Suppose you wanted to find out all the different keys for the hash
%Maple_Mansions? The Perl function 'keys' will do that for you:
print "All the people in Maple_Mansions: ";
print keys %Maple_Mansions;
the keys function returns a list, and each element of that list is a key in
the hash it is called on. 'keys %Maple_Mansions' returns a list of all the
keys in %Maple_Mansions, in no particular order.
Suppose you wanted to do something with the contents of every mailbox in
Maple_Mansions. Well, the foreach command lets you perform an action on each
element of a list. And keys returns a list of elements. Put them together
and...
foreach $person (keys %Maple_Mansions)
{
print "The person $person who lives at Maple_Mansions has a letter:";
print " The letter says...\n";
print $Maple_Mansions{$person};
}
Hope that helps!
------------------------------
Date: 24 Aug 1999 15:37:56 +0200
From: Joakim Hove <hove@ido.phys.ntnu.no>
Subject: Re: I just don't get the damn hash thing..
Message-Id: <k0nvha5fgej.fsf@ido.phys.ntnu.no>
Well, first of all, in my opinion they are not damn hashes, on the
contrary they are very useful :)
My way to think of hashes is a sort of generalised arrays, but instead
of using numbers to index the entities you use keywords as
indexes. Lets say you wanted to store personal information in a Perl
program, a way to do this with an array would be :
@person = ('John Smith',22,'Unemployed',2)
But, then you would have to remember that :
$person[0] = name of the person
$person[1] = age
$person[2] = Work status
$person[4] = # kids
quite inconvenient in my opinon. On the other hand, using a hash
instead you could write :
%person = (name => 'John Smith',
age => 22,
work => 'Unemployed',
kids => 2);
Observe that this can also be written in the more array-like way :
%person = ('name','John SMith','age',22,'work','Unemployed','kids',2);
However in this case you must have '' surrounding the keywords.
Then, when you needed to access the name of a person, you would have
it in $person{'name'}. Observe that :
1. A hash as a whole is prefixed with the "%" sign.
2. Accessing a particular value from a hash you use "$", ie
$person{'name'}. Similar to @ -> $ for arrays.
3. The key is enclosed in a { } pair, similar to the [ ] pair for
arrays.
This little program shows off very limited what you can achive with
hashes:
#!/path/to/your/perl -w
%person = (name => 'John Smith',
age => ,22,
kids => 2,
work => 'Programmer');
#
# We start with initialising the hash with a typical person.
#
print("These are the vitals of $person{'name'} : \n");
foreach $key (keys %person) {
print("$key -> $person{$key} \n");
}
#
# Using a foreach loop we print out all the information we have
# about this person. The foreach construction goes through all the
# elements in an array. But using "(keys %person)" we get an array
# of all the keys in the hash %person.
#
# Hence the foreach statement above is similar to the construction
#
# foreach $key ('name','age','kids','work')
#
# However - you should observe that although we have inserted the
# elements in the hash in a particular order - they will generally
# _not_ come out in that order.
#
print("\nWhat do yo want to change about $person{'name'} => ");
chomp($key = <STDIN>);
#
# OK - now we allow the user to change the status of our person somewhat.
# The user can print in one of the keywords : name,age,kids, or work
#
# With the if (defined $person{$key}) test we check whether this particular
# keyword is indeed defined. If we the user for instance enters "Car", because
# he wants to assign John Smith a Mercedes, we are bailing out.
if (defined $person{$key}) {
print("\nWhat is the new value for $key => ");
chomp($value = <STDIN>);
$person{$key} = $value;
#
# We have verified that the user has entered one of the valid keywords,
# then we ask for a new value. For instance John Smith might have got another
# kid, and after entering the keyword "kids" we can enter the new value 3.
#
# Finally we assign the new value with $person{$key} = $value, and print out
# the new personal information about john smith.
#
print("\nThese are the NEW vitals of $person{'name'} :\n");
foreach $key (keys %person) {
print("$key -> $person{$key} \n");
}
} else {
print("Sorry the hash has not defined any $key keyword :-( \n");
}
HTH Joakim
--
=== Joakim Hove www.phys.ntnu.no/~hove/ ======================
# Institutt for fysikk (735) 93637 / 352 GF | Skøyensgate 10D #
# N - 7034 Trondheim hove@phys.ntnu.no | N - 7030 Trondheim #
=====================================================================
------------------------------
Date: Tue, 24 Aug 1999 11:21:49 GMT
From: macisdvm@my-deja.com
Subject: intallation on AIX 4.3.2
Message-Id: <7ptv85$pp6$1@nnrp1.deja.com>
Help. I am trying to install perl5.004_5 on my AIX
4.3.2 server. After running sh Configure, I try
to run make and get the following error
./miniperl -Ilib pod/pod2html.PL
make: 1254-059 The signal code from the last
command is 11.
Stop.
Any Suggestions.
Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.
------------------------------
Date: Tue, 24 Aug 1999 11:30:03 GMT
From: lg@kt.dtu.dk (Lars Gregersen)
Subject: Re: PERL EDITOR
Message-Id: <37c281a4.97295322@news.dtu.dk>
On Thu, 19 Aug 1999 15:03:32 -0300, "Webmaster"
<webmaster@compre-ya.com> wrote:
>Does anyone of you know an good and nice editor for writting perl code?
Just because nobody else have recommended it I will: Try TextPad.
www.textpad.com
It's shareware. Other good shareware editors can be found at
www.editplus.com
www.ultraedit.com
Ultraedit has built in ftp which makes it rather nice when you have
remote script on a Unix computer that you want to edit and you don't
like to use vi or emacs.
Lars
------------------------------
Lars Gregersen (lg@kt.dtu.dk)
http://www.gbar.dtu.dk/~matlg
------------------------------
Date: Tue, 24 Aug 1999 08:08:20 -0500
From: Chris Covington <ccoving@uhc.com>
Subject: Perl won't compile (gcc 2.8, AIX)
Message-Id: <37C29944.9CA100E4@uhc.com>
I am trying to compile perl5.005_03 on AIX 4.3.2 using gcc 2.8.1.
Here are the commands that I use:
./Configure -Dcc=gcc -Dprefix=/usr/local/perl -Uinstallusrbinperl -des
# That completes OK
make
# The make command errors out. Here is where the error starts:
gcc -L/usr/local/lib -o miniperl miniperlmain.o libperl.a
-lnsl
-lgdbm -ldbm -ldl -lld -lm -lc -lcrypt -lbsd -lPW
./miniperl -w -Ilib -MExporter -e 0 || make minitest
/bin/sh: 73904 Illegal instruction(coredump)
rm -f lib/re.pm
cat ext/re/re.pm > lib/re.pm
You may see some irrelevant test failures if you have been unable
to build lib/Config.pm.
cd t && (rm -f perl; /usr/bin/ln -s ../miniperl perl) &&
./perl
TEST base/*.t comp/*.t cmd/*.t io/*.t op/*.t pragma/*.t
Also I noticed that doing the
./Configure gives a couple of "WHOA THERE"s for "the recommended value
form $d_setueuid on this machine is 'undef'" early on, but does not
cause the configure script to abort. Here is the real myconfig:
Summary of my perl5 (5.0 patchlevel 5 subversion 3) configuration:
Platform:
osname=aix, osvers=4.3.2.0, archname=aix
uname='aix yeti 3 4 000018538000 '
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef useperlio=undef d_sfio=undef
Compiler:
cc='gcc', optimize='-O', gccversion=2.8.1
cppflags='-D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE
-I/usr/local/include'
ccflags ='-D_ALL_SOURCE -D_ANSI_C_SOURCE -D_POSIX_SOURCE
-I/usr/local/include'
stdchar='unsigned char', d_stdstdio=define, usevfork=false
intsize=4, longsize=4, ptrsize=4, doublesize=8
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=8
alignbytes=8, usemymalloc=n, prototype=define
Linker and Libraries:
ld='ld', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib /usr/ccs/lib
libs=-lnsl -lgdbm -ldbm -ldl -lld -lm -lc -lcrypt -lbsd -lPW
libc=, so=a, useshrplib=false, libperl=libperl.a
Dynamic Linking:
dlsrc=dl_aix.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Xlinker
-bE:perl.exp'
cccdlflags='-fpic', lddlflags='-bhalt:4 -bM:SRE
-bI:$(PERL_INC)/perl.exp -bE:$(BASEEXT).exp -b noentry -lc
-L/usr/local/lib'
Any clues?
Chris Covington, ccoving@uhc.com
------------------------------
Date: Tue, 24 Aug 1999 13:28:49 +0200
From: "Alan J. Flavell" <flavell@mail.cern.ch>
Subject: Re: POP is pooped!
Message-Id: <Pine.HPP.3.95a.990824132509.26651F-100000@hpplus03.cern.ch>
On 24 Aug 1999, Sam Holden wrote:
> > flock(KEYS, 2);
>
> You should really use the human readbale constants LOCK_SH, etc...
Yes, I was trying to make that point to a student recently, and then
noticed to my surprise that Merlyn's Web techniques articles are all
using numbers:
http://www.stonehenge.com/cgi/wtsearch?search=flock
Comments?
------------------------------
Date: Tue, 24 Aug 1999 14:35:38 +0200
From: Andrei Karpatchev <karpat@eeh.ee.ethz.ch>
Subject: Processing of the Authorization form (Status: 401)
Message-Id: <37C2919A.88892E13@eeh.ee.ethz.ch>
Sorry, if this is wrong place for this question...
Can somebody tell me, how can I process the user answer, which he enters
his data into the authorisation form which appears when I send "Status
401" in
Perl-script like:
print 'Status: 401 Unauthorized to access the document' ."\n";
print 'WWW-authenticate: Basic realm="foobar"' ."\n";
print 'Content-type: text/plain' ."\n\n";
print 'Unauthorised to access this document' ."\n";
Thanks.
------------------------------
Date: 24 Aug 1999 14:35:58 +0100
From: Jonathan Stowe <gellyfish@gellyfish.com>
Subject: Re: Processing of the Authorization form (Status: 401)
Message-Id: <37c29fbe_2@newsread3.dircon.co.uk>
Andrei Karpatchev <karpat@eeh.ee.ethz.ch> wrote:
> Sorry, if this is wrong place for this question...
>
> Can somebody tell me, how can I process the user answer, which he enters
> his data into the authorisation form which appears when I send "Status
> 401" in
> Perl-script like:
>
> print 'Status: 401 Unauthorized to access the document' ."\n";
> print 'WWW-authenticate: Basic realm="foobar"' ."\n";
> print 'Content-type: text/plain' ."\n\n";
> print 'Unauthorised to access this document' ."\n";
>
Despite your using Perl to do this there is nothing Perl specific about
this question : I would see the relevant part of the CGI faq at :
<http://www.webthing.com/tutorials/cgifaq.3.html#11>
/J\
--
"Buzz Aldrin was the second man to walk on the moon and the first to
fill his pants" - Violet Berlin, The Big Bang
------------------------------
Date: 24 Aug 1999 11:12:44 GMT
From: nieka@dsv.nl (N. Albers)
Subject: Re: Shamefully simple question.
Message-Id: <7ptunc$r1e$1@enterprise.cistron.net>
In article <7ps16m$cv3$1@nnrp1.deja.com>
mrbog@my-deja.com wrote:
>
>
> Alright I'm ashamed to ask this, because I'm a somewhat experienced
> perl programmer, but I couldn't find this in the camel book (or the
> cookbook, or the advanced perl book, etc etc see I TOLD you I was
> fairly experienced!)
>
> All I want to do is, instead of a cgi constructing and returning a page
> to the user, I want it to push the user to a URL.
>
> That's it!
>
> Up to now whenever I'd need to do that, I'd like give them a page that
> either had a meta tag that pushed them where I wanted them to go, or
> I'd have something like <body onload="document.href='http:sfdsdf'">
>
> hehe! I'm so ashamed..
>
> -Mike
>
>
>
> Sent via Deja.com http://www.deja.com/
> Share what you know. Learn what you don't.
what about the CGI.pm function:
print $query->redirect('http://somewhere.else/in/movie/land');
------------------------------
Date: Tue, 24 Aug 1999 12:44:53 GMT
From: Gareth Rees <garethr@cre.canon.co.uk>
Subject: Re: sorting files randomly out of a list
Message-Id: <si9071wdoa.fsf@cre.canon.co.uk>
Gareth Rees wrote:
> Aside: is there a way to avoid the sort when computing the combination?
> In other words, is there an O(n) algorithm for generating a combination
> of k elements from 0..n-1?
David Cassell wrote:
> When this came up before [oh, a couple months ago], Larry Rosler
> suggested the equivalent of:
>
> my @p = 0 .. $n-1;
> map splice (@p, rand @p, 1), 1..$k;
This computes a permutation, not a combination (at least not by my
definition in the previous post, in which combinations are always in the
original order).
Note also that since `splice' is an O(n) operation, this algorithm costs
O(n^2). (Having said that, since Larry's algorithm spends its time "in
core" it runs faster than mine for small n.)
--
Gareth Rees
------------------------------
Date: Tue, 24 Aug 1999 15:27:07 +0200
From: "m m m" <mmilovan@grolier.fr>
Subject: Re: sorting files randomly out of a list
Message-Id: <7pu6kf$sim$1@front6.grolier.fr>
>After defining
>
> # permutation(k,n) is a random permutation of k elements from 0..n-1.
> sub permutation {
> my ($k, $n) = @_;
> 0 <= $k and $k <= $n or die "k=$k, n=$n does not satisfy 0<=k<=n";
> my @p = (0 .. $n - 1);
> my $i = $k;
> while ($i-- > 0) {
> my $j = int rand $n--;
> @p[$n,$j] = @p[$j,$n];
> }
> return @p[-$k .. -1];
> }
>
> # combination(k,n) is a random combination of k elements from 0..n-1.
> sub combination { sort { $a <=> $b} permutation(@_) }
>
>you can write an expression like
>
> @files[combination(30, scalar @files)]
>
>to get a list of 30 random files from @files.
>Gareth Rees
Thanks a lot, it works perfectly well.
marko m milovanovic
------------------------------
Date: Tue, 24 Aug 1999 11:48:26 +0100
From: "Martin" <contact@nativetongues.com>
Subject: Re: spider - stripping useless words
Message-Id: <7pu2l6$lag$1@gxsn.com>
Oh dear I want to cry...
I've done 90% of the work already but thanks for the information it's always
useful to see if there's any way to check my approach to the problem if
nothing else.
Mind you having said that I am also writing it to do a lot more than simply
reverse index the thing, there's vocabulary checking, subject relevance
checks and more so I'm guessing there's still a lot that I would have had to
do anyway.
Thanks for the pointer though. :-)
Martin
Jon Peterson <jpeterson@office.colt.net> wrote in message
news:%Otw3.258$u07.1625@news.colt.net...
> Martin <contact@nativetongues.com> wrote:
> > Hi,
>
> > I'm writing a search engine spider which will be hosted on a site using
only
> > around 80Mb of space and thinking ahead I'm trying to keep the resulting
> > database as compact as possible. Part of what the site does is reverse
index
>
> > Thanks in advance if you can help or even point me in the right
direction.
>
> There's a Search::InvertedIndex module on CPAN that might save you a whole
> ton of work. Last I looked it was very much a toolkit rather than an end
> to end solution, but no doubt very helpful to someone like you.
>
> If you are unaware of CPAN check www.perl.com/cpan and prepare to be
happy.
>
------------------------------
Date: Tue, 24 Aug 1999 20:38:08 +0800
From: assakhof <assakhof@nospam.mimos.my>
Subject: split inside split - can it be done ?
Message-Id: <37C29230.9CEAAB63@nospam.mimos.my>
Hi,
This a header line in my file;
#********** CUSTOMER[Customer Name-ID987] **********
I want to get
$cust = Customer Name and
$id = ID987
My idea is to use split inside split like this; (but it not work)
open (CUST, $custFile) or die "ERROR: Cannot open $custFile";
while (<CUST>) {
if (/CUSTOMER/) {
($cust,$id) = split(/-/,split(/]/, split (/[/, $_)[1])[0]);
print "$cust:$id\n";
}
}
Any idea how to get it in one line ?
Thanks in advance.
--assakhof
#note: to reply-me: remove nospam in my add
------------------------------
Date: Tue, 24 Aug 1999 15:00:49 +0200
From: "Philip 'Yes, that's my address' Newton" <nospam.newton@gmx.net>
Subject: Re: split inside split - can it be done ?
Message-Id: <37C29781.FB48A8AE@gmx.net>
assakhof wrote:
>
> #********** CUSTOMER[Customer Name-ID987] **********
[...]
> open (CUST, $custFile) or die "ERROR: Cannot open $custFile";
> while (<CUST>) {
> if (/CUSTOMER/) {
> ($cust,$id) = split(/-/,split(/]/, split (/[/, $_)[1])[0]);
> print "$cust:$id\n";
> }
> }
How about:
open (CUST, $custFile) or die "ERROR: Cannot open $custFile: $!";
while (<CUST>) {
if (/CUSTOMER/) {
($cust,$id) = /CUSTOMER\[([^-]*)-([^]]*)]/;
print "$cust:$id\n";
}
}
Using // in list context returns the contents of capturing parentheses.
Also, putting $! in the error message will tell you why the open failed.
Cheers,
Philip
------------------------------
Date: Tue, 24 Aug 1999 13:13:57 +0200
From: Thomas Rassmann <rassmann@sdm.de>
Subject: Re: User's Operative Sistem????
Message-Id: <37C27E75.76BFDB76@sdm.de>
Rodrigo Cortes wrote:
>
> How i know the user's Operative Sistem with a perl script???
>
You can also try to ask environment-variables like
$ENV{'OS'} # for windows
$ENV{'OSTYPE'} # for some unixs (especially Solaris)
but I think $^O is best choice.
Look for the env.vars. with `set` in the command line.
Tom
--
Thomas Rassmann mailto:Thomas.Rassmann@sdm.de
sd&m AG http://www.sdm.de
software design & management
Thomas-Dehler-Str. 27, D-81737 Muenchen, Germany
Tel +49 89 63812-346 Fax -410
------------------------------
Date: Tue, 24 Aug 1999 12:40:17 GMT
From: gary@onegoodidea.com (Gary O'Keefe)
Subject: Re: Why use Perl when we've got Python?!
Message-Id: <37c28ac9.13320348@news.hydro.co.uk>
A keyboard was whacked upside Xah's head and out came:
[ a vast and rambling discourse snipped ]
>AppleScript only looks English, but does not have all the essential
>qualities of natural languages. Like Python and others, it's too consistent
>and ram some OOP or Object Models down your throat. Don't start me on Apple.
>When it comes to Apple, they always push for useless change. In the past
>years they degraded programers' social status with the oh--sooo-cute color
>iMacs and G3 towers, and in September they'll have this
>oh--sooo-cute-I'm-drooling iBook and AirPort technology probably semi-stolen
>from unix's RFC standards. (RFC is an abbreviation for Really Fucking
>Common)
IIRC, AirPort is licensed from Lucent.
Anyway... damn straight! Back to the bad old days of gnawing out the
holes in punched cards with your canines, 'cos the hole punching
machine costs another $500,000, fits into a small warehouse, uses
enough electricity to electrocute England and weighs more than the
Earth itself. Programming that got respect. Real spit and bailing-wire
programming. None of this namby-pamby open-up-a-spreadsheet-record-
a-macro-and-knock-it-about-a-bit "programming" for us, eh?
How unfortunate that computers (and programming) have been brought to
the attention of the bungled and the botched by the Great Satan Apple
and their evil couterparts in Redmond. When the proletariat figure out
how easy all this stuff *really* is, then they're going to turn on us
and rip us to shreds like a pack of hungry wolves for charging them so
damned much all these years.
>As Larry has well said, Perl is a post-modern language. The language of the
>New Age era.
Pity all the hippies have got colostomy bags and Zimmer frames these
days. We're post-post-modern now, or hadn't you heard? (Turn up the
hearing aid... ;)
Gary
p.s. To the other poster (who I really cannot be bothered searching
for) who asked if the point of learning programming was just to earn
vast lumps of cold, hard cash at a F500 company: yes it is.
--
Gary O'Keefe
gary@onegoodidea.com
You know the score - my current employer has nothing to do with what I post
------------------------------
Date: Tue, 24 Aug 1999 10:11:08 -0300
From: Arved_37@chebucto.ns.ca (Arved Sandstrom)
Subject: Re: xml::parser help
Message-Id: <Arved_37-2408991011080001@dyip-8.chebucto.ns.ca>
In article <7prpi4$2nq$1@news2.vas-net.net>, "Chris Denman"
<chris@inta.net.uk> wrote:
> I have been playing with XML::Parser and have had some results setting up
> handlers, but the order I receive the data is wrong.
>
The order won't be wrong, but you may be acting on the wrong tags. I fired
up a quick script, a la
use XML::Parser;
my $char = "";
$p1 = new XML::Parser(
Handlers => {End => \&handle_end, Char => \&handle_char});
$p1->parsefile('testb.xml');
sub handle_end {
my ($xp, $end) = @_;
print "$end: $char\n";
$char = "";
}
sub handle_char {
my ($xp, $str) = @_;
$char .= $str;
}
and this returns Fred's data first, followed by Frank's, which is what you want.
Arved
------------------------------
Date: 1 Jul 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 1 Jul 99)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq" from
almanac@ruby.oce.orst.edu. The real FAQ, as it appeared last in the
newsgroup, can be retrieved with the request "send perl-users FAQ" from
almanac@ruby.oce.orst.edu. Due to their sizes, neither the Meta-FAQ nor
the FAQ are included in the digest.
The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq" from
almanac@ruby.oce.orst.edu.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V9 Issue 629
*************************************