[18757] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 925 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu May 17 14:10:51 2001

Date: Thu, 17 May 2001 11:10:15 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <990123015-v10-i925@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Thu, 17 May 2001     Volume: 10 Number: 925

Today's topics:
    Re: i want to tokenize a string <peb@bms.umist.ac.uk>
    Re: i want to tokenize a string <tee_joker@yahoo.com>
    Re: i want to tokenize a string (Craig Berry)
    Re: Posting Guidelines for comp.lang.perl.misc ($Revisi (Mark Jason Dominus)
    Re: Posting Guidelines for comp.lang.perl.misc ($Revisi (Tad McClellan)
    Re: splitting strings (E.Chang)
    Re: splitting strings <steve_dob@totalise.co.uk>
    Re: splitting strings <julien.quint@imag.fr>
    Re: splitting strings <Peter.Dintelmann@dresdner-bank.com>
    Re: splitting strings (Tad McClellan)
    Re: splitting strings <joe+usenet@sunstarsys.com>
    Re: splitting strings <peb@bms.umist.ac.uk>
    Re: splitting strings (Anno Siegel)
    Re: splitting strings (Rafael Garcia-Suarez)
    Re: splitting strings (Anno Siegel)
        System calls in Windows NT <mtx064@coventry.ac.uk>
    Re: timeout after STDIN ? (Mark Jason Dominus)
    Re: transform html to xhtml <julien.quint@imag.fr>
    Re: URGENT: CGI program wanted! <uri@sysarch.com>
    Re: What's wrong with my scope? (Mark Jason Dominus)
    Re: word doc to txt <alexis.roda@si.urv.es>
    Re: word doc to txt (Tad McClellan)
    Re: word doc to txt <bart.lateur@skynet.be>
        Writing to Unix passwd <nabeards@wpi.edu>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 17 May 2001 16:09:48 +0100
From: Paul Boardman <peb@bms.umist.ac.uk>
Subject: Re: i want to tokenize a string
Message-Id: <3B03E9BC.DFB0D724@bms.umist.ac.uk>

Anno Siegel wrote:
> 
> According to Paul Boardman  <peb@bms.umist.ac.uk>:
> > Rafael Garcia-Suarez wrote:
<snip>

> >[...] "Use a pattern match
> > when you know what you want to keep. Use split when you know what you
> > want to throw away."
> >
> > how about something like:-
> >
> > $string = "mooo     ninja randy    dandy";
> > @results = $string =~ m/(\w+\s+)/g;
> 
> Fair enough, but the rule, as stated, doesn't decide this case.  We
> know both what to keep (whole words plus trailing spaces) and what
> to throw away (nothing, but we want to throw it away at the beginning
> of each word :).


He he!  I hadn't thought of it like that :-)

Paul


------------------------------

Date: Thu, 17 May 2001 23:24:27 +0700
From: "Tee" <tee_joker@yahoo.com>
Subject: Re: i want to tokenize a string
Message-Id: <9e0u0l$3f5$1@chatta.samart.co.th>

You can
"mooo"." "
like these
"nachoman" <nachoman@ehpt.com> wrote in message
news:1103_990094705@kspc377...
> if i have a string like
>
> "mooo     ninja randy    dandy"
>
> and use the split command
>
> i get
>
> "mooo"
> "ninja"
> "randy"
> "dandy"
>
> however i want to include the whitespaces up to the next word and get the
output like this
>
> "mooo   "
> "ninja "
> "randy   "
> "dandy"
>
> any suggestions on how to do this?
>




------------------------------

Date: Thu, 17 May 2001 17:05:31 -0000
From: cberry@cinenet.net (Craig Berry)
Subject: Re: i want to tokenize a string
Message-Id: <tg816r220r1u4e@corp.supernews.com>

nachoman (nachoman@ehpt.com) wrote:
: if i have a string like
: "mooo     ninja randy    dandy"
[snip]
: however i want to include the whitespaces up to the next word and get the output like this
: "mooo   "
: "ninja "
: "randy   "
: "dandy"

  @tokens = $str =~ /\w+\s*/g;

-- 
   |   Craig Berry - http://www.cinenet.net/~cberry/
 --*--  "God becomes as we are that we may be as he is."
   |               - William Blake


------------------------------

Date: Thu, 17 May 2001 17:08:44 GMT
From: mjd@plover.com (Mark Jason Dominus)
Subject: Re: Posting Guidelines for comp.lang.perl.misc ($Revision: 1.1 $)
Message-Id: <3b04059c.443f$91@news.op.net>

In article <slrn9g5r2n.nun.tadmc@tadmc26.august.net>,
Tad McClellan <tadmc@augustmail.com> wrote:
>Mark Jason Dominus <mjd@plover.com> wrote:
>>In article <slrn9g4433.90p.bernard.el-hagin@gdndev25.lido-tech>,
>>Bernard El-Hagin <bernard.el-hagin@lido-tech.net> wrote:
>>>You don't seem to understand that the 'must' means 'if you want expert
>>>advice from some of the best Perl programmers in the world you must'
>              ^^^^
>>If that's what it means, then it should be removed because it is wrong.
>
>
>I don't see that it is wrong. What is wrong about it?
>
>
>"some" implies "more than one", it does not imply "all".

'Some' has several meanings in English.  Consider:

        "If you want to get some diamonds, you must buy them from DeBeers."

There is no implication here of an alternate supply of diamonds which 
is not monopolized by DeBeers.  

This was the my misunderstanding of Bernard's remark.

However, I think that from what you originally wrote, nobody could
reasonably be expected to understand what you meant.  You originally
said:

    "This section describes things that you *must* do before posting to
    clpmisc."

with no qualification whatsoever.  If what Bernard said above is what
you intended to say with this, I do not think people will understand
you correctly.  I suggest you change this section to read:

    "Many of the posters to this group will ignore your message unless
    you do these things:"

and omit the word 'must', which, unless qualified somehow, is wrong.


-- 
@P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
@p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord
($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&
close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print


------------------------------

Date: Thu, 17 May 2001 12:33:59 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: Posting Guidelines for comp.lang.perl.misc ($Revision: 1.1 $)
Message-Id: <slrn9g7vbn.qcs.tadmc@tadmc26.august.net>

Mark Jason Dominus <mjd@plover.com> wrote:
>In article <slrn9g5r2n.nun.tadmc@tadmc26.august.net>,
>Tad McClellan <tadmc@augustmail.com> wrote:
>>Mark Jason Dominus <mjd@plover.com> wrote:
>>>In article <slrn9g4433.90p.bernard.el-hagin@gdndev25.lido-tech>,
>>>Bernard El-Hagin <bernard.el-hagin@lido-tech.net> wrote:
>>>>You don't seem to understand that the 'must' means 'if you want expert
>>>>advice from some of the best Perl programmers in the world you must'
>>              ^^^^
>>>If that's what it means, then it should be removed because it is wrong.
>>
>>
>>I don't see that it is wrong. What is wrong about it?


>'Some' has several meanings in English.  Consider:


>However, I think that from what you originally wrote, nobody could
>reasonably be expected to understand what you meant.  


Right. I'm pretty sure I have already admitted that somewhere.


>You originally
>said:
>
>    "This section describes things that you *must* do before posting to
>    clpmisc."
>
>with no qualification whatsoever.  If what Bernard said above is what
>you intended to say with this, I do not think people will understand
>you correctly.  I suggest you change this section to read:
>
>    "Many of the posters to this group will ignore your message unless
>    you do these things:"
>
>and omit the word 'must', which, unless qualified somehow, is wrong.


Gotcha.

But s/ignore/will not respond to/; 

I've become more sensitive to connotations of late  :-)


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Thu, 17 May 2001 15:12:36 GMT
From: echang@netstorm.net (E.Chang)
Subject: Re: splitting strings
Message-Id: <Xns90A4727457BC5echangnetstormnet@207.106.93.86>

Christian Seeberger <cseeberg@sgi1.chemie.uni-hamburg.de> wrote in 
<3B03E3E5.535224BD@sgi1.chemie.uni-hamburg.de>:

> $string = "abcdefg";

 @array = split //, $string;

-- 
EBC


------------------------------

Date: Thu, 17 May 2001 15:13:24 GMT
From: "Stephen Dobinson" <steve_dob@totalise.co.uk>
Subject: Re: splitting strings
Message-Id: <oURM6.1007$hk3.165890@news1.cableinet.net>

> $string = "abcdefg";
> @array = split /magic reg exp/, $string;
>
> @array shoud look like (a,b,c,d,e,f,g) after the split. What is the
> 'magic reg exp' I need in the code above ?? Is it possible to do it this

Drop the 'magic reg exp'

@array = split //, $string;

--
Stephen Dobinson
http://a2zcomms.freeshell.org/
Free unix shell account




------------------------------

Date: 17 May 2001 17:14:29 +0200
From: Julien Quint <julien.quint@imag.fr>
Subject: Re: splitting strings
Message-Id: <khvvgn0m2i2.fsf@imag.fr>

Christian Seeberger <cseeberg@sgi1.chemie.uni-hamburg.de> writes:

> Hi all !
> 
> I want to split a string, so that each letter of it is an element in an
> array. My idea is something like:
> 
> $string = "abcdefg";
> @array = split /magic reg exp/, $string;

The magic regexp is very simple:

	@array = split //, $string

perldoc -f split gives lots of example. Please read it (also read perldoc
perlre).

-- 
Julien


------------------------------

Date: Thu, 17 May 2001 17:15:14 +0200
From: "Dr. Peter Dintelmann" <Peter.Dintelmann@dresdner-bank.com>
Subject: Re: splitting strings
Message-Id: <9e0pev$bpd7@news-1.bank.dresdner.net>

    Hi,

"Christian Seeberger" <cseeberg@sgi1.chemie.uni-hamburg.de> wrote in message
news:3B03E3E5.535224BD@sgi1.chemie.uni-hamburg.de...

> $string = "abcdefg";
> @array = split /magic reg exp/, $string;
>
> @array shoud look like (a,b,c,d,e,f,g) after the split. What is the
> 'magic reg exp' I need in the code above ??

    //

> Is it possible to do it this
> way at all ??

    yepp.

            Peter





------------------------------

Date: Thu, 17 May 2001 10:27:02 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: splitting strings
Message-Id: <slrn9g7ntm.q33.tadmc@tadmc26.august.net>

Christian Seeberger <cseeberg@sgi1.chemie.uni-hamburg.de> wrote:
>
>I want to split a string, so that each letter of it is an element in an
>array. My idea is something like:
>
>$string = "abcdefg";
>@array = split /magic reg exp/, $string;
>
>@array shoud look like (a,b,c,d,e,f,g) after the split. What is the
>'magic reg exp' I need in the code above ??


Strangely enough, the answer to your question is:

   "nothing"

heh.


Split on the empty string:

   @array = split //, $string;


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: 17 May 2001 11:36:18 -0400
From: Joe Schaefer <joe+usenet@sunstarsys.com>
Subject: Re: splitting strings
Message-Id: <m3bsosxa19.fsf@mumonkan.sunstarsys.com>

Christian Seeberger <cseeberg@sgi1.chemie.uni-hamburg.de> writes:

> I want to split a string, so that each letter of it is an element in an
> array. My idea is something like:
> 
> $string = "abcdefg";
> @array = split /magic reg exp/, $string;

According to the documentation for split() in perlfunc,
your (non-magical) regexp must match a null (zero-length) string.  
Did you look there before posting?

> @array shoud look like (a,b,c,d,e,f,g) after the split. What is the
> 'magic reg exp' I need in the code above ?? Is it possible to do it this
> way at all ?? Up to now I use a construct with substr() and suchlike,
> but I just have the feeling, thet there is a better, more elegant way
> of oing this.

There is.  It is loosely discussed in the documentation 
in perlfunc and perlre, but be careful.  In general, there 
is a difference between "null pattern" and "only matches a null 
string", although split() does not distinguish between them.

-- 
Joe Schaefer     "Never put off until tomorrow that which can be done the day
                                       after tomorrow."
                                               --Mark Twain


------------------------------

Date: Thu, 17 May 2001 16:02:13 +0100
From: Paul Boardman <peb@bms.umist.ac.uk>
Subject: Re: splitting strings
Message-Id: <3B03E7F5.B5752E5D@bms.umist.ac.uk>

Christian Seeberger wrote:
> 
> Hi all !
> 
> I want to split a string, so that each letter of it is an element in an
> array. My idea is something like:
> 
> $string = "abcdefg";
> @array = split /magic reg exp/, $string;
> 
> @array shoud look like (a,b,c,d,e,f,g) after the split. What is the
> 'magic reg exp' I need in the code above ?? Is it possible to do it this
> way at all ?? Up to now I use a construct with substr() and suchlike,
> but I just have the feeling, thet there is a better, more elegant way of
> oing this.
> 

@array = split //, $string;

HTH

Paul


------------------------------

Date: 17 May 2001 15:50:36 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: splitting strings
Message-Id: <9e0s0c$n2u$1@mamenchi.zrz.TU-Berlin.DE>

According to Christian Seeberger  <cseeberg@sgi1.chemie.uni-hamburg.de>:
> Hi all !
> 
> I want to split a string, so that each letter of it is an element in an
> array. My idea is something like:
> 
> $string = "abcdefg";
> @array = split /magic reg exp/, $string;
> 
> @array shoud look like (a,b,c,d,e,f,g) after the split. What is the
> 'magic reg exp' I need in the code above ?? Is it possible to do it this
> way at all ?? Up to now I use a construct with substr() and suchlike,
> but I just have the feeling, thet there is a better, more elegant way of
> oing this.

When you think you need a list of characters, there is often a more
elegant, or at least more "perlish"  way to solve the problem that
doesn't need this.  Thinking in arrays of characters is one of the
hallmarks of a C accent in Perl.

However, assuming this is not the case, the idiom is

    @array = split //, $string;

The pattern // matches an empty string, which means it matches
between any two characters.  So the string is split in the desired
way.  This is, by the way, mentioned in "perldoc -f split".

It would uncontestedly be the idiom if we didn't since, uh...
yesterday?, have Randal's Rule: "If you know what to keep, use a
regex.  If you know what to throw away, use split."  Or something.
Randal will set me right if I botched the wording.

"Know" should be read here as "know how to match".  We want to keep
single characters, and we know how to match them, so the contestant
is

    @array = $string =~ /./;

This doesn't look bad at all, and it's probably clearer than the
split once you've got used to matches in list context.  It benchmarks
a little slower on my machine, taking about 140% of the time of a
split.

Anno


------------------------------

Date: 17 May 2001 16:06:12 GMT
From: rgarciasuarez@free.fr (Rafael Garcia-Suarez)
Subject: Re: splitting strings
Message-Id: <slrn9g7tpv.tiq.rgarciasuarez@rafael.kazibao.net>

Anno Siegel wrote in comp.lang.perl.misc:
} According to Christian Seeberger  <cseeberg@sgi1.chemie.uni-hamburg.de>:
} > 
} > I want to split a string, so that each letter of it is an element in an
} > array.
[...snip...]
} 
} However, assuming this is not the case, the idiom is
} 
}     @array = split //, $string;
} 
} The pattern // matches an empty string, which means it matches
} between any two characters.  So the string is split in the desired
} way.  This is, by the way, mentioned in "perldoc -f split".

It's time to say that // matches an empty string *when used with split*.
As perlop says,

  m/PATTERN/cgimosx
    ...
    If the PATTERN evaluates to the empty string, the last
    successfully matched regular expression is used instead.

This specific behavior of // in split is not well documented.
I suspect that there is some cargo-cultism here.

Note : split '', $string also works, but I don't know why.

} It would uncontestedly be the idiom if we didn't since, uh...
} yesterday?, have Randal's Rule: "If you know what to keep, use a
} regex.  If you know what to throw away, use split."  Or something.
} Randal will set me right if I botched the wording.
} 
} "Know" should be read here as "know how to match".  We want to keep
} single characters, and we know how to match them, so the contestant
} is
} 
}     @array = $string =~ /./;
} 
} This doesn't look bad at all, and it's probably clearer than the
} split once you've got used to matches in list context.  It benchmarks
} a little slower on my machine, taking about 140% of the time of a
} split.

This doesn't work. You meant
     @array = $string =~ /./gs;

-- 
Rafael Garcia-Suarez / http://rgarciasuarez.free.fr/
$japh="Just another Perl hacker,\n";@j=split/(?= )/,$japh;for my $i
(0..3){*{(($x)=$j[3-$i]=~/\w+/g)[0]}=sub(@){print$j[$i]}}eval$japh;


------------------------------

Date: 17 May 2001 16:13:39 GMT
From: anno4000@lublin.zrz.tu-berlin.de (Anno Siegel)
Subject: Re: splitting strings
Message-Id: <9e0tbj$n2u$2@mamenchi.zrz.TU-Berlin.DE>

According to Rafael Garcia-Suarez <rgarciasuarez@free.fr>:
> Anno Siegel wrote in comp.lang.perl.misc:

> }     @array = $string =~ /./;

[...]

> This doesn't work. You meant
>      @array = $string =~ /./gs;

Yes.  I don't know where the /g went.  Didn't have /s.  Thanks.

Anno


------------------------------

Date: Thu, 17 May 2001 16:38:12 +0100
From: Dominic Hibbs <mtx064@coventry.ac.uk>
Subject: System calls in Windows NT
Message-Id: <Pine.OSF.3.91.1010517163258.4661E-100000@leofric>

In my UNIX script I accept a password without echo to the screen using 

system 'stty','-echo';
my $pass = <STDIN>;
system 'stty','echo';

when run on an NT machine the perl interpreter (or NT op. sys.) complain 
about the two system lines but still the password is accepted (and 
echoed) and the program continues to work.

What is the Win NT equivalent to suppress echo?

TIA

-----------------------------------------------------
Dominic Hibbs (Senior Lecturer)
School of Maths and Information Sciences
Coventry University
Priory Street
Coventry
CV1 5FB
02476 631313 Ext 7063
-----------------------------------------------------



------------------------------

Date: Thu, 17 May 2001 17:28:23 GMT
From: mjd@plover.com (Mark Jason Dominus)
Subject: Re: timeout after STDIN ?
Message-Id: <3b040a2b.44ab$5d@news.op.net>

In article <slrn9g7de5.pk3.bernard.el-hagin@gdndev25.lido-tech>,
Bernard El-Hagin <bernard.el-hagin@lido-tech.net> wrote:
>alarm (10) or $input = <STDIN>;

That 'or' does not make any sense to me.  What is it doing there, and
why is it not a semicolon?
-- 
@P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
@p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord
($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&
close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print


------------------------------

Date: 17 May 2001 17:13:07 +0200
From: Julien Quint <julien.quint@imag.fr>
Subject: Re: transform html to xhtml
Message-Id: <khvy9rwm2kc.fsf@imag.fr>

Philip Newton <pne-news-20010517@newton.digitalspace.net> writes:

> On Thu, 17 May 2001 14:36:24 +0200, "Marco T�lle" <marco@uni.de> wrote:
> 
> > Can anyone tell me how to transform HTML to XHTML using Perl ?
> > Is there a module ?
> 
> Not specifically. I suppose you could hack something together using
> HTML::TokeParser (or another HTML parser of your choice), however.

The best choice would be to use HTMLTidy, written by Dave Ragget. More info
at

	http://www.w3.org/People/Raggett/tidy/

I'm sorry that it is not a Perl solution, but it is a Good solution.

-- 
Julien


------------------------------

Date: Thu, 17 May 2001 17:58:13 GMT
From: Uri Guttman <uri@sysarch.com>
Subject: Re: URGENT: CGI program wanted!
Message-Id: <x7bsor97t6.fsf@home.sysarch.com>

>>>>> "WH" == Walter Hafner <hafner-usenet@ze.tu-muenchen.de> writes:

  WH> Uri Guttman <uri@sysarch.com> writes:
  >> that web site is off limits. you just marked yourself as a disciple of
  >> the matt wright school of bad perl programming. 

  WH> Care to give an explanation of your statement?

go to groups.google.com and search for his name in this group. his
scripts were written when he was in high school, are notorious for their
low quality, bugs and security holes, etc. so anyone recommending them
obviously does not know about decent perl and so their opinions on other
perl matters take on a decidedly lower ranking.

clear enough? matt's code stinks. anyone who associates themselves with
his code takes on that skunky scent.

uri

-- 
Uri Guttman  ---------  uri@sysarch.com  ----------  http://www.sysarch.com
SYStems ARCHitecture and Stem Development ------ http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info:     http://www.sysarch.com/perl/OOP_class.html


------------------------------

Date: Thu, 17 May 2001 17:21:49 GMT
From: mjd@plover.com (Mark Jason Dominus)
Subject: Re: What's wrong with my scope?
Message-Id: <3b0408a2.448e$6a@news.op.net>

In article <jkovgn01aed.fsf@myrtle.ukc.ac.uk>,
J.C.Posey <jcp@myrtle.ukc.ac.uk> wrote:
>> > 	$urls{$fields[1]} += [$fields[2], $fields[3], $fields[4], $fields[5]];
>
>Okay, I was under the impression to create a hash with an array you did:
>
>	my $hash{"a_key"} = ["a value", "another", $my_var];

You can. (Except for the 'my' here, which is wrong.)

But += means to do addition of numbers.  An array is not a number.

>> > for my $domain (sort keys %urls){
>> >     print $domain, @{ $urls->{$domain}}."\n";
>> 
>I was trying to create an array reference...hmmm, didn't quite get there.

If you do

        $hash{"a_key"} =  [ ... ];

that already creates an array reference and stores it in the hash.
[ ... ] is a reference to an array.  Then @{$hash{"a_key"}} recovers
the array.

Have you read the reference tutorial at 
        http://perl.plover.com/FAQs/References.html
?  You might find it helpful.

-- 
@P=split//,".URRUU\c8R";@d=split//,"\nrekcah xinU / lreP rehtona tsuJ";sub p{
@p{"r$p","u$p"}=(P,P);pipe"r$p","u$p";++$p;($q*=2)+=$f=!fork;map{$P=$P[$f^ord
($p{$_})&6];$p{$_}=/ ^$P/ix?$P:close$_}keys%p}p;p;p;p;p;map{$p{$_}=~/^[P.]/&&
close$_}%p;wait until$?;map{/^r/&&<$_>}%p;$_=$d[$q];sleep rand(2)if/\S/;print


------------------------------

Date: Thu, 17 May 2001 17:20:44 +0200
From: Alexis Roda <alexis.roda@si.urv.es>
Subject: Re: word doc to txt
Message-Id: <3B03EC4C.9E90A47B@si.urv.es>

sven wrote:
> 
> Hi,
> 
> I want to extract all ascii strings in a microsoft word document. I am
> not interested in layout or anything, just in the text.
> 
> I did something like:
> 
> while ($line = <FileHandle>) {
>   $line =~ s/[^A-Za-z0-9]+/ /g;
>   ...
> }
> 
> Any suggestions ?

Not a Perl script, but take a look at http://www.wvWare.com


HRH
-- 
                                  ////
                                 (@ @)
---------------------------oOO----(_)----OOo------------------------
        Los pecados de los tres mundos desapareceran conmigo.
Alexis Roda - Universitat Rovira i Virgili - Reus, Tarragona (Spain)
--------------------------------------------------------------------


------------------------------

Date: Thu, 17 May 2001 10:31:43 -0400
From: tadmc@augustmail.com (Tad McClellan)
Subject: Re: word doc to txt
Message-Id: <slrn9g7o6f.q33.tadmc@tadmc26.august.net>

sven <huhusven@xs4all.nl> wrote:
>
>I want to extract all ascii strings in a microsoft word document. 


A MS Word document is a binary file.

If you are running perl on Windows, then have a look at:

   perldoc -f binmode


If you are running on *nix, then it gets even easier:

   man strings

"print the strings of printable characters in files"


-- 
    Tad McClellan                          SGML consulting
    tadmc@augustmail.com                   Perl programming
    Fort Worth, Texas


------------------------------

Date: Thu, 17 May 2001 16:40:41 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: word doc to txt
Message-Id: <osu7gtcfl3s31k5j6grfp2u2ppblo59nr3@4ax.com>

sven wrote:

>I want to extract all ascii strings in a microsoft word document. I am
>not interested in layout or anything, just in the text.

If you have Word on your PC, you can use Win32::OLE to tell Word to give
you the text contents from the file.

Otherwise, at least in theory, it ought to be possible to use
OLE::Storage or OLE::Storage_Lite to extract the contents. That's how
SpreadSheet::ParseExcel works, for Excel documents. But AFAIK this has
not been done yet, or at least: there's no equivalent module to the
Excel parsing module, for Word, available on CPAN.

-- 
	Bart.


------------------------------

Date: Thu, 17 May 2001 12:47:00 -0400
From: "Neil A. Beardsley" <nabeards@wpi.edu>
Subject: Writing to Unix passwd
Message-Id: <Pine.OSF.4.33.0105171244590.27644-100000@bert.WPI.EDU>

Hi,

I'm trying to use the Unix/Linux command passwd from a script.  I used
open, and try to print to it, but it doesn't work.  Are there any other
ways to do this?  Does anyone know of a module to perform this?

Here's the code:

open(PASSWD, "|passwd $username") || die "Can't fork to passwd: $!";
local $SIG{PIPE} = sub { die "Passwd pipe broke." };
print PASSWD "$password" || die "Can't write password.";
print PASSWD "$password" || die "Can't write password 2.";
close PASSWD || die "Failed to close: $!  $?";

Thanks for any help,
Neil



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 925
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[18757] in Perl-Users-Digest

Perl-Users Digest, Issue: 925 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Thu May 17 14:10:51 2001

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu May 17 14:10:51 2001