[31455] in Perl-Users-Digest
Perl-Users Digest, Issue: 2707 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Dec 5 06:09:36 2009
Date: Sat, 5 Dec 2009 03:09:04 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sat, 5 Dec 2009 Volume: 11 Number: 2707
Today's topics:
"negative" regex matching? <seven.reeds@gmail.com>
Re: "negative" regex matching? <tadmc@seesig.invalid>
Re: "negative" regex matching? <justin.0912@purestblue.com>
Extract file properties from Office documents <marcusdau@googlemail.com>
Re: Extract file properties from Office documents <justin.0911@purestblue.com>
Re: Extract file properties from Office documents <marcusdau@googlemail.com>
Re: Extract file properties from Office documents <justin.0912@purestblue.com>
Re: Extract file properties from Office documents <newsojo@web.de>
Re: FAQ 3.4 How do I find which modules are installed o <justin.0911@purestblue.com>
Re: FAQ 3.4 How do I find which modules are installed o <brian.d.foy@gmail.com>
free redemption site <robin1@cnsp.com>
Must have modules <projecktzero@yahoo.com>
Want to judge some remote hosts online or not quickly o <hongyi.zhao@gmail.com>
Re: Want to judge some remote hosts online or not quick <Peter@PSDT.com>
Re: Want to judge some remote hosts online or not quick <smallpond@juno.com>
Re: Want to judge some remote hosts online or not quick <m@rtij.nl.invlalid>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Fri, 4 Dec 2009 14:50:59 -0800 (PST)
From: "seven.reeds" <seven.reeds@gmail.com>
Subject: "negative" regex matching?
Message-Id: <22a51860-1d32-4b8d-a17f-086d7d07c962@m26g2000yqb.googlegroups.com>
Hi,
I have a regex question. I have arbitrary text and I want to search
it for a set of terms/substrings. In the simple case of one term
it is easy to find the match(es) and then mark them up with HTML
"span" tags. My issue is with more than one term.
Here is an example to illustrate. If I have the string:
Sarah likes Johnny's cooking
and the single term: "john" then I can match and highlight the match
resulting in:
Sarah likes <span>John</span>ny's cooking
Now what if I have two terms: "Johnny" & "john" -- in that order? I
can easily let myself end up with (in sequence):
<apply Johnny match>
Sarah likes <span>Johnny</span>'s cooking
<apply john match>
Sarah likes <span><span>John</span>ny</span>'s cooking
Ok, so what I want is to be able to search for and mark each term in
the string as long as that term is not already in a "span" clause.
I've done some digging in Friedl's RegEx book but I'm not sure if I
know enough to know what I am looking for?
ideas?
------------------------------
Date: Fri, 04 Dec 2009 17:10:43 -0600
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: "negative" regex matching?
Message-Id: <slrnhhj5hk.ghp.tadmc@tadbox.sbcglobal.net>
seven.reeds <seven.reeds@gmail.com> wrote:
> My issue is with more than one term.
>
> Here is an example to illustrate. If I have the string:
>
> Sarah likes Johnny's cooking
>
> and the single term: "john" then I can match and highlight the match
> resulting in:
>
> Sarah likes <span>John</span>ny's cooking
>
> Now what if I have two terms: "Johnny" & "john" -- in that order? I
> can easily let myself end up with (in sequence):
^^^^^^^^^^^
If you do them all in one go, rather than "in sequence",
then problem disappears!
> <apply Johnny match>
> Sarah likes <span>Johnny</span>'s cooking
> <apply john match>
> Sarah likes <span><span>John</span>ny</span>'s cooking
<apply both matches>
s{(Johnny|john)} {<span>$1</span>}gi;
> Ok, so what I want is to be able to search for and mark each term in
> the string as long as that term is not already in a "span" clause.
If you really need to process HTML, then use a module that understands
HTML rather than trying to kludge it with regular expressions.
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
------------------------------
Date: Sat, 5 Dec 2009 00:18:18 +0000
From: Justin C <justin.0912@purestblue.com>
Subject: Re: "negative" regex matching?
Message-Id: <aflou6-409.ln1@purestblue.com>
In article <22a51860-1d32-4b8d-a17f-086d7d07c962@m26g2000yqb.googlegroups.com>, seven.reeds wrote:
> Hi,
>
> I have a regex question. I have arbitrary text and I want to search
> it for a set of terms/substrings. In the simple case of one term
> it is easy to find the match(es) and then mark them up with HTML
> "span" tags. My issue is with more than one term.
>
> Here is an example to illustrate. If I have the string:
>
> Sarah likes Johnny's cooking
>
> and the single term: "john" then I can match and highlight the match
> resulting in:
>
> Sarah likes <span>John</span>ny's cooking
>
> Now what if I have two terms: "Johnny" & "john" -- in that order? I
> can easily let myself end up with (in sequence):
>
> <apply Johnny match>
> Sarah likes <span>Johnny</span>'s cooking
> <apply john match>
> Sarah likes <span><span>John</span>ny</span>'s cooking
>
> Ok, so what I want is to be able to search for and mark each term in
> the string as long as that term is not already in a "span" clause.
>
> I've done some digging in Friedl's RegEx book but I'm not sure if I
> know enough to know what I am looking for?
I think I understand what you are saying.
What you want is "or", and the pattern memory. "or", in this situation,
is most easily achieved with (x|y), and pattern memory with parentheses.
If you wrap from the beginning of the string up until the match in one
set, the from the match to the end of the string in another set you can
then: print "$1<span>$2</span>$3\n";
Show us some code and we can give you a clue.
Justin.
--
Justin C, by the sea.
------------------------------
Date: Fri, 4 Dec 2009 04:53:33 -0800 (PST)
From: Marcus Dau <marcusdau@googlemail.com>
Subject: Extract file properties from Office documents
Message-Id: <522a16ca-5b8c-4fe8-87ea-657ac4f934f5@m3g2000yqf.googlegroups.com>
Hy!
I want to parse on a unix system (!!!) a file structure with many many
many office documents (doc, xls, ppt) and pdf documents. The goal is
to get all words stored in these documents.
For normal information in these documents i used catppt, xls2csv and
catdoc or wvText. But i cannot access the file properties (File ->
Properties [in German Datei -> Eigenschaften]) where i can find fields
like title or author.
Is there any perl module, perl script or something else, that i can
use in my perl script to read the whole document and the properties?
This might be 4 different modules, this doesn't matter.
I hope you can help me!
Thx in advance
marcus
------------------------------
Date: Fri, 04 Dec 2009 13:04:48 -0000
From: Justin C <justin.0911@purestblue.com>
Subject: Re: Extract file properties from Office documents
Message-Id: <71a.4b1908f0.5becd@zem>
On 2009-12-04, Marcus Dau <marcusdau@googlemail.com> wrote:
> Hy!
>
> I want to parse on a unix system (!!!) a file structure with many many
> many office documents (doc, xls, ppt) and pdf documents. The goal is
> to get all words stored in these documents.
>
> For normal information in these documents i used catppt, xls2csv and
> catdoc or wvText. But i cannot access the file properties (File ->
> Properties [in German Datei -> Eigenschaften]) where i can find fields
> like title or author.
The module Spreadsheet::ParseExcel can get at at least some of the
fields that show in the properties dialogue, according to it's
documentation. I've never needed that data from a spreadsheet so I can't
say for sure that it works, nor if it can access all the data you
require.
I don't know about the other file types.
Justin.
--
Justin C, by the sea.
------------------------------
Date: Fri, 4 Dec 2009 06:29:54 -0800 (PST)
From: Marcus Dau <marcusdau@googlemail.com>
Subject: Re: Extract file properties from Office documents
Message-Id: <338b4f90-7978-49e4-bcfd-018c074f0ecf@a32g2000yqm.googlegroups.com>
Where in the documentation can I find this feature???
But this is only Excel. I do need also the other filetypes...
------------------------------
Date: Fri, 4 Dec 2009 16:28:34 +0000
From: Justin C <justin.0912@purestblue.com>
Subject: Re: Extract file properties from Office documents
Message-Id: <iupnu6-s1v.ln1@purestblue.com>
In article <338b4f90-7978-49e4-bcfd-018c074f0ecf@a32g2000yqm.googlegroups.com>, Marcus Dau wrote:
> Where in the documentation can I find this feature???
I can't find it in the documentation on CPAN, but it's in the documentation for the installed module... perhaps I have an old version and it's been removed?
Anyway, with the module installed, search the documentation for properties and you'll find it soon enough, under Workbook Properties:
A workbook object exposes a number of properties as shown below:
$workbook->{Worksheet }->[$index]
$workbook->{File}
$workbook->{Author}
$workbook->{Flg1904}
$workbook->{Version}
$workbook->{SheetCount}
$workbook->{PrintArea }->[$index]
$workbook->{PrintTitle}->[$index]
These properties are generally only of interest to advanced users.
Casual users can skip this section.
> But this is only Excel. I do need also the other filetypes...
You said already. I replied with what I'm familiar with. Others may yet
reply with other solutions.
What suggestions did CPAN make?
Justin.
--
Justin C, by the sea.
------------------------------
Date: 04 Dec 2009 18:37:23 GMT
From: Oliver 'ojo' Bedford <newsojo@web.de>
Subject: Re: Extract file properties from Office documents
Message-Id: <4b1956e3$0$7622$9b4e6d93@newsspool1.arcor-online.net>
Am Fri, 04 Dec 2009 04:53:33 -0800 schrieb Marcus Dau:
> Hy!
>
> I want to parse on a unix system (!!!) a file structure with many many
> many office documents (doc, xls, ppt) and pdf documents. The goal is to
> get all words stored in these documents.
>
> For normal information in these documents i used catppt, xls2csv and
> catdoc or wvText. But i cannot access the file properties (File ->
> Properties [in German Datei -> Eigenschaften]) where i can find fields
> like title or author.
For pdf: PDF::API2
Oliver
------------------------------
Date: Fri, 04 Dec 2009 13:14:56 -0000
From: Justin C <justin.0911@purestblue.com>
Subject: Re: FAQ 3.4 How do I find which modules are installed on my system?
Message-Id: <891.4b190b50.ab9e0@zem>
On 2009-12-03, Justin C <justin.0911@purestblue.com> wrote:
> On 2009-12-02, brian d foy <brian.d.foy@gmail.com> wrote:
>>
>> It's the cpan-script or App::Cpan module. The one that comes with
>> CPAN.pm is old.
>
> OK, thanks, I'll look into that.
That's interesting, my mirror didn't know about this module yesterday,
but it does today. Are you writing FAQs to cover modules you're in the
process of writing?
Justin.
--
Justin C, by the sea.
------------------------------
Date: Fri, 04 Dec 2009 09:28:07 -0600
From: brian d foy <brian.d.foy@gmail.com>
Subject: Re: FAQ 3.4 How do I find which modules are installed on my system?
Message-Id: <041220090928071030%brian.d.foy@gmail.com>
In article <891.4b190b50.ab9e0@zem>, Justin C
<justin.0911@purestblue.com> wrote:
> On 2009-12-03, Justin C <justin.0911@purestblue.com> wrote:
> > On 2009-12-02, brian d foy <brian.d.foy@gmail.com> wrote:
> >>
> >> It's the cpan-script or App::Cpan module. The one that comes with
> >> CPAN.pm is old.
> That's interesting, my mirror didn't know about this module yesterday,
> but it does today. Are you writing FAQs to cover modules you're in the
> process of writing?
No, I'm fixing problems as I find them. App::Cpan hadn't indexed
correctly and now it does.
The FAQ is getting ready for Perl 5.12, and if you aren't using the
latest version of Perl, you might find that the latest version of the
faq doesn't match up with your older Perl. Life marches on.
------------------------------
Date: Fri, 4 Dec 2009 21:56:40 -0800 (PST)
From: Robin <robin1@cnsp.com>
Subject: free redemption site
Message-Id: <58651301-f701-4ae9-9c34-21b85a25df88@z41g2000yqz.googlegroups.com>
http://redemption.zxq.net/index.php?a
this rocks guys
-robin
------------------------------
Date: Fri, 4 Dec 2009 11:38:45 -0800 (PST)
From: projecktzero <projecktzero@yahoo.com>
Subject: Must have modules
Message-Id: <02b018f3-217b-4ec5-83b0-fbaabd98c4d2@d20g2000yqh.googlegroups.com>
Out of curiosity, what modules do you find yourself always installing
when you have a fresh install of Perl? I primarily do web development,
and I'd like to know which modules you web devs rely on. Is there a
list of the most popular modules on CPAN?
------------------------------
Date: Fri, 04 Dec 2009 21:28:38 +0800
From: Hongyi Zhao <hongyi.zhao@gmail.com>
Subject: Want to judge some remote hosts online or not quickly over WAN.
Message-Id: <sg3ih5568gbmgum85pkh2tb7un9g3rdcbf@4ax.com>
Hi all,
I want to judge some remote hosts online or not quickly over WAN. I've
learned that ping command will not work in this case if the icmp ack
is blocked locally by firewall. Is it possiable for me to do this job
by perl codes?
Best regards.
--
.: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
------------------------------
Date: Fri, 04 Dec 2009 14:13:50 GMT
From: Peter Scott <Peter@PSDT.com>
Subject: Re: Want to judge some remote hosts online or not quickly over WAN.
Message-Id: <yO8Sm.36011$cX4.13416@newsfe10.iad>
On Fri, 04 Dec 2009 21:28:38 +0800, Hongyi Zhao wrote:
> I want to judge some remote hosts online or not quickly over WAN. I've
> learned that ping command will not work in this case if the icmp ack is
> blocked locally by firewall. Is it possiable for me to do this job by
> perl codes?
Whatever it is that you or others will want to do with those remote hosts
if they are online, try that. If they're web servers, check port 80 etc.
For mail, port 25. Even if they were pingable wouldn't mean that the
desired services were running.
--
Peter Scott
http://www.perlmedic.com/
http://www.perldebugged.com/
http://www.informit.com/store/product.aspx?isbn=0137001274
------------------------------
Date: Fri, 4 Dec 2009 08:37:51 -0800 (PST)
From: smallpond <smallpond@juno.com>
Subject: Re: Want to judge some remote hosts online or not quickly over WAN.
Message-Id: <4e75f9de-2d96-4ba2-ae42-44056e78d439@f16g2000yqm.googlegroups.com>
On Dec 4, 8:28=A0am, Hongyi Zhao <hongyi.z...@gmail.com> wrote:
> Hi all,
>
> I want to judge some remote hosts online or not quickly over WAN. I've
> learned that ping command will not work in this case if the icmp ack
> is blocked locally by firewall. =A0Is it possiable for me to do this job
> by perl codes?
So the problem is that they are online but you aren't.
There is no valid reason to block ICMP packets.
------------------------------
Date: Fri, 4 Dec 2009 17:49:19 +0100
From: Martijn Lievaart <m@rtij.nl.invlalid>
Subject: Re: Want to judge some remote hosts online or not quickly over WAN.
Message-Id: <f5rnu6-9le.ln1@news.rtij.nl>
On Fri, 04 Dec 2009 21:28:38 +0800, Hongyi Zhao wrote:
> Hi all,
>
> I want to judge some remote hosts online or not quickly over WAN. I've
> learned that ping command will not work in this case if the icmp ack is
> blocked locally by firewall. Is it possiable for me to do this job by
> perl codes?
Try tcping, tcptrace or hping.
M4
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
Back issues are available via anonymous ftp from
ftp://cil-www.oce.orst.edu/pub/perl/old-digests.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 2707
***************************************