[28927] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 171 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Feb 27 00:14:01 2007

Date: Mon, 26 Feb 2007 21:14:28 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Mon, 26 Feb 2007     Volume: 11 Number: 171

Today's topics:
        Automating Internet Explorer <g_m@remove-comcast.net>
    Re: Automating Internet Explorer <mark.clementsREMOVETHIS@wanadoo.fr>
    Re: Automating Internet Explorer <g_m@remove-comcast.net>
    Re: Automating Internet Explorer <g_m@remove-comcast.net>
    Re: Automating Internet Explorer <g_m@remove-comcast.net>
        Can I hack this perl thing ? <mkakkad@gmail.com>
    Re: Can I hack this perl thing ? <jurgenex@hotmail.com>
    Re: Can I hack this perl thing ? <wahab-mail@gmx.de>
    Re: Can I hack this perl thing ? <joe@inwap.com>
        carriage returns in HERE statements <bdalzell@qis.net>
    Re: carriage returns in HERE statements <spamtrap@dot-app.org>
    Re: carriage returns in HERE statements usenet@DavidFilmer.com
    Re: carriage returns in HERE statements <bdalzell@qis.net>
        DocumentHTML ? <g_m@remove-comcast.net>
    Re: DocumentHTML ? <1usa@llenroc.ude.invalid>
    Re: DocumentHTML ? <g_m@remove-comcast.net>
        Fussy Date::Parse... <DJStunks@gmail.com>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Sat, 24 Feb 2007 15:57:49 -0500
From: "~greg" <g_m@remove-comcast.net>
Subject: Automating Internet Explorer
Message-Id: <hZKdnbUd0eZXO33YnZ2dnUVZ_sqdnZ2d@comcast.com>

Hello,

I am trying to find a reliable leak-proof way to control Internet Explorer
by way of buttons (or whatever) in a parallel Tk window.

The script below is where it's at at the moment.
(Please forgive the learner's-comments,
I just clipped the script exactly as it currently is.)

At the moment the Tk button is supposed to switch the IE document
in and out of edit mode (via document.designMode = 'On' and 'Off'.)

And it works!

However, there seems to a leak.
And my question is, how to stop it?
(I'd like to get this right, and not just seem to be right.)

The printout is this:

  BEGIN Loop
  TEST 1:
  TEST 2:
  TEST 3:
  TEST 4:
  TEST 5:
  TEST 6:
  BEGIN OnQuit event
  END OnQuit event
  END Loop
  BEGIN Test for leaks:
   Object=Win32::OLE=HASH(0x1db18b0) Class=DispHTMLDocument
   Object=Win32::OLE=HASH(0x1d99bc0) Class=IWebBrowser2
   Object=Win32::OLE=HASH(0x225530) Class=IWebBrowser2
  END test for leaks.
  BEGIN MyOleQuit
  END MyOleQuit

So the leak is that those 3 objects still exist
when the END block is executed.


I'm thinking that I'm supposed to call
  Win32::OLE->Uninitialize();
and/or
  Win32::OLE->FreeUnusedLibraries();

but I don't know,
and I don't know if FreeUnusedLibraries() has to be protected,
from the VB bug the kludgy way mentioned in the OLE doc
(--as quoted below after the __END__)

and I don't know where to call them, if they are the answer -- 
(-- in the 'OnQuit' event handler ?
 -- in 'MyOnQuit()' ?
 -- after the loop exit ?
 -- in an END block ?
)

Currently the script exits when the X is clicked in the IE window,
which sends the 'OnQuit' message to MyIEHandler(),
which then stops OLE from listening to IE events,
and destroys the Tk window.
So I'm doing all that in the event handler,
while the MyOleQuit() and loop drop-out do nothing.


I hope I've made it at least a little bit clear what I'm asking!


(Another question: every time I switch in and out
of edit-mode, the page seems to get refreshed.
Can that be stopped?)

~greg.

(note: I've seen something like this (ie, simultaneous Tk and OLE loops)
being done
(here:
D:\Us\Programming\Perl\Internet Explorer\Notes\Scripting iTunes with Perl - Part 2 at cyberrazor.htm
)
by using both Tk's MainLoop and OLE's MessageLoop,
like this:
  1.. $tkWin->waitVariable(\$iTunes_OLE);
  2.. MainLoop;
  3.. Win32::OLE->MessageLoop();

and I think that might have some advantages
in terms of MainLoop and MessageLoop being well tweaked
so as not to take up too much cpu time when you're doing
other things. And I got it to work too, but it got
too confusing for me, and the script always hung
when I quit the IE.

In general, I don't know where to close things.
(as this post is now an example! :)

In particular, I don't know why $TkWindow->destroy();
can't be called just anywhere at all, completely
independant of the OLE automation, but it can't.
The only place I can put it so that Tk doesn't stay hanging
after I've closed IE is in the 'OnQuit' event-handler case.



# ------------------------------------------------------------------------
# the script: ...
# ------------------------------------------------------------------------

use strict;
$|=1;

my $TkWindow;
my $IEWindow;
my $Document;
my $Looping = 1;

# ------------------------------------------------------------------------
# IE

use Win32::OLE qw(EVENTS in);
#use Win32::OLE::Variant;

my $IE = Win32::OLE->new("InternetExplorer.Application", \&MyOleQuit)
|| die "Could not start Internet Explorer.Application\n";

$IE->{visible} = 1;

$IE->Navigate("http://www.google.com");

Win32::OLE->WithEvents($IE, \&MyIEHandler, "DWebBrowserEvents2");

sub MyIEHandler
{
  my ($obj,$event,@args) = @_;
  #print " Event triggered: $event\n";
  if ($event eq "DocumentComplete")
  {
    $IEWindow = shift @args;
    $Document = $IEWindow->{Document};
    #print "URL: " . $Document->URL . "\n";
  }
  elsif($event eq 'OnQuit')
  {
    print "BEGIN OnQuit event\n";
    Win32::OLE->WithEvents($IE); # stop trying to get messages from IE
    $TkWindow->destroy(); # end Tk
    $Looping = 0;
    print "END OnQuit event\n";
  }
}

# ------------------------------------------------------------------------
# Tk

use Tk;

$TkWindow = MainWindow->new;
$TkWindow->title("Control Box");
$TkWindow->Button(-text => 'TEST', -command => \&Test)->pack;

my $TestNumber=0;
sub Test
{
  print 'TEST ', ++$TestNumber, ': ';
  #if ($Document->title)
  #{
  #  print "The title is " . $Document->title;
  #}
  $Document->{designMode} =
    $Document->{designMode} eq 'On' ? 'Off' : 'On';
  # starts undef? In anycase, not eq 'On'.
  print "\n";
}

# ------------------------------------------------------------------------
# Loop
print "BEGIN Loop\n";

while($Looping)
{
  # "A delay of 50 milliseconds typically is fine"
  Win32::Sleep(50);
  $TkWindow->update(); # process Tk messages
  Win32::Sleep(50);
  Win32::OLE->SpinMessageLoop(); # process IE messages
}

# ------------------------------------------------------------------------
# End

print "END Loop\n";

sub MyOleQuit
{
  # This really is After everything!
  print "BEGIN MyOleQuit\n";
  print "END MyOleQuit\n";
}

END
{
  print "BEGIN Test for leaks:\n";
  Win32::OLE->EnumAllObjects
  (
    sub
    {
      my $object = shift;
      my $class = Win32::OLE->QueryObjectType($object);
      $class = '?' if ! defined $class;
      printf " Object=%s Class=%s\n", $object, $class;
    }
  );
  print "END test for leaks.\n";
  # "The EnumAllObjects() method is primarily a debugging tool.
  # It can be used e.g. in an END block to check if all
  # external connections have been properly destroyed."
}


__END__

# "Win32::OLE->Uninitialize
#
# The Uninitialize() class method
# uninitializes the OLE subsystem.
# It also destroys the hidden top level window
# created by OLE for single threaded apartments.
# All OLE objects will become invalid after this call!
# It is possible to call the Initialize() class method again
# with a different apartment model
# after shutting down OLE with Uninitialize()."

# "Win32::OLE->FreeUnusedLibraries
#
# The FreeUnusedLibraries() class method
# unloads all unused OLE resources.
# These are the libraries of those classes of which
# all existing objects have been destroyed.
# The unloading of object libraries
# is really only important for long running processes
# that might instantiate a huge number of different objects
# over time.
# Be aware that objects implemented in Visual Basic
# have a buggy implementation of this functionality:
# They pretend to be unloadable
# while they are actually still running their cleanup code.
# Unloading the DLL at that moment
# typically produces an access violation.
# The probability for this problem can be reduced
# by calling the SpinMessageLoop() method
# and sleep()ing for a few seconds."



# microsoft DHTML reference:
# http://msdn.microsoft.com/library/default.asp?url=/workshop/author/dhtml/reference/dhtml_reference_entry.asp




------------------------------

Date: Sat, 24 Feb 2007 22:46:48 +0100
From: Mark Clements <mark.clementsREMOVETHIS@wanadoo.fr>
Subject: Re: Automating Internet Explorer
Message-Id: <45e0b23b$0$5065$ba4acef3@news.orange.fr>

~greg wrote:
> Hello,
> 
> I am trying to find a reliable leak-proof way to control Internet Explorer
> by way of buttons (or whatever) in a parallel Tk window.
> 
> The script below is where it's at at the moment.
> (Please forgive the learner's-comments,
> I just clipped the script exactly as it currently is.)
> 

Other than using Win32::OLE, you could look at

Win32::IE::Mechanize

or using Selenium to generate perl code that controls the browser.

Mark


------------------------------

Date: Sat, 24 Feb 2007 18:43:50 -0500
From: "~greg" <g_m@remove-comcast.net>
Subject: Re: Automating Internet Explorer
Message-Id: <tsadnWWaEbYrUH3YnZ2dnUVZ_q2pnZ2d@comcast.com>

> Other than using Win32::OLE, you could look at
>
> Win32::IE::Mechanize
>
> or using Selenium to generate perl code that controls the browser.
>
> Mark



Thanks,

"Selenium" seems to be both more, and less, than what I want.

Also, "Selenium uses JavaScript and Iframe" -- not just perl.

(Also, Selenium appears to have something or other to do with "Agile teams",
--which may, or many not, have something to do with The "Agile group",
--which Leonard Cohen had a very nasty run-in with, about year or so ago.
And I am a great fan of Leonard Cohen. :)

As for IE::Mechanize, my version of Win32::IE::Mechanize is 0.009.
and the doc says:

  "This module tries to be a sort of drop-in replacement
  for the WWW::Mechanize manpage.
  It uses the Win32::OLE manpage to manipulate the Internet Explorer.
  Don't expect it to be like the mech in that the class
  is not derived from the user-agent class (like LWP).
  WARNING: This is a work in progress and my first priority
  will be to implement the WWW::Mechanize interface
  (which is still in full development). Where ever possible
   and needed I will also implement LWP::UserAgent methods
   that the mech inherits and will help make this thing useful.


I have been learning Mechanize better and better, and I will
be using it in this thing of mine. So, since IE::Mechanize
isn't really Mechanize yet, it would just be one more
unnecessary layer for me to have to learn.

~~
For what it's worth, my approach began
by using Dave Roth's "Win32 Perl Programming",
and Henry Wasserman's essay: "Automating Windows Applications with Win32::OLE", April 21, 2005,
at:
D:\Us\Programming\Perl\Internet Explorer\Notes\perl_com Automating Windows Applications with Win32OLE.htm

Wasserman ends his essay by mentioning the further evolution
of the idea in SAMIE, "Simple Automation Module For Internet Explorer",
here: http://samie.sourceforge.net/

And SAMIE may be exactly the wheel I'm trying to re-invent.
I don't know. But it's not available via ActiveState,
and the download includes exes (--which, I got the impression
from somewhere, aren't open-source, --which always bothers me.)

In any case, I am close enough to what I want
just by using Tk and Win32::OLE
(--which I think all the other ways use anyway)
that I'd prefer to stick with it.
I just need to fix the leak.

~greg












http://www.perl.com/pub/a/2005/04/21/win32ole.html





------------------------------

Date: Sat, 24 Feb 2007 18:56:33 -0500
From: "~greg" <g_m@remove-comcast.net>
Subject: Re: Automating Internet Explorer
Message-Id: <Z76dnX3-yK5YTX3YnZ2dnUVZ_qyjnZ2d@comcast.com>


Sorry about those local links!

Scripting iTunes with Perl - Part 2
is here:
http://cyberrazor.com/2006/07/06/scripting-itunes-with-perl-part-2/

And  Henry Wasserman's essay:
Automating Windows Applications with Win32::OLE
is here:
http://www.perl.com/pub/a/2005/04/21/win32ole.html 




------------------------------

Date: Sat, 24 Feb 2007 22:45:55 -0500
From: "~greg" <g_m@remove-comcast.net>
Subject: Re: Automating Internet Explorer
Message-Id: <adWdnXZ_uaPsm3zYnZ2dnUVZ_rSjnZ2d@comcast.com>



If anyone cares, I tried putting
  Win32::OLE->Uninitialize();
in MyOleQuit()
and (of course) got a "Deep recursion" error.

Then I tried putting it in my END block.
And then right after the loop exit.

Each of which got these error calls:

  BEGIN Test for leaks:
  Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
   Object=Win32::OLE=HASH(0x1db18f0) Class=?
  Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
   Object=Win32::OLE=HASH(0x1d99c1c) Class=?
  Win32::OLE(0.1707): GetOleObject() Not a Win32::OLE object at ...
   Object=Win32::OLE=HASH(0x225530) Class=?
  END test for leaks


Which is confusing because
why would
Win32::OLE->EnumAllObjects()
be enumerating objects
that aren't
Win32::OLE objects?

~~

Finally I read
Win32::OLE::NEWS - What's new in Win32::OLE
for the version (0.18) which I have.

It says, in effect, that since version 0.1007,
I should not have to worry about leaks.

Quote:

  "more robust global destruction of Win32::OLE objects

  The final destruction of Win32::OLE objects
  has always been somewhat fragile. The reason for this
  is that Perl doesn't honour reference counts during
  global destruction but destroys objects in seemingly
  random order. This can lead to leaked database connections
  or unterminated external objects. The only solution
  was to make all objects lexical and hope that no object
  would be trapped in a closure. Alternatively all objects
  could be explicitly set to undef, which doesn't work
  very well with exception handling.

  With version 0.1007 of Win32::OLE this problem should be gone:
  The module keeps a list of active Win32::OLE objects.
  It uses an END block to destroy all objects at program
  termination before the Perl's global destruction starts.
  Objects still existing at program termination
  are now destroyed in reverse order of creation.
  The effect is similar to explicitly calling
  Win32::OLE->Uninitialize() just prior to termination."


So I guess that the only thing for me to do
is to run the script a few thousand times
and watch ram usage.

If it goes up monotonically, then there's a problem.
Otherwise not.


~greg




------------------------------

Date: 25 Feb 2007 22:48:28 -0800
From: "Mihir" <mkakkad@gmail.com>
Subject: Can I hack this perl thing ?
Message-Id: <1172472508.393332.264120@z35g2000cwz.googlegroups.com>

I am a beginner to perl. I have a setup a page on an apache server
which has its addr like
http:// <name of server> :8088/cgi-bin/names.pl?id1=xx&id2=yy

This page contains a list of names of a few friends. This page is made
when a friend of mine registers in my guestbook. Now the question is
that this above address is displayed in the browser everytime a friend
accesses their account. So he/she can see their own page but can a
friend of mine get to this page and somehow modify its contents and
see the list of all my friends that exist and show up when the xx
value of id1 or id2 change?

Can somebody please advice, so that I can know how secure this page of
mine is.....

Thank you for your time in advance ....


--
MK



------------------------------

Date: Mon, 26 Feb 2007 07:01:08 GMT
From: "J�rgen Exner" <jurgenex@hotmail.com>
Subject: Re: Can I hack this perl thing ?
Message-Id: <UAvEh.6204$iF.2495@trndny03>

Mihir wrote:
> I am a beginner to perl.

Irrelevant because your question has nothing at all to do with Perl.

> I have a setup a page on an apache server
> which has its addr like
> http:// <name of server> :8088/cgi-bin/names.pl?id1=xx&id2=yy
>
> This page contains a list of names of a few friends. This page is made
> when a friend of mine registers in my guestbook. Now the question is
> that this above address is displayed in the browser everytime a friend
> accesses their account. So he/she can see their own page but can a
> friend of mine get to this page and somehow modify its contents and
> see the list of all my friends that exist and show up when the xx
> value of id1 or id2 change?

Maybe, impossible to tell from your description. Do you authenticate your 
users?
And assign permissions accordingly?

> Can somebody please advice, so that I can know how secure this page of
> mine is.....

Without a thourough security analysis of you system, staring with the OS, 
including the web server setup, and then last but not least your code it is 
impossible to answer the question. A trivial test would be to just try it. 
If you can get in as John Doe then you know it's not secure. Of course if 
you can't get in that only means that _you_ weren't able to find a hole, 
someone else might very well still might be.

Just to give you and idea of the complexity: Professional software security 
companies charge 6-digit sums to do a security analysis of medium-sized web 
applications.

Anyway, as I mentioned before: your question has nothing to do with Perl.

jue 




------------------------------

Date: Mon, 26 Feb 2007 10:27:27 +0100
From: Mirco Wahab <wahab-mail@gmx.de>
Subject: Re: Can I hack this perl thing ?
Message-Id: <eru9g3$rdk$1@mlucom4.urz.uni-halle.de>

Mihir wrote:
> I am a beginner to perl. I have a setup a page on an apache server
> which has its addr like
> http:// <name of server> :8088/cgi-bin/names.pl?id1=xx&id2=yy
> 
> This page contains a list of names of a few friends. This page is made
> when a friend of mine registers in my guestbook. Now the question is
> that this above address is displayed in the browser everytime a friend
> accesses their account. So he/she can see their own page but can a
> friend of mine get to this page and somehow modify its contents and
> see the list of all my friends that exist and show up when the xx
> value of id1 or id2 change?

I'd create a sha1-hash of "xx_yy", like

    ...
    use Digest::SHA1 qw(sha1_hex);
    ...
    my $friends_name = "xx";
    my $friends_email= "yy";
    $newid = sha1_hex( $friends_name . '_' . $friends_email  );
    ...
    // now: $newid = "1df1f88fa38f0906cf09da207e1c4ae005a146bd";
    ...


gives then:

  http:// <name of server> :8088/cgi-bin/names.pl?id=1df1f88fa38f0906cf09da207e1c4ae005a146bd

or (with working /path_info/)

  http:// <name of server> :8088/cgi-bin/names.pl/1df1f88fa38f0906cf09da207e1c4ae005a146bd

of course, the "ID" of your people will be this
key from now on.But nobody ever on earth will
be able to make guesses ;-)

Regards

M.


------------------------------

Date: Mon, 26 Feb 2007 13:04:19 -0800
From: Joe Smith <joe@inwap.com>
Subject: Re: Can I hack this perl thing ?
Message-Id: <uqSdndlUYLHB1n7YnZ2dnUVZ_vCknZ2d@comcast.com>

Mihir wrote:

> http:// <name of server> :8088/cgi-bin/names.pl?id1=xx&id2=yy
>
> So he/she can see their own page but can a
> friend of mine get to this page and somehow modify its contents and
> see the list of all my friends that exist and show up when the xx
> value of id1 or id2 change?

If names.pl implements some sort of password scheme, then I expect that
the friend of yours won't be able to change anything.

If names.pl does not use passwords, then you are in deep doo-doo.
In that case, delete everything and start over.
	-Joe


------------------------------

Date: 26 Feb 2007 10:28:14 -0800
From: "bdz" <bdalzell@qis.net>
Subject: carriage returns in HERE statements
Message-Id: <1172514494.514501.320360@s48g2000cws.googlegroups.com>

I am running Ubuntu GNU/Linux Edgie.

I have a program that is using a here statement to print a message in
a terminal window.

sub info {
print << "HERE"
\~\~\~\~\~\~\~\~\~\~\~\~\n
Line 1 \n
Line 2 \n
Line 3\n Line 4\nLine 5 \n

\~\~\~\~\~\~\~\~\~\~\~\~\n
HERE
}

The message is printed out in this fashion

~~~~~~~~~~~~

Line 1

Line 2

Line 3
Line 4
Line5

~~~~~~~~~~~~

That is the carriage returns not marked by \n in the script are
printing to the
terminal as carriage returns giving me double spaced lines unless I
group all the lines on one line (as in 3,4,and 5.

I did not have this problem when I was using HERE statements under my
previous (obscure non-windows) operating system. Can anyone illuminate
this matter?



------------------------------

Date: Mon, 26 Feb 2007 13:42:22 -0500
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: carriage returns in HERE statements
Message-Id: <m2wt24rca9.fsf@local.wv-www.com>

"bdz" <bdalzell@qis.net> writes:

> sub info {
> print << "HERE"
> \~\~\~\~\~\~\~\~\~\~\~\~\n
> Line 1 \n
> Line 2 \n
> Line 3\n Line 4\nLine 5 \n
>
> \~\~\~\~\~\~\~\~\~\~\~\~\n
> HERE
> }
>
> The message is printed out in this fashion
>
> ~~~~~~~~~~~~
>
> Line 1
>
> Line 2
>
> Line 3
> Line 4
> Line5
>
> ~~~~~~~~~~~~
>
> That is the carriage returns not marked by \n in the script are
> printing to the
> terminal as carriage returns giving me double spaced lines unless I
> group all the lines on one line (as in 3,4,and 5.
>
> I did not have this problem

What problem? You're getting exactly the output you're asking for. Here
documents are multi-line by nature, and include the newlines. Since there's
a newline already there, at the end of each line, the extra \n's cause the
output to be double-spaced.

If your other OS required explicit \n's at the end of each line of a here
document, then you had a problem *there* - that's not normal. The behavior
you're seeing now is normal.

sherm--

-- 
Web Hosting by West Virginians, for West Virginians: http://wv-www.net
Cocoa programming in Perl: http://camelbones.sourceforge.net


------------------------------

Date: 26 Feb 2007 10:46:14 -0800
From: usenet@DavidFilmer.com
Subject: Re: carriage returns in HERE statements
Message-Id: <1172515574.021470.110520@h3g2000cwc.googlegroups.com>

On Feb 26, 10:28 am, "bdz" <bdalz...@qis.net> wrote:

> That is the carriage returns not marked by \n in the script are
> printing to the terminal as carriage returns

You typed the program output (and made three typoos in the process) -
please don't do that. cut-and-paste instead.

Your program is apparently producing the correct output (we don't know
for sure, since you typed the output instead of cut-and-paste).  When
you use a heredoc, every single character is included (including any
newlines that you put in the text by way of hitting your ENTER key).
If you terminate a line with a \n you get two newlines - one that you
included by hitting ENTER and one that you included by typing \n.

If you saw different behavior from another program it's probably
because $\ had been redefined.

--
The best way to get a good answer is to ask a good question.
David Filmer (http://DavidFilmer.com)



------------------------------

Date: 26 Feb 2007 16:36:55 -0800
From: "bdz" <bdalzell@qis.net>
Subject: Re: carriage returns in HERE statements
Message-Id: <1172536615.219446.81370@j27g2000cwj.googlegroups.com>

Thanks for the explanation. I am sorry I typed it, i was trying to
simplify. However knowing that the newlines that are in the text will
print to the terminal without having to put in a \n solves my problem.
Amigas do not respond that way when here documents are used in a
terminal but I am aiming for cross OS capability.

Thanks for your useful comments on my typing. I see the three
"typoos", forgetting the ; after "HERE" and  Line5 instead of Line 5
in the output and the code should have produced two blanks before the
final set of tildes.



------------------------------

Date: Mon, 26 Feb 2007 16:04:58 -0500
From: "~greg" <g_m@remove-comcast.net>
Subject: DocumentHTML ?
Message-Id: <pZWdnZ_VB-MT1n7YnZ2dnUVZ_oGlnZ2d@comcast.com>

I am trying to get an InternetExplorer.Application to print out
the whole HTML document as text,
from the <HTML> (or before) to the </HTML>.
(-so as to feed it to a TreeBuilder parse).


print $Document->Body->innerHTML works,
but returns only the body's innerHTML.

print $Document->Body->outterHTML,
and print $Document->DocumentHTML,
don't work.

The error is:
   Win32::OLE(0.1707) error 0x80020003: "Member not found"
   in METHOD/PROPERTYGET "" at ...


Any hints, please?

~greg


use strict;
$|=1;
my $IEWindow;
my $Document;
my $Looping = 1;
use Win32::OLE qw(EVENTS in);
my $IE = Win32::OLE->new("InternetExplorer.Application")
|| die "Could not start Internet Explorer.Application\n";
Win32::OLE->WithEvents($IE,\&MyIEHandler,"DWebBrowserEvents2");
sub MyIEHandler
{
  my ($obj,$event,@args) = @_;
  if ($event eq "DocumentComplete")
  {
    $IEWindow = shift @args;
    $Document = $IEWindow->{Document};
    #print $Document->DocumentHTML;    # doesn't work
    #print $Document->Body->outterHTML; # doesn't work
    print $Document->Body->innerHTML;   # works
  }
  elsif($event eq 'OnQuit')
  {
    Win32::OLE->WithEvents($IE);
    $Looping = 0;
  }
}

$IE->{visible} = 1;
$IE->Navigate("http://www.google.com");

while($Looping)
{
  Win32::Sleep(40);
  Win32::OLE->SpinMessageLoop();
} 




------------------------------

Date: Tue, 27 Feb 2007 01:07:45 GMT
From: "A. Sinan Unur" <1usa@llenroc.ude.invalid>
Subject: Re: DocumentHTML ?
Message-Id: <Xns98E3CCC373A43asu1cornelledu@127.0.0.1>

"~greg" <g_m@remove-comcast.net> wrote in news:pZWdnZ_VB-MT1n7YnZ2dnUVZ_oGlnZ2d@comcast.com:

> I am trying to get an InternetExplorer.Application to print out
> the whole HTML document as text,
> from the <HTML> (or before) to the </HTML>.
> (-so as to feed it to a TreeBuilder parse).
> 
> 
> print $Document->Body->innerHTML works,
> but returns only the body's innerHTML.
> 
> print $Document->Body->outterHTML,
> and print $Document->DocumentHTML,
> don't work.
> 
> The error is:
>    Win32::OLE(0.1707) error 0x80020003: "Member not found"
>    in METHOD/PROPERTYGET "" at ...
> 
> 
> Any hints, please?

Well, the first one would to use 

http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/

I have successfully used that module to do some really complicated
automated downloading of about 10 GB of HTML from various web sites
(sorry can't be more specific).

Note the comment at

http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#%24ie-%3Econtent

> use strict;

use warnings; # do not leave it out.

#!/usr/bin/perl

use strict;
use warnings;


$|=1;
my $IEWindow;
my $Document;
my $Looping = 1;

use Win32::OLE qw(EVENTS in);

my $IE = Win32::OLE->new("InternetExplorer.Application")
    or die "Could not start Internet Explorer.Application\n";

Win32::OLE->WithEvents($IE, \&MyIEHandler, "DWebBrowserEvents2");

sub MyIEHandler {
    my ($obj, $event, @args) = @_;
    
    if ($event eq "DocumentComplete") {
        my $IEWindow = shift @args;
        print $IEWindow->Document->documentElement->{outerHTML};
    }
    elsif($event eq 'OnQuit') {
        Win32::OLE->WithEvents($IE);
        $Looping = 0;
    }
}

$IE->{visible} = 1;
$IE->Navigate("http://www.google.com");

while ($Looping) {
  Win32::Sleep(40);
  Win32::OLE->SpinMessageLoop();
} 

__END__

Sinan


------------------------------

Date: Mon, 26 Feb 2007 23:01:38 -0500
From: "~greg" <g_m@remove-comcast.net>
Subject: Re: DocumentHTML ?
Message-Id: <o-KdnS_QLYiwMH7YnZ2dnUVZ_qOpnZ2d@comcast.com>


"A. Sinan Unur" > wrote ...
> "~greg" > wrote ...
>> ...
>> Any hints, please?
>
> Well, the first one would to use
>
> http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/
>
> I have successfully used that module to do some really complicated
> automated downloading of about 10 GB of HTML from various web sites
> (sorry can't be more specific).
>
> Note the comment at
>
> http://search.cpan.org/~abeltje/Win32-IE-Mechanize-0.009_17/lib/Win32/IE/Mechanize.pm#%24ie-%3Econtent
>
>> use strict;
>
> use warnings; # do not leave it out.




Thanks.

I do use Mechanize, and TreeBuilder, together, quite a bit.

But what I am really trying to do is to add value to my regular browser
(i.e, IE), --without having to write COM plug-ins
(or whatever they're called these days.)

~~

I don't know what you mean by "the comment" at the link
to cpan's Win32::IE::Mechanize,

but the DESCRIPTION of its current state is not at all encouraging
(---"Don't expect it to be like the mech in that the class is not derived
from the user-agent class (like LWP). WARNING: This is a work in progress ... ")

and the CAVEATS  (---"...This means that you may need
to set your security settings to a low and possibly unsafe level. ...")

sounds down right dire to me.

(Part of what I mean by adding value to IE is ADDING security, not subtracting it!)

~~~

But of course I use warnings!

You didn't see it in my snippet because I always run scripts
from within a text editor that has it on the command line:
  perl.exe -w -Mstrict ...

But I do want to thank you because you made me look
at the setup again, and it turns out that I'd had it as:
  perl.exe -w mstrict ...

- with small 'm' instead of capital 'M',
-- which is why I had to still use "use strict;"
in all my scripts!

And now I don't have to look at either one of them - "use strict;" or "use warnings;"
ever again!   :)


(Next I've got to figure out how to hide "$|=1;" )


~greg











------------------------------

Date: 26 Feb 2007 21:04:45 -0800
From: "DJ Stunks" <DJStunks@gmail.com>
Subject: Fussy Date::Parse...
Message-Id: <1172552685.042520.299990@v33g2000cwv.googlegroups.com>

All,

I love Date::Parse but it's struggling parsing what I consider to be a
pretty unambiguous date time that Date::Manip handles just fine...

C:\>perl -MDate::Parse \
-e "print str2time('2/26/2007 14:38:13 PM') ? 'y' : 'n'"
n

C:\>perl -MDate::Manip \
-e "print UnixDate('2/26/2007 14:38:13 PM','%s') ? 'y' : 'n'"
y

C:\>perl -MDate::Manip \
-e "print UnixDate('2/26/2007 14:38:13 PM','%m/%d/%Y %T')"
02/26/2007 14:38:13

I don't really have a question because Date::Manip is ok, but I needed
epoch time.

-jp



------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V11 Issue 171
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[28927] in Perl-Users-Digest

Perl-Users Digest, Issue: 171 Volume: 11

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Tue Feb 27 00:14:01 2007

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Feb 27 00:14:01 2007