[10900] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 4501 Volume: 8

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Dec 27 13:08:28 1998

Date: Sun, 27 Dec 98 10:00:19 -0800
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 27 Dec 1998     Volume: 8 Number: 4501

Today's topics:
        ANNOUNCE: HTML::FromText <garethr@cre.canon.co.uk>
        ANNOUNCE: Parse-Yapp-0.21, Christmas Release <desar@club-internet.fr>
        ANNOUNCE: Statistics::MaxEntropy v0.9 <terdoest@cs.utwente.nl>
        ANNOUNCE: v1998.1204 Squeeze.pm -- Shorten text to page (Jari Aalto+mail.perl)
    Re: Basic Perl DOS/Win95 + WWW + CGI course for Newbies <mlabor@sprintmail.com>
    Re: Get Title <gellyfish@btinternet.com>
    Re: get webpage with perl <gellyfish@btinternet.com>
        Java/Perl Tool Available as Open Source Software <silver@oreilly.com>
        Makepatch version 2.00 released (Johan Vromans)
    Re: mkdir and -p <tchrist@mox.perl.com>
        News::Newsrc 1.07 released (Steven W McDougall)
        Set::IntSpan 1.07 released (Steven W McDougall)
        Special: Digest Administrivia (Last modified: 12 Dec 98 (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 27 Dec 1998 17:02:23 GMT
From: Gareth Rees <garethr@cre.canon.co.uk>
Subject: ANNOUNCE: HTML::FromText
Message-Id: <765p6v$36o$1@play.inetarena.com>

See http://www.perl.com/CPAN/authors/id/G/GD/GDR/HTML-FromText-1.000.tar.gz

  NAME
	HTML::FromText - flexibly mark up plain text as HTML

  SYNOPSIS
	use HTML::FromText 'text2html';
	print text2html($text, paras => 1, urls => 1);

  DESCRIPTION
	The function `text2html' converts plain text to HTML.  It can
	apply the follow transformations (each transformation is
	selected by passing the appropriate flag as an argument):

	* Turn HTML metacharacters into HTML entities.

	* Spot URLs and convert them to links.

	* Spot e-mail addresses and convert them to `mailto:' links.

	* Preserve line breaks.

	* Expand tabs and preserve spaces throughout the text.

	* Mark up words surrounded with *asterisks* as bold.

	* Mark up words surrounded with _underscores_ as underlined.

	* Format the text as paragraphs.

	* Spot paragraphs where every line begins with whitespace, and
	  mark them up as block quotes.

	* Spot bulleted paragraphs and mark them up as an unordered
          list.

	* Spot numbered paragraphs and marks them up as an ordered list.

	* Spot headings (paragraphs starting with numbers) and mark them
	  up as headings of the appropriate level.

	* Format the first paragraph of the text as a first-level
          heading.

  INSTALLATION
	perl Makefile.PL && make && make test && make install

  BUGS
	* There are lots of transformations it doesn't do.

-- 
Gareth Rees




------------------------------

Date: 27 Dec 1998 17:01:48 GMT
From: Francois Desarmenien <desar@club-internet.fr>
Subject: ANNOUNCE: Parse-Yapp-0.21, Christmas Release
Message-Id: <765p5s$36e$1@play.inetarena.com>

I'm pleased to announce that Parse-Yapp-0.21 (Christmas Release) has
been
uploaded to CPAN. It should be available soon on your nearest CPAN
mirror.

Note that beginning with version 0.20, Parse::Yapp is no longer alpha
software:
it has been promoted to beta.

Enjoy and merry Christmas

Frangois Disarminien
-----------------------------------------------------------------------------

Parse::Yapp - Parse::Yapp  Yet Another Perl Parser compiler

Compiles yacc-like LALR grammars to generate Perl OO parser modules.

COPYRIGHT

(c) 1998 Francois Desarmenien, all rights reserved.
(see the Copyright section in Yapp.pm for usage and distribution rights)

IMPORTANT NOTES

THIS IS BETA SOFTWARE.

Though it has been tested a lot, there are probably bugs in it ;-)

The BETA status does not reflect the quality of the code, but the
possible
changes in the generated parser modules.

I need FEEDBACK for every problem or bug you could encounter so I can
fix
them in the next release. Comments are welcome too.
But I also need FEEDBACK if you use it and have it work fine so I can
step
to production releases. Just drop me a mail.

The Parse::Yapp pod section is the main documentation and it assumes
you already have a good knowledge of yacc. If not, I suggest the GNU
Bison manual which is a very good tutorial to LALR parsing and yacc's
grammar syntax.

The documentation is only a draft and should be rewritten (I think).
Any help on this issue would be very welcome.

DESCRIPTION

This is the alpha release 0.21 of the Parse::Yapp parser generator.

It lets you create Perl OO fully reentrant LALR(1) parser
modules (see the Yapp.pm pod pages for more details) and has
been designed to be functionnaly as close as possible to yacc,
but using the full power of Perl and opened for enhancements.

REQUIREMENTS

Requires perl5.004 or better :)

It is written only in Perl, with standard distribution modules,
so you don't need any compiler nor special modules.

INSTALLATION

perl Makefile.PL
make
make test
make install

WARRANTY

This software comes with absolutly NO WARRANTY of any kind.
I just hope it can be useful.


FEEDBACK

Send feedback, comments and bug reports to:

Francois Desarmenien
desar@club-internet.fr





------------------------------

Date: 27 Dec 1998 17:03:39 GMT
From: Hugo ter Doest <terdoest@cs.utwente.nl>
Subject: ANNOUNCE: Statistics::MaxEntropy v0.9
Message-Id: <765p9b$374$1@play.inetarena.com>


CHANGES:

- Now has own support for sparse vectors, no longer requires
  Bit::Vector, and no longer supports it!

- Added support for the (Abney 1997) Newton estimation.

- Enumeration of event space no longer stored in memory.


README:

NAME
    MaxEntropy - Perl5 module for Maximum Entropy Modeling and
    Feature Induction

SYNOPSIS
      use Statistics::MaxEntropy;

      # debugging messages; default 0
      $Statistics::MaxEntropy::debug = 0;

      # maximum number of iterations for IIS; default 100
      $Statistics::MaxEntropy::NEWTON_max_it = 100;

      # minimal distance between new and old x for Newton's method; 
      # default 0.001
      $Statistics::MaxEntropy::NEWTON_min = 0.001;

      # maximum number of iterations for Newton's method; default 100
      $Statistics::MaxEntropy::KL_max_it = 100;

      # minimal distance between new and old x; default 0.001
      $Statistics::MaxEntropy::KL_min = 0.001;

      # the size of Monte Carlo samples; default 1000
      $Statistics::MaxEntropy::SAMPLE_size = 1000;

      # creation of a new event space from an events file
      $events = Statistics::MaxEntropy::new($file);

      # Generalised Iterative Scaling, "corpus" means no sampling
      $events->scale("corpus", "gis");

      # Improved Iterative Scaling, "mc" means Monte Carlo sampling
      $events->scale("mc", "iis");

      # Feature Induction algorithm, also see Statistics::Candidates POD
      $candidates = Statistics::Candidates->new($candidates_file);
      $events->fi("iis", $candidates, $nr_to_add, "mc");

      # writing new events, candidates, and parameters files
      $events->write($some_other_file);
      $events->write_parameters($file);
      $events->write_parameters_with_names($file);

      # dump/undump the event space to/from a file
      $events->dump($file);
      $events->undump($file);

DESCRIPTION
    This module is an implementation of the Generalised and Improved
    Iterative Scaling (GIS, IIS) algorithms and the Feature
    Induction (FI) algorithm as defined in (Darroch and Ratcliff
    1972) and (Della Pietra et al. 1997). The purpose of the scaling
    algorithms is to find the maximum entropy distribution given a
    set of events and (optionally) an initial distribution. Also a
    set of candidate features may be specified; then the FI
    algorithm may be applied to find and add the candidate
    feature(s) that give the largest `gain' in terms of Kullback
    Leibler divergence when it is added to the current set of
    features.

    Events are specified in terms of a set of feature functions
    (properties) f_1...f_k that map each event to {0,1}: an event is
    a string of bits. In addition of each event its frequency is
    given. We assume the event space to have a probability
    distribution that can be described by

    p(x) = 1/Z e^{sum_i alpha_i f_i(x)}

    The module requires the `Bit::SparseVector' module by Steffen
    Beyer and the `Data::Dumper' module by Gurusamy Sarathy. Both
    can be obtained from CPAN just like this module.

  CONFIGURATION VARIABLES

    `$Statistics::MaxEntropy::debug'
        If set to `1', lots of debug information, and intermediate
        results will be output. Default: `0'

    `$Statistics::MaxEntropy::NEWTON_max_it'
        Sets the maximum number of iterations in Newton's method.
        Newton's method is applied to find the new parameters
        \alpha_i of the features `f_i'. Default: `100'.

    `$Statistics::MaxEntropy::NEWTON_min'
        Sets the minimum difference between x' and x in Newton's
        method (used for computing parameter updates in IIS); if
        either the maximum number of iterations is reached or the
        difference between x' and x is small enough, the iteration
        is stopped. Default: `0.001'. Sometimes features have
        Infinity or -Infinity as a solution; these features are
        excluded from future iterations.

    `$Statistics::MaxEntropy::KL_max_it'
        Sets the maximum number of iterations applied in the IIS
        algorithm. Default: `100'.

    `$Statistics::MaxEntropy::KL_min'
        Sets the minimum difference between KL divergences of two
        distributions in the IIS algorithm; if either the maximum
        number of iterations is reached or the difference between
        the divergences is enough, the iteration is stopped.
        Default: `0.001'.

    `$Statistics::MaxEntropy::SAMPLE_size'
        Determines the number of (unique) events a sample should
        contain. Only makes sense if for sampling "mc" is selected
        (see below). Its default is `1000'.

  METHODS

    `new'
         $events = Statistics::MaxEntropy::new($events_file);

        A new event space is created, and the events are read from
        `$file'. The events file is required, its syntax is
        described in the section on "FILE SYNTAX".

    `write'
         $events->write($file);

        Writes the events to a file. Its syntax is described in the
        section on "FILE SYNTAX".

    `scale'
         $events->scale($sample, $scaler);

        If `$scaler' equals `"gis"', the Generalised Iterative
        Scaling algorithm (Darroch and Ratcliff 1972) is applied on
        the event space; `$scaler' equals `"iis"', the Improved
        Iterative Scaling Algorithm (Della Pietra et al. 1997) is
        used. If `$sample' is `"corpus"', there is no sampling done
        to re-estimate the parameters (the events previously read
        are considered a good sample); if it equals `"mc"' Monte
        Carlo (Metropolis-Hastings) sampling is performed to obtain
        a random sample; if `$sample' is `"enum"' the complete event
        space is enumerated.

    `fi'
         fi($scaler, $candidates, $nr_to_add, $sampling);

        Calls the Feature Induction algorithm. The parameter
        `$nr_to_add' is for the number of candidates it should add.
        If this number is greater than the number of candidates, all
        candidates are added. Meaningfull values for `$scaler' are
        `"gis"' and `"iis"'; default is `"gis"' (see previous item).
        `$sampling' should be one of `"corpus"', `"mc"', `"enum"'.
        `$candidates' should be in the `Statistics::Candidates'
        class:

         $candidates = Statistics::Candidates->new($file);

        See the Statistics::Candidates manpage.

    `write_parameters'
         $events->write_parameters($file);

    `write_parameters_with_names'
         $events->write_parameters_with_names($file);

    `dump'
         $events->dump($file);

        `$events' is written to `$file' using `Data::Dumper'.

    `undump'
         $events = Statistics::MaxEntropy->undump($file);

        The contents of file `$file' is read and eval'ed into
        `$events'.

FILE SYNTAX
    Lines that start with a `#' and empty lines are ignored.

    Below we give the syntax of in and output files.

  EVENTS FILE (input/output)

    Syntax of the event file (`n' features, and `m' events); the
    following holds for features:

    *   each line is an event;

    *   each column represents a feature function; the co-domain of a
        feature function is {0,1};

    *   no space between feature columns;

    *   constant features (i.e. columns that are completely 0 or 1) are
        forbidden;

    *   2 or more events should be specified (this is in fact a
        consequence of the previous requirement;

    The frequency of each event precedes the feature columns.
    Features are indexed from right to left. This is a consequence
    of how `Bit::SparseVector' reads bit strings. Each `f_ij' is a
    bit and `freq_i' an integer in the following schema:

        name_n <tab> name_n-1 ... name_2 <tab> name_1 <newline>
        freq_1 <white> f_1n ... f_13 f_12 f_11 <newline>
          .                     .
          .                     .
          .                     .
        freq_i <white> f_in ... f_i3 f_i2 f_i1 <newline>
          .                     .
          .                     .
          .                     .
        freq_m <white> f_mn ... f_m3 f_m2 f_m1

    (`m' events, `n' features) The feature names are separated by
    tabs, not white space. The line containing the feature names
    will be split on tabs; this implies that (non-tab) white space
    may be part of the feature names.

  PARAMETERS FILE (input/output)

    Syntax of the initial parameters file; one parameter per line:

        par_1 <newline>
         .
         .
         .
        par_i <newline>
         .
         .
         .
        par_n

    The syntax of the output distribution is the same. The
    alternative procedure for saving parameters to a file
    `write_parameters_with_names' writes files that have the
    following syntax

        n <newline>
        name_1 <tab> par_1 <newline>
         .
         .
         .
        name_i <tab> par_i <newline>
         .
         .
         .
        name_n <tab> par_n <newline>
        bitmask

    where bitmask can be used to tell other programs what features
    to use in computing probabilities. Features that were ignored
    during scaling or because they are constant functions, receive a
    `0' bit.

  DUMP FILE (input/output)

    A dump file contains the event space (which is a hash blessed
    into class `Statistics::MaxEntropy') as a Perl expression that
    can be evaluated with eval.

BUGS
    It's slow.

SEE ALSO
    the perl(1) manpage, the Statistics::Candidates manpage, the
    Statistics::SparseVector manpage, the Bit::Vector manpage, the
    Data::Dumper manpage, the POSIX manpage, the Carp manpage.

DIAGNOSTICS
    The module dies with an appropriate message if

    *   it cannot open a specified events file;

    *   if you specified a constant feature function (in the events file
        or the candidates file);

    *   if the events file, candidates file, or the parameters file is
        not consistent; possible causes are (a.o.): insufficient or
        too many features for some event; inconsistent candidate
        lines; insufficient, or to many event lines in the
        candidates file.

    The module captures `SIGQUIT' and `SIGINT'. On a `SIGINT'
    (typically <CONTROL-C> it will dump the current event space(s)
    and die. If a `SIGQUIT' (<CONTROL-BACKSLASH>) occurs it dumps
    the current event space as soon as possible after the first
    iteration it finishes.

REFERENCES
    (Abney 1997)
        Steven P. Abney, Stochastic Attribute Value Grammar,
        Computational Linguistics 23(4).

    (Darroch and Ratcliff 1972)
        J. Darroch and D. Ratcliff, Generalised Iterative Scaling
        for log-linear models, Ann. Math. Statist., 43, 1470-1480,
        1972.

    (Jaynes 1983)
        E.T. Jaynes, Papers on probability, statistics, and
        statistical physics. Ed.: R.D. Rosenkrantz. Kluwer Academic
        Publishers, 1983.

    (Jaynes 1997)
        E.T. Jaynes, Probability theory: the logic of science, 1997,
        unpublished manuscript.
        `URL:http://omega.math.albany.edu:8008/JaynesBook.html'

    (Della Pietra et al. 1997)
        Stephen Della Pietra, Vincent Della Pietra, and John
        Lafferty, Inducing features of random fields, In:
        Transactions Pattern Analysis and Machine Intelligence,
        19(4), April 1997.

VERSION
    Version 0.8.

AUTHOR
Hugo WL ter Doest, terdoest@cs.utwente.nl

COPYRIGHT
    `Statistics::MaxEntropy' comes with ABSOLUTELY NO WARRANTY and
    may be copied only under the terms of the GNU Library General
    Public License (version 2, or later), which may be found in the
    distribution.




------------------------------

Date: 27 Dec 1998 17:03:28 GMT
From: jari.aalto@poboxes.com (Jari Aalto+mail.perl)
Subject: ANNOUNCE: v1998.1204 Squeeze.pm -- Shorten text to pagers and GSM phones
Message-Id: <765p90$373$1@play.inetarena.com>


What's New: Variable SQZ_OPTIMIZE_LEVEL


Title

        ANNOUNCE: v1998.1204 Squeeze.pm -- Shorten text to minimum syllables
        The version number is based on date format YYYY.MMDD

Download

        Home page:

            (eg. ftp://ftp.funet.fi/pub/languages/perl/CPAN/)
            CPAN//modules/by-module/Lingua/

        Perl language interpreter pointers at (Win32/Unix etc.)
        Perl: http://language.perl.com/info/software.html

Description

        A module that I use to compress text from email before it is
        sent to my Cellular phone. If you have a pager, you know how
        tight the space is and every extra characters saver is a plus.

        A shortened POD page follows. The Module's Interface functions
        and interface variables are not included in this announcement.

        I would welcome more text compresion rules, so feel free to
        suggest more hash entries like:

                WORD       => CONVERSION
                MULTI WORD => CONVERSION

NAME
    Squeeze.pm - Shorten text to minimum syllables by using hash and vowel
    deletion

REVISION
    $Id: Squeeze.pm,v 1.24 1998/10/08 14:58:15 jaalto Exp $

SYNOPSIS
        use Squeeze.pm;         # imnport only function
        use Squeeze qw( :ALL ); # import all functions and variables
        use English;

        while (<>)
        {
            print SqueezeText $ARG;
        }


DESCRIPTION
    Squeeze English text to most compact format possibly so that it is
    barely readable. You should convert all text to lowercase for maximum
    compression, because optimizations have been designed mostly fr
    uncapitalised letters.

        `Warning: Each line is processed multiple times, so prepare for slow
        conversion time'

    You can use this module e.g. to preprocess text before it is sent to
    electronic media that has some maximum text size limit. For example
    pagers have an arbitrary text size limit, typically 200 characters,
    which you want to fill as much as possible. Alternatively you may have
    GSM cellular phone which is capable of receiving Short Messages (SMS),
    whose message size limit is 160 characters. For demonstration of this
    module's SqueezeText() function , the description text of this paragraph
    has been converted below. See yourself if it's readable (Yes, it takes
    some time to get used to). The compress ratio is typically 30-40%

        u _n use thi mod e.g. to prprce txt bfre i_s snt to
        elrnic mda has som max txt siz lim. f_xmple pag
        hv  abitry txt siz lim, tpcly 200 chr, W/ u wnt
        to fll as mch as psbleAlternatvly u may hv GSM cllar P8
        w_s cpble of rcivng Short msg (SMS), WS/ msg siz
        lim is 160 chr. 4 demonstrton of thi mods SquezText
        fnc ,  dsc txt of thi prgra has ben cnvd_ blow
        See uself if i_s redble (Yes, it tak som T to get usdto
        compr rat is tpcly 30-40

    And if $SQZ_OPTIMIZE_LEVEL is set to non-zero

        u_nUseThiModE.g.ToPrprceTxtBfreI_sSntTo
        elrnicMdaHasSomMaxTxtSizLim.F_xmplePag
        hvAbitryTxtSizLim,Tpcly200Chr,W/UWnt
        toFllAsMchAsPsbleAlternatvlyUMayHvGSMCllarP8
        w_sCpbleOfRcivngShortMsg(SMS),WS/MsgSiz
        limIs160Chr.4DemonstrtonOfThiModsSquezText
        fnc,DscTxtOfThiPrgraHasBenCnvd_Blow
        SeeUselfIfI_sRedble(Yes,ItTakSomTToGetUsdto
        comprRatIsTpcly30-40

    The comparision of these two show

        Original text   : 627 characters
        Level 0         : 433 characters    reduction 31 %
        Level 1         : 345 characters    reduction 45 %  (+14 improvement)

    There are few grammar rules which are used to shorten some English
    tokens very much:

        Word that has _ is usually a verb

        Word that has / is usually a substantive, noun,
                        pronomine or other non-verb

    For example, these tokens must be understood before text can be read.
    This is not yet like Geek code, because you don't need external parser
    to understand this, but just some common sense and time to adapt
    yourself to this text. *For a complete up to date list, you have to peek
    the source code*

        automatically => 'acly_'

        for           => 4
        for him       => 4h
        for her       => 4h
        for them      => 4t
        for those     => 4t

        can           => _n
        does          => _s

        it is         => i_s
        that is       => t_s
        which is      => w_s
        that are      => t_r
        which are     => w_r

        less          => -/
        more          => +/
        most          => ++

        however       => h/ver
        think         => thk_

        useful        => usful

        you           => u
        your          => u/
        you'd         => u/d
        you'll        => u/l
        they          => t/
        their         => t/r

        will          => /w
        would         => /d
        with          => w/
        without       => w/o
        which         => W/
        whose         => WS/

    Time is expressed with big letters

        time          => T
        minute        => MIN
        second        => SEC
        hour          => HH
        day           => DD
        month         => MM
        year          => YY

    Other Big letter acronyms

        phone         => P8

EXAMPLES
    To add new words e.g. to word conversion hash table, you'd define your
    custom set and merge them to existing ones. Do similarly to
    `%SQZ_WXLATE_MULTI_HASH' and `$SQZ_ZAP_REGEXP' and then start using the
    conversion function.

        use English;
        use Squeeze qw( :ALL );

        my %myExtraWordHash =
        (
              new-word1  => 'conversion1'
            , new-word2  => 'conversion2'
            , new-word3  => 'conversion3'
            , new-word4  => 'conversion4'
        );

        #   First take the existing tables and merge them with my
        #   translation table

        my %mySustomWordHash =
        (
              %SQZ_WXLATE_HASH
            , %SQZ_WXLATE_EXTRA_HASH
            , %myExtraWordHash
        );

        my $myXlat = 0;                             # state flag

        while (<>)
        {
            if ( $condition )
            {
                SqueezeHashSet \%%mySustomWordHash; # Use MY conversions
                $myXlat = 1;
            }

            if ( $myXlat and $condition )
            {
                SqueezeHashSet "reset";             # Back to default table
                $myXlat = 0;
            }

            print SqueezeText $ARG;
        }

    Similarly you can redefine the multi word thanslate table by supplying
    another hash reference in call to SqueezeHashSet(), and to kill more
    text immediately in addtion to default, just concatenate the regexps to
    *$SQZ_ZAP_REGEXP*

KNOWN BUGS
    There may be lot of false conversions and if you think that some word
    squeezing went too far, please turn on the debug end send the log to the
    maintainer. To see how the conversion goes e.g. for word *Messages*:

        use English;
        use Lingua::EN:Squeeze;

        SqueezeDebug( 1, '(?i)Messages' );

        $ARG = "This line has some Messages in it";
        print SqueezeText $ARG;


AVAILABILITY
    Author can be reached at jari.aalto@poboxes.com HomePage via forwarding
    service is at http://www.netforward.com/poboxes/?jari.aalto or
    alternatively absolute url is at ftp://cs.uta.fi/pub/ssjaaa/ but this
    may move without notice. Prefer keeping the forwarding service link in
    your bookmark.

    Latest version of this module can be found at $CPAN/modules/by-
    module/Lingua/

AUTHOR
    Copyright (C) 1998-1999 Jari Aalto. All rights reserved. This program is
    free software; you can redistribute it and/or modify it under the same
    terms as Perl itself or in terms of Gnu General Public licence v2 or
    later.




------------------------------

Date: Sun, 27 Dec 1998 11:42:42 -0500
From: "Manual Labor" <mlabor@sprintmail.com>
Subject: Re: Basic Perl DOS/Win95 + WWW + CGI course for Newbies , Christmas free  offer .
Message-Id: <765o1c$j0$1@fir.prod.itd.earthlink.net>

I would be very interested in subjecting myself to you expert tutelage.
contact me at manlabor@hotmail.com





Franklin wrote in message <3687237a.387639806@news>...
>How do I access your course?
>
>On Sat, 26 Dec 1998 17:51:37 -0800, TRG Software
><chatmaster@c-zone.net> wrote:
>
>>Expert wrote:
>>>
>>> I would like to give basic Perl course for newbies.
>>> Integration of CGI Perl scripts with WWW pages.
>>> Setting up simple WIN95/ web server and setting up web pages + CGI
>>> programs running on your PC , for testing purposes.
>>>
>>> Hope this course to be free, interactive, mayby on shareware basis.
>>>
>>> If some of you guys are interested I set up  news group to move us
>>> there.
>>> regards,
>>> Jack
>>
>>I might be interested in helping you with this (if you need any more
>>help), but I'll need more details. :-)
>>
>>Good luck
>





------------------------------

Date: 27 Dec 1998 13:56:10 -0000
From: Jonathan Stowe <gellyfish@btinternet.com>
Subject: Re: Get Title
Message-Id: <765e9q$1f4$1@gellyfish.btinternet.com>

On Sun, 27 Dec 1998 13:29:24 +0100 Frank de Bot <debot@xs4all.nl> wrote:
> Does anybody know a script to get the Title of a webpage?
> Some help for making a own script is OK.


You will probably want to use the HTML::HeadParser module (part of the 
HTML::Parser package available from CPAN) to obtain the title of the
document from the HTML - if you require to do this via HTTP rather than from
a file then you will also probably want to use the LWP::UserAgent module to
obtain the document.

The following is an example of using both Modules to to obtain the title of
a document from a URL given on the command line:


#!/usr/bin/perl

use HTML::HeadParser;
use LWP::UserAgent;


$ua = new LWP::UserAgent;
$ua->agent("$0/0.1 " . $ua->agent);

$req = new HTTP::Request 'GET' => $ARGV[0];
$req->header('Accept' => 'text/html');

# send request
$res = $ua->request($req);
if ($res->is_success) 
  {
    my $parser = HTML::HeadParser->new;
    $parser->parse($res->content);
    $outtitle = $parser->header('Title');
    print $outtitle,"\n";
  }
else
  {
    print $res->status_line,"\n";
  }
__END__
-- 
Jonathan Stowe <jns@btinternet.com>
Some of your questions answered:
<URL:http://www.btinternet.com/~gellyfish/resources/wwwfaq.htm>
Hastings: <URL:http://www.newhoo.com/Regional/UK/England/East_Sussex/Hastings>


------------------------------

Date: 27 Dec 1998 14:09:12 -0000
From: Jonathan Stowe <gellyfish@btinternet.com>
Subject: Re: get webpage with perl
Message-Id: <765f28$1fq$1@gellyfish.btinternet.com>

On Sun, 27 Dec 1998 11:57:31 GMT h9250293@obelix.wu-wien.ac.at wrote:
> hi!
> 
> I would like to go through webpages on the web and grep the essential
> information for me with PERL and mail it to my account. How can i do that? is
> there anywhere a ready source for retriving the HTML of a webpage?
> 

You will probably want to use the LWP::* modules available from CPAN to do
this - the distribution of this package comes with a document 'lwpcook' that
has an example that would form a reasonable basis to do this.  If you
require a program that will do 'spidering' - following the links on each
successive page then you will require the HTML::Parser module to parse the
documents and retrieve the URLs of the linked pages.

/J\
-- 
Jonathan Stowe <jns@btinternet.com>
Some of your questions answered:
<URL:http://www.btinternet.com/~gellyfish/resources/wwwfaq.htm>
Hastings: <URL:http://www.newhoo.com/Regional/UK/England/East_Sussex/Hastings>


------------------------------

Date: 27 Dec 1998 17:03:01 GMT
From: Ellen Maremont Silver <silver@oreilly.com>
Subject: Java/Perl Tool Available as Open Source Software
Message-Id: <765p85$371$1@play.inetarena.com>

This announcement was sent to the press on December 1, 1998.
For further information please see http://www.perl.com
and http://perl.oreilly.com.

JAVA/PERL TOOL AVAILABLE AS OPEN SOURCE SOFTWARE
Programmers Can Use Strengths of Two Popular Languages in the Same Environment

Java/Perl Lingo (JPL), software which enables programmers to use the use
the strengths of both Java and Perl in the same environment, is now freely
available as open source software. Until now, the tool has been available
exclusively in O'Reilly & Associates' Perl Resource Kit-UNIX Edition, a
commercial product. JPL was developed by Larry Wall, creator of Perl and
Senior Software Developer at O'Reilly & Associates.

JPL, available since November, 1997, is a unique project whose goal is to
seamlessly unite the two popular languages in a way which lets them
complement each other's strengths. Java excels at helping computers across
a network or the Internet communicate and share data; Perl is used
especially for system administration and interactive Web sites. JPL enables
programmers to implement Java methods with Perl, and for Perl code to
access Java via the Java Native Interface (JNI). It includes a translator
and build system that make it easy to create JPL applications.

The JPL tool and its source code are being made available as part of the
latest development release of Perl (version 5.005_54) and can be obtained
at http://www.perl.com/CPAN/authors/id/GSAR/ (perl5.005_54.patch.gz and
perl5.005_54.tar.gz: see
directory notes for important caveats). Subscription information for the
JPL mailing list is available at http://www.perl.org/maillist.html.

"O'Reilly has been a strong supporter of open source software, so releasing
JPL as open source matches our company values," said Gina Blaber, Director
of O'Reilly's Software Products. "JPL will benefit from the attention of
the broader development community. Further, our Perl books and software are
an important part of the O'Reilly business, so we want to thank and support
the open source community by making the JPL source available."

O'Reilly first released the Perl Resource Kit-UNIX in November, 1997, and
followed it in August with the Perl Resource Kit-Win32 Edition.

#  #  #


-------------------------------------------------------
Ellen Maremont Silver (formerly Elias)                   Publicist
O'Reilly & Associates, Inc.
101 Morris St., Sebastopol, CA 95472
Cambridge - Koeln - Paris - Sebastopol - Tokyo
phone: (707) 829-0515 ext. 322  fax: (707) 829-0104
Online: software.oreilly.com, www.oreilly.com





------------------------------

Date: 27 Dec 1998 17:03:17 GMT
From: JVromans@Squirrel.nl (Johan Vromans)
Subject: Makepatch version 2.00 released
Message-Id: <765p8l$372$1@play.inetarena.com>

I'm very pleased to announce release 2.00 of the makepatch package.

  URL: $CPAN/authors/id/JV/makepatch-2.00a.tar.gz

This package contains a pair of programs to assist in the generation
and application of patch kits to synchronise source trees.

INTRODUCTION

Traditionally, source trees are updated with the 'patch' program,
processing patch information that is generated by the 'diff' program.
Although 'diff' and 'patch' do a very good job at patching file
contents, most versions do not handle creating and deleting files and
directories, and adjusting of file modes and time stamps. Newer
versions of 'diff' and 'patch' seem to be able to create files, and
very new versions of 'patch' can remove files. But that's about it.

Another typical problem is that patch kits are typically downloaded
from the Internet, of transmitted via electronic mail. It is often
desirable to verify the correctness of a patch kit before even
attempting to apply it.

The makepatch package is designed to overcome these limitations.

DESCRIPTION

The makepatch package contains two programs, both written in Perl:
'makepatch' and 'applypatch'.

'makepatch' will generate a patch kit from two source trees. 
It traverses the source directory and runs a 'diff' on each pair of
corresponding files, accumulating the output into a patch kit. It
knows about the conventions for patch kits: if a file named
patchlevel.h exists, it is handled first, so 'patch' can check the
version of the source tree. Also, to deal with the non-perfect
versions of 'patch' that are in use, it supplies 'Index:' and
'Prereq:' lines, so 'patch' can correctly locate the files to patch,
and it relocates the patch to the current directory to avoid problems
with creating new files.

The list of files can be specified in a so called 'manifest' file, but
it can also be generated by recursively traversing the source tree.
Files can be excludes using shell style wildcards and Perl regex
patterns.

Moreover, 'makepatch' prepends a small shell script in front of the
patch kit that creates the necessary files and directories for the
patch process. By running the patch kit as a shell script your source
directory is prepared for the patching process.

But that is not it! 'makepatch' also inserts some additional
information in the patch kit for use by the 'applypatch' program.

The 'applypatch' program will do the following:

  - It will extensively verify that the patch kit is complete and not
    corrupted during transfer.
  - It will apply some heuristics to verify that the directory in
    which the patch will be applied does indeed contain the expected
    sources.
  - It creates files and directories as necessary.
  - It applies the patch by running the 'patch' program.
  - Upon completion, obsolete files, directories and .orig files are
    removed, file modes of new files are set, and the timestamps of
    all patched files are adjusted.

Note that 'applypatch' only requires the 'patch' program. It does not
rely on a shell or shell tools. This makes it possible to apply
'makepatch' generated patches on non-Unix systems.

REQUIREMENTS

  - Perl 5.005 standard installation.
  - For 'makepatch': the 'diff' program.
  - For 'applypatch': the 'patch' program.

AVALIABLILTY

CPAN and its mirrors, e.g.

  http://www.perl.com/CPAN/authors/id/JV/makepatch-2.00a.tar.gz

--------------------------------------------------------------------------
Johan Vromans                                         jvromans@squirrel.nl
Squirrel Consultancy                              Haarlem, the Netherlands
http://www.squirrel.nl              http://www.squirrel.nl/people/jvromans
PGP Key 2048/4783B14D KFP=65 44 CA 66 B3 50 0B 34  CE 0E FB CA 2D 95 34 D0
---------------------- "Arms are made for hugging" -----------------------




------------------------------

Date: 27 Dec 1998 16:21:52 GMT
From: Tom Christiansen <tchrist@mox.perl.com>
Subject: Re: mkdir and -p
Message-Id: <765mr0$a3q$1@csnews.cs.colorado.edu>

 [courtesy cc of this posting sent to cited author via email]

In comp.lang.perl.misc, 
    webmaster@skatesearch.com writes:
:I'd like to invoke mkdir -p  from perl.  I know I can shell it out to do this,
:but is there a way to do this from the built in version fo mkdir in perl?

No, there isn't.  mkdir makes one directory.  It's ok to use the
toolbox now and then you know.

You might look at the File::Path module.

--tom
-- 
    double value;                /* or your money back! */
    short changed;               /* so triple your money back! */
            --Larry Wall in cons.c from the 4.0 perl source code


------------------------------

Date: 27 Dec 1998 17:02:34 GMT
From: swmcd@world.std.com (Steven W McDougall)
Subject: News::Newsrc 1.07 released
Message-Id: <765p7a$36p$1@play.inetarena.com>

News::Newsrc 1.07 has been uploaded to PAUSE and will soon propagate
through CPAN.


>From the README file:

News::Newsrc VERSION 1.07 - manage newsrc files

DESCRIPTION
News::Newsrc manages newsrc files, of the style

    alt.foo: 1-21,28,31-34
    alt.bar! 3,5,9-2900,2902


>From the Changes file:

Revision history for Perl extension News::Newsrc

1.07  1998 Dec 21
	- added import_rc and export_rc
	- added VERSION_FROM, DISTNAME, ABSTRACT, AUTHOR, and dist 
          keys in Makefile.PL



Thanks to Philip Hallstrom for suggesting the import/export methods.


- SWM




------------------------------

Date: 27 Dec 1998 17:02:41 GMT
From: swmcd@world.std.com (Steven W McDougall)
Subject: Set::IntSpan 1.07 released
Message-Id: <765p7h$36q$1@play.inetarena.com>

Set::IntSpan 1.07 has been uploaded to PAUSE and will soon propagate
through CPAN. 


>From the Changes file:

Revision history for Perl extension Set::IntSpan

1.07  1988 Dec 03
	- fixes to facilitate subclassing
	  o use ref $this instead of hardcoded "Set::IntSpan"
	  o made internal functions into methods
	  o use method call syntax on all internal method calls,
	    not function call syntax
	  o use direct object syntax on all internal method calls,
	    because indirect object syntax sometimes parses as a 
	    function call
	- added ABSTRACT and AUTHOR keys in Makefile.PL


>From the README file:

Set::IntSpan VERSION 1.07 - Manages sets of integers

DESCRIPTION
Set::IntSpan manages sets of integers.  It is optimized for sets that
have long runs of consecutive integers.  These arise, for example, in
 .newsrc files, which maintain lists of articles:

    alt.foo: 1-21,28,31
    alt.bar: 1-14192,14194,14196-14221

Sets are stored internally in a run-length coded form.  This provides
for both compact storage and efficient computation.  In particular,
set operations can be performed directly on the encoded
representation.


Thanks to Chris Sidi for showing me how to fix the code to support
subclassing. 


- SWM




------------------------------

Date: 12 Dec 98 21:33:47 GMT (Last modified)
From: Perl-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Special: Digest Administrivia (Last modified: 12 Dec 98)
Message-Id: <null>


Administrivia:

Well, after 6 months, here's the answer to the quiz: what do we do about
comp.lang.perl.moderated. Answer: nothing. 

]From: Russ Allbery <rra@stanford.edu>
]Date: 21 Sep 1998 19:53:43 -0700
]Subject: comp.lang.perl.moderated available via e-mail
]
]It is possible to subscribe to comp.lang.perl.moderated as a mailing list.
]To do so, send mail to majordomo@eyrie.org with "subscribe clpm" in the
]body.  Majordomo will then send you instructions on how to confirm your
]subscription.  This is provided as a general service for those people who
]cannot receive the newsgroup for whatever reason or who just prefer to
]receive messages via e-mail.

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.

The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V8 Issue 4501
**************************************

home help back first fref pref prev next nref lref last post