[18515] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 683 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Apr 12 06:10:45 2001

Date: Thu, 12 Apr 2001 03:10:18 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <987070217-v10-i683@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Thu, 12 Apr 2001     Volume: 10 Number: 683

Today's topics:
        Request <chris62vw@hotmail.com>
    Re: Request <comdog@panix.com>
    Re: Request (Martien Verbruggen)
    Re: Stripping non standard control characters from a fi <ronda@panix.com>
    Re: Stripping non standard control characters from a fi (Logan Shaw)
    Re: Stripping non standard control characters from a fi (Martien Verbruggen)
    Re: Stripping non standard control characters from a fi <ronda@panix.com>
    Re: Stripping non standard control characters from a fi <ronda@panix.com>
    Re: Stripping non standard control characters from a fi <uri@sysarch.com>
    Re: what are the new languages? <mitiaNOSPAM@northwestern.edu.invalid>
    Re: Why Perl? <bart.lateur@skynet.be>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Wed, 11 Apr 2001 23:21:04 -0700
From: "Chris A." <chris62vw@hotmail.com>
Subject: Request
Message-Id: <tdai7kf4n69t5c@corp.supernews.com>

I am posting this here as this is the .*.perl.misc group.  If you would
rather not read this message that doesn't mostly pertain to a help question,
please ceast reading now, or forever hold your peace (or rage).

There is a script that I have been working on at
http://chrisangell.com/unix/gallery/avirt .  I would like people to make
suggestions pertaining  to the usage of this script and to the viability of
this script.  I know I could work on the part where the $username and $dest
is checked for errors,

Thanks for everthing,

Chris A.




------------------------------

Date: Thu, 12 Apr 2001 02:52:45 -0400
From: brian d foy <comdog@panix.com>
Subject: Re: Request
Message-Id: <comdog-66258D.02524512042001@news.panix.com>

In article <tdai7kf4n69t5c@corp.supernews.com>, "Chris A." 
<chris62vw@hotmail.com> wrote:

> I am posting this here as this is the .*.perl.misc group.  If you would
> rather not read this message that doesn't mostly pertain to a help question,
> please ceast reading now, or forever hold your peace (or rage).

nice to meet you too.

*plonk*

-- 
brian d foy <comdog@panix.com>



------------------------------

Date: Thu, 12 Apr 2001 17:24:16 +1000
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: Request
Message-Id: <slrn9dam10.482.mgjv@martien.heliotrope.home>

On Wed, 11 Apr 2001 23:21:04 -0700,
	Chris A. <chris62vw@hotmail.com> wrote:
> I am posting this here as this is the .*.perl.misc group.  If you would
> rather not read this message that doesn't mostly pertain to a help question,
> please ceast reading now, or forever hold your peace (or rage).

You may very well say this, but it isn't going to stop anyone from
commenting about it, if they feel like it. In fact, putting a comment
like this at the start of your post only _invites_ comment.

> There is a script that I have been working on at
> http://chrisangell.com/unix/gallery/avirt .  I would like people to make
> suggestions pertaining  to the usage of this script and to the viability of
> this script.  I know I could work on the part where the $username and $dest
> is checked for errors,

I've only given it a cursory glance. If you want an indepth analysis
you'll have to hire someone to do that for you.

You're not using strict. Most people will have stopped reading by now.

This line, and the following:

$virtusertable = '/etc/mail/virtusertable' || 
	die "$virtusertable doesn\'t exist or has file permissions 
		that don\'t allow access\n";
	
makes no sense. If that assignment ever returns a false value, you have
much more severe problems than the ones you indicate. In fact, you won't
ever get a false value caused by anything you indicate. That assignment
will always return a true value, unless your perl binary is totally sick.

Your script is badly formatted, that could use some work. Some of the
comments are unnecessary. You seem to have an odd idea about what
constitutes valid email addresses, which may very well fit your local
business rules, but won't work out there in the wide world in the
general case.

The error messages from a failed open() should include $!, so that you
know why it failed.

You put in too many backwhacks. A single quote in a double quoted string
doesn't need to be backwhacked. Too many backwhacks can be as dangerous
as too few.

The regex where you check for 'special characters' should be done with a
character class, instead of the alteration you use. It'll make it much
shorter, and much more efficient.

All those elsifs don't need to be there. You should probably rework that
control structure a bit to make it more useable and maintainable.

The regexp /.+@.+\.[a-z]{2}/ is a very limited way to match 'valid'
email addresses. It'll allow many pieces of crud through. Maybe you
meant to anchor it at the start and/or the end? I'm a bit puzzled about
what you're trying to match here. You allow any string, followed by @,
followed by any string as long as it contains somewhere after the first
character a dot followed by two lowercase characters. There are much
better ways to check email addresses, and this is even discussed in the
FAQ. In this case, you might as well not check at all.

You are duplicating a lot of code where you check $username and $dest.
This could be done more efficiently by wrapping those checks in subs.

The logic determining whether getdest or getname need to be rerun is
flawed, especially in the case of getdest.

You keep calling subroutines with an ampersand and no parens. Read
perlsub to find out what that means. We're not doing perl 4 anymore.

Apart from the logging, how is this script better than vi
/etc/mail/virtusertable, followed by a makemap? Is this to allow users
who don't know what they're doing to add stuff to the virtusers table?
If so, are you sure you want to do that? If not, then I doubt a bit that
it's useful. Most people who know what they're doing would most likely
prefer to edit the file directly. If you need that logged, put it under
control of sudo and CVS or something like that.

I personally don't see that this script is viable in the wide world,
although it may very well be useful inside of your company. I'd clean it
up a bit before taking it into production though, and I'd certainly stop
enforcing lowercase letters for everything. It will break things, and it
is totally unnecessary. I'd also either write a reasonably decent email
address validation routine (check the FAQ) or I'd leave it out
altogether. Half-baked attempts only give one a false sense of security.
People will believe/think that it has all been checked, only to find out
that it didn't really check for anything important at all.

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | I took an IQ test and the results
Commercial Dynamics Pty. Ltd.   | were negative.
NSW, Australia                  | 


------------------------------

Date: 12 Apr 2001 04:43:27 GMT
From: Ronda Hauben <ronda@panix.com>
Subject: Re: Stripping non standard control characters from a file
Message-Id: <9b3bpf$gl7$1@news.panix.com>

Ilmari Karonen <iltzu@sci.invalid> wrote:

: In article <9b1tco$d6i$1@news.panix.com>, Ronda Hauben wrote:
:>I am working on using ord in a loop to convert each character of a file 
:>and test to see whether the character is in the range of standard ascii 
:>characters of 32-127
:>

thanks for the various suggestions how to deal with the problem
I posted.

What I learned from trying them is that Wordstar is more complicated
than I thought. It uses characters above 128 to represent the last
letter in many of the words. What I found is if I looked at the 
file in vi I would find a hex number like xee at the end of a word.
When I converted that to decimal I had 238. I subtracted 128 from that
and ended up with 110 or "n". Similarly I had the hex number xf8. I  
subtracted 128 and ended up with 120 which is the decimal for
x .

I tried to post some of the wordstar code but the tin newsreader
wouldn't post the non ascii. so I will put some wordstar code
at http://www.columbia.edu/~rh120/other/tocovert.ws 

Thanks for any help with this.

Ronda
ronda@panix.com



------------------------------

Date: 12 Apr 2001 00:14:26 -0500
From: logan@cs.utexas.edu (Logan Shaw)
Subject: Re: Stripping non standard control characters from a file
Message-Id: <9b3dji$1qn$1@charity.cs.utexas.edu>

In article <9b3bpf$gl7$1@news.panix.com>,
Ronda Hauben  <ronda@panix.com> wrote:
>What I learned from trying them is that Wordstar is more complicated
>than I thought. It uses characters above 128 to represent the last
>letter in many of the words. What I found is if I looked at the 
>file in vi I would find a hex number like xee at the end of a word.
   :
   :
>I tried to post some of the wordstar code but the tin newsreader
>wouldn't post the non ascii. so I will put some wordstar code
>at http://www.columbia.edu/~rh120/other/tocovert.ws 

Here's some documentation on the WordStar format:

http://www.geocities.com/SiliconValley/Lakes/2160/fformats/files/wordstar.txt

Basedon what I read there (in the second paragraph), I think a decent
first crack would be this:

	while (<>)
	{
		tr/\x80-\xff/\x00-\x7f/;	# mask most significant bit
		tr/\n\x20-\x7f//dc;		# nuke control characters
		s/ +$//;			# nuke spaces at end of line
		print;
	}

Hope that helps.

  - Logan
-- 
my  your   his  her   our   their   _its_
I'm you're he's she's we're they're _it's_


------------------------------

Date: Thu, 12 Apr 2001 15:26:14 +1000
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: Stripping non standard control characters from a file
Message-Id: <slrn9daf3l.482.mgjv@martien.heliotrope.home>

On 12 Apr 2001 04:43:27 GMT,
	Ronda Hauben <ronda@panix.com> wrote:
> Ilmari Karonen <iltzu@sci.invalid> wrote:
> 
>: In article <9b1tco$d6i$1@news.panix.com>, Ronda Hauben wrote:
>:>I am working on using ord in a loop to convert each character of a file 
>:>and test to see whether the character is in the range of standard ascii 
>:>characters of 32-127
>:>
> 
> thanks for the various suggestions how to deal with the problem
> I posted.
> 
> What I learned from trying them is that Wordstar is more complicated
> than I thought.

Maybe you should have a look at a description of the wordstar file
format at:

http://www.csdn.net/dev/Format/text/wordst.htm

and take it from there.

Especially the sentence:

    ...a raw stream of the printable text in a wordstar file could
    therefore be discerned by masking off the 8th bit and discarding
    codes in the range of 00h through 1fh.

should give you the only two operations you need to remove the wordstar
special stuff from the file.

Martien
-- 
Martien Verbruggen              | 
Interactive Media Division      | Never hire a poor lawyer. Never buy
Commercial Dynamics Pty. Ltd.   | from a rich salesperson.
NSW, Australia                  | 


------------------------------

Date: 12 Apr 2001 05:49:59 GMT
From: Ronda Hauben <ronda@panix.com>
Subject: Re: Stripping non standard control characters from a file
Message-Id: <9b3fm7$iao$1@news.panix.com>

Logan Shaw <logan@cs.utexas.edu> wrote:
: In article <9b3bpf$gl7$1@news.panix.com>,
: Ronda Hauben  <ronda@panix.com> wrote:
:>What I learned from trying them is that Wordstar is more complicated
:>than I thought. It uses characters above 128 to represent the last
:>letter in many of the words. What I found is if I looked at the 
:>file in vi I would find a hex number like xee at the end of a word.
:    :
:    :
:>I tried to post some of the wordstar code but the tin newsreader
:>wouldn't post the non ascii. so I will put some wordstar code
:>at http://www.columbia.edu/~rh120/other/tocovert.ws 

: Here's some documentation on the WordStar format:

: http://www.geocities.com/SiliconValley/Lakes/2160/fformats/files/wordstar.txt

: Basedon what I read there (in the second paragraph), I think a decent
: first crack would be this:

: 	while (<>)
: 	{
: 		tr/\x80-\xff/\x00-\x7f/;	# mask most significant bit
: 		tr/\n\x20-\x7f//dc;		# nuke control characters
: 		s/ +$//;			# nuke spaces at end of line
: 		print;
: 	}

: Hope that helps.


:   - Logan


I did try it on a short file and it seemed to work. I'll try
this more extensively tomorrow. Thanks :-)

#!/usr/bin/perl

my $source = shift @ARGV;
my $destination = shift @ARGV;

open IN, $source or die "Can't read source file $source: $:\n";
open OUT, ">$destination" or die "Can't write on file $destination: $!\n";

print "Copying $source to $destination\n";

while (<IN>) {

        {
                tr/\x80-\xff/\x00-\x7f/;        # mask most significant bit
                tr/\n\x20-\x7f//dc;             # nuke control characters
                s/ +$//;                        # nuke spaces at end of line
                print;
        }


print OUT $_;

}


Ronda



------------------------------

Date: 12 Apr 2001 05:54:03 GMT
From: Ronda Hauben <ronda@panix.com>
Subject: Re: Stripping non standard control characters from a file
Message-Id: <9b3ftr$iao$2@news.panix.com>

Martien Verbruggen <mgjv@tradingpost.com.au> wrote:
: On 12 Apr 2001 04:43:27 GMT,
: 	Ronda Hauben <ronda@panix.com> wrote:
:> Ilmari Karonen <iltzu@sci.invalid> wrote:
:> 
:>: In article <9b1tco$d6i$1@news.panix.com>, Ronda Hauben wrote:
:>:>I am working on using ord in a loop to convert each character of a file 
:>:>and test to see whether the character is in the range of standard ascii 
:>:>characters of 32-127
:>:>
:> 
:> thanks for the various suggestions how to deal with the problem
:> I posted.
:> 
:> What I learned from trying them is that Wordstar is more complicated
:> than I thought.

: Maybe you should have a look at a description of the wordstar file
: format at:

: http://www.csdn.net/dev/Format/text/wordst.htm

: and take it from there.

: Especially the sentence:

:     ...a raw stream of the printable text in a wordstar file could
:     therefore be discerned by masking off the 8th bit and discarding
:     codes in the range of 00h through 1fh.

: should give you the only two operations you need to remove the wordstar
: special stuff from the file.

: Martien
: -- 

I made up a small program to do all this a few years
ago, but then forgot what the format of wordstar was that I had found
at that time. I'll take a look at the format material.
Thanks for pointing me to it.

Ronda



------------------------------

Date: Thu, 12 Apr 2001 06:52:32 GMT
From: Uri Guttman <uri@sysarch.com>
Subject: Re: Stripping non standard control characters from a file
Message-Id: <x7itkaobio.fsf@home.sysarch.com>

>>>>> "RH" == Ronda Hauben <ronda@panix.com> writes:


  RH> #!/usr/bin/perl

  RH> my $source = shift @ARGV;
  RH> my $destination = shift @ARGV;

  RH> open IN, $source or die "Can't read source file $source: $:\n";
  RH> open OUT, ">$destination" or die "Can't write on file $destination: $!\n";

learn about the -n and -p options for perl. they are perfect for this
kind of stuff.

  RH>                 print;
  RH>         }

  RH> print OUT $_;

you are printing to the file and stdout? other than for debugging, why?
try this with: strip_wstar input_file > output_file

#/usr/local/bin/perl -p

	tr/\x80-\xff/\x00-\x7f/;        # mask most significant bit
	tr/\n\x20-\x7f//dc;             # nuke control characters
        s/ +$//;                        # nuke spaces at end of line

uri

-- 
Uri Guttman  ---------  uri@sysarch.com  ----------  http://www.sysarch.com
SYStems ARCHitecture and Stem Development ------ http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info:     http://www.sysarch.com/perl/OOP_class.html


------------------------------

Date: Thu, 12 Apr 2001 02:10:41 -0500
From: Dmitry Epstein <mitiaNOSPAM@northwestern.edu.invalid>
Subject: Re: what are the new languages?
Message-Id: <3AD554F1.FD2E895B@northwestern.edu.invalid>

Bart Lateur wrote:
> 
> Dmitry Epstein wrote:
> 
> >I tried installing the IDE, but right away the installer warned me that
> >the program won't work with directories that have spaces in them.  What
> >the hell?!  I've never seen this before.
> 
> You're not used to much, then. The same restriction pretty much applies
> to perl itself.

Don't get me wrong: I do not approve of spaces in filenames.  However,
I've never seen a program that would outright refuse to work with them. 
Either it's a case of snobbery (well screw them then!) or incompetence. 
None of the Perl's that I used on PC ever had a problem with spaces, by
the way.  You quote the path or backslash the space, that's all.


------------------------------

Date: Thu, 12 Apr 2001 08:00:17 GMT
From: Bart Lateur <bart.lateur@skynet.be>
Subject: Re: Why Perl?
Message-Id: <s2oadt0ggfq4oso0qljcjc42centjbrrmj@4ax.com>

GrapeApe wrote:

>I imagine the lists archives are at
>http://www.its.unimelb.edu.au/hma/pub/macperl/
>I will peruse those before I sub. Don't see a sub form there anyway.

It's not your imagination, but it's not the only archive. The base for
the mailing lists is at:

	<http://www.macperl.org/depts/mlist.html>

where you can find info on how to subscribe/unsubscribe, and access the
archives.

-- 
	Bart.


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 683
**************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[18515] in Perl-Users-Digest

Perl-Users Digest, Issue: 683 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Thu Apr 12 06:10:45 2001

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Apr 12 06:10:45 2001