[19758] in Perl-Users-Digest


home	help	back	first	fref	pref	prev	next	nref	lref	last	post
Perl-Users Digest, Issue: 1953 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Oct 18 03:05:31 2001

Date: Thu, 18 Oct 2001 00:05:10 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <1003388709-v10-i1953@ruby.oce.orst.edu>
Content-Type: text

Perl-Users Digest           Thu, 18 Oct 2001     Volume: 10 Number: 1953

Today's topics:
    Re: Array / Regular Expression / Searching (Garry Williams)
    Re: Change Last Modified for all files in a directory (BUCK NAKED1)
    Re: DBI::ProxyServer error on WindowsNT (Keith Clay)
    Re: How can I do the same in perl ? - VB Command SenKey <goldbb2@earthlink.net>
    Re: ImageMagick or Gimp?? <goldbb2@earthlink.net>
        newbie timeout issue <aj@atomicawebdesign.com>
        Pbm Plus on Win32? <Gala@nonono.com>
    Re: pcl printer codes <pne-news-20011018@newton.digitalspace.net>
    Re: Perl CGI problem printing Javascript... (BUCK NAKED1)
    Re: Perl CGI problem printing Javascript... <joe+usenet@sunstarsys.com>
    Re: precedence question <dtweed@acm.org>
    Re: precedence question (Martien Verbruggen)
    Re: precedence question <joe+usenet@sunstarsys.com>
    Re: put more simply: putting $1..$9 into a replacement  <goldbb2@earthlink.net>
    Re: putting $1..$9 into a replacement string such as $r <goldbb2@earthlink.net>
    Re: reference as a sub name <goldbb2@earthlink.net>
    Re: Scaling a DNA string <dtweed@acm.org>
    Re: Scaling a DNA string (Jay Tilton)
    Re: Scaling a DNA string (Martien Verbruggen)
    Re: Specifying a range of values <goldbb2@earthlink.net>
    Re: Splitting on value pairs <mbudash@sonic.net>
    Re: Splitting on value pairs <tintin@snowy.calculus>
    Re: String To List <goldbb2@earthlink.net>
    Re: Troubleshootng Bundle::libnet installation <Tassilo.Parseval@post.rwth-aachen.de>
    Re: Writing and reading encrypted string (password) <goldbb2@earthlink.net>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: Thu, 18 Oct 2001 04:07:15 GMT
From: garry@ifr.zvolve.net (Garry Williams)
Subject: Re: Array / Regular Expression / Searching
Message-Id: <slrn9sslbj.l1i.garry@zfw.zvolve.net>

[ Post re-ordered.  Please don't top post. ]

On Thu, 18 Oct 2001 03:54:00 GMT, Geoff Clark <gclark@wavetel.com> wrote:
> "Garry Williams" <garry@ifr.zvolve.net> wrote in message
> news:slrn9ssjm8.l1i.garry@zfw.zvolve.net...
>> On Thu, 18 Oct 2001 02:29:56 GMT, Geoff Clark <gclark@wavetel.com> wrote:
>>
>> It would be a big help, if you gave us complete code that illustrates
>> your problem.  The code you show has no way of "displaying the
>> result".
>>
>> Maybe this will help:
>>
>>   while ($current_line = <TEXT_FILE>) {
>>     if ( substr($current_line, 0, 14) eq $mac ) {
>>       print +(split " ", $current_line)[2], "\n";
>>     }
>>   }
>>
>> > I am not real comfortable with
>> > regular Expressions.  If you someone could assist with this.  I would
>> > greatly appreciate it.
>>
>> You haven't defined a requirement for a regular expression.
> 
> This is what I have tried :
> 
> #!/usr/bin/perl
> use warnings;
> use strict;
> 
> use vars qw($current_line $mac);


Loose this statement.  There's no need to declare package variables.  


> my $mac = 'p10020D6ACB204';


  open(DHCP, "dhcp.txt") || die "can't open `dhcp.txt': $!\n";


> while ($current_line = <dhcp.txt>) {


  while ( my $current_line = <DHCP>) {


>  if ( substr($current_line, 0, 14) eq $mac ) {
>   print +(split " ", $current_line)[2], "\n";
>  }
> }
> 
> I am receiving a blank result.

Because you didn't open the file you wish to read and you are reading
from a non-existant filehandle.  

-- 
Garry Williams


------------------------------

Date: Wed, 17 Oct 2001 22:54:07 -0500 (CDT)
From: dennis100@webtv.net (BUCK NAKED1)
Subject: Re: Change Last Modified for all files in a directory
Message-Id: <28417-3BCE525F-45@storefull-246.iap.bryant.webtv.net>

> goldbb2@earthlink.net  (Benjamin=A0Goldberg) 
> > What A Man ! wrote: 
> > My script below does not change the 
> > last modified date. 

> It *did* change them, but you didn't 
> press reload, so you didn't see it. 

> [ snip some good info from Ben ]

>  So press shift when reloading when 
>  you run into this kind of problem. 

I was using IE5, and had my browser open for several hours, hitting
"Cntrl R"  (along with Refresh in the toolbar) to refresh. Thanks for
the OT tip... so "Cntrl, Shift, R" reloads, eh? I don't always see a
Reload function in the IE toolbar, so I thought just a plain "Cntrl R"
was reloading it.

BTW... this group is lucky to have people like you. You are a wealth of
information, kind, and a noted sense of humor.

Regards,
Dennis (aka whataman@home.com)

---------------
"Bathing yourself is a simple task at age 20, difficult at 50, and
almost impossible to do after 70... unless of course, you're fat, then
it's a big hassle at any age."



------------------------------

Date: 17 Oct 2001 21:47:36 -0700
From: clayk@acu.edu (Keith Clay)
Subject: Re: DBI::ProxyServer error on WindowsNT
Message-Id: <8947fc5.0110172047.3e46716f@posting.google.com>

Ron Reidy <rereidy@indra.com> wrote in message 
> This problem is from Log.pm.  Investigate this by looking at the file in
> an editor.

From what I found out after posting is that perl 5.6.1 does not
support the 5.005 threading so it appears that we are going to go back
to 5.005 and see if this fixes the problem.  Any additional input is
welcome.

keith


------------------------------

Date: Thu, 18 Oct 2001 00:18:12 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: How can I do the same in perl ? - VB Command SenKeys
Message-Id: <3BCE5804.34BDE1B4@earthlink.net>

Tonino Sclavont wrote:
> 
> I'd like to send Keys to an external application (Win32)
> With VB it's look like this (and it works)
>     AppActivate 2048
>     SendKeys "tagada"
> Where 2048 is the process number of an external application.
> 
> Is it possible to do the same with perl ?

Probably.  Look at the Win32:: heararchy of modules.

-- 
"What does stupid old man mean pidgin talk?
Shampoo does not talk like a bird."


------------------------------

Date: Thu, 18 Oct 2001 01:34:24 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: ImageMagick or Gimp??
Message-Id: <3BCE69E0.7D6748D0@earthlink.net>

Bart Lateur wrote:
> 
> tszeto wrote:
> 
> >I'd like to write a script to convert Adobe Photoshop and Illustrator
> >files to thumbnails.
> >
> >What's the best way to do this? Should I use ImageMagick or the Gimp?
> 
> Photoshop, I'd think. Photoshop 5 supports a primitive form of
> scripting. But then you can't do it using Perl.

Of course you can.  You could use either IPC::Open2 or Expect.pm from
perl to run photoshop :)

-- 
"What does stupid old man mean pidgin talk?
Shampoo does not talk like a bird."


------------------------------

Date: Thu, 18 Oct 2001 16:45:48 +1000
From: "aj" <aj@atomicawebdesign.com>
Subject: newbie timeout issue
Message-Id: <TTuz7.812$Qj7.60809@ozemail.com.au>

hello

why would I get a cgi-server timeout problem when I try to access a perl
script? I have set perl.exe as the default application to handle perl file
extensions on windows 2000 IIS, and have checked that perl is installed
properly.

thanks.




------------------------------

Date: Thu, 18 Oct 2001 06:10:48 GMT
From: "Gala" <Gala@nonono.com>
Subject: Pbm Plus on Win32?
Message-Id: <Inuz7.29856$gT6.18183104@news1.rdc1.sfba.home.com>

Is it possible to install and use Pbm Plus on a win box?

Also, can it be installed on a unix box with out having telnet access? I only have ftp access to our unix web server.

If theres a better place to ask about Pbm Plus could someone please direct me there?

Thanks a lot for any help.

-Gala




------------------------------

Date: Thu, 18 Oct 2001 08:23:06 +0200
From: Philip Newton <pne-news-20011018@newton.digitalspace.net>
Subject: Re: pcl printer codes
Message-Id: <1gssstse7tnan982672m8gjadci6i5ja5m@4ax.com>

On Thu, 18 Oct 2001 03:16:19 GMT, garry@ifr.zvolve.net (Garry Williams)
wrote:

> the OP apparently meant decimal numbers in his original post

He almost certainly did -- the first number was '027' which is the code
for the Escape character in decimal.

Cheers,
Philip
-- 
Philip Newton <nospam.newton@gmx.li>
That really is my address; no need to remove anything to reply.
If you're not part of the solution, you're part of the precipitate.


------------------------------

Date: Wed, 17 Oct 2001 23:17:43 -0500 (CDT)
From: dennis100@webtv.net (BUCK NAKED1)
Subject: Re: Perl CGI problem printing Javascript...
Message-Id: <28418-3BCE57E7-20@storefull-246.iap.bryant.webtv.net>

<SCRIPT LANGUAGE="Javascript" SRC="www.pschallenge.com/cgi-bin/bc.pl"
type="text/javascript"> 
</script>

Don't know if this is what's causing your prob, but you're using
"<script language=javascript... blah... type=text/javascript>", and then
you use a PERL file for your script file. IOW, shouldn't your SRC file
end in .js, and be a JAVASCRIPT file?

Regards,
--Dennis



------------------------------

Date: 18 Oct 2001 02:56:37 -0400
From: Joe Schaefer <joe+usenet@sunstarsys.com>
Subject: Re: Perl CGI problem printing Javascript...
Message-Id: <m33d4hbfmi.fsf@mumonkan.sunstarsys.com>

Brian Carlson <bcarlso4@bellsouth.net> writes:

> I have a perl cgi script, that when ran, prints out javascript.  I have
> the following in my page....
> 
> <SCRIPT LANGUAGE="Javascript" SRC="www.pschallenge.com/cgi-bin/bc.pl"
> type="text/javascript">
> </script>
> 
> This doesn't work.  
  ^^^^^^^^^^^^^^^^^

Poor choice of words- but based on your description of what you 
claim "does" work, I'd say you need a valid URI in the SRC attribute,
or maybe a different LANGUAGE value, or maybe an uppercase </SCRIPT>
tag, or maybe a different Content-Type header (text/plain?) from 
your Perl script, or maybe ...

In any case, the first order of business is to recognize that none of
this is Perl related, and to try again in a group that deals with such 
inter-networking issues.

-- 
Joe Schaefer     "We are all in the gutter, but some of us are looking at the
                                           stars."
                                               -- Oscar Wilde



------------------------------

Date: Thu, 18 Oct 2001 04:30:57 GMT
From: Dave Tweed <dtweed@acm.org>
Subject: Re: precedence question
Message-Id: <3BCE59A1.1A00BAB3@acm.org>

Richard Trahan wrote:
> I disagree. Right- and left-associativness are only used as tie-
> breakers when two operators of the same precedence are delivered to
> the parser; right-associative causes a parser "shift", and left-
> causes a "reduce".

Granted, you could have a right-associative operator that evaluates its
LHS first, but most people would find this extremely counterintuitive
and it just isn't done that way. It only matters when evaluating the
operands has side-effects anyway.

> my $a=5;
> (print("left $a\n")),$a = (print ("right $a\n")),7;
> print "$a\n";
> 
> The output clearly shows that the left print statement is being
> evaluated (I assume you mean 'executed', not tokenized) before
> the right. Notice that $a winds up with a value of 1, which is
> the return value of the right print (1 for true); this is because
> the precedence of "=" is higher than that of ",", otherwise $a
> would wind up as 7.

Exactly. This is a sequence of three expressions, evaluated in order
left-to-right ("," is left-associative). The middle expression happens
to be an assignment, whose right-hand side gets evaluated before its
left-hand side. All as described in the documentation.

> There are two other bizarre aspects of the above code. If run
> as is, you get a "useless use of constant in void context"
> warning, presumably the 7, which is reasonable because it's not
> used for anything. But if you change the 7 to a 1, the warning
> goes away. Could this be due to "constant folding"? If so, this
> is surely a bug in the warning system.

An expression in void context that evaluates to a constant 1 (or 0)
is an idiomatic usage that is specifically allowed. It is often used
as the last line of a module or other "require"d file to indicate to
its caller that it has initialized correctly (or not).

> Second, if you remove the "my $a=5" line, the program will warn
> about concatenation of an unitialized variable, which makes sense,
> but more importantly, it doesn't die for lack of a "my" or "our"
> declaration for $a. In fact, "use strict" doesn't seem to work at
> all in toplevel code (but works in subs). How come?

$a and $b are special symbols, used in particular in sort functions.
They're sort of pre-declared, so no warnings or compilation errors
occur. Try the same thing with $x and you'll see the difference.

That's the neat thing about Perl. It's got all of these little twists
that make it easy to do common things, but that can really trip you
up if you try to think about it too formally.

-- Dave Tweed

P.S. Do you know what the following C code prints, and why?

   int i = 2;
   printf ("%d %d\n", i++, i++);


------------------------------

Date: Thu, 18 Oct 2001 05:56:42 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: precedence question
Message-Id: <slrn9ssrok.e62.mgjv@verbruggen.comdyn.com.au>

On Thu, 18 Oct 2001 04:30:57 GMT,
	Dave Tweed <dtweed@acm.org> wrote:
> Richard Trahan wrote:

[Stuff about how perl deals with multiple auto-increments in a single
expression]

> P.S. Do you know what the following C code prints, and why?

#include <stdio.h>

>    int i = 2;
>    printf ("%d %d\n", i++, i++);

I fail to see how this has any bearing on how the current
implementation of perl does these things [1]. The C standard defines the
language in very different terms from how Perl is defined, and in fact
explicitly defines many things (or declares them undefined or
implementation-specific) on which the Perl manuals and documentation
are simply silent. You can't deduce perl's behaviour from how an
'equivalent' C program behaves.

\begin{offtopic}

Apart from that, the above is allowed to print several things. While
there is a sequence point just before the function call, the order of
evaluation of its arguments isn't specified, as explicitly stated in
the C specification.

This is why the second part of your question ("and why?") doesn't
really make sense.

The above code prints "3 2\n" on Linux, with gcc -ansi, and "2 3\n"
on Solaris with gcc -ansi and cc -Xc.

\end{offtopic}

Martien

[1] And we've had discussions about this on this newsgroup before. It
isn't at all clear _how_ perl is _supposed_ to do this. However, since
there is only a single implementation of Perl, and no formal and
complete language specification, this implementation is the only thing
that defines 'correct' behaviour.
-- 
Martien Verbruggen              | 
                                | 42.6% of statistics is made up on the
Trading Post Australia Pty Ltd  | spot.
                                | 


------------------------------

Date: 18 Oct 2001 02:31:53 -0400
From: Joe Schaefer <joe+usenet@sunstarsys.com>
Subject: Re: precedence question
Message-Id: <m37kttbgrq.fsf@mumonkan.sunstarsys.com>

Dave Tweed <dtweed@acm.org> writes:

[...]

> The middle expression happens to be an assignment, whose right-hand
> side gets evaluated before its left-hand side. All as described in the
> documentation.  

Where exactly in the Perl documentation does it say that the RHS 
of an assignment always gets _fully evaluated_ *before* the LHS does?
Citing simple rules of associativity is insufficient for this- C has
essentially the same associativity and precedence rules as Perl,
but you certainly can't make the same claim for C.

However, I believe it is possible to craft a narrow and very carefully 
worded statement that supports your claim, but I have not been able to 
find such a thing in the standard documentation.

[...other stuff I completely agree with snipped...]

> P.S. Do you know what the following C code prints, and why?
> 
>    int i = 2;
>    printf ("%d %d\n", i++, i++);

Who cares what it prints?  It is a textbook example of the dreaded
"undefined behavior".  From K&R v2 p53:

  ... Similarly, the order in which function arguments are evaluated
  is not specified, so the statement

      printf ("%d %d\n", ++n, power(2,n)); /* WRONG */

  can produce different results with different compilers, ...

  ...continued on p54...
  The moral is that writing code that depends on order of evaluation
  is bad programming practice in any language.  Naturally, it is 
  necessary to know what things to avoid, but if you don't know _how_
  they are done on various machines, you won't be tempted to take 
  advantage of a particular implementation.

-- 
Joe Schaefer
perl -wle '$,=" ";{ my @x;sub x {if(@_){push @x,@_; return sub{push @x,@_;@x}}
                                                           sub{push @x,@_;@x}}
                  }     print x ("Just")->("another"), x -> ("perl","hacker,")'



------------------------------

Date: Thu, 18 Oct 2001 02:38:12 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: put more simply: putting $1..$9 into a replacement string such as $repl="=csinfo$1$2^$3"
Message-Id: <3BCE78D4.52F218FB@earthlink.net>

Bart Lateur wrote:
[snip]
> One solution would be to put your string between quotes, and eval the
> RHS. Or use the /e modifier. Like this:
> 
>         $strText =~ s/$search/'"'.$repl.'"'/giee;
> 
> You might already guess what can go wrong with this... first of all:
> what happens with embedded quotes? Backslashes? What about likely
> unsafe occurrences of "@{[...]}", "${\(...)}", or even "$foo{code()}"
> and "$foo[code()]"?
[snip]
> Am I missing another option?

Use something like the above, but instead of evaling it with /ee, do
something like:

my $safe = Safe->new();
$safe->permit_only(qw[uc lc quotemeta concat]);
$strText =~ s/$search/$safe->reval(qq[qq[$repl]])/gie;

You might need to allow a few more ops than that, but this should
provide a reasonable amount of safety.

-- 
"What does stupid old man mean pidgin talk?
Shampoo does not talk like a bird."


------------------------------

Date: Thu, 18 Oct 2001 01:03:01 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: putting $1..$9 into a replacement string such as $repl="=csinfo$1$2^$3"
Message-Id: <3BCE6285.A123ADDE@earthlink.net>

sh�wd�g wrote:
> 
> problem
> 
> the program reads tokens from a text file, with lines like
>     \=csinfo(.*)([A-Za-z]+)\/([A-Za-z]+)|=csinfo$1$2^$3
> these lines are separated by the |, and are read into $search and
> $replace:
> 
> $search="\=csinfo(.*)([A-Za-z]+)\/([A-Za-z]+)";
> $replace="=csinfo$1$2^$3";
> 
> I then try
>     s{$search}{$replace}g;
> and I get (for example):
> 
> =csinfo 00000130c/c overruled on other grounds
>   becomes
> =csinfo$1$2^$3 overruled on other grounds
> 
> so the numbered variables are not being evaluated, they are being
> treated as literal strings.  What shall I do?


$search =  q[=csinfo(.*)([A-Za-z]+)/([A-Za-z]+)];
$replace = q[=csinfo$1$2^$3];

One way is:

s[$search][qq[q[$replace]]]ee;

Another way is:

eval qq[s[\$search][$replace]];

Both of these result in $replace being evaled as a "" type string.
This is can be unsafe, but the alternative would be to parse out the
variables, and do the replacement 'by hand'

if( m[$search] ) {
    my @s = @+; my @e = @-;
    my ($out, $lastpos) = (substr($_, 0, $+[0]), 0);
    while( $replace =~ /\$(\d+)/g ) {
        $out .= substr($replace,$lastpos,$+[0]);
        $out .= substr($_, $s[$1], $e[$1]-$s[$1]);
        $lastpos = pos;
    }
    $out .= substr($replace,$lastpos);
    $_ = $out;
}

Isn't that ugly?  Not to mention that it doesn't handle any kind of
escapes in the replacement, like \n, \$, etc.

The safest thing would be to make a variant of
    eval qq[s[\$search][$replace]];
which used the Safe.pm module, turning everything off except matching,
substitution, and concatenation.

-- 
"What does stupid old man mean pidgin talk?
Shampoo does not talk like a bird."


------------------------------

Date: Thu, 18 Oct 2001 02:17:11 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: reference as a sub name
Message-Id: <3BCE73E7.2103388B@earthlink.net>

perl misk wrote:
> 
> reference as a sub name (I think)
> 
> I have a config file:
> 
> config.file
> @abc = qw %a b c%;
> @num = qw %1 2 3%;
> 
> and a Perl module that requires the above file.

It would be better to have the config file as:
{
    abc => [qw %a b c%],
    num => [qw %1 2 3%],
};

and a perl module which do()s the file, as follows:
my %configdata = do "config.file";

> A sub within the pm is called with a user name as a parameter:
> 
> sub get_list {
>     my $user = shift;
> 
>     <taint check code snipped>
> 
>     no strict 'refs';
>     print @main::{$user};       # how can I access the users' array?
> }

sub get_list {
    my $user = shift;
    print @{ $config{$user} };
}

> The above assumes I want to get an element, instead I want the whole
> array.

Actually, your problem is different.  @main::{$user} does not do what
you think that it does.  $main::{$user} is a glob which contains a
variable in package "main", and @main{$user} is a one element list
containing that glob.

To access a variable whose name is $user, you can do one of the
following:

    $some_glob = $main::{$user}; # get the glob.
    $aref = *$some_glob{ARRAY}; # get the array portion of it.
    print @$aref; # dereference the array.

Or:
    $aref = *{$main::{$user}}{ARRAY};
    print @$aref; # dereference the array.

Or:
    print @{*{$main::{$user}}{ARRAY}};

Or:
    print do{no strict 'refs'; @{"main::$user"}};

> And is there a way to "use extra-strict".  Something that I don't like
> about Perl is I now have to check each sub at the start to ensure it
> was called with valid parameters, maybe this is too bigger overhead to
> worry about but I am very pessimistic - but not desperate so I wont
> use Java.

Well, you could use subroutine prototypes... but perl5's prototypting
often doesn't help you as much as you'd like.  Perl6 will have more
useful prototype checking, which you can have in your perl5 programs by
adding a line:
    use Perl6::Parameters;
You have to get this from CPAN.

However, I don't think there's really all that serious a need for
them...  Why do you assume that folks will call your subs wrong?  Or
that you will call them wrong yourself?

-- 
"What does stupid old man mean pidgin talk?
Shampoo does not talk like a bird."


------------------------------

Date: Thu, 18 Oct 2001 05:29:34 GMT
From: Dave Tweed <dtweed@acm.org>
Subject: Re: Scaling a DNA string
Message-Id: <3BCE6752.1E500A87@acm.org>

DocDodge wrote:
> What I really want to scale the string to is this ...-GC-G--... ( or this
> ...-GC--G-..., small position shifts are inconsequential as long as the
> number of motifs is accurate).
> 
> I hope this is clear and not to long winded.  I figured the problem would be
> more fun if the background context was included.

Very clear, and an interesting problem.

Here's my approach, which shows the relative lengths of the gaps between
motifs using a logarithmic scale. I'm assuming that any sequence of four
*or more* Cs or Gs is a motif, but you can change the first argument of
the split as needed.

You can fiddle with the base of the logarithm (2 in this example) to get
different scalings.

-- Dave Tweed

$sequence = 'ACGACGTCCAGGGGGGTTGTTACGTCCCCAATCAGTCGGGGCTATTCAGTC';

# split the sequence into motifs and non-motifs
# all of the odd positions in this array will be motifs
@m = split /(C{4,}|G{4,})/, $sequence;

while (@m) {
    my $non = shift @m;
    my $motif = shift @m;
    # show the relative length of the non-motif using a logarithmic scale
    print '-' x int((log length $non)/log 2);
    # show the motif as a single character
    print chop $motif if defined $motif;
}
print "\n";


------------------------------

Date: Thu, 18 Oct 2001 06:01:41 GMT
From: tiltonj@erols.com (Jay Tilton)
Subject: Re: Scaling a DNA string
Message-Id: <3bce660d.407138606@news.erols.com>

On Wed, 17 Oct 2001 22:52:57 -0400, "DocDodge" <DocDodge@hotmail.com> wrote:

>a motif composed of four G's or
>four C's in a row might have implications for how some genes are regulated.
>
>So, I grabbed 2000 characters from nearby an important gene and stored it in
>a string.  I replaced all the unimportant characters with at dash so we can see
>where these motifs are:
>
>...----------GGGGGG---------CCCC--------GGGG----------...
>
>What I need to do is scale the
>string down to a reasonable size.  What I've tried to do is use a for loop
>with and index of 10 to replace all the instances or 10 dashes with a single
>dash.  If a G or C is found in the 10 character region, a single G or C is
>printed.  
>
>The last GGGG gets counted twice because it is crosses over
>two different 10 character long chunks.
>And, if I scale the string by looking for GGGG or CCCC in any 10 character
>long chunk, I will underestimated the number of motifs 
>
>What I really want to scale the string to is this ...-GC-G--... ( or this
>...-GC--G-..., small position shifts are inconsequential as long as the
>number of motifs is accurate).

Summarizing the apparent key points,
1. Sequences of four G's or four C's in the string are important.
2. Condense sequences of -'s in the string.
3. Chewing on the string in 10-character chunks is not working well because
a single motif may cross the boundary between chunks.

How about crunching four or more G's or C's down to one,
  $sequence =~ s/([GC])\1{3,}/$1/g;
then crunching multiple -'s down to one, so the motifs are visually
separate.
  $sequence =~ s/(-)+/$1/g;

What should happen where there are three or fewer C's or G's in a row?  Keep
them, discard them, or some other notation indicating "something there, but
not what I'm really looking for"?


------------------------------

Date: Thu, 18 Oct 2001 06:52:32 GMT
From: mgjv@tradingpost.com.au (Martien Verbruggen)
Subject: Re: Scaling a DNA string
Message-Id: <slrn9ssv1e.e62.mgjv@verbruggen.comdyn.com.au>

On Wed, 17 Oct 2001 22:52:57 -0400,
    DocDodge <DocDodge@hotmail.com> wrote:
> Hi,
> 
> I have what is a tricky biology problem, but an easy perl problem.
> Unfortunately, while I'm competent biologist, I'm a complete newbie at perl.

Does that mean that these problems are approximately equally hard to
you? :)

> As you may know, the DNA which codes for all the instructions of life is
> simply a string composed of the four characters A, C, G, or T.  We have
> experimental evidence that suggests that a motif composed of four G's or
> four C's in a row might have implications for how some genes are regulated.
> But, it would matter how close these motifs are to the genes they are
> suppose to control.

Ok, so far I'm still here... :)

> So, I grabbed 2000 characters from nearby an important gene and stored it in
> a string.  Here is what part of that might look like:
> 
> ...ACGACGTCCAGGGGGGTTGTTACGTCCCCAATCAGTCGGGGCTATTCAGTC...
> 
> Next I replaced all the unimportant characters with at dash so we can see
> where these motifs are:
> 
> ...----------GGGGGG---------CCCC--------GGGG----------...
> 
> But as you might imagine, I'm having trouble printing out these 2000
> character long strings in a readable format.  What I need to do is scale the
> string down to a reasonable size.  What I've tried to do is use a for loop
> with and index of 10 to replace all the instances or 10 dashes with a single
> dash.  If a G or C is found in the 10 character region, a single G or C is
> printed.  Here is how the scaled version looks:

But what happens if both a G and a C are found in this 10 character
sequence? And what happens if more than one sequence falls in one
slice?

"GGGG--CCCC" => G or C or something else?

"GGGG-GGGG-" => Only one G? or something else?

> ...-GC-GG-...
> 
> The problem is that someone looking at this scaled version would think that
> this string of DNA characters has four motifs in this short region.  But it
> only has three.  The last GGGG gets counted twice because it is crosses over
> two different 10 character long chunks.

And indeed, what do you do with motifs that straddle the boundary? Or
worse, what if you have a string like this:

0123456789012345678901234567890123456789
---CCCC-GGGGG--CCCCCCC--GGGGGGGG--CCCC--

In the above, every 10-character sequence has at least one G and at
least one C. 3 out of 5 motifs straddle a boundary. There are 5
motifs, but only 4 10-character sequences.  How do you want that
expressed?

What about coming up with a totally different encoding scheme? One
that simply tells you how many repeats it's seen? Your string would be
encoded as

-10G6-9C4-8G4-10

or maybe

-10 G6 -9 C4 -8 G4 -10

Mine would be:

-3 C4 -1 G5 -2 C7 -2 G8 -2 C4 -2

The more repetition there is, the more compression you get. You could
even consider not putting the minuses in.

3 C4 1 G5 2 C7 2 G8 2 C4 2

All this is assuming that the length of these things, and the gap
between them, is somehow important. If it isn't, You probably would
have just compressed multiple characters into one (with tr///s, for
example)

I'm not sure whether that's readable enough, but at least it doesn't
suffer from the problems that you've outlined, and the extra problems
that I noted.

Assuming that you already have the minuses in the string:

$_ = "----------GGGGGG---------CCCC--------GGGG----------";
s/(([-CG])\2*)/ ($2 eq "-" ? "" : $2) . length($1) . " " /ge;

Of course, there are other ways to do this, even when starting with
the raw string, instead of the one with minuses.

Something like the following might be a good start:

while (m/(.*?)(([CG])\3{3,})/g)
{
    print length($1), " ", length($2), $3, " ";
}
print "\n";

This misses out on the last sequence, if it isn't a motif. I have to
go home now, so I don't have time to think of a way to fix that
hurdle, without resorting to $' (if that works at all).  :)

Maybe pos() with substr() could be used, although I think pos() will
actually be undefined...

Martien
-- 
Martien Verbruggen              | 
                                | That's not a lie, it's a
Trading Post Australia Pty Ltd  | terminological inexactitude.
                                | 


------------------------------

Date: Thu, 18 Oct 2001 00:35:15 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Specifying a range of values
Message-Id: <3BCE5C03.B562E42C@earthlink.net>

Gary wrote:
> 
> Probably a simple question, but I was wondering how I can specify a
> range of vaules to use in an if statement.
> 
> For example, I want to specify for an action to be taken if a user
> scores between X and Y.   How do I code that?  Thanks.

use Quantum::Superpositions;
if( $score == any($X..$Y) ) {
    ....
}

-- 
"What does stupid old man mean pidgin talk?
Shampoo does not talk like a bird."


------------------------------

Date: Thu, 18 Oct 2001 05:35:04 GMT
From: Michael Budash <mbudash@sonic.net>
Subject: Re: Splitting on value pairs
Message-Id: <mbudash-576DAB.22350517102001@news.sonic.net>

In article <_Doz7.11$5p2.237596@news.interact.net.au>, "Tintin" 
<tintin@snowy.calculus> wrote:

> I'm trying to parse a WELF (Webtrends Extended Log Format) file.
> 
> Each record is on a single line terminated by CRLF, and each field is a
> simple id/value pair.  Each field is separated by whitespace, but if 
> there
> is whitespace in the value, it is enclosed in quotes.  For example:
> 
> id=firewall time="2001-10-14 12:01:05" fw=199.9.9.9 src=199.9.9.9
> 
> Obviously, I could do a simple split (if all values had no whitespace) to
> get each value pair, but how do I cater for the quoted values?
> 
> I tried
> 
> my @records = split(/[\w"] [a-z]);
> 
> which comes very close to what I need, except that it slurps the first 
> and
> last character of each field pair.
> 
> Of course, I'm always open to other ways to achieve the result.  
> Utimately,
> I want to have all the fields in a hash.
> 
> 

here's an alternative thought path... why not start with the '='s?

$_ = q|id=firewall fw=199.9.9.9 src=199.9.9.9 time="2001-10-14 
12:01:05"|;
my (@h, %h);
my @a = split /\s*=\s*/;
for my $i (0..$#a) {
    $a[$i] =~ s/^"(.+)"$/$1/;
    if ($i == 0 || $i == $#a) {
        push @h, $a[$i];
    } else {
        $a[$i] =~ /^(.+)\s+(\S+)$/;
        push @h, $1, $2;
    }
}
%h = @h;
while (my ($k, $v) = each %h) {
    print "$k=>$v\n";
}

this produces:

src=>199.9.9.9
fw=>199.9.9.9
id=>firewall
time=>2001-10-14 12:01:05

hth-
-- 
Michael Budash ~~~~~~~~~~ mbudash@sonic.net


------------------------------

Date: Thu, 18 Oct 2001 16:28:45 +1000
From: "Tintin" <tintin@snowy.calculus>
Subject: Re: Splitting on value pairs
Message-Id: <IFuz7.22$wo2.221240@news.interact.net.au>


"Michael Budash" <mbudash@sonic.net> wrote in message
news:mbudash-576DAB.22350517102001@news.sonic.net...
> here's an alternative thought path... why not start with the '='s?
>
> $_ = q|id=firewall fw=199.9.9.9 src=199.9.9.9 time="2001-10-14
> 12:01:05"|;
> my (@h, %h);
> my @a = split /\s*=\s*/;
> for my $i (0..$#a) {
>     $a[$i] =~ s/^"(.+)"$/$1/;
>     if ($i == 0 || $i == $#a) {
>         push @h, $a[$i];
>     } else {
>         $a[$i] =~ /^(.+)\s+(\S+)$/;
>         push @h, $1, $2;
>     }
> }
> %h = @h;
> while (my ($k, $v) = each %h) {
>     print "$k=>$v\n";
> }
>
> this produces:
>
> src=>199.9.9.9
> fw=>199.9.9.9
> id=>firewall
> time=>2001-10-14 12:01:05

True, that does work, although I ended up with a pretty simple solution, ie:

while (<>) {
    my %record;
    foreach (split(/ (?=[a-z]+=)/)) {
        my ($field,$value) = split(/=/);
        $record{$field}=$value;
    }
}




------------------------------

Date: Thu, 18 Oct 2001 00:10:01 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: String To List
Message-Id: <3BCE5619.4CAB0825@earthlink.net>

Jeff Zucker wrote:
[snip]
> > # I'm not sure what DBI::AnyData uses for LIKE...  If the author of
> > # it was smart, he'd have used File::Glob, in which case it's * and
> > # ? so you would get rid of the above line.
> 
> DBD::AnyData's LIKE depends on SQL::Statement which uses standard
> percent signs.
> 
> But as the author of both AnyData and the upcoming version of
> SQL::Statement (and you're right about me not being smart :-) I would
> be interested to know what you mean about File::Glob.  Please email me
> off list if its not relevant to the public.

It's just a matter of code reuse, that's all.

Obviously File::Glob has to have some function which tests if a filename
is "like" a pattern, and if there's an external interface [which there
unfortunatly doesn't seem to be... I only checked just now, not when I
sent that earlier post], then you would be able to use *that* for your
"like".

Having discovered that there isn't such a thing, I suppose that for
doing "like" comparisons, the simplest way of doing it would be:

$like = "%foo?bar";
my ($anchorstart, $anchorend) = ("^", "\z");
$like =~ s [(\\.)|([_%])] {
    if( $1 ) {
        $1
    } elsif( $2 eq "_" ) {
        "."
    } elsif( pos == 1 ) {
        $anchorstart = ""
    } elsif( pos == length ) {
        $anchorend  =  ""
    } else {
        ".*"
    }
}sge;
$like = qr/$anchrostart$like$anchorend/s;

I'm posting this to usenet since very similar code could be used for
glob-like comparisons.

-- 
"What does stupid old man mean pidgin talk?
Shampoo does not talk like a bird."


------------------------------

Date: Thu, 18 Oct 2001 06:50:32 +0200
From: Tassilo von Parseval <Tassilo.Parseval@post.rwth-aachen.de>
Subject: Re: Troubleshootng Bundle::libnet installation
Message-Id: <3BCE5F98.4010804@post.rwth-aachen.de>

bill wrote:

> I tried to install Bundle::libnet (following CPAN.pm's
> recommendation), but the installation fails for reasons I don't
> understand.  I include CPAN.pm's entire output below, but the crucial
> portion, as far as I can tell, is this:
> 
>     Running make test
>     PERL_DL_NONLAZY=1 /usr/local/bin/perl -Iblib/arch -Iblib/lib -I/usr/local/lib/perl5/5.6.1/i586-linux -I/usr/local/lib/perl5/5.6.1 -e 'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;' t/*.t
>     t/ftp...............skipped test on this platform
>     t/hostname..........Use of uninitialized value in pattern match (m//) at blib/lib/Net/Domain.pm line 226.
>     Use of uninitialized value in split at blib/lib/Net/Domain.pm line 233.
>     ok
>     t/nntp..............skipped test on this platform
>     t/require...........FAILED tests 8-9
> 	    Failed 2/11 tests, 81.82% okay


That's an already known shortcoming of the test-suite that comes with 
libnet. require.t eval()s several require()-statements, amongst those 
testing for Net::SNPP and Net::PH which are not part of libnet. Just 
edit t/require.t and remove the two lines that test for those modules or 
install them before you 'make test' libnet.

[...]

Tassilo


-- 
$a=[(74,116)];$b=[($a->[1]-1,$a->[1]++,0x20)];$c=[(97,110)];$d=[($c->
[1]+1,$b->[1],"her")];for(@{[$a,$b,$c,$d]}){for(@{$_}){$_=~/\d+/?print
(chr($_)):print;}}$c=sub{$l=shift;[(0x20+$l-1,0x50,0x65,0x73-0x01,108
),(0x20,0x68,0x61,)]};print(map{chr($_)}@{($c->(1))});$h={a=>33*3,b=>
10**2+7,c=>"1"."0"."1",d=>0162};@h=sort(keys(%$h));for(@h){print(chr(
ord(chr($h->{$_}))))};



------------------------------

Date: Thu, 18 Oct 2001 02:40:07 -0400
From: Benjamin Goldberg <goldbb2@earthlink.net>
Subject: Re: Writing and reading encrypted string (password)
Message-Id: <3BCE7947.528EB230@earthlink.net>

Lars Oeschey wrote:
> 
> On Tue, 16 Oct 2001 09:46:16 -0500, trammell@haqq.hypersloth.invalid
> (John J. Trammell) wrote:
> 
> >You could obfuscate it with MIME::Base64.
> 
> hm, that could work indeed. Though I wan't to stay away from Modules
> not included with Activestate Perl a bit, since I don't know on how
> much machines the program will be later, and everywhere would have to
> be the module installed then (on NT with ppm and proxy settings that
> is a bit work)

Then how about you obfuscate it with pack/unpack "u" ?

-- 
"What does stupid old man mean pidgin talk?
Shampoo does not talk like a bird."


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 1953
***************************************

home	help	back	first	fref	pref	prev	next	nref	lref	last	post
[19758] in Perl-Users-Digest

Perl-Users Digest, Issue: 1953 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)Thu Oct 18 03:05:31 2001

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Thu Oct 18 03:05:31 2001