[29500] in Perl-Users-Digest
Perl-Users Digest, Issue: 744 Volume: 11
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sat Aug 11 03:09:40 2007
Date: Sat, 11 Aug 2007 00:09:06 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sat, 11 Aug 2007 Volume: 11 Number: 744
Today's topics:
[OT] Dealing with spam (was: Perl On Apache) <noreply@gunnar.cc>
Re: File::Find problem? <tadmc@seesig.invalid>
Re: File::Find problem? <rkb@i.frys.com>
Re: File::Find problem? <rkb@i.frys.com>
Re: File::Find problem? <spamtrap@dot-app.org>
Re: File::Find problem? <spamtrap@dot-app.org>
Re: File::Find problem? <uri@stemsystems.com>
Re: File::Find problem? <dummy@example.com>
Re: how to tranpose a huge text file xhoster@gmail.com
Re: how to tranpose a huge text file <nospam-abuse@ilyaz.org>
Re: match and group across 2 lines <tadmc@seesig.invalid>
Re: Need help writing a basic script print to text file <stoupa@practisoft.cz>
new CPAN modules on Sat Aug 11 2007 (Randal Schwartz)
Re: Out of memory in vec <paduille.4061.mumia.w+nospam@earthlink.net>
Re: Out of memory in vec <paduille.4061.mumia.w+nospam@earthlink.net>
Re: Perl On Apache <stoupa@practisoft.cz>
Perl script to track UPS packages by tracking number. <ignoramus22443@NOSPAM.22443.invalid>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sat, 11 Aug 2007 05:55:17 +0200
From: Gunnar Hjalmarsson <noreply@gunnar.cc>
Subject: [OT] Dealing with spam (was: Perl On Apache)
Message-Id: <5i4qfuF3l1b0nU1@mid.individual.net>
Petr Vileta wrote:
> (My server rejects all messages from Yahoo and Hotmail.
Funny; I get almost no spam from Yahoo or Hotmail servers.
Or are you saying that you reject messages based on the faked From
addresses??
--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
------------------------------
Date: Fri, 10 Aug 2007 19:45:24 -0500
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: File::Find problem?
Message-Id: <slrnfbq1l4.it6.tadmc@tadmc30.sbcglobal.net>
Ron Bergin <rkb@i.frys.com> wrote:
> Try changing the open call to this:
>
> open(HDR_FILE, '<', "$_") || die "Can't open file or input: $!";
perldoc -q vars
What’s wrong with always quoting "$vars"?
then try instead:
open(HDR_FILE, '<', $_) || die "Can't open file or input: $!";
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
------------------------------
Date: Fri, 10 Aug 2007 19:24:36 -0700
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: File::Find problem?
Message-Id: <1186799076.017416.119690@q3g2000prf.googlegroups.com>
On Aug 10, 5:45 pm, Tad McClellan <ta...@seesig.invalid> wrote:
> Ron Bergin <r...@i.frys.com> wrote:
> > Try changing the open call to this:
>
> > open(HDR_FILE, '<', "$_") || die "Can't open file or input: $!";
>
> perldoc -q vars
>
> What's wrong with always quoting "$vars"?
>
> then try instead:
>
> open(HDR_FILE, '<', $_) || die "Can't open file or input: $!";
>
> --
> Tad McClellan
> email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
Tad,
I'm aware of the reasons for not always quoting "$vars" and I rarely
do quote them, however, I did in this case to safeguard against the
possibility of having spaces in the filename, in which case the open
call would fail if it was not quoted. I know that spaces in filenames
aren't real common on *nix systems, but on occasion it does happen and
on Windows systems it's very common.
I probably should have been more clear as to my reasoning when I
posted that suggestion. Additionally, we both left out one important
piece...we should have included $_ or $File::Find::name within the die
statement.
------------------------------
Date: Fri, 10 Aug 2007 19:37:47 -0700
From: Ron Bergin <rkb@i.frys.com>
Subject: Re: File::Find problem?
Message-Id: <1186799867.631524.149740@q4g2000prc.googlegroups.com>
On Aug 10, 7:24 pm, Ron Bergin <r...@i.frys.com> wrote:
> On Aug 10, 5:45 pm, Tad McClellan <ta...@seesig.invalid> wrote:
>
>
>
> > Ron Bergin <r...@i.frys.com> wrote:
> > > Try changing the open call to this:
>
> > > open(HDR_FILE, '<', "$_") || die "Can't open file or input: $!";
>
> > perldoc -q vars
>
> > What's wrong with always quoting "$vars"?
>
> > then try instead:
>
> > open(HDR_FILE, '<', $_) || die "Can't open file or input: $!";
>
> > --
> > Tad McClellan
> > email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
>
> Tad,
>
> I'm aware of the reasons for not always quoting "$vars" and I rarely
> do quote them, however, I did in this case to safeguard against the
> possibility of having spaces in the filename, in which case the open
> call would fail if it was not quoted.
Hmmm It looks like I need to correct myself. I just ran a test on my
*nix box without quoting the var and it did open a file with spaces in
the name. I know that I've had problems in the past with spaces in
the names, but that may have been on older version of Perl on Windows.
------------------------------
Date: Fri, 10 Aug 2007 23:12:41 -0400
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: File::Find problem?
Message-Id: <m2odhebw7q.fsf@dot-app.org>
Ron Bergin <rkb@i.frys.com> writes:
> I'm aware of the reasons for not always quoting "$vars" and I rarely
> do quote them, however, I did in this case to safeguard against the
> possibility of having spaces in the filename, in which case the open
> call would fail if it was not quoted.
Where on Earth did you get that idea???
Filenames with spaces need to be quoted when used in a shell command,
because the command interpreter would otherwise interpret the spaces as
delimiters in between arguments. Open() is not a shell command.
sherm--
--
Web Hosting by West Virginians, for West Virginians: http://wv-www.net
Cocoa programming in Perl: http://camelbones.sourceforge.net
------------------------------
Date: Fri, 10 Aug 2007 23:17:54 -0400
From: Sherm Pendley <spamtrap@dot-app.org>
Subject: Re: File::Find problem?
Message-Id: <m2k5s2bvz1.fsf@dot-app.org>
Ron Bergin <rkb@i.frys.com> writes:
> Hmmm It looks like I need to correct myself. I just ran a test on my
> *nix box without quoting the var and it did open a file with spaces in
> the name. I know that I've had problems in the past with spaces in
> the names, but that may have been on older version of Perl on Windows.
The Perl version is irrelevant - spaces are treated no differently in Perl
now than they were ten years ago.
You may have been passing a string to a command interpeter with system; in
that case you would have needed to insert the proper quoting in the command.
system 'copy c:\foo "c:\foo bar baz"';
But that's got nothing to do with Perl; you would need the quotes if you
were running the command from Ruby, C, or typing it by hand.
sherm--
--
Web Hosting by West Virginians, for West Virginians: http://wv-www.net
Cocoa programming in Perl: http://camelbones.sourceforge.net
------------------------------
Date: Sat, 11 Aug 2007 03:22:59 GMT
From: Uri Guttman <uri@stemsystems.com>
Subject: Re: File::Find problem?
Message-Id: <x77io2daaz.fsf@mail.sysarch.com>
>>>>> "RB" == Ron Bergin <rkb@i.frys.com> writes:
RB> Hmmm It looks like I need to correct myself. I just ran a test on my
RB> *nix box without quoting the var and it did open a file with spaces in
RB> the name. I know that I've had problems in the past with spaces in
RB> the names, but that may have been on older version of Perl on Windows.
spaces in file names are allowed under almost all filesystems. most any
programming language api for open can handle spaces just fine. you are
probably thinking about quoting those names when used in the shell. but
when you shell out you need to use quotes at the shell level but inside
a string in the perl code and then it can be confusing.
uri
--
Uri Guttman ------ uri@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
------------------------------
Date: Sat, 11 Aug 2007 05:19:12 GMT
From: "John W. Krahn" <dummy@example.com>
Subject: Re: File::Find problem?
Message-Id: <kFbvi.73340$Io4.20087@edtnps89>
Monty wrote:
> The O/S portion of the error is "No such file or directory".
>
> The file exists where $File::Find::name says it is. I wonder if the
> problem is in the relative pathing?
That is exactly the problem. $File::Find::name contains the complete path of
the file. The problem is that File::Find::find defaults to changing
directories so that the file in question is always in the current directory.
So if $File::Find::name contains './one/two/three/file.txt' you are trying to
open a file in './one/two/three/./one/two/three/file.txt'. You need to either
use the "no_chdir" option or use just the file name in $_.
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
------------------------------
Date: 11 Aug 2007 01:17:08 GMT
From: xhoster@gmail.com
Subject: Re: how to tranpose a huge text file
Message-Id: <20070810211712.245$QE@newsreader.com>
Ted Zlatanov <tzz@lifelogs.com> wrote:
> On Fri, 10 Aug 2007 22:28:34 +0000 (UTC) Ilya Zakharevich
> <nospam-abuse@ilyaz.org> wrote:
>
> IZ> [A complimentary Cc of this posting was sent to
> IZ> Ted Zlatanov
> IZ> <tzz@lifelogs.com>], who wrote in article
> <m2absz82te.fsf@lifelogs.com>:
> >> >> Now you can write each inverted output line by looking in
> >> >> break.txt, reading every line, chomp() it, and append it to your
> >> >> current output line if it's divisible by 1000 (so 0, 1000, 2000,
> >> >> etc. will match). Write "\n" to end the current output line.
> >>
> IZ> Good. So what you suggest, is 1000 passes over a 4GB file. Good
> luck!
> >>
> >> I suggested a database, actually.
>
> IZ> And why do you think this would decrease the load on head seeks?
> IZ> Either the data fits in memory (then database is not needed), or it
> is IZ> read from disk (which would, IMO, imply the same amount of seeks
> with IZ> database as with any other file-based operation).
>
> Look, databases are optimized to store large amounts of data
> efficiently.
For some not very general meanings of "efficiently", sure. They generally
expand the data quite a bit upon storage; they aren't very good at straight
retrieval unless you have just the right index structures in place and your
queries have a high selectivity; most of them put a huge amount of effort
into transactionality and concurrency which maybe not be needed here but
imposes a high overhead whether you use it or not. One of the major
gene-chip companies was very proud that in one of their upgrades, they
started using a database instead of plain files for storing the data. And
then their customers were very pleased when in a following upgrade they
abandoned that, and went back to using plain files for the bulk data and
using the database just for the small DoE metadata.
> You can always create a hand-tuned program that will do
> one task (e.g. transposing a huge text file) well, but you're missing
> the big picture: future uses of the data. I really doubt the only thing
> anyone will ever want with that data is to transpose it.
And I really doubt that any single database design is going to support
everything that anyone may ever want to do with the data, either.
>
> IZ> One needs not a database, but a program with build-in caching
> IZ> optimized for non-random access to 2-dimensional arrays. AFAIK,
> IZ> imagemagick is mostly memory-based. On the other side of spectrum,
> IZ> GIMP is based on tile-caching algorithms; if there were a way to
> IZ> easily hook into this algorithm (with no screen display involved),
> one IZ> could handle much larger datasets.
>
> You and everyone else are overcomplicating this.
>
> Rewrite the original input file for fixed-length records.
Actually, that is just what I initially did recommended.
> Then you just
> need to seek to a particular offset to read a record, and the problem
> becomes transposing a matrix piece by piece. This is fairly simple.
I think you are missing the big picture. Once you make a seekable file
format, that probably does away with the need to transpose the data in the
first place--whatever operation you wanted to do with the transposition can
be probably be done on the seekable file instead.
Xho
--
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service $9.95/Month 30GB
------------------------------
Date: Sat, 11 Aug 2007 03:51:27 +0000 (UTC)
From: Ilya Zakharevich <nospam-abuse@ilyaz.org>
Subject: Re: how to tranpose a huge text file
Message-Id: <f9jbnv$p7n$1@agate.berkeley.edu>
[A complimentary Cc of this posting was sent to
Ted Zlatanov
<tzz@lifelogs.com>], who wrote in article <m2tzr67vzu.fsf@lifelogs.com>:
> On Fri, 10 Aug 2007 22:28:34 +0000 (UTC) Ilya Zakharevich <nospam-abuse@ilyaz.org> wrote:
> IZ> And why do you think this would decrease the load on head seeks?
> IZ> Either the data fits in memory (then database is not needed), or it is
> IZ> read from disk (which would, IMO, imply the same amount of seeks with
> IZ> database as with any other file-based operation).
> Look, databases are optimized to store large amounts of data
> efficiently.
Words words words. You can't do *all the things* efficiently.
Databases are optimized for some particular access patterns. I doubt
that even "good databases" are optimized for *this particular* access
pattern. And, AFAIK, MySQL is famous for its lousiness...
> You can always create a hand-tuned program that will do
> one task (e.g. transposing a huge text file) well, but you're missing
> the big picture: future uses of the data. I really doubt the only thing
> anyone will ever want with that data is to transpose it.
If the transposed form is well tuned for further manipulation (of
which we were not informed), then databasing looks like an overkill.
If not, then indeed.
> You and everyone else are overcomplicating this.
>
> Rewrite the original input file for fixed-length records. Then you just
> need to seek to a particular offset to read a record, and the problem
> becomes transposing a matrix piece by piece. This is fairly simple.
Sure. Do 1e9 seeks in your spare time...
> IZ> Yet another way might be compression; suppose that there are only
> IZ> (e.g.) 130 "types" of entries; then one can compress the matrix into
> IZ> 1GB of data, which should be handled easily by almost any computer.
>
> You need 5 bits per item: it has 16 possible values ([ACTG]{2}), plus
> "--".
> A database table, to come back to my point, would store these items as
> enums. Then you, the user, don't have to worry about the bits per item
> in the storage, and you can just use the database.
Of course one does not care about anything - IF the solution using the
database is going to give an answer during the following month. Which
I doubt...
Hope this helps,
Ilya
------------------------------
Date: Fri, 10 Aug 2007 19:38:51 -0500
From: Tad McClellan <tadmc@seesig.invalid>
Subject: Re: match and group across 2 lines
Message-Id: <slrnfbq18r.it6.tadmc@tadmc30.sbcglobal.net>
ktl <ktlind@gmail.com> wrote:
> On Aug 10, 8:22 am, "John W. Krahn" <du...@example.com> wrote:
>> ktl...@gmail.com wrote:
>> > I would like to group six numbers separated by commas into $1 thru $6.
>> > The problem is that four of the numbers are on the first line and two
>> > of the numbers are on the second line. Here is an example of those 2
>> > lines:
>>
>> > SLN499 = LINE/994.455930,-49.320125,347.561019,994.456333 $
>> > ,-49.320486,347.560579
>>
>> > The $ on the end of the first line is a continuation symbol for the
>> > next line.
>>
>> > I can group the first four numbers with:
>>
>> > ^SLN\d\d\d = LINE\/(.*),(.*),(.*),(.*) \$$
>>
>> > How can I reach down one more line and get the other two numbers?
>>
>> $ echo "SLN499 = LINE/994.455930,-49.320125,347.561019,994.456333 $
>> ,-49.320486,347.560579
>>
>> " | \
>> perl -lne'
>> if ( s/\$$// ) {
>> $_ .= <>;
>> redo;
>> }
>> if ( s/^SLN\d\d\d = LINE\/// ) {
>> @x = split /\s*,\s*/;
>> print "@x";
>> }
>> '
>> 994.455930 -49.320125 347.561019 994.456333 -49.320486 347.560579
>>
>> John
>> --
>> Perl isn't a toolbox, but a small machine shop where you
>> can special-order certain sorts of tools at low cost and
>> in short order. -- Larry Wall
>
> Thanks for the quick response John.
> But what is this?
>
> " | \
The ending delimiter for the argument to echo, a pipe, and
a line continuation character.
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
------------------------------
Date: Sat, 11 Aug 2007 03:46:02 +0200
From: "Petr Vileta" <stoupa@practisoft.cz>
Subject: Re: Need help writing a basic script print to text file
Message-Id: <f9j90k$22ot$1@ns.felk.cvut.cz>
Bill H wrote:
> Heres a brute force example:
>
> $filename = time();
> $filename += ".txt";
Yes, really very brute :-)
$filename += ".txt";
should be
$filename .= ".txt";
or simple
$filename = time . '.txt';
--
Petr Vileta, Czech republic
(My server rejects all messages from Yahoo and Hotmail. Send me your mail
from another non-spammer site please.)
------------------------------
Date: Sat, 11 Aug 2007 04:42:13 GMT
From: merlyn@stonehenge.com (Randal Schwartz)
Subject: new CPAN modules on Sat Aug 11 2007
Message-Id: <JMLEED.wEI@zorch.sf-bay.org>
The following modules have recently been added to or updated in the
Comprehensive Perl Archive Network (CPAN). You can install them using the
instructions in the 'perlmodinstall' page included with your Perl
distribution.
ACME-ESP-1.001001
http://search.cpan.org/~tyemq/ACME-ESP-1.001001/
The power to implant and extract strings' thoughts.
----
ACME-ESP-1.001002
http://search.cpan.org/~tyemq/ACME-ESP-1.001002/
The power to implant and extract strings' thoughts.
----
Acme-ESP-1.001003
http://search.cpan.org/~tyemq/Acme-ESP-1.001003/
The power to implant and extract strings' thoughts.
----
Acme-Isnt-0.01
http://search.cpan.org/~apeiron/Acme-Isnt-0.01/
----
CPAN-Reporter-0.99_03
http://search.cpan.org/~dagolden/CPAN-Reporter-0.99_03/
Provides Test::Reporter support for CPAN.pm
----
Class-InsideOut-1.07
http://search.cpan.org/~dagolden/Class-InsideOut-1.07/
a safe, simple inside-out object construction kit
----
DBIx-FileSystem-1.3
http://search.cpan.org/~afrika/DBIx-FileSystem-1.3/
Manage tables like a filesystem
----
DashProfiler-1.05
http://search.cpan.org/~timb/DashProfiler-1.05/
collect call count and timing data aggregated by context
----
Date-Holidays-UK-EnglandAndWales-0.01
http://search.cpan.org/~lgoddard/Date-Holidays-UK-EnglandAndWales-0.01/
Public Holidays in England and Wales
----
Eludia-07.08.10
http://search.cpan.org/~dmow/Eludia-07.08.10/
----
Geo-ICAO-0.10
http://search.cpan.org/~jquelin/Geo-ICAO-0.10/
Airport and ICAO codes lookup
----
Geo-ICAO-0.11
http://search.cpan.org/~jquelin/Geo-ICAO-0.11/
Airport and ICAO codes lookup
----
Geo-ICAO-0.12
http://search.cpan.org/~jquelin/Geo-ICAO-0.12/
Airport and ICAO codes lookup
----
Geo-ICAO-0.20
http://search.cpan.org/~jquelin/Geo-ICAO-0.20/
Airport and ICAO codes lookup
----
Getopt-Std-WithCheck-0.01
http://search.cpan.org/~tpaba/Getopt-Std-WithCheck-0.01/
Perl extension for process command line arguments with custom check on them
----
Getopt-Std-WithCheck-0.03
http://search.cpan.org/~tpaba/Getopt-Std-WithCheck-0.03/
Perl extension for process command line arguments with custom check on them
----
HTML-Merge-3.53
http://search.cpan.org/~razinf/HTML-Merge-3.53/
Embedded HTML/SQL/Perl system.
----
IO-Socket-SSL-1.08
http://search.cpan.org/~sullr/IO-Socket-SSL-1.08/
Nearly transparent SSL encapsulation for IO::Socket::INET.
----
Kx-0.01
http://search.cpan.org/~markpf/Kx-0.01/
Perl extension for Kdb+ http://kx.com
----
Mail-DKIM-0.28
http://search.cpan.org/~jaslong/Mail-DKIM-0.28/
Signs/verifies Internet mail with DKIM/DomainKey signatures
----
Math-BigApprox-0.001003
http://search.cpan.org/~tyemq/Math-BigApprox-0.001003/
Fast and small way to closely approximate very large values.
----
Math-BigApprox-0.001004
http://search.cpan.org/~tyemq/Math-BigApprox-0.001004/
Fast and small way to closely approximate very large values.
----
Math-BigApprox-0.001005
http://search.cpan.org/~tyemq/Math-BigApprox-0.001005/
Fast and small way to closely approximate very large values.
----
POE-Component-Server-NRPE-0.01
http://search.cpan.org/~bingos/POE-Component-Server-NRPE-0.01/
A POE Component implementation of NRPE Daemon.
----
POE-Component-Server-NRPE-0.02
http://search.cpan.org/~bingos/POE-Component-Server-NRPE-0.02/
A POE Component implementation of NRPE Daemon.
----
POE-Component-Server-NRPE-0.03
http://search.cpan.org/~bingos/POE-Component-Server-NRPE-0.03/
A POE Component implementation of NRPE Daemon.
----
Perl-Critic-More-0.15
http://search.cpan.org/~cdolan/Perl-Critic-More-0.15/
Supplemental policies for Perl::Critic
----
Slay-Makefile-Gress-0.02
http://search.cpan.org/~nodine/Slay-Makefile-Gress-0.02/
Use Slay::Makefile for software regression testing
----
Talk-NothingIsFaster510-0.01
http://search.cpan.org/~avar/Talk-NothingIsFaster510-0.01/
----
Time-Fuzzy-0.01
http://search.cpan.org/~jquelin/Time-Fuzzy-0.01/
Time read like a human, with some fuzziness
----
Time-Fuzzy-0.10
http://search.cpan.org/~jquelin/Time-Fuzzy-0.10/
Time read like a human, with some fuzziness
----
Time-Fuzzy-0.11
http://search.cpan.org/~jquelin/Time-Fuzzy-0.11/
Time read like a human, with some fuzziness
----
Time-Fuzzy-0.20
http://search.cpan.org/~jquelin/Time-Fuzzy-0.20/
Time read like a human, with some fuzziness
----
UWO-Student
http://search.cpan.org/~frequency/UWO-Student/
Provides Perl object representation of a University of Western Ontario student.
----
UWO-Student-0.01
http://search.cpan.org/~frequency/UWO-Student-0.01/
Provides Perl object representation of a University of Western Ontario student.
----
XML-Compile-0.51
http://search.cpan.org/~markov/XML-Compile-0.51/
Compilation based XML processing
----
ack-1.65_01
http://search.cpan.org/~petdance/ack-1.65_01/
grep-like text finder
If you're an author of one of these modules, please submit a detailed
announcement to comp.lang.perl.announce, and we'll pass it along.
This message was generated by a Perl program described in my Linux
Magazine column, which can be found on-line (along with more than
200 other freely available past column articles) at
http://www.stonehenge.com/merlyn/LinuxMag/col82.html
print "Just another Perl hacker," # the original
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
------------------------------
Date: Fri, 10 Aug 2007 22:47:09 -0500
From: "Mumia W." <paduille.4061.mumia.w+nospam@earthlink.net>
Subject: Re: Out of memory in vec
Message-Id: <13bqcdchv0phsce@corp.supernews.com>
On 08/10/2007 01:22 PM, Mumia W. wrote:
> On 08/10/2007 11:26 AM, Ted Zlatanov wrote:
>> On 10 Aug 2007 07:35:48 GMT anno4000@radom.zrz.tu-berlin.de wrote:
>>
>> a> Why four? You only need two bits to encode four bases.
>>
>> "--" was also allowed as data besides a pair of letters, so you have
>>
>> [ACGT][ACGT] = 4 bits (which is what Mumia means by "item", I think)
>> plus "--" as a value = 5 bits
>>
>> Ted
> [...]
>
> Now that Jie has give us some data, I can find a way to get [ACGT][ACGT]
> into a single four-bit group.
Duh. Sorry Anno. Sorry Ted. Of course I only need two bits for [ACGT]
I guess it's obvious that I haven't needed to count in binary for a while
:-(
But the "_" characters do throw a monkey wrench into my plans.
I wrote another version of the program that used bytes. It took over 80
minutes to complete and consumed 930MB of memory--mostly swap.
However, I'll still try to get a vec() version working because keeping
the data out of the swap space will probably speed things up by 100 times.
One half of 930MB is 465MB. If I exit Gnome and don't use my PC for
browsing the web while the program is running, I'll probably have enough
RAM for the program to run without using any swap. But I've got to get
vec() working.
PS.
Thanks again Anno. I changed the program to use two bits, and it worked.
------------------------------
Date: Sat, 11 Aug 2007 01:12:41 -0500
From: "Mumia W." <paduille.4061.mumia.w+nospam@earthlink.net>
Subject: Re: Out of memory in vec
Message-Id: <13bql0f10tp0lc7@corp.supernews.com>
On 08/10/2007 11:26 AM, Ted Zlatanov wrote:
>
> "--" was also allowed as data besides a pair of letters, so you have
>
> [ACGT][ACGT] = 4 bits (which is what Mumia means by "item", I think)
> plus "--" as a value = 5 bits
>
> Ted
Alright, I admit I cheated :-)
Five bits were needed, so I used five bits. No, I didn't hack the vec()
function in the Perl core. I just used two buffers:
#!/usr/local/bin/perl5.9.4
use strict;
use warnings;
use Fatal qw/open close/;
use Data::Dumper;
my $infile = shift() || 'BIG.txt';
my $outfile = 'out';
# A C G T _
my %bases = (
0,'AA',1,'AC',2,'AG',3,'AT',4,'A-',5,'CA',6,'CC',
7,'CG',8,'CT',9,'C-',10,'GA',11,'GC',12,'GG',13,'GT',
14,'G-',15,'TA',16,'TC',17,'TG',18,'TT',19,'T-',
20,'-A',21,'-C',22,'-G',23,'-T',24,'--',
);
my %rbases = reverse %bases;
my $buffer1 = '';
my $buffer2 = '';
my $maxcols = 0;
my $maxrows = 0;
my $pos = 0;
my (%inp, %out);
open $inp{hand}, '<', $infile;
open $out{hand}, '>', $outfile;
scalar readline $inp{hand};
my $startpos = tell($inp{hand});
while (readline $inp{hand}) {
my @f = split; shift @f;
bpstore($_, $pos++) for @f;
# print "@f\n";
$maxcols = @f if @f > $maxcols;
$maxrows++;
# last if $inp{lines}++ > 4;
}
select $out{hand};
for my $col (0 .. $maxcols-1) {
for my $row (0 .. $maxrows-1) {
# print "($row, $col) ";
my $npos = $col + ($row * $maxcols);
my $bp = $bases{bpfetch($npos)};
print "$bp ";
}
print "\n";
}
close $inp{hand};
close $out{hand};
system(ps => 'up', $$);
# system(cat => $outfile);
##############################
sub bpstore {
my ($bp, $pos) = @_;
$bp = $rbases{$bp};
vec($buffer1, $pos, 4) = 0xF & $bp;;
vec($buffer2, $pos, 1) = 1 & ($bp >> 4);
}
sub bpfetch {
my $pos = shift;
my $bp = vec($buffer1, $pos, 4);
$bp = $bp | (vec($buffer2, $pos, 1) << 4);
}
__END__
This works on Jie's submitted data, and it takes less than two minutes
to run and requires only 10MB of memory. I'm happy about that, but I
suspect it may encounter problems when dealing with over a gigabyte of
input data.
------------------------------
Date: Sat, 11 Aug 2007 03:51:07 +0200
From: "Petr Vileta" <stoupa@practisoft.cz>
Subject: Re: Perl On Apache
Message-Id: <f9j90l$22ot$2@ns.felk.cvut.cz>
Bill H wrote:
> On Aug 10, 2:13 pm, Louis <t051...@hotmail.com> wrote:
> Louis - you need to enable it. Look in your httpd.conf file, either
> in : /usr/local/etc/apache/httpd.conf or in /etc/apache/httpd.conf -
or in /etc/httpd/conf/httpd.conf
on Linux of course. On windows this could be somewhere on c:\program
files\Apache
--
Petr Vileta, Czech republic
(My server rejects all messages from Yahoo and Hotmail. Send me your mail
from another non-spammer site please.)
------------------------------
Date: Fri, 10 Aug 2007 21:35:24 -0500
From: Ignoramus22443 <ignoramus22443@NOSPAM.22443.invalid>
Subject: Perl script to track UPS packages by tracking number.
Message-Id: <jvWdnT_1XJfxvSDbnZ2dneKdnZydnZ2d@giganews.com>
To interrupt our usual patterns of worthless spams
(alt.marketing.online.ebay) or making snide remarks
(comp.lang.perl.misc), here is a useful perl script for tracking UPS
packages.
To run it on command line, you give it a list of UPS tracking numbers
(no spaces inside the numbers). You can append a comment (no spaces
after a "/" character to each tracking number.
Example:
ups-track.pl 1Z903475387458375/Newegg 1Z387847878676764/Fred 1Z938983989489898
It prints something like this:
1Z1467XF0366666414 Toshiba-2.2-kW Status: In Transit - On Time Scheduled: 08/14/2007 Weight: 13.00 Lbs
1Z1467XF0346652024 Toshiba-3.7-kW Status: In Transit - On Time Scheduled: 08/14/2007 Weight: 15.00 Lbs
1ZA4Y0130366628290 DinoRight Status: In Transit - On Time Scheduled: 08/14/2007 Weight: 12.00 Lbs
1Z2R899R0355530175 Bargainland Status: In Transit
1Z2R899R0315555584 Status: In Transit
#!/usr/bin/perl
######################################################################
# perl script for tracking UPS packages.
# To run it on command line, you give it a list of UPS tracking numbers
# (no spaces inside the numbers). You can append a comment (no spaces
# after a "/" character to each tracking number.
#
# Example:
#
# ups-track.pl 1Z903475387458375/Newegg 1Z387847878676764/Fred 1Z938983989489898
#
# Copyright 2007 Igor Chudov ichudov@algebra.com
# Released under GNU GPL version 3.
#
######################################################################
use strict;
use warnings;
use vars qw( $ua );
use LWP::UserAgent;
use HTTP::Request::Common;
use HTML::TreeBuilder;
#use Data::Dumper;
$ua = LWP::UserAgent->new;
#$cookies = new HTTP::Cookies( file => "cookies.txt", autosave => 1 );
sub make_tree {
my ($html) = @_;
my $tree = HTML::TreeBuilder->new;
$tree->parse( $html );
return $tree;
}
sub get_request {
my ($req) = (@_);
#$cookies->add_cookie_header($req);
my $res = $ua->request($req);
if ($res->is_success) {
#$cookies->extract_cookies($res);
return $res;
} else {
print STDERR "Failed to execute HTTP request: ", $res->status_line,
print STDERR $res->as_string;
}
return undef;
}
sub get_webpage {
my ($url) = @_;
my $req = HTTP::Request->new(GET => $url);
my $result = get_request( $req );
if( !$result ) {
print STDERR "Failed to get url '$url'.\n";
}
return $result;
}
my $usage = "USAGE: $0 tracknum";
foreach my $track (@ARGV) {
my $comment = "";
$comment = $1 if $track =~ s#/(.*)$##;
my $url = "http://wwwapps.ups.com/WebTracking/processInputRequest?sort_by=status&tracknums_displayed=1&TypeOfInquiryNumber=T&loc=en_US&InquiryNumber1=$track&track.x=0&track.y=0";
my $text = get_webpage( $url )->as_string;
my $tree = make_tree( $text );
#$tree->dump;
my @table = $tree->look_down( '_tag',
'table',
sub {
return $_[0]->as_text =~ /Status:/;
}
);
#print Dumper( @table );
my $table = pop @table;
#print Dumper( $table );
#exit 0;
my $t = $table->as_HTML;
my @rows = $table->content_list;
my $item = {};
foreach my $row (@rows) {
#print "ROW=$row.\n";
#$row->dump; print "\n================================\n";
next unless ref( $row );
my @cols = $row->content_list;
next unless 2 <= @cols;
my ($key, $value) = ($cols[0]->as_text, $cols[1]->as_text);
$key =~ s/^\s+//; $key =~ s/\s+$//; $key =~ s/ +/ /g; $key =~ tr/\x80-\xFF//d;
$value =~ s/^\s+//; $value =~ s/\s+$//; $value =~ s/ +/ /g; $value =~ tr/\x80-\xFF//d;
#print "$key=>$value.\n";
next unless $key =~ /(.*):$/;
$key = $1;
$item->{$key} = $value;
}
print "$track";
print sprintf( " %19s", $comment );
print " Status: $item->{Status}" if defined $item->{Status};
print " Scheduled: $item->{'Scheduled Delivery'}" if defined $item->{'Scheduled Delivery'};
print " Weight: $item->{Weight}" if defined $item->{Weight};
print "\n";
if( 0 ) {
foreach my $k (sort keys %$item ) {
print "\t\t$k ==> $item->{$k}\n";
}
print "\n";
}
}
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc. For subscription or unsubscription requests, send
#the single line:
#
# subscribe perl-users
#or:
# unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.
NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.
#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V11 Issue 744
**************************************