[24649] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 6813 Volume: 10

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Tue Aug 3 13:36:44 2004

Date: Tue, 3 Aug 2004 10:36:05 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Tue, 3 Aug 2004     Volume: 10 Number: 6813

Today's topics:
        Multipart form upload causes script to hang after 16K p (John)
    Re: Multipart form upload causes script to hang after 1 <nobull@mail.com>
    Re: Multipart form upload causes script to hang after 1 (John)
        Multiple File Grep (Blake)
    Re: Multiple File Grep <josef.moellers@fujitsu-siemens.com>
    Re: Multiple File Grep ctcgag@hotmail.com
    Re: Multiple File Grep (Greg Bacon)
    Re: Multiple File Grep <jgibson@mail.arc.nasa.gov>
        Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 27 Jul 2004 08:21:14 -0700
From: montclairguy@hotmail.com (John)
Subject: Multipart form upload causes script to hang after 16K printed
Message-Id: <f7606929.0407270721.4438aace@posting.google.com>

Firstly, the OS is FreeBSD 4.7-RELEASE-p27 (VKERN) #33, perl is
v5.6.1., and Apache is 1.3.27.

I'm using following (very pared down version) HTML form for submitting
up to 5 files as uploads:

<html><body>
<form enctype="multipart/form-data" action="upload.cgi" method="post"
name="upload_form">
<input type="file" name="file[1]" size=25
accept="image/*,application/*"><br>
<input type="file" name="file[2]" size=25 ... (this repeats until
file[5])
<input type="submit" value="Upload" name="Upload">
<INPUT TYPE="reset" value="Reset" name="Reset">
</form>
</html></body>

After a ton of debugging, I consistently encounter a problem when the
upload data is over 16K, and the script is attempting to output
-ANYTHING- over 16K.  And I mean, even when the script doesn't even
process the submitted data!

I've debugged the script down to virtually nothing, and the processing
of the uploaded data doesn't seem to have anything to do with the
problem.  For example, here's upload.cgi which has really nothing to
do with the submitted data anymore:

#------------------------
#!/usr/local/bin/perl -w

use strict;

my $header_page = $ENV{'DOCUMENT_ROOT'} .
'/scripts/uploadheader.html';

# $|=0; #nope this doesn't help
# $|=1; #neither does this

print "Content-type: text/html\n\n\<HTML><HEAD><TITLE>File
Upload</TITLE>\n";
&spitout_file($header_page);
exit;
#-----------------
sub  spitout_file {
        my $file = $_[0]; my $endbyte=(-s $file); my $string='';
        if ($endbyte>0){
                open FILE, $file || die "Unable to locate file $file";
                read FILE, $string, $endbyte;
                close FILE;
                print $string;
		undef $string;
        }else{
                return 0;
        }
        return 1;
}

I've also tried spitout_file as this:

#-----------------
sub spitout_file {
        my $file = $_[0]; my $l; 
        open (AFILE, "<$file") || die "Unable to locate file $file";
        while (<AFILE>) {print $_;}        
        close (AFILE);
}
#-----------------

I've tried other variations on spitout_file as well, including reading
1000, 8192, and 16384 bytes at a time.  Every method, every time, will
not print the contents of $file if it's more than 16384 bytes, and the
submitted data is more than 16384.  All methods -will- read the file,
but they won't print.  It's as if "print" is broken and or some buffer
is full and can't take anymore, so perl just quits.

If the submitted data is <16384, the script works -or- if $file is
<16384, the script works.  And, of course, the script works fine if
called by itself (not the action of this upload form.)

I've got no error messages, no Apache error_log messages, and nothing
to go on other than what's above, and have tried everything I can
think of.  I've searched the groups and the web and cannot find
anything resembling this.

What am I missing?


------------------------------

Date: 27 Jul 2004 18:20:09 +0100
From: Brian McCauley <nobull@mail.com>
Subject: Re: Multipart form upload causes script to hang after 16K printed
Message-Id: <u93c3dcqpi.fsf@wcl-l.bham.ac.uk>

montclairguy@hotmail.com (John) writes:

> Firstly, the OS is FreeBSD 4.7-RELEASE-p27 (VKERN) #33, perl is
> v5.6.1., and Apache is 1.3.27.
> 
> I'm using following (very pared down version) HTML form for submitting
> up to 5 files as uploads:

> After a ton of debugging, I consistently encounter a problem when the
> upload data is over 16K, and the script is attempting to output
> -ANYTHING- over 16K.  And I mean, even when the script doesn't even
> process the submitted data!

If the CGI script does not process (or at least discard) the data
presented to it on STDIN by a HTTP server then that data will just be
left sitting in the FIFO between the HTTP server process and the CGI
process.  If the data exceeds the size of a FIFO buffer and the HTTP
server process doesn't bother trying to read from the GCI's STDOUT
until it has finished writing the CGI request to the CGI's STDIN then
the web server (or at least one hander thread/subprocess) will stall.
I don't know enough about the internals of Apache 1.3 to be sure if it
behaves this way but from what you describe I'd guess this is what's
happening.

-- 
     \\   ( )
  .  _\\__[oo
 .__/  \\ /\@
 .  l___\\
  # ll  l\\
 ###LL  LL\\


------------------------------

Date: 28 Jul 2004 08:08:01 -0700
From: montclairguy@hotmail.com (John)
Subject: Re: Multipart form upload causes script to hang after 16K printed
Message-Id: <f7606929.0407280708.1b620241@posting.google.com>

Brian McCauley <nobull@mail.com> wrote in message news:<u93c3dcqpi.fsf@wcl-l.bham.ac.uk>...
> montclairguy@hotmail.com (John) writes:
> 
> > Firstly, the OS is FreeBSD 4.7-RELEASE-p27 (VKERN) #33, perl is
> > v5.6.1., and Apache is 1.3.27.
> > 
> > I'm using following (very pared down version) HTML form for submitting
> > up to 5 files as uploads:
>  
> > After a ton of debugging, I consistently encounter a problem when the
> > upload data is over 16K, and the script is attempting to output
> > -ANYTHING- over 16K.  And I mean, even when the script doesn't even
> > process the submitted data!
> 
> If the CGI script does not process (or at least discard) the data
> presented to it on STDIN by a HTTP server then that data will just be
> left sitting in the FIFO between the HTTP server process and the CGI
> process.  If the data exceeds the size of a FIFO buffer and the HTTP
> server process doesn't bother trying to read from the GCI's STDOUT
> until it has finished writing the CGI request to the CGI's STDIN then
> the web server (or at least one hander thread/subprocess) will stall.
> I don't know enough about the internals of Apache 1.3 to be sure if it
> behaves this way but from what you describe I'd guess this is what's
> happening.

Hi Brian,

The actual script does process the form data.  I just ommitted that
from my post since the script wouldn't even get that far.  Regardless,
you are correct.  I changed the script to process the submitted data
prior to performing any output, and it works fine!

Funny how something so obvious is hard to see after hours of mulling
over code.  Thanks for pointing it out and solving the problem!

Regards,
Dave


------------------------------

Date: 27 Jul 2004 06:21:24 -0700
From: phpuser@chek.com (Blake)
Subject: Multiple File Grep
Message-Id: <6ca67b3b.0407270521.1e99ec8e@posting.google.com>

I'm trying to figure out how to grep a bunch of log files into one
file.

Basically I have virtual hosts set up like this

/home/user/site/log

There's about 85 like that. Within /log/ there is an access_log file

So what I want to do is to be able to grep out all the hits in a
certain hour to see who's killing the server.

So what I'd like to do is do something like

find /home/ -name access_log
while { there's a log }
grep July 27 10pm 
send that output to a single file

Then I can grep out the hits from that one file to see who's kiling
me.

What's the best what to do that?


------------------------------

Date: Tue, 27 Jul 2004 15:30:19 +0200
From: Josef Moellers <josef.moellers@fujitsu-siemens.com>
Subject: Re: Multiple File Grep
Message-Id: <ce5l59$652$1@nntp.fujitsu-siemens.com>

Blake wrote:
> I'm trying to figure out how to grep a bunch of log files into one
> file.
>=20
> Basically I have virtual hosts set up like this
>=20
> /home/user/site/log
>=20
> There's about 85 like that. Within /log/ there is an access_log file
>=20
> So what I want to do is to be able to grep out all the hits in a
> certain hour to see who's killing the server.
>=20
> So what I'd like to do is do something like
>=20
> find /home/ -name access_log
> while { there's a log }
> grep July 27 10pm=20
> send that output to a single file
>=20
> Then I can grep out the hits from that one file to see who's kiling
> me.
>=20
> What's the best what to do that?

I'd use File::Find.

--=20
Josef M=F6llers (Pinguinpfleger bei FSC)
	If failure had no penalty success would not be a prize
						-- T.  Pratchett



------------------------------

Date: 27 Jul 2004 16:04:51 GMT
From: ctcgag@hotmail.com
Subject: Re: Multiple File Grep
Message-Id: <20040727120451.898$RA@newsreader.com>

phpuser@chek.com (Blake) wrote:
> I'm trying to figure out how to grep a bunch of log files into one
> file.
>
> Basically I have virtual hosts set up like this
>
> /home/user/site/log
>
> There's about 85 like that. Within /log/ there is an access_log file
>
> So what I want to do is to be able to grep out all the hits in a
> certain hour to see who's killing the server.
>
> So what I'd like to do is do something like
>
> find /home/ -name access_log
> while { there's a log }
> grep July 27 10pm
> send that output to a single file
>
> Then I can grep out the hits from that one file to see who's kiling
> me.
>
> What's the best what to do that?


system q{
grep 'July 27 10pm' /home/*/site/log/access_log > a_single_file
}q



Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service                        $9.95/Month 30GB


------------------------------

Date: Tue, 27 Jul 2004 16:35:34 -0000
From: gbacon@hiwaay.net (Greg Bacon)
Subject: Re: Multiple File Grep
Message-Id: <10gd12m5omtu2e9@corp.supernews.com>

In article <6ca67b3b.0407270521.1e99ec8e@posting.google.com>,
    Blake <phpuser@chek.com> wrote:

: [...]
: So what I want to do is to be able to grep out all the hits in a
: certain hour to see who's killing the server.
: 
: So what I'd like to do is do something like
: 
: find /home/ -name access_log
: while { there's a log }
: grep July 27 10pm 
: send that output to a single file
: 
: Then I can grep out the hits from that one file to see who's kiling
: me.

Consider the code below.  Example usage:

    % ghrp /home 27/Jul/2004:22

#! /usr/local/bin/perl

# ghrp: search for an hour and print

use warnings;
use strict;

sub usage { "Usage: $0 root-dir time-pattern\n" }

my %mon = (
    Jan =>  1, Feb =>  2, Mar =>  3, Apr =>  4, May =>  5, Jun =>  6,
    Jul =>  7, Aug =>  8, Sep =>  9, Oct => 10, Nov => 11, Dec => 12,
);

sub date {
    my $date = shift;

    my($d,$m,$y,$hr,$min,$sec);

    # e.g., 03/Feb/1998:17:42:15 -0500
    if (m!(\d+)/(\w+)/(\d+):(\d+):(\d+):(\d+)!) {
        ($d,$m,$y,$hr,$min,$sec) = ($1,$2,$3,$4,$5,$6);

        $m = $mon{$m} || 0;
    }
    else {
        $d = $m = $y = $hr = $min = $sec = 0;
    }

    ($d,$m,$y,$hr,$min,$sec);
}

sub date_asc {
    $a->[2] <=> $b->[2]  # year
            ||
    $a->[1] <=> $b->[1]  # month
            ||
    $a->[0] <=> $b->[0]  # day
            ||
    $a->[3] <=> $b->[3]  # hour
            ||
    $a->[4] <=> $b->[4]  # min
            ||
    $a->[5] <=> $b->[5]  # sec
}

## main
die usage unless @ARGV == 2;

my $root = shift;
die "$0: '$root' is not a directory!\n" . usage unless -d $root;

(my $time = shift) =~ s,/,\\/,g;
my $pat  = eval "qr/" . $time . "/";

unless (defined $pat) {
    die "$0: bad time pattern\n";
}

# from http://stein.cshl.org/WWW/docs/handout.html#Log_Parsing
my $line = qr/^\S+ \S+ \S+ \[($pat[^]]+)\] "\w+ \S+.*" \d+ \S+/;

my @hits;
for (`find $root -name access_log 2>&1`) {
    chomp;

    # assume this line is a warning if it's not a filename
    unless (-f $_) {
        warn $_ . "\n";
        next;
    }

    my $fh;
    unless (open $fh, "<", $_) {
        warn "$0: open $_: $!\n";
        next;
    }

    while (<$fh>) {
        push @hits, [ date($1), $_ ] if /$line/;
    }
}

print $_ for map $_->[6], sort date_asc @hits;

__END__

Hope this helps,
Greg
-- 
The logic is flawless: when a private business *accidentally* kills 146
people, we need to increase the power of the government, an entity that
*deliberately* kills millions.
    -- Gene Callahan


------------------------------

Date: Tue, 27 Jul 2004 17:50:33 -0700
From: Jim Gibson <jgibson@mail.arc.nasa.gov>
Subject: Re: Multiple File Grep
Message-Id: <270720041750333947%jgibson@mail.arc.nasa.gov>

In article <6ca67b3b.0407270521.1e99ec8e@posting.google.com>, Blake
<phpuser@chek.com> wrote:

> I'm trying to figure out how to grep a bunch of log files into one
> file.
> 
> Basically I have virtual hosts set up like this
> 
> /home/user/site/log
> 
> There's about 85 like that. Within /log/ there is an access_log file
> 
> So what I want to do is to be able to grep out all the hits in a
> certain hour to see who's killing the server.
> 
> So what I'd like to do is do something like
> 
> find /home/ -name access_log
> while { there's a log }
> grep July 27 10pm 
> send that output to a single file
> 
> Then I can grep out the hits from that one file to see who's kiling
> me.
> 
> What's the best what to do that?

You don't always have to use Perl. If you are using Unix:

#!/bin/csh
set date = `date '+%b %d %l%p'`
find /home -name access_log -exec -H grep "$date" {} \;

(and direct output of this shell script to a file).


------------------------------

Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>


Administrivia:

#The Perl-Users Digest is a retransmission of the USENET newsgroup
#comp.lang.perl.misc.  For subscription or unsubscription requests, send
#the single line:
#
#	subscribe perl-users
#or:
#	unsubscribe perl-users
#
#to almanac@ruby.oce.orst.edu.  

NOTE: due to the current flood of worm email banging on ruby, the smtp
server on ruby has been shut off until further notice. 

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

#To request back copies (available for a week or so), send your request
#to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
#where x is the volume number and y is the issue number.

#For other requests pertaining to the digest, send mail to
#perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
#sending perl questions to the -request address, I don't have time to
#answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V10 Issue 6813
***************************************


home help back first fref pref prev next nref lref last post