[19868] in Perl-Users-Digest
Perl-Users Digest, Issue: 2063 Volume: 10
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Nov 4 06:10:25 2001
Date: Sun, 4 Nov 2001 03:10:09 -0800 (PST)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Message-Id: <1004872209-v10-i2063@ruby.oce.orst.edu>
Content-Type: text
Perl-Users Digest Sun, 4 Nov 2001 Volume: 10 Number: 2063
Today's topics:
Re: reading flat-file db and replacing a word <bootsy52@gmx.net>
Re: reading flat-file db and replacing a word <wwonko@rdwarf.com>
Re: reading flat-file db and replacing a word <please@no.spam>
Re: Sending Content Type in email <Tassilo.Parseval@post.rwth-aachen.de>
Re: Sending Content Type in email nobull@mail.com
Re: Split output into multiple pages? <please@no.spam>
what means about "e" : $name=~ s/.../.../egi (hugh1)
Re: what means about "e" : $name=~ s/.../.../egi <Tassilo.Parseval@post.rwth-aachen.de>
Digest Administrivia (Last modified: 6 Apr 01) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: Sun, 04 Nov 2001 05:10:34 +0100
From: "Carsten Menke" <bootsy52@gmx.net>
Subject: Re: reading flat-file db and replacing a word
Message-Id: <pan.2001.11.04.05.10.31.548.13516@gmx.net>
On Sun, 04 Nov 2001 02:19:44 +0100, Tad McClellan wrote:
>>>
>>I have made some thoughts, and, well I can hear all people say use a DB
>>then, but this is only what I thought of:
>>
>>I have my file, for every line in the file I use a fixed line length
> ^^^^^^^^^^^^^^^^^
>>then for input, the # character is forbidden, because it is used as a
>>seperator.
>
>
> Eh? That looks to me like semi-colon is being used as a separator, and #
> is being used as a padding character.
Yes, of course I meant padding character, sorry I'm not a native speaker
>
> Why not use fixed length fields within your fixed length records?
I could fixed length fields for the first 1 field, but for example 4th field is not
possible, because I do not know how much greeting, comments what ever my
boss is wanting to fill in, if I assume 100 lines he's coming the next
day and say He need 110 lines because (actually this field is reserverd
for greeting string the user receives when he logs on) He want's to greet
a group of 10 persons each by each.
> Then there would be no need for spending space on separators at all.
Would be nice, but...you don't know my boss. As soon as you have finished
some code he has more ideas of what he wants to be implemented.
>
>
> I think you may have overlooked money. Lots of people count that in the
> "important" category :-)
Yes, all thoughts were made without this view.
> The FAQ shows you how to avoid your second objection, but you've never
> said why you do not want to make a copy of the file itself.
>
> Why you do not want to make a copy of the file itself?
>
> I don't see anything in your description that would preclude just using
> the -i command line switch and ripping through your data with a while <>
> loop.
>
> Development time would be, I dunno, 10 or 20 times more if you must not
> use a temporary file. Disk space is cheap. Programmer time is expensive.
>
> I have not seen anything that is forcing you to go with the (greatly)
> more expensive route.
>
> Why do you feel compelled to spend so much time in order to save disk
> space that will be used for only a few seconds?
>
> Perhaps you have a good reason and just haven't shared it with us?
>
Hmmmh, my reason is not really that good. But maybe I'm a little bit
paranoid, but the fear I got, is that every disk writing attempt could
fail, and therefore I'm afraid of corrupted data files (inconsistency).
But maybe you can tell me better if I should be afraid of this point or
if the inline edit keeps the same risk as a temporary file. BTW, of
course I will do backup copies.
The second reason, why I want to do this is speed. I tell you exactly
for what I want to use this code. The examples shown are part of a
password file (yes, my boss said encryption is not necessary, was not my
idea) So now there are two things which have to be done, it is wanted
that at every login (done via CGI), the time of that login gets
set, so if I made a temporary copy and the user hits the stop button of his browser what
happens than? In case of an inline editing only his record may be
corrupted in case of a tempory file which rewrites the original file all
records may be corrupted (or this is at least what I think of).
The other thing is that I want to provide an CGI based interface so that one
can easily add,change,remove members. Therefore I have made currently 5
input fields, (but it is flexible, if more are needed) So if the one wants to change 5
users at a time, I need to make a temporary copy, replace the first
entry, then unlink/overwrite the temp file by the modified one, make the
second change, unline/overwrite the temp by the modified one again, and
so on, and so on until all changes are made.
But mainly I'm concerned that a copy/unlink operation could fail. You are more
experienced, so I would really like to hear your point of view.
Thanx Carsten
------------------------------
Date: Sun, 4 Nov 2001 07:06:49 +0000 (UTC)
From: Louis Erickson <wwonko@rdwarf.com>
Subject: Re: reading flat-file db and replacing a word
Message-Id: <9s2pe9$dmf$1@holly.rdwarf.com>
Carsten Menke <bootsy52@gmx.net> wrote:
: On Sun, 04 Nov 2001 02:19:44 +0100, Tad McClellan wrote:
:> The FAQ shows you how to avoid your second objection, but you've never
:> said why you do not want to make a copy of the file itself.
:>
:> Why you do not want to make a copy of the file itself?
:>
:> I don't see anything in your description that would preclude just using
:> the -i command line switch and ripping through your data with a while <>
:> loop.
:>
:> Development time would be, I dunno, 10 or 20 times more if you must not
:> use a temporary file. Disk space is cheap. Programmer time is expensive.
:>
:> I have not seen anything that is forcing you to go with the (greatly)
:> more expensive route.
:>
:> Why do you feel compelled to spend so much time in order to save disk
:> space that will be used for only a few seconds?
:>
:> Perhaps you have a good reason and just haven't shared it with us?
:>
I have to agree with Tad here. Generally, people have disk space, and
modern disk drives don't just randomly fail. Also, doing the copy/update
then the rename/unlink will have more easily handled error conditions,
and be less likely to corrupt your file.
With a change to an existing file, an error means you have to try and
recover the file, undoing your work, which will be comples. With a
copy, you just erase the temp file, and fail. The one place that
a real failure can occur is if you can't rename the temp file to
the real file name.
I would also probably consider using a lock file of some sort to
keep other processes from mangling this file at the same time.
The faq has suggestions on how to do this. There are also more
details in "The Perl Cookbook" by Christansen & Torkington,
published by O'Reilly. (See Recipies 7.5, 7.8, 7.9, 7.10, and
7.11 for the things I've talked about here.) It is, IMO, a
very useful book to have around.
: Hmmmh, my reason is not really that good. But maybe I'm a little bit
: paranoid, but the fear I got, is that every disk writing attempt could
: fail, and therefore I'm afraid of corrupted data files (inconsistency).
: But maybe you can tell me better if I should be afraid of this point or
: if the inline edit keeps the same risk as a temporary file. BTW, of
: course I will do backup copies.
By making backup copies, you've used as much space as making your
changes by making a copy of the file.
: The second reason, why I want to do this is speed. I tell you exactly
: for what I want to use this code. The examples shown are part of a
: password file (yes, my boss said encryption is not necessary, was not my
: idea) So now there are two things which have to be done, it is wanted
: that at every login (done via CGI), the time of that login gets
: set, so if I made a temporary copy and the user hits the stop button of his browser what
: happens than? In case of an inline editing only his record may be
: corrupted in case of a tempory file which rewrites the original file all
: records may be corrupted (or this is at least what I think of).
If you're worried about speed, try making up a file with that many
lines, or more in it. Write a Perl program to generate some, or
just copy and paste data in to your text file, and find out.
It may be, you can handle big files easily, and it may be that
you're not going to be fast enough anyway. Test and know.
: The other thing is that I want to provide an CGI based interface so that one
: can easily add,change,remove members. Therefore I have made currently 5
: input fields, (but it is flexible, if more are needed) So if the one wants to change 5
: users at a time, I need to make a temporary copy, replace the first
: entry, then unlink/overwrite the temp file by the modified one, make the
: second change, unline/overwrite the temp by the modified one again, and
: so on, and so on until all changes are made.
If you know all five users you are updating, one update pass should
be able to handle all of them, or however many you're changing.
: But mainly I'm concerned that a copy/unlink operation could fail. You are more
: experienced, so I would really like to hear your point of view.
I'm pleased to hear you're worried about a disk operation failing. That
shows great concern and wisdom. However, the more complex your
operations, the more likely a failure is. My experience says simple is
almost always better.
I would handle this by reading the records, one line at a time, and
writing out updated records, or skipping a record to be deleted. Rather
than trying to write perl for this off the cuff, I'm going to give you
a little psudeo code of how I might handle this. If you're not familiar
with that term (hard to say, since you say you're not a native English
speaker. You're doing pretty well, so far.) it means that it's not
real code, won't work, isn't written in computer language, but in
a sort of hybrid mix of code-like notation, and commentary which
won't actually run, but it will explain enough so that you should
understand what you'll have to go off and write.
(While many in this newsgroup seem able to write useful bits of
Perl quickly and easily, it's not a knack I have; I have to
work over it slowly and carefully and go one step at a time.
By figuring out like this ahead of time, I know what I'm doing.)
step 1: build hash of lines to be changed. Use the "primary key",
whatever it may be, as the index value for the hash, and
the line to be output as the value. Use a "special" value
for lines to be deleted. When built, your hash might contain:
$changes{"erickson"} = "Louis;Erickson;whatever this field is;something...";
$changes{"foo"} = "Foo;Bar;long field value;nothing";
$changes{"delete"} = ";;delete-record;;";
step 2: Open the original file for read, and a temp file for writing.
(See perldoc -q temp for notes on how to make a temp file
name.) Put the file in the same directory as the real file,
so you can rename it later.
step 3: Loop through, reading a line, checking to see if it's one you care
about, and writing it out. If you don't care about this line,
write it unchanged. If you do, skip it or write the value in
the hash prepared above.
while(<READ>)
{
$whole_line = $_;
Split the read line in to appropriate keys
if (defined $changes{"primary for this line"})
{
if($changes{"primary for this line"}
ne ";;delete-record;;")
{
print WRITE $changes{"primary for
this line"} . "\n";
}
}
else
{
print WRITE $whole_line;
}
}
step 4: Close input and output files.
step 5: Rename the database to database.hold
Note that if any errors happen to this point, you can just delete the
temp file, and return an error, and your original data is still intact.
This is good! If you're messing with your only copy of the file, errors
will be much harder to clean up.
step 6: Rename the temp file to the database file. Errors here mean
you have to rename the .hold file back before you fail.
Renames don't fail too often, but they can. If both of them
fail, your program is hosed, but the data is there on the disk
where a person can come along and find it.
step 7: Delete the .hold file.
Also, as this is triggered by the Web, you'll need to consider what
happens if two people do this at once; the answer is probably, "bad
things". You'll need to prevent this with some sort of locking.
This is, unfourtenately, nontrivial. See CPAN and perldoc -q lock
for some leads that way.
I hate to say this, but what you're describing here is a database.
This is what databases do, and someone ELSE has written and debugged
and tested that whole big mess of code so you don't have to. A
good database handles all the problems of multiple people making
changes to the data at once, and keeping the data consistent
when errors happen.
I do understand that saying, "We need a database!" is a big step,
and kind of scary for a lot of people, but it is the right answer.
You will also be faster that way, and be able to handle any
amount of load and many unforseen things, which this code will
be much harder to maintain. Perl's DBI lets you access many
databases, such as MySql, ODBC, Sybase, and others.
I hope my psudeo code up there helped, and good luck. What
you're doing isn't as straightforward as you'd like, and
doing it right will be tricky.
Good luck!
------------------------------
Date: Sun, 04 Nov 2001 09:30:51 GMT
From: Andrew Cady <please@no.spam>
Subject: Re: reading flat-file db and replacing a word
Message-Id: <87d72yril5.fsf@homer.cghm>
"Carsten Menke" <bootsy52@gmx.net> writes:
> On Sun, 04 Nov 2001 02:19:44 +0100, Tad McClellan wrote:
> I could fixed length fields for the first 1 field, but for example
> 4th field is not possible, because I do not know how much greeting,
> comments what ever my boss is wanting to fill in, if I assume 100
> lines he's coming the next day and say He need 110 lines because
> (actually this field is reserverd for greeting string the user
> receives when he logs on) He want's to greet a group of 10 persons
> each by each.
Why store these in the same file? Why not e.g. store a filename in
the record, and put the greeting in the file? Or keep them in a
separate database or something?
> Hmmmh, my reason is not really that good. But maybe I'm a little bit
> paranoid, but the fear I got, is that every disk writing attempt
> could fail, and therefore I'm afraid of corrupted data files
> (inconsistency). But maybe you can tell me better if I should be
> afraid of this point or if the inline edit keeps the same risk as a
> temporary file.
Your reason isn't good. But you can't replace the file here, for
another reason. Writes could clobber each other and corrupt on-going
reads. This applies to your approach as well, although it's slightly
better. If you added record-locking to your approach, that would make
it viable, but it would be far too much work.
Use a database. A simple, flat DBM (via a tied hash) should suffice.
GDBM is the best. All your operations will be in constant time, and
best of all, perl will handle all the gory details for you. You just
use it like a hash.
> The second reason, why I want to do this is speed. I tell you
> exactly for what I want to use this code. The examples shown are
> part of a password file (yes, my boss said encryption is not
> necessary, was not my idea) So now there are two things which have
> to be done, it is wanted that at every login (done via CGI), the
> time of that login gets set, so if I made a temporary copy and the
> user hits the stop button of his browser what happens than?
That's not how CGI works, and even if it was, it's not how you design
interfaces. Validate the entire input before you start writing any of
it to disk.
(What exactly DO you think is going to happen if the user hits
"stop"?)
You should not be storing timestamps in the password file. And you
say you're worried about disk failure!
> The other thing is that I want to provide an CGI based interface so
> that one can easily add,change,remove members. Therefore I have made
> currently 5 input fields, (but it is flexible, if more are needed)
> So if the one wants to change 5 users at a time, I need to make a
> temporary copy, replace the first entry, then unlink/overwrite the
> temp file by the modified one, make the second change,
> unline/overwrite the temp by the modified one again, and so on, and
> so on until all changes are made.
>
> But mainly I'm concerned that a copy/unlink operation could
> fail. You are more experienced, so I would really like to hear your
> point of view.
Well so what if it does? You don't lose any data. Even if the rename
fails, you don't lose any data (POSIX rename() guarantees this, at
least). Of course, these operations will never fail. They quite
possibly will never use the physical disk (if the file is only 1 or
2k).
You WILL lose data if two processes try it at once, though, so
use a database.
------------------------------
Date: Sun, 04 Nov 2001 09:37:34 +0100
From: Tassilo von Parseval <Tassilo.Parseval@post.rwth-aachen.de>
Subject: Re: Sending Content Type in email
Message-Id: <3BE4FE4E.7020609@post.rwth-aachen.de>
Glenn White wrote:
[...]
> I can get the email to display in Courier by sending it as html, but that
> gets nasty for the people who use a plain vanilla reader. I've been trying
> to use:
>
> $mail{'Content-type'} = 'text/plain; charset="iso-8859-1"';
> $mail{'Content-type'} = 'text/plain; charset="US-ASCII"';
> and a few other variations.
Mark that this does not say anything about the font used to render the
body. It just specifies the type (text/plain) and the charset which
should be enough to display each character in the body.
> I can see the content type listed in the email header, but the mail
> programs will not display it as a mono-type font. I've tried playing with
> the font options in Eudora, but have not had any success.
>
> Is it possible for a Perl program to send a "content type" so the email
> will be displayed as a mono-type font (such as Courier) even if the user's
> default settings are something else? Since I'm trying to do this from a
> Perl program, I thought I would ask here first, then try one of the other
> related groups.
This is not possible even if you had a Python-program sending a
content-type field in the header. There is no way to tell the
recipient's user-agent to use a particular font for displaying an email
if the email in question is of text/plain-type. You'd indeed need a
multipart/alternative email with one part text/plain and the other one
for instance text/html. But don't do it and instead leave the font used
as a choice for the recipient.
Tassilo
--
$a=[(74,116)];$b=[($a->[1]-1,$a->[1]++,0x20)];$c=[(97,110)];$d=[($c->
[1]+1,$b->[1],"her")];for(@{[$a,$b,$c,$d]}){for(@{$_}){$_=~/\d+/?print
(chr($_)):print;}}$c=sub{$l=shift;[(0x20+$l-1,0x50,0x65,0x73-0x01,108
),(0x20,0x68,0x61,)]};print(map{chr($_)}@{($c->(1))});$h={a=>33*3,b=>
10**2+7,c=>"1"."0"."1",d=>0162};@h=sort(keys(%$h));for(@h){print(chr(
ord(chr($h->{$_}))))};
------------------------------
Date: 04 Nov 2001 10:13:50 +0000
From: nobull@mail.com
Subject: Re: Sending Content Type in email
Message-Id: <u9668qlu75.fsf@wcl-l.bham.ac.uk>
spam.killer@home.com_nospam (Glenn White) writes:
> Is it possible for a Perl program to send a "content type" so the email
> will be displayed as a mono-type font (such as Courier) even if the user's
> default settings are something else?
If such a header existed Perl could send it.
> Since I'm trying to do this from a Perl program, I thought I would
> ask here first, then try one of the other related groups.
I guess the nettizens over in rec.food.drink.coffee should feel
themselves lucky you weren't also drinking a cup of coffee at the
time.
--
\\ ( )
. _\\__[oo
.__/ \\ /\@
. l___\\
# ll l\\
###LL LL\\
------------------------------
Date: Sun, 04 Nov 2001 09:47:56 GMT
From: Andrew Cady <please@no.spam>
Subject: Re: Split output into multiple pages?
Message-Id: <87vggqq389.fsf@homer.cghm>
ccking@consultant.com (Charles King) writes:
> Have a problem with a script which prints designated fields from
> *all records* in a flat file database. The problem is how to split
> what gets printed into multiple pages with links at the bottom to
> the other pages.
[...]
> while (<DATABASE>)
> {
> $row = $_;
> chop $row;
> @stuff = split (/\|/, $row);
> &table_row; # prints designated fields in the record
$row shouldn't be global. Nor @stuff. Pass the values to print to
table_row.
> $count++;
> }
> close (DATABASE);
> $count--;
Why $count-- ? Start the count at 0. Unless it's a global you're
using in table_row, in which case don't do that.
> &table_footer;
> }
>
> print "$footertemplate";
>
> ____________________________________________________________________
>
> The following code works well to split the pages in a Search script.
> I'm hoping it could be adapted, or maybe there's a much simpler way?
[snip unreadable cludge]
I remember this code. Still trying to get that monstrosity to work,
huh? I think you should give up and try to write something on your
own from scratch... or find something better to start from at least.
Anyway, the basic idea is this:
sub PAGERECS { 30 };
my $i = 0;
while (<DATABASE>) {
chomp;
print_header() if ($i % PAGERECS == 0);
print_row(split /\|/);
print_footer() if (++$i % PAGERECS == 0);
}
print_footer() unless ($i % PAGERECS == 0); # print unless we just did
That's to split into pages, but print ALL RECORDS, which is what you
said you wanted, although it seems like it might make more sense to
print only a portion of the records, which is just as easy:
my ($from, $to) = (30, 60); # print records [30,60[
my $i = 0;
while (<DATABASE>) {
++$i;
next unless $i >= $from;
last unless $i < $to;
# do whatever...
}
------------------------------
Date: 4 Nov 2001 00:31:45 -0800
From: weiwe1@yeah.net (hugh1)
Subject: what means about "e" : $name=~ s/.../.../egi
Message-Id: <7dcf30ba.0111040031.5fb1e572@posting.google.com>
thanx
------------------------------
Date: Sun, 04 Nov 2001 09:43:18 +0100
From: Tassilo von Parseval <Tassilo.Parseval@post.rwth-aachen.de>
Subject: Re: what means about "e" : $name=~ s/.../.../egi
Message-Id: <3BE4FFA6.8060509@post.rwth-aachen.de>
hugh1 wrote:
[what means about "e" : $name=~ s/.../.../egi]
It means that the right side of the substitution is interpreted as Perl-code that
gets executed (in fact _e_valuated):
1)
$a = "hello";
$a =~ s/(.*)/uc($1)/e;
versus
2)
$a = "hello";
$a =~ s/(.*)/uc($1)/;
1) results in $a="HELLO"
2) results in $a="uc(hello)"
Tassilo
--
$a=[(74,116)];$b=[($a->[1]-1,$a->[1]++,0x20)];$c=[(97,110)];$d=[($c->
[1]+1,$b->[1],"her")];for(@{[$a,$b,$c,$d]}){for(@{$_}){$_=~/\d+/?print
(chr($_)):print;}}$c=sub{$l=shift;[(0x20+$l-1,0x50,0x65,0x73-0x01,108
),(0x20,0x68,0x61,)]};print(map{chr($_)}@{($c->(1))});$h={a=>33*3,b=>
10**2+7,c=>"1"."0"."1",d=>0162};@h=sort(keys(%$h));for(@h){print(chr(
ord(chr($h->{$_}))))};
------------------------------
Date: 6 Apr 2001 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 6 Apr 01)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V10 Issue 2063
***************************************