[12702] in Perl-Users-Digest
Perl-Users Digest, Issue: 111 Volume: 9
daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Jul 11 14:17:32 1999
Date: Sun, 11 Jul 1999 11:10:09 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)
Perl-Users Digest Sun, 11 Jul 1999 Volume: 9 Number: 111
Today's topics:
Re: Random Numbers (Kai Henningsen)
Re: Random Numbers (Kai Henningsen)
Re: regex to eat all html tags (Kai Henningsen)
Re: WEB DEVELOPERS (Kai Henningsen)
Digest Administrivia (Last modified: 1 Jul 99) (Perl-Users-Digest Admin)
----------------------------------------------------------------------
Date: 11 Jul 1999 14:08:00 +0200
From: kaih=7Kg9yrTmw-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: Random Numbers
Message-Id: <7Kg9yrTmw-B@khms.westfalen.de>
lr@hpl.hp.com (Larry Rosler) wrote on 08.07.99 in <MPG.11ee921a92212f97989c75@nntp.hpl.hp.com>:
> The little program below will demonstrate the problem very clearly. A
> quicker way to see it is to do the command `perl -V:randbits`, which
> gives 15 on each of the systems I have tried.
$ perl -V:randbits
randbits='31';
$ uname -a
Linux khms.westfalen.de 2.2.7 #1 Son Mai 2 17:44:43 CEST 1999 i486 unknown
$ ldd $( which perl )
/lib/nfslock.so.0 => /lib/nfslock.so.0 (0x40001000)
libnsl.so.1 => /lib/libnsl.so.1 (0x40009000)
libdb.so.2 => /lib/libdb.so.2 (0x4001e000)
libgdbm.so.1 => /usr/lib/libgdbm.so.1 (0x4002b000)
libdl.so.2 => /lib/libdl.so.2 (0x40031000)
libm.so.6 => /lib/libm.so.6 (0x40034000)
libc.so.6 => /lib/libc.so.6 (0x40050000)
libcrypt.so.1 => /lib/libcrypt.so.1 (0x4013c000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x2aaaa000)
$ perl -V
Summary of my perl5 (5.0 patchlevel 4 subversion 4) configuration:
Platform:
osname=linux, osvers=2.0.36, archname=i386-linux
uname='linux perv 2.0.36 #2 wed nov 18 03:00:48 pst 1998 i686 unknown '
hint=recommended, useposix=true, d_sigaction=define
bincompat3=n useperlio=undef d_sfio=undef
Compiler:
cc='cc', optimize='-O2', gccversion=2.7.2.3
cppflags='-Dbool=char -DHAS_BOOL -D_REENTRANT'
ccflags ='-Dbool=char -DHAS_BOOL -D_REENTRANT'
stdchar='char', d_stdstdio=define, usevfork=false
voidflags=15, castflags=0, d_casti32=define, d_castneg=define
intsize=4, alignbytes=4, usemymalloc=n, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lndbm -lgdbm -ldbm -ldb -ldl -lm -lc -lposix -lcrypt
libc=, so=so
useshrplib=false, libperl=libperl.a
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'
Characteristics of this binary (from libperl):
Built under linux
Compiled at Feb 3 1999 00:52:40
@INC:
/usr/lib/perl5/i386-linux/5.004
/usr/lib/perl5
/usr/local/lib/site_perl/i386-linux
/usr/local/lib/site_perl
.
$
> #!/usr/local/bin/perl -w
> use strict;
>
> @_{map sprintf('%07d', rand 10_000_000) => 1 .. 100_000} = ();
> print scalar keys %_, "\n";
99520
Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
- Russ Allbery (rra@stanford.edu)
------------------------------
Date: 11 Jul 1999 14:19:00 +0200
From: kaih=7Kg9z571w-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: Random Numbers
Message-Id: <7Kg9z571w-B@khms.westfalen.de>
gellyfish@gellyfish.com (Jonathan Stowe) wrote on 09.07.99 in <7m5vqt$1vl$1@gellyfish.btinternet.com>:
> On 09 Jul 1999 15:07:54 -0400 Uri Guttman wrote:
> >>>>>> "LR" == Larry Rosler <lr@hpl.hp.com> writes:
> >
> > LR> In article <378638DB.1C12D9C2@mail.cor.epa.gov> on Fri, 09 Jul 1999
> > LR> 11:00:59 -0700, David Cassell <cassell@mail.cor.epa.gov> says...
> > LR> ...
> > >> Math::TrulyRandom may not be random anyway, merely chaotic.
> > >> I haven't seen an adequate analysis of it.
> >
> > LR> Hello! There are no random-number algorithms, by definition.
> > Special LR> hardware (such as a radioactive-decay detector) is required.
> > See Knuth,
> >
> > how would you classify /dev/rand which is based on data in the kernel? i
> > don't know exactly how it creates the numbers, but it is not predictive
> > as it is influenced by random input like kernel interrupts.
> >
>
> From :
>
>
> RANDOM(4) Linux Programmer's Manual RANDOM(4)
>
> <snip description of /dev stuff>
>
> The random number generator gathers environmental noise
> from device drivers and other sources into an entropy
> pool. The generator also keeps an estimate of the number
> of bit of the noise in the entropy pool. From this
> entropy pool random numbers are created.
>
> I dunno I no mathmetician ...
From drivers/char/random.c:
/*
* random.c -- A strong random number generator
*
* Version 1.04, last modified 26-Apr-98
*
* Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998. All rights
* reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, and the entire permission notice in its entirety,
* including the disclaimer of warranties.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* 3. The name of the author may not be used to endorse or promote
* products derived from this software without specific prior
* written permission.
*
* ALTERNATIVELY, this product may be distributed under the terms of
* the GNU Public License, in which case the provisions of the GPL are
* required INSTEAD OF the above restrictions. (This clause is
* necessary due to a potential bad interaction between the GPL and
* the restrictions contained in a BSD-style copyright.)
*
* THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
* WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
* DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
* INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
* (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
* STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
* OF THE POSSIBILITY OF SUCH DAMAGE.
*/
/*
* (now, with legal B.S. out of the way.....)
*
* This routine gathers environmental noise from device drivers, etc.,
* and returns good random numbers, suitable for cryptographic use.
* Besides the obvious cryptographic uses, these numbers are also good
* for seeding TCP sequence numbers, and other places where it is
* desirable to have numbers which are not only random, but hard to
* predict by an attacker.
*
* Theory of operation
* ===================
*
* Computers are very predictable devices. Hence it is extremely hard
* to produce truly random numbers on a computer --- as opposed to
* pseudo-random numbers, which can easily generated by using a
* algorithm. Unfortunately, it is very easy for attackers to guess
* the sequence of pseudo-random number generators, and for some
* applications this is not acceptable. So instead, we must try to
* gather "environmental noise" from the computer's environment, which
* must be hard for outside attackers to observe, and use that to
* generate random numbers. In a Unix environment, this is best done
* from inside the kernel.
*
* Sources of randomness from the environment include inter-keyboard
* timings, inter-interrupt timings from some interrupts, and other
* events which are both (a) non-deterministic and (b) hard for an
* outside observer to measure. Randomness from these sources are
* added to an "entropy pool", which is mixed using a CRC-like function.
* This is not cryptographically strong, but it is adequate assuming
* the randomness is not chosen maliciously, and it is fast enough that
* the overhead of doing it on every interrupt is very reasonable.
* As random bytes are mixed into the entropy pool, the routines keep
* an *estimate* of how many bits of randomness have been stored into
* the random number generator's internal state.
*
* When random bytes are desired, they are obtained by taking the SHA
* hash of the contents of the "entropy pool". The SHA hash avoids
* exposing the internal state of the entropy pool. It is believed to
* be computationally infeasible to derive any useful information
* about the input of SHA from its output. Even if it is possible to
* analyze SHA in some clever way, as long as the amount of data
* returned from the generator is less than the inherent entropy in
* the pool, the output data is totally unpredictable. For this
* reason, the routine decreases its internal estimate of how many
* bits of "true randomness" are contained in the entropy pool as it
* outputs random numbers.
*
* If this estimate goes to zero, the routine can still generate
* random numbers; however, an attacker may (at least in theory) be
* able to infer the future output of the generator from prior
* outputs. This requires successful cryptanalysis of SHA, which is
* not believed to be feasible, but there is a remote possibility.
* Nonetheless, these numbers should be useful for the vast majority
* of purposes.
*
* Exported interfaces ---- output
* ===============================
*
* There are three exported interfaces; the first is one designed to
* be used from within the kernel:
*
* void get_random_bytes(void *buf, int nbytes);
*
* This interface will return the requested number of random bytes,
* and place it in the requested buffer.
*
* The two other interfaces are two character devices /dev/random and
* /dev/urandom. /dev/random is suitable for use when very high
* quality randomness is desired (for example, for key generation or
* one-time pads), as it will only return a maximum of the number of
* bits of randomness (as estimated by the random number generator)
* contained in the entropy pool.
*
* The /dev/urandom device does not have this limit, and will return
* as many bytes as are requested. As more and more random bytes are
* requested without giving time for the entropy pool to recharge,
* this will result in random numbers that are merely cryptographically
* strong. For many applications, however, this is acceptable.
*
* Exported interfaces ---- input
* ==============================
*
* The current exported interfaces for gathering environmental noise
* from the devices are:
*
* void add_keyboard_randomness(unsigned char scancode);
* void add_mouse_randomness(__u32 mouse_data);
* void add_interrupt_randomness(int irq);
* void add_blkdev_randomness(int irq);
*
* add_keyboard_randomness() uses the inter-keypress timing, as well as the
* scancode as random inputs into the "entropy pool".
*
* add_mouse_randomness() uses the mouse interrupt timing, as well as
* the reported position of the mouse from the hardware.
*
* add_interrupt_randomness() uses the inter-interrupt timing as random
* inputs to the entropy pool. Note that not all interrupts are good
* sources of randomness! For example, the timer interrupts is not a
* good choice, because the periodicity of the interrupts is to
* regular, and hence predictable to an attacker. Disk interrupts are
* a better measure, since the timing of the disk interrupts are more
* unpredictable.
*
* add_blkdev_randomness() times the finishing time of block requests.
*
* All of these routines try to estimate how many bits of randomness a
* particular randomness source. They do this by keeping track of the
* first and second order deltas of the event timings.
*
* Ensuring unpredictability at system startup
* ============================================
*
* When any operating system starts up, it will go through a sequence
* of actions that are fairly predictable by an adversary, especially
* if the start-up does not involve interaction with a human operator.
* This reduces the actual number of bits of unpredictability in the
* entropy pool below the value in entropy_count. In order to
* counteract this effect, it helps to carry information in the
* entropy pool across shut-downs and start-ups. To do this, put the
* following lines an appropriate script which is run during the boot
* sequence:
*
* echo "Initializing random number generator..."
* random_seed=/var/run/random-seed
* # Carry a random seed from start-up to start-up
* # Load and then save 512 bytes, which is the size of the entropy pool
* if [ -f $random_seed ]; then
* cat $random_seed >/dev/urandom
* fi
* dd if=/dev/urandom of=$random_seed count=1
* chmod 600 $random_seed
*
* and the following lines in an appropriate script which is run as
* the system is shutdown:
*
* # Carry a random seed from shut-down to start-up
* # Save 512 bytes, which is the size of the entropy pool
* echo "Saving random seed..."
* random_seed=/var/run/random-seed
* dd if=/dev/urandom of=$random_seed count=1
* chmod 600 $random_seed
*
* For example, on most modern systems using the System V init
* scripts, such code fragments would be found in
* /etc/rc.d/init.d/random. On older Linux systems, the correct script
* location might be in /etc/rcb.d/rc.local or /etc/rc.d/rc.0.
*
* Effectively, these commands cause the contents of the entropy pool
* to be saved at shut-down time and reloaded into the entropy pool at
* start-up. (The 'dd' in the addition to the bootup script is to
* make sure that /etc/random-seed is different for every start-up,
* even if the system crashes without executing rc.0.) Even with
* complete knowledge of the start-up activities, predicting the state
* of the entropy pool requires knowledge of the previous history of
* the system.
*
* Configuring the /dev/random driver under Linux
* ==============================================
*
* The /dev/random driver under Linux uses minor numbers 8 and 9 of
* the /dev/mem major number (#1). So if your system does not have
* /dev/random and /dev/urandom created already, they can be created
* by using the commands:
*
* mknod /dev/random c 1 8
* mknod /dev/urandom c 1 9
*
* Acknowledgements:
* =================
*
* Ideas for constructing this random number generator were derived
* from Pretty Good Privacy's random number generator, and from private
* discussions with Phil Karn. Colin Plumb provided a faster random
* number generator, which speed up the mixing function of the entropy
* pool, taken from PGPfone. Dale Worley has also contributed many
* useful ideas and suggestions to improve this driver.
*
* Any flaws in the design are solely my responsibility, and should
* not be attributed to the Phil, Colin, or any of authors of PGP.
*
* The code for SHA transform was taken from Peter Gutmann's
* implementation, which has been placed in the public domain.
* The code for MD5 transform was taken from Colin Plumb's
* implementation, which has been placed in the public domain. The
* MD5 cryptographic checksum was devised by Ronald Rivest, and is
* documented in RFC 1321, "The MD5 Message Digest Algorithm".
*
* Further background information on this topic may be obtained from
* RFC 1750, "Randomness Recommendations for Security", by Donald
* Eastlake, Steve Crocker, and Jeff Schiller.
*/
[snip actual code, about 1700 lines]
Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
- Russ Allbery (rra@stanford.edu)
------------------------------
Date: 11 Jul 1999 17:45:00 +0200
From: kaih=7KgA0-5Hw-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: regex to eat all html tags
Message-Id: <7KgA0-5Hw-B@khms.westfalen.de>
abigail@delanet.com (Abigail) wrote on 06.07.99 in <slrn7o59qa.tch.abigail@alexandra.delanet.com>:
> Here are a few things where your simplistic approach fails utterly:
>
> <IMG SRC = "a_greater_b.gif" ALT = "a > b">
> <!-- <IMG SRC = "foo.gif"> -->
> We break a line using <[CDATA [ <br> ]]>
> <# This is text, not a tag #>
> <SCRIPT>document.write ("<BR>")</SCRIPT>
Hmmm ... what would be needed to make that legal?
nsgmls:test:2:0:E: no document type declaration; will parse without validation
nsgmls:test:4:37:E: marked section end not in marked section declaration
nsgmls:test:6:39:E: end tag for "BR" omitted, but its declaration does not permit this
nsgmls:test:6:25: start tag was here
nsgmls:test:7:1:E: end tag for "BR" omitted, but its declaration does not permit this
nsgmls:test:4:32: start tag was here
nsgmls:test:7:1:E: end tag for "IMG" omitted, but its declaration does not permit this
nsgmls:test:2:0: start tag was here
ASRC CDATA a_greater_b.gif
AALT CDATA a > b
(IMG
-We break a line using <[CDATA [
(BR
- \n<# This is text, not a tag #>\n
(SCRIPT
-document.write ("
(BR
-")
)BR
)SCRIPT
-\n
)BR
)IMG
Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
- Russ Allbery (rra@stanford.edu)
------------------------------
Date: 11 Jul 1999 16:28:00 +0200
From: kaih=7Kg9-b0mw-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: WEB DEVELOPERS
Message-Id: <7Kg9-b0mw-B@khms.westfalen.de>
cassell@mail.cor.epa.gov (David Cassell) wrote on 09.07.99 in <378675D9.6476C66E@mail.cor.epa.gov>:
> Jonathan Stowe wrote:
> > [snip]
> > The funny thing about this is that I got a nice message from a cancelbot
> > operating in the it.* hierarchy - so presumably the original post was
> > likewise cancelled. I wonder if we could get such a 'bot together here
> > ...
>
> Sure we could... But once TomC 'improved' it, it would cancel
> every post that included any of the following phrases:
> 'probably a FAQ'
> 'didn\'t read the FAQ'
> 'newbie'
> 'write me a script'
> .
> .
> .
>
> Wait, on second thought I'm liking this bot idea more and more...
>
> [Now John Stanley will walk over here and smack me one. :-]
If John doesn't, let me do it instead.
> David, ducking in advance :-)
Nice of you to tell me where to aim:
*SMACK*
Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
- Russ Allbery (rra@stanford.edu)
------------------------------
Date: 1 Jul 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin)
Subject: Digest Administrivia (Last modified: 1 Jul 99)
Message-Id: <null>
Administrivia:
The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc. For subscription or unsubscription requests, send
the single line:
subscribe perl-users
or:
unsubscribe perl-users
to almanac@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.
To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.
To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.
The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.
The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.
For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.
------------------------------
End of Perl-Users Digest V9 Issue 111
*************************************