[12702] in Perl-Users-Digest

home help back first fref pref prev next nref lref last post

Perl-Users Digest, Issue: 111 Volume: 9

daemon@ATHENA.MIT.EDU (Perl-Users Digest)
Sun Jul 11 14:17:32 1999

Date: Sun, 11 Jul 1999 11:10:09 -0700 (PDT)
From: Perl-Users Digest <Perl-Users-Request@ruby.OCE.ORST.EDU>
To: Perl-Users@ruby.OCE.ORST.EDU (Perl-Users Digest)

Perl-Users Digest           Sun, 11 Jul 1999     Volume: 9 Number: 111

Today's topics:
    Re: Random Numbers (Kai Henningsen)
    Re: Random Numbers (Kai Henningsen)
    Re: regex to eat all html tags (Kai Henningsen)
    Re: WEB DEVELOPERS (Kai Henningsen)
        Digest Administrivia (Last modified: 1 Jul 99) (Perl-Users-Digest Admin)

----------------------------------------------------------------------

Date: 11 Jul 1999 14:08:00 +0200
From: kaih=7Kg9yrTmw-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: Random Numbers
Message-Id: <7Kg9yrTmw-B@khms.westfalen.de>

lr@hpl.hp.com (Larry Rosler)  wrote on 08.07.99 in <MPG.11ee921a92212f97989c75@nntp.hpl.hp.com>:

> The little program below will demonstrate the problem very clearly.  A
> quicker way to see it is to do the command `perl -V:randbits`, which
> gives 15 on each of the systems I have tried.

$ perl -V:randbits
randbits='31';
$ uname -a
Linux khms.westfalen.de 2.2.7 #1 Son Mai 2 17:44:43 CEST 1999 i486 unknown
$ ldd $( which perl )
	/lib/nfslock.so.0 => /lib/nfslock.so.0 (0x40001000)
	libnsl.so.1 => /lib/libnsl.so.1 (0x40009000)
	libdb.so.2 => /lib/libdb.so.2 (0x4001e000)
	libgdbm.so.1 => /usr/lib/libgdbm.so.1 (0x4002b000)
	libdl.so.2 => /lib/libdl.so.2 (0x40031000)
	libm.so.6 => /lib/libm.so.6 (0x40034000)
	libc.so.6 => /lib/libc.so.6 (0x40050000)
	libcrypt.so.1 => /lib/libcrypt.so.1 (0x4013c000)
	/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x2aaaa000)
$ perl -V
Summary of my perl5 (5.0 patchlevel 4 subversion 4) configuration:
  Platform:
    osname=linux, osvers=2.0.36, archname=i386-linux
    uname='linux perv 2.0.36 #2 wed nov 18 03:00:48 pst 1998 i686 unknown '
    hint=recommended, useposix=true, d_sigaction=define
    bincompat3=n useperlio=undef d_sfio=undef
  Compiler:
    cc='cc', optimize='-O2', gccversion=2.7.2.3
    cppflags='-Dbool=char -DHAS_BOOL -D_REENTRANT'
    ccflags ='-Dbool=char -DHAS_BOOL -D_REENTRANT'
    stdchar='char', d_stdstdio=define, usevfork=false
    voidflags=15, castflags=0, d_casti32=define, d_castneg=define
    intsize=4, alignbytes=4, usemymalloc=n, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lnsl -lndbm -lgdbm -ldbm -ldb -ldl -lm -lc -lposix -lcrypt
    libc=, so=so
    useshrplib=false, libperl=libperl.a
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'


Characteristics of this binary (from libperl):
  Built under linux
  Compiled at Feb  3 1999 00:52:40
  @INC:
    /usr/lib/perl5/i386-linux/5.004
    /usr/lib/perl5
    /usr/local/lib/site_perl/i386-linux
    /usr/local/lib/site_perl
    .
$

> #!/usr/local/bin/perl -w
> use strict;
>
> @_{map sprintf('%07d', rand 10_000_000) => 1 .. 100_000} = ();
> print scalar keys %_, "\n";

99520


Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
  - Russ Allbery (rra@stanford.edu)


------------------------------

Date: 11 Jul 1999 14:19:00 +0200
From: kaih=7Kg9z571w-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: Random Numbers
Message-Id: <7Kg9z571w-B@khms.westfalen.de>

gellyfish@gellyfish.com (Jonathan Stowe)  wrote on 09.07.99 in <7m5vqt$1vl$1@gellyfish.btinternet.com>:

> On 09 Jul 1999 15:07:54 -0400 Uri Guttman wrote:
> >>>>>> "LR" == Larry Rosler <lr@hpl.hp.com> writes:
> >
> >   LR> In article <378638DB.1C12D9C2@mail.cor.epa.gov> on Fri, 09 Jul 1999
> >   LR> 11:00:59 -0700, David Cassell <cassell@mail.cor.epa.gov> says...
> >   LR> ...
> >   >> Math::TrulyRandom may not be random anyway, merely chaotic.
> >   >> I haven't seen an adequate analysis of it.
> >
> >   LR> Hello!  There are no random-number algorithms, by definition.
> >   Special LR> hardware (such as a radioactive-decay detector) is required.
> >    See Knuth,
> >
> > how would you classify /dev/rand which is based on data in the kernel? i
> > don't know exactly how it creates the numbers, but it is not predictive
> > as it is influenced by random input like kernel interrupts.
> >
>
> From :
>
>
> RANDOM(4)           Linux Programmer's Manual           RANDOM(4)
>
> <snip description of /dev stuff>
>
>        The random number generator  gathers  environmental  noise
>        from  device  drivers  and  other  sources into an entropy
>        pool.  The generator also keeps an estimate of the  number
>        of  bit  of  the  noise  in  the  entropy pool.  From this
>        entropy pool random numbers are created.
>
> I dunno I no mathmetician ...

From drivers/char/random.c:

/*
 * random.c -- A strong random number generator
 *
 * Version 1.04, last modified 26-Apr-98
 *
 * Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998.  All rights
 * reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, and the entire permission notice in its entirety,
 *    including the disclaimer of warranties.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 3. The name of the author may not be used to endorse or promote
 *    products derived from this software without specific prior
 *    written permission.
 *
 * ALTERNATIVELY, this product may be distributed under the terms of
 * the GNU Public License, in which case the provisions of the GPL are
 * required INSTEAD OF the above restrictions.  (This clause is
 * necessary due to a potential bad interaction between the GPL and
 * the restrictions contained in a BSD-style copyright.)
 *
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
 * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 * DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
 * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
 * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
 * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
 * OF THE POSSIBILITY OF SUCH DAMAGE.
 */

/*
 * (now, with legal B.S. out of the way.....)
 *
 * This routine gathers environmental noise from device drivers, etc.,
 * and returns good random numbers, suitable for cryptographic use.
 * Besides the obvious cryptographic uses, these numbers are also good
 * for seeding TCP sequence numbers, and other places where it is
 * desirable to have numbers which are not only random, but hard to
 * predict by an attacker.
 *
 * Theory of operation
 * ===================
 *
 * Computers are very predictable devices.  Hence it is extremely hard
 * to produce truly random numbers on a computer --- as opposed to
 * pseudo-random numbers, which can easily generated by using a
 * algorithm.  Unfortunately, it is very easy for attackers to guess
 * the sequence of pseudo-random number generators, and for some
 * applications this is not acceptable.  So instead, we must try to
 * gather "environmental noise" from the computer's environment, which
 * must be hard for outside attackers to observe, and use that to
 * generate random numbers.  In a Unix environment, this is best done
 * from inside the kernel.
 *
 * Sources of randomness from the environment include inter-keyboard
 * timings, inter-interrupt timings from some interrupts, and other
 * events which are both (a) non-deterministic and (b) hard for an
 * outside observer to measure.  Randomness from these sources are
 * added to an "entropy pool", which is mixed using a CRC-like function.
 * This is not cryptographically strong, but it is adequate assuming
 * the randomness is not chosen maliciously, and it is fast enough that
 * the overhead of doing it on every interrupt is very reasonable.
 * As random bytes are mixed into the entropy pool, the routines keep
 * an *estimate* of how many bits of randomness have been stored into
 * the random number generator's internal state.
 *
 * When random bytes are desired, they are obtained by taking the SHA
 * hash of the contents of the "entropy pool".  The SHA hash avoids
 * exposing the internal state of the entropy pool.  It is believed to
 * be computationally infeasible to derive any useful information
 * about the input of SHA from its output.  Even if it is possible to
 * analyze SHA in some clever way, as long as the amount of data
 * returned from the generator is less than the inherent entropy in
 * the pool, the output data is totally unpredictable.  For this
 * reason, the routine decreases its internal estimate of how many
 * bits of "true randomness" are contained in the entropy pool as it
 * outputs random numbers.
 *
 * If this estimate goes to zero, the routine can still generate
 * random numbers; however, an attacker may (at least in theory) be
 * able to infer the future output of the generator from prior
 * outputs.  This requires successful cryptanalysis of SHA, which is
 * not believed to be feasible, but there is a remote possibility.
 * Nonetheless, these numbers should be useful for the vast majority
 * of purposes.
 *
 * Exported interfaces ---- output
 * ===============================
 *
 * There are three exported interfaces; the first is one designed to
 * be used from within the kernel:
 *
 * 	void get_random_bytes(void *buf, int nbytes);
 *
 * This interface will return the requested number of random bytes,
 * and place it in the requested buffer.
 *
 * The two other interfaces are two character devices /dev/random and
 * /dev/urandom.  /dev/random is suitable for use when very high
 * quality randomness is desired (for example, for key generation or
 * one-time pads), as it will only return a maximum of the number of
 * bits of randomness (as estimated by the random number generator)
 * contained in the entropy pool.
 *
 * The /dev/urandom device does not have this limit, and will return
 * as many bytes as are requested.  As more and more random bytes are
 * requested without giving time for the entropy pool to recharge,
 * this will result in random numbers that are merely cryptographically
 * strong.  For many applications, however, this is acceptable.
 *
 * Exported interfaces ---- input
 * ==============================
 *
 * The current exported interfaces for gathering environmental noise
 * from the devices are:
 *
 * 	void add_keyboard_randomness(unsigned char scancode);
 * 	void add_mouse_randomness(__u32 mouse_data);
 * 	void add_interrupt_randomness(int irq);
 * 	void add_blkdev_randomness(int irq);
 *
 * add_keyboard_randomness() uses the inter-keypress timing, as well as the
 * scancode as random inputs into the "entropy pool".
 *
 * add_mouse_randomness() uses the mouse interrupt timing, as well as
 * the reported position of the mouse from the hardware.
 *
 * add_interrupt_randomness() uses the inter-interrupt timing as random
 * inputs to the entropy pool.  Note that not all interrupts are good
 * sources of randomness!  For example, the timer interrupts is not a
 * good choice, because the periodicity of the interrupts is to
 * regular, and hence predictable to an attacker.  Disk interrupts are
 * a better measure, since the timing of the disk interrupts are more
 * unpredictable.
 *
 * add_blkdev_randomness() times the finishing time of block requests.
 *
 * All of these routines try to estimate how many bits of randomness a
 * particular randomness source.  They do this by keeping track of the
 * first and second order deltas of the event timings.
 *
 * Ensuring unpredictability at system startup
 * ============================================
 *
 * When any operating system starts up, it will go through a sequence
 * of actions that are fairly predictable by an adversary, especially
 * if the start-up does not involve interaction with a human operator.
 * This reduces the actual number of bits of unpredictability in the
 * entropy pool below the value in entropy_count.  In order to
 * counteract this effect, it helps to carry information in the
 * entropy pool across shut-downs and start-ups.  To do this, put the
 * following lines an appropriate script which is run during the boot
 * sequence:
 *
 *	echo "Initializing random number generator..."
 * 	random_seed=/var/run/random-seed
 *	# Carry a random seed from start-up to start-up
 *	# Load and then save 512 bytes, which is the size of the entropy pool
 * 	if [ -f $random_seed ]; then
 *		cat $random_seed >/dev/urandom
 * 	fi
 *	dd if=/dev/urandom of=$random_seed count=1
 * 	chmod 600 $random_seed
 *
 * and the following lines in an appropriate script which is run as
 * the system is shutdown:
 *
 *	# Carry a random seed from shut-down to start-up
 *	# Save 512 bytes, which is the size of the entropy pool
 *	echo "Saving random seed..."
 * 	random_seed=/var/run/random-seed
 *	dd if=/dev/urandom of=$random_seed count=1
 * 	chmod 600 $random_seed
 *
 * For example, on most modern systems using the System V init
 * scripts, such code fragments would be found in
 * /etc/rc.d/init.d/random.  On older Linux systems, the correct script
 * location might be in /etc/rcb.d/rc.local or /etc/rc.d/rc.0.
 *
 * Effectively, these commands cause the contents of the entropy pool
 * to be saved at shut-down time and reloaded into the entropy pool at
 * start-up.  (The 'dd' in the addition to the bootup script is to
 * make sure that /etc/random-seed is different for every start-up,
 * even if the system crashes without executing rc.0.)  Even with
 * complete knowledge of the start-up activities, predicting the state
 * of the entropy pool requires knowledge of the previous history of
 * the system.
 *
 * Configuring the /dev/random driver under Linux
 * ==============================================
 *
 * The /dev/random driver under Linux uses minor numbers 8 and 9 of
 * the /dev/mem major number (#1).  So if your system does not have
 * /dev/random and /dev/urandom created already, they can be created
 * by using the commands:
 *
 * 	mknod /dev/random c 1 8
 * 	mknod /dev/urandom c 1 9
 *
 * Acknowledgements:
 * =================
 *
 * Ideas for constructing this random number generator were derived
 * from Pretty Good Privacy's random number generator, and from private
 * discussions with Phil Karn.  Colin Plumb provided a faster random
 * number generator, which speed up the mixing function of the entropy
 * pool, taken from PGPfone.  Dale Worley has also contributed many
 * useful ideas and suggestions to improve this driver.
 *
 * Any flaws in the design are solely my responsibility, and should
 * not be attributed to the Phil, Colin, or any of authors of PGP.
 *
 * The code for SHA transform was taken from Peter Gutmann's
 * implementation, which has been placed in the public domain.
 * The code for MD5 transform was taken from Colin Plumb's
 * implementation, which has been placed in the public domain.  The
 * MD5 cryptographic checksum was devised by Ronald Rivest, and is
 * documented in RFC 1321, "The MD5 Message Digest Algorithm".
 *
 * Further background information on this topic may be obtained from
 * RFC 1750, "Randomness Recommendations for Security", by Donald
 * Eastlake, Steve Crocker, and Jeff Schiller.
 */

[snip actual code, about 1700 lines]


Kai
-- 
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
  - Russ Allbery (rra@stanford.edu)


------------------------------

Date: 11 Jul 1999 17:45:00 +0200
From: kaih=7KgA0-5Hw-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: regex to eat all html tags
Message-Id: <7KgA0-5Hw-B@khms.westfalen.de>

abigail@delanet.com (Abigail)  wrote on 06.07.99 in <slrn7o59qa.tch.abigail@alexandra.delanet.com>:

> Here are a few things where your simplistic approach fails utterly:
>
> <IMG SRC = "a_greater_b.gif" ALT = "a > b">
> <!-- <IMG SRC = "foo.gif"> -->
> We break a line using <[CDATA [ <br> ]]>
> <# This is text, not a tag #>
> <SCRIPT>document.write ("<BR>")</SCRIPT>

Hmmm ... what would be needed to make that legal?

nsgmls:test:2:0:E: no document type declaration; will parse without validation
nsgmls:test:4:37:E: marked section end not in marked section declaration
nsgmls:test:6:39:E: end tag for "BR" omitted, but its declaration does not permit this
nsgmls:test:6:25: start tag was here
nsgmls:test:7:1:E: end tag for "BR" omitted, but its declaration does not permit this
nsgmls:test:4:32: start tag was here
nsgmls:test:7:1:E: end tag for "IMG" omitted, but its declaration does not permit this
nsgmls:test:2:0: start tag was here
ASRC CDATA a_greater_b.gif
AALT CDATA a > b
(IMG
-We break a line using <[CDATA [
(BR
- \n<# This is text, not a tag #>\n
(SCRIPT
-document.write ("
(BR
-")
)BR
)SCRIPT
-\n
)BR
)IMG

Kai
-- 
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
  - Russ Allbery (rra@stanford.edu)


------------------------------

Date: 11 Jul 1999 16:28:00 +0200
From: kaih=7Kg9-b0mw-B@khms.westfalen.de (Kai Henningsen)
Subject: Re: WEB DEVELOPERS
Message-Id: <7Kg9-b0mw-B@khms.westfalen.de>

cassell@mail.cor.epa.gov (David Cassell)  wrote on 09.07.99 in <378675D9.6476C66E@mail.cor.epa.gov>:

> Jonathan Stowe wrote:
> > [snip]
> > The funny thing about this is that I got a nice message from a cancelbot
> > operating in the it.* hierarchy - so presumably the original post was
> > likewise cancelled.  I wonder if we could get such a 'bot together here
> > ...
>
> Sure we could...  But once TomC 'improved' it, it would cancel
> every post that included any of the following phrases:
> 'probably a FAQ'
> 'didn\'t read the FAQ'
> 'newbie'
> 'write me a script'
>    .
>    .
>    .
>
> Wait, on second thought I'm liking this bot idea more and more...
>
> [Now John Stanley will walk over here and smack me one.  :-]

If John doesn't, let me do it instead.

> David, ducking in advance :-)

Nice of you to tell me where to aim:

*SMACK*


Kai
-- 
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
  - Russ Allbery (rra@stanford.edu)


------------------------------

Date: 1 Jul 99 21:33:47 GMT (Last modified)
From: Perl-Users-Request@ruby.oce.orst.edu (Perl-Users-Digest Admin) 
Subject: Digest Administrivia (Last modified: 1 Jul 99)
Message-Id: <null>


Administrivia:

The Perl-Users Digest is a retransmission of the USENET newsgroup
comp.lang.perl.misc.  For subscription or unsubscription requests, send
the single line:

	subscribe perl-users
or:
	unsubscribe perl-users

to almanac@ruby.oce.orst.edu.  

To submit articles to comp.lang.perl.misc (and this Digest), send your
article to perl-users@ruby.oce.orst.edu.

To submit articles to comp.lang.perl.announce, send your article to
clpa@perl.com.

To request back copies (available for a week or so), send your request
to almanac@ruby.oce.orst.edu with the command "send perl-users x.y",
where x is the volume number and y is the issue number.

The Meta-FAQ, an article containing information about the FAQ, is
available by requesting "send perl-users meta-faq". The real FAQ, as it
appeared last in the newsgroup, can be retrieved with the request "send
perl-users FAQ". Due to their sizes, neither the Meta-FAQ nor the FAQ
are included in the digest.

The "mini-FAQ", which is an updated version of the Meta-FAQ, is
available by requesting "send perl-users mini-faq". It appears twice
weekly in the group, but is not distributed in the digest.

For other requests pertaining to the digest, send mail to
perl-users-request@ruby.oce.orst.edu. Do not waste your time or mine
sending perl questions to the -request address, I don't have time to
answer them even if I did know the answer.


------------------------------
End of Perl-Users Digest V9 Issue 111
*************************************


home help back first fref pref prev next nref lref last post