[7621] in bugtraq
Re: Eudora executes (Java) URL
daemon@ATHENA.MIT.EDU (Alec Kosky)
Tue Aug 11 22:17:05 1998
Date: Tue, 11 Aug 1998 15:34:12 -0700
Reply-To: alec@dakotacom.net
From: Alec Kosky <alec@DAKOTACOM.NET>
To: BUGTRAQ@NETSPACE.ORG
In-Reply-To: <98Aug11.155335edt.32260@pcbhi266.bhsi.com>
On 11-Aug-98 Vitiello, Eric (BHS) wrote:
>> [From an anti-mail-exploit-procmail-filter-perl-script (see
>> http://www.wolfenet.com/~jhardin/procmail-security.html):]
>> > s/<BODY\s+(([^">]+("(\\.|[^"])*")?)*)ONLOAD/<BODY $1
>> DEFANGED-ONLOAD/gi;
>>
>> This Pattern will catch lines like
>> <body onload="badthings()">
>> converted to
>> <BODY DEFANGED-ONLOAD="badthings()">
>> but not
>> <body onload="badthings()" onload="badthings()">
>> converted to
>> <BODY onload="badthings()" DEFANGED-ONLOAD="badthings()">]
>> So one onload=... will stay and act.
>>
>> Also things like < body ... > wont be catched. I dont know if
>> those are
>> leading spaces are proper HTML, but even if not, one should
>> not suppose
>> every bad HTML to be rejected.
>
> The following can Fix all of that:
>
> s/<\s+BODY\s+((([^">]+("(\\.|[^"])*")?)*)ONLOAD)*?\s+/<BODY $1
> DEFANGED-ONLOAD/gi;
Actually, I believe the RE that you are looking for is this:
s/<\s*BODY\s+((([^">]+("(\\.|[^"])*")?)*)ONLOAD)*?\s*/<BODY $1
DEFANGED-ONLOAD/gi;
The \s+ will only match one or more whitespaces, meaning that
<body onload="badthings()" onload="badthings()"> would not be caught, becuase
there are no spaces between < and body, but \s* will match zero or more
whitespace characters. This will catch
<body onload="badthings()" onload="badthings()">
and
< body onload="badthings()" onload="badthings()" >
--Alec--