[22344] in bugtraq
Re: HTML email "bug", of sorts.
daemon@ATHENA.MIT.EDU (Bear Giles)
Mon Aug 20 20:08:37 2001
From: Bear Giles <bear@coyotesong.com>
Message-Id: <200108202133.PAA17268@eris.coyotesong.com>
In-Reply-To: <Pine.LNX.4.21.0108191610200.3928-100000@wakko.bitey.net> from Alex
Prestin at "Aug 19, 2001 04:19:12 pm"
To: Alex Prestin <wakko@bitey.net>
Date: Mon, 20 Aug 2001 15:33:26 -0600 (MDT)
Cc: bugtraq@securityfocus.com
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
> On Sun, 19 Aug 2001, David F. Skoll wrote:
>
> What I was more interested in finding out is how admins and people who
> *are* technically adept can filter these types of things out on a massive
> scale (on their mailservers, for example) *without* affecting the delivery
> of legitimate mail.
Here's a simple solution if you can run your inbound mail (or web
pages) through a filter.
1) run them through a simple filter for image tags. With regex,
the pattern could be as simple as "<img ([^>]+)>", case insensitive.
You might need to include some backslash quotes.
For everything that matches, look for any height and width attributes
for the image. If it's 1, you have a web bug. Even if it's 2-8 or so,
it's probably still a web bug.
Either comment it out or delete it. The latter may be preferable
if don't want to break scripts.
Of course, this basic can also be used to knock out scripts,
uuencoded HTML content (which I've also noticed recently - a lot of
filters don't seem to bother checking for scripts or webbugs within
a uuecoded HTML block, even though many readers will automatically
unpack, load and execute them.)
2) on a related note, if you see anything like
<img src="http://spammer.com/images/foo.gif?some-random-string-here">
you can snip the "?some-random-string-here" part. Their logs may
still show your IP address, but they won't show a unique identifier
string.
3) if you want to be a bit more forceful, you could take the presence
of a small image as presumptive proof that the message is spam and treat
it as such.
> The problems I see with this approach are:
>
> 1) how do you determine what's legitimate HTML email and what isn't? Can
> pattern-matching of web bugs be as easy as "*.gif\?.*" or something
> similar?
It's impossible to be 100% certain, but you can certainly answer with
broad strokes. Is there ever a legitimate reason to use a 1x1 image?
If not, what is the harm in deleting them immediately?
> 2) where is this type of filter ethically the right thing to do? on a
> server at work? (I would think "yes".) What if you work at an ISP? (I
> would be less inclined to think "yes" if I might somehow be restricting
> the experience of paying customers.)
What about an ISP that decides to filter executable attachments?
Should the ISP offer an opt-out choice on that, or is it something
where they can legitimately say that it's a matter of "public hygiene"?
If the city can force homeowners to remove trash and mow tall grass
that provides a haven to vermin, then it seems that an ISP could take
similar actions again e-vermin like executable attachments or
web bugs used to validate spam lists.
--
Bear Giles
bgiles+bqt@coyotesong.com