[22354] in bugtraq

home help back first fref pref prev next nref lref last post

Re: HTML email "bug", of sorts.

daemon@ATHENA.MIT.EDU (Sean Straw / PSE)
Tue Aug 21 12:37:30 2001

Message-Id: <5.1.0.14.2.20010820212720.07dfc370@mail.professional.org>
Date: Mon, 20 Aug 2001 21:41:24 -0700
To: bugtraq@securityfocus.com
From: PSE-L@mail.professional.org (Sean Straw / PSE)
In-Reply-To: <200108202133.PAA17268@eris.coyotesong.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed

At 15:33 2001-08-20 -0600, Bear Giles wrote:

>1) run them through a simple filter for image tags.  With regex,
>the pattern could be as simple as "<img ([^>]+)>", case insensitive.
>You might need to include some backslash quotes.

.. which immediatley screws up _CODE_ embedded into messages.  "Here, joe, 
the solution to the niggling problem is to replace the code in somefunction 
with  <img src..."

KLUNK.  This method would have broken valid code - code which may be 
expected to be copied and pasted as-is.

>For everything that matches, look for any height and width attributes
>for the image.  If it's 1, you have a web bug.  Even if it's 2-8 or so,
>it's probably still a web bug.

And for code embedded in valid pages, it may not be.  How about for images 
without explicit height and width elements - many clients don't show a 
preview, or at least show an outline (even on single pixel images) that 
this wouldn't matter in email.  In fact, the 'web bug' could just as easily 
be a *REGULAR GRAPHIC* (such as a horizontal rule), since you're viewing 
HTML email, and by the time you realize an image is being loaded - whether 
it is visible or not - the request has already been made.

>Either comment it out or delete it.  The latter may be preferable
>if don't want to break scripts.

Now you're stuck needing to match brackets, which very likely will not work 
properly the instant you receive a quoted message:

 > the tag <img src="some tag"
 > height="1" width="1">

Where does the IMG SRC closing bracket appear when you're using a simple 
regexp?  What if the second line doesn't appear?

Arguably, if the message body is HTML, the MIME type should indicate as 
much, there should be an opening HTML tag (but there might not be, and 
email HTML renderers are pretty lax with this), and gt and lt's that aren't 
part of the HTML coding of the page would be properly escaped.  Then again, 
what stops the spammer from obfuscating their code in the same way?  Try 
embedding ORDINALS in your page, and a good HTML renderer will render it 
fine, but most regexps will fail to find a match (I use ordinals to 
"mailfuscate" mailto urls and even non-URL plaintext email addresses on all 
of my webpages - it significantly reduces spam which arrives from 
web-spidering spambots).

Besides BGSOUND, page backgrounds and even TABLE backgrounds could utilize 
an embedded image, in which case, you won't even see it as an IMG SRC 
tag.  Suddenly, your filter needs to fully parse HTML in order to have a 
prayer of stripping these tags.

Which makes blocking (via RBL, etc) and effectively filtering spam a pretty 
darn good solution.


Someone mentioned having a port-80 filter on your firewall -- what of dot 
trackers which reference a specific port number?

         <img src="http://www.somesite.com:110/dot_tracker.file?uniqueid">

Anyone running a firewall would probably block certain services -- but all 
the spammer has to do is run their tracking system on a port for a standard 
service which a mail client would be expected to access, and that 
firewalling isn't going to do you much (unless your firewall only allows 
access for POP3 (110) out to one specific server - joe user is unlikely to 
configure their machine this way, joe poweruser probably won't because they 
have multiple accounts, and joe corporateadmin won't because too many users 
check their various mail accounts from the office, and limiting them in 
this fashion would be too grievous).


Sorry if I've pointed out another exploit that the spammers could use to 
circumvent such firewall rules.

---
  Please DO NOT carbon me on list replies.  I'll get my copy from the list.

  Sean B. Straw / Professional Software Engineering
  Post Box 2395 / San Rafael, CA  94912-2395


home help back first fref pref prev next nref lref last post