[5257] in www-talk@info.cern.ch

home help back first fref pref prev next nref lref last post

Re: Caching Servers Considered Harmful (was: Re: Finger URL)

daemon@ATHENA.MIT.EDU (Steven D. Majewski)
Mon Aug 22 19:10:15 1994

Date: Tue, 23 Aug 1994 01:07:02 +0200
Errors-To: listmaster@www0.cern.ch
Errors-To: listmaster@www0.cern.ch
Reply-To: sdm7g@elvis.med.virginia.edu
From: "Steven D. Majewski" <sdm7g@elvis.med.virginia.edu>
To: Multiple recipients of list <www-talk@www0.cern.ch>

On Aug 22, 18:08, "Rob Raisch, The Internet Company" wrote:
> 
> Because anyone running a caching server runs the dual risk of presenting 
> out-of-date information to their users and can be in direct violation of 
> international copyright law.
> 
> The first point is by far the most important in my mind.  As more and 
> more professional publishers come online, you will see this becoming 
> much more of an issue.  
> 
>  [ ... various remarks about timliness of information ... ] 

But, isn't that what (HTTP) Expires: is for?
If the information may change over time, then it should be marked so.

There are some documents that the provider know *WILL* be superceded,
although he doesn't know exactly when. It would seem that a reasonable
procedure would be to set Expires: to now + 1 unit, and that the client 
or caching server should use "If-Modified-Since:" to check if it 
has *actually* expired. ( Or does having no explicit Expires: imply 
this ?  Whichever, there should probably be a usage note somewhere to
document the proper procedure. ) 


> Of course, I can mark my information as being uncacheable, but will you
> honor that request?  Your interest is to provide content to your users
> with as little impact on your communications resources as possible.  I
> believe that your goals and mine are not compatible. 


I think both ends should have a compatible goal: to follow a common
protocol. Is the problem here: 
 (1) That a new feature/header-field needs to be added to the protocol. 
     ( i.e. is Expires: being forced to carry too much of a "semantic load"?)
 (2) That a usage clarification about how to properly USE the protocol
     needs to be added. ( Semantics of Expires: needs to be defined. )
 (3) That there are just some broken or misconfigured servers out there.  
? I have listed those in what I think are increasing probability, but
I would consider any of them more reasonable conclusions than that 
caching servers should be considered harmful. 


> The copyright issue is the more difficult one.   In light of the previous 
> argument, you are archiving an original work.  This is called "copying" 
> in copyright law and if it is done without permission, is against the law.

 The data is likely to get "copied" numerous times in transit from the 
provider to the client. ( And probably cached on the client - and what 
if my client provides cross-session global caching ? ) 
 The only technological fix is for the copyrighted data to be encrypted
and viewable only by the authorized client/customer. ( i.e. cached and
encrypted data is useless for another client with a different key.
This could be an argument for (#1) above. Trying to overload too many
functions onto Expires may make erroneous results ambiguous. ) 


> (I'm ignoring any arguments that copyright law must be redesigned in light
> of digital distribution.  I don't think anyone would disagree with this. 
> However, I doubt that copyright is going away and in fact, I expect the
> body of law will be strengthened not diluted.)


I would take this, and the implication of legal culpability on the part
of the server for copyright violation, as an argument for specifying
"reasonable behaviour" or semantics more strongly: to distinguish that
the server (and it's maintainer) can't be responsible for things that 
the client doesn't tell it! True - RFC's carry no legal weight in any
courts I know of, but specification in an internet standard, plus a 
couple of expert witnesses to testify on what the "community" considers
to be prudent behaviour might just make the difference in a court 
deciding on what exactly constitutes negligent behaviour. 

 
> I expect that most professional publishers will not serve content to any 
> site which caches unless they can enter into a business relationship with 
> that site.  Unfortunately, this presents a very interesting N by N 
> problem, as publishers and caching servers proliferate.

I don't think the WWW is "ready for prime time" commercial use yet - 
better authentication, security, encryption, etc. needs to be 
standardized, implemented and deployed ( i.e. in *common* use )
first. But I think you are wrong in picking caching servers as the
scapegoat that would prevent it. I think, rather, that they are going
to be a useful and (practically) necessary piece of technology to 
bring the Web to the (commercial) masses. 



-- Steve Majewski       (804-982-0831)      <sdm7g@Virginia.EDU> --
-- UVA Department of Molecular Physiology and Biological Physics --
-- Box 449 Health Science Center        Charlottesville,VA 22908 --
   [ "Cognitive Science is where Philosophy goes when it dies ... 
	if it hasn't been good!" - Jerry Fodor  ]


home help back first fref pref prev next nref lref last post