[6192] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

Re: Semantic Forests, from CWD (fwd)

daemon@ATHENA.MIT.EDU (Steven M. Bellovin)
Thu Dec 2 17:28:52 1999

From: "Steven M. Bellovin" <smb@research.att.com>
To: "Arnold G. Reinhold" <reinhold@world.std.com>
Cc: Udhay Shankar N <udhay@pobox.com>, cryptography@c2.net
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Date: Thu, 02 Dec 1999 17:24:46 -0500
Message-Id: <19991202222451.A964A41F16@SIGABA.research.att.com>

In message <v04210100b46c3822f6c2@[24.218.56.92]>, "Arnold G. Reinhold" w=
rites:

>>In April 1999, a report commissioned by the Parliament's Office of
>>Scientific and Technological Options Assessment (STOA), concluded that
>>"effective voice 'wordspotting' systems do not exist" and "are not in u=
se".
> =

> I wonder about the European Parliament. They sometimes make our =

> Congress look intelligent. The existence of speech recognition =

> technology is hardly a secret. It's been on the market for years, has =

> been improving steadily and is now being offered commercially for =

> similar applications. I don't know how effective it is right now at =

> telephone monitoring, but it will only get better. Here is an excerpt =

> from one vendor's web site: =

> http://www.dragonsystems.com/products/audiomining/
> =

> "New AudioMining=81 Technology Uses Award-Winning Speech Recognition =

> Engine to Quickly Capture and Index Information Contained in Recorded =

> Video Footage, Radio Broadcasts, Telephone Conversations, Call Center =

> Dialogues, Help Desk Recordings, and More

etc.

The problem, from the perspective of an intelligence agency, is figuring =
out =

what to listen to.  Let's do some arithmetic.

The product you cite requires at least a 133 Mhz Pentium; 200 Mhz preferr=
ed.  =

How many such chips are needed?  Well, according to a map on a wall near =
my =

office (see http://www.telegeography.com/Publications/cmap99.html), there=
 are =

currently about 150 Gbps worth of fiber across the Atlantic.  That's abou=
t 2.7 =

potential million phone channels.  A lot of that is data, of course -- sh=
all =

we say 75%?  That still leaves us with ~675K simultaneous calls.  That's =
an =

awful lot of CPU power, even by NSA's standards.

And it gets worse -- within a year, the FLAG and TAT-14 cables will come =

online, adding at least 800 Gbps of capacity...

Tentative conclusion:  they need to listen to the signaling channels, so =
that =

they can focus their efforts.  *Then* they can do the voice recognition a=
nd =

pattern-matching tricks.

		--Steve Bellovin




home help back first fref pref prev next nref lref last post