[6192] in cryptography@c2.net mail archive
Re: Semantic Forests, from CWD (fwd)
daemon@ATHENA.MIT.EDU (Steven M. Bellovin)
Thu Dec 2 17:28:52 1999
From: "Steven M. Bellovin" <smb@research.att.com>
To: "Arnold G. Reinhold" <reinhold@world.std.com>
Cc: Udhay Shankar N <udhay@pobox.com>, cryptography@c2.net
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Date: Thu, 02 Dec 1999 17:24:46 -0500
Message-Id: <19991202222451.A964A41F16@SIGABA.research.att.com>
In message <v04210100b46c3822f6c2@[24.218.56.92]>, "Arnold G. Reinhold" w=
rites:
>>In April 1999, a report commissioned by the Parliament's Office of
>>Scientific and Technological Options Assessment (STOA), concluded that
>>"effective voice 'wordspotting' systems do not exist" and "are not in u=
se".
> =
> I wonder about the European Parliament. They sometimes make our =
> Congress look intelligent. The existence of speech recognition =
> technology is hardly a secret. It's been on the market for years, has =
> been improving steadily and is now being offered commercially for =
> similar applications. I don't know how effective it is right now at =
> telephone monitoring, but it will only get better. Here is an excerpt =
> from one vendor's web site: =
> http://www.dragonsystems.com/products/audiomining/
> =
> "New AudioMining=81 Technology Uses Award-Winning Speech Recognition =
> Engine to Quickly Capture and Index Information Contained in Recorded =
> Video Footage, Radio Broadcasts, Telephone Conversations, Call Center =
> Dialogues, Help Desk Recordings, and More
etc.
The problem, from the perspective of an intelligence agency, is figuring =
out =
what to listen to. Let's do some arithmetic.
The product you cite requires at least a 133 Mhz Pentium; 200 Mhz preferr=
ed. =
How many such chips are needed? Well, according to a map on a wall near =
my =
office (see http://www.telegeography.com/Publications/cmap99.html), there=
are =
currently about 150 Gbps worth of fiber across the Atlantic. That's abou=
t 2.7 =
potential million phone channels. A lot of that is data, of course -- sh=
all =
we say 75%? That still leaves us with ~675K simultaneous calls. That's =
an =
awful lot of CPU power, even by NSA's standards.
And it gets worse -- within a year, the FLAG and TAT-14 cables will come =
online, adding at least 800 Gbps of capacity...
Tentative conclusion: they need to listen to the signaling channels, so =
that =
they can focus their efforts. *Then* they can do the voice recognition a=
nd =
pattern-matching tricks.
--Steve Bellovin