[94785] in North American Network Operators' Group
Re: Time Series databases
daemon@ATHENA.MIT.EDU (Rodrick Brown)
Thu Feb 8 08:59:14 2007
Date: Thu, 8 Feb 2007 08:54:16 -0500
From: "Rodrick Brown" <rodrick.brown@gmail.com>
To: "michael.dillon@bt.com" <michael.dillon@bt.com>
Cc: nanog@merit.edu
In-Reply-To: <2DA00C5A2146FB41ABDB3E9FCEBC74C1D77221@i2km07-ukbr.domain1.systemhost.net>
Errors-To: owner-nanog@merit.edu
On 2/8/07, michael.dillon@bt.com <michael.dillon@bt.com> wrote:
>
> > > Going back to this thread, http://www.kx.com/ deals in
> > financial transaction
> > > databases where they store millions of ticks. They appear to have a
> > > transactional based language with a solution that appears
> > to be robust and
> > > fail resistant.
>
> > hmm, that is quite interesting. and apparently people out there _are_
> > using it for things like counter values and what not - based on their
> > FAQ. I'd absolutely love to know more about the algorithms and math
> > behind something like kdb+
>
> KX publish a bunch of information about their product. Their lineage
> goes back to APL and the J language, both of which found most of their
> users in financial services.
>
> However, the general issue of time-series databases is more interesting.
> Google will take you to lots of research using keywords like:
>
> time-series database delta wavelet search indexing maxima
>
> Of course, don't use them all at once. To give you a flavor of the stuff
> that people have done, here is a slide presentation on compression and
> indexing that does not use averages like RRD does:
> http://www.cs.cmu.edu/~eugene/research/talks/major-extrema.ppt
>
> In addition to Google, it is a good idea to search CiteSeer
> http://citeseer.ist.psu.edu/ because it allows you to quickly track down
> references to other papers so you can read them all as a set.
>
> I don't think there are any full-blown open-source implementations that
> you could integrate into your own systems. There is stuff like Metakit
> http://www.equi4.com/metakit.html which stores data by column rather
> than by row. And people who have thought about how to efficiently store
> time-series probably cobbled together their own systems using bsddb or
> HDF5.
>
> If you are stuck in the SQL world, then check out these articles on star
> and snowflake schemas. http://en.wikipedia.org/wiki/Snowflake_schema
> http://en.wikipedia.org/wiki/Star_schema and follow up the references at
> the bottom of the page.
>
>
There have been numerous technical discussions over at EliteTrader.com
about tick database implementations using a variety of technologies
from with various pros and cons of SQL, KX, Vhayu, Times Ten,
Hibernate, and HDF5 a must read for anyone interested.
The threads can be found on elite trader automated trading forums
http://www.elitetrader.com/vb/showthread.php?s=&threadid=81345&perpage=6&pagenumber=1
--
Rodrick R. Brown