My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
Discords  
Primer for discords finding in time-series.
Updated Apr 19, 2010 by sen...@gmail.com

Introduction

In 2005 Keogh, Lin & Fu published a paper providing SAX application for finding sub-series of unusual behavior within time-series: E. Keogh, J. Lin and A. Fu (2005). "HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence". In The Fifth IEEE International Conference on Data Mining.

Authors define discords in their work as the "...subsequences of a longer time series that are maximally different to all the rest of the time series subsequences..."

Details

JMotif provides a convenient method for finding discords within time series.

Following the article I've downloaded a TEK16 dataset (Space Shuttle Marotta Valve Series) from the UCR Time Series Classification/Clustering Page. The discord location was identified by simply calling SAXFactory.instances2Discords method:

  public static void main(String[] args) throws Exception {

    // get the data first
    Instances tsData = readTSData();

    // now build the SAX data structure using sliding window of size 40 and alphabet of 3
    DiscordRecords dr = SAXFactory.instances2Discords(tsData, attribute, windowSize, alphabetSize);

    // printout the discords occurrences
    System.out.println(dr.toString());
  }

  /**
   * Read the timeseries data into WEKA format.
   *
   * @return Timeseries.
   * @throws Exception If error occurs.
   */
  private static Instances readTSData() throws Exception {
    Instances data = DataSource.read("data//ts_data//TEK16.arff");
    return data;
  }

The TEK17 dataset analysis:

Raw timeseries with discord found by JMotif highlighted

Zoomed into discord; similar fragments from timeseries and their clustering.


Sign in to add a comment
Powered by Google Project Hosting