Towards Never-Ending Learning from Time Series Streams (NELTS)
This webpage was build in support of NELTS; Software
that an agent examines an unbounded stream of data and occasionally asks a teacher (which may be a human or an algorithm) for a label.
NELTS was developed by
-
UC Riverside: Yuan Hao, Yanping Chen,
Jesin Zakaria, Bing Hu, Eamonn Keogh
-
Kasetsart University: Thanawin Rakthanmanon
Examples of S and P
- Application for long-term sleep studies
Recall that our work assumes the existence of two data streams: S, which is high-dimensional, and could be an audio stream, a video stream, a text document stream, etc., and P,
which is a low-dimensional proxy for S.Below we give some concrete examples of S and P, both from our work, and examples from the literature that we feel might fit naturally into our framework. From [a] .
These dataset has 4400 hours of video, which we see as S, and wrist sensor data, which we see as P.
- Application for behavior of pierching-sucking insects
The electrical penetration graph (EPG), developed by Dr. Elaine Backus (shown) is a widely used tool for research into the behavior of piercing-sucking insects.
To use it, researchers connect the insect and plant to an electronic monitor that, like an electrocardiogram, reads electrical charges produced by tiny changes in voltage that occur as the insect feeds.
Here we see video of the insect behavior as S, and the single time series stream as P [b] .
Overview of System Architeture
The main challenge of our work is that our system learns prototypical time series templates from real-valued data taht it can only see once.
To solve this problem,our System has three main subroutines, which is Subseqence Processing, Frequent Pattern maintenance, and Active Learning System.
The structure is shown as below.
- Frequent Pattern Maintenance
The problem to maintain real-valued and high dimensional frequent items from unbounded streams in bounded space is a challenging task.
The intuition of our method is based on finding the dense subtrees in the dendrogram in constance space.
The figure below give an example to find the dense subtree, please see the details in the paper.
- Active Learning System
The active learning system includes two broad approaches for classification: strong teacher and weak teacher.
Given our dictionary-based model, the frequency to trigger a query to the teacher, and what action we should take given the teacher's feedback is worthy to explore.
Let SR be the sampling rate of P, and QR be the mean number of seconds between queries that the teacher is willing to tolerate.
The more details that how we calculate the trigger threshold is:here.
The illustration of a weak teacher example is shown below.
Code:
The code.
Data:
- Activity Dataset consists 13.3 minute video sequence, 721 optical flow time series, came from paper.
- The Flying Insects sample Dataset is available here, Please email Yanping Chen for a 20GB hard drive for all the data.
- The 20-hour long ECG Dataset came from Physionet_BIDMC_ch07.
- Tawny Owl Dataset is available here. The code to extract mfcc is here.
- The Sapsucking Insect Dataset came from here (UCR Entomology Department).
- The elder care (Weak Teacher example) Dataset came from IADL Housekeeping Activities.
[a] Combining Wearable and Environmental Sensing into an Unobtrusive Tool for Long-Term Sleep Studies. 2012 Marko Borazio and Kristof Van Laerhoven 2nd ACM SIGHIT International Health Informatics Symposium (IHI 2012).
[b] Photos by Stephen Ausmus. Figure from: Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney Cash, Brandon Westover (2009). Exact Discovery of Time Series Motifs. SDM 2009.