Wednesday, September 30, 2015

Recurrent Neural Networks Can Detect Anomalies in Time Series


A recurrent neural network is trained on the blue line (which is some kind of physiologic signal). It has some kind of pattern to it except at t=~300 where it shows 'anomalous' behavior. The green line (not same scale) represents the error between the (original) signal and a reconstructed version of it from the neural network. At ~300, the network could not reconstruct the signal, so the error there becomes significantly higher.

Why is this cool??

  • unsupervised: I did not care about data with anomalies vs data without anomalies
  • trained with anomaly in the data: as long as most of the data is normal, the algorithm seemed robust enough to have learned the pattern of the data with the anomaly in it.
  • no domain knowledge applied: no expert in this kind of time series provided input on how to analyze this data

More details for the more technical people:
- training algo: RMSprop
- input noise added
- the network is an LSTM autoencoder
- it's a fairly small network
- code: theanets 

And that's my master's thesis in one graph!


---
update 12/2018:
This post has been getting much attention which I appreciate. However, I find myself obligated to point readers to the latest research which obviates RNNs. Here is a great introduction towards the latest research The Fall of RNN LSTM.