A recurrent neural network is trained on the blue line (which is some kind of physiologic signal). It has some kind of pattern to it except at t=~300 where it shows 'anomalous' behavior. The green line (not same scale) represents the error between the (original) signal and a reconstructed version of it from the neural network. At ~300, the network could not reconstruct the signal, so the error there becomes significantly higher.
Why is this cool??
- unsupervised: I did not care about data with anomalies vs data without anomalies
- trained with anomaly in the data: as long as most of the data is normal, the algorithm seemed robust enough to have learned the pattern of the data with the anomaly in it.
- no domain knowledge applied: no expert in this kind of time series provided input on how to analyze this data
More details for the more technical people:
- training algo: RMSprop
- input noise added
- the network is an LSTM autoencoder
- it's a fairly small network
- code: theanets
And that's my master's thesis in one graph!
---
update 12/2018:
This post has been getting much attention which I appreciate. However, I find myself obligated to point readers to the latest research which obviates RNNs. Here is a great introduction towards the latest research The Fall of RNN LSTM.
---
update 12/2018:
This post has been getting much attention which I appreciate. However, I find myself obligated to point readers to the latest research which obviates RNNs. Here is a great introduction towards the latest research The Fall of RNN LSTM.