Skip to content
This repository has been archived by the owner on Dec 18, 2019. It is now read-only.

Series with periodic #85

Open
hit9 opened this issue May 8, 2014 · 7 comments
Open

Series with periodic #85

hit9 opened this issue May 8, 2014 · 7 comments

Comments

@hit9
Copy link

hit9 commented May 8, 2014

For example, a series with periodic: 1 day, data at 12:00 is a peak(i.e 1000), and at 0:00 is 10, so, 1000 at 12:00 should be normal, and 10 at 12:00 should be anomalous.

But skyline thinks 10 is normal.

@astanway
Copy link
Contributor

astanway commented May 8, 2014

Pull requests accepted...

Seasonal algorithms are hard to automatically fit. Working on it, though...

On May 8, 2014, at 12:25 AM, 王超 [email protected] wrote:

For example, a series with periodic: 1 day, data at 12:00 is a peak(i.e 1000), and at 0:00 is 10, so, 1000 at 12:00 should be normal, and 10 at 12:00 should be anomalous.

But skyline thinks 10 is normal.


Reply to this email directly or view it on GitHub.

@hit9
Copy link
Author

hit9 commented May 8, 2014

A way is, use Fast Fourier Transform to detect series's periodic, and fetch datapoints at the same phase, then analyze the new dataset.

I am looking inside now ..

@astanway
Copy link
Contributor

astanway commented May 8, 2014

Yep! That's what I was leaning towards - use FFT to get periodicity, and maybe use that to populate an ARIMA or use a KS test along windowed intervals? cc @toufic

On May 8, 2014, at 6:01 AM, 王超 [email protected] wrote:

A way is, use Fast Fourier Transform to detect series's periodic, and fetch datapoints at the same phase, then analyze the new dataset.

I am looking inside now ..


Reply to this email directly or view it on GitHub.

@hit9
Copy link
Author

hit9 commented May 8, 2014

I'm not so sure of the last question, but the method to detect periodicity, I get some information from : http://stackoverflow.com/questions/15261122/determine-frequency-from-signal-data-in-matlab

And, this function may help:

def guess_period(x):
    x = np.array(x)
    n = np.size(x)
    m = np.mean(x)
    p = np.abs(np.fft.fft(x - m))
    i = np.argmax(p)
    if i:
        return n / float(i)

this might gives a series's period, but some fails:

>>> x = [1, 20, 2, 20, 1, 21, 2, 22, 1, 19]
>>> guess_period(x)
2.0
>>> import itertools
>>> source = itertools.cycle([1, 10, 20, 10, 1])
>>> x = [source.next() for _ in range(101)]
>>> guess_period(x)
5.05
>>> x = [source.next() for _ in range(103)]
>>> guess_period(x)
4.904761904761905
>>> x = [source.next() for _ in range(105)]
>>> guess_period(x)
1.25  # fails

I think, we can maintain a dict ({period: hit_times}), the period that hit most wins.

@astanway
Copy link
Contributor

astanway commented May 8, 2014

Awesome. You can use Crucible (github.com/astanway/crucible) to refine the algorithm.

@hit9
Copy link
Author

hit9 commented Jul 16, 2014

Any progress forward on this ?

@hit9
Copy link
Author

hit9 commented Sep 1, 2014

Hi @astanway , I have created another monitor similar with to skyline https://github.com/eleme/node-bell
, it's only for periodic metrics. And the algorithm used is only 3-sigma. Thanks, for this project giving me lot of ideas!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants