-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a generic wrapper for forecasting classes. #162
base: master
Are you sure you want to change the base?
Add a generic wrapper for forecasting classes. #162
Conversation
Codecov Report
@@ Coverage Diff @@
## master #162 +/- ##
==========================================
- Coverage 88.39% 88.34% -0.05%
==========================================
Files 185 185
Lines 13566 13595 +29
==========================================
+ Hits 11992 12011 +19
- Misses 1574 1584 +10
|
Thanks for the PR @mertozer94 Pinging @jmread to get feedback on this. Note: |
This looks promising! |
Thanks for the advice I will add some test cases as soon as possible. Also next time I will try to run locally 'codecov' but are there any other things that I should better check before pushing my commits? Like code coverage etc . |
Thank you for the response. You are right, I meant (x=x[t-1], y=x[t+k]). I will modify it right now. But I am open for discussion on this subject. |
Actually, I think I have responded a bit quickly. What I had in my mind first was the training was done via (x=x[t-1], y=x[t]), and since I was thinking that forecasting was basically k times predicting the next value we could simply call predict function repeatedly k times. I now see that this is not possible. I realize that I had in mind (x=x[t-k:t-1], y=x[t]). As I said, I am open for discussion on this one. |
I think the most generic option is (x=x[t-k:t-1], y=x[t::t+l]) to allow multi-step forecasting into the future since multi-output learning is also an integral part of scikit-multiflow (of course, l=1 would be a nice default). But maybe the t:t+l part could be an extension of the wrapper. |
I agree that this is more generalized. But this means that, users needs to specify |
Yes, I believe that There are numerous options that could be considered to extend this even further. For example, y=x[t+l] (a single, but long-range forecast). This has some relation to delayed labels. But this kind of flexibility could be added later. Even the simplest case with k=1,l=1 will open up many possibilities for the framework leveraging existing methods. |
064e82c
to
6eb7f01
Compare
Hey @jmread, I have included changes. Although, for the unit test, I can only imagine to test it via it's sub classes. Do you have any recommendation on that? |
Other than testing the handling of different values for the parameters k, l, indeed most of the testing can be done with the actual methods employed for the forecasting. |
I have added unit tests without creating a subclass. |
This is a base class where raw stream of inputs x are transformed into a data stream x, y1,y2,...,yk where k is defined as a parameter to the filter (number of steps to forecast) and training of the model is x=x[t-k:t-1], y=x[t::t+l] and it is trivial to the user
c79e630
to
e3c413d
Compare
Hello @jacobmontiel @jmread It's been a long time, but do you have any idea on how we can improve the code coverage ?
|
@mertozer94 Can you share a sample code for the forecasting model which can handle stream data in an online fashion and is compatible with other functions of multiflow, like DriftDetectors? |
@mertozer94 I am unable to checkout this commit, throws out a "fatal:reference is not a tree" error while using the correct sha. Can someone help? |
Issue
This is a base class where raw stream of inputs x are transformed into a data stream x, y1,y2,...,yk where k is defined as a parameter to the number of steps to forecast and the training is done such as (x=x[t-1], y=x[t+k]) at moment t. Also training of the model is trivial to the user.
Changes proposed in this pull request:
Checklist