Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a generic wrapper for forecasting classes. #162

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mertozer94
Copy link
Contributor

@mertozer94 mertozer94 commented Nov 23, 2019

Issue

This is a base class where raw stream of inputs x are transformed into a data stream x, y1,y2,...,yk where k is defined as a parameter to the number of steps to forecast and the training is done such as (x=x[t-1], y=x[t+k]) at moment t. Also training of the model is trivial to the user.

Changes proposed in this pull request:

  • A new base class for forecasting.

Checklist

  • Code complies with PEP-8 and is consistent with the framework.
  • Code is properly documented.
  • Tests are included for new functionality or updated accordingly.
  • Travis CI build passes with no errors.
  • Test Coverage is maintained (threshold is -0.2%).
  • Files changed (update, add, delete) are in the PR's scope (no extra files are included).

@codecov
Copy link

codecov bot commented Nov 23, 2019

Codecov Report

Merging #162 into master will decrease coverage by 0.04%.
The diff coverage is 66.66%.

@@            Coverage Diff             @@
##           master     #162      +/-   ##
==========================================
- Coverage   88.39%   88.34%   -0.05%     
==========================================
  Files         185      185              
  Lines       13566    13595      +29     
==========================================
+ Hits        11992    12011      +19     
- Misses       1574     1584      +10     
Impacted Files Coverage Δ
src/skmultiflow/core/base.py 77.37% <62.96%> (-2.01%) ⬇️
src/skmultiflow/core/__init__.py 100.00% <100.00%> (ø)

@jacobmontiel
Copy link
Contributor

Thanks for the PR @mertozer94

Pinging @jmread to get feedback on this.

Note: codcov error is due to missing tests for the new code.

@jmread
Copy link
Contributor

jmread commented Nov 27, 2019

This looks promising!
In your comment above, I think you mean (x=x[t-1], y=x[t+k])? (To take into account k). Or you do mean (x=x[t-k:t-1], y=x[t]). Or a combination of these (there is one k or two?).

@mertozer94
Copy link
Contributor Author

Thanks for the PR @mertozer94

Pinging @jmread to get feedback on this.

Note: codcov error is due to missing tests for the new code.

Thanks for the advice I will add some test cases as soon as possible. Also next time I will try to run locally 'codecov' but are there any other things that I should better check before pushing my commits? Like code coverage etc .

@mertozer94
Copy link
Contributor Author

This looks promising!
In your comment above, I think you mean (x=x[t-1], y=x[t+k])? (To take into account k). Or you do mean (x=x[t-k:t-1], y=x[t]). Or a combination of these (there is one k or two?).

Thank you for the response. You are right, I meant (x=x[t-1], y=x[t+k]). I will modify it right now. But I am open for discussion on this subject.

@mertozer94
Copy link
Contributor Author

This looks promising!
In your comment above, I think you mean (x=x[t-1], y=x[t+k])? (To take into account k). Or you do mean (x=x[t-k:t-1], y=x[t]). Or a combination of these (there is one k or two?).

Thank you for the response. You are right, I meant (x=x[t-1], y=x[t+k]). I will modify it right now. But I am open for discussion on this subject.

Actually, I think I have responded a bit quickly. What I had in my mind first was the training was done via (x=x[t-1], y=x[t]), and since I was thinking that forecasting was basically k times predicting the next value we could simply call predict function repeatedly k times. I now see that this is not possible.

I realize that I had in mind (x=x[t-k:t-1], y=x[t]). As I said, I am open for discussion on this one.

@jmread
Copy link
Contributor

jmread commented Dec 1, 2019

I realize that I had in mind (x=x[t-k:t-1], y=x[t]). As I said, I am open for discussion on this one.

I think the most generic option is (x=x[t-k:t-1], y=x[t::t+l]) to allow multi-step forecasting into the future since multi-output learning is also an integral part of scikit-multiflow (of course, l=1 would be a nice default). But maybe the t:t+l part could be an extension of the wrapper.

@mertozer94
Copy link
Contributor Author

I realize that I had in mind (x=x[t-k:t-1], y=x[t]). As I said, I am open for discussion on this one.

I think the most generic option is (x=x[t-k:t-1], y=x[t::t+l]) to allow multi-step forecasting into the future since multi-output learning is also an integral part of scikit-multiflow (of course, l=1 would be a nice default). But maybe the t:t+l part could be an extension of the wrapper.

I agree that this is more generalized. But this means that, users needs to specify k and l before the training phase. Like you said, if they are not present, we can have some default values for them.

@jmread
Copy link
Contributor

jmread commented Dec 4, 2019

I agree that this is more generalized. But this means that, users needs to specify k and l before the training phase. Like you said, if they are not present, we can have some default values for them.

Yes, I believe that k=1 and l=1 will make good defaults.

There are numerous options that could be considered to extend this even further. For example, y=x[t+l] (a single, but long-range forecast). This has some relation to delayed labels. But this kind of flexibility could be added later. Even the simplest case with k=1,l=1 will open up many possibilities for the framework leveraging existing methods.

@mertozer94
Copy link
Contributor Author

Hey @jmread, I have included changes. Although, for the unit test, I can only imagine to test it via it's sub classes. Do you have any recommendation on that?

@jmread
Copy link
Contributor

jmread commented Jan 5, 2020

Hey @jmread, I have included changes. Although, for the unit test, I can only imagine to test it via it's sub classes. Do you have any recommendation on that?

Other than testing the handling of different values for the parameters k, l, indeed most of the testing can be done with the actual methods employed for the forecasting.

@mertozer94
Copy link
Contributor Author

Hey @jmread, I have included changes. Although, for the unit test, I can only imagine to test it via it's sub classes. Do you have any recommendation on that?

Other than testing the handling of different values for the parameters k, l, indeed most of the testing can be done with the actual methods employed for the forecasting.

I have added unit tests without creating a subclass.

This is a base class where raw stream of inputs x are transformed into a data stream x, y1,y2,...,yk where k is defined as a parameter to the filter (number of steps to forecast) and training of the model is x=x[t-k:t-1], y=x[t::t+l] and it is trivial to the user
@mertozer94
Copy link
Contributor Author

Hello @jacobmontiel @jmread It's been a long time, but do you have any idea on how we can improve the code coverage ?

Hey @jmread, I have included changes. Although, for the unit test, I can only imagine to test it via it's sub classes. Do you have any recommendation on that?

Other than testing the handling of different values for the parameters k, l, indeed most of the testing can be done with the actual methods employed for the forecasting.

I have added unit tests without creating a subclass.

@AnkurDebnath35
Copy link

@mertozer94 Can you share a sample code for the forecasting model which can handle stream data in an online fashion and is compatible with other functions of multiflow, like DriftDetectors?

@AnkurDebnath35
Copy link

@mertozer94 I am unable to checkout this commit, throws out a "fatal:reference is not a tree" error while using the correct sha. Can someone help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants