Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First generation is ignored #34

Open
ClimbsRocks opened this issue Sep 8, 2017 · 4 comments
Open

First generation is ignored #34

ClimbsRocks opened this issue Sep 8, 2017 · 4 comments

Comments

@ClimbsRocks
Copy link
Contributor

A duplicate of #27 but hopefully easier to understand with a reproducible code block.

to reproduce, simply set generations_number=1 in the test.ipynb notebook.

when you do, you'll see the following:
image

there are a number of things to note here (presumably all coming from the same root error of the first generation being skipped)

  1. despite generations_number=1, it's actually run for 2 generations
  2. the .cv_results_ table only has 22 rows- the number of evaluations that were in the second generation. it should have at least 50 rows (the number of evaluations in the first generation)
  3. there's an inconsistent error being thrown: AttributeError: 'EvolutionaryAlgorithmSearchCV' object has no attribute 'best_score_'. i haven't yet figured out when this error is thrown and when it is not, but given that nobody else has complained about it, i'm hoping that it's also related to this issue of just having 1 generation. it does work sometimes though, even with just 1 generation
  4. the information for the 0th generation does appear to be saved and available for the statement that's printing "best individual with fitness X", even when it's not available in .cv_results_. see the screenshot below
    image
    it's also worth noting on this same point that i have sometimes seen .best_score_ print out results that only report the best score from the second generation (which would be 0.923205 in this one-off example i created just for point number 4).

the full .cv_results_ table is below. interestingly, it seems to have some awareness that 50 other rows should be present (the index column starts at 51), but there are only 22 rows present:
image

i'm hoping this is all an easily-fixed off-by-one error somewhere.

@ClimbsRocks
Copy link
Contributor Author

@ryanpeach and @rsteca hopefully this is a clearer bug report than before! i'm back from vacation now, and should hopefully be a bit more responsive.

@ryanpeach
Copy link
Contributor

Very interesting, thanks for the extra info. I bet you're onto something with those missing index numbers, it's completely possible we are not fully scraping the history file into the outputs. Unfortunately I won't have time to examine it for a while, just got a new job! But ill try whenever I get the chance. Feel free to dig into the source code, it's just one file so even if complicated it's not a lot to learn.

@ClimbsRocks
Copy link
Contributor Author

@ryanpeach congrats on the new job! i've got a lead on the possible cause, if anyone's got a minute to check it out: https://github.com/rsteca/sklearn-deap/blob/master/evolutionary_search/cv.py#L416

it looks like we've registered only the mutation and mate stage, and not the initial population. it looks like we try to update the history with the population, but it doesn't appear to work.

hope this is an easy fix for someone! maybe @rsteca can figure this out.

@rsteca
Copy link
Owner

rsteca commented Sep 14, 2017

Sorry, I don't have time right now for looking into this. But if someone can figure this out and does a pull request I will gladly merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants