Skip to content
This repository was archived by the owner on Sep 18, 2024. It is now read-only.

Conversation

@btel
Copy link

@btel btel commented Oct 15, 2019

Summary

This PR proposes to clarify the mechanism of creation validation set generation in function flow_from_directory when validate_split option is used. This may be potentially important when the image files in the directory have meaningful names (for example, encode particular instance of the objects). One can imagine the following organisation of files:

Cats/
    balinese1.jpg
    balinese2.jpg
    siamese1.jpg
    siamese2.jpg
Dogs/
    Shepard1.jpg
    Shepard2.jpg
    Terrier2.jpg

In such a setting validation set might get all siamese cats and terrier dogs, while the training set might get balinese cats and shepard dogs. This will heavilly affect validation metrics.

Related Issues

PR Overview

  • [n ] This PR requires new unit tests [y/n] (make sure tests are included)
  • [y] This PR requires to update the documentation [y/n] (make sure the docs are up-to-date)
  • [y] This PR is backwards compatible [y/n]
  • [n] This PR changes the current API [y/n] (all API changes need to be approved by fchollet)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant