Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix shape mismatch error during backpropagation in MLP optimizer #96

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

achal-khanna
Copy link

This submission addresses the issue tracked in #78.

Root Cause

In optimizers like Adam and SGD, the self.cache was shared among all layers, leading to a situation where the cache keys were simply W and b. As a result, when different layers attempted to update their parameters, they all referred to the same cache entries. This led to shape mismatches because the updates for different layers were not properly isolated.

For instance, the cache should have unique keys like layer1-W, layer1-b, layer2-W, etc., but instead, all parameters were using the same keys, resulting in conflicts during backpropagation.

Solution

The solution involved ensuring that each layer maintained its own cache. This was done by creating a deepcopy of the optimizer linked to each specific layer during its initialization. This way, each layer could independently manage its cache.

All Submissions

  • Is the code you are submitting your own work?
  • Have you followed the contributing guidelines?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

Changes to Existing Models

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant