Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECS capacity errors have inconsistent behavior. #3047

Closed
bensherman opened this issue Jun 17, 2024 · 2 comments
Closed

ECS capacity errors have inconsistent behavior. #3047

bensherman opened this issue Jun 17, 2024 · 2 comments
Assignees
Labels
documentation This is a problem with documentation. service-api General API label for AWS Services.

Comments

@bensherman
Copy link

bensherman commented Jun 17, 2024

Describe the bug

If ECS managed scaling is not being used and a task can't be placed because of a capacity error a nice parsable error message is generated and added to the payload.

If ECS managed scaling is being used and there is a capacity error because of a quota limit of Tasks in PROVISIONING state per cluster (quota of 500), a ClientError is raised.

This is unclear in the documentation.

Expected Behavior

A capacity error should return an error, not raise an exception.

Current Behavior

If ECS managed scaling is being used and there is a capacity error because of a quota limit of Tasks in PROVISIONING state per cluster (500), a ClientError is raised.

Reproduction Steps

This code does not work on managed clusters. An exception must be caught and parsed to handle it.

def run()
    result = ecs_client.run_task(
      task_definition: @name,
      cluster: @cluster,
      placement_strategy: [
        { field: "cpu", type: "binpack" }
      ],
      count: count
    )
    if result.failures.empty?
      return { success: true }
    end
    # ... parse the result and handle the retry if needed.
end

Possible Solution

No response

Additional Information/Context

This is the backtrace from my code up:

Aws::ECS::Errors::ClientException: Tasks provisioning capacity limit exceeded.
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/seahorse/client/plugins/raise_response_errors.rb in call at 17
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/aws-sdk-core/plugins/checksum_algorithm.rb in call at 111
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/aws-sdk-core/plugins/jsonvalue_converter.rb in call at 16
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/aws-sdk-core/plugins/idempotency_token.rb in call at 19
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/aws-sdk-core/plugins/param_converter.rb in call at 26
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/seahorse/client/plugins/request_callback.rb in call at 71
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/aws-sdk-core/plugins/response_paging.rb in call at 12
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/seahorse/client/plugins/response_target.rb in call at 24
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-core-3.159.0/lib/seahorse/client/request.rb in send_request at 72
  /usr/share/rbenv/versions/3.0.6/lib/ruby/gems/3.0.0/gems/aws-sdk-ecs-1.102.0/lib/aws-sdk-ecs/client.rb in run_task at 6731
  /app/app/models/task_definition.rb in run at 14

Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version

aws-sdk-core (3.159.0), aws-sdk-ecs (1.102.0)

Environment details (Version of Ruby, OS environment)

Ruby 3.06, ubuntu linux.

@bensherman bensherman added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 17, 2024
@RanVaknin RanVaknin self-assigned this Jun 18, 2024
@RanVaknin
Copy link

Hi @bensherman ,

Thanks for reaching out. The SDK maps the returned service error to the predefined list of errors specified by this service in their API model.

In ECS's RunTask documentation, ClientException is in fact a modeled exception that can be returned for this specific operation. While I agree that the service should have modeled a better exception name for this specific case, the SDK team cannot directly introduce new Exceptions since the SDK is directly code-generated from the API model of each API service.

From a service perspective, changing the returned exception will be considered a breaking change since some customers may be explicitly relying on this error when error handling capacity exceptions.

Because of this, this is not actionable by the SDK team.

I do however I agree that the documentation is too narrow and does not cover this specific case:

ClientException

These errors are usually caused by a client action. This client action might be using an action or resource on behalf of a user that doesn't have permissions to use the action or resource. Or, it might be specifying an identifier that isn't valid.

HTTP Status Code: 400

I will create a documentation request with the ECS service team to expand the definition of this exception to match this specific undocumented edge case, but we cannot commit to a timeline of the fix since the change would need to be made by the ECS team itself and not the SDK team. If you wish you can submit additional documentation related feedback using the Feedback button on the top right of each documentation page.

Thanks again,
Ran~

@RanVaknin RanVaknin added service-api General API label for AWS Services. documentation This is a problem with documentation. and removed needs-triage This issue or PR still needs to be triaged. bug This issue is a bug. labels Jun 18, 2024
@RanVaknin RanVaknin closed this as not planned Won't fix, can't repro, duplicate, stale Jun 18, 2024
Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation This is a problem with documentation. service-api General API label for AWS Services.
Projects
None yet
Development

No branches or pull requests

2 participants