Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single AppSync Api - Multi CloudFormation Stacks - Concurrency issue #595

Open
Gerroll opened this issue Apr 17, 2023 · 20 comments
Open

Single AppSync Api - Multi CloudFormation Stacks - Concurrency issue #595

Gerroll opened this issue Apr 17, 2023 · 20 comments

Comments

@Gerroll
Copy link

Gerroll commented Apr 17, 2023

Hi, thank you for this plugin!

My team and I are facing a bug when updating the GraphQL schema with multiple stacks while using the plugin. Since we have a huge project, we decided to split it into multiple entities and use the Serverless Compose Framework. Each entity has a dedicated stack with more or fewer resources, which allows us to deploy them individually. Additionally, we have a stack dedicated to AppSync to instantiate the API. The overall structure looks something like this :

serverless-compose.yml
services
  |  appsync
  |    |  schema.graphql
  |    |  serverless.ts
  |  entityX
  |    |  getEntityX
  |    |  addEntityX
  |    |  ...
  |    |  serverless.ts
  |  entityY
  |    |  getEntityY
  |    |  addEntityY
  |    |  ...
  |    |  serverless.ts

Using the Serverless Compose Framework, we first deploy the AppSync service to provide API information to all other services (Entity X, Entity Y). Then, we deploy all other services simultaneously still using the serverless-appsync-plugin(v1), which is configured to use the API ID. The goal of these services is to add resolvers, data sources, and pipelines to the AppSync API that was deployed in the AppSync service.

However, after deploying all these services (15 for now), we noticed that the GraphQL schema is being updated too frequently, causing some resolvers to become unattached from a mutation/query. When we contacted AWS support, they suggested that too much concurrent updating of the GraphQL schema could be the cause of our problem.

IMHO, the ideal solution would be to have an option to prevent updating the GraphQL schema. This would allow the plugin to only modify external configurations such as resolvers, data sources, and pipelines. It's possible that I have overlooked this solution, and it may have other implications for other features of AppSync. What are your thoughts on this?

(Edit): The goal is to configure the AppSync service to deploy only the schema, while other services do not update it.

@bboure
Copy link
Collaborator

bboure commented Apr 17, 2023

Hi,

What I'm not sure is why the service stacks update the schema?
Does each individual stack re-pass the (same?) GraphQL schema in the AppSync config?

@Gerroll
Copy link
Author

Gerroll commented Apr 17, 2023

Each stack is updating the same GraphQL schema, essentially making each stack a complete 'app' that uses the plugin once. This occurs a total of 15 times - once to create the API and the others to use the API ID to add resolvers.
Yes, they pass the same GraphQL schema between the stacks.

@jeremycare
Copy link

Hey @bboure, thanks for replying quickly.

We have the first service, in charge of deploying the API and the schema.

export const appSyncMain = {
  name: `api`,
  authenticationType: 'API_KEY',
  schema: '../appsync/dist/*.graphql',
  apiKeys: [
    {
      name: 'api-key',
      description: 'basic api key',
      expiresAfter: '30d',
    },
  ],
  mappingTemplatesLocation: '.',
  defaultMappingTemplates: {
    request: false,
    response: false,
  },
  mappingTemplates: [],
  dataSources: [],
  functionConfigurations: [],
};

Then we have all the other services, in charge of plugging resolvers and mapping templates to that API and schema.

export const appSyncService = ({
  apiId, dataSources, mappingTemplates, functionConfigurations,
}: AppSyncService) => ({
  apiId,
  schema: '../appsync/dist/*.graphql', // <---- HERE we have to pass the path of the schema again.
  mappingTemplatesLocation: '.',
  defaultMappingTemplates: {
    request: false,
    response: false,
  },
  mappingTemplates,
  dataSources,
  functionConfigurations,
});

The plugin requires specifying the path of the graphql schema. But that API / Schema has been deployed previously by the main service.

We end up with multiple updates on the schema. The AWS Core team told us this was provoking our issues with resolvers that got detached from the API.

We would like to know, how we can achieve to just do a reference to the APPSync API and Schema instead of modifying the resource.

Since we provide the apiId, does it make sense to require the schema? Do we need to give the resourceId of the schema instead?

@jeremycare
Copy link

Okay seems that the v2 doesn't support apiId anymore.

So I believe the plugin doesn't support our use case of AppSync right @bboure?

@jeremycare
Copy link

Should we fork and add our support ourselves or do you want us to do a PR to add this functionality?

@jeremycare
Copy link

Blank diagram - Page 1 (3)

Here's the architecture we have today and would like to have support with.

Also considering creating a custom plugin on our project that will correct the cloudformation stacks on top of this one to add a link instead of an update.

@Hideman85
Copy link

Hi there, I am looking forward to this because our team just reached the limit of resources per stack and so we have to split our mono service into multiple one keeping a single AppSync API.

Is there any progress on this? Workaround? or even post-install patches?

@jeremycare
Copy link

The workaround we are doing today is to do a depends_on in serverless-compose so we don't have concurrent updates (not ideal this the deployment time is very long now since we cannot do concurrent deployment on our microservices).

@jeremycare
Copy link

@Hideman85, I think it's a common topic on this repository, so you might find better solutions for your need in previous issues

@HumbleBeck
Copy link

Hey @jeremycare I have the same use case as you. I've opened a PR and hope it will be merged soon. For now, Im using my fork on dev, but so far, everything works fine. I also considered your point about schema and agree that it should be managed from one place.

@jeremycare
Copy link

@bboure Any thoughts on that?

@bboure
Copy link
Collaborator

bboure commented Apr 28, 2023

This is a common debate/topic 🙂

I purposefully removed support for passing an apiId in v2. In the past, it has been misused and would generate other issues. The problems you are facing are a symptom of that.
Serverless Framework (or any IaC) is not really made for deploying the same service into multiple stacks. An AppSync API should be considered as one service and it should not be broken apart.
When you look at how SF works "natively", there is no way to split an API Gateway into several stacks through compose (as far as I know).

Right now, the only valid use case I maybe see is when reaching the stack limit. In general, 500 should be plenty enough, but I understand that in some cases, it might be reached. The workaround/preferred way of solving that issue for now is using https://www.npmjs.com/package/serverless-plugin-split-stacks

Eventually, I think what most of you are looking for is API federation/API merging.
Unfortunately, this is not available today.

I'm open to discussing this further and would be happy to try and provide solutions for all, but at the same time I'd like avoid introducing pitfalls.

@Hideman85
Copy link

Yes indeed this looks to be a big topic and when you are not aware and just face the issue it can be really frustrating. So some feedback about our experience through this issue.

The first point about the limit 500, each lambda produces min 4 resources:

  • AWS::Lambda::Function
  • AWS::Logs::LogGroup
  • AWS::AppSync::DataSource
  • AWS::AppSync::Resolver

If you consider a pattern like CRUD that is like 16 resources per Type. In addition you tend to also link your types together with sub-resolver, provide feeds for realtime (most of the time to bypass the AppSync limitation you tend to do a none datasource called internally) you can really quickly reach the 500. Like us we have a relatively small project and boom 526, from one day to another, your deployment is stuck.

Split stacks plugin, I got a look at it and it feels to be more like a workaround than a permanent solution. Even the author would consider more optimal to handle it manually than using the plugin.

Now come, nested stacks vs standalone/root stacks. Actually, there is also a limitation that has been open here, hard limit of 2.5k resources updated in per CF update. Okey the limit start to be big enough to have already something.

What about the performance? When you update with nested stack you have to update everything and it takes ages. We made the choice of multi services and 1 AppSync because it greatly follows the split of concerns/responsibility and we are able to quickly deploy only the part that has changed and by the members/team that own it. I fairly not see a way to correctly split a GQL schema in 2+ APIs (how would you sub-resolve something if not in this API? + pretty annoying to swap endpoints, I just dont see it possible)

The last point about ApiGateway (havent used serverless for that we did manually in past) but in terms of CF this is the same, you have an ApiId that need to be linked to methods/resources and then you can create routes the same way AppSync does. In addition, in the past, not using serverless you can still cross stack import Outputs to another stack that how we dealt with it.

If we want to go forward, standalone stack mean you can fine tune deployment permission using service role that you cant in nested since it use the root one.

.

You are also speaking about miss use of apiId? Do you think we can detect the valid usage and throw an error for the other usages? Maybe the question is does apiId can be use if you are not using service composition? Lets detect a compose file and warn/error the user about it?
Maybe also a documentation about how properly splitting your API could help. Pros, cons, what best suite multi developers same API? Personally I faced the issue of the 500 I could not find great documentation/pattern, I had to search in issues and make my own experiments and choice.

Last thing, this is pretty hidden piece of doc, but we discovered that serverless support TS config file. So since we needed to migrate from v1 to v2 and explode everything we just refactor in TS, made some helper functions to declare and split the resources and now it works just fine with serverless-compose.
We are also using offline mode (with the patch/PR for v2) because this helps a lot when you are several members working on the API (where deploy in the same env would override each other, different envs extra costs...) In the end the FE have local version why not BE? Oh and huge benefit you can run in isolation full locally the entire of your application ecosystem with a bunch of integration testing that gives you high confidence. So that is pretty awesome works that helps a lot, and big thanks for those plugins.

@bboure
Copy link
Collaborator

bboure commented May 8, 2023

Thanks for all the feedback.

I will take it into consideration and start thinking/planning about what is the best way to handle this.

@bboure
Copy link
Collaborator

bboure commented May 8, 2023

After some research, I found this in the sls doc.
It looks like I was wrong when I said this is not supported for API Gateway.

I will go through this to see how we can "port" this to AppSync

@jeremycare
Copy link

It's important that the schema is not updatable when the APPSync API is referenced and not created by the plugin.

We had to talk to AppSync service team to troubleshoot our issues, and if there are multiple schema updates at the same time, cloud formation doesn't handle it correctly and some resolvers might disappear. (It caused us a lot of time to troubleshoot and find the issue for that kind of error, CF was not showing errors correctly :) )

@jeremycare
Copy link

Any update on that? @bboure

@plezan
Copy link

plezan commented Nov 14, 2024

@Hideman85 We are facing exactly the same issue with both the 500 resources stack limit and the offline mode for team work. Did you find any way to make it work ?
With SLS v3 dropping support pretty soon as well, I'm wondering if AppSync is still a good option for our needs

@jeremycare
Copy link

@plezan , we forked this plugin (on a private repo) to enable us to do the reference ourselves. It was not as bad as expected to add new params to do so.

We are also moving away from appsync in our project to API Gateway (We don't benefit enough from graphql feature to make the complexity worth it).

@plezan
Copy link

plezan commented Nov 29, 2024

Since the original PR seams to be dead, I forked/updated it.

It still needs a bit more testing but here is a draft PR: #647

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants