Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DXE-3042 Question about provider behaviour destroy then create - edgedns record #462

Open
hightoxicity opened this issue Aug 29, 2023 · 12 comments

Comments

@hightoxicity
Copy link

hightoxicity commented Aug 29, 2023

Hi there,

I want some details about the provider behaviour on terraform apply...
I know that terraform default behaviour is to destroy things before creating new one and it is a welcome behaviour in the coming explained usecase...

Terraform version: 1.2.9
Akamai provider version: 3.2.1

We use to encounter a typical issue with akamai_dns_record in the following situation:

  • Let's imagine we had a test.mydomain.com A record already populated both on akamai edgedns side and terraform state one...
  • We convert this record from an A record to a CNAME
  • The terraform plan tells us that we are about to destroy a record and create a new one...
  • On apply, it fails like this:
    • It starts by destroying the test.mydomain.com A record
    • Then it fails creating the cname with the following error:
│ Error: Recordset create failure
│ 
│   with akamai_dns_record.ak-cname-records["mydomain.com#test.mydomain.com"],
│   on cname.tf line 32, in resource "akamai_dns_record" "ak-cname-records":
│   32: resource "akamai_dns_record" "ak-cname-records" {
│ 
│ Title: Invalid rdata - CNAME collision; Type:
│ https://problems.luna.akamaiapis.net/authoritative-dns/cnameCollision;
│ Detail: CNAME must be the only record for a given name

If I re-run a tf plan after this failure it tells me now that only a CNAME record will be created (test.domain.com) and after that the terraform apply runs properly.

My conclusion is that the provider does not synchronously destroy the previous A record before creating the new CNAME with same name and it is a very annoying behaviour...

Is there any existing workaround or thing to do against this (get a synchronous destroy of resources) or is it a bug you should fix?

Thanks.

@hightoxicity
Copy link
Author

For gtm, there is a property wait_on_complete on akamai_gtm_property, I guess we need the same here.

@hightoxicity hightoxicity changed the title Question about provider behaviour destroy then create Question about provider behaviour destroy then create - edgedns record Aug 29, 2023
@majakubiec majakubiec changed the title Question about provider behaviour destroy then create - edgedns record DXE-3042 Question about provider behaviour destroy then create - edgedns record Aug 31, 2023
@aka-mark aka-mark added the STG label Aug 31, 2023
@majakubiec
Copy link
Contributor

majakubiec commented Sep 1, 2023

Hi @hightoxicity

I am looking at this issue and trying to reproduce with no luck so far. Does this happen often?

Were there any other records before first apply? I wonder if maybe you had more than one record of which one was the mentioned A record. You replaced this A record with CNAME and completely removed the other records from .tf file, and hit apply. I can see how this could potentially cause synchronization issue, but having only one record and replacing it with other record should not have caused this error 🤔.

@hightoxicity
Copy link
Author

hightoxicity commented Sep 1, 2023

Hi @majakubiec Yes the A is replaced by a CNAME, so no more in tf inputs as older type.
It happens randomly. We have thousands of records, I do not know if volume influences anything.

@hightoxicity
Copy link
Author

hightoxicity commented Sep 16, 2023

Hi, we had the case again yesterday, here is some details we had, terraform plan before apply:

Unless you have made equivalent changes to your configuration, or ignored the
relevant attributes using ignore_changes, the following plan may include
actions to undo or respond to these changes.


─────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create
  - destroy

Terraform will perform the following actions:

  # akamai_dns_record.ak-a-records["springcm.com#inboundeu21.springcm.com"] will be created
  + resource "akamai_dns_record" "ak-a-records" {
      + answer_type = (known after apply)
      + dns_name    = (known after apply)
      + id          = (known after apply)
      + name        = "inboundeu21.springcm.com"
      + record_sha  = (known after apply)
      + recordtype  = "A"
      + serial      = (known after apply)
      + target      = [
          + "31.186.230.163",
        ]
      + ttl         = 300
      + zone        = "springcm.com"
    }

  # akamai_dns_record.ak-cname-records["springcm.com#inboundeu21.springcm.com"] will be destroyed
  # (because key ["springcm.com#inboundeu21.springcm.com"] is not in for_each map)
  - resource "akamai_dns_record" "ak-cname-records" {
      - id         = "springcm.com#inboundeu21.springcm.com#CNAME" -> null
      - name       = "inboundeu21.springcm.com" -> null
      - record_sha = "ef9c44a44448da00cc4c48ea4208f5235b49356a" -> null
      - recordtype = "CNAME" -> null
      - target     = [
          - "origin-c-inboundeu21.springcm.com.",
        ] -> null
      - ttl        = 300 -> null
      - zone       = "springcm.com" -> null
    }

Plan: 1 to add, 0 to change, 1 to destroy.

Then terraform apply failure:

Starting: Terraform apply
==============================================================================
Task         : Terraform CLI
Description  : Execute terraform cli commands
Version      : 0.7.8
Author       : Charles Zipp
Help         : 
==============================================================================
/usr/bin/terraform version
Terraform v1.2.9
on linux_amd64
+ provider registry.terraform.io/akamai/akamai v3.2.1
+ provider registry.terraform.io/hashicorp/local v2.3.0

Your version of Terraform is out of date! The latest version
is 1.5.7. You can update by downloading from https://www.terraform.io/downloads.html
/usr/bin/terraform apply -auto-approve -parallelism=50 -input=false -lock=true -lock-timeout=360s tfplan-production.zip
╷
│ Warning: "use_microsoft_graph": [DEPRECATED] This field now defaults to `true` and will be removed in v1.3 of Terraform Core due to the deprecation of ADAL by Microsoft.
│ 
│ 
╵
akamai_dns_record.ak-cname-records["springcm.com#inboundeu21.springcm.com"]: Destroying... [id=springcm.com#inboundeu21.springcm.com#CNAME]
akamai_dns_record.ak-a-records["springcm.com#inboundeu21.springcm.com"]: Creating...
akamai_dns_record.ak-cname-records["springcm.com#inboundeu21.springcm.com"]: Destruction complete after 1s
╷
│ Error: Recordset create failure
│ 
│   with akamai_dns_record.ak-a-records["springcm.com#inboundeu21.springcm.com"],
│   on a.tf line 32, in resource "akamai_dns_record" "ak-a-records":
│   32: resource "akamai_dns_record" "ak-a-records" {
│ 
│ Title: Invalid rdata - CNAME collision; Type:
│ https://problems.luna.akamaiapis.net/authoritative-dns/cnameCollision;
│ Detail: CNAME must be the only record for a given name
╵
##[error]Terraform command 'apply' failed with exit code '1'.
##[error]╷
│ Error: Recordset create failure
│ 
│   with akamai_dns_record.ak-a-records["springcm.com#inboundeu21.springcm.com"],
│   on a.tf line 32, in resource "akamai_dns_record" "ak-a-records":
│   32: resource "akamai_dns_record" "ak-a-records" {
│ 
│ Title: Invalid rdata - CNAME collision; Type:
│ https://problems.luna.akamaiapis.net/authoritative-dns/cnameCollision;
│ Detail: CNAME must be the only record for a given name
|

@majakubiec
Copy link
Contributor

Hi,
thank you for all the details, now it makes so much more sense to me 😄
I thought you only changed the record type and hit apply, but I see you also changed the resource name "ak-a-records" -> "ak-cname-records". In such case terraform handles it differently. From terraform perspective these are two separate resources so terraform thinks it can run these operations in parallel, and thus sometimes delete runs first and then create and everything is fine, and sometimes its the other way around and it blows up.
I think it may be a bit tricky to implement this cross resource synchronization in terraform plugin but we'll se what we can do :D

@hightoxicity
Copy link
Author

hightoxicity commented Sep 18, 2023

Hi, thank you for all the details, now it makes so much more sense to me 😄 I thought you only changed the record type and hit apply, but I see you also changed the resource name "ak-a-records" -> "ak-cname-records". In such case terraform handles it differently. From terraform perspective these are two separate resources so terraform thinks it can run these operations in parallel, and thus sometimes delete runs first and then create and everything is fine, and sometimes its the other way around and it blows up. I think it may be a bit tricky to implement this cross resource synchronization in terraform plugin but we'll se what we can do :D

I perfectly understand now! Thanks for the explanation, yep that would be very nice if we can do something at provider side since there are different type of records that does not conflict and can appears twice with same name (I can not have only one ak-records structure since I can have test.toto.tld TXT "something" with also test.toto.tld A X.X.X.X) and the conflict appears mainly with CNAME type and all other type of resources on type conversion attempt (ANY to CNAME or CNAME to ANY).

I have (I think) a temporary workaround that consists in forcing parallel value to one on tf apply...

Thx for your help

@hightoxicity
Copy link
Author

hightoxicity commented Sep 18, 2023

Do you think it could work if I materialize:

  • resource "akamai_dns_record" "ak-a-records" depends_on akamai_dns_record.ak-cname-records
  • resource "akamai_dns_record" "ak-txt-records" depends_on akamai_dns_record.ak-cname-records
  • ...

https://developer.hashicorp.com/terraform/language/meta-arguments/depends_on

Edit: I think It would work doing a CNAME to "another type" conversion but not "another type" to CNAME because it will try to create the CNAME before deleting the base type

@majakubiec
Copy link
Contributor

Edit: I think It would work doing a CNAME to "another type" conversion but not "another type" to CNAME because it will try to create the CNAME before deleting the base type

I think you are right with the above.

We'll try to help you with this but could you please provide a minimal version of your current configuration? This would allow us to understand your setup more clearly, and potentially reproduce and debug the issue you're facing.

Please ensure to redact any sensitive information before sharing your configuration

@hightoxicity
Copy link
Author

hightoxicity commented Sep 20, 2023

Hi @majakubiec I shared with you some tf hcl in a private gh repo

@majakubiec
Copy link
Contributor

Hi @hightoxicity
Thanks :D I'll take a look

@dstopka
Copy link
Contributor

dstopka commented Sep 22, 2023

@hightoxicity, we're going to evaluate possible solutions that could mitigate the risk of running into the problem you described. This will require some time, but once we're set on anything we'll let you know. Unfortunately, we were not able to find any specific workarounds that would work with the config you shared with us.

@hightoxicity
Copy link
Author

Hi @dstopka , thanks for the update and work you are about to do on that...

To avoid to face again this annoying issue, a temp fix has been added at the CI level on our side:

  • We introspect content of a persisted/approved tf plan using terraform show --json plan_to_apply
  • We catch about to be deleted cnames and perform a lookup to find some records with same name about to be created using yq
  • We destroy them with terraform destroy and -target param
  • Then we perform a new tf plan that will be offloaded of those deletions
  • To have the same behaviour but with any type of records replaced by a CNAME, we use tf HCL to say that actions on CNAME depends on all other record types operations (one way dependency)

Something like this before applying:

        - task: Bash@3
          name: cnames_to_be_deleted
          displayName: Look for cnames to be deleted
          inputs:
            targetType: inline
            workingDirectory: ${{ parameters.workingDir }}/${{ parameters.publishedArtifactName }}
            script: |
              set -x
              deleted_cnames=$(terraform show --json ./tfplan-${{ parameters.environment }}.zip | yq e -o=j -I=0 -r '(.resource_changes as $rc) | $rc | map(select((.type == "akamai_dns_record") and (.change.actions | contains(["delete"]) and (.change.before.recordtype == "CNAME")))) | map(.index as $eltIndex | $rc | map(select(.index == $eltIndex and .type == "akamai_dns_record" and (.change.actions | contains(["create"])) and (.address | test("^akamai_dns_record\.ak-cname-records") == false)) | .index) | sort | unique) | @csv' | tr -d '\n')
              while IFS=, read -ra cnames; do
                for cn in "${cnames[@]}"; do
                  set +e
                  terraform state show "akamai_dns_record.ak-cname-records[\"${cn}\"]"
                  result=$?
                  set -e

                  if [ "${result}" -eq "0" ]; then
                    terraform destroy -target="akamai_dns_record.ak-cname-records[\"${cn}\"]" -lock=true -lock-timeout=120s -auto-approve
                  fi
                done
              done <<< ${deleted_cnames}
              set +x
              echo "##vso[task.setvariable variable=deletedCnames;isOutput=true]${deleted_cnames}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants