Importing Live DNS into Terraform Without Downtime
If your DNS already serves traffic and you want it under Terraform, import the records - do not recreate them. I moved ~220 records across 9 zones onto Cloudflare and into Terraform this way, and the migration changed nothing that resolves. The one decision that makes that true: adopt the live records, rather than declaring a desired state and letting Terraform converge to it.
The recreate trap
The obvious way to put existing infrastructure into code is to write a desired-state list - here are the records I want - and terraform apply it. Against live records, that is the wrong default, for two reasons.
First, Terraform's create path will try to make records that already exist. Depending on the provider you get duplicate records or an API error, neither of which you want mid-migration.
Second, and worse: anything your list forgot gets deleted as drift. DNS is exactly the place you forget things. A DKIM selector a vendor set up two years ago, an _acme-challenge TXT, a verification record for some SaaS - none are in your head, all of them matter, and a desired-state apply silently removes every one you didn't transcribe. The web record you remember keeps working; the mail domain you forgot quietly breaks. A bad trade for tidier source files.
Import instead
Importing inverts the direction. Instead of declaring what should exist and deleting the rest, you take what exists and write configuration that matches it, changing nothing.
The mechanical recipe, with Terraform 1.5+:
- Pull every record from the provider's API and emit one
importblock per record (tothe resource address,idthe provider's record id). A short script does this and stays useful afterward as a drift audit. - Run
terraform plan -generate-config-out=generated.tf. Terraform writes the matching resource configuration for every imported record, derived from reality rather than guessed. - Review the generated config, then run a plain
plan.
The plan that proves it
Drive it until the plan says this:
Plan: 221 to import, 0 to add, 0 to change, 0 to destroy.
That row of zeros is the acceptance test. The code now matches what is serving traffic - no record created, deleted, or modified. The baseline plan is a no-op, which is the property you wanted: the next time plan shows a change, the change is real and intended, and the diff means something. A configuration you cannot trust to be a no-op is one you cannot use to review changes.
The gotcha: SRV records
Generated config is not always valid config. For my SRV records, the generator emitted both a content string and a structured data block, which the provider rejects as mutually exclusive. The fix was a deterministic post-process that strips the redundant content from any block that has a data. The general shape is worth noting: importing surfaces provider quirks at import time, all at once, instead of ambushing you on some unrelated edit six months later. I would rather meet them on day one.
When recreating is actually fine
The import-first rule is for adopting infrastructure already serving traffic. For a greenfield zone with no live records, recreate is correct: author the desired state and let Terraform build it, no import dance. The same holds for a record set small enough to recreate inside a maintenance window where a brief inconsistency is acceptable. The distinction is just whether real users are resolving those names right now.
What it costs
Honesty about the price: generated config is verbose and machine-named. Importing 200-some records gives you 200-some resource blocks with generated identifiers, not the hand-curated names you would write greenfield. You can rename them in state afterward, but that is separate work I have not decided is worth doing for records I rarely touch by hand. The tidiness I gave up is real; the zero-downtime adoption I bought is worth more. Whether the rename cleanup earns its keep is a question I am leaving open until the verbosity gets in my way.
The full sanitized setup - Terraform, the import generator, the Ansible roles - is on GitHub, and the longer reasoning is written up as ADR 0014, alongside the surrounding IaC decisions: staged declarative provisioning (0013) and keeping state off the provider it provisions (0015).