Bootstrap Failure with Terraform Deployment to GCP

Norbert527 · ‎07-19-2020

Unable to configure static IP address on management interface, and default password, on VM-Series GCP. I'm using code modified from the GCP-Terraform-Samples/FW-3-Interfaces/Variables.tf with Terraform v0.12.28. I've created storage bucket named "fw-bootstrap-bucket" with 4 folders, uploaded init-cfg.txt with "type=static" and an IP address to the "config/" folder in the bucket. I've tried several "bootstrap.xml" files.

the firewall instance boots up with a dynamically assigned IP address instead of the IP address I statically assigned in the init-cfg.txt file, so I know that the firewall instance either:

never tries to read the init-cfg.txt file when it boots up
tries to read it, but can't access the file in the storage bucket

The metadata section of the tf file says

// Adding METADATA Key Value pairs to VM-Series GCE instance
metadata = {
vmseries-bootstrap-gce-storagebucket = "var.bootstrap_bucket_fw"
serial-port-enable = true
ssh-keys = "var.public_key"
}

The tf variables file says

variable "bootstrap_bucket_fw" {
default = "fw-bootstrap-bucket"
}

I've tried several different ideas for specifying the location of the 'fw-bootstrap-bucket', none work.

So....I have a firewall deployed with a dynamically assigned management IP address, I can't log into it because although I pass SSH keys to it, I don't have the default password because bootstrap.xml didn't configure it.

How do I specify the bootstrap bucket location? How do I look at bucket access logs to see if the firewall instance is trying to connect to the bucket to download the init-cfg.txt?

glynn · ‎07-21-2020

That is controlled by the service scopes assigned at deployment time. Have a look at the Terraform docs for google_compute_instance for more details:

https://www.terraform.io/docs/providers/google/r/compute_instance.html

In the GH link I posted, there are a few examples of this. Note that they could be defined in variables files or explicitly set in the TF template but would look something like this:

variable scopes {
type = list(string)

default = [
"https://www.googleapis.com/auth/compute.readonly",
"https://www.googleapis.com/auth/cloud.useraccounts.readonly",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring.write",
]
}

If the storage bucket is not in the same project as the FW, then you will need to grant the FW service account the permission to access the bucket. You can also make the bucket readable by anyone or by all authenticated users but that is insecure and not recommended.

View solution in original post

Norbert527 · ‎07-19-2020

I used "gsutil" to turn on logging on the bootstrap bucket. Not much help right now since it takes an hour to see bucket access messages ....... SMH.

"Access logs provide information for all of the requests made on a specified bucket and are created hourly."

glynn · ‎07-20-2020

The management interface is set to DHCP and cannot be changed as you risk locking yourself out if you mis-type or otherwise enter incorrect information on the management interface. You _can_ tell GCP to assign a static IP to that interface so that whenever it sends a DHCP request, it always gets the same IP.

Regarding bootstrapping, you shouldn't need to tell the FW where the bucket is located. Bucket names have to be unique in the DNS namespace so it is sufficient to specify the name you gave the bucket. The FW also has to have permission to access it and the bucket has to be configured with _all_ of the required folders (config, content, license, software) so that might be worth verifying as well.

If you specify a correctly-formatted SSH key at deployment time, it should be inserted into the admin account even if bootstrapping fails. The correct format for the ssh key is "ssh-rsa <public_key> <username>".

There are some examples of Terraform deployments with bootstrapping here:

https://github.com/wwce/terraform/tree/master/gcp

Most of them create/populate the bootstrap bucket at runtime. The process is the same and they might provide a template for your development.

Norbert527 · ‎07-21-2020

Seeing the bucket name in a working example of code helps a lot. Thanks!

About the bucket -- I have all the folders created in the bucket....but the permissions on the bucket. How do I ensure the firewall/GCE instance has the correct permissions to access the files in the bucket? Currently "members" of the bucket are Project owners, Editors, and viewers. And the bucket is "multi-region" location type.

glynn · ‎07-21-2020

That is controlled by the service scopes assigned at deployment time. Have a look at the Terraform docs for google_compute_instance for more details:

https://www.terraform.io/docs/providers/google/r/compute_instance.html

In the GH link I posted, there are a few examples of this. Note that they could be defined in variables files or explicitly set in the TF template but would look something like this:

variable scopes {
type = list(string)

default = [
"https://www.googleapis.com/auth/compute.readonly",
"https://www.googleapis.com/auth/cloud.useraccounts.readonly",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring.write",
]
}

If the storage bucket is not in the same project as the FW, then you will need to grant the FW service account the permission to access the bucket. You can also make the bucket readable by anyone or by all authenticated users but that is insecure and not recommended.

Norbert527 · ‎07-21-2020

"If you specify a correctly-formatted SSH key at deployment time, it should be inserted into the admin account even if bootstrapping fails."

So....the bootstrap process has to succeed or the firewall comes up with no admin password, right? If bootstrap process fails, I can't get into the firewall CLI even if my public SSH key is installed, right? Because, the firewall is asking me for the password, and it won't take the password in the bootstrap.xml file--which is the password in PDF guide in the same folder in the Repo.

glynn · ‎07-22-2020

Correct on the first question. The virtual FW does not have a built-in password the way the HW FW do since they can be deployed/accessed across the internet. By specifying an SSH key at deployment time, you can authenticate to the CLI and set the admin password for use when logging into the GUI. The SSH key gets inserted into the admin account so you would need to connect as "admin".

If the bootstrap fails, you should still be able to get into the CLI using SSH provided you specify an SSH key at deployment time even though admin password exists on the box.

It can take several minutes for the FW to completely boot up post-deployment. During that time, you will get prompted for a password but authentication will fail since all of the management plane processes are not yet completely up. If after 10-15 minutes, you cannot authenticate, chances are that the SSH was not formatted in a manner that the FW could accept.

If the bootstrap fails but you can get into the FW, you can see information from the bootstrap attempt with the following command:

show system bootstrap status

Norbert527 · ‎07-23-2020

So the firewall deploys now--kinda. I set the bucket to just allow all users from everywhere, deployed a firewall, and I was able to SSH into it. So I'll need to look at the scope in TF and the bucket permissions more closely.

Now, auto-commit hangs, and the firewall says bootstrap failed, but the name of the firewall is the name that I set in init-cfg.txt. So it's reading the bootstrap file, and I assume, the bootstrap.xml file.

The hung auto-commit issue is a separate issue I think?

Thanks for helping with this, Glynn.

glynn · ‎07-23-2020

Progress at last.

I've seen the initial auto-commit fail/hang before. Hangs usually sort themselves out if you give it some additional time. It might be instructive to temporarily remove the bootstrap.xml file so that all you have is the init-cfg.txt file in the config directory. That should allow the FW to bootstrap but remove any configuration weirdness from the picture. You should be able to access the CLI via SSH key authentication and then use "show jobs all" to quickly check the status of the initial autocommit. If it is successful, then you can look at your boostrap.xml file more closely.

If nothing else, you can have the TAC look at it. Since you can now access the FW, they should be able to take a peek and tell you what's going on.

Bootstrap Failure with Terraform Deployment to GCP