AWS Reference Architecture, Subnet Sizes and Automation

mb_equate · ‎01-30-2024

1. The AWS Reference Architectures (AWS - Palo Alto Networks) and associated automation libraries all use a /16 CIDR for the Security VPC and a /24 for each subnet - including those for the TGW attachments and GWLB endpoints. AWS recommend deploying these resources in the smallest subnet available, a /28 - as they use a single IP and should not host any other resources. In fact, the AWS reference architecture uses /28 for all subnets.

I understand these are for reference, and customers will be using their own subnets, but I can't imagine any deployment where a subnet larger than /28 would provide any benefit - even for (centralised) inbound flows as the ALB/NLB can only use the firewall's public IP and different ports for multiple apps, it does not support "floating IP" for the firewall to NAT in the way an Azure ALB does.

This being the case, should the reference architecture be updated to align to AWS best practices and use practical subnet sizes? And is there any reason to deploy subnets larger than /28 in a production security VPC?

2. The automation libraries appear to differ from the deployment guide in several areas, as far as the Terraform modules for Centralised Design Model are concerned...

Design / Deployment guide	Automation Libraries	My 2c
Single GWLBe is deployed in each AZ for both east/west and outbound	Two GWLBe's are deployed in each AZ, one for east/west and the other for outbound	I don't see the point in separation, especially with overlay routing enabled you're confusing admins with different source zones for the same source networks depending on the destination?
Subnet names (tags) are readable e.g. security-mgmt-2a, security-fw-2a	Subnet names are concatenated e.g. mgmta, privatea and do not contain a user-defined prefix	I prefer to use a common prefix for these names and a separator for the AZ per the guide - while the tfvars example allows a user-defined prefix for the VPC, the code does not apply this to the subnet names

With these in mind, how hard would it be for a consultant (i.e. me) to update the libraries to align to the design, reduce the subnet sizes and achieve a sensible naming convention for the resources?

Cloud engineers seem to be as lax about address consumption as they ~~are~~ were about security... /s

Sources: Transit gateway design best practices - Amazon VPC, Centralized inspection architecture with AWS Gateway Load Balancer and AWS Transit Gateway | Network...

mb_equate · ‎01-30-2024

Further to this, I was also hoping to clarify the requirement to host the GWLB, GWLBe and firewall private interfaces in different zones. Putting inbound flows aside...

The AWS architecture, even without overlay routing, achieves this with 3 subnets within each AZ (4 if you include mgmt):

1 for the TGW attachment

1 for the GWLBe (both outbound & east/west use a single endpoint), GWLB itself and the firewall private interface ("appliance subnet", a mere /28)

1 for the NAT gateway (or public interface, because we can use overlay routing)

The PAN reference architecture splits the GWLB, GWLBe and firewall private interfaces into separate subnets, resulting in 6 (including mgmt):

1 for the TGW attachment

1 for the GWLBe (both outbound & east/west)

1 for the GWLB

1 for the firewall private interface

1 for the public interface

1 for mgmt

The PAN automation libraries throw in another subnet for the east/west GWLBe making it 7 in total, add another 2 if you include the app and network load balancers to support inbound flows.

My question is, is it necessary to place the GWLBe, GWLB and firewall private interface in different subnets? I can't see any logical reason to, even without support for overlay routing or separation of outbound & east/west flows, because:

1. The route table for the east/west GWLBe must forward private subnets to the TGW attachment

2. The route table for the outbound GWLBe must forward private subnets to the TGW attachment (and to the NAT GW if no overlay routing)

3. The route table for the firewall private interface must forward private networks to the TGW attachment (centralised inbound)

4. The GWLB itself does not need to reach anything other than the firewall instances

Given the above, can we simplify the design to consolidate these components and remove unnecessary subnets, route tables and network hops? I'm all for flexibility and future proofing but in this case there doesn't appear to be any value in using separate subnets.

KISS KISS

Unlock your full community experience!

AWS Reference Architecture, Subnet Sizes and Automation

AWS Reference Architecture, Subnet Sizes and Automation

Show your appreciation!