PANCast™ Episode 41: Scaling on Prisma Access

ozheng · ‎04-24-2024

Episode Transcript:

John:

Hello PANCasters. Today we have two special guests who are going to talk to us about scaling features in Prisma Access. We have with us Nripendra Shrestha from the product management team and Amit Srivastava, Director of SRE and Devops. Welcome Nripendra and Amit. To start with, can you tell us a bit about yourselves?

Nripendra:

I am Nripendra - a JAPAC Regional Product Manager for Prisma SASE. I am based in Japan and I have been in this role for about two years now. As a regional product manager, I own the Prisma SASE product portfolio in the JAPAC region along with my colleagues who are based out of India. I have been working in the cloud, cyber security, and compliance space for over a decade now. I have been a GIAC Advisory Board member since 2016. And before joining Palo Alto Networks, I was a Specialist Solutions Architect - for highly regulated industry customers at Amazon Web Services.

Amit:

Hi, This is Amit. I manage the SASE SRE Customer Operations team. I have been in the role for a year and a half now. I am responsible for all Prisma SASE customer operations from the infrastructure side as well as my team is responsible for handling escalations from customer-facing teams.

I have been with the network, cloud and cybersecurity industry for over 2 decades now.

John:

Thank you, Nripendra and Amit. So, what is scaling and why do we need it?

What is Scaling?

Nripendra:

Scaling can mean different things - depending on the context. In Prisma Access, when we say scaling, we mean the ability to increase or decrease the processing power of Prisma Access in order to process network traffic - as required - without an impact in the user experience. Even during the high traffic hours, users will not see any degradation in user experience, resilience, and security processing capability of Prisma Access. Bearing this in mind, Prisma Access has been designed to scale.

To add - Prisma Access is a SaaS service – and we understand that any infrastructure availability - including scaling-up and scaling-down events – should be transparent to the customers.

Prisma Access infrastructure primarily consists of 3 types of processing nodes - Mobile User Gateway (MU) nodes for mobile users, Remote Networks (RN) nodes for users at branch offices, and Service connections (SC) for users to access private applications hosted at data centers.

Mobile User Gateways are able to scale out horizontally to provide processing power on demand. What that means is that additional instances will be spawned once a threshold is breached. In contrast, Remote Networks (RNs) and Service Connection (SCs) nodes do not scale horizontally. They are designed to scale vertically instead - for pre-determined traffic patterns. We plan to discuss this in-depth in a future episode.

John:

Great, thank you. What are the various ways we can scale in Prisma Access?

Scaling in Prisma Access

Amit:

First up, we have something called horizontal auto-scaling - which means provisioning of new Security Processing Nodes when existing nodes experience high network traffic - this is also rightly termed as “SCALE-OUT”. Incoming network traffic will be distributed to the new SPNs to share the additional traffic. Once the load on the network goes down, the SPNs will be gracefully de-provisioned - in a process termed as “SCALE-IN”. This is an on-demand mode scaling, that is enabled by default for all our customers using Prisma Access Mobile Users Gateways.

Secondly, we have something called as “scheduled” auto-scaling. This is recommended when we are fairly certain when the network traffic is going to be high. Say for example we expect the network traffic to peak between 9 am and 12 noon, in this case we recommend “scheduled” auto-scaling. Here the required SPNs will be spun up before the expected high traffic period, and will be ready to use when the traffic hits the SPNs. And in this way, there will not be any compromised user experience where there is high network traffic, because the SPNs will be ready to use before the high traffic period comes in. To utilize this feature, the customers will need to contact their respective customer facing Palo Alto Networks team.

Thirdly, we have something called as vertical auto-scaling - which means replacing the existing SPNs with a higher processing capacity SPNs that is “SCALING-UP”/”SCALING-DOWN” the capacity of the SPN . This is also enabled by default for our customers.

John:

Got it. What should we be aware of with scaling in Prisma Access?

Nripendra:

There are three things which come to the top of my mind.

First, users should understand that whenever there is scaling out/provisioning, there will also be scaling in/de-provisioning. Be it on-demand or scheduled scaling, during deprovisioning, some of the existing sessions will be terminated, and these users may have to reconnect to Prisma Access again.

Second, to enable “scheduled” auto-scaling - which we mentioned before - requires coordination led by Palo Alto Networks customer-facing team.

Third, whenever there is autoscaling, for each new processing node, a new public IP address gets assigned from the allocated pool of active and reserve IP addresses. So users are recommended to register all active and reserved IP addresses in the respective allowlist of SaaS, public or private applications. These IP addresses can be obtained from our API and we also support webhook.

However, some SaaS Applications or Public Clouds may have limitations that cannot be controlled by Palo Alto Networks. For example, we have seen that AWS Security Group policy only allows registration of up to 60 CIDRs. And some SaaS applications only allow the registration of IP addresses instead of CIDRs.

John:

Some great info, thanks. I understand there are additional features like IP optimization.

Can you tell us a bit more about this?

IP Optimization Features

Amit:

Yes. I think you are referring to our NextGen Prisma Access Infrastructure, that is also called NGPA.

In this, we have two remarkable features coming up namely Network Load Balancer or NLB and Egress Network Address Translation, that is Egress NAT.

First up with NLB - here we will be able to scale your MU GW SPNs behind a fixed public IP.

Secondly with NAT on the egress side, customers will be able to aggregate outgoing network traffic by using a fixed set of allocated IP addresses. This is a very desirable feature as it alleviates the problem of allowlisting multiple egress IP addresses for Mobile User gateways.

This feature is available by default for all users running Prisma Access version 4.0 with the minimum dataplane version of 10.2.4. For brownfield customers, to move to NGPA will require manual migration. In such cases, the customer are required to contact their account team for guidance.

John:

Really great information so thank you. It sounds like there is continuous evolution with scaling and Prisma Access.

Amit:

Yes certainly. We have a continuous learning culture at Palo Alto. We have learned a lot from Prisma Access deployments globally. And Auto-scale is no exception. We have put a lot of these learnings into Auto-scale to enhance the customer’s experience with our service.

At the very high level, the new Auto-scale conceptually separates the feature implementation into 3 services that run completely independent. First up Trigger, Action and Readiness. As the name suggests, Trigger is a service that monitors parameters like CPU/users/sessions to initiate the next service which is the Action. Let's talk about Trigger here because that's different from the way it work in the past. The new autoscale triggers are computed differently. In this case, the data we use here is a combination of CPU metrics from all cores of MU instances in the entire region. This gives us a better understanding of the ‘compute’ requirements and we can make a more informed decision about instant scale.

Similarly, as part of scale-down events, the algorithm now schedules instance deletions during off-peak hours to eliminate force logouts.

John:

OK. So Nripendra and Amit, what are the key takeaways for our listeners?

Episode Key Takeaways

Amit:

Prisma Access auto-scales both vertically and horizontally. This is triggered by CPU and users spikes. We also have time-based triggers to address on-demand and predictive horizontal auto-scale requests. Palo Alto Engineering has a culture of continuous improvement. As a part of this journey, we have recently released Next Generation IP Optimization capabilities. Both our existing and new customers can take advantage of these managed services to optimize their SASE adoption and operations.

Nripendra:

We are listening to our customers' feedback for enhancements and improvements. We request you to provide your feedback through your Palo Alto Networks customer facing team.

John:

Thank you so much, Amit and Nripendra for sharing some really good information on scaling and Prisma Access. For our PANCast™ listeners, as always, the transcript of this episode will be on live.paloaltonetworks.com, and you will also find links related to this episode.

Related Content:

Prisma Access

jnathan · ‎04-24-2024

Good overview on Prisma Access scaling.

Thank you @jarena , @ozheng Amit and Nripendra

Unlock your full community experience!

PANCast™ Episode 41: Scaling on Prisma Access