Disaster recovery and High Availabilty

oDarweesh2 · ‎10-04-2022

Dears,

Kindly need your clarification for the below scenario in my environment:

-- We have a XSOAR deployment and we need to deploy HA and DR to it, the current situation is that we have Basic installation with Bolt DB.

-- So After reading the guide I reached out to the following two solutions:

1- Using High Availability Module, this module will require to migrate my bolt DB to elastic search DB and configuring Load Balancer. but the question here is: How can we configure DR server for this solution as there is no Live backup with elastic search?????

2- Using live backup, this option is much easier as we don't need to migrate to elastic search, but the question here that we can not configure more than one standby server, so how can we deploy DR for this solution?

Kindly need your answer

chrking · ‎10-04-2022

HA and Live Backup are mutually exclusive. If you use one, you can't use the other. Live Backup is sometimes referred to as "DR" because it is intended to be used for disaster recovery, but disaster recovery is a more general concept and not restricted to any one specific technology or architecture.

Disaster Recovery as a general concept generally has levels depending on what you're trying to protect against. Are you trying to protect against a single server failure? A partial data center outage (eg, one rack, one row, etc)? A complete data center outage? A wide ranging major disaster that destroys all data centers in a geographical region?

HA will protect just fine against a single server failure assuming everything is configured correctly and you have sufficient ES nodes. It could protect against partial data center outages if you're careful about your placement of nodes. For anything further, you'd need to carefully consider the inter-node latency, as excessive latencies will cause performance issues. In this situation, taking ES backups and storing them in geographically dispersed locations would allow you to set up a new environment in another location without losing all of your data. Depending on your requirements that may be a sufficient DR plan as the kinds of wide-ranging disasters that would necessitate this are generally infrequent.

Live Backup doesn't have the same latency requirements so you can deploy the standby server further away without issue, assuming you still have sufficient bandwidth. If you use Live Backup, the standby server essentially *is* your disaster recovery plan - manually fail over to it, update any DNS etc that's required and your DR plan is fulfilled. It's not clear to me why you think you need more than a single Live Backup server to provide DR.

oDarweesh2 · ‎10-05-2022

ok, the point of having two live backup servers to fullfill the requirement of doing (High Availibilty) in the Main Site , so if the production server is down (for any reason) we can switch to another server in the same site. and to fullfill the requirement of doing (Disaster recovery) in another geographical place (Second Site in another city) so in case our main site (data center) is damaged we can switch to the other city environment.

So can this be done??? by a way or another

to fullfill HA concept and DR concept??

chrking · ‎10-05-2022

Live Backup requires manual failover and therefore cannot provide seamless HA. IMHO having a second host on the main site adds nothing to this configuration above and beyond what the secondary site host already adds.

The way I see it you basically have three choices:

1) Live Backup with 1 host at each site, on the understanding that manual failover is required and this will cause some limited downtime when failover is required. Additionally you need to understand that in the event of a break-glass failover (ie, unable to configure the main site to gracefully become standby), you will need to rebuild your main site host. This has the most disadvantages, but also the least hardware requirements and simplest configuration.

2) HA with geographically close secondary site - if your secondary site is sufficiently close that you can realistically expect network round trip times of <= 10ms, you can deploy an HA configuration with an Elastic cluster that spans both sites. You'll need to configure your elastic settings ensure that data is stored on both sites. I would strongly suggest having a skilled elastic search administrator for this, because most of the cross-site redundancy configuration is out-of-scope of the XSOAR documentation.

3) Single site HA, with Elastic backups to your secondary site. If your main site is destroyed you'll be building new servers on the secondary site to get running again (so, relatively long downtime), but there's no latency requirements and this will still provide good protection against single-host hardware failures.

I hope that helps.

Unlock your full community experience!

Disaster recovery and High Availabilty

Disaster recovery and High Availabilty

Show your appreciation!