RecoverPoint GEN6 Hardware Appliance

New RecoverPoint Hardware Platform 

Introducing the latest generation of RecoverPoint Appliance the ‘EMC Europa 1U’ Gen6 which replaces the ‘Intel R1000’ Gen5 Appliance. The Gen6 RPA is available from RP code levels 4.1.3 and 4.4SP1+. Gen4 (Dell PowerEdge R610), Gen5 and Gen6 appliances are supported at these code levels and may be mixed in the same cluster, but the recommended best practice is to configure the latest types as RPA 1 and RPA 2 (cluster control RPAs) to improve system function and performance.

  • Gen6 and Gen5 can exist in the same RecoverPoint cluster
  • Gen6 appliances have to be the first and second RPA in the cluster. This is because those RPAs run the Cluster Control services for the cluster.
  • To complete this task, replace RPA1 and RPA2 with the Gen6 RPAs. Then add the existing Gen5 RPAs back into the cluster
  • Gen6 and Gen5 can be used in a mixed-site system where Gen6 reside in one or more sites, and Gen5 reside in one or more sites.

Support for VMAX 10K, 20K, 40K, VNX, VNXe3200, VNX-F, Unity, VPLEX, XtremIO and VMAX3 arrays via VPLEX. Support for Gen6 is also included with Vblock/VxBlock CI systems.

Gen6 Spec:

Continue reading

EMC RecoverPoint for VMs ESXi Splitter Installation

The following steps detail the installation of the ‘EMC RecoverPoint for VMs’ Splitter VIB on ESXi hosts. The requirement is to Install the ‘RP for VMs’ splitter on each ESXi server hosting VMs that require RP protection. The recommendation is to allow 800MB of RAM per ESXi host for the RP4VM Splitter.

Note:

  • The use of VUM is not supported for the splitter install at present.
  • vRPAs must be deployed on an ESXi with a splitter.

Begin by ensuring that SSH is enabled on the ESXi hosts where the splitter will be installed in order to issue the ESXCLI cmds (After completing the splitter install then SSH may be disabled).
pp_remove_3

Using WINSCP to securely copy over the RP splitter VIB to /tmp:
RP4VM2
RP4VM3

Change directory to /tmp where the VIB has been copied to and ensure the VIB is present:
cd /tmp
ls -l
du -ah

RP4VM-4

RP4VM-5

Run the installation of the VIB file on ESXi host:
# esxcli software vib install –v kdriver_RPESX-00.4.3.0.1.0.c.122_md5_d4e7e95a89e74c7ca17be8f4344830b8.vib –no-sig-check

RP4VM-6

Retrieve the names of the packages installed on the vSphere host confirming the splitter installation:
# esxcli software vib list
# esxcli software vib list | grep RP
RP4VM-7
Note: VIB stands for vSphere Installation Bundle

 

PowerCLI Script:

##########################################################

# RP4VM Splitter Install Version 1.0

# Date: 2016-04-08

# Created by: David Ring

##########################################################

############# vCenter Connectivity Details ################

Write-Host “Please enter the vCenter Host IP Address:” -ForegroundColor Yellow -NoNewline

$VMHost = Read-Host

Write-Host “Please enter the vCenter Username:” -ForegroundColor Yellow -NoNewline

$User = Read-Host

Write-Host “Please enter the vCenter Password:” -ForegroundColor Yellow -NoNewline

$Pass = Read-Host

Connect-VIServer -Server $VMHost -User $User -Password $Pass

####### Please enter the Cluster to install RP4VM Splitter #######

Write-Host “Clusters Associated with this vCenter:” -ForegroundColor Green

$VMcluster = ‘*’

ForEach ($VMcluster in (Get-Cluster -name $VMcluster)| sort)

{

Write-Host $VMcluster

}

Write-Host “Please enter the Cluster to install RP4VM Splitter:” -ForegroundColor Yellow -NoNewline

$VMcluster = Read-Host

################# Enabling SSH ######################

Write-Host “Enabling SSH on all hosts in your specified cluster:” -ForegroundColor Green

Get-Cluster $VMcluster | Get-VMHost | ForEach {Start-VMHostService -HostService ($_ | Get-VMHostService | Where {$_.Key -eq “TSM-SSH”})}

############ Please enter the VMFS datastore #############

Write-Host “From the list provided – Please enter the VMFS datastore where the VIB has been uploaded to:” -ForegroundColor Green

$Datastore = ‘*’

ForEach ($Datastore in (Get-Datastore -name $Datastore)| sort)

{

Write-Host $Datastore

}

Write-Host “Please enter the VMFS datastore Name:” -ForegroundColor Yellow -NoNewline

$Datastore = Read-Host

########## Please enter the VIB name ##########

Write-Host “Please enter the VIB name e.g. kdriver_RPESX-00.4.3.0.1.0.c.122_md5_d4e7e95a89e74c7ca17be8f4344830b8.vib:” -ForegroundColor Yellow -NoNewline

$VIB = Read-Host

########## Installing RP4VM Splitter ###########

Write-Host “Installing RP4VM Splitter” -ForegroundColor Green

$hosts = Get-Cluster $VMcluster | Get-VMHost

foreach($vihost in $hosts)

{

$esxcli = get-vmhost $vihost | Get-EsxCli

$esxcli.software.vib.install($null,$false,$false,$false,$false,$true,$null,$null,”/vmfs/volumes/$Datastore/$VIB”)

}

###### Confirm Splitter Installed Successfully #######

Write-Host “Confirm Splitter Installed Successfully” -ForegroundColor Green

$hosts = Get-Cluster $VMcluster | Get-VMHost

forEach ($vihost in $hosts)

{

$esxcli = get-vmhost $vihost | Get-EsxCli

$esxcli.software.vib.list() | Where { $_.Name -like “*RP*”} | Select @{N=”VMHost”;E={$ESXCLI.VMHost}}, Name, Version

}

####### Enter each host in maintenance mode and reboot (use with caution!) #######

Write-Host “Enter all Cluster hosts in maintenance mode and reboot (use with caution!)” -ForegroundColor Yellow -NoNewline

Write-Host ” Y/N:” -ForegroundColor Red -NoNewline

$Reboot = Read-Host

if ($Reboot -eq “y”) {

$hosts = Get-Cluster $VMcluster | Get-VMHost

forEach ($vihost in $hosts)

{

$esxcli = get-vmhost $vihost | Get-EsxCli

$esxcli.system.maintenanceMode.set($true)

$esxcli.system.shutdown.reboot(10,”RP4VM Splitter”)

}

}

######## Disabling SSH #########

Write-Host “Disabling SSH” -ForegroundColor Green

Get-Cluster $VMcluster | Get-VMHost | ForEach {Stop-VMHostService -HostService ($_ | Get-VMHostService | Where {$_.Key -eq “TSM-SSH”}) -Confirm:$FALSE}

######### RP4VM INSTALLATION COMPLETE #########

Write-Host “RP4VM INSTALLATION COMPLETE” -ForegroundColor Green

EMC VNX – Registering RecoverPoint Initiators

The VNX will have been previously zoned to the RPAs at this stage. For example purposes the config below will have the RPA1-port-3 Zoned to the VNX SP-A&B Port 4 on Fabric-A and RPA1-port-1 Zoned to the VNX SP-A&B Port 5 on Fabric-B. Note: In a synchronous RP solution all 4 RPA ports should be zoned.

Parameters as follows:
Initiator Type = RecoverPoint Appliance (-type 31)
Failover Mode = 4 (ALUA – this mode allows the initiators to send I/O to a LUN regardless of which VNX Storage Processor owns the LUN)
RPA1_IP = IP Address of RPA1
RPA1_NAME = Appropriate name for RPA1 (E.g. RPA1-SITE1)

RPA WWNs can be recognized in the SAN by their 50:01:24:81:….. prefix.

Example:
Create a storage group for all RPAs on Site1:
naviseccli -User sysadmin -Password password -Scope 0 -h SP_IP storagegroup -create -gname RPA-Site1-SG

##############
## FABRIC A: ##
##############

RPA1-Port-3 initiator registered to both VNX SP A&B Port 4:

naviseccli -User sysadmin -Password sysadmin -Scope 0 -h SP_IP storagegroup -setpath -gname RPA-Site1-SG -hbauid 50:01:24:80:00:64:1C:E3:50:01:24:81:00:64:1C:E3 -type 31 -ip RPA1_IP -host RPA1_NAME -sp a -spport 4 -failovermode 4 -o

naviseccli -User sysadmin -Password sysadmin -Scope 0 -h SP_IP storagegroup -setpath -gname RPA-Site1-SG -hbauid 50:01:24:80:00:64:1C:E3:50:01:24:81:00:64:1C:E3 -type 31 -ip RPA1_IP -host RPA1_NAME -sp b -spport 4 -failovermode 4 -o

##############
## FABRIC B: ##
##############

RPA1-Port-1 initiator registered to both VNX SP A&B Port 5:

naviseccli -User sysadmin -Password sysadmin -Scope 0 -h SP_IP storagegroup -setpath -gname RPA-Site1-SG -hbauid 50:01:24:80:00:64:1C:E1:50:01:24:81:00:64:1C:E1 -type 31 -ip RPA1_IP -host RPA1_NAME -sp a -spport 5 -failovermode 4 -o

naviseccli -User sysadmin -Password sysadmin -Scope 0 -h SP_IP storagegroup -setpath -gname RPA-Site1-SG -hbauid 50:01:24:80:00:64:1C:E1:50:01:24:81:00:64:1C:E1 -type 31 -ip RPA1_IP -host RPA1_NAME -sp b -spport 5 -failovermode 4 -o

Registered Initiators displayed in Unisphere:

VNX-RP-INIT1

EMC VNX – RecoverPoint Enabler Installation

Installing the RecoverPoint Enabler using NAVICLI:

Check the list of all ENABLERS currently installed on the VNX:
naviseccli -h SP_IP ndu -list

A series of rule checks need to be performed in advance and correct any rule failures before proceeding:
naviseccli -h SP_IP ndu -runrules -listrules

Your configuration will run the following rules
===============================================
Host Connectivity
Redundant SPs
No Thin Provisioning Transitions
Version Compatibility
No Active Replication I/O
Acceptable Processor Utilization
Statistics Logging Disabled
No Transitions
All Packages Committed
Special Conditions
No Trespassed LUNs
No System Faults
No Interrupted Operations
No Incompatible Operations
FAST Cache Status
No Un-owned LUNs

Run through the Pre-installation rules to ensure the success of this software upgrade:
naviseccli -h 10.73.113.40 ndu -runrules -verbose

A common result is a warning for tresspassed LUNs:
RULE NAME: No Trespassed LUNs
RULE STATUS: Rule has warning.
RULE DESCRIPTION: This rule checks for trespassed LUNs on the storage system.
A total of 1 trespassed LUNs were found.
RULE INSTRUCTION: If these LUNs are not trespassed back, connectivity will be disrupted.

To remediate this rule failure and change the Current Owner you will need to execute a trespass command on the LUN using navicli or by right clicking on the LUN in Unisphere and click the trespass option:
naviseccli -h SP_IP trespass lun 1
If changing on multiple LUNs then running the trespass mine command from the SP will trespass all the LUNs that the SP has DEFAULT ownership of. For example to trespass LUNs with Default Ownership of ‘SP B’ but which are currently owned by ‘SP A’:
naviseccli -h SPB_IP trespass mine

Statistics Logging Disabled : Rule failed.
naviseccli -h SP_IP setstats -off

Confirm all rule checks for RPSplitterEnabler are met:
naviseccli -h SP_IP ndu -runrules c:\VNX\Enablers\RPSplitterEnabler-01.01.5.002.ena -verbose

Running install rules...
===============================================
Version Compatibility : Rule passed.
Redundant SPs : Rule passed.
Acceptable Processor Utilization : Rule passed.
No Trespassed LUNs : Rule passed.
No Transitions : Rule passed.
No System Faults : Rule passed.
All Packages Committed : Rule passed.
Special Conditions : Rule passed.
Statistics Logging Disabled : Rule passed.
Host Connectivity : Rule passed.
No Un-owned LUNs : Rule passed.
No Active Replication I/O : Rule passed.
No Thin Provisioning Transitions : Rule passed.
No Incompatible Operations : Rule passed.
No Interrupted Operations : Rule passed.
FAST Cache Status : Rule passed.

Install the RPSplitterEnabler:
naviseccli -h SP_IP ndu -install “c:\VNX Enablers\RPSplitterEnabler-01.01.5.002.ena” -delay 360 -force -gen -verbose

Name of the software package: -RecoverpointSplitter
Already Installed Revision NO
Installable YES
Disruptive upgrade: NO
NDU Delay: 360 secs

Monitoring the progress of the installation:
naviseccli -h SP_IP ndu -status
Is Completed: NO
Status: Activating software on primary SP
Operation: Install

naviseccli -h SP_IP ndu -status
Is Completed: NO
Status: Completing install on secondary SP
Operation: Install

naviseccli -h SP_IP ndu -status
Is Completed: YES
Status: Operation completed successfully
Operation: Install

naviseccli -h SP_IP ndu -list -name -RPSplitterEnabler

Commit Required: NO
Revert Possible: NO
Active State: YES
Is installation completed: YES
Is this System Software: NO

Re-enable stats logging:
naviseccli -h SP_IP setstats -on

If uninstall required:
naviseccli -h SP_IP ndu -messner -uninstall -RPSplitterEnabler -delay 360
Uninstall operation will uninstall
-RPSplitterEnabler
from both SPs Set NDU delay with interval time of 360 secs.Do you still want to continue. (y/n)? y

Installing the RecoverPoint Enabler via Unisphere Service Manager (USM):

Enabler_Install_USM1

Enabler_Install_USM2

Enabler_Install_USM3

Enabler_Install_USM4

Enabler_Install_USM5

Enabler_Install_USM6

Enabler_Install_USM7

Enabler_Install_USM8

Enabler_Install_USM9

Enabler_Install_USM10

Enabler_Install_USM11

Enabler_Install_USM12

Enabler_Install_USM13

Enabler_Install_USM14

EMC RecoverPoint – Updated FC Port Labeling on RP Gen5 Server

This is one that caught us by surprise! As per previous blog Post “EMC RecoverPoint Architecture and Basic Concepts” this is the port configuration as per the initial release of GEN5:

GEN5_1

With the latest RPA’s shipping there is an Update to the port labeling on the rear of the Gen5 server (from ABCD to 3210):

GEN5_2

Note: There is no change in the FC Adapter it remains “Qlogic QLE2564 4-port 8-Gb Fibre”

EMC RecoverPoint Journal and Replica Volume – Performance Considerations

In this post I will detail some considerations for RecoverPoint Journal and Replica Volumes from a Performance Perspective. This will give you some insight into the workings and designs that go into a RecoverPoint solution. EMC qualified personnel and performance tools are best placed to calculate 100% per customer requirements.

For every write on the Source Production Volume there will be five IO’s on the Target Side:
1. RPA Write to Journal Volume – sequential write to the Do Stream
2. RPA Read from Journal Volume – sequential read of the oldest data
3. RPA Read from Replica Volume – read current data from location to be written to
4. RPA Write to Journal Volume – write the current data so that the replica volume may be rolled back
5. RPA Write to Replica Volume – write the new data from the Do Stream to the replica volume

This five stage replication model is known as Five-Phase distribution and is the default distribution mode used by RecoverPoint. For more detailed information on the distribution phases and types please see the Administrator’s Guide at support.emc.com

The pattern of the IO therefore is 2*Read and 3*Write on the target side which is split between Journal and Replica Volumes. The exact breakdown of IO type is:
Journal Volume = 2 sequential writes and 1 sequential read
Replica Volume = 1 random read and 1 random write (Production Volume IO Pattern)

The IO’s to the Journal Volume are patched up together into large IOs, giving an IO profile on the Journal Volume of large sequential type IO workload. (3 sequential IOs to the streams)

The IO profile of the Replica Volume is the same as the write IO profile on the Source Volume plus reads of the same blocks; so for example if you have 8KB random writes on the source you will get 8KB random reads and then 8KB random writes to the same location on the Replica Volume.

The distribution to the Journal Volume uses large cumulative sequential writes independently of what the user I/O pattern is. Thus the Journal Volume should support at least 3 times the incoming rate of the Source Volume. (3x bandwidth of Source Volume, but not as much IOPs requirement)

Example of Journal Volume Throughput (MB/s) requirement:
So given an Application workload of 4000IOPs with an IO Pattern of 25% Write (1:3 W/R Ratio) and if the average IO Size=8KB you would need the disk throughput of:
1000IOPs * 8KB = 8000KB
8000KB * 3 = ~24MB/s (Throughput Requirement)
What matters is the throughput not the IO’s since the Journal Volume uses very large IO’s. EMC recommends running a benchmark such as IOMETER on the Journal Volume to measure the throughput performance.

Example of Replica Volume IOPs requirement:
However for the Replica Volume, those 1000 * 8KB write IO’s will translate to 1000 * 8KB Read misses + 1000 * 8KB Writes on the Replica Volume target device = 2000 IOPS requirement.

Remember these calculations are from the fact that the Journal Volume endures 2 writes and 1 read large sequential IOs, whereas the Replica Volume endures 1 write and 1 read random IOs for each 8KB write IO on Source Volume.

It is important to emphasise the Replica Volume performance. Often the performance of the Journal can overshadow the Replica performance requirements. It is important to be mindful of this when sizing solution’s, or a result of miss-calculation will result in “distribution too slow” warning messages which may lead to highloads (A highload is a bottleneck somewhere in the flow, that forces RecoverPoint to drop replication until it clears).

Another consideration to be mindful of is the performance on Failover of the Replica Volume as the Replica Volume needs to be capable of running the production systems during a failover (DR) situation. Many customers may consider a Replica Volume performance to be acceptable at 70% to 90% of Production (Source Volume) but other customers want 100% performance. In this situation IOPs figures will be a very important consideration on the Replica Volume.

Slow Journal Volume or slow Replica Volume may also have an impact on Journal Lag, in such a situation (increasing Journal Lag) investigation of the Replica and Journal performance characteristics may be a worthwhile exercise.

It is also good practice to allocate a dedicated RAID group for the Journal Volume(s) (RAID5 is a good option for large sequential IO performance). As pointed out above the IO pattern of the Journal Volume is large sequential IO thus it is best practice not to mix these LUNs with any LUNs of a random IO type profile.

As always these are my own ramblings and to always leverage EMC (or Partners) in devloping your optimum RecoverPoint solution. In a later post I will cover the WAN performance considerations along with Compression and Deduplication considerations.

5phase

EMC RecoverPoint Considerations for Journal sizing – Protection Windows

A very important consideration is the capacity sizing of your RecoverPoint journals which is done on a per Consistency Group basis. This will dicatate the level of failback allowed (Protection Window) on a per CG configuration. The JV must have the correct performance characteristics in order to handle the total write performance of the LUN(s) being protected (next blog). They must also have the capacity to store all the writes of the protected LUN(s). The two most important questions to ask are:

• What change rate does the source LUN generate?
• What retention window is required?

To calculate the journal capacity, you must measure the rate of change on the production LUNs within a CG. This can be done by analyzing the data per second values using the Array analysis tools (for example on the EMC VNX array, data per second values are found using the Unisphere Analyzer).This figure may change over the course of say a 24 hour period so it is good practice to average out the rate of change requirement. Also note that EMC as well as partners have access to EMC Business Continuity Solution Designer (BCSD) to help in sizing journals.

The Journal Volume Sizing formula is:
Journal size= (data per second Mbps) * (required rollback time in seconds)/(1- target side log size) x 1.05

Twenty percent of the journal must be reserved for the target side log and five percent for internal system needs.
For example, to support a 24-hour rollback requirement (86,400 seconds), with 10 megabits per second (Mb/s) of new data writes to the replication volumes in a consistency group, the calculation is:

(10 * 86,400)/(1- 0.2) x 1.05 = 1,134,000Mb = ~140GB

The example here is to give you an idea of the planning behind the capacity requirements. While the above calculation is used for sizing capacity of the Journal my next blog will detail the performance characteristics required by both the Journal and Replica volumes.

Note: When the change rate values are not available then a general rule of thumb is to size the journal at 20% of the data being replicated.