Azure Storage Account - Automate the Archived blob rehydration and copy the files using Azure Data Factory
Introduction
I have already posted about copying the data from On-Prem to Azure storage account. This is the continuation of the previous post because this will let you know about how to restore the data from Azure Archived data to Azure or On-Prem with automation.
The archived data can't be easily restored because it needs to rehydrated first then only you can copy the data. So i did work on this solution and was able to do with less manual effort.
Following tools i have used to achieve this requirement,
Azure Storage account - Data stored in Archive
Batch Accounts - Used to execute the script which will for Rehydration of Azure blob
Azure Data Factory - Copy the data back to On-Prem.
Logi Apps - Once this activity completed, mail will be triggered.
Azure Data Factory final Activity flow will be:
Created a storage account and put some blobs with archive tier.
Then, We have a data factory and created all parameters under "Global Parameters" section
Ex:
Source,
Target,
Folder(If you wanted to restore single folder),
Target Tier(This will be used to change the blob type and check the blob type in the script)
Create the source and destination linked service with the parameterized location.
Source:
Repeat the same for Target DataSet,
Once it's been created, lets create Pipeline. The pipeline activity which i used is,
Web activity,
Custom activity,
Copy activity.
In the Web activity settings, change the below values,
URL -> URL of the password stored in Azure KeyVault
Method -> GET
Authentication -> Managed Identity
Resource -> https://vault.azure.net
2. Custom Activity:- Used to set the blob type to Hot or Cool and wait for rehydration activity to completely change the blob type. All these inputs have been parameterized using the above mentioned global parameter values. Also add the Mail functionality in case if it's failed.
This activity is required Batch Service, i'm not going to explain how to create batch service as it's relatively very straight forward. Once created link the batch accounts.
Next, click on settings and define the parameters, scripts path
Script command:
powershell .\AzSet-BlobRehydration-v2.ps1
-AzureStorageAccount @{pipeline().globalParameters.AzureStorageAccount}
-AzureStorageAccountSASToken '@{activity('Get-AzureStorageSecret').output.value}'
-AzureContainerName @{pipeline().globalParameters.AzureContainerName}
-AzureFoldername @{pipeline().globalParameters.AzureFoldername}
-AzureTargettier @{pipeline().globalParameters.AzureTargettier}
PowerShell Script for Blob Rehdration:
###################Script Starts Here######################
param
(
[Parameter(Mandatory=$true,
Position=1,
HelpMessage=" Provide the name of Azure Storage Account")]
$AzureStorageAccount,
[Parameter(Mandatory=$true,
Position=2,
HelpMessage="This will be used to get the SAS token of the storage account")]
$AzureStorageAccountSASToken,
[Parameter(Mandatory=$true,
Position=3,
HelpMessage="Provide the container name where the data is stored")]
$AzureContainerName,
[Parameter(Mandatory=$true,
Position=4,
HelpMessage="Provide the Folder name from the container : E.G : abxis245675,adapt245803")]
$AzureFoldername,
[Parameter(Mandatory=$true,
Position=5,
HelpMessage="Provide the Tier type blob should be changed E.G : Hot")]
$AzureTargettier
)
<#$storageaccountname = "adfazstoragecopytesting"
$foldername = "folder4"
$Targettier = "Hot"
$containername = "target"#>
#Create a storage context
$sasTokens = "$AzureStorageAccountSASToken"
$acccontexts = New-AzStorageContext -StorageAccountName $AzureStorageAccount -SASToken $sasTokens
#$acccontext = Get-AzStorageAccount -Name $storageaccountname -ResourceGroupName $RGName
$Allbobsdata = Get-AzStorageBlob -Container $AzureContainerName -Context $acccontexts
$filterdata = $Allbobsdata | Where-Object {$_.Name -match $AzureFoldername}
foreach($blob in $filterdata){
$blob.ICloudBlob.SetStandardBlobTier("$AzureTargettier");
}
while ($filterdata.AccessTier -ne "$AzureTargettier") {
Write-Output "The blob "$filterdata.AccessTier" is not yet changed to $AzureTargettier"
Start-Sleep -s 2
$Allbobsdata = Get-AzStorageBlob -Container $AzureContainerName -Context $acccontexts
$filterdata = $Allbobsdata | Where-Object {$_.Name -match $AzureFoldername}
#Write-Output "Before if statement "$filterdata.AccessTier""
if ($filterdata.AccessTier -eq "$AzureTargettier") {
Write-Host "Blob has been changed to $AzureTargettier"
}
}
###################Script Ends Here######################
3. Failure email for the Storage account SAS token retrieval.
4. Copy Activity:- Used to copy the data between source and destination with parameterized values.
Also, Add Web activity for mailing if the activity is failed.
5. Mail feature if PowerShell activity is failed.
6. Finally add Success & Failure web activity which will be linked to the Copy data activity.
Body of the Send-Mail web activity:
{"Subject":"@{pipeline().Pipeline}-pipeline","DataFactoryName":"@{pipeline().DataFactory}",
"ActivityRunId":"@{activity('copy-storageaccount-files').ActivityRunId}",
"StartTime":"@{activity('copy-storageaccount-files').output.executionDetails[0].start}",
"DataSource":"@{activity('copy-storageaccount-files').output.executionDetails[0].source}",
"DataTarget":"@{activity('copy-storageaccount-files').output.executionDetails[0].sink}",
"SourceSizeInBytes":"@{activity('copy-storageaccount-files').output.dataRead}",
"TargetSizeInBytes":"@{activity('copy-storageaccount-files').output.dataWritten}",
"TotalSourceFilesRead":"@{activity('copy-storageaccount-files').output.filesRead}",
"TotalTargetFilesWritten":"@{activity('copy-storageaccount-files').output.filesWritten}",
"CopyDurationinMin":"@{activity('copy-storageaccount-files').output.copyDuration}",
"RehydrationDurationInSec":"@{activity('Set-AzBlobRehydration').output.executionDuration}",
"Throughput":"@{activity('copy-storageaccount-files').output.throughput}",
"Status":"@{activity('copy-storageaccount-files').output.executionDetails[0].status}",
"PipelineName":"@{pipeline().Pipeline}"}
7. LogicApp for sending those activity details with Logged attachments.
HTTP Post JSON Schema
{
"properties": {
"CopyDurationinMin": {
"type": "string"
},
"DataFactoryName": {
"type": "string"
},
"DataSource": {
"type": "string"
},
"DataTarget": {
"type": "string"
},
"ErrorMessage": {
"type": "string"
},
"PipelineName": {
"type": "string"
},
"RehydrationDurationInSec": {
"type": "string"
},
"SourceSizeInBytes": {
"type": "string"
},
"StartTime": {
"type": "string"
},
"Status": {
"type": "string"
},
"Subject": {
"type": "string"
},
"TargetSizeInBytes": {
"type": "string"
},
"Throughput": {
"type": "string"
},
"TotalSourceFilesRead": {
"type": "string"
},
"TotalTargetFilesWritten": {
"type": "string"
},
"activityRunId": {
"type": "string"
}
},
"type": "object"
}
For List Blobs, Filter Array, Get blob content using path (V2) actions please refer my previous post for more information.
Next Send an email configuration,
Finally i got the successful email which completes the blob rehydration along with copy activity.
Comments