VCS Hybrid Cloud Implementation Using Pure Storage on Microsoft Azure

From the “Key Highlights from SNUG 2022“:

“One of the most attractive uses of the cloud for chip development is bursting VCS workloads. “Bursting” to the cloud is all about dynamic deployment of applications and allows customers to leverage the huge scale of compute that the cloud offers. While hosting design data completely on the cloud is simpler and more efficient, many customers want a hybrid scenario where they can store data on a wholly owned storage solution while leveraging Cloud as a Service (CaaS).

Microsoft Azure has worked with Pure Storage and Equinix to offer such a colocation hybrid solution for customers to gain the desired performance for EDA workloads. On Day 2 of SNUG 2022, Microsoft’s senior program manager, Raymond Meng-Ru Tsai, and Pure Storage’s technical director, Bikash Roy Choudhury, led a joint session to provide attendees with an in-depth perspective of running the industry’s highest performance simulation solution, Synopsys VCS® via Microsoft Azure and Pure Storage FlashBlade® at scale. They discussed best practices to verify parameters such as completion time, storage throughput patterns, and network route capabilities. This discussion also provided attendees with granular details of a tried-and-tested method to store data on a wholly owned FlashBlade device located in an Equinix data center while being connected to the Azure cloud for compute."

Synopsys users will be able to access SNUG content at SolvNetPlus.

Create RAID Array on Azure Linux VM

This article will show you how to create RAID 0 (for best performance), or RAID 1 (for fault tolerance) Array on an Azure Linux VM. And will take HB120v3 as example to stripe its 2 local 960GiB NVMe disks.

1. Use lsblk command to find the device names.

In HB120v3, you would see 2 NVMe devices: “nvme0n1″ & “nvme1n1″.

2. Create RAID 0 Array.

# Create a logical RAID 0 device named "NVME_RAID". 
# Please change from --level=0 to --level=1" if you would like to create RAID 1 Array
sudo mdadm --create --verbose /dev/md0 --level=0 --name=NVME_RAID --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1
# Create an ext4 file system with label "NVME_RAID"
sudo mkfs.ext4 -L NVME_RAID /dev/md0
# To ensure RAID array is reassembled automatically on boot
sudo mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
# Create a new ramdisk image
sudo dracut -H -f /boot/initramfs-$(uname -r).img $(uname -r)
# Create a mount point
sudo mkdir -p /mnt/raid
# Mount the RAID device
sudo mount LABEL=NVME_RAID /mnt/raid

3. Verify that the 2 NVMe devices have been striped together as an 1.8TiB RAID 0 Array.

[hpcadmin@hb120v3 ~]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs        221G     0  221G   0% /dev
tmpfs           221G     0  221G   0% /dev/shm
tmpfs           221G  9.0M  221G   1% /run
tmpfs           221G     0  221G   0% /sys/fs/cgroup
/dev/sda2        30G   14G   16G  45% /
/dev/sda1       494M  113M  382M  23% /boot
/dev/sda15      495M   12M  484M   3% /boot/efi
/dev/sdb1       473G   73M  449G   1% /mnt/resource
/dev/md0        1.8T   77M  1.7T   1% /mnt/raid
tmpfs            45G     0   45G   0% /run/user/1000

For Windows VM: Create RAID Array on Azure Windows VM – Raymond’s Tech Thoughts

如何在Azure上快速建立HPC環境

高效能運算 High Performance Computing (HPC) 是大數據機器學習、半導體EDA模擬、氣象預報等運算的基礎，一般都是政府單位(如氣象局)或大型企業(如台積電)，才有預算採購並建置這種包含數萬核以上的CPU、儲存設備、及高速網路的運算環境。

現在你可以在數十分鐘內，在 Azure 上快速建置出一個大小由您決定的 HPC 環境，作為您實作 POC，以及未來正式將運算上雲之基礎。

本文章會使用 AzureHPC 這個開放原始碼工具來完成一個簡單的 HPC 環境：

一個名稱為 hpcvnet 的 Virtual Network，其中包含一個名稱為 compute 的 subnet。
一個名稱為 headnode 的 Virtual Machine，並已安裝：
- PBS 批次處理系統
- 2TB 的 NFS 檔案系統
一個名稱為 compute 的 Virtual Machine Scale Set，其中包含 2 個 instances。

先決條件：

一個 Linux 環境，例如 Windows 10 中的 WSL 2.0，並安裝 Azure CLI。亦可直接在 Azure Portal 上的 CloudShell 上執行。
一個有足夠 quota 的 Azure subscription:
- 1xDS8_v3 (8 cores)
- 2xHC44rs (88 cores)

步驟：

下載 AzureHPC repo:

# log in to your Azure subscription
$ az login

# mkdir your working environment
$ mkdir airlift 
$ cd airlift

# Clone the AzureHPC repo 
$ git clone https://github.com/Azure/azurehpc.git
$ cd azurehpc

# Source the install script 
$ source install.sh

這個 AzureHPC repo 包含了許多有已預先建制好的模版，都放在 /examples 目錄之下。

2. 編輯 config.json 檔案：

# 這範例使用 /examples/simple_hpc_pbs 這個模版
$ cd examples/simple_hpc_pbs

# 使用喜歡的編輯器編輯 config.json
# vi config.json 
$ code .

請填寫包含 location, resource_group, 及 vm_type等欄位。同時請瀏覽一下此設定檔，包含網路及儲存設備等都是在此設定，你可隨需求作更改。

3. 建立 HPC 環境。

$ azhpc-build

大約十分鐘左右可建置完畢。若有錯誤發生，修正後再執行一次即可，程式會自動檢查並跳過已完成的步驟。

4. 登入 headnode，檢查建立好的 HPC 環境。請注意有一個 /share 的目錄，其下的 /apps, /data, /home 三個目錄皆可由 PBS nodes 存取，以存放執行檔及共享資料。

你也可以登入 Azure portal 檢查。

$ azhpc-connect -u hpcuser headnode
Fri Jun 28 09:18:04 UTC 2019 : logging in to headnode (via headnode6cfe86.westus2.cloudapp.azure.com)
$ pbsnodes -avS
$ df -h

5. 利用 PBS submit jobs 並監控執行狀況：

$ qstat -Q

6. 刪除環境：

在 Azure portal 上直接刪除整個資源群祖。

在Azure NetApp Files (ANF)上執行EDA工作之效能調校

微軟的 NFS 解決方案：Azure NetApp Files (ANF) 已被各種行業廣泛採用，包括許多在 Azure 上運行其電子設計自動化 (EDA) 工作負載的半導體公司。

Azure NetApp Files 提供了 3 種不同的服務級別 (Service Level) 以確保吞吐量 (throughput)、提供從 Windows 或 Linux VM 連接的 NFS 3.0/NFS4.1/SMB 的各種掛載協議，操作簡單，只需幾分鐘即可完成設置。企業可以將他們的應用程序無縫遷移到 Azure，並提供類似於本地 NetApps 的體驗和性能。

本文的目的是分享在 Azure NetApp Files 上運行SPEC EDA Benchamrk及FIO等測試所得到的經驗，並：

提供現實世界中實用的性能最佳實踐 (Performance Best Practice)。
利用多個NFS 磁片區 (Volume) 來檢視 ANF 的橫向擴展功能。
成本效益分析，提供使用者選擇最適合 ANF 的服務級別。

Performance best practice- running EDA workloads on Azure NetApp Files (microsoft.com)

在Azure上執行NCBI BLAST

BLAST可以用來比對不同DNA或胺基酸的序列，可以用來追查Covid-19的源頭，或是人類和尼安德塔人的基因相似度。用一般機器跑BLAST通常非常耗時，這篇文章除教導如何在Azure上跑BLAST及其最佳化的過程。

BLAST can be used to compare different DNA or protein sequences, and can be used to trace the origin of Covid-19, or the genetic similarity between humans and Neanderthals. Running BLAST is usually very time-consuming. This article will guide you how to run BLAST on Azure, optimization process, and best practice.

Running NCBI BLAST on Azure – Performance, Scalability and Best Practice (microsoft.com)