Create new ASM Disks for an Oracle RAC in VirtualBox

In this post we try to explain the minimum steps necessary to add a new disk to an Oracle ASM instance virtualitzed using VirtualBox. We use command line to execute all the tasks (most of them can also be executed using the different GUI’s).

The first step consist on “create” the virtual disk, this step also “registers” this disk on Virtualbox  Media Manager. In this case the disk have 10Gb of size (10.240 Mb) and is located on the directory where the command is executed, the name of the file is “asm5.vdi”.


vboxmanage createhd --filename asm5.vdi --size 10240 --format VDI --variant Fixed

Next we need to “attach” this disk to all the servers in our RAC configuration (in this case a two rac node), the command to do this is:

vboxmanage storageattach ol7-121-rac1 --storagectl "SATA" --port 5 --device 0 --type hdd --medium asm5.vdi --mtype shareable
vboxmanage storageattach ol7-121-rac2 --storagectl "SATA" --port 5 --device 0 --type hdd --medium asm5.vdi --mtype shareable

The name of the two virtual machines are ol7-121-rac1 and ol7-121-rac2, the device parameter indicates the “controller” type (in our case a virtual SATA controller device with id “0” ) and we must indicate a unused “port” on this controller.

Finally we must define the disk as “shared” to allow both servers to use it at same time

vboxmanage  modifyhd asm5.vdi --type shareable

The tasks on VirtualBox side are done, next go to virtual machines to prepare the new volume.

Now on system log of both servers must appear a message like this (is not necessary to reboot) showing that the new device is detected:

[ 1451.887229] scsi 7:0:0:0: Direct-Access     ATA      VBOX HARDDISK    1.0  PQ: 0 ANSI: 5
[ 1451.888322] sd 7:0:0:0: [sdg] 20971520 512-byte logical blocks: (10.7 GB/10.0 GiB)
[ 1451.888338] sd 7:0:0:0: [sdg] Write Protect is off
[ 1451.888340] sd 7:0:0:0: [sdg] Mode Sense: 00 3a 00 00
[ 1451.888347] sd 7:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1451.889102] sd 7:0:0:0: Attached scsi generic sg7 type 0
[ 1451.889468]  sdg: unknown partition table
[ 1451.889556] sd 7:0:0:0: [sdg] Attached SCSI disk
[ 1452.328731] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs

We must capture the “scsi id” of the new created device with this command on any of the servers

[root@ol7-121-rac1 ~]# /usr/lib/udev/scsi_id -g -u -d /dev/sdg
1ATA_VBOX_HARDDISK_VB7d862d36-95a92590

The next step to execute on (only one) of the servers is create a partition on the new volume (sdg in this case).

[root@ol7-121-rac1 ~]# fdisk /dev/sdg
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table
Building a new DOS disklabel with disk identifier 0x62330d87.

Command (m for help): n
Partition type:
p   primary (0 primary, 0 extended, 4 free)
e   extended
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-20971519, default 2048):
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-20971519, default 20971519):
Using default value 20971519
Partition 1 of type Linux and of size 10 GiB is set

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

After that the second server not have knowledge of the device new partition, to show the new partition to the other server we must execute on it this command:

[root@ol7-121-rac2 ~]# partx /dev/sdg
NR START      END  SECTORS SIZE NAME UUID
1  2048 20971519 20969472  10G
[root@ol7-121-rac2 ~]# partx -a /dev/sdg

Now all servers have a new partitioned block device. The name of the devices may change between reboots (and may not be the same on all servers) and the default permissions may not allow ASM instances to use the device, the next step is assure a consistent device naming and permissions at every boot and between all servers.

This can be accomplished in various ways, one using the ASMLib from Oracle or using the “udev” (Dynamic device management) from the OS. In this case we use this second approach.

We have created a udev script called 99-oracle-asmdevices.rules

[root@ol7-121-rac1 ~]# cat /etc/udev/rules.d/99-oracle-asmdevices.rules
KERNEL=="sd?1", SUBSYSTEM=="block", PROGRAM=="/usr/lib/udev/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VB00ecec07-395a2a52", SYMLINK+="oracleasm/asm-disk1", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sd?1", SUBSYSTEM=="block", PROGRAM=="/usr/lib/udev/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VB1453d286-361fb64b", SYMLINK+="oracleasm/asm-disk2", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sd?1", SUBSYSTEM=="block", PROGRAM=="/usr/lib/udev/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VB2ed01775-17d58006", SYMLINK+="oracleasm/asm-disk3", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sd?1", SUBSYSTEM=="block", PROGRAM=="/usr/lib/udev/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VB0badefb4-2d8464ae", SYMLINK+="oracleasm/asm-disk4", OWNER="oracle", GROUP="dba", MODE="0660"
KERNEL=="sd?1", SUBSYSTEM=="block", PROGRAM=="/usr/lib/udev/scsi_id -g -u -d /dev/$parent", RESULT=="1ATA_VBOX_HARDDISK_VB7d862d36-95a92590", SYMLINK+="oracleasm/asm-disk5", OWNER="oracle", GROUP="dba", MODE="0660"

When udev executes during boot reconfigures devices following the rules on this file.

Every time the kernel detects a device who matches the one defined “sd?1” launches an event and follows the rules in the file, first executes a command (scsi_id) to retrieve the “id” of the device (this id never changes on the life of the device), compares the id with some predefined “id’s” and if match creates a virtual link to the device with a fixed name and change the permissions on the device (is time to use the “scsi id” retrieved some steps ago).

Finally we must “reload” udev rules, ( is possible to execute a “test” first with the command udevadm test /block/sdg1).

[root@ol7-121-rac1 ~]# udevadm control --reload-rules
[root@ol7-121-rac1 ~]# udevadm trigger
[root@ol7-121-rac1 ~]# ls -ltr /dev/oracleasm/
total 0
lrwxrwxrwx. 1 root root 7 Jun  9 17:26 asm-disk3 -> ../sdd1
lrwxrwxrwx. 1 root root 7 Jun  9 17:26 asm-disk5 -> ../sdg1
lrwxrwxrwx. 1 root root 7 Jun  9 17:26 asm-disk2 -> ../sdc1
lrwxrwxrwx. 1 root root 7 Jun  9 17:26 asm-disk1 -> ../sdb1
lrwxrwxrwx. 1 root root 7 Jun  9 17:26 asm-disk4 -> ../sde1

Note: The permissions are changed on the device itself not on the link.

Note2: This rules only works on “partitioned” devices, due the kernel “pattern” to trigger it is “sd?1”, a device without any partition not fits on this pattern.

Finally we must connect to the ASM instance and add the new disk to some diskgroup:

ALTER DISKGROUP data ADD DISK '/dev/oracleasm/asm-disk5';

Note: This command only takes some seconds, but in background the ASM instance launches a “rebalance” operation to move part of the existent data to the new disk.

You now can use all the “new” space in your databases or cloud file systems.

Rafael.

 

 

 

 

 

 

 

 

 

 

 

Cluster Health Monitor Database size, or who is filling my clusterware volume?

Inside most installations of Oracle Clusterware 11g exists an “Berkeley DB” database used to store system performance data. This database also exists on the initial release of 12g (12.1.0.1) database, and after that (from 12c patch1 or 12.1.0.2) this “Berkeley DB” is replaced by an Oracle DB (called “Grid Infrastructure Management Repository”)

This database is limited to 1Gb in size, but due to some bugs or due to configuration changes (changing default retention of data) it could overpass this limit.

In the case explained here, a bug caused the database to grow over 80Gb, and we need to resize it to normal values. To do this we used the method explained on MOS note “1343105.1“, the procedure consists basically on stopping the Berkeley DB database, dropping all datafiles and start it again. Also it is a good practice to “review” the retention time for the data and correct it, if it’s not the expected one. Note that this procedure implies removing all historic performance data.

More