Replication when time is of the essence – Snapshot Standby
We have a couple of environments which have 14 databases replicated from production every day. Our current method is to use a storage-based mechanism where the disks are “snapped” and then mounted on another server. Then the file systems are copied from this other server to the file systems in our replicated environments. For all but one database, the entire process takes just under a half hour. The last database on it’s own takes over 80 minutes and as it is growing daily, takes longer each day. Many methods involving many technologies have been considered. I suggested a method involving a physical standby and flashback. I had used this method to test the primary and disaster recovery (DR) sites at the same time for another company a few years ago (albeit, that was a 100% manual process then). I set up a demo a few months ago in my lab and now have a demo running at work. In my lab, I have a primary database called mydb on a RHEL5 server called centora. The standby will have a SERVICE_NAME of mydbd and runs on a RHEL5 server called centora2. Here’s how it goes:
On both servers, ensure there is a definition for the primary and standby database in the $ORACLE_HOME/network/admin/tnsnames.ora file:
mydb = (description = (address = (protocol = tcp)(host = centora.lidh.com)(port = 1521)) (connect_data = (server = dedicated)(service_name = mydb)) ) mydbd = (description = (address = (protocol = tcp)(host = centora2.lidh.com)(port = 1521)) (connect_data = (server = dedicated)(service_name = mydbd)) )
Then on the primary database, check the db_unique_name. It will need to be different from the standby.
sys@MYDB> show parameter unique NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ db_unique_name string mydb
Now it’s time to set some parameters in the primary database to make it a primary:
sys@MYDB> alter system set log_archive_config='DG_CONFIG=(mydb,mydbd)'; System altered. sys@MYDB> alter system set log_archive_dest_1='LOCATION=USE_DB_RECOVERY_FILE_DEST valid_for=(all_logfiles,all_roles) db_unique_name=mydb'; System altered. sys@MYDB> alter system set log_archive_dest_2='SERVICE=mydbd lgwr async valid_for=(online_logfiles, primary_role) db_unique_name=mydbd'; System altered. sys@MYDB> alter system set log_archive_dest_state_2='DEFER'; System altered. sys@MYDB> alter system set log_archive_max_processes=3; System altered. sys@MYDB> alter system set dg_broker_start=true; System altered.
Now we need to enable forced logging.
sys@MYDB> alter database force logging; Database altered.
Now I backup the database including a standby controlfile.
RMAN> backup format '/u02/backup/mydb/mydb_stby_%U' database include current controlfile for standby;
From the primary server, I create a new init<SID>.ora file from the spfile and I copy the init<SID>.ora and password file over to the standby server. For the standby, I make some modifications to the parameter file:
*.db_unique_name='mydbd' *.service_names='mydbd' *.fal_client='mydbd' *.fal_server='mydb' *.standby_archive_dest=USE_DB_RECOVERY_FILE_DEST *.log_archive_dest_1='LOCATION=USE_DB_RECOVERY_FILE_DEST valid_for=(all_logfiles,all_roles) db_unique_name=mydbd'
In order to use USE_DB_RECOVERY_FILE_DEST, you have to already have the db_recovery_file_dest parameter defined (which I do). I then create an spfile from the pfile and start the standby database in nomount mode. Now I’m ready to restore the standby.
RMAN> duplicate target database for standby nofilenamecheck;
You might have realized that I didn’t set the fal_client or fal_server parameters in the primary or the log_archive_dest_2 parameter in the standby. For now, there’s no intention of ever using this standby as a DR database for the primary. If that ever becomes the case, I’ll set up the rest of the parameters. Once the standby is restored, it’s time to put it into managed recovery.
idle> alter database flashback on; Database altered. idle> alter database recover managed standby database disconnect from session; Database altered.
The final step is in the primary, and that’s to enable the archive destination to the standby.
sys@MYDB> alter system set log_archive_dest_state_2=’ENABLE’; System altered.
So now we have a one-way standby environment set up. Note that in the standby, we also enabled flashback. Flashback isn’t required for the primary database so if you have a 10g database, you don’t have to worry about bouncing production.
So at a predetermined and scheduled (in cron) point in time, I decide I’m ready to use my standby as a read/write development database. In the primary I switch a log file and then turn off log shipping to the standby:
prmy> alter system archive log current; prmy> alter system set dg_broker_start=false; prmy> alter system set log_archive_dest_state_2=defer;
Then five minutes later, I “activate” my standby. This is not to be confused with the 11g Active Data Guard which has your standby up in read-only mode and applying logs from the primary. That is a licensed and expensive option which doesn’t give us a development database.
stby> alter database recover managed standby database cancel; stby> create restore point devdb guarantee flashback database; stby> alter database activate standby database; stby> alter database open;
Now I can make independent changes in both databases without affecting the other. I ran my load generator against both databases at the same time. At the time I’m ready to “deactivate” my standby, the primary database looks like this:
appuser@MYDB> select a.name name,b.numwidgets numwidgets,b.orderdate from mycustomer a,custorder b where a.id = b.custid and b.orderdate = (select max(orderdate) from custorder); NAME NUMWIDGETS ORDERDATE ------------------------------ ---------- -------------------- Bush 11 14-FEB-2012 12:52:17 Rivera 13 14-FEB-2012 12:52:17 Martin 58 14-FEB-2012 12:52:17 appuser@MYDB> select count(*) from mycustomer; COUNT(*) ---------- 192 appuser@MYDB>
And my standby looks like this:
appuser@MYDB> select a.name name,b.numwidgets numwidgets,b.orderdate from mycustomer a,custorder b where a.id = b.custid and b.orderdate = (select max(orderdate) from custorder); NAME NUMWIDGETS ORDERDATE ------------------------------ ---------- -------------------- Wallace 79 14-FEB-2012 12:52:02 Fitzgerald 86 14-FEB-2012 12:52:02 appuser@MYDB> select count(*) from mycustomer; COUNT(*) ---------- 183 appuser@MYDB>
So they are definitely different. Now, after a busy day developing, we’re ready to put our development database back into managed standby and catch it up with production. In the standby:
stby> shutdown immediate stby> startup mount force; stby> flashback database to restore point devdb; stby> alter database convert to physical standby; stby> startup mount force; stby> alter database recover managed standby database disconnect from session; stby> drop restore point devdb;
And then in the primary, we re-enable log shipping:
prmy> alter system set log_archive_dest_state_2=enable; prmy> alter system set dg_broker_start=true;
So with flashback, we remove all the changes that occurred in development and then with the standby back in managed recovery, we apply all the changes that occurred in the primary. Just to check, I’ll open the standby read-only and check the numbers:
idle> alter database recover managed standby database cancel; Database altered. idle> alter database open; Database altered. idle> connect appuser/appuser Connected. appuser@MYDB> select a.name name,b.numwidgets numwidgets,b.orderdate from mycustomer a,custorder b where a.id = b.custid and b.orderdate = (select max(orderdate) from custorder); NAME NUMWIDGETS ORDERDATE ------------------------------ ---------- -------------------- Bush 11 14-FEB-2012 12:52:17 Rivera 13 14-FEB-2012 12:52:17 Martin 58 14-FEB-2012 12:52:17 appuser@MYDB> select count(*) from mycustomer; COUNT(*) ---------- 192 appuser@MYDB>
And as you can see, my standby now matches my primary again. I did script the “activate” and “deactivate” commands but they are far from elegant at this point. When I add exception handling and notification, I’ll post them. Also, since we have archiving on in the standby, I added a script to the process to clear out old archive logs to keep the destination from filling up.
So the current storage-based solution takes 80 minutes to replicate the larger of our 14 databases. My standby/flashback method takes just under a minute and is more current.
Topics: Flashback, Oracle, Physical Standby, Snapshot Standby | No Comments »
Flashback database and archive logs
So at work, I’m going to be testing the impact of a global non-partitioned index on a partitioned table (approx 1.6 billion rows). This table gets loaded every morning with a lot of records and then some partitions get compressed which invalidates the indexes. I want to see how the load will be affected by another index and how long it will take to rebuild the index after the compression.
As a test, i decided to allow a load to occur and then flashback the database. Then I would create the index and redo the load. After the compression I would rebuild the index and time all the steps as well as evaluate how much more is being used in terms of system resources (CPU, I/O, etc) as a result of the new index being there. Under normal circumstances, we don’t have archiving on this database. So five days ago, I prepped the system by:
- Enable archiving
- Set the parameters db_recovery_file_dest_size, db_recovery_file_dest, and db_flashback_retention_target
- Enable flashback
So now, when I query flashback_on from v$database, it says “YES”. Since we didn’t have a lot of room on the server, I purged archive logs periodically but ensured I had all my flashback logs. But this morning, when I went to flashback the database about five hours back, I got this:
ORA-38754: FLASHBACK DATABASE not started; required redo log is not available
I hadn’t ever done flashback database on a database which doesn’t normally have archiving enabled. Normally I have two days of archive logs on disk for production databases. As a test, I went to one of my 10gR2 lab databases. This database has three redo groups. I ran my JDBC load program for about 15 minutes and switched log files four times during the load. My database had been up since 10:02am and I started the load around 10:08am. Now, let’s make sure flashback is still enabled:
[/home/oracle/dev] $ sqlp SQL*Plus: Release 10.2.0.4.0 - Production on Fri Jan 27 10:17:46 2012 Copyright (c) 1982, 2007, Oracle. All Rights Reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options sys@MYDB> select flashback_on from v$database; FLASHBACK_ON ------------------ YES sys@MYDB>
Now I want to delete all my archived logs during that time.
[/home/oracle/dev] $ rman target / Recovery Manager: Release 10.2.0.4.0 - Production on Fri Jan 27 10:18:29 2012 Copyright (c) 1982, 2007, Oracle. All rights reserved. connected to target database: MYDB (DBID=2677001426) RMAN> delete noprompt archivelog all; using target database control file instead of recovery catalog allocated channel: ORA_DISK_1 channel ORA_DISK_1: sid=136 devtype=DISK List of Archived Log Copies Key Thrd Seq S Low Time Name ------- ---- ------- - --------- ---- 70 1 71 A 25-JAN-12 /u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_71_7l5p8j14_.arc 71 1 72 A 26-JAN-12 /u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_72_7l5p8jch_.arc 69 1 73 A 27-JAN-12 /u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_73_7l5p8hnw_.arc 72 1 1 A 27-JAN-12 /u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_1_7l5pogw0_.arc 73 1 2 A 27-JAN-12 /u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_2_7l5prd2n_.arc 74 1 3 A 27-JAN-12 /u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_3_7l5pv8fb_.arc 75 1 4 A 27-JAN-12 /u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_4_7l5q14d3_.arc deleted archive log archive log filename=/u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_71_7l5p8j14_.arc recid=70 stamp=773661856 deleted archive log archive log filename=/u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_72_7l5p8jch_.arc recid=71 stamp=773661856 deleted archive log archive log filename=/u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_73_7l5p8hnw_.arc recid=69 stamp=773661856 deleted archive log archive log filename=/u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_1_7l5pogw0_.arc recid=72 stamp=773662271 deleted archive log archive log filename=/u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_2_7l5prd2n_.arc recid=73 stamp=773662364 deleted archive log archive log filename=/u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_3_7l5pv8fb_.arc recid=74 stamp=773662457 deleted archive log archive log filename=/u02/fra/MYDB/archivelog/2012_01_27/o1_mf_1_4_7l5q14d3_.arc recid=75 stamp=773662644 Deleted 7 objects RMAN>
I haven’t touched my flashback logs so I’ll try to flashback to 10:09am.
[/home/oracle/dev]
$ sqlp
SQL*Plus: Release 10.2.0.4.0 - Production on Fri Jan 27 10:19:15 2012
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
sys@MYDB> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
sys@MYDB> startup mount exclusive
ORACLE instance started.
Total System Global Area 209715200 bytes
Fixed Size 2082784 bytes
Variable Size 142608416 bytes
Database Buffers 58720256 bytes
Redo Buffers 6303744 bytes
Database mounted.
sys@MYDB> flashback database to timestamp to_timestamp('2012-01-27 10:09','yyyy-mm-dd hh24:mi');
flashback database to timestamp to_timestamp('2012-01-27 10:09','yyyy-mm-dd hh24:mi')
*
ERROR at line 1:
ORA-38754: FLASHBACK DATABASE not started; required redo log is not available
ORA-38761: redo log sequence 1 in thread 1, incarnation 2 could not be accessed
sys@MYDB>
So my lesson here is not to remove the archived logs generated during the time I want to flashback through.
Topics: Flashback, Oracle | No Comments »
Solaris 11/Oracle 11gR2 RAC on VirtualBox – Epilogue
So I did install the 11gR2 (11.2.0.3.0) RDBMS on my cluster and it went so smoothly, I didn’t bother blogging about it. There are far better bloggers on database installation than me. I will mention that the two zip files I mentioned in the beginning of the last post, p10404530_112030_Solaris86-64_1of6.zip and p10404530_112030_Solaris86-64_2of6.zip are needed for the installation. I then created a clustered database called srac with an instance on each node. I also increased the RAM on each node to 6 GB. Later, I found that my RMAN commands were throwing errors saying
TNS-12549: TNS:operating system resource quota exceeded
I looked and saw I had 1.4 GB of unused RAM. I knew I had plenty of file system space so I couldn’t figure out what resource I was low on. I checked out /var/adm/messages and found lots of these:
Dec 13 22:18:20 solrc2.lidh.com genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 464 (oracle)
So my lack of swap (again, set up by the Solaris installer) was biting me. In my Linux-y way, I figured I would just do a
root> mkfile 3g /var/newswap root> swap -a /var/newswap
But I quickly figured out that this doesn’t work with ZFS in Solaris 11. So how do you add swap?
root> zfs set volsize=6G rpool/swap
When I did a swap -l, I did notice that the space didn’t show up. So I bounced the servers and now
$ swap -l -h swapfile dev swaplo blocks free /dev/zvol/dsk/rpool/swap 195,2 4K 6.0G 6.0G
And my databases are happy and I haven’t seen a swap message in 17 hours since I rebooted.
Topics: Oracle, RAC/Grid Infrastructure, Solaris | No Comments »
Solaris 11/Oracle 11gR2 RAC on VirtualBox – Part II
With this post, I’m going to step through the Oracle 11gR2 Grid Infrastructure installation. Because of the networking issues I encountered earlier, I had to deinstall my software two times. I’ve done it with the documentation months ago but I followed the steps outlined here which worked perfectly.
Since I wanted the latest version of Grid Infrastructure (GI) and the RDBMS, I went to My Oracle Support (MOS) and downloaded p10404530_112030_Solaris86-64_1of6.zip, p10404530_112030_Solaris86-64_2of6.zip, and p10404530_112030_Solaris86-64_3of6.zip. For the GI installation, I just need the “3of6″ zip file. As root, I ran the rootpre.sh which looks to see if SunCluster is running and the platform. It came back with “No action needed.” which is always nice to see.
Next step was to set up “user equivalence” between my Oracle users on my two nodes. For this, I referred to my old notes:
On node 1
solrc1> mkdir ~/.ssh solrc1> chmod 700 ~/.ssh solrc1> ssh-keygen -t rsa solrc1> ssh-keygen -t dsa
Then on node 2
solrc2> mkdir ~/.ssh solrc2> chmod 700 ~/.ssh solrc2> ssh-keygen -t rsa solrc2> ssh-keygen -t dsa
Back to node 1
solrc1> cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys solrc1> cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys solrc1> ssh solrc2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys solrc1> ssh solrc2 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys solrc1> scp ~/.ssh/authorized_keys solrc2:~/.ssh/authorized_keys
At this point, I can run the runcluvfy.sh script as root. The cluster verification script didn’t like my 1 GB swap (which Oracle Solaris set up for me) because I have 4 GB RAM. Since I’m going to use ASM to hold my OCR and voting disk, I need to have my disks set up for Oracle to use. For this, I went to the Oracle documentation . It wasn’t as helpful as I like but workable.
root@solrc1:~# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c3t0d0 <ATA-VBOX HARDDISK-1.0 cyl 4565 alt 2 hd 255 sec 63>
/pci@0,0/pci8086,2829@d/disk@0,0
1. c3t3d0 <ATA-VBOX HARDDISK-1.0 cyl 2558 alt 2 hd 128 sec 32>
/pci@0,0/pci8086,2829@d/disk@3,0
2. c3t4d0 <ATA-VBOX HARDDISK-1.0 cyl 2558 alt 2 hd 128 sec 32>
/pci@0,0/pci8086,2829@d/disk@4,0
3. c3t5d0 <ATA-VBOX HARDDISK-1.0 cyl 2558 alt 2 hd 128 sec 32>
/pci@0,0/pci8086,2829@d/disk@5,0
4. c3t6d0 <ATA-VBOX HARDDISK-1.0 cyl 2558 alt 2 hd 128 sec 32>
/pci@0,0/pci8086,2829@d/disk@6,0
5. c3t7d0 <ATA-VBOX HARDDISK-1.0 cyl 2558 alt 2 hd 128 sec 32>
/pci@0,0/pci8086,2829@d/disk@7,0
6. c3t8d0 <ATA-VBOX HARDDISK-1.0 cyl 2558 alt 2 hd 128 sec 32>
/pci@0,0/pci8086,2829@d/disk@8,0
7. c3t9d0 <ATA-VBOX HARDDISK-1.0 cyl 2558 alt 2 hd 128 sec 32>
/pci@0,0/pci8086,2829@d/disk@9,0
Specify disk (enter its number): 1
selecting c3t3d0
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
fdisk - run the fdisk program
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show disk ID
volname - set 8-character volume name
!<cmd> - execute <cmd>, then return
quit
format> partition
PARTITION MENU:
0 - change `0' partition
1 - change `1' partition
2 - change `2' partition
3 - change `3' partition
4 - change `4' partition
5 - change `5' partition
6 - change `6' partition
7 - change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name - name the current table
print - display the current table
label - write partition map and label to the disk
!<cmd> - execute <cmd>, then return
quit
partition> 1
Part Tag Flag Cylinders Size Blocks
1 unassigned wm 0 0 (0/0/0) 0
Enter partition id tag[unassigned]: usr
Enter partition permission flags[wm]:
Enter new starting cyl[0]: 1
Enter partition size[0b, 0c, 1e, 0.00mb, 0.00gb]: 4.99gb
partition> label
Ready to label disk, continue? y
partition> print
Current partition table (original):
Total disk cylinders available: 2557 + 2 (reserved cylinders)
Part Tag Flag Cylinders Size Blocks
0 unassigned wm 0 0 (0/0/0) 0
1 usr wm 1 - 2555 4.99GB (2555/0/0) 10465280
2 backup wu 0 - 2556 4.99GB (2557/0/0) 10473472
3 unassigned wm 0 0 (0/0/0) 0
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 unassigned wm 0 0 (0/0/0) 0
7 unassigned wm 0 0 (0/0/0) 0
8 boot wu 0 - 0 2.00MB (1/0/0) 4096
9 unassigned wm 0 0 (0/0/0) 0
partition> quit
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
fdisk - run the fdisk program
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show disk ID
volname - set 8-character volume name
!<cmd> - execute <cmd>, then return
quit
format> quit
And then I do the same thing for disks 2-7. According to the documentation, “Oracle Solaris fdisk partitions must start at cylinder 1, not cylinder 0.” Finally, the mode and ownership of the disk slices have to be modified so the asmadmin (in my case, oracle) owns them.
root@solrc1:~# chown oracle:asmadmin /dev/rdsk/c3t3d0s1 root@solrc1:~# chown oracle:asmadmin /dev/rdsk/c3t4d0s1 root@solrc1:~# chown oracle:asmadmin /dev/rdsk/c3t5d0s1 root@solrc1:~# chown oracle:asmadmin /dev/rdsk/c3t6d0s1 root@solrc1:~# chown oracle:asmadmin /dev/rdsk/c3t7d0s1 root@solrc1:~# chown oracle:asmadmin /dev/rdsk/c3t8d0s1 root@solrc1:~# chown oracle:asmadmin /dev/rdsk/c3t9d0s1 root@solrc1:~# chmod 660 /dev/rdsk/c3t3d0s1 root@solrc1:~# chmod 660 /dev/rdsk/c3t4d0s1 root@solrc1:~# chmod 660 /dev/rdsk/c3t5d0s1 root@solrc1:~# chmod 660 /dev/rdsk/c3t6d0s1 root@solrc1:~# chmod 660 /dev/rdsk/c3t7d0s1 root@solrc1:~# chmod 660 /dev/rdsk/c3t8d0s1 root@solrc1:~# chmod 660 /dev/rdsk/c3t9d0s1
Now everything should be ready for the GI install. As oracle, I kick off the runInstaller.sh.
I choose to “Skip software updates” and then the “Next” button.
Since this is a brand new cluster, I want to install and configure it. I’m not upgrading from a previous version.
I actually tried both the typical and the advanced installation during my failed attempts. The advanced asked me more questions but I wound up taking the defaults most of the time anyway.
I had already put an unused IP address with a SCAN name in my /etc/hosts file, so I entered that name. Then I added my second node and it’s VIP name (also already in the /etc/hosts file).
I added the software base and installation locations. I also entered my SYSASM password, which Oracle doesn’t like…but I don’t care. I also chose ASM for my storage. No raw devices for my OCR or voting disk.
The next step discovered the disks I just set up. Since this is for my DATA disk group, I only chose the first four disks. The next two I’ve dedicated for my Fast Recovery Area (FRA) and the last is a spare. Make sure you choose external redundancy if you’re short on disks like me.
Here’s my inventory location. This was the default so I took it.
Now it does a verification for the prerequisites. I know it won’t like my swap.
The “Device checks for ASM” confused me. When I looked at it, this is what it said.
So it just can’t figure out if the storage is shared. I’m not running Veritas or asmlib or anything so I guess that makes sense. I decide to ignore the warnings and press onward!
With the network setup I described in the last post, the two root.sh scripts ran with no issues.
The final cluster verification utility check did fail but I already knew it would. But when I closed up the window, my cluster was working. I have ASM and the listener running and all my cluster services are running, as well. I used the asmca utility to configure my FRA with the next two disks and I’m ready to install the database software.
Topics: Oracle, RAC/Grid Infrastructure, Solaris | No Comments »
Solaris 11/Oracle 11gR2 RAC on VirtualBox – Part I
So I’ve been wanting to build a cluster with something besides Linux for a long time. In real life, I did build a 3-node 9i cluster on Tru64 and I’ve supported RAC on HP-UX, Solaris (SPARC), Windows Server (with ASM), and RHEL5 but in my home environment, I’ve only done Linux…until now. I should state up front that my Solaris administration skills are basic and I don’t know anyone personally with any Solaris 11 experience. Part I of this two-part post is just the Solaris server setup for my two nodes. I’m going to leave out the epic failures I had getting the networking set up.
I already have VirtualBox installed and running. I got my Solaris ISO from Oracle’s Technology download site. I used the Live Media although I’m sure the others would work. As opposed to some virtual Linux RAC blogs, I didn’t clone my host. I don’t know enough about Solaris to attempt it so I created and configured each node separately. I gave both nodes 4 GB of RAM and a 35 GB drive for the OS and eventually the Grid Infrastructure and RDBMS software. I used a bridged connection to my VMware Player vmnet8 (NAT) network for my public network and a host-only network for the private interconnect between my nodes. As far as I know, the VirtualBox NAT still doesn’t let you choose your IP address.
My example for doing the Solaris install will be my second node. I set up my Live Media iso file as the CD mounted. When I started the machine, I chose the first option:
I took the defaults for keyboard and language. They aren’t really important at this time because this isn’t actually the install, yet. This is kind of a “try before you buy” version.
And then I had a running Solaris machine. I like the looks of the OS with the old-school Gnome menus and the pretty background. But the important thing at this point is the desktop icon to “Install Oracle Solaris.”
I double-clicked the install icon and this is where I actually start creating my Solaris virtual machine. The first screen looks similar to the Oracle 11g install screens. Nothing exciting, so we just click “Next.”
At this point, I decided to let Oracle/Solaris use my whole virtual disk (35 GB). I probably should have done the partitioning myself because Oracle only allocated 1 GB for swap even though I gave it 4 GB of RAM. Oracle would then complain about my lack of swap when I started the Grid Infrastructure install. Sigh.
As with any operating system install, I selected my time zone.
The next step is to create a user. I will be creating my Oracle user later so I created my personal account. The root password will be the same as this one and you’ll be required to change it when you first su to root.
Now we wait as Oracle installs Solaris.
When it’s done, you’ll see this:
This time when my VM boots up, I choose the option of booting from the hard drive. If you leave it to the default, you’ll get the “live media” version which is not what we want.
This time when it boots up, it’s your real virtual server. Now the interesting stuff happens. For some reason, Oracle’s default for networking is DHCP. This makes sense for a laptop but not a server, to me. So the first step is to disable automatic networking.
root@solrc1:~# netadm enable -p ncp DefaultFixed
The next step is to set up the static IP and default route. I created my /etc/hosts file for my cluster which looks like this:
# # Copyright 2009 Sun Microsystems, Inc. All rights reserved. # Use is subject to license terms. # # Internet host table # 127.0.0.1 localhost loghost # For the Solaris cluster 172.16.65.200 solrc1.lidh.com solrc1 192.168.56.200 solrc1-priv.lidh.com solrc1-priv 172.16.65.210 solrc1-vip.lidh.com solrc1-vip # 172.16.65.201 solrc2.lidh.com solrc2 192.168.56.201 solrc2-priv.lidh.com solrc2-priv 172.16.65.211 solrc2-vip.lidh.com solrc2-vip # 172.16.65.190 solscan.lidh.com solscan
If I do an ifconfig -a, I see my two networks are net0 (public) and net1(private). So to configure these for my first node, I did the following:
root@solrc1:~# ipadm create-ip net0 root@solrc1:~# ipadm create-addr -T static -a 172.16.65.200/24 net0/v4static root@solrc1:~# route -p add default 172.16.65.2 root@solrc1:~# ipadm create-ip net1 root@solrc1:~# ipadm create-addr -T static -a 192.168.56.200/24 net1/v4static
And then I wanted to make sure the servers used my files (/etc/hosts, /etc/defaultrouter, /etc/resolv.conf) so I did this:
root@solrc1:~# svccfg svc:> select name-service/switch svc:/system/name-service/switch> setprop config/host = astring: "files dns" svc:/system/name-service/switch> setprop config/ipnodes = astring: "files dns" svc:/system/name-service/switch> select name-service/switch:default svc:/system/name-service/switch:default> refresh svc:/system/name-service/switch:default> validate svc:/system/name-service/switch:default> exit
My second node looks the same except the specific IP addresses:
root@solrc2:~# ipadm create-ip net0 root@solrc2:~# ipadm create-addr -T static -a 172.16.65.201/24 net0/v4static root@solrc2:~# route -p add default 172.16.65.2 root@solrc2:~# ipadm create-ip net1 root@solrc2:~# ipadm create-addr -T static -a 192.168.56.201/24 net1/v4static root@solrc2:~# svccfg svc:> select name-service/switch svc:/system/name-service/switch> setprop config/host = astring: "files dns" svc:/system/name-service/switch> setprop config/ipnodes = astring: "files dns" svc:/system/name-service/switch> select name-service/switch:default svc:/system/name-service/switch:default> refresh svc:/system/name-service/switch:default> validate svc:/system/name-service/switch:default> exit
I tried three other methods to get the static IP to work (including Oracle documentation) but this method was the only one that allowed the creation of the VIP during the Grid Infrastructure install to work.
The next step for me was to create my Oracle user and groups. I also set up the /u01 file system for the Oracle user to do the installs.
root@solrc1:~# mkdir /u01 root@solrc1:~# groupadd dba root@solrc1:~# groupadd oinstall root@solrc1:~# groupadd asmdba root@solrc1:~# groupadd asmadmin root@solrc1:~# echo "oracle localhost:/export/home/&" >> /etc/auto_home root@solrc1:~# useradd -g oinstall -G dba,asmdba,asmadmin -m -d /export/home/oracle -s /bin/ksh oracle 80 blocks root@solrc1:~# passwd oracle New Password: Re-enter new Password: passwd: password successfully changed for oracle root@solrc1:~# chown -R oracle:oinstall /u01
Next, I configure the settings for an Oracle database environment.
root@solrc1:~# projadd -U oracle -K "project.max-shm-memory=(priv,5g,deny)" user.oracle root@solrc1:~# projmod -sK "project.max-sem-nsems=(priv,512,deny)" user.oracle root@solrc1:~# projmod -sK "project.max-sem-ids=(priv,128,deny)" user.oracle root@solrc1:~# projmod -sK "project.max-shm-ids=(priv,128,deny)" user.oracle root@solrc1:~# projmod -sK "project.max-file-descriptor=(priv,65536,deny)" user.oracle root@solrc1:~# ipadm set-prop -p smallest_anon_port=9000 tcp root@solrc1:~# ipadm set-prop -p largest_anon_port=65500 tcp root@solrc1:~# ipadm set-prop -p smallest_anon_port=9000 udp root@solrc1:~# ipadm set-prop -p largest_anon_port=65500 udp
Oracle Solaris doesn’t come with everything installed in order to install the database products like Grid Infrastructure and the RDBMS so we need to install what’s missing. There’s a list in the documentation which says what needs to be there. Some of it is already there and two of the packages don’t exist (nor are needed, from what I can tell). So this is what I had to install on top of the existing environments for both of my nodes:
root@solrc1:~# pkg set-publisher -Pe -O http://pkg.oracle.com/solaris/release solaris root@solrc1:~# pkg install compatibility/packages/SUNWxwplt SUNWmfrun SUNWarc SUNWhea SUNWlibm
Finally, for some reason during the Grid Infrastructure, the soft limit on file descriptors wasn’t getting picked up by the cluster verification utility so the one parameter I had to set in the /etc/system file is this:
# Set the soft limit on file descriptors set rlim_fd_cur = 65536
Now the servers are configured and just about ready for the Grid Infrastructure installation. The only thing I need to do is power down the servers and use VirtualBox to add the shared disks to be used for the ASM. These are the commands I used to create the virtual disks and associate them with the two servers. I also have to make them shareable.
[lidh@lidhsrv2 solshared]$ VBoxManage createhd --filename /u04/vm/solshared/asm1.vdi --size 5120 --format VDI --variant Fixed [lidh@lidhsrv2 solshared]$ VBoxManage createhd --filename /u04/vm/solshared/asm2.vdi --size 5120 --format VDI --variant Fixed [lidh@lidhsrv2 solshared]$ VBoxManage createhd --filename /u04/vm/solshared/asm3.vdi --size 5120 --format VDI --variant Fixed [lidh@lidhsrv2 solshared]$ VBoxManage createhd --filename /u04/vm/solshared/asm4.vdi --size 5120 --format VDI --variant Fixed [lidh@lidhsrv2 solshared]$ VBoxManage createhd --filename /u04/vm/solshared/asm5.vdi --size 5120 --format VDI --variant Fixed [lidh@lidhsrv2 solshared]$ VBoxManage createhd --filename /u04/vm/solshared/asm6.vdi --size 5120 --format VDI --variant Fixed [lidh@lidhsrv2 solshared]$ VBoxManage createhd --filename /u04/vm/solshared/asm7.vdi --size 5120 --format VDI --variant Fixed [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc1 --storagectl "SATA Controller" --port 3 --device 0 --type hdd --medium asm1.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc1 --storagectl "SATA Controller" --port 4 --device 0 --type hdd --medium asm2.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc1 --storagectl "SATA Controller" --port 5 --device 0 --type hdd --medium asm3.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc1 --storagectl "SATA Controller" --port 6 --device 0 --type hdd --medium asm4.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc1 --storagectl "SATA Controller" --port 7 --device 0 --type hdd --medium asm5.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc1 --storagectl "SATA Controller" --port 8 --device 0 --type hdd --medium asm6.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc1 --storagectl "SATA Controller" --port 9 --device 0 --type hdd --medium asm7.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage modifyhd asm1.vdi --type shareable [lidh@lidhsrv2 solshared]$ VBoxManage modifyhd asm2.vdi --type shareable [lidh@lidhsrv2 solshared]$ VBoxManage modifyhd asm3.vdi --type shareable [lidh@lidhsrv2 solshared]$ VBoxManage modifyhd asm4.vdi --type shareable [lidh@lidhsrv2 solshared]$ VBoxManage modifyhd asm5.vdi --type shareable [lidh@lidhsrv2 solshared]$ VBoxManage modifyhd asm6.vdi --type shareable [lidh@lidhsrv2 solshared]$ VBoxManage modifyhd asm7.vdi --type shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc2 --storagectl "SATA Controller" --port 3 --device 0 --type hdd --medium asm1.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc2 --storagectl "SATA Controller" --port 4 --device 0 --type hdd --medium asm2.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc2 --storagectl "SATA Controller" --port 5 --device 0 --type hdd --medium asm3.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc2 --storagectl "SATA Controller" --port 6 --device 0 --type hdd --medium asm4.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc2 --storagectl "SATA Controller" --port 7 --device 0 --type hdd --medium asm5.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc2 --storagectl "SATA Controller" --port 8 --device 0 --type hdd --medium asm6.vdi --mtype shareable [lidh@lidhsrv2 solshared]$ VBoxManage storageattach solrc2 --storagectl "SATA Controller" --port 9 --device 0 --type hdd --medium asm7.vdi --mtype shareable
I created seven 5 GB disks. My plan is to use the first four for my DATA disk group, the next two for my FRA disk group and the last to be a spare. Now I can boot up my two servers and I’m ready to prep for my Grid Infrastructure installation. I’ll pick that up in Part II.
Topics: Oracle, RAC/Grid Infrastructure, Solaris | No Comments »
Upgrading SQL Server 2005 to 2008 r2
So we have another “real world” scenario. At work, we have a task coming up where we need to upgrade a SQL Server (MSSQL) 2005 database which is sitting on a Windows Server 2003 machine to a MSSQL 2008 r2 database sitting on Windows Server 2008 r2. I wondered if MSSQL supported applying logs from one version to another? The answer is, “Sort of.” I went to my Technet subscription and downloaded Windows Server 2003 r2 and MSSQL 2005. I built the Windows Server, applied the 108 patches, and then installed MSSQL. I created a user called “klidh” and a database called “myld” and populated it with the tables I use from my load test. I modified my Java/JDBC code to run against this new database and started it running. While the load was running, I took a full backup of the database and copied the backup over to one of my Windows Server 2008 r2 servers.
So will a MSQL 2008 even let me restore a MSSQL 2005 backup? Let’s find out together. I right-clicked on “Databases” and selected “Restore Database…”.
Then I entered the name of my new database “myld” and selected “From Device” and then clicked the “…” button to pick the file I’m going to restore from. I went to where the backup was, remembering to select “All Files (*)” since I still don’t know how to use the right file extension.
Then, on a whim, under “Options,” I decided to see if I could restore it as a standby. The documentation I read said I couldn’t but why not see?
Okay, that’s fine. Since my source database is still changing, I want to restore it with the NORECOVERY option so I can apply the changes later.
That worked! Now, I can take transaction log backups and apply them to the new copy of the database until I’m ready to switch over to the new database.
You notice that my database shows up in the list with a status of “Restoring…”. In Oracle terms, this would be akin to “managed recovery.” So with changes still happening to my MSSQL 2005 source database, I take a transaction log backup. This time, I just use my command line method for the backup and restore.
RESTORE LOG myld FROM disk = 'c:\myld\backup\tlogs\myld_log_201111231013.bck' WITH NORECOVERY go
Now I’m ready to switch my application over to my new server. I stop the load on the source database and with TSQL, I query to see when the last record was inserted.
[lidh@lidhsrv2 mssql]$ tsql -S my05 -U klidh Password: locale is "en_US.UTF-8" locale charset is "UTF-8" using default charset "UTF-8" 1> select max(orderdate) from CUSTORDER; 2> go Nov 23 2011 10:20:08:440AM (1 row affected) 1> exit [lidh@lidhsrv2 mssql]$
Now I take a final transaction log backup, copy it over, and restore it.
RESTORE LOG myld FROM disk = 'c:\myld\backup\tlogs\myld_log_201111231024.bck' WITH NORECOVERY go
Now for the moment of truth. I’m ready to open my database so I restore my database WITH recovery.
[lidh@lidhsrv2 mssql]$ tsql -S myserver -U sa Password: locale is "en_US.UTF-8" locale charset is "UTF-8" using default charset "UTF-8" 1> restore database myld with recovery; 2> go Msg 944 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Converting database 'myld' from version 611 to the current version 661." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 611 to version 621." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 621 to version 622." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 622 to version 625." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 625 to version 626." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 626 to version 627." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 627 to version 628." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 628 to version 629." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 629 to version 630." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 630 to version 631." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 631 to version 632." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 632 to version 633." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 633 to version 634." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 634 to version 635." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 635 to version 636." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 636 to version 637." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 637 to version 638." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 638 to version 639." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 639 to version 640." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 640 to version 641." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 641 to version 642." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 642 to version 643." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 643 to version 644." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 644 to version 645." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 645 to version 646." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 646 to version 647." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 647 to version 648." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 648 to version 649." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 649 to version 650." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 650 to version 651." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 651 to version 652." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 652 to version 653." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 653 to version 654." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 654 to version 655." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 655 to version 660." Msg 951 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "Database 'myld' running the upgrade step from version 660 to version 661." Msg 3014 (severity 0, state 1) from LIDHSRV0801\SQLIDH Line 1: "RESTORE DATABASE successfully processed 0 pages in 2.060 seconds (0.000 MB/sec)." 1>
So what this tells me is that I can ship and apply transaction log backups between versions BUT the version of the restored database is still MSSQL 2005 until I restore the database with recovery. Now it’s a shiny new MSSQL 2008 r2 database! Let’s check the data to see if that last record made it over. Since I just restored it, my “klidh” user in my 2008 instance doesn’t have any privileges to the new database. So I’ll log in as my “sa” user and check the data.
[lidh@lidhsrv2 mssql]$ tsql -S myserver -U sa Password: locale is "en_US.UTF-8" locale charset is "UTF-8" using default charset "UTF-8" 1> use myld 2> go 1> select max(orderdate) from CUSTORDER; 2> go Nov 23 2011 10:20:08:440AM (1 row affected) 1>
I have all the data from the old database! But in order for me to use this, I’ll need to set the roles and privileges for my “klidh” user to access the database.
I modified my Java/JDBC code to point to the new version of the database and it ran as if nothing had changed. This will be my recommended method for upgrading our database.
Topics: SQL Server, Upgrade | No Comments »
Oracle Heterogeneous Service (it could happen)
So we got a task at work to allow an Oracle database to be able to see data in a SQL Server database. Our method of choice is to use Oracle’s Heterogeneous Services. I’ve never done it before so I had to test it out. In my case, I’m going from a 11gR2 Oracle database (11.2.0.2.0 RAC) to a SQL Server 2008 r2 database.
So the first step is to install Oracle on the Windows Server where the SQL Server database is located. I wanted to go with the latest version of Oracle for Windows which is 11.2.0.2.0. I also remembered that starting with that version, every patch is a full install so I went to My Oracle Support (MOS) and downloaded patch p10098816_112020_MSWIN-x86-64_1of7.zip through p10098816_112020_MSWIN-x86-64_7of7.zip before (of course) remembering that I only need the first two for the database install. After my install, I went to work configuring my service.
First, I started the ODBC Data Source Administrator and clicked on the System DSN for my SQL Server database. Make sure you choose SQL Server and not Oracle here. Since I already created mine, I’ll just step through the properties.
So now I have my ODBC Data Source and it can connect to my SQL Server database. The next step is to set up the listener. I go into my %ORACLE_HOME%\NETWORK\ADMIN directory and create a listener.ora file. One thing to note is that if you did this with older versions of Oracle, the program was called hsodbc but now it’s called dg4odbc.
Now I just need to start the listener. If you try to start the listener and get a TNS-12560 followed by a TNS-00530 error, you likely didn’t start the listener as the Administrator. I was logged in as a member of the group but got this error.
But if you open a Command Prompt as Administrator, then it works.
So now we’re almost there. Like a good server, my server has port 1522 locked down. So now I go to the Windows Firewall with Advanced Security screen and create an “Inbound Rule” for port 1522. I call mine, Oracle1522.
One last step on the Windows Server is to set up the initdg4odbc.ora file for my instance in your %ORACLE_HOME%\hs\admin directory. My instance name to the Oracle listener is called mymsdb so I create a copy of the initdg4odbc.ora file called initmymsdb.ora and for the HD_FDS_CONNECT_INFO parameter, I use the name of my ODBC DSN which was myMSDBds.
So now, my Windows Server should be set up with my Oracle Heterogeneous Service. I go to one of my RHEL5 RAC nodes and set up my tnsnames.ora file. The entry looks like this:
mymsdb = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST=lidhsrv0801)(PORT=1522)) ) (CONNECT_DATA = (SID=mymsdb)) (HS=OK) )
You need to include the (HS=OK) to tell Oracle that it is using the heterogeneous service. Then I ping to make sure my listener is visible from outside the Windows Server.
[/u01/app/oracle/product/11.2.0.2/network/admin] $ tnsping mymsdb TNS Ping Utility for Linux: Version 11.2.0.2.0 - Production on 02-NOV-2011 21:33:12 Copyright (c) 1997, 2010, Oracle. All rights reserved. Used parameter files: Used TNSNAMES adapter to resolve the alias Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST=lidhsrv0801)(PORT=1522))) (CONNECT_DATA = (SID=mymsdb)) (HS=OK)) OK (730 msec) [/u01/app/oracle/product/11.2.0.2/network/admin] $
It works! So can we actually see the SQL Server database from inside an Oracle database? Let’s see. First, I created a public database link. Make sure you give it an authorized database user in the SQL Server database.
sys@UPGR> create public database link mymsdb connect to klidh identified by <klidh's password> using 'MYMSDB'; Database link created. sys@UPGR>
Now let’s test the link. I have a table called mycustomer in the database. How many records are there?
sys@UPGR> select count(*) from mycustomer@mymsdb; COUNT(*) ---------- 230 sys@UPGR>
That’s the correct number. So that’s how you set it up. I created a user in my Oracle database called msuser and granted “create synonym” to him. As msuser, I created an alias to the tables in the SQL Server database as such:
msuser@UPGR> create synonym mycustomer for mycustomer@mymsdb; Synonym created. msuser@UPGR> select count(*) from mycustomer; COUNT(*) ---------- 230 msuser@UPGR>
Now it’s just like being there! Let’s do something a little more complex (and I mean a little).
msuser@UPGR> desc custorder Name Null? Type ----------------------------------------------------------------- -------- -------------------------------------------- custid NOT NULL NUMBER(10) numwidgets NOT NULL NUMBER(10) orderdate NOT NULL DATE msuser@UPGR> select max(orderdate) from custorder; select max(orderdate) from custorder * ERROR at line 1: ORA-00904: "ORDERDATE": invalid identifier msuser@UPGR>
Well that doesn’t look good. There is a column called “orderdate” If I use TSQL, the same query works:
1> select max(orderdate) from custorder; 2> go Nov 2 2011 10:27:19:490AM (1 row affected) 1>
The column names are case sensitive so in order to be case specific in your Oracle query, you have to double quote the column name(s):
msuser@UPGR> select max("orderdate") from custorder;
MAX("ORDERDATE")
--------------------
02-NOV-2011 10:27:19
msuser@UPGR>
And that should be enough to get your heterogeneous service working!
Topics: Heterogeneous Service, Oracle, SQL Server | No Comments »
P270X
Yes, I know there isn’t such thing as P270X but it’s my nickname for the workout program I do. I took a year-long sabbatical from running due to the back issues I had last summer. I figured I would give my L5 a chance to get better before beating on it again. I needed something else to do so at first, I cobbled together a weight program and I liked it. But then my job had me traveling quite a bit (6 times in 5 months) so I switched to bands. I got a good set, cobbled together a band program, and took it on the road. It seemed to work pretty well.
Then I saw a P90X commercial, as I’m sure we all have, and the concept of having multiple workouts plus a diet program seemed like a good idea. When I got the set of DVDs and started playing them, it seemed pretty daunting but I was determined. My job had me busy during all hours and I struggled to find consecutive days to work out. Then, I changed jobs and had to completely change my routine.
With my new job, I have to go into the office every day and it’s not close. So I decided if I wanted to work, work out, and still have time for family, I needed to go to bed earlier, get up earlier, and work out before going to work as opposed to afterward. So now I’m in bed by 10pm and up at 4am. I work out until about 5:10am. Unfortunately, the P90X program requires a bigger time demand than I can afford even with this new schedule. So I’m doing a modified version of P90X which I jokingly refer to as P270X. I do Day 1 on a day but not the ab workout afterward. Day 2, I do the ab workout and treadmill hills for a mile an a half. Day 3 I do the Day 2 workout. Then Day 4, abs and treadmill. If I can’t do a work out for some reason, I don’t beat myself up for it, I just pick it back up the next day. I figure for me to go through the 90-day program, I’ll have to do this for 270 days.
So far, I can see results (but no before or after pictures like they recommend) so I’m happy. I feel better and I feel like I’m in better shape. I’m about 70 days into this. We’ll see where it goes. The big thing about programs like this to me is that it isn’t a 90-day or 270-day thing. It’s a lifestyle change which means it needs to become part of you, not just an add-on.
I highly recommend getting a good set of bands and the P90X program. In the long run, it’s cheaper than bands and you don’t have to wait for a machine or bench to open up. Plus you can work the program around your schedule and no commutes in between.
Topics: Life | 2 Comments »
SQL Server Standby automation (the hard way)
Now I have a standby database but I need to keep it in sync with my primary database. The process is that I take a backup of my transaction log, copy it over to the standby server, and then restore from the transaction log backup. Currently, my log backup runs every odd-hour plus a minute (1:01, 3:01, etc.). My backup copy process runs every odd-hour plus 30 minutes (1:30, 3:30, etc.). What I need is a process to apply those transactions so I turn to my trusty cron and bsqldb method.
First, I modified my transaction log backup script to create the script for the restore. I use the date to make each backup file name unique and since the script already knew the current name of the backup file, I could just use that in the recover script.
#!/bin/ksh
export SCRIPT_DIR=/home/lidh/dev/scripts
. ${SCRIPT_DIR}/.sqlenv
export STAMP=`date +%Y%m%d%H%M`
export BFILE=${SCRIPT_DIR}/backup_log_mymsdb_${STAMP}.sql
export LFILE=${SCRIPT_DIR}/backup_log_mymsdb_${STAMP}.log
export OFILE=${SCRIPT_DIR}/backup_log_mymsdb_${STAMP}.out
export RFILE=${SCRIPT_DIR}/restore_stby_mymsdb.rst
echo "backup log mymsdb" > ${BFILE}
echo " to disk = 'C:\backup\tlogs\mysqldb_log_${STAMP}.bck'" >> ${BFILE}
echo " with format," >> ${BFILE}
echo " medianame='C_SQLServerLogBackups'," >> ${BFILE}
echo " name = 'Log backup of mymsdb';" >> ${BFILE}
echo "go" >> ${BFILE}
/usr/local/bin/bsqldb -S myserver -U klidh -P ${PWD} -i ${BFILE} -o ${LFILE} -e ${OFILE}
echo "exec sp_kill" > ${RFILE}
echo "go" >> ${RFILE}
echo "RESTORE LOG mymsdb" > ${RFILE}
echo " FROM disk = 'c:\backup\tlogs\mysqldb_log_${STAMP}.bck'" >> ${RFILE}
echo " WITH STANDBY = 'c:\backup\mysqldb_undo_log.bck'" >> ${RFILE}
echo "go" >> ${RFILE}
echo "...and done"
So ${RFILE} is my recovery script name. In it, I put the name of the backup file so when my scheduled recovery job runs, it’ll pass this newly created script name and apply my backed up transactions. Simple enough? Well, there’s one more step I have to consider. In SQL Server, you can do read-only transactions in the standby database any time except when you’re restoring the backup. So I needed a way to kick out any lingering sessions just prior to the restore. My SQL Server DBA (and architect) buddy Charlie provided me with a stored procedure called sp_kill which kills any non sa process. Since I didn’t write the script, I won’t be providing the contents but he assures me there are multiple versions out on the interweb. So I wrote a restore script to be called from cron which kills the lingering processes and then applies the transaction backup referenced in the earlier script.
#!/bin/ksh
export SCRIPT_DIR=/home/lidh/dev/scripts
. ${SCRIPT_DIR}/.sqlenv
export STAMP=`date +%Y%m%d%H%M`
export LFILE=${SCRIPT_DIR}/restore_stby_mymsdb_${STAMP}.log
export OFILE=${SCRIPT_DIR}/restore_stby_mymsdb_${STAMP}.out
export RFILE=${SCRIPT_DIR}/restore_stby_mymsdb.rst
/usr/local/bin/bsqldb -S mydr -U ${USER} -P ${PWD} -i ${RFILE} -o ${LFILE} -e ${OFILE}
echo "...and done"
Seems simple enough. I scheduled this job to run every even-hour plus a minute (12:01, 2:01, etc.). I put this into cron and it seemed to work but let’s test it to see. I used my transaction generator against the primary database for about an hour (12:04pm – 1:04pm).
12:04pm - Start transaction test
1:01pm - Log backup runs on primary database
1:04pm - Stop transaction test
So at this point, the data on the primary has changed (a lot) but not at all on the standby. Also, only the transactions up to 1:01pm should be in the transaction log. This is important to remember later.
On Primary:
1> select sysdatetime(),@@servername 2> go 2011-10-24 13:06:40.5156810 LIDHSRV0801\SQLIDH (1 row affected) 1> select max(orderdate) from custorder; 2> go Oct 24 2011 01:04:22:783PM (1 row affected) 1>
On Standby:
1> select sysdatetime(),@@servername 2> go 2011-10-24 13:07:00.0276250 LIDHSRV0802\SQLIDH (1 row affected) 1> select max(orderdate) from custorder; 2> go Oct 20 2011 11:02:16:257AM (1 row affected) 1>
1:30pm - Transaction log backup copied over to standby server
At this point, the data hasn’t changed in the standby, but in an emergency, the log could be applied and the database could become the primary with only 29 minutes of loss.
2:01pm - Transaction log backup applied to standby database
So now, we should see new data in the standby database. I left my session open to see if I was kicked out.
On Standby:
1> select sysdatetime(),@@servername 2> go Error 20006 (severity 9): Write to the server failed OS error 104, "Connection reset by peer" tds_submit_query() failed 1> +++ New session +++ 1> select sysdatetime(),@@servername 2> go 2011-10-24 15:36:23.9182500 LIDHSRV0802\SQLIDH (1 row affected) 1> select max(orderdate) from custorder; 2> go Oct 24 2011 01:01:01:243PM (1 row affected) 1>
So the two things to notice are that my session was kicked out prior to applying the transaction log backup and the timestamp of the last update was 1:01:01 pm even though my transaction generator ran until around 1:04pm. I waited until this morning to check and see if the other transactions made it over.
On Standby:
1> select sysdatetime(),@@servername 2> go 2011-10-25 09:09:05.8401250 LIDHSRV0802\SQLIDH (1 row affected) 1> select max(orderdate) from custorder; 2> go Oct 24 2011 01:04:22:783PM (1 row affected) 1>
And you can see the data made it over in the next backup/restore sequence concluding at 4:01pm.
Topics: SQL Server, TSQL/BSQLDB, Warm Standby | No Comments »
SQL Serv-ing up a “warm” standby
So in my quest to learn more about SQL Server, I wondered if I could create a standby. In Oracle, you have physical standbys where you have a point-in-time physical copy of the primary database and the updates are carried over via archived redo or standby redo logs. Prior to 11g, your standby would be in “managed recovery” mode which means you couldn’t see the contents of the database until you opened the database. In 11g, you have the option (for an additional cost) of having the database open “read only” while the transactions are applied. This is called Oracle Active Data Guard. You also have the option of a logical standby in Oracle which is where the physical structures don’t have to match and the logical change records (LCR) are created from the archived redo and applied to the standby. This database is fully open for use. I have quite a bit of experience with physical and some experience with logical standbys.
In SQL Server, we have warm standbys. This is sort of a cross between a physical standby and active data guard in Oracle. You don’t have to “open” the standby in order to read from it but you can’t have any sessions active in it to apply the transactions. But let’s take a step back and see how to create one.
My SQL Server 2008 r2 Setup:
- I have full backup that runs from a Linux server via cron every morning at 12:01am
- I have a (transaction) log backup that runs every two hours from a Linux server starting at 1:01am.
- I have a job that runs via the Windows Task Scheduler that copies the transaction log backups from the primary server to the standby server every two hours starting at 1:30am.
- I also have a cleanup job which deletes backups older than 48 hours that runs on both servers to keep things clean.
So first, I look at my primary server to see if I have a current backup.
Then I check my standby server to ensure the backups (or at least the latest backup) are there.
So now we go to Microsoft Sql Server Management Studio on my standby server and click the [+] by Databases and I see I don’t currently have a database called mymsdb.
Then I right click on the Databases line and click on “Restore Database…”.
As with most Windows applications, I get a Restore Database wizard. My “To database” is mymsdb and I’m going to restore from a file.
When I click the “…” button next to the “From device” option, I can look for my backup. Of course, the default file extension is “.bak” and I use “.bck” so I had to look at “All Files (*)” but that’s okay. I choose the latest full backup so I won’t have to apply as many log backups to catch my standby up.
Then I see my backup in the list below. i select the backup and then click on “Options” on the left. If you don’t check the box by the backup first, it will error when you select “Options.”
Under “Options,” select the “WITH STANDBY” option. I took the defaults with everything else including the standby log file.
If you click the “OK” button, it does the restore. Then you have a shiny new (Standby / Read Only) database on your standby server.
But this database is only current to when your full backup on your primary side started. For this to be a true DR database, we need to (1) catch it up and (2) keep it caught up. To catch it up manually, we can right-click on Databases again and choose the “Restore Files and Filegroups…” selection.
This time we can select the mymsdb database since it exists now. And we are restoring from disk so we choose “Device” and click the “…” button and look for our transaction log backups. Before I took the next screen capture, I already restored the first transaction log backup after the full (1:01am) so now I am going to apply the next one here (3:01am).
You get this window as a result of your selection…
And you want to ensure you continue to select the WITH STANDBY option or you change the mode of your standby database.
Then you keep applying your transaction backups in order until you catch up.
But, if you recall, I get a new transaction log backup every two hours.
So now I need to set up my automation for my standby. This will be in my next post.
Topics: SQL Server, Warm Standby | No Comments »






















































