Oracle Streams Replication Example February 21, 2008
Filed under: scripts,streams,tips — prodlife @ 4:04 am
I promised to show off our streams implementation, because I found so
few examples out there. Keep in mind that your streams implementation
will probably look nothing like mine at all, and even if it will look
like mine, you can get the same results in very different way. This
example is not an excuse to avoid reading the docs.
Our goal: Set up a reporting server, with replication of schemas
arriving from many source DBs. We also want to filter out and not
replicate some of the less critical tables.
Decisions, decisions: When implementing streams there are several
things you want to ask yourself: Do you want bi-directional
replications? Do you want to capture the changes on the source DB,
destination DB or a third server? Do you do the instantiation (first
copy of data) with DataPump or RMan?
We went with one-directional replication, capturing will be done on
the destination server, and we’ll use DataPump to initially move the
schemas to the destination.
Create users: You want to create a user both on source and destination
server that will own all the streams related objects. Also, you want
to give him his own tablespace (or at least keep his objects away from
SYSTEM tablespace) because in case of errors lots of data can
accumulate in stream queues. The docs recommend making the tablespace
autoextend, but I’m very much against it. After creating the strmadmin
user, the rest of my instructions should be run from this user.
CREATE TABLESPACE streams_tbs DATAFILE
‘/data/oradata/DB05/streams_tbs.dbf’ SIZE 25M;
CREATE USER strmadmin
IDENTIFIED BY strmadminpw
DEFAULT TABLESPACE streams_tbs
QUOTA UNLIMITED ON streams_tbs;
GRANT DBA TO strmadmin;
grantee => ‘strmadmin’,
grant_privileges => true);
DBLinks: You will need a DBLink from source to destination DB. For
convenience, I created a DBLink from destination to source which I’ll
use during the configuration of Streams and will drop afterwards. Of
course, the servers should be able to tnsping each other before this
CREATE DATABASE LINK REP01 CONNECT TO strmadmin IDENTIFIED BY strmadminpw
select count(*) from dba_tables@REP01;
— on target
CREATE DATABASE LINK DB05 CONNECT TO strmadmin IDENTIFIED BY strmadminpw
select count(*) from dba_tables@DB05;
Directory Objects: I decided to use Datapump for instantiation and to
use DBMS_STREAMS to create a configuration scripts. This means I
needed two directory objects on destination (for the script and
Datapump) and one of the source (just Datapump). I used existing
directories, so I had no need to create them, but in general you
create a directory like so:
create directory EXP_DP_VOL as ‘/exp_dp’;
Copying Archive redo logs: You need to get the archived redo logs from
the source server to the server doing the capturing (destination
server in my case, but can be a 3rd server). There are two decisions
to make here: do you use ARCH or LGWR to write the log files to the
remote location? And will the new archiving destination be MANDATORY
or OPTIONAL (i.e. does the writing have to succeed before the
corresponding redo file at the source can be overwritten)?
Also, when specifying the destination directory, you want it to be a
separate directory from where REP01 redo logs are archived, and make
extra sure that the directory really exists as you wrote it.
I specified a non-existent directory, this resulted in errors on the
alert log about failure to archive (scary!), and after few retries the
archiver stopped attempting to write to the new destination. So, after
creating the directory I had to reset the log_archive_dest_2 parameter
(i.e. set it to ” and then change it back again).
On the source server, we run:
ALTER SYSTEM set LOG_ARCHIVE_DEST_2=’SERVICE=REP01 ARCH OPTIONAL
ALTER SYSTEM set LOG_ARCHIVE_STATE_2=ENABLE;
Create instantiation and configuration script: Since we do replication
of (almost) entire schemas, we could use
DBMS_STREAMS_ADM.MAINTAIN_SCHEMAS to do both the instantiation and the
configuration of Streams.
I used the option that creates a script with the instructions (instead
of run everything). I had many problems with the script and ended up
using only parts of it, and other parts doing in different ways. I’m
not sure if it means that using a script is a bad idea (since I had so
many problems) or a good idea (since the same problems would be more
annoying if I did not have the script). To create the script, I ran
the following command on the destination database. If you are using a
3rd server for capture, run it on the 3rd server:
schema_names => ‘DEV110’,
source_directory_object => ‘EXPORTS_DIR’,
destination_directory_object => ‘EXP_DP_VOL’,
source_database => ‘DB05’,
destination_database => ‘REP01’,
perform_actions => false,
script_name => ‘src_pre_inst_streams.sql’,
script_directory_object => ‘EXP_DP_VOL’,
dump_file_name => ‘dev110.dmp’,
capture_queue_table => ‘rep_capture_queue_table’,
capture_queue_name => ‘rep_capture_queue’,
capture_queue_user => NULL,
apply_queue_table => ‘rep_dest_queue_table’,
apply_queue_name => ‘rep_dest_queue’,
apply_queue_user => NULL,
capture_name => ‘capture_110dev’,
propagation_name => ‘prop_110dev’,
apply_name => ‘apply_110dev’,
log_file => ‘export_110dev.clg’,
include_ddl => true,
instantiation => DBMS_STREAMS_ADM.INSTANTIATION_SCHEMA);
Running the result script: Try running the script you just created on
the same DB where you ran the procedure to create it. Note that it
asks you for connection details of three sites. The first should be
source, the second the destination, and the third can be garbage as it
is not used in the script at all.
If the stars are aligned right, maybe you run this script to the
finish and now have a working schema replication. In my case, it ran
fine until the Datapump part, and created all the capture processes
and queues. However, the Datapump export failed with very unusual
errors. I spent two days trying to get it to work, and then given up
and simply used Datapump manually to export the schema from source and
import into destination. After that I continued running the rest of
the script, and things were fine.
Fun and games: Now that you have your own schema replication. It is
time to enjoy it. Create new tables, create as select, update data,
insert new data, delete data, drop tables, create procedures and
types. Everything replicates!
Just remember that if you don’t commit, it won’t replicate. Reasons
should be obvious. Also, if you are testing on low-volume system, you
probably want to alter system switch logfile every few minutes so you
can see your changes faster.
Watch the replication process: While you are replicating changes, you
probably want to watch how they move from source to destination:
You can see when the last message was read from redo archive:
TO_CHAR(AVAILABLE_MESSAGE_CREATE_TIME, ‘HH24:MI:SS MM/DD/YY’)
You can see when the changes were entered into the capture queue:
TO_CHAR(ENQUEUE_MESSAGE_CREATE_TIME, ‘HH24:MI:SS MM/DD/YY’) CREATE_TIME,
TO_CHAR(ENQUEUE_TIME, ‘HH24:MI:SS MM/DD/YY’) ENQUEUE_TIME,
And you can see when the apply process started working on them:
TO_CHAR(DEQUEUED_MESSAGE_CREATE_TIME,’HH24:MI:SS MM/DD/YY’) CREATION,
TO_CHAR(DEQUEUE_TIME,’HH24:MI:SS MM/DD/YY’) LAST_DEQUEUE,
Another nice thing to try is to generate errors. For example, you can
delete a table from the destination, make a change to it in the
source, and see how it doesn’t replicate. The error will be found
And you should also check which archived logs can be deleted:
FROM DBA_REGISTERED_ARCHIVED_LOG r, DBA_CAPTURE c
WHERE r.CONSUMER_NAME = c.CAPTURE_NAME;
The last thing I did was filter out the tables I did not need to
replicate. Those are tables the application uses as queues or temps –
tons of inserts and deletes, but the data is useless for reports.
I used the DBMS_STREAMS_ADM.ADD_TABLE_RULES procedure. I selected to
do the filtering during the capture (you can also do it during apply).
The important bits: table_name MUST contain the schema name in it.
Otherwise the procedure will assume that the table belongs to
strmadmin, and nothing will get filtered.
inclusion_rule=>FALSE is the part that indicates that we don’t want this table.
You can then see the rule you created by running select * from DBA_RULES
What now: The process I just described got me safely past a POC. There
is obviously lot more to be done before this solution goes production.
The performance of the capturing, and what is the impact of lags is of
concern. Also, the entire system needs to be recreated from scratch
whenever we do “alter database open reset logs” on the source DB
(hopefully not often). But the immediate next step for me is to
prepare a nice presentation to management showing what a great
solution we prepared and how useful it will be for the business and
how much this will be worth the investment.