/[rserv]/doc/rserv-explained.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /doc/rserv-explained.pod

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (show annotations)
Fri Aug 15 20:41:20 2003 UTC (20 years, 8 months ago) by dpavlin
Branch: MAIN
CVS Tags: before_onlytables, before_multmaster, r_0_3, HEAD
contribution from Grant McLean: explains how RServ works

1 =for comment
2 use 'perldoc ./rserv-explained.pod' to read this doc with formatting
3
4
5 =head1 How Rserv Works
6
7 Rserv is made up of the following components:
8
9 =over 4
10
11 =item *
12
13 several functions written in 'C' and compiled into the shared library rserv.so
14
15 =item *
16
17 several database tables '_rserv_*' used to track replication metadata
18
19 =item *
20
21 one trigger for each replicated table that fires on every insert/update/delete
22 and calls one of the 'C' functions
23
24 =item *
25
26 a collection of Perl scripts for initialising the metadata and replicating the
27 database updates. Most of the Perl code is in Rserv.pm and the routines can be
28 run from custom scripts or from simple wrapper scripts that come with the
29 distribution
30
31 =back
32
33 Rserv assumes each table has a single column to uniquely identify each row.
34
35 When the master database is first created, the 'rserv_init.pl' script should be
36 run with the '-m' option to do the following:
37
38 =over 4
39
40 =item *
41
42 create four tables:
43
44 _rserv_tables_ stores the name of unique column for each table
45 _rserv_log_ tracks which rows of each table have been updated
46 _rserv_servers_ details of slave servers (not used?)
47 _rserv_sync_ tracks which updates have been seen by each slave
48
49 =item *
50
51 call MasterAddTable once for each table in the database, to add one row to
52 _rserv_tables_ and to create a trigger. (Note: rserv_init.pl identifies the
53 unique column by locating the first column with a unique index).
54
55 =back
56
57 Once the initialisation script has been run the master is ready to run.
58 Whenever a database update occurs a trigger will fire and a row will be added
59 to the _rserv_log_ table. Note this table only tracks which row in which table
60 was updated, it does not log details of the values which were changed.
61
62 When a slave database is first created, the 'rserv_init.pl' script should be
63 run with the '-s' option to do the following:
64
65 =over 4
66
67 =item *
68
69 create two tables:
70
71 _rserv_slave_tables_ stores the name of unique column for each table
72 _rserv_slave_sync_ tracks which updates have been seen by this slave
73
74 =item *
75
76 call MasterAddTable once for each table in the database, to add one row to
77 _rserv_tables_ and to create a trigger. (Note: rserv_init.pl identifies the
78 unique column by locating the first column with a unique index).
79
80 =back
81
82 Once the initialisation script has been run, replication can begin. One
83 replication cycle between a master and one slave consists of the following
84 steps:
85
86 =over 4
87
88 =item *
89
90 PrepareSnapshot (either the function in Rserv.pm or the wrapper script of the
91 same name) is used to create a text file of all updates the slave has not yet
92 seen.
93
94 =item *
95
96 ApplySnapshot is used to apply those updates to the slave
97
98 =item *
99
100 GetSyncID is used to retrieve the syncid just applied to the slave
101
102 =item *
103
104 MasterSync is used to store the syncid in the master's _rserv_sync_ record for
105 the slave
106
107 =cut
108
109 The next time PrepareSnapshot is run, it will only include updates which
110 occurred since the last snapshot the slave has seen however the log entries
111 will still exist in the _rserv_log_ table. The CleanLog routine can be used to
112 purge entries upto a specified syncid.
113
114 The Replicate script performs all the steps listed above except the CleanLog.
115 Note: although the metadata framework and Rserv.pm support multiple slaves, the
116 wrapper scripts are all hardcoded for a single slave (number 0).
117
118 =head2 SyncIDs
119
120 Rserv uses the concept of 'SyncIDs' to track how up-to-date a slave is.
121 SyncIDs are merely an ascending series of numbers which are derived from a
122 PostgreSQL sequence. A group of transactions may share the same SyncID:
123
124 =over 4
125
126 =item *
127
128 As updates are logged on the master, they are assigned the current value of the
129 _rserv_sync_seq_ sequence (not the next value)
130
131 =item *
132
133 When a snapshot is prepared, the sequence is incremented
134
135 =item *
136
137 One snapshot will include updates for all SyncID's that the slave host not yet
138 seen
139
140 =item *
141
142 Applying a snapshot to a slave updates I<the slave's> record of which snapshots
143 it has seen
144
145 =item *
146
147 GetSyncID + MasterSync are used to update I<the master's> record of which
148 snapshots a slave has seen
149
150 =head2 Snapshots
151
152 A snapshot is a sequence of instructions that should be applied to a slave to
153 bring it up to date with a given SyncID. The file does not contain SQL
154 statements, but commands/comments (preceded by '--') and tab-delimited data.
155
156 There are two types of instruction: a DELETE and an UPDATE. When the snapshot
157 is applied, all the records listed in the snapshot will be deleted from the
158 slave database and then the records listed in UPDATE instructions will be
159 inserted.
160
161 One consequence of this design is that it is perfectly safe to apply the same
162 snapshot more than once.
163
164 Another consequence of the design is that if the SyncID is not updated on the
165 master then the next snapshot will include everything from the last snapshot
166 plus all updates since then. This is also perfectly safe.
167
168 =cut
169

  ViewVC Help
Powered by ViewVC 1.1.26