1 |
dpavlin |
1.1 |
RServ design considerations |
2 |
|
|
|
3 |
|
|
This document will try to describe Rserv implementation details as well as I |
4 |
|
|
(Dobrica Pavlinusic) can describe them, as well as provide some insight into |
5 |
|
|
problems which occured during implementation of RServ 0.3 version (dubbed |
6 |
|
|
"improved by community"). |
7 |
|
|
|
8 |
|
|
I would particulary like to thank Alen Lovrencic for interesting discussion |
9 |
|
|
on database design and good lunch during that. |
10 |
|
|
|
11 |
|
|
|
12 |
|
|
* How to find field (just one) which is unqiue identitifier for table? |
13 |
|
|
|
14 |
|
|
Quick answer to this question is to use primary key. But, all databases "in |
15 |
|
|
the wild" doesn't have primary keys defined. |
16 |
|
|
|
17 |
|
|
How to find unique identifier for some row: find unique not null index for |
18 |
|
|
that database (that is definition of primary key). Other solutions can |
19 |
|
|
provide unique identifiers but only for currently available data. |
20 |
|
|
|
21 |
|
|
We have to consider problem with associative tables (which have primary key |
22 |
|
|
which is compound of foreign keys). One possible solution is to add |
23 |
|
|
additional column which will be unique, not null index. Sequence comes to |
24 |
|
|
mind, but can't be used because that column must be same in both databases |
25 |
|
|
if we want to be sure that our data is consistent. |
26 |
|
|
|
27 |
|
|
Examine uuid implementation. |
28 |
|
|
|
29 |
|
|
Hashing might seem like solution, but it doesn't garantee unique mappings |
30 |
|
|
between source data set (data in database) and newly created identifier |
31 |
|
|
which is required. |
32 |
|
|
|
33 |
|
|
If unique index not null doesn't exist we can provide different methods |
34 |
|
|
which have complexity from n*nrattr (if we start with all fields and drop |
35 |
|
|
one by one) or 2**nrattr (if we try every possible combination to find |
36 |
|
|
minimal key, but exponential complexity prohibit us to do this). |
37 |
|
|
|
38 |
|
|
--- |
39 |
|
|
|
40 |
|
|
Consider do we need indexes on replication log tables. If replication is |
41 |
|
|
done frequently, overhead of indexes might be considerable. Test this on |
42 |
|
|
data and find breaking point. |
43 |
|
|
|