rserv/doc/rserv-design.txt

RServ design considerations

This document will try to describe Rserv implementation details as well as I
(Dobrica Pavlinusic) can describe them, as well as provide some insight into
problems which occured during implementation of RServ 0.3 version (dubbed
"improved by community").

I would particulary like to thank Alen Lovrencic for interesting discussion
on database design and good lunch during that.


* How to find field (just one) which is unqiue identitifier for table?

Quick answer to this question is to use primary key. But, all databases "in
the wild" doesn't have primary keys defined.

How to find unique identifier for some row: find unique not null index for
that database (that is definition of primary key). Other solutions can
provide unique identifiers but only for currently available data.

We have to consider problem with associative tables (which have primary key
which is compound of foreign keys). One possible solution is to add
additional column which will be unique, not null index. Sequence comes to
mind, but can't be used because that column must be same in both databases
if we want to be sure that our data is consistent.

Examine uuid implementation.

Hashing might seem like solution, but it doesn't garantee unique mappings
between source data set (data in database) and newly created identifier
which is required.

If unique index not null doesn't exist we can provide different methods
which have complexity from n*nrattr (if we start with all fields and drop
one by one) or 2**nrattr (if we try every possible combination to find
minimal key, but exponential complexity prohibit us to do this).

---

Consider do we need indexes on replication log tables. If replication is
done frequently, overhead of indexes might be considerable. Test this on
data and find breaking point.

1	dpavlin	1.1	RServ design considerations
2
3			This document will try to describe Rserv implementation details as well as I
4			(Dobrica Pavlinusic) can describe them, as well as provide some insight into
5			problems which occured during implementation of RServ 0.3 version (dubbed
6			"improved by community").
7
8			I would particulary like to thank Alen Lovrencic for interesting discussion
9			on database design and good lunch during that.
10
11
12			* How to find field (just one) which is unqiue identitifier for table?
13
14			Quick answer to this question is to use primary key. But, all databases "in
15			the wild" doesn't have primary keys defined.
16
17			How to find unique identifier for some row: find unique not null index for
18			that database (that is definition of primary key). Other solutions can
19			provide unique identifiers but only for currently available data.
20
21			We have to consider problem with associative tables (which have primary key
22			which is compound of foreign keys). One possible solution is to add
23			additional column which will be unique, not null index. Sequence comes to
24			mind, but can't be used because that column must be same in both databases
25			if we want to be sure that our data is consistent.
26
27			Examine uuid implementation.
28
29			Hashing might seem like solution, but it doesn't garantee unique mappings
30			between source data set (data in database) and newly created identifier
31			which is required.
32
33			If unique index not null doesn't exist we can provide different methods
34			which have complexity from n*nrattr (if we start with all fields and drop
35			one by one) or 2**nrattr (if we try every possible combination to find
36			minimal key, but exponential complexity prohibit us to do this).
37
38			---
39
40			Consider do we need indexes on replication log tables. If replication is
41			done frequently, overhead of indexes might be considerable. Test this on
42			data and find breaking point.
43