1 |
The Malete server protocol. |
2 |
|
3 |
|
4 |
* introduction |
5 |
|
6 |
The Malete server is based on passing of messages, which are represented |
7 |
as records. The only interface to the server can be regarded as a single |
8 |
function "send", which takes a record as parameter and returns a record. |
9 |
The result record itself is a valid message. |
10 |
This "send" can be actually invoked in one of two ways: |
11 |
|
12 |
- by having the server in process |
13 |
i.e. by actually calling the C function "send", |
14 |
possibly via some wrapper to interface another programming language. |
15 |
This is the way the Malete Tcl extension works. |
16 |
- via some bytestream |
17 |
This can be regarded as just one of the wrappers, interfacing a |
18 |
bytestream by deserializing message records from the bytestream |
19 |
and serializing result records to the bytestream. |
20 |
The standard server process uses stdin and stdout and thus can |
21 |
be invoked by executing it from pipes or by contacting it via TCP, |
22 |
when running from |
23 |
> http://openisis.org/Doc/UcspiSsl tcpserver. |
24 |
As a special case, the record data file itself is such a bytestream, |
25 |
however only containing simple write messages. |
26 |
|
27 |
The server maintains a session state bound to a bytestream, |
28 |
e.g. one TCP connection. |
29 |
|
30 |
|
31 |
* messages and data |
32 |
|
33 |
In Malete, every record has a "header", which is the value of the first field. |
34 |
The header specifies which message the record represents, |
35 |
with the following fields ("body") containing parameter data for the message. |
36 |
|
37 |
Recall that |
38 |
- the first field's tag denotes the number of fields in the record |
39 |
- a "data record" is a record that can be written to a database. |
40 |
This requires a record id (MFN), which, however, can be 0 |
41 |
to denote an append with the next available id. |
42 |
- for a data record read from or written to the database, |
43 |
the header will/must be empty or start with a digit. |
44 |
The general format is 'rid[@pos][*TAB*leader]'. |
45 |
Rid is the record id (MFN), which on write may be 0 to append a new record. |
46 |
Pos is the optional old position to guard an updating write |
47 |
against concurrent changes. |
48 |
Leader contains arbitrary data like e.g. a MARC leader, |
49 |
a record key or a message header. |
50 |
|
51 |
Proper message headers are not empty and do not start with a digit. |
52 |
The first token of a message header (up to a *TAB* or end of value) |
53 |
is the message name, optionally qualified by a message target, |
54 |
i.e. an object to receive the message (usually a database). |
55 |
|
56 |
|
57 |
However, messages and data are converted into each other canonically: |
58 |
- If a data record header is encountered where a message is expected, |
59 |
it is treated as a write message as if 'W*TAB*' where prepended |
60 |
(which oviously will write just this record). |
61 |
Even the empty message (a record with 0 fields) is a valid message |
62 |
and will append an empty record when sent to a database. |
63 |
- If a message is treated as data, its header is treated as leader |
64 |
as if '0*TAB*' where prepended. |
65 |
|
66 |
|
67 |
* message targets are objects |
68 |
|
69 |
A server processes messages by first looking up a target object by |
70 |
inspecting and stripping an initial addressing part of the message header |
71 |
(or resorting to some default) and then passing the message to this object. |
72 |
(Actually, even this dispatching is done by an object, the session). |
73 |
|
74 |
|
75 |
In general, objects are free in how they process messages. |
76 |
For example, an object might represent a (session on a) remote server, |
77 |
and simply pass every message there. Objects using the same processing |
78 |
function are said to be in the same "class". Commonly processing functions |
79 |
handle only some known messages and pass anything else on to the function |
80 |
of another class, which is called "inheriting from this class". |
81 |
|
82 |
|
83 |
Objects to which messages can be send are |
84 |
- a structure |
85 |
is a collection of other (child) objects like databases (tables). |
86 |
It does basically nothing but passing messages to its childs. |
87 |
It may support a listing of the known childs. |
88 |
The structure interface may be implemented locally or as a remote server. |
89 |
- a database (table) |
90 |
supports reading and writing of record and query data. |
91 |
A database is a structure, it may support childs e.g. to provide views. |
92 |
- a session |
93 |
is a structure representing the connection to a (local or remote) server. |
94 |
It passes messages to the server's childs (like databases) and maintains |
95 |
some state, called the environment. |
96 |
|
97 |
|
98 |
Any object should recognize |
99 |
- the comment '#' |
100 |
a special message used to pass additional info (echo/error) |
101 |
- rooting '.' |
102 |
the message is passed to the session as is. |
103 |
A session strips the '.' and processes the rest as usual. |
104 |
- options '=' (optional extension) |
105 |
to get or set values of object options (not implemented). |
106 |
- messages starting with other special characters like '|' and ';' |
107 |
are reserved for future special processing |
108 |
|
109 |
|
110 |
A structure in addition recognizes |
111 |
- child addressing '.' |
112 |
if the message name starts with a letter and contains a dot '.', |
113 |
everything up to the dot is taken as the name of a child. |
114 |
After stripping the child qualifier, the message is send to the child. |
115 |
With no additional message, the child's existence is tested |
116 |
and returned in a comment. |
117 |
The qualification can contain several dots, which are processed from left. |
118 |
Therefore, 'a.b.c' means to send message 'b.c' to target 'a', |
119 |
which could be for example a remote server, which in turn is expected |
120 |
to somehow dispatch message 'c' to its local child 'b'. |
121 |
|
122 |
A session also supports: |
123 |
- default path (optional extension) |
124 |
Similar to a current working directory, a default path can be set |
125 |
as session option '@', which is then lexically prepended to any |
126 |
unrooted request to the session. (not implemented). |
127 |
|
128 |
|
129 |
The standard messages a database should recognize are |
130 |
- the write message W |
131 |
writing one or more records to a database |
132 |
- the read message R |
133 |
reading records by record id |
134 |
- the query message Q |
135 |
to search the query data (btree index) |
136 |
- the index message X |
137 |
to write index data |
138 |
|
139 |
|
140 |
|
141 |
Standard message and object names always start with an ASCII letter. |
142 |
As a convention, message names should start uppercase and |
143 |
object names lowercase. |
144 |
|
145 |
Every message returns an error comment message in case of error |
146 |
or another message as specified (possibly the empty message). |
147 |
|
148 |
|
149 |
The body of a message (i.e. the fields following the header) |
150 |
may define a fixed or variable number of parameter fields |
151 |
or one or more records, which are in turn, depending on the message, |
152 |
used as message or data records (generally regardless of their contents): |
153 |
- header only: |
154 |
The message is not using any fields or records as parameter. |
155 |
Such messages treat any body as embedded records (see below) specifying |
156 |
one or more chained messages, which are then processed in turn. |
157 |
A possible but currently unused generalization of this is |
158 |
a fixed number of parameter fields. |
159 |
- parameter list: |
160 |
the contents of following fields is interpreted by the message itself. |
161 |
Many messages use only one type of parameter fields and ignore their tags. |
162 |
- embedded records: |
163 |
Each of the records begins with a proper header field, |
164 |
with the tag being its negative length (including the header). |
165 |
A tag of 0 is treated as using all available fields. |
166 |
Should such a tag be positive or specify a length |
167 |
exceeding the number of available fields, the result is undefined, |
168 |
but either an error or treating it as record using all available fields. |
169 |
- immediate record: |
170 |
Some messages also support a short form, where they do not themselves |
171 |
take all of their header, but only chop off some initial part of it, |
172 |
using the remaining message as record. |
173 |
|
174 |
|
175 |
* write |
176 |
|
177 |
The write message takes one of two forms: |
178 |
- short write (immediate record): |
179 |
The header is of the form 'W*TAB*rid[@pos][*TAB*leader]', |
180 |
and the following fields are the body of a record to write. |
181 |
This message writes the record with header 'rid[@pos][*TAB*leader]' |
182 |
and the body as given by the following fields. |
183 |
It returns a short read message with the record id written. |
184 |
- long write (embedded records): |
185 |
The header is a single 'W'. The body contains any number of embedded records. |
186 |
Multiwrite returns a long read message with the record ids written. |
187 |
With an empty body, long write can be used to test the existence |
188 |
and writeability of a database. |
189 |
|
190 |
Note that there is no special support for deleting records; |
191 |
writing empty records has the same effect. |
192 |
|
193 |
|
194 |
* read |
195 |
|
196 |
Like write, the read message takes one of two forms, |
197 |
all returning a long write for the retrieved records: |
198 |
- short read (header only): |
199 |
The header is of the form 'R*TAB*rid[*TAB*count]'. |
200 |
It reads count (default 1) records starting at record rid. |
201 |
A count of 0 reads any records as available and within the read limit. |
202 |
Note that a read of record 0 retrieves the metadata. |
203 |
- long read (parameter list): |
204 |
The header is a single 'R'. |
205 |
The following fields contain one record id each. |
206 |
|
207 |
Note that |
208 |
- the number of records read at once is limited by the session option 'r' |
209 |
- read might retrieve older versions of records, |
210 |
if the database has a snapshot position set |
211 |
|
212 |
|
213 |
* query |
214 |
|
215 |
The query message is of the form 'Q[*TAB*query]', |
216 |
where query is an expression in the |
217 |
> Query Malete query language. |
218 |
With parameters, the query message creates a new query as the current. |
219 |
With or without parameters, the query message returns an echo |
220 |
of the estimated remaining result set size, followed by a long write |
221 |
containing the next 'r' records from the current query set |
222 |
(subject to a snapshot like read). |
223 |
|
224 |
The query can contain two parts, separated by a '?': |
225 |
- an index based search defining a result set. |
226 |
If it is empty, the search result set is the entire database. |
227 |
- a filter to be applied on record retrieval. |
228 |
If no filter is specified (i.e. no '?'), only record ids are returned. |
229 |
An empty filter selects every record with all fields. |
230 |
Other filters will select records and/or fields. |
231 |
|
232 |
In future versions, one or both parts might be specified as embedded |
233 |
records. By now, however, the query message is header only. |
234 |
|
235 |
|
236 |
Note that |
237 |
- the session keeps a total of 'q' queries with the query expression, |
238 |
the cursor (offset of next record to retrieve) and search result set. |
239 |
If a query expression is only a reference '#n' to an open query, |
240 |
this query is used from its current position without establishing |
241 |
a new query. |
242 |
- the size of a search result set is limited by the session option 's'. |
243 |
This limit applies also to any intermediate result, thus the |
244 |
actual set might be much smaller or even empty due to the limit. |
245 |
Some search expressions might allow larger set sizes, |
246 |
especially the empty one does (since no record ids need to be stored). |
247 |
|
248 |
The returned echo contains several numbers: |
249 |
- estimated number of remaining records, including the ones just read. |
250 |
This number may be wrong for a number of reasons, especially it does |
251 |
not account for filtering. However, if it equals the number of returned |
252 |
records, it is safe to assume that there are no more records. |
253 |
This number is the primary echo code, if it is negative, |
254 |
the rest of the echo is some error message. |
255 |
- number of the query, by which it can be referenced. |
256 |
These numbers are per database. |
257 |
- truncation record id. If not 0, this is a record id where the search |
258 |
was truncated due to the result set size limit. |
259 |
Future versions might support transparent continuation after truncation. |
260 |
|
261 |
|
262 |
* terms |
263 |
|
264 |
The terms message has one of the forms |
265 |
- 'T*TAB*from*TAB*to' |
266 |
Selects terms greater or equal the first parameter and less than the second. |
267 |
Where the second parameter is empty, no upper bound is used. |
268 |
- 'T*TAB*prefix' |
269 |
Selects terms with the parameter as prefix. |
270 |
Using a prefix ABC is just a shorthand for from ABC to ABD. |
271 |
- 'T*TAB*from*TAB*to*TAB*tag' |
272 |
Like the first form, but restrict matches to the given tag (number). |
273 |
|
274 |
|
275 |
Terms are returned as a list (record with 0-tagged fields), |
276 |
where each field value is a count of hits of the term, |
277 |
followed by a *TAB* and the term. |
278 |
The list is limited to the result set size. |
279 |
The full index can be looped by using the last returned term |
280 |
as from parameter for the next invocation. |
281 |
|
282 |
|
283 |
When not restricting to a tag, the hit count is just the number of all |
284 |
index entries for the selected terms. This may be higher than the number |
285 |
of matched records, where a term has multiple hits for the same records. |
286 |
|
287 |
With a restriction to a tag, the count is the actual number of records |
288 |
(even where a term has multiple entries for the same record and tag). |
289 |
If the database uses the traditional fulltext index format (the default), |
290 |
tag 0 selects any tag, else tag 0 selects actual tag 0 entries (unique keys). |
291 |
|
292 |
|
293 |
* index |
294 |
|
295 |
The index message 'X' takes a parameter list of data and control fields. |
296 |
Control fields have tag 0 and change the way the data fields are processed. |
297 |
All other fields contain index data. During processing of the message, |
298 |
a position counter is maintained which is incremented by one for every word |
299 |
(in word or split mode), to the next multiple of the field step (default 65536) |
300 |
for every field (1 in word mode), and reset to 0 on tag change. |
301 |
|
302 |
|
303 |
Every control field contains one or more instructions |
304 |
(as always, separated by TABs): |
305 |
- f[pos] |
306 |
sets default (full field) indexing mode where every data field contains |
307 |
one index entry. The position is set to the given or 0 and then |
308 |
incremented to the field step. |
309 |
- w[pos] |
310 |
Like field mode, but incrementing the position by one. |
311 |
- s[pos] |
312 |
Split mode, where each data field is split into words according |
313 |
to collation info. |
314 |
If the index has no collation info, all characters but the well-known |
315 |
ASCII non-letters are assumed to be word characters. |
316 |
- a[pos] |
317 |
set add mode (default) |
318 |
- d[pos] |
319 |
set delete mode: following index entries are deleted. |
320 |
- m[mode] |
321 |
mode 'H' selects traditional conversion of angle brackets: |
322 |
<a[=b]> is replaced by b (or nothing). |
323 |
mode 'P' or none turns this off. |
324 |
- p*pfx* |
325 |
prepend prefix pfx to index entries |
326 |
- r*id* |
327 |
set record id (defaults to the session's last written record) |
328 |
- [+|-]*tag* |
329 |
where tag is a number, stops processing of the field and treats |
330 |
everything after the next *TAB* as data field with *tag*. |
331 |
With a leading + or -, set mode to add or del, resp. |
332 |
|
333 |
Control instructions may also be part of the message header. |
334 |
The index message echoes a count of the index entries made. |
335 |
|
336 |
|
337 |
* comment |
338 |
|
339 |
The comment message '#' is used to augment other messages. |
340 |
It is header only (executing any body) of the form '#*TAB*code[*TAB*message]', |
341 |
where code is a number. |
342 |
A nonnegative code indicates a success, typically some count. |
343 |
A negative code indicates some sort of error (-1..-10) or notification. |
344 |
Message is arbitrary. |
345 |
This message copies itself to the result. |
346 |
|
347 |
|
348 |
* options |
349 |
|
350 |
Some objects have options, which can be given as subfields |
351 |
in some configuration header for the object and be set and retrieved |
352 |
using the '=' message. The '=' message echoes a comment containing |
353 |
some or all options as subfields. |
354 |
|
355 |
- a single '=' echoes all options |
356 |
- '=' immediatly followed by option characters echoes these options |
357 |
- additional subfields set options and, after a single '=', echo these. |
358 |
|
359 |
|
360 |
* special message processing |
361 |
|
362 |
optional extensions |
363 |
|
364 |
There are more special messages envisioned which are used to control or |
365 |
modify the processing of one or more other messages. |
366 |
Given here is a rough sketch as a guide for future implementation, |
367 |
however, this may be not yet implemented and is still subject to change. |
368 |
|
369 |
|
370 |
The pipe '|' reuses the result created by one message as or for another message. |
371 |
It scans its header for occurences of '*TAB*|' (i.e. tabseparated subfields |
372 |
with subfield code '|'), each of which starts a new submessage. |
373 |
Iteratively, the part of the header up to the next submessage is processed |
374 |
as a message, creating a result. |
375 |
|
376 |
Then if the next submessage |
377 |
(the part of the header starting with the next character after the pipe |
378 |
and extending to the character before the next '*TAB*|' or end of header) |
379 |
- is empty, |
380 |
the result is processed as message. |
381 |
This is convenient to immediatly execute the read returned by a query. |
382 |
- starts with a *TAB*, |
383 |
the submessage (including the *TAB*) is appended to the result's header, |
384 |
and the result is processed as message. |
385 |
- else, |
386 |
the result's header is echoed to the final (not the next intermediate) |
387 |
result and then replaced by the submessage before processing. |
388 |
|
389 |
As a special case, if the pipe message header did not contain any '*TAB*|', |
390 |
it is treated as with '*TAB*|' at end, i.e. the only submessage's result |
391 |
is executed (mimicking the effect of backticks). |
392 |
|
393 |
In a long form, where the pipe message header is only the '|', |
394 |
the submessages are embedded records in the body. |
395 |
Here, in each step, any body fields of the following submessage |
396 |
are prepended to the result before execution. |
397 |
|
398 |
|
399 |
The composition ';' processes several messages, appending to the same result. |
400 |
In the long form, submessages are embedded records. |
401 |
In the compact form, the header is split into submessages as for the pipe. |
402 |
(Details to specify). |
403 |
|
404 |
|
405 |
* serialization |
406 |
|
407 |
Message can be represented in byte streams according to the following rules: |
408 |
- Field values (including the header) MUST NOT contain a newline character, |
409 |
else the results are undefined. Where an application must be prepared |
410 |
to handle newlines, it must take care of encoding them (see below). |
411 |
- If the message header is empty, no header is printed |
412 |
- else if the message is a regular message (not starting with a digit), |
413 |
the header is printed followed by a newline. |
414 |
- else 'W*TAB*' is printed followed by the header and a newline. |
415 |
- All body fields are printed as the tag followed by a *TAB*, |
416 |
the value and a newline. |
417 |
- A single newline is printed to terminate the message. |
418 |
|
419 |
|
420 |
On deserialization, if a message starts with a number (digit or -sign), |
421 |
this is the tag of the first body field, and an empty header is to |
422 |
be assumed (equivalent to a 'W*TAB*0' append message). |
423 |
|
424 |
For all body fields, the deserialization must be done in the following steps: |
425 |
- take an initial '-' sign and any digits as tag, defaulting to 0 |
426 |
- skip one following *TAB* character |
427 |
- use anything up to a newline as value |
428 |
Consequently, on serialization: |
429 |
- a tag of 0 may and commonly will be omitted |
430 |
- where a value does not start with a TAB, |
431 |
the TAB may be ommited |
432 |
- where a value does not start with a '-', digit or TAB, |
433 |
both a 0 tag and the TAB may be ommited |
434 |
- where values containing newlines are used unencoded, |
435 |
they will in most cases result in following 0 tagged fields |
436 |
However, ommiting the TAB is considered bad style. |
437 |
|
438 |
|
439 |
The record data ("master") file is simply a stream of data record messages, |
440 |
using headerless mode where possible (i.e. appends of leaderless records). |
441 |
|
442 |
|
443 |
Some easy common encodings are suggested to deal with newline characters: |
444 |
- in "field mode", |
445 |
discard newlines by replacing them with spaces or tabs. |
446 |
- in "text mode", |
447 |
newlines are replaced with vertical tabs VT (ASCII 11, ^K). |
448 |
This maybe reversed to restore newline-separated lines if needed, |
449 |
but e.g. on printing the VT will have the desired effect. |
450 |
- in "binary mode", |
451 |
newlines are replaced as VT followed by a byte value 1, |
452 |
if the newline is followed by a byte value 0 or 1, else by a single VT. |
453 |
A VT is replaced by a VT and a 0 byte. |
454 |
- as an "ultra robust binary mode", use BASE64. |
455 |
|
456 |
The advantages of text mode over binary mode are |
457 |
- it is slightly faster than the binary translation |
458 |
- the serialized records do not need more space |
459 |
(whereas the binary serialization might need twice the space) |
460 |
|
461 |
The binary mode has the advantage of not loosing vertical tab characters that |
462 |
might have been contained in the original field values. |
463 |
It is fully transparent and can be used to store any binary data like images |
464 |
with an average overhead of 0.4% (as compared to +33% with BASE64 encoding). |
465 |
Note that for a plain text not containing control characters 0, 1 or 11, |
466 |
text and binary mode have the same results, thus it is reasonably safe |
467 |
for client libraries to use binary mode by default on all communication. |
468 |
|
469 |
However, BASE64 has the advantage of even surviving a character set recoding, |
470 |
thus is more robust for databases which may be exchanged internationally. |
471 |
Also the overhead of BASE64 is fixed to 33% (4 bytes for every 3), |
472 |
while the binary mode has a worst case of +100% (on all VTs). |
473 |
|
474 |
--- |
475 |
$Id: Protocol.txt,v 1.12 2004/06/15 11:11:16 kripke Exp $ |