1 |
fst lines read |
2 |
"tag tec mode,fmt", |
3 |
where |
4 |
- tag is the field number used in the index posting |
5 |
- tec is the indexing technique (see below) |
6 |
- mode is a Mmc formatting mode or, for tec 5..8, a prefixing format |
7 |
- fmt is some format expression, typically a single field like v24 |
8 |
|
9 |
|
10 |
* formatting modes |
11 |
indicated by Mmc, where m is a character indicating some translations, |
12 |
and c is a character indicating the handling of case; default MPL |
13 |
|
14 |
m |
15 |
- P proof mode: no changes applied |
16 |
- H heading mode: angle brackets are removed, |
17 |
^x is replaced as ';' for x=a, ',' for x=b..i, '.' for others |
18 |
- D data mode: like heading mode plus '. ' after each field |
19 |
|
20 |
c |
21 |
- L lower case: no changes applied |
22 |
- U upper case: (ASCII-)characters are converted to "uppercase" as listed |
23 |
in the file ISISUC.TAB (a textfile containing 256 replacement byte values |
24 |
like ... '096 065 066' ... in 16 lines of 16 numbers each) |
25 |
|
26 |
|
27 |
* indexing techniques |
28 |
- 0 one entry per line (format should produce appropriate newlines with /) |
29 |
- 1 one entry per line or subfield (use proof mode MPx to retain ^x) |
30 |
- 2 one entry per each term in <angle brackets> (use proof mode) |
31 |
- 3 one entry per each term /between slashes/ |
32 |
- 4 one term per "word": |
33 |
a word is a sequence of 'alphabetical characters' as listed in the file |
34 |
ISISAC.TAB (a textfile containing numbers like ... '098 099 100 101' ...). |
35 |
"stopwords" (listed in the db's .stw-file) are ignored |
36 |
- 5..8 like 1..4, but instead of applying a mode Mmc, |
37 |
every entry with delimiter 'd' is given a prefix 'p' like |
38 |
1 8 '/TI=/',v24 (prefix //-entries with TI=) |