--- trunk/lib/WebPAC/Manual.pod 2005/06/25 20:23:23 1 +++ trunk/lib/WebPAC/Manual.pod 2007/10/07 22:51:54 892 @@ -12,35 +12,68 @@ step - source file CDS/ISIS, MARC, Excel, robots, ... + source data CDS/ISIS, MARC, Excel, robots, ... | - 1 | apply import normalisation rules (xml) + 0 | apply lookup rules (optional) + 1 | apply input normalisation rules (xml or yaml) V - intermidiate this data is re-formatted source data converted - data to chunks based on tag names from import_xml + intermidiate this data is re-formatted source data converted + data to chunks based on tag names from config/input/ | - 2 | apply output filter (TT2) + 2 | optionally apply output filter (TT2) V - data search engine, HTML, OAI, RDBMS + data search engine, HTML, OAI, RDBMS | 3 | filter using query in REST format 4 | apply output filter (TT2) V - client Web browser, SOAP + client Web browser (html), JSON -=head2 Normalisation and Intermidiate data +=head2 Source data -This is first step in working with your data. +WebPAC supports various input formats: + +=over 2 + +=item L CDS/ISIS data + +=item L for MARC records + +=item L Microsoft Excel C<.xls> support + +=item L support legacy tables (e.g. Clipper) + +=item L for RDF catalog data from Project Gutenberg + +=back + +=head2 Create data lookups + +Before you can begin normalisation, you might want to create lookups which store +C<< key -> value(s) >> pair(s). Lookups are especially useful if you want to +I lookup value of some other record using some sort of identifier. + +Lookup are described in more details in L. + +=head2 Normalisation to intermidiate data + +Intermidiate data is internal representation of data on which WebPAC operates. You are creating mappings, one-to-one from source data records to documents -in webpac. You can split or merge data from input records, apply filters -(perl subroutines), use lookups within same source file or do simple -evaluations while producing output. - -All that is controlled with C configuration file. You will want -to create fine-grained chunks of data (like separate first and last name), -which will later be used to produce output. You can think of conversation -process as application of C recepie on every input record. +in WebPAC. You can split or merge data from input records, apply regexes, +use lookups within same source file, do conditions, branches and/or +simple evaluations while producing intermidiate data. + +All that is controlled with C configuration file. +This file is in human-readable YAML format, and it describes all configuration of +WebPAC and it's front-end Webpacus. + + +All that is controlled with C configuration files. You +will want to create fine-grained chunks of data (like separate first and +last name), which will later be used to produce output. You can think of +conversation process as application of C recepie on +every input record. Each tag within recepie is creating one new records as long as there are fields in input format (which can be repeatable) that satisfy at least one @@ -50,6 +83,9 @@ formatting or specification of output type and that granularity of each tag has increased. +B + =head2 Output filter Now that we have normalized record, we can create some output. You can create