/[webpac2]/trunk/lib/WebPAC/Manual.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Diff of /trunk/lib/WebPAC/Manual.pod

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1 by dpavlin, Sat Jun 25 20:23:23 2005 UTC revision 892 by dpavlin, Sun Oct 7 22:51:54 2007 UTC
# Line 12  of your data: Line 12  of your data:
12    
13    step    step
14    
15     source file          CDS/ISIS, MARC, Excel, robots, ...     source data      CDS/ISIS, MARC, Excel, robots, ...
16        |        |
17    1   | apply import normalisation rules (xml)    0   | apply lookup rules (optional)
18      1   | apply input normalisation rules (xml or yaml)
19        V        V
20     intermidiate         this data is re-formatted source data converted     intermidiate     this data is re-formatted source data converted
21       data               to chunks based on tag names from import_xml       data           to chunks based on tag names from config/input/
22        |        |
23    2   | apply output filter (TT2)    2   | optionally apply output filter (TT2)
24        V        V
25       data               search engine, HTML, OAI, RDBMS       data           search engine, HTML, OAI, RDBMS
26        |        |
27    3   | filter using query in REST format    3   | filter using query in REST format
28    4   | apply output filter (TT2)    4   | apply output filter (TT2)
29        V        V
30      client              Web browser, SOAP      client          Web browser (html), JSON
31    
32  =head2 Normalisation and Intermidiate data  =head2 Source data
33    
34  This is first step in working with your data.  WebPAC supports various input formats:
35    
36    =over 2
37    
38    =item L<WebPAC::Input::ISIS> CDS/ISIS data
39    
40    =item L<WebPAC::Input::MARC> for MARC records
41    
42    =item L<WebPAC::Input::Excel> Microsoft Excel C<.xls> support
43    
44    =item L<WebPAC::Input::DBF> support legacy tables (e.g. Clipper)
45    
46    =item L<WebPAC::Input::Gutemberg> for RDF catalog data from Project Gutenberg
47    
48    =back
49    
50    =head2 Create data lookups
51    
52    Before you can begin normalisation, you might want to create lookups which store
53    C<< key -> value(s) >> pair(s). Lookups are especially useful if you want to
54    I<well> lookup value of some other record using some sort of identifier.
55    
56    Lookup are described in more details in L<WebPAC::Lookup>.
57    
58    =head2 Normalisation to intermidiate data
59    
60    Intermidiate data is internal representation of data on which WebPAC operates.
61    
62  You are creating mappings, one-to-one from source data records to documents  You are creating mappings, one-to-one from source data records to documents
63  in webpac. You can split or merge data from input records, apply filters  in WebPAC. You can split or merge data from input records, apply regexes,
64  (perl subroutines), use lookups within same source file or do simple  use lookups within same source file, do conditions, branches and/or
65  evaluations while producing output.  simple evaluations while producing intermidiate data.
66    
67  All that is controlled with C<import_xml> configuration file. You will want  All that is controlled with C<config/config.yml> configuration file.
68  to create fine-grained chunks of data (like separate first and last name),  This file is in human-readable YAML format, and it describes all configuration of
69  which will later be used to produce output. You can think of conversation  WebPAC and it's front-end Webpacus.
70  process as application of C<import_xml> recepie on every input record.  
71    
72    All that is controlled with C<config/input/> configuration files. You
73    will want to create fine-grained chunks of data (like separate first and
74    last name), which will later be used to produce output. You can think of
75    conversation process as application of C<config/input/> recepie on
76    every input record.
77    
78  Each tag within recepie is creating one new records as long as there are  Each tag within recepie is creating one new records as long as there are
79  fields in input format (which can be repeatable) that satisfy at least one  fields in input format (which can be repeatable) that satisfy at least one
# Line 50  Users of older webpac should note that t Line 83  Users of older webpac should note that t
83  formatting or specification of output type and that granularity of each tag  formatting or specification of output type and that granularity of each tag
84  has increased.  has increased.
85    
86    B<this document should really be updated to reflect Webpacus front-end from
87    this point...>
88    
89  =head2 Output filter  =head2 Output filter
90    
91  Now that we have normalized record, we can create some output. You can create  Now that we have normalized record, we can create some output. You can create

Legend:
Removed from v.1  
changed lines
  Added in v.892

  ViewVC Help
Powered by ViewVC 1.1.26