--- trunk/README.pod 2006/07/11 14:11:42 60 +++ trunk/README.pod 2007/07/13 10:12:54 87 @@ -1,37 +1,72 @@ -=head1 pgestraier - search Hyper Estraier indexes from PostgreSQL +=head1 pgestraier - PostgreSQL full-text search using Hyper Estraier -This package is essentially PostgreSQL C function which calls Hyper Estraier -API and returns results in user defined format. +This package is essentially composed of two different parts: + +=over 4 + +=item search function + +PostgreSQL function to search Hyper Estraier full-text index, using +full-text queries and attribute filtering to return user-specified +table of results. + +This function can mimic SQL C, C and C +functionality much faster than using those SQL constructs on search +results. + +=item trigger function + +PostgreSQL trigger function to keep Hyper Estraier in sync with PostgreSQL. +It triggers after insert, update or delete and update full-text index +accordingly. + +=back + +Both functions are written in C, while test framework and supporting +utilities are written in perl. + +You can use just one of those functions. If you want just to search existing +Hyper Estraier index or generate it off-line (after nightly batch jobs, for +example), just use search function. + +On the other hand, if you want just to keep your Hyper Estraier index in +sync with PostgreSQL data, you can use just trigger function to achieve that. =head1 Why is it written? -Aside from providing single API to query your RDBMS and full text index +Aside from providing single query language (SQL) to RDBMS and full text index (using any language that has PostgreSQL client libraries), real power is hidden in ability to join results from full text index and structured data in RDBMS. For simple real-life example which address problem -I -see L. +C<< WHERE name LIKE '%foo%' OR surname LIKE '%foo%' >> +is slow see L and L documentation. =head1 How to install Installation should be simple. However, you will have to have following -software already installed before you try this function: +software already installed before you try this functions: =over =item * -PostgreSQL (tested with versions 7.4 and 8.0) with development libraries +PostgreSQL (tested with versions 7.4, 8.0 and 8.1) with development libraries =item * -Hyper Estraier (tested with various versions, recommended 1.2.4 of newer) +Hyper Estraier (tested with various versions, recommended 1.2.4 or newer) +with development headers + +=item * + +working C compiler (tested with gcc) =back -To run tests you will also need: +If you want to use helper script to create consistency triggers to keep +Hyper Estraier in sync with PostgreSQL database, you will also need: =over @@ -41,11 +76,22 @@ =item * -perl modules C, C, C and optionally C +perl modules C, C and C + +=back + +To run tests you will also need: + +=over =item * -C from Internet Movie Database in C directory +perl module C + +=item * + +C from Internet Movie Database in C directory. +You can download it from L =item * @@ -85,7 +131,7 @@ See also included file C for more examples of usage. -=head1 Usage of pgest from SQL +=head1 Usage of search function pgest from SQL C PostgreSQL function tries to mimic usage of normal database tables (with support for attribute filtering, limit and offset) in following way: @@ -144,6 +190,49 @@ index grows bigger, you might consider using node API to remove overhead of database opening on each query. +=head1 Usage of trigger function pgest_trigger from SQL + +Let's first say that I really suggest that you use C helper script to +create triggers because it already supports following steps automatically: + +=over + +=item begin transaction + +Transaction is needed to catch updates which might happen while creation +of full-text index is in progress (and on huge collections this can take a while, +just like normal index creation in PostgreSQL). + +=item insert all existing data in full-text index + +This will be done directly from PostgreSQL database to Hyper Estraier index. +This is somewhat faster than waiting for trigger to fire for each existing +row. + +=item create insert, update and delete triggers + +Which will keep data in sync later + +=item commit transaction + +=back + +If you still want to do that manually, you will need to know format of +C function: + + CREATE TRIGGER pgest_trigger_insert AFTER INSERT + ON table FOR EACH ROW + EXECUTE PROCEDURE pgest_trigger( + -- node URI, login and password + 'http://localhost:1978/node/trivia', 'admin', 'admin', + -- name of primary key column + 'id', + -- names of all other columns to index (one or more) + 'column', 'another_one', 'and_another' + ) + +You have to create triggers for C and C in similar way. + =head1 Who wrote this? Hyper Estraier is written by Mikio Hirabayashi. @@ -151,4 +240,32 @@ PostgreSQL is written by hackers calling themselves PostgreSQL Global Development Group. -This small C function is written by L, dpavlin@rot13.org. +This small C functions are written by L, dpavlin@rot13.org. + +=head1 See also + +=over + +=item * + +L - how to create first full-text index in under 10 minutes! + +=item * + +L - what has changed since last version + +=item * + +L - helper script to create index and triggers + +=item * + +L hosts home page of this project + +=item * + +L +has a documentaton about query format. C is using noraml queries (with +C, C etc.) and not simplified queryies (with C<|>). + +=back