/[pgestraier]/trunk/README.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Diff of /trunk/README.pod

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 42 by dpavlin, Sat Sep 10 21:48:46 2005 UTC revision 60 by dpavlin, Tue Jul 11 14:11:42 2006 UTC
# Line 10  Aside from providing single API to query Line 10  Aside from providing single API to query
10  hidden in ability to join results from full text index and structured data  hidden in ability to join results from full text index and structured data
11  in RDBMS.  in RDBMS.
12    
13    For simple real-life example which address problem
14    I<where like '%foo%' is slow>
15    see L<Tutorial>.
16    
17  =head1 How to install  =head1 How to install
18    
19  Installation should be simple. However, you will have to have following  Installation should be simple. However, you will have to have following
# Line 19  software already installed before you tr Line 23  software already installed before you tr
23    
24  =item *  =item *
25    
26  PostgreSQL (tested with version 7.4.8) with development libraries  PostgreSQL (tested with versions 7.4 and 8.0) with development libraries
27    
28  =item *  =item *
29    
30  Hyper Estraier (tested with versions 0.3.9 and 0.3.10)  Hyper Estraier (tested with various versions, recommended 1.2.4 of newer)
31    
32  =back  =back
33    
# Line 37  working perl installation Line 41  working perl installation
41    
42  =item *  =item *
43    
44  perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<HyperEstraier>  perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<Search::Estraier>
45    
46    =item *
47    
48    C<trivia.list.gz> from Internet Movie Database in C<data/> directory
49    
50  =item *  =item *
51    
52  C<trivia.list.gz> from Internet Movie Database in data/ directory  PostgreSQL database C<test> with permissions for current user
53    
54  =item *  =item *
55    
56  database "test" with permissions for current user  Hyper Estraier C<estmaster> running with permissions for C<admin> user
57    to create C<trivia> node.
58    
59  =back  =back
60    
# Line 54  If you have all that, you should be able Line 63  If you have all that, you should be able
63    make    make
64    
65  and see sample results. You will be asked your password once (via sudo) to  and see sample results. You will be asked your password once (via sudo) to
66  install pgest.so shared library in system-wide location so that PostgreSQL  install C<pgest.so> shared library in system-wide location so that PostgreSQL
67  could access it.  could access it.
68    
69  Next, you will have to create test index. You have two options:  =head2 Create sample index using Hyper Estraier perl bindings
70    
71  =head2 Create index using estcmd  Perl bindings for Hyper Estraier are available at CPAN:
72    
73  This will create temporary files on disk and index them using estcmd gather  L<http://search.cpan.org/~dpavlin/Search-Estraier/>
74    
   cd data  
   make index  
   cd ..  
75    
76  B<Warning:> this method is incomplete and won't create node index needed  After installing C<Search::Estraier> you can create index using following commands:
 to run last examples in C<test.sql> correctly. Solution is simple: either  
 symlink your newly created index to Hyper Estraier C<_node> directory or  
 create node and fill re-create index using C<estcall>.  
   
 =head2 Create index using Hyper Estraier perl bindings  
   
 For this, you will have to install perl bindings from  
   
 L<http://hyperestraier.sourceforge.net/binding/>  
   
 If you installed bindings as documented in README file, you can use  
 perl binding to create index about three times faster. However, you will  
 fist need to create node I<trivia> using Hyper Estraier's administration  
 interface at L<http://localhost:1978/masterui>. You will also need user  
 C<admin> with password C<admin> because those values are hard-coded in  
 C<indexer.pl>. If you want to use different user on index name, feel  
 free to change script.  
77    
78    cd data    cd data
79    make perl    make index
80    cd ..    cd ..
81    
82  To run tests (which require that you have estcmd in your $PATH) issue  To run tests (which require that you have estcmd in your $PATH) issue
83    
84    make test    make test
85    
86  See also included file test.sql for more examples of usage.  See also included file C<test.sql> for more examples of usage.
87    
88  =head1 Usage of pgest from SQL  =head1 Usage of pgest from SQL
89    
90  C<pgest> PostgreSQL function has two different prototypes (number of arguments) depending on usage.  C<pgest> PostgreSQL function tries to mimic usage of normal database tables (with support for attribute filtering, limit and offset) in following way:
91    
92          SELECT          SELECT
93                  -- columns to return (defined later)                  -- columns to return (defined later)
94                  id,title,size                  id,title,size
95          FROM pgest(          FROM pgest(
96                  -- path to index OR URL to node, user-name and password                  -- node URI, login, password and depth of search
97                  -- you will need JUST ONE of following two lines, depending                  'http://localhost:1978/node/trivia', 'admin', 'admin', 42,
                 -- on your usage described below  
                 '/full/path/to/casket',  
                 'http://localhost:1978/node/trivia', 'admin', 'admin',  
98                  -- query                  -- query
99                  'blade runner',                  'blade runner',
100                  -- additional attributes, use NULL or '' to disable                  -- additional attributes, use NULL or '' to disable
# Line 127  C<pgest> PostgreSQL function has two dif Line 113  C<pgest> PostgreSQL function has two dif
113                  id text, title text, size text                  id text, title text, size text
114          );          );
115    
116    You should note that Hyper Estraier uses UTF-8 encoding, while your
117    PostgreSQL installation might use different encoding. To fix that, use
118    C<convert> function in PostgreSQL to convert encodings.
119    
120    =head2 Using index via C<estmaster> server process
121    
122    This is default and recommended way to use C<pgest> functionality. In this
123    case, C<pgest> will use node API and access index through C<estmaster>
124    process which should be running on (local or remote) machine.
125    
126    This will remove database opening overhead, at a cost of (small) additional network
127    traffic. However, you can have Hyper Estraier C<estmaster> process running on
128    different machine or update index while doing searches, so benefits of this
129    approach are obvious.
130    
131  =head2 Accessing database directly  =head2 Accessing database directly
132    
133  If you want to access database directly (without running C<estmaster> process), first argument is full path to database file.  B<Please note that direct access to database is depreciated.> As such, it's
134    not stated in example, and it's kept just for backward compatibility, but it
135    will probably be removed in future versions of C<pgest>.
136    
137    If you want to access database directly (without running C<estmaster> process), you
138    have to replace node URI, login, password and depth with full path to database file.
139    
140  Have in mind that C<postgres> user under which PostgreSQL is running must  Have in mind that C<postgres> user under which PostgreSQL is running must
141  have read permission on Hyper Estraier database files.  have read permission on Hyper Estraier database files.
# Line 138  This will work a bit faster on really sm Line 144  This will work a bit faster on really sm
144  index grows bigger, you might consider using node API to remove overhead of  index grows bigger, you might consider using node API to remove overhead of
145  database opening on each query.  database opening on each query.
146    
 =head2 Using index via C<estmaster> server process  
   
 If first argument is URL to node (like C<http://localhost:1978/node/trivia>)  
 and there are two additional parameters (user-name and password) after it,  
 C<pgest> will use node API and access index through C<estmaster> process which should be running on (local or remote) machine.  
   
 This will remove database opening overhead, at a cost of additional network  
 traffic. However, you can have Hyper Estraier C<estmaster> process running on  
 different machine or update index while doing searches, so benefits of this  
 approach are obvious.  
   
 B<Note:> Currently, there is no support to search more than one index (depth  
 of search is always 0). This will be fixed.  
   
147  =head1 Who wrote this?  =head1 Who wrote this?
148    
149  Hyper Estraier is written by Mikio Hirabayashi.  Hyper Estraier is written by Mikio Hirabayashi.
150    
 Perl bindings for Hyper Estraier are written by MATSUNO Tokuhiro.  
   
151  PostgreSQL is written by hackers calling themselves PostgreSQL Global  PostgreSQL is written by hackers calling themselves PostgreSQL Global
152  Development Group.  Development Group.
153    
154  This small C function is written by Dobrica Pavlinusic, dpavlin@rot13.org.  This small C function is written by L<Dobrica Pavlinusic|http://www.rot13.org/~dpavlin/>, dpavlin@rot13.org.

Legend:
Removed from v.42  
changed lines
  Added in v.60

  ViewVC Help
Powered by ViewVC 1.1.26