/[pgestraier]/trunk/README.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Diff of /trunk/README.pod

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

trunk/README revision 29 by dpavlin, Wed Jul 6 11:47:56 2005 UTC trunk/README.pod revision 51 by dpavlin, Tue May 9 22:55:42 2006 UTC
# Line 1  Line 1 
1  1. pgestraier - search Hyper Estraier indexes from PostgreSQL  =head1 pgestraier - search Hyper Estraier indexes from PostgreSQL
2    
3  This package is essentially PostgreSQL C function which calls Hyper Estraier  This package is essentially PostgreSQL C function which calls Hyper Estraier
4  API and returns results in user defined format.  API and returns results in user defined format.
5    
6  2. Why is it written?  =head1 Why is it written?
7    
8  Aside from providing single API to query your RDBMS and full text index  Aside from providing single API to query your RDBMS and full text index
9  (using any language that has PostgreSQL client libraries), real power is  (using any language that has PostgreSQL client libraries), real power is
10  hidden in ability to join results from full text index and structured data  hidden in ability to join results from full text index and structured data
11  in RDBMS.  in RDBMS.
12    
13  3. How to install  =head1 How to install
14    
15  Installation should be simple. However, you will have to have following  Installation should be simple. However, you will have to have following
16  software already installed before you try this function:  software already installed before you try this function:
17    
18   * PostgreSQL (tested with version 7.4.8) with development libraries  =over
19   * Hyper Estraier (tested with versions 0.3.9 and 0.3.10)  
20    =item *
21    
22    PostgreSQL (tested with versions 7.4 and 8.0) with development libraries
23    
24    =item *
25    
26    Hyper Estraier (tested with 0.5.0-1.0.0+, version newer than 0.9.6 are
27    recommended)
28    
29    =back
30    
31  To run tests you will also need:  To run tests you will also need:
32    
33   * working perl installation  =over
34   * perl modules DBI, DBD::Pg, Test::More  
35   * trivia.list.gz from Internet Movie Database in data/ directory  =item *
36   * database "test" with permissions for current user  
37    working perl installation
38    
39    =item *
40    
41    perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<HyperEstraier>
42    
43    =item *
44    
45    C<trivia.list.gz> from Internet Movie Database in C<data/> directory
46    
47    =item *
48    
49    PostgreSQL database C<test> with permissions for current user
50    
51    =item *
52    
53    Hyper Estraier node C<trivia> with permissions for C<admin> user.
54    
55    =back
56    
57  If you have all that, you should be able to type  If you have all that, you should be able to type
58    
59    make    make
60    
61  and see sample results. You will be asked your password once (via sudo) to  and see sample results. You will be asked your password once (via sudo) to
62  install pgest.so shared library in system-wide location so that PostgreSQL  install C<pgest.so> shared library in system-wide location so that PostgreSQL
63  could access it.  could access it.
64    
65  Next, you will have to create test index. You have two options:  Next, you will have to create test index. You have two options:
66    
67  3.1. Create index using estcmd  =head2 Create index using estcmd
68    
69  This will create temporary files on disk and index them using estcmd gather  This will create temporary files on disk and index them using estcmd gather
70    
# Line 43  This will create temporary files on disk Line 72  This will create temporary files on disk
72    make index    make index
73    cd ..    cd ..
74    
75  3.2. Create index using Hyper Estraier perl bindings  B<Warning:> this method is incomplete and won't create node index needed
76    to run last examples in C<test.sql> correctly. Solution is simple: either
77  For this, you will have to install perl bindings from  symlink your newly created index to Hyper Estraier C<_node> directory or
78    create node and fill re-create index using C<estcall>.
79    http://tokuhirom.dnsalias.org/~tokuhirom/archive/hyper_estraier_wrappers-0.0.6.tar.gz  
80    =head2 Create index using Hyper Estraier perl bindings
81  If you installed bindings as documented in README file, you can issue  
82  following commands to create index about three times faster than using  Perl bindings for Hyper Estraier are available at
83  estcmd:  
84    L<http://hyperestraier.sourceforge.net/binding/>
85    
86    However, they don't support node API (yet), so you will have to use
87    my modified version which is available at
88    L<http://svn.rot13.org/> in C<hyperestraier_wrappers> repository.
89    
90    If you installed bindings as documented in README file, you can use
91    perl binding to create index about three times faster than using C<estcmd>
92    (to be fair, I must say that creation of intermediate files take most time,
93    not indexing).
94    
95    However, you will first need to create node I<trivia> using Hyper Estraier's
96    administration interface at L<http://localhost:1978/masterui>. You will also
97    need user C<admin> with password C<admin> because those values are
98    hard-coded in C<indexer.pl>. If you want to use different user on index
99    name, feel free to change script.
100    
101    cd data    cd data
102    make perl    make perl
# Line 63  To run tests (which require that you hav Line 108  To run tests (which require that you hav
108    
109  See also included file test.sql for more examples of usage.  See also included file test.sql for more examples of usage.
110    
111  4. Who wrote this?  =head1 Usage of pgest from SQL
112    
113    C<pgest> PostgreSQL function has two different prototypes (number of arguments) depending on usage.
114    
115            SELECT
116                    -- columns to return (defined later)
117                    id,title,size
118            FROM pgest(
119                    -- path to index OR URL to node, user-name and password
120                    -- you will need JUST ONE of following two lines, depending
121                    -- on your usage described below, for direct access
122                    '/full/path/to/casket',
123                    -- or for node API specify node URI, login, password
124                    -- and depth of search
125                    'http://localhost:1978/node/trivia', 'admin', 'admin', 42,
126                    -- query
127                    'blade runner',
128                    -- additional attributes, use NULL or '' to disable
129                    -- multiple attributes conditions can be separated by {{!}}
130                    '@title ISTRINC blade',
131                    -- order results by
132                    '@title STRA',
133                    -- limit, use NULL or 0 to disable
134                    null,
135                    -- offset, use NULL or 0 to disable
136                    null,
137                    -- attributes to return as columns
138                    ARRAY['@id','@title','@size']
139            ) AS (
140                    -- specify names and types of returned attributes
141                    id text, title text, size text
142            );
143    
144    =head2 Accessing database directly
145    
146    If you want to access database directly (without running C<estmaster> process), first argument is full path to database file.
147    
148    Have in mind that C<postgres> user under which PostgreSQL is running must
149    have read permission on Hyper Estraier database files.
150    
151    This will work a bit faster on really small indexes. However, when your
152    index grows bigger, you might consider using node API to remove overhead of
153    database opening on each query.
154    
155    B<Please note that direct access to database is depriciated.>
156    
157    =head2 Using index via C<estmaster> server process
158    
159    If first argument is URL to node (like C<http://localhost:1978/node/trivia>)
160    and there are two additional parameters (user-name and password) after it,
161    C<pgest> will use node API and access index through C<estmaster> process which should be running on (local or remote) machine.
162    
163    This will remove database opening overhead, at a cost of additional network
164    traffic. However, you can have Hyper Estraier C<estmaster> process running on
165    different machine or update index while doing searches, so benefits of this
166    approach are obvious.
167    
168    =head1 Who wrote this?
169    
170  Hyper Estraier is written by Mikio Hirabayashi.  Hyper Estraier is written by Mikio Hirabayashi.
171    
172    Perl bindings for Hyper Estraier are written by MATSUNO Tokuhiro.
173    
174  PostgreSQL is written by hackers calling themselves PostgreSQL Global  PostgreSQL is written by hackers calling themselves PostgreSQL Global
175  Development Group.  Development Group.
176    

Legend:
Removed from v.29  
changed lines
  Added in v.51

  ViewVC Help
Powered by ViewVC 1.1.26