/[pgestraier]/trunk/README.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Annotation of /trunk/README.pod

Parent Directory Parent Directory | Revision Log Revision Log


Revision 42 - (hide annotations)
Sat Sep 10 21:48:46 2005 UTC (18 years, 7 months ago) by dpavlin
File size: 4966 byte(s)
updated documentation to reflect two different ways to call pgest with example

1 dpavlin 33 =head1 pgestraier - search Hyper Estraier indexes from PostgreSQL
2 dpavlin 23
3     This package is essentially PostgreSQL C function which calls Hyper Estraier
4     API and returns results in user defined format.
5    
6 dpavlin 33 =head1 Why is it written?
7 dpavlin 23
8     Aside from providing single API to query your RDBMS and full text index
9     (using any language that has PostgreSQL client libraries), real power is
10     hidden in ability to join results from full text index and structured data
11     in RDBMS.
12    
13 dpavlin 33 =head1 How to install
14 dpavlin 23
15     Installation should be simple. However, you will have to have following
16     software already installed before you try this function:
17    
18 dpavlin 33 =over
19 dpavlin 23
20 dpavlin 33 =item *
21    
22     PostgreSQL (tested with version 7.4.8) with development libraries
23    
24     =item *
25    
26     Hyper Estraier (tested with versions 0.3.9 and 0.3.10)
27    
28     =back
29    
30 dpavlin 23 To run tests you will also need:
31    
32 dpavlin 33 =over
33 dpavlin 23
34 dpavlin 33 =item *
35    
36     working perl installation
37    
38     =item *
39    
40     perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<HyperEstraier>
41    
42     =item *
43    
44     C<trivia.list.gz> from Internet Movie Database in data/ directory
45    
46     =item *
47    
48     database "test" with permissions for current user
49    
50     =back
51    
52 dpavlin 23 If you have all that, you should be able to type
53    
54     make
55    
56     and see sample results. You will be asked your password once (via sudo) to
57     install pgest.so shared library in system-wide location so that PostgreSQL
58     could access it.
59    
60 dpavlin 29 Next, you will have to create test index. You have two options:
61    
62 dpavlin 33 =head2 Create index using estcmd
63 dpavlin 29
64     This will create temporary files on disk and index them using estcmd gather
65    
66     cd data
67     make index
68     cd ..
69    
70 dpavlin 42 B<Warning:> this method is incomplete and won't create node index needed
71     to run last examples in C<test.sql> correctly. Solution is simple: either
72     symlink your newly created index to Hyper Estraier C<_node> directory or
73     create node and fill re-create index using C<estcall>.
74    
75 dpavlin 33 =head2 Create index using Hyper Estraier perl bindings
76 dpavlin 29
77     For this, you will have to install perl bindings from
78    
79 dpavlin 42 L<http://hyperestraier.sourceforge.net/binding/>
80 dpavlin 29
81 dpavlin 42 If you installed bindings as documented in README file, you can use
82     perl binding to create index about three times faster. However, you will
83     fist need to create node I<trivia> using Hyper Estraier's administration
84     interface at L<http://localhost:1978/masterui>. You will also need user
85     C<admin> with password C<admin> because those values are hard-coded in
86     C<indexer.pl>. If you want to use different user on index name, feel
87     free to change script.
88 dpavlin 29
89     cd data
90     make perl
91     cd ..
92    
93 dpavlin 23 To run tests (which require that you have estcmd in your $PATH) issue
94    
95     make test
96    
97     See also included file test.sql for more examples of usage.
98    
99 dpavlin 42 =head1 Usage of pgest from SQL
100    
101     C<pgest> PostgreSQL function has two different prototypes (number of arguments) depending on usage.
102    
103     SELECT
104     -- columns to return (defined later)
105     id,title,size
106     FROM pgest(
107     -- path to index OR URL to node, user-name and password
108     -- you will need JUST ONE of following two lines, depending
109     -- on your usage described below
110     '/full/path/to/casket',
111     'http://localhost:1978/node/trivia', 'admin', 'admin',
112     -- query
113     'blade runner',
114     -- additional attributes, use NULL or '' to disable
115     -- multiple attributes conditions can be separated by {{!}}
116     '@title ISTRINC blade',
117     -- order results by
118     '@title STRA',
119     -- limit, use NULL or 0 to disable
120     null,
121     -- offset, use NULL or 0 to disable
122     null,
123     -- attributes to return as columns
124     ARRAY['@id','@title','@size']
125     ) AS (
126     -- specify names and types of returned attributes
127     id text, title text, size text
128     );
129    
130     =head2 Accessing database directly
131    
132     If you want to access database directly (without running C<estmaster> process), first argument is full path to database file.
133    
134     Have in mind that C<postgres> user under which PostgreSQL is running must
135     have read permission on Hyper Estraier database files.
136    
137     This will work a bit faster on really small indexes. However, when your
138     index grows bigger, you might consider using node API to remove overhead of
139     database opening on each query.
140    
141     =head2 Using index via C<estmaster> server process
142    
143     If first argument is URL to node (like C<http://localhost:1978/node/trivia>)
144     and there are two additional parameters (user-name and password) after it,
145     C<pgest> will use node API and access index through C<estmaster> process which should be running on (local or remote) machine.
146    
147     This will remove database opening overhead, at a cost of additional network
148     traffic. However, you can have Hyper Estraier C<estmaster> process running on
149     different machine or update index while doing searches, so benefits of this
150     approach are obvious.
151    
152     B<Note:> Currently, there is no support to search more than one index (depth
153     of search is always 0). This will be fixed.
154    
155 dpavlin 33 =head1 Who wrote this?
156 dpavlin 23
157     Hyper Estraier is written by Mikio Hirabayashi.
158    
159 dpavlin 42 Perl bindings for Hyper Estraier are written by MATSUNO Tokuhiro.
160    
161 dpavlin 23 PostgreSQL is written by hackers calling themselves PostgreSQL Global
162     Development Group.
163    
164     This small C function is written by Dobrica Pavlinusic, dpavlin@rot13.org.

  ViewVC Help
Powered by ViewVC 1.1.26