/[pgestraier]/trunk/README.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /trunk/README.pod

Parent Directory Parent Directory | Revision Log Revision Log


Revision 49 - (show annotations)
Sat Oct 29 18:54:40 2005 UTC (18 years, 6 months ago) by dpavlin
File size: 5363 byte(s)
added depth to node API version of pgest, note that you have to use modified
perl wrapper with node API

1 =head1 pgestraier - search Hyper Estraier indexes from PostgreSQL
2
3 This package is essentially PostgreSQL C function which calls Hyper Estraier
4 API and returns results in user defined format.
5
6 =head1 Why is it written?
7
8 Aside from providing single API to query your RDBMS and full text index
9 (using any language that has PostgreSQL client libraries), real power is
10 hidden in ability to join results from full text index and structured data
11 in RDBMS.
12
13 =head1 How to install
14
15 Installation should be simple. However, you will have to have following
16 software already installed before you try this function:
17
18 =over
19
20 =item *
21
22 PostgreSQL (tested with versions 7.4 and 8.0) with development libraries
23
24 =item *
25
26 Hyper Estraier (tested with 0.5.0-1.0.0+, version newer than 0.9.6 are
27 recommended)
28
29 =back
30
31 To run tests you will also need:
32
33 =over
34
35 =item *
36
37 working perl installation
38
39 =item *
40
41 perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<HyperEstraier>
42
43 =item *
44
45 C<trivia.list.gz> from Internet Movie Database in C<data/> directory
46
47 =item *
48
49 PostgreSQL database C<test> with permissions for current user
50
51 =item *
52
53 Hyper Estraier node C<trivia> with permissions for C<admin> user.
54
55 =back
56
57 If you have all that, you should be able to type
58
59 make
60
61 and see sample results. You will be asked your password once (via sudo) to
62 install C<pgest.so> shared library in system-wide location so that PostgreSQL
63 could access it.
64
65 Next, you will have to create test index. You have two options:
66
67 =head2 Create index using estcmd
68
69 This will create temporary files on disk and index them using estcmd gather
70
71 cd data
72 make index
73 cd ..
74
75 B<Warning:> this method is incomplete and won't create node index needed
76 to run last examples in C<test.sql> correctly. Solution is simple: either
77 symlink your newly created index to Hyper Estraier C<_node> directory or
78 create node and fill re-create index using C<estcall>.
79
80 =head2 Create index using Hyper Estraier perl bindings
81
82 Perl bindings for Hyper Estraier are available at
83
84 L<http://hyperestraier.sourceforge.net/binding/>
85
86 However, they don't support node API (yet), so you will have to use
87 my modified version which is available at
88 L<http://svn.rot13.org/> in C<hyperestraier_wrappers> repository.
89
90 If you installed bindings as documented in README file, you can use
91 perl binding to create index about three times faster than using C<estcmd>
92 (to be fair, I must say that creation of intermediate files take most time,
93 not indexing).
94
95 However, you will first need to create node I<trivia> using Hyper Estraier's
96 administration interface at L<http://localhost:1978/masterui>. You will also
97 need user C<admin> with password C<admin> because those values are
98 hard-coded in C<indexer.pl>. If you want to use different user on index
99 name, feel free to change script.
100
101 cd data
102 make perl
103 cd ..
104
105 To run tests (which require that you have estcmd in your $PATH) issue
106
107 make test
108
109 See also included file test.sql for more examples of usage.
110
111 =head1 Usage of pgest from SQL
112
113 C<pgest> PostgreSQL function has two different prototypes (number of arguments) depending on usage.
114
115 SELECT
116 -- columns to return (defined later)
117 id,title,size
118 FROM pgest(
119 -- path to index OR URL to node, user-name and password
120 -- you will need JUST ONE of following two lines, depending
121 -- on your usage described below, for direct access
122 '/full/path/to/casket',
123 -- or for node API specify node URI, login, password
124 -- and depth of search
125 'http://localhost:1978/node/trivia', 'admin', 'admin', 42,
126 -- query
127 'blade runner',
128 -- additional attributes, use NULL or '' to disable
129 -- multiple attributes conditions can be separated by {{!}}
130 '@title ISTRINC blade',
131 -- order results by
132 '@title STRA',
133 -- limit, use NULL or 0 to disable
134 null,
135 -- offset, use NULL or 0 to disable
136 null,
137 -- attributes to return as columns
138 ARRAY['@id','@title','@size']
139 ) AS (
140 -- specify names and types of returned attributes
141 id text, title text, size text
142 );
143
144 =head2 Accessing database directly
145
146 If you want to access database directly (without running C<estmaster> process), first argument is full path to database file.
147
148 Have in mind that C<postgres> user under which PostgreSQL is running must
149 have read permission on Hyper Estraier database files.
150
151 This will work a bit faster on really small indexes. However, when your
152 index grows bigger, you might consider using node API to remove overhead of
153 database opening on each query.
154
155 =head2 Using index via C<estmaster> server process
156
157 If first argument is URL to node (like C<http://localhost:1978/node/trivia>)
158 and there are two additional parameters (user-name and password) after it,
159 C<pgest> will use node API and access index through C<estmaster> process which should be running on (local or remote) machine.
160
161 This will remove database opening overhead, at a cost of additional network
162 traffic. However, you can have Hyper Estraier C<estmaster> process running on
163 different machine or update index while doing searches, so benefits of this
164 approach are obvious.
165
166 =head1 Who wrote this?
167
168 Hyper Estraier is written by Mikio Hirabayashi.
169
170 Perl bindings for Hyper Estraier are written by MATSUNO Tokuhiro.
171
172 PostgreSQL is written by hackers calling themselves PostgreSQL Global
173 Development Group.
174
175 This small C function is written by Dobrica Pavlinusic, dpavlin@rot13.org.

  ViewVC Help
Powered by ViewVC 1.1.26