/[pgestraier]/trunk/README.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /trunk/README.pod

Parent Directory Parent Directory | Revision Log Revision Log


Revision 43 - (show annotations)
Sat Sep 10 22:33:36 2005 UTC (18 years, 7 months ago) by dpavlin
File size: 5162 byte(s)
few more changes to documentation

1 =head1 pgestraier - search Hyper Estraier indexes from PostgreSQL
2
3 This package is essentially PostgreSQL C function which calls Hyper Estraier
4 API and returns results in user defined format.
5
6 =head1 Why is it written?
7
8 Aside from providing single API to query your RDBMS and full text index
9 (using any language that has PostgreSQL client libraries), real power is
10 hidden in ability to join results from full text index and structured data
11 in RDBMS.
12
13 =head1 How to install
14
15 Installation should be simple. However, you will have to have following
16 software already installed before you try this function:
17
18 =over
19
20 =item *
21
22 PostgreSQL (tested with versions 7.4 and 8.0) with development libraries
23
24 =item *
25
26 Hyper Estraier (version 0.5.0 or newer)
27
28 =back
29
30 To run tests you will also need:
31
32 =over
33
34 =item *
35
36 working perl installation
37
38 =item *
39
40 perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<HyperEstraier>
41
42 =item *
43
44 C<trivia.list.gz> from Internet Movie Database in data/ directory
45
46 =item *
47
48 PostgreSQL database C<test> with permissions for current user
49
50 =item *
51
52 Hyper Estraier node C<trivia> with permissions for C<admin> user.
53
54 =back
55
56 If you have all that, you should be able to type
57
58 make
59
60 and see sample results. You will be asked your password once (via sudo) to
61 install C<pgest.so> shared library in system-wide location so that PostgreSQL
62 could access it.
63
64 Next, you will have to create test index. You have two options:
65
66 =head2 Create index using estcmd
67
68 This will create temporary files on disk and index them using estcmd gather
69
70 cd data
71 make index
72 cd ..
73
74 B<Warning:> this method is incomplete and won't create node index needed
75 to run last examples in C<test.sql> correctly. Solution is simple: either
76 symlink your newly created index to Hyper Estraier C<_node> directory or
77 create node and fill re-create index using C<estcall>.
78
79 =head2 Create index using Hyper Estraier perl bindings
80
81 For this, you will have to install perl bindings from
82
83 L<http://hyperestraier.sourceforge.net/binding/>
84
85 If you installed bindings as documented in README file, you can use
86 perl binding to create index about three times faster than using C<estcmd>
87 (to be fair, I must say that creation of intermediate files take most time,
88 not indexing).
89
90 However, you will first need to create node I<trivia> using Hyper Estraier's
91 administration interface at L<http://localhost:1978/masterui>. You will also
92 need user C<admin> with password C<admin> because those values are
93 hard-coded in C<indexer.pl>. If you want to use different user on index
94 name, feel free to change script.
95
96 cd data
97 make perl
98 cd ..
99
100 To run tests (which require that you have estcmd in your $PATH) issue
101
102 make test
103
104 See also included file test.sql for more examples of usage.
105
106 =head1 Usage of pgest from SQL
107
108 C<pgest> PostgreSQL function has two different prototypes (number of arguments) depending on usage.
109
110 SELECT
111 -- columns to return (defined later)
112 id,title,size
113 FROM pgest(
114 -- path to index OR URL to node, user-name and password
115 -- you will need JUST ONE of following two lines, depending
116 -- on your usage described below
117 '/full/path/to/casket',
118 'http://localhost:1978/node/trivia', 'admin', 'admin',
119 -- query
120 'blade runner',
121 -- additional attributes, use NULL or '' to disable
122 -- multiple attributes conditions can be separated by {{!}}
123 '@title ISTRINC blade',
124 -- order results by
125 '@title STRA',
126 -- limit, use NULL or 0 to disable
127 null,
128 -- offset, use NULL or 0 to disable
129 null,
130 -- attributes to return as columns
131 ARRAY['@id','@title','@size']
132 ) AS (
133 -- specify names and types of returned attributes
134 id text, title text, size text
135 );
136
137 =head2 Accessing database directly
138
139 If you want to access database directly (without running C<estmaster> process), first argument is full path to database file.
140
141 Have in mind that C<postgres> user under which PostgreSQL is running must
142 have read permission on Hyper Estraier database files.
143
144 This will work a bit faster on really small indexes. However, when your
145 index grows bigger, you might consider using node API to remove overhead of
146 database opening on each query.
147
148 =head2 Using index via C<estmaster> server process
149
150 If first argument is URL to node (like C<http://localhost:1978/node/trivia>)
151 and there are two additional parameters (user-name and password) after it,
152 C<pgest> will use node API and access index through C<estmaster> process which should be running on (local or remote) machine.
153
154 This will remove database opening overhead, at a cost of additional network
155 traffic. However, you can have Hyper Estraier C<estmaster> process running on
156 different machine or update index while doing searches, so benefits of this
157 approach are obvious.
158
159 B<Note:> Currently, there is no support to search more than one index (depth
160 of search is always 0). This will be fixed.
161
162 =head1 Who wrote this?
163
164 Hyper Estraier is written by Mikio Hirabayashi.
165
166 Perl bindings for Hyper Estraier are written by MATSUNO Tokuhiro.
167
168 PostgreSQL is written by hackers calling themselves PostgreSQL Global
169 Development Group.
170
171 This small C function is written by Dobrica Pavlinusic, dpavlin@rot13.org.

  ViewVC Help
Powered by ViewVC 1.1.26