/[pgestraier]/trunk/README.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /trunk/README.pod

Parent Directory Parent Directory | Revision Log Revision Log


Revision 42 - (show annotations)
Sat Sep 10 21:48:46 2005 UTC (18 years, 7 months ago) by dpavlin
File size: 4966 byte(s)
updated documentation to reflect two different ways to call pgest with example

1 =head1 pgestraier - search Hyper Estraier indexes from PostgreSQL
2
3 This package is essentially PostgreSQL C function which calls Hyper Estraier
4 API and returns results in user defined format.
5
6 =head1 Why is it written?
7
8 Aside from providing single API to query your RDBMS and full text index
9 (using any language that has PostgreSQL client libraries), real power is
10 hidden in ability to join results from full text index and structured data
11 in RDBMS.
12
13 =head1 How to install
14
15 Installation should be simple. However, you will have to have following
16 software already installed before you try this function:
17
18 =over
19
20 =item *
21
22 PostgreSQL (tested with version 7.4.8) with development libraries
23
24 =item *
25
26 Hyper Estraier (tested with versions 0.3.9 and 0.3.10)
27
28 =back
29
30 To run tests you will also need:
31
32 =over
33
34 =item *
35
36 working perl installation
37
38 =item *
39
40 perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<HyperEstraier>
41
42 =item *
43
44 C<trivia.list.gz> from Internet Movie Database in data/ directory
45
46 =item *
47
48 database "test" with permissions for current user
49
50 =back
51
52 If you have all that, you should be able to type
53
54 make
55
56 and see sample results. You will be asked your password once (via sudo) to
57 install pgest.so shared library in system-wide location so that PostgreSQL
58 could access it.
59
60 Next, you will have to create test index. You have two options:
61
62 =head2 Create index using estcmd
63
64 This will create temporary files on disk and index them using estcmd gather
65
66 cd data
67 make index
68 cd ..
69
70 B<Warning:> this method is incomplete and won't create node index needed
71 to run last examples in C<test.sql> correctly. Solution is simple: either
72 symlink your newly created index to Hyper Estraier C<_node> directory or
73 create node and fill re-create index using C<estcall>.
74
75 =head2 Create index using Hyper Estraier perl bindings
76
77 For this, you will have to install perl bindings from
78
79 L<http://hyperestraier.sourceforge.net/binding/>
80
81 If you installed bindings as documented in README file, you can use
82 perl binding to create index about three times faster. However, you will
83 fist need to create node I<trivia> using Hyper Estraier's administration
84 interface at L<http://localhost:1978/masterui>. You will also need user
85 C<admin> with password C<admin> because those values are hard-coded in
86 C<indexer.pl>. If you want to use different user on index name, feel
87 free to change script.
88
89 cd data
90 make perl
91 cd ..
92
93 To run tests (which require that you have estcmd in your $PATH) issue
94
95 make test
96
97 See also included file test.sql for more examples of usage.
98
99 =head1 Usage of pgest from SQL
100
101 C<pgest> PostgreSQL function has two different prototypes (number of arguments) depending on usage.
102
103 SELECT
104 -- columns to return (defined later)
105 id,title,size
106 FROM pgest(
107 -- path to index OR URL to node, user-name and password
108 -- you will need JUST ONE of following two lines, depending
109 -- on your usage described below
110 '/full/path/to/casket',
111 'http://localhost:1978/node/trivia', 'admin', 'admin',
112 -- query
113 'blade runner',
114 -- additional attributes, use NULL or '' to disable
115 -- multiple attributes conditions can be separated by {{!}}
116 '@title ISTRINC blade',
117 -- order results by
118 '@title STRA',
119 -- limit, use NULL or 0 to disable
120 null,
121 -- offset, use NULL or 0 to disable
122 null,
123 -- attributes to return as columns
124 ARRAY['@id','@title','@size']
125 ) AS (
126 -- specify names and types of returned attributes
127 id text, title text, size text
128 );
129
130 =head2 Accessing database directly
131
132 If you want to access database directly (without running C<estmaster> process), first argument is full path to database file.
133
134 Have in mind that C<postgres> user under which PostgreSQL is running must
135 have read permission on Hyper Estraier database files.
136
137 This will work a bit faster on really small indexes. However, when your
138 index grows bigger, you might consider using node API to remove overhead of
139 database opening on each query.
140
141 =head2 Using index via C<estmaster> server process
142
143 If first argument is URL to node (like C<http://localhost:1978/node/trivia>)
144 and there are two additional parameters (user-name and password) after it,
145 C<pgest> will use node API and access index through C<estmaster> process which should be running on (local or remote) machine.
146
147 This will remove database opening overhead, at a cost of additional network
148 traffic. However, you can have Hyper Estraier C<estmaster> process running on
149 different machine or update index while doing searches, so benefits of this
150 approach are obvious.
151
152 B<Note:> Currently, there is no support to search more than one index (depth
153 of search is always 0). This will be fixed.
154
155 =head1 Who wrote this?
156
157 Hyper Estraier is written by Mikio Hirabayashi.
158
159 Perl bindings for Hyper Estraier are written by MATSUNO Tokuhiro.
160
161 PostgreSQL is written by hackers calling themselves PostgreSQL Global
162 Development Group.
163
164 This small C function is written by Dobrica Pavlinusic, dpavlin@rot13.org.

  ViewVC Help
Powered by ViewVC 1.1.26