/[pgestraier]/trunk/README.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Contents of /trunk/README.pod

Parent Directory Parent Directory | Revision Log Revision Log


Revision 59 - (show annotations)
Thu May 25 18:46:49 2006 UTC (17 years, 11 months ago) by dpavlin
File size: 4578 byte(s)
begin tutorial
1 =head1 pgestraier - search Hyper Estraier indexes from PostgreSQL
2
3 This package is essentially PostgreSQL C function which calls Hyper Estraier
4 API and returns results in user defined format.
5
6 =head1 Why is it written?
7
8 Aside from providing single API to query your RDBMS and full text index
9 (using any language that has PostgreSQL client libraries), real power is
10 hidden in ability to join results from full text index and structured data
11 in RDBMS.
12
13 For simple real-life example which address problem
14 I<where like '%foo%' is slow>
15 see L<Tutorial>.
16
17 =head1 How to install
18
19 Installation should be simple. However, you will have to have following
20 software already installed before you try this function:
21
22 =over
23
24 =item *
25
26 PostgreSQL (tested with versions 7.4 and 8.0) with development libraries
27
28 =item *
29
30 Hyper Estraier (tested with various versions, recommended 1.2.4 of newer)
31
32 =back
33
34 To run tests you will also need:
35
36 =over
37
38 =item *
39
40 working perl installation
41
42 =item *
43
44 perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<Search::Estraier>
45
46 =item *
47
48 C<trivia.list.gz> from Internet Movie Database in C<data/> directory
49
50 =item *
51
52 PostgreSQL database C<test> with permissions for current user
53
54 =item *
55
56 Hyper Estraier C<estmaster> running with permissions for C<admin> user
57 to create C<trivia> node.
58
59 =back
60
61 If you have all that, you should be able to type
62
63 make
64
65 and see sample results. You will be asked your password once (via sudo) to
66 install C<pgest.so> shared library in system-wide location so that PostgreSQL
67 could access it.
68
69 =head2 Create sample index using Hyper Estraier perl bindings
70
71 Perl bindings for Hyper Estraier are available at CPAN:
72
73 L<http://search.cpan.org/~dpavlin/Search-Estraier/>
74
75
76 After installing C<Search::Estraier> you can create index using following commands:
77
78 cd data
79 make index
80 cd ..
81
82 To run tests (which require that you have estcmd in your $PATH) issue
83
84 make test
85
86 See also included file C<test.sql> for more examples of usage.
87
88 =head1 Usage of pgest from SQL
89
90 C<pgest> PostgreSQL function tries to mimic usage of normal database tables (with support for attribute filtering, limit and offset) in following way:
91
92 SELECT
93 -- columns to return (defined later)
94 id,title,size
95 FROM pgest(
96 -- node URI, login, password and depth of search
97 'http://localhost:1978/node/trivia', 'admin', 'admin', 42,
98 -- query
99 'blade runner',
100 -- additional attributes, use NULL or '' to disable
101 -- multiple attributes conditions can be separated by {{!}}
102 '@title ISTRINC blade',
103 -- order results by
104 '@title STRA',
105 -- limit, use NULL or 0 to disable
106 null,
107 -- offset, use NULL or 0 to disable
108 null,
109 -- attributes to return as columns
110 ARRAY['@id','@title','@size']
111 ) AS (
112 -- specify names and types of returned attributes
113 id text, title text, size text
114 );
115
116 You should note that Hyper Estraier uses UTF-8 encoding, while your
117 PostgreSQL installation might use different encoding. To fix that, use
118 C<convert> function in PostgreSQL to convert encodings.
119
120 =head2 Using index via C<estmaster> server process
121
122 This is default and recommended way to use C<pgest> functionality. In this
123 case, C<pgest> will use node API and access index through C<estmaster>
124 process which should be running on (local or remote) machine.
125
126 This will remove database opening overhead, at a cost of (small) additional network
127 traffic. However, you can have Hyper Estraier C<estmaster> process running on
128 different machine or update index while doing searches, so benefits of this
129 approach are obvious.
130
131 =head2 Accessing database directly
132
133 B<Please note that direct access to database is depreciated.> As such, it's
134 not stated in example, and it's kept just for backward compatibility, but it
135 will probably be removed in future versions of C<pgest>.
136
137 If you want to access database directly (without running C<estmaster> process), you
138 have to replace node URI, login, password and depth with full path to database file.
139
140 Have in mind that C<postgres> user under which PostgreSQL is running must
141 have read permission on Hyper Estraier database files.
142
143 This will work a bit faster on really small indexes. However, when your
144 index grows bigger, you might consider using node API to remove overhead of
145 database opening on each query.
146
147 =head1 Who wrote this?
148
149 Hyper Estraier is written by Mikio Hirabayashi.
150
151 Perl bindings for Hyper Estraier are written by MATSUNO Tokuhiro.
152
153 PostgreSQL is written by hackers calling themselves PostgreSQL Global
154 Development Group.
155
156 This small C function is written by Dobrica Pavlinusic, dpavlin@rot13.org.

  ViewVC Help
Powered by ViewVC 1.1.26