/[pgestraier]/trunk/README.pod
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Annotation of /trunk/README.pod

Parent Directory Parent Directory | Revision Log Revision Log


Revision 54 - (hide annotations)
Thu May 11 10:19:46 2006 UTC (17 years, 11 months ago) by dpavlin
File size: 4448 byte(s)
updated documentation and example for convert
1 dpavlin 33 =head1 pgestraier - search Hyper Estraier indexes from PostgreSQL
2 dpavlin 23
3     This package is essentially PostgreSQL C function which calls Hyper Estraier
4     API and returns results in user defined format.
5    
6 dpavlin 33 =head1 Why is it written?
7 dpavlin 23
8     Aside from providing single API to query your RDBMS and full text index
9     (using any language that has PostgreSQL client libraries), real power is
10     hidden in ability to join results from full text index and structured data
11     in RDBMS.
12    
13 dpavlin 33 =head1 How to install
14 dpavlin 23
15     Installation should be simple. However, you will have to have following
16     software already installed before you try this function:
17    
18 dpavlin 33 =over
19 dpavlin 23
20 dpavlin 33 =item *
21    
22 dpavlin 43 PostgreSQL (tested with versions 7.4 and 8.0) with development libraries
23 dpavlin 33
24     =item *
25    
26 dpavlin 54 Hyper Estraier (tested with various versions, recommended 1.2.4 of newer)
27 dpavlin 33
28     =back
29    
30 dpavlin 23 To run tests you will also need:
31    
32 dpavlin 33 =over
33 dpavlin 23
34 dpavlin 33 =item *
35    
36     working perl installation
37    
38     =item *
39    
40 dpavlin 54 perl modules C<DBI>, C<DBD::Pg>, C<Test::More> and optionally C<Search::Estraier>
41 dpavlin 33
42     =item *
43    
44 dpavlin 49 C<trivia.list.gz> from Internet Movie Database in C<data/> directory
45 dpavlin 33
46     =item *
47    
48 dpavlin 43 PostgreSQL database C<test> with permissions for current user
49 dpavlin 33
50 dpavlin 43 =item *
51    
52     Hyper Estraier node C<trivia> with permissions for C<admin> user.
53    
54 dpavlin 33 =back
55    
56 dpavlin 23 If you have all that, you should be able to type
57    
58     make
59    
60     and see sample results. You will be asked your password once (via sudo) to
61 dpavlin 43 install C<pgest.so> shared library in system-wide location so that PostgreSQL
62 dpavlin 23 could access it.
63    
64 dpavlin 54 =head2 Create sample index using Hyper Estraier perl bindings
65 dpavlin 29
66 dpavlin 54 Perl bindings for Hyper Estraier are available at CPAN:
67 dpavlin 29
68 dpavlin 54 L<http://search.cpan.org/~dpavlin/Search-Estraier/>
69 dpavlin 29
70    
71 dpavlin 54 After installing C<Search::Estraier> you can create index using following commands:
72 dpavlin 42
73 dpavlin 29 cd data
74 dpavlin 54 make index
75 dpavlin 29 cd ..
76    
77 dpavlin 23 To run tests (which require that you have estcmd in your $PATH) issue
78    
79     make test
80    
81 dpavlin 54 See also included file C<test.sql> for more examples of usage.
82 dpavlin 23
83 dpavlin 42 =head1 Usage of pgest from SQL
84    
85 dpavlin 54 C<pgest> PostgreSQL function tries to mimic usage of normal database tables (with support for attribute filtering, limit and offset) in following way:
86 dpavlin 42
87     SELECT
88     -- columns to return (defined later)
89     id,title,size
90     FROM pgest(
91 dpavlin 54 -- node URI, login, password and depth of search
92 dpavlin 49 'http://localhost:1978/node/trivia', 'admin', 'admin', 42,
93 dpavlin 42 -- query
94     'blade runner',
95     -- additional attributes, use NULL or '' to disable
96     -- multiple attributes conditions can be separated by {{!}}
97     '@title ISTRINC blade',
98     -- order results by
99     '@title STRA',
100     -- limit, use NULL or 0 to disable
101     null,
102     -- offset, use NULL or 0 to disable
103     null,
104     -- attributes to return as columns
105     ARRAY['@id','@title','@size']
106     ) AS (
107     -- specify names and types of returned attributes
108     id text, title text, size text
109     );
110    
111 dpavlin 54 You should note that Hyper Estraier uses UTF-8 encoding, while your
112     PostgreSQL installation might use different encoding. To fix that, use
113     C<convert> function in PostgreSQL to convert encodings.
114    
115     =head2 Using index via C<estmaster> server process
116    
117     This is default and recommended way to use C<pgest> functionality. In this
118     case, C<pgest> will use node API and access index through C<estmaster>
119     process which should be running on (local or remote) machine.
120    
121     This will remove database opening overhead, at a cost of (small) additional network
122     traffic. However, you can have Hyper Estraier C<estmaster> process running on
123     different machine or update index while doing searches, so benefits of this
124     approach are obvious.
125    
126 dpavlin 42 =head2 Accessing database directly
127    
128 dpavlin 54 B<Please note that direct access to database is depreciated.> As such, it's
129     not stated in example, and it's kept just for backward compatibility, but it
130     will probably be removed in future versions of C<pgest>.
131 dpavlin 42
132 dpavlin 54 If you want to access database directly (without running C<estmaster> process), you
133     have to replace node URI, login, password and depth with full path to database file.
134    
135 dpavlin 42 Have in mind that C<postgres> user under which PostgreSQL is running must
136     have read permission on Hyper Estraier database files.
137    
138     This will work a bit faster on really small indexes. However, when your
139     index grows bigger, you might consider using node API to remove overhead of
140     database opening on each query.
141    
142 dpavlin 33 =head1 Who wrote this?
143 dpavlin 23
144     Hyper Estraier is written by Mikio Hirabayashi.
145    
146 dpavlin 42 Perl bindings for Hyper Estraier are written by MATSUNO Tokuhiro.
147    
148 dpavlin 23 PostgreSQL is written by hackers calling themselves PostgreSQL Global
149     Development Group.
150    
151     This small C function is written by Dobrica Pavlinusic, dpavlin@rot13.org.

  ViewVC Help
Powered by ViewVC 1.1.26