1 |
Object Server (Virtual Data Warehouse) |
2 |
====================================== |
3 |
|
4 |
I. What is it? |
5 |
|
6 |
Object Server is a sql parser with perl embedded, that maps virtual table |
7 |
column names onto perl subroutines, and runs the subs needed based on the |
8 |
columns in the "select" and "where" clauses of the query, then returns the |
9 |
data after filtering it through the conditions in the "where" clause. |
10 |
|
11 |
The column-to-subroutine map is in obj_srvr.tables, a file which obj_srvr.pm |
12 |
reads at "embed time" (when the parser is started). (The parser embeds |
13 |
obj_srvr.pl, which is a wrapper for obj_srvr.pm.) |
14 |
|
15 |
The data model relies on various Banner SIS related perl modules in |
16 |
woup:[wou_sis_mods.com] |
17 |
|
18 |
WOU_Admit.pm |
19 |
WOU_Person.pm |
20 |
WOU_SIS_Util.pm |
21 |
WOU_Student.pm |
22 |
WOU_Util.pm |
23 |
|
24 |
However, obj_srvr.pm can work with ANY data model. All you need is the |
25 |
map in obj_srvr.tables, and the subs to return the data. (* obj_srvr.pm |
26 |
needs subs in WOU_Util.pm, regardless of data model used. *) |
27 |
|
28 |
Conventions of the Data Model: |
29 |
|
30 |
1. "from" clause can only have one table. |
31 |
|
32 |
You can build whatever joins you need with wrapper subs, see |
33 |
stu_schedule "table" for example. |
34 |
|
35 |
2. Driver sub (the sub associated with the virtual table in %table_objs) |
36 |
returns a ref to an array of hash refs. The array is like a table, |
37 |
and the hashes are like records. The data looks like this: |
38 |
|
39 |
term_students = [ |
40 |
{ "pidm" => 12345, |
41 |
"term" => "200203", |
42 |
"level" => "UG", |
43 |
"confidential" => "N", |
44 |
"dead" => "N", |
45 |
"majr_code" => "5CSC", |
46 |
"minr_code" => "4PSY", |
47 |
"con1_code" => "", # actually, might be null |
48 |
"con2_code" => "", |
49 |
"con3_code" => "", |
50 |
"major" => "Computer Science", |
51 |
"minor" => "Pyschology", |
52 |
"conc1" => "", |
53 |
"conc2" => "", |
54 |
"conc3" => "", |
55 |
"resident" => "R", |
56 |
"deg_plan" => "PPF", |
57 |
"admt_code" => "FT", |
58 |
"styp_code" => "C", |
59 |
"orsn_code" => "E", |
60 |
"rate_code" => "BASIC" }, |
61 |
|
62 |
{ "pidm" => 12346, |
63 |
"term" => "200203", |
64 |
"level" => "UG", |
65 |
"confidential" => "N", |
66 |
"dead" => "N", |
67 |
"majr_code" => "5CSM", |
68 |
"minr_code" => "", |
69 |
"con1_code" => "", |
70 |
"con2_code" => "", |
71 |
"con3_code" => "", |
72 |
"major" => "Computer Science/Math", |
73 |
"minor" => "", |
74 |
"conc1" => "", |
75 |
"conc2" => "", |
76 |
"conc3" => "", |
77 |
"resident" => "N", |
78 |
"deg_plan" => "", |
79 |
"admt_code" => "FR", |
80 |
"styp_code" => "C", |
81 |
"orsn_code" => "S1", |
82 |
"rate_code" => "BASIC" }, |
83 |
. |
84 |
. |
85 |
. ]; |
86 |
|
87 |
3. Supporting subs (subs associated with the columns in %obj_accessor) |
88 |
return a hash ref (a single record). Data looks like this for |
89 |
get_addr ($dbh, $pidm, $atyp1, $atyp2, $atyp3): |
90 |
|
91 |
{ "atyp" => "CU", |
92 |
"street" => "223 NW Front Street", |
93 |
"street2" => "", # actually, might be null |
94 |
"street3" => "", |
95 |
"city" => "Salem", |
96 |
"state" => "OR", |
97 |
"zip" => "97301", |
98 |
"nation" => "", |
99 |
"natn_code" => "", |
100 |
"county" => "Marion" }; |
101 |
|
102 |
For supporting subs that return a ref to an array of hashes (a table), |
103 |
make them part of the driver sub results by using a wrapper sub to join |
104 |
(see stu_schedule example below). |
105 |
|
106 |
# ========================================================= |
107 |
# stu_schedule - All students and their class schedule for |
108 |
# given term and level. Level may use wildcards. |
109 |
# ========================================================= |
110 |
sub stu_schedule { |
111 |
my ($dbh, $term, $level) = @_; |
112 |
|
113 |
my (@stu_sched, $ra_students, $rh_stu, $ra_sched, $rh_class, $rh_new); |
114 |
|
115 |
$ra_students = term_students($dbh, $term, $level); |
116 |
|
117 |
foreach $rh_stu ( @{ $ra_students } ) { |
118 |
$ra_sched = schedule($dbh, $rh_stu->{pidm}, $term); |
119 |
foreach $rh_class ( @{ $ra_sched } ) { |
120 |
|
121 |
$rh_new = { }; # get new memory |
122 |
add2hash($rh_new, $rh_stu); # "dup" $rh_stu |
123 |
|
124 |
add2hash($rh_new, $rh_class); |
125 |
push @stu_sched, $rh_new; |
126 |
} |
127 |
} |
128 |
|
129 |
return \@stu_sched; |
130 |
} |
131 |
|
132 |
II. Why? |
133 |
|
134 |
1. Any Programmer can pull reliable data from system |
135 |
|
136 |
The problem with Database-level data models (i.e. the tables in |
137 |
Banner, for instance) is that even a good programmer can't trust |
138 |
herself to write reliable queries without a lot of research, until |
139 |
she has had a year of experience working with the data. So maybe |
140 |
only one programmer on the staff can do quick queries from a given |
141 |
module of Banner, for example. But with higher-level objects (like |
142 |
address, grades, schedule, gpa) available in subroutines, any |
143 |
programmer can hook these subs together and get quick, reliable |
144 |
results. This functionality is provided by the perl modules mentioned |
145 |
above, and doesn't need obj_srvr. An experienced programmer can |
146 |
benefit from the higher-level subs too, because they eliminate a lot |
147 |
of error checking, and guarantee uniformity in results. |
148 |
|
149 |
2. Users can get good answers to ad hoc questions without asking a |
150 |
programmer |
151 |
|
152 |
obj_srvr provides an interface for hooking the subs together. Users |
153 |
can do this through BI Query (in alpha testing), and programmers |
154 |
can do it by doing sql queries inside the obj_srvr parser. For a |
155 |
programmer this is still better than writing even a simple perl script |
156 |
to hook together the subs, because the script would have to be checked |
157 |
for errors, since things can go wrong due to oversight in only a few |
158 |
lines of code. |
159 |
|
160 |
III. Why not just use a real Data Warehouse? |
161 |
|
162 |
1. Data is live |
163 |
|
164 |
2. Easier to change and add to model |
165 |
|
166 |
Changes can be made by merely adding field, table, and/or subnames |
167 |
to obj_srvr.tables, and writing the needed perl subs. With a |
168 |
warehouse you'd have to add fields to database tables or add tables, |
169 |
and mod the warehouse populate program accordingly. |
170 |
|
171 |
3. Cleaner and more trustworthy |
172 |
|
173 |
Warehouse populate programs are multi-step processes that can fail |
174 |
at any point due to lack of disk space, lack of tablespace space, |
175 |
violation of integrity constraints, etc. If you plan on using the data |
176 |
you need to verify that the process ran to completion every day. |
177 |
Initially you need to confirm for every step that any failure will |
178 |
abort the whole process. If you modify the process you must be careful |
179 |
not to introduce code that could fail without aborting the process. |
180 |
|
181 |
|
182 |
4. Enhancements to obj_srvr data model are available to regular programs, |
183 |
and vice-versa |
184 |
|
185 |
The subs for the data model are in perl modules, and can be used by |
186 |
the warehouse or by other perl programs and reports. It doesn't |
187 |
matter if they were created for use in a report, or for use in the |
188 |
warehouse. |
189 |
|
190 |
IV. How implemented |
191 |
|
192 |
obj_srvr.exe is generated from obj_srvr.l and obj_srvr.y (lex and yacc), |
193 |
then compiled with exec_sql.c. The SQL lexer and parser were copied from |
194 |
the O'Reilly Lex & Yacc book, 2nd Edition, 1995, by John R. Levine, |
195 |
Tony Mason, and Doug Brown, and then enhanced (embedded perl and some |
196 |
parser "action" blocks). |
197 |
|
198 |
V. Building |
199 |
|
200 |
from this directory (developer$disk:[wosc_common.uid.obj_srvr] |
201 |
|
202 |
@make clean |
203 |
@make |
204 |
|
205 |
(There is a unix makefile in the directory, but didn't have make on VMS |
206 |
so wrote make.com. Later installed make, but makefile will need mod'ed |
207 |
a bit to work in VMS.) |
208 |
|
209 |
Will work on unix using the unix makefile. (at least it DID on SunOS, may |
210 |
have done some stuff in the VMS version that screwed it up for unix, will |
211 |
have to check.) obj_srvr.pm handles unix or VMS (only a couple lines of code |
212 |
where it matters). The Banner SIS perl modules have a little code that is |
213 |
specific to VMS, need to mod this for unix (mostly the @INC path). |
214 |
|
215 |
VI. Using |
216 |
|
217 |
As a SQL-like Query Tool |
218 |
======================== |
219 |
|
220 |
from this directory (developer$disk:[wosc_common.uid.obj_srvr] |
221 |
|
222 |
orauser wouprd |
223 |
|
224 |
runpgm := $developer$disk:[wosc_common.uid.obj_srvr]obj_srvr.exe |
225 |
|
226 |
runpgm <uid> <passwd> I |
227 |
|
228 |
(I is for Interactive, i.e. command line rather than BI-Query through tnslsnr.) |
229 |
|
230 |
(wait about 10 seconds for the parser to embed perl) |
231 |
|
232 |
sql |
233 |
|
234 |
SQL> select lname, fname, major, city from term_students |
235 |
where term = '200203' and level = '%' and |
236 |
city in ('Monmouth', 'Independence') and state = 'OR' and |
237 |
majr_code like '5%' order by major, city, lname, fname; |
238 |
|
239 |
(will take about 2 minutes) |
240 |
|
241 |
sql |
242 |
|
243 |
SQL> (another query) |
244 |
|
245 |
quit |
246 |
|
247 |
(back to the vms prompt - you can also type quit from the SQL> prompt). |
248 |
|
249 |
** any "where clause" operators that don't translate directly to perl need |
250 |
to be implemented in obj_srvr.y. (I.e., "and" and "or" mean the same |
251 |
thing in perl as in sql, as do parentheses. "=", "!=", "<", ">" do |
252 |
not, if the item being compared is a string, rather than a number.) |
253 |
Currently, am always translating "=" to "eq", need to leave alone for |
254 |
numbers. (looks like it's okay to use eq, ne, lt, gt on numbers as long as |
255 |
both items being compared are numbers, so maybe this is okay.) |
256 |
HAVE implemented "like", "not like", "in", "not in", =, <>, !=, <>, <=, >=. |
257 |
(=, !=, >, etc. implementation is translating to stringwise equivalent |
258 |
(eq, ne, gt, etc.), which seems to work for either strings or numbers, as |
259 |
long as both comparison items are either strings, or numbers (but not |
260 |
mixed, which seems pretty unlikely). |
261 |
Everything else is coming across unchanged. ** |
262 |
|
263 |
(am using eval (actually reval) on the "where clause" expression to |
264 |
determine if any given record satisfies the "where clause". ) |
265 |
|
266 |
|
267 |
From BI Query |
268 |
============= |
269 |
|
270 |
1. Put a tnsnames.ora with the entry for obj_srvr in your windows directory |
271 |
(use tnsnames.ora in [.related] subdir as example) and set the properties |
272 |
for BI Query User to start in the windows directory. |
273 |
|
274 |
2. In BI Query User, create a connection to obj_srvr. |
275 |
|
276 |
3. STILL TO DO (see Spoofing Oracle, below) |
277 |
** This now works: is in alpha testing. Need to add handling |
278 |
for rows longer than 2K, and clean up some stuff, etc. ** |
279 |
|
280 |
|
281 |
VII. Spoofing Oracle (so a desktop client like BI Query can be used as a front-end) |
282 |
|
283 |
If you start a tns listener using the listener.ora and tnsnames.ora files |
284 |
in the [.related] subdir, it will look for the oracle server executable |
285 |
in the ORACLE_HOME/bin of each SID_NAME in the SID_LIST_LISTENER parameter. |
286 |
I have it set to /home/hjd so it will look for /home/hjd/bin/oracle on |
287 |
my linux PC (where I was testing this). This "oracle" is a perl script. |
288 |
When it is started by tnslsnr, it sends tnslsnr a service_register command |
289 |
(See SERVICE_REGISTER entry in tns_cmds.txt). This means that when tnslsnr |
290 |
gets connection requests for obj_srvr, it routes them to "oracle". (You |
291 |
can call it another name, say obj_srvr, if you have a (PROGRAM=obj_srvr) |
292 |
entry in the SID_DESC in listener.ora. Then the spoofing script should be |
293 |
named obj_srvr.) |
294 |
|
295 |
TO DO: |
296 |
|
297 |
Make "oracle" do connection conversation with BI Query client, get sql |
298 |
request from BI Query, and pass it to obj_srvr.exe. |
299 |
** Now Works: connects, sends query, "oracle" sends back result from wouprd, |
300 |
and BI Query displays ** |
301 |
|
302 |
You can see the bytes in the packets exchanged between the sqlplus client and |
303 |
the DB by creating a .sqlnet.ora file in your home directory (at least on |
304 |
unix), setting the trace level at the max, and running sqlplus and executing |
305 |
a query. Use the .sqlnet.ora in the [.related] subdir, but set |
306 |
trace_level_client=16. See cli_trace.log in [.related] for an example of the |
307 |
info this gives you. In windows you can see the bytes from a BI-Query/DB |
308 |
conversation. Go into Oracle Net8 Assistant and click on "Profile". Set |
309 |
trace level to "SUPPORT" in "Tracing" - it may then force you to put in a TNS |
310 |
Time Out value in "Advanced". It seems to have a problem closing the log file |
311 |
after a session; you may have to do some gyrations to be able to read it. |
312 |
Logging off the network and back on again frees up the file. |
313 |
|
314 |
The oracle packet type is given in bytes 11, 12. 13 is a sequence number. |
315 |
|
316 |
---------------- |
317 |
|
318 |
Jeremy Hickerson, 6/13/2003 |
319 |
7/20/2003 |
320 |
8/24/2004 |