/[vdw]/trunk/readme_vdw.txt
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Annotation of /trunk/readme_vdw.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 2 - (hide annotations)
Sun Feb 6 05:32:59 2005 UTC (19 years, 2 months ago) by dpavlin
File MIME type: text/plain
File size: 13998 byte(s)
renamed files back to original extension

1 dpavlin 1 Object Server (Virtual Data Warehouse)
2     ======================================
3    
4     I. What is it?
5    
6     Object Server is a sql parser with perl embedded, that maps virtual table
7     column names onto perl subroutines, and runs the subs needed based on the
8     columns in the "select" and "where" clauses of the query, then returns the
9     data after filtering it through the conditions in the "where" clause.
10    
11     The column-to-subroutine map is in obj_srvr.tables, a file which obj_srvr.pm
12     reads at "embed time" (when the parser is started). (The parser embeds
13     obj_srvr.pl, which is a wrapper for obj_srvr.pm.)
14    
15     The data model relies on various Banner SIS related perl modules in
16     woup:[wou_sis_mods.com]
17    
18     WOU_Admit.pm
19     WOU_Person.pm
20     WOU_SIS_Util.pm
21     WOU_Student.pm
22     WOU_Util.pm
23    
24     However, obj_srvr.pm can work with ANY data model. All you need is the
25     map in obj_srvr.tables, and the subs to return the data. (* obj_srvr.pm
26     needs subs in WOU_Util.pm, regardless of data model used. *)
27    
28     Conventions of the Data Model:
29    
30     1. "from" clause can only have one table.
31    
32     You can build whatever joins you need with wrapper subs, see
33     stu_schedule "table" for example.
34    
35     2. Driver sub (the sub associated with the virtual table in %table_objs)
36     returns a ref to an array of hash refs. The array is like a table,
37     and the hashes are like records. The data looks like this:
38    
39     term_students = [
40     { "pidm" => 12345,
41     "term" => "200203",
42     "level" => "UG",
43     "confidential" => "N",
44     "dead" => "N",
45     "majr_code" => "5CSC",
46     "minr_code" => "4PSY",
47     "con1_code" => "", # actually, might be null
48     "con2_code" => "",
49     "con3_code" => "",
50     "major" => "Computer Science",
51     "minor" => "Pyschology",
52     "conc1" => "",
53     "conc2" => "",
54     "conc3" => "",
55     "resident" => "R",
56     "deg_plan" => "PPF",
57     "admt_code" => "FT",
58     "styp_code" => "C",
59     "orsn_code" => "E",
60     "rate_code" => "BASIC" },
61    
62     { "pidm" => 12346,
63     "term" => "200203",
64     "level" => "UG",
65     "confidential" => "N",
66     "dead" => "N",
67     "majr_code" => "5CSM",
68     "minr_code" => "",
69     "con1_code" => "",
70     "con2_code" => "",
71     "con3_code" => "",
72     "major" => "Computer Science/Math",
73     "minor" => "",
74     "conc1" => "",
75     "conc2" => "",
76     "conc3" => "",
77     "resident" => "N",
78     "deg_plan" => "",
79     "admt_code" => "FR",
80     "styp_code" => "C",
81     "orsn_code" => "S1",
82     "rate_code" => "BASIC" },
83     .
84     .
85     . ];
86    
87     3. Supporting subs (subs associated with the columns in %obj_accessor)
88     return a hash ref (a single record). Data looks like this for
89     get_addr ($dbh, $pidm, $atyp1, $atyp2, $atyp3):
90    
91     { "atyp" => "CU",
92     "street" => "223 NW Front Street",
93     "street2" => "", # actually, might be null
94     "street3" => "",
95     "city" => "Salem",
96     "state" => "OR",
97     "zip" => "97301",
98     "nation" => "",
99     "natn_code" => "",
100     "county" => "Marion" };
101    
102     For supporting subs that return a ref to an array of hashes (a table),
103     make them part of the driver sub results by using a wrapper sub to join
104     (see stu_schedule example below).
105    
106     # =========================================================
107     # stu_schedule - All students and their class schedule for
108     # given term and level. Level may use wildcards.
109     # =========================================================
110     sub stu_schedule {
111     my ($dbh, $term, $level) = @_;
112    
113     my (@stu_sched, $ra_students, $rh_stu, $ra_sched, $rh_class, $rh_new);
114    
115     $ra_students = term_students($dbh, $term, $level);
116    
117     foreach $rh_stu ( @{ $ra_students } ) {
118     $ra_sched = schedule($dbh, $rh_stu->{pidm}, $term);
119     foreach $rh_class ( @{ $ra_sched } ) {
120    
121     $rh_new = { }; # get new memory
122     add2hash($rh_new, $rh_stu); # "dup" $rh_stu
123    
124     add2hash($rh_new, $rh_class);
125     push @stu_sched, $rh_new;
126     }
127     }
128    
129     return \@stu_sched;
130     }
131    
132     II. Why?
133    
134     1. Any Programmer can pull reliable data from system
135    
136     The problem with Database-level data models (i.e. the tables in
137     Banner, for instance) is that even a good programmer can't trust
138     herself to write reliable queries without a lot of research, until
139     she has had a year of experience working with the data. So maybe
140     only one programmer on the staff can do quick queries from a given
141     module of Banner, for example. But with higher-level objects (like
142     address, grades, schedule, gpa) available in subroutines, any
143     programmer can hook these subs together and get quick, reliable
144     results. This functionality is provided by the perl modules mentioned
145     above, and doesn't need obj_srvr. An experienced programmer can
146     benefit from the higher-level subs too, because they eliminate a lot
147     of error checking, and guarantee uniformity in results.
148    
149     2. Users can get good answers to ad hoc questions without asking a
150     programmer
151    
152     obj_srvr provides an interface for hooking the subs together. Users
153     can do this through BI Query (in alpha testing), and programmers
154     can do it by doing sql queries inside the obj_srvr parser. For a
155     programmer this is still better than writing even a simple perl script
156     to hook together the subs, because the script would have to be checked
157     for errors, since things can go wrong due to oversight in only a few
158     lines of code.
159    
160     III. Why not just use a real Data Warehouse?
161    
162     1. Data is live
163    
164     2. Easier to change and add to model
165    
166     Changes can be made by merely adding field, table, and/or subnames
167     to obj_srvr.tables, and writing the needed perl subs. With a
168     warehouse you'd have to add fields to database tables or add tables,
169     and mod the warehouse populate program accordingly.
170    
171     3. Cleaner and more trustworthy
172    
173     Warehouse populate programs are multi-step processes that can fail
174     at any point due to lack of disk space, lack of tablespace space,
175     violation of integrity constraints, etc. If you plan on using the data
176     you need to verify that the process ran to completion every day.
177     Initially you need to confirm for every step that any failure will
178     abort the whole process. If you modify the process you must be careful
179     not to introduce code that could fail without aborting the process.
180    
181    
182     4. Enhancements to obj_srvr data model are available to regular programs,
183     and vice-versa
184    
185     The subs for the data model are in perl modules, and can be used by
186     the warehouse or by other perl programs and reports. It doesn't
187     matter if they were created for use in a report, or for use in the
188     warehouse.
189    
190     IV. How implemented
191    
192     obj_srvr.exe is generated from obj_srvr.l and obj_srvr.y (lex and yacc),
193     then compiled with exec_sql.c. The SQL lexer and parser were copied from
194     the O'Reilly Lex & Yacc book, 2nd Edition, 1995, by John R. Levine,
195     Tony Mason, and Doug Brown, and then enhanced (embedded perl and some
196     parser "action" blocks).
197    
198     V. Building
199    
200     from this directory (developer$disk:[wosc_common.uid.obj_srvr]
201    
202     @make clean
203     @make
204    
205     (There is a unix makefile in the directory, but didn't have make on VMS
206     so wrote make.com. Later installed make, but makefile will need mod'ed
207     a bit to work in VMS.)
208    
209     Will work on unix using the unix makefile. (at least it DID on SunOS, may
210     have done some stuff in the VMS version that screwed it up for unix, will
211     have to check.) obj_srvr.pm handles unix or VMS (only a couple lines of code
212     where it matters). The Banner SIS perl modules have a little code that is
213     specific to VMS, need to mod this for unix (mostly the @INC path).
214    
215     VI. Using
216    
217     As a SQL-like Query Tool
218     ========================
219    
220     from this directory (developer$disk:[wosc_common.uid.obj_srvr]
221    
222     orauser wouprd
223    
224     runpgm := $developer$disk:[wosc_common.uid.obj_srvr]obj_srvr.exe
225    
226     runpgm <uid> <passwd> I
227    
228     (I is for Interactive, i.e. command line rather than BI-Query through tnslsnr.)
229    
230     (wait about 10 seconds for the parser to embed perl)
231    
232     sql
233    
234     SQL> select lname, fname, major, city from term_students
235     where term = '200203' and level = '%' and
236     city in ('Monmouth', 'Independence') and state = 'OR' and
237     majr_code like '5%' order by major, city, lname, fname;
238    
239     (will take about 2 minutes)
240    
241     sql
242    
243     SQL> (another query)
244    
245     quit
246    
247     (back to the vms prompt - you can also type quit from the SQL> prompt).
248    
249     ** any "where clause" operators that don't translate directly to perl need
250     to be implemented in obj_srvr.y. (I.e., "and" and "or" mean the same
251     thing in perl as in sql, as do parentheses. "=", "!=", "<", ">" do
252     not, if the item being compared is a string, rather than a number.)
253     Currently, am always translating "=" to "eq", need to leave alone for
254     numbers. (looks like it's okay to use eq, ne, lt, gt on numbers as long as
255     both items being compared are numbers, so maybe this is okay.)
256     HAVE implemented "like", "not like", "in", "not in", =, <>, !=, <>, <=, >=.
257     (=, !=, >, etc. implementation is translating to stringwise equivalent
258     (eq, ne, gt, etc.), which seems to work for either strings or numbers, as
259     long as both comparison items are either strings, or numbers (but not
260     mixed, which seems pretty unlikely).
261     Everything else is coming across unchanged. **
262    
263     (am using eval (actually reval) on the "where clause" expression to
264     determine if any given record satisfies the "where clause". )
265    
266    
267     From BI Query
268     =============
269    
270     1. Put a tnsnames.ora with the entry for obj_srvr in your windows directory
271     (use tnsnames.ora in [.related] subdir as example) and set the properties
272     for BI Query User to start in the windows directory.
273    
274     2. In BI Query User, create a connection to obj_srvr.
275    
276     3. STILL TO DO (see Spoofing Oracle, below)
277     ** This now works: is in alpha testing. Need to add handling
278     for rows longer than 2K, and clean up some stuff, etc. **
279    
280    
281     VII. Spoofing Oracle (so a desktop client like BI Query can be used as a front-end)
282    
283     If you start a tns listener using the listener.ora and tnsnames.ora files
284     in the [.related] subdir, it will look for the oracle server executable
285     in the ORACLE_HOME/bin of each SID_NAME in the SID_LIST_LISTENER parameter.
286     I have it set to /home/hjd so it will look for /home/hjd/bin/oracle on
287     my linux PC (where I was testing this). This "oracle" is a perl script.
288     When it is started by tnslsnr, it sends tnslsnr a service_register command
289     (See SERVICE_REGISTER entry in tns_cmds.txt). This means that when tnslsnr
290     gets connection requests for obj_srvr, it routes them to "oracle". (You
291     can call it another name, say obj_srvr, if you have a (PROGRAM=obj_srvr)
292     entry in the SID_DESC in listener.ora. Then the spoofing script should be
293     named obj_srvr.)
294    
295     TO DO:
296    
297     Make "oracle" do connection conversation with BI Query client, get sql
298     request from BI Query, and pass it to obj_srvr.exe.
299     ** Now Works: connects, sends query, "oracle" sends back result from wouprd,
300     and BI Query displays **
301    
302     You can see the bytes in the packets exchanged between the sqlplus client and
303     the DB by creating a .sqlnet.ora file in your home directory (at least on
304     unix), setting the trace level at the max, and running sqlplus and executing
305     a query. Use the .sqlnet.ora in the [.related] subdir, but set
306     trace_level_client=16. See cli_trace.log in [.related] for an example of the
307     info this gives you. In windows you can see the bytes from a BI-Query/DB
308     conversation. Go into Oracle Net8 Assistant and click on "Profile". Set
309     trace level to "SUPPORT" in "Tracing" - it may then force you to put in a TNS
310     Time Out value in "Advanced". It seems to have a problem closing the log file
311     after a session; you may have to do some gyrations to be able to read it.
312     Logging off the network and back on again frees up the file.
313    
314     The oracle packet type is given in bytes 11, 12. 13 is a sequence number.
315    
316     ----------------
317    
318     Jeremy Hickerson, 6/13/2003
319     7/20/2003
320     8/24/2004

  ViewVC Help
Powered by ViewVC 1.1.26