/[webpac2]/trunk/lib/WebPAC/Input.pm
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Log of /trunk/lib/WebPAC/Input.pm

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Sticky Revision:

Revision 1307 - (view) (annotate) - [select for diffs]
Modified Mon Sep 21 16:42:25 2009 UTC (14 years, 6 months ago) by dpavlin
File length: 17600 byte(s)
Diff to previous 1306
cleanup WebPAC::Input

Revision 1306 - (view) (annotate) - [select for diffs]
Modified Mon Sep 21 15:48:52 2009 UTC (14 years, 6 months ago) by dpavlin
File length: 17621 byte(s)
Diff to previous 1304
extract size from ll_db if supported

Revision 1304 - (view) (annotate) - [select for diffs]
Modified Sun Sep 20 21:38:15 2009 UTC (14 years, 6 months ago) by dpavlin
File length: 17511 byte(s)
Diff to previous 1286
fix offset to skip records just like SQL databases do, and not position us on it

Revision 1286 - (view) (annotate) - [select for diffs]
Modified Fri Sep 18 21:30:30 2009 UTC (14 years, 6 months ago) by dpavlin
File length: 17557 byte(s)
Diff to previous 1236
push offset and limit to low-level modules for optimizations


Revision 1236 - (view) (annotate) - [select for diffs]
Modified Fri Jul 10 13:54:55 2009 UTC (14 years, 8 months ago) by dpavlin
File length: 17498 byte(s)
Diff to previous 1221
save ll_db earlier so that input_module won't die on empty input


Revision 1221 - (view) (annotate) - [select for diffs]
Modified Tue Jun 9 21:37:32 2009 UTC (14 years, 9 months ago) by dpavlin
File length: 17481 byte(s)
Diff to previous 1168
accessor for low-level input_module


Revision 1168 - (view) (annotate) - [select for diffs]
Modified Sat Apr 25 17:13:20 2009 UTC (14 years, 11 months ago) by dpavlin
File length: 17444 byte(s)
Diff to previous 1128
 r1838@llin:  dpavlin | 2009-04-25 19:13:18 +0200
 don't ever never use blib (so I don't have to re-run make)


Revision 1128 - (view) (annotate) - [select for diffs]
Modified Tue Apr 21 21:03:52 2009 UTC (14 years, 11 months ago) by dpavlin
File length: 17439 byte(s)
Diff to previous 1122
 r1764@llin:  dpavlin | 2009-04-21 18:56:13 +0200
 


Revision 1122 - (view) (annotate) - [select for diffs]
Modified Mon Nov 17 21:30:05 2008 UTC (15 years, 4 months ago) by dpavlin
File length: 17305 byte(s)
Diff to previous 1121
dump error from $@


Revision 1121 - (view) (annotate) - [select for diffs]
Modified Mon Nov 17 21:18:16 2008 UTC (15 years, 4 months ago) by dpavlin
File length: 17303 byte(s)
Diff to previous 1107
better report modify errors


Revision 1107 - (view) (annotate) - [select for diffs]
Modified Mon Aug 4 21:47:27 2008 UTC (15 years, 7 months ago) by dpavlin
File length: 17259 byte(s)
Diff to previous 1100
migrate internal encoding to utf-8


Revision 1100 - (view) (annotate) - [select for diffs]
Modified Sat Aug 2 23:46:41 2008 UTC (15 years, 7 months ago) by dpavlin
File length: 17308 byte(s)
Diff to previous 1076
Make cleanup of encodings, moving webpac closer to having
internal utf-8 representation.

This will break current code, but is really neceserry
step toward checking input encoding for validity


Revision 1076 - (view) (annotate) - [select for diffs]
Modified Wed Nov 28 22:51:43 2007 UTC (16 years, 4 months ago) by dpavlin
File length: 17025 byte(s)
Diff to previous 910
tweaks to statistics:
- support stats from simple field => 'value' structure
- sort fields which are not numeric correctly


Revision 910 - (view) (annotate) - [select for diffs]
Modified Tue Oct 30 01:51:20 2007 UTC (16 years, 5 months ago) by dpavlin
File length: 16937 byte(s)
Diff to previous 909
 r1364@llin:  dpavlin | 2007-10-30 02:51:21 +0100
 generelize idea a bit, and sort every subfield which has more than
 one char (and is thus "special" or wrong :-) in front.


Revision 909 - (view) (annotate) - [select for diffs]
Modified Tue Oct 30 01:46:41 2007 UTC (16 years, 5 months ago) by dpavlin
File length: 16930 byte(s)
Diff to previous 873
 r1362@llin:  dpavlin | 2007-10-30 02:46:05 +0100
 Show indicators (available when using WebPAC::Input::MARC) as
 first two subfileds in statistics insted in alphabetical order


Revision 873 - (view) (annotate) - [select for diffs]
Modified Fri Jun 22 00:03:46 2007 UTC (16 years, 9 months ago) by dpavlin
File length: 16725 byte(s)
Diff to previous 868
 r1298@llin:  dpavlin | 2007-06-22 02:03:23 +0200
 input_config can be given to new or open now


Revision 868 - (view) (annotate) - [select for diffs]
Modified Thu Jun 21 21:26:17 2007 UTC (16 years, 9 months ago) by dpavlin
File length: 16700 byte(s)
Diff to previous 860
 r1289@llin:  dpavlin | 2007-06-21 23:26:10 +0200
 * transfer input configuration hash as input_config to input module


Revision 860 - (view) (annotate) - [select for diffs]
Modified Sun May 27 19:10:43 2007 UTC (16 years, 10 months ago) by dpavlin
File length: 16599 byte(s)
Diff to previous 855
call low-level dump_ascii as it should


Revision 855 - (view) (annotate) - [select for diffs]
Modified Sun May 27 14:44:58 2007 UTC (16 years, 10 months ago) by dpavlin
File length: 16597 byte(s)
Diff to previous 844
 r1267@llin:  dpavlin | 2007-05-27 16:44:54 +0200
 sort fields in stats


Revision 844 - (view) (annotate) - [select for diffs]
Modified Sat May 26 10:40:01 2007 UTC (16 years, 10 months ago) by dpavlin
File length: 16597 byte(s)
Diff to previous 823
work with fields which have number 0 (as opposed to 000) which has been
noticed in the wild (invalid, but --stats shouldn't really die)


Revision 823 - (view) (annotate) - [select for diffs]
Modified Wed Apr 11 12:22:37 2007 UTC (16 years, 11 months ago) by dpavlin
File length: 16552 byte(s)
Diff to previous 818
 r1203@llin:  dpavlin | 2007-04-11 14:22:28 +0200
 spacial handling for empty subfields [0.18]


Revision 818 - (view) (annotate) - [select for diffs]
Modified Thu Apr 5 21:53:52 2007 UTC (16 years, 11 months ago) by dpavlin
File length: 16048 byte(s)
Diff to previous 800
fix warning


Revision 800 - (view) (annotate) - [select for diffs]
Modified Sun Feb 4 23:10:18 2007 UTC (17 years, 1 month ago) by dpavlin
File length: 15994 byte(s)
Diff to previous 799
decorate output from regexp modify with filename and line


Revision 799 - (view) (annotate) - [select for diffs]
Modified Sun Feb 4 15:09:01 2007 UTC (17 years, 1 month ago) by dpavlin
File length: 15677 byte(s)
Diff to previous 797
minor tweaks to test modify_file


Revision 797 - (view) (annotate) - [select for diffs]
Modified Sun Feb 4 13:28:30 2007 UTC (17 years, 1 month ago) by dpavlin
File length: 15533 byte(s)
Diff to previous 793
finish tweaking mock framework, test and fix problem with slashes in modify_record


Revision 793 - (view) (annotate) - [select for diffs]
Modified Sun Feb 4 12:19:51 2007 UTC (17 years, 1 month ago) by dpavlin
File length: 15473 byte(s)
Diff to previous 784
small tweaks on seek


Revision 784 - (view) (annotate) - [select for diffs]
Modified Wed Dec 6 23:43:45 2006 UTC (17 years, 3 months ago) by dpavlin
File length: 15423 byte(s)
Diff to previous 774
added regex: to modify_records


Revision 774 - (view) (annotate) - [select for diffs]
Modified Fri Nov 3 20:56:21 2006 UTC (17 years, 4 months ago) by dpavlin
File length: 15340 byte(s)
Diff to previous 771
another swiping API change: input->dump is gone, replaced
with input->dump_ascii which is more understandable.
If you want to override default behaviour
(which is to use Data::Dump's dump in input->fetch_rec)
define dump_ascii in low-level WebPAC::Input:: API


Revision 771 - (view) (annotate) - [select for diffs]
Modified Fri Nov 3 20:40:33 2006 UTC (17 years, 4 months ago) by dpavlin
File length: 15326 byte(s)
Diff to previous 761
 r1123@llin:  dpavlin | 2006-11-03 21:38:14 +0100
 implement fallback dump if low-level API isn't exposing dump_rec [0.15]


Revision 761 - (view) (annotate) - [select for diffs]
Modified Wed Oct 25 17:10:08 2006 UTC (17 years, 5 months ago) by dpavlin
File length: 15290 byte(s)
Diff to previous 760
implemented load_row and save_row closures to serialize
input databases (using WebPAC::Store probably).
This will allow lookups to share on-disk storage with
low_mem option of WebPAC::Input, which is now gone
(under pressure of 600000+ record database which we
are now testing on)


Revision 760 - (view) (annotate) - [select for diffs]
Modified Wed Oct 25 15:56:44 2006 UTC (17 years, 5 months ago) by dpavlin
File length: 17026 byte(s)
Diff to previous 757
Turn on option low_mem (which need rewrite to use db/row) if there
are more than 10000 rows (hardcoded, but should go away).

This prevents webpac from running out of memory with databases
of about 300000 records on 4Gb of (virtual) memory.


Revision 757 - (view) (annotate) - [select for diffs]
Modified Tue Oct 10 10:57:59 2006 UTC (17 years, 5 months ago) by dpavlin
File length: 16788 byte(s)
Diff to previous 726
fix dump (ugly, needs re-visiting)


Revision 726 - (view) (annotate) - [select for diffs]
Modified Fri Sep 29 19:52:17 2006 UTC (17 years, 6 months ago) by dpavlin
File length: 16744 byte(s)
Diff to previous 707
 r1045@llin:  dpavlin | 2006-09-29 21:38:42 +0200
 change low-level API to be OO (and remove various ugly cludges).


Revision 707 - (view) (annotate) - [select for diffs]
Modified Mon Sep 25 15:26:12 2006 UTC (17 years, 6 months ago) by dpavlin
File length: 16992 byte(s)
Diff to previous 697
 r1008@llin:  dpavlin | 2006-09-25 17:23:42 +0200
 lookup creation somewhat works


Revision 697 - (view) (annotate) - [select for diffs]
Modified Mon Sep 25 09:49:28 2006 UTC (17 years, 6 months ago) by dpavlin
File length: 16972 byte(s)
Diff to previous 652
 r988@llin:  dpavlin | 2006-09-25 11:47:07 +0200
 fix die


Revision 652 - (view) (annotate) - [select for diffs]
Modified Thu Sep 7 15:01:45 2006 UTC (17 years, 6 months ago) by dpavlin
File length: 16969 byte(s)
Diff to previous 636
refactored internal WebPAC::Input::* API a bit, added dump_rec,
validate is now more clever and reports all errors from database at end


Revision 636 - (view) (annotate) - [select for diffs]
Modified Wed Sep 6 19:25:22 2006 UTC (17 years, 6 months ago) by dpavlin
File length: 16825 byte(s)
Diff to previous 634
implement new modify_file format which is (hopefully) simplier than yaml and/or perl [2.27]
(yes, I know... It's a sin...)


Revision 634 - (view) (annotate) - [select for diffs]
Modified Wed Sep 6 18:08:30 2006 UTC (17 years, 6 months ago) by dpavlin
File length: 15059 byte(s)
Diff to previous 626
move logging to debug level


Revision 626 - (view) (annotate) - [select for diffs]
Modified Mon Sep 4 16:15:07 2006 UTC (17 years, 6 months ago) by dpavlin
File length: 14985 byte(s)
Diff to previous 625
fix MARC encoding whoes


Revision 625 - (view) (annotate) - [select for diffs]
Modified Sat Aug 26 12:00:36 2006 UTC (17 years, 7 months ago) by dpavlin
File length: 14927 byte(s)
Diff to previous 624
 r878@llin:  dpavlin | 2006-08-26 14:00:08 +0200
 removed some debugging output (or moved it to debug level), few tweaks [2.26]


Revision 624 - (view) (annotate) - [select for diffs]
Modified Sat Aug 26 12:00:31 2006 UTC (17 years, 7 months ago) by dpavlin
File length: 14912 byte(s)
Diff to previous 619
 r877@llin:  dpavlin | 2006-08-25 21:55:05 +0200
 removed traces of Text::Iconv and replaced them with Encode,
 code page 852 is now cp852 (instead of just 852) because Encode
 likes it that way, record encoding is now hard-coded to utf-8


Revision 619 - (view) (annotate) - [select for diffs]
Modified Fri Aug 25 12:31:06 2006 UTC (17 years, 7 months ago) by dpavlin
File length: 14936 byte(s)
Diff to previous 613
 r867@llin:  dpavlin | 2006-08-25 14:32:05 +0200
 statistics now show data before modify_records


Revision 613 - (view) (annotate) - [select for diffs]
Modified Wed Aug 23 11:04:32 2006 UTC (17 years, 7 months ago) by dpavlin
File length: 14924 byte(s)
Diff to previous 606
 r857@llin:  dpavlin | 2006-08-23 13:04:58 +0200
 modify_records is now applied only once for each field to prevent looping of regexpes


Revision 606 - (view) (annotate) - [select for diffs]
Modified Tue Aug 1 13:59:47 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 14699 byte(s)
Diff to previous 599
added --parallel option to utilize multiple CPUs in machine


Revision 599 - (view) (annotate) - [select for diffs]
Modified Thu Jul 13 13:55:19 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 14698 byte(s)
Diff to previous 598
 r835@llin:  dpavlin | 2006-07-13 15:56:53 +0200
 test modify_record


Revision 598 - (view) (annotate) - [select for diffs]
Modified Thu Jul 13 13:55:15 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 14523 byte(s)
Diff to previous 597
 r834@llin:  dpavlin | 2006-07-13 14:49:23 +0200
 fix pod


Revision 597 - (view) (annotate) - [select for diffs]
Modified Thu Jul 13 11:54:33 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 14523 byte(s)
Diff to previous 593
 r831@llin:  dpavlin | 2006-07-13 13:56:19 +0200
 first cut in implementing modify_records using automatically generated regexpes


Revision 593 - (view) (annotate) - [select for diffs]
Modified Sun Jul 9 15:22:39 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 12538 byte(s)
Diff to previous 585
 r823@llin:  dpavlin | 2006-07-09 17:23:28 +0200
 stats not report repeatable subfields


Revision 585 - (view) (annotate) - [select for diffs]
Modified Wed Jul 5 19:52:45 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 12267 byte(s)
Diff to previous 523
 r810@llin:  dpavlin | 2006-07-05 21:53:01 +0200
 change of parametars to WebPAC::Input


Revision 523 - (view) (annotate) - [select for diffs]
Modified Sun May 21 19:29:26 2006 UTC (17 years, 10 months ago) by dpavlin
File length: 11617 byte(s)
Diff to previous 519
transfer all input variables to open_db in input module


Revision 519 - (view) (annotate) - [select for diffs]
Modified Thu May 18 13:48:51 2006 UTC (17 years, 10 months ago) by dpavlin
File length: 11604 byte(s)
Diff to previous 513
 r689@llin:  dpavlin | 2006-05-18 15:45:23 +0200
 treat field names as strings, not numbers (Excel field names are chars, not numbers)


Revision 513 - (view) (annotate) - [select for diffs]
Modified Tue May 16 13:08:31 2006 UTC (17 years, 10 months ago) by dpavlin
File length: 11604 byte(s)
Diff to previous 507
dump debug not info if skipping to mfn


Revision 507 - (view) (annotate) - [select for diffs]
Modified Mon May 15 13:15:01 2006 UTC (17 years, 10 months ago) by dpavlin
File length: 11603 byte(s)
Diff to previous 506
 r669@llin:  dpavlin | 2006-05-15 15:18:36 +0200
 added nicely formatted stats and --stats flag to run.pl


Revision 506 - (view) (annotate) - [select for diffs]
Modified Mon May 15 09:59:05 2006 UTC (17 years, 10 months ago) by dpavlin
File length: 10920 byte(s)
Diff to previous 496
 r663@llin:  dpavlin | 2006-05-15 12:02:43 +0200
 added stats gathering


Revision 496 - (view) (annotate) - [select for diffs]
Modified Sun May 14 19:45:26 2006 UTC (17 years, 10 months ago) by dpavlin
File length: 10216 byte(s)
Diff to previous 487
 r651@llin:  dpavlin | 2006-05-14 21:47:08 +0200
 allow 0 as valid db handle


Revision 487 - (view) (annotate) - [select for diffs]
Modified Sun May 14 12:34:50 2006 UTC (17 years, 10 months ago) by dpavlin
File length: 10207 byte(s)
Diff to previous 483
 r634@llin:  dpavlin | 2006-05-14 13:12:43 +0200
 don't use version which doesn't exits


Revision 483 - (view) (annotate) - [select for diffs]
Modified Sun May 14 09:34:05 2006 UTC (17 years, 10 months ago) by dpavlin
File length: 10212 byte(s)
Diff to previous 416
 r625@llin:  dpavlin | 2006-05-14 11:37:22 +0200
 added no_progress_bar for tests and cron usage


Revision 416 - (view) (annotate) - [select for diffs]
Modified Sun Feb 26 23:21:50 2006 UTC (18 years, 1 month ago) by dpavlin
File length: 9943 byte(s)
Diff to previous 339
 r494@llin:  dpavlin | 2006-02-27 00:22:59 +0100
 implemented recode option to input (for now, just for MARC)


Revision 339 - (view) (annotate) - [select for diffs]
Modified Sat Dec 31 16:50:11 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9352 byte(s)
Diff to previous 338
 r346@llin:  dpavlin | 2005-12-31 17:53:29 +0100
 rename $offset and $limit variables to $from_rec and $to_rec to avoid confusion
 with parametars which have same names


Revision 338 - (view) (annotate) - [select for diffs]
Modified Sat Dec 31 16:50:06 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9325 byte(s)
Diff to previous 308
 r345@llin:  dpavlin | 2005-12-31 17:50:23 +0100
 better output


Revision 308 - (view) (annotate) - [select for diffs]
Modified Tue Dec 20 19:01:22 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9306 byte(s)
Diff to previous 307
 r335@athlon:  dpavlin | 2005-12-20 20:01:21 +0100
 added debug output for record fetched from low-level API


Revision 307 - (view) (annotate) - [select for diffs]
Modified Tue Dec 20 00:03:04 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9250 byte(s)
Diff to previous 301
moved clean into WebPAC::Output::Estraier, cleanup


Revision 301 - (view) (annotate) - [select for diffs]
Modified Mon Dec 19 21:26:04 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9261 byte(s)
Diff to previous 292
 r322@athlon:  dpavlin | 2005-12-19 22:27:06 +0100
 make run.pl moderatly chatty (along with other modules), added command line options
 (try perldoc run.pl) new target index (to reindex all) and run (to index
 first 100 records of each database)


Revision 292 - (view) (annotate) - [select for diffs]
Modified Sun Dec 18 23:34:30 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9282 byte(s)
Diff to previous 290
 r11790@llin:  dpavlin | 2005-12-19 06:35:06 +0100
 and small fix for codepage


Revision 290 - (view) (annotate) - [select for diffs]
Modified Sun Dec 18 23:10:02 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9276 byte(s)
Diff to previous 289
 r11787@llin:  dpavlin | 2005-12-19 06:10:47 +0100
 MARC indexing seems to work


Revision 289 - (view) (annotate) - [select for diffs]
Modified Sun Dec 18 22:16:44 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9197 byte(s)
Diff to previous 287
 r11784@llin:  dpavlin | 2005-12-19 05:17:24 +0100
 don't use Exporter after all


Revision 287 - (view) (annotate) - [select for diffs]
Modified Sun Dec 18 21:06:51 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9124 byte(s)
Diff to previous 286
 r11779@llin:  dpavlin | 2005-12-19 04:07:22 +0100
 and fixes to make it work


Revision 286 - (view) (annotate) - [select for diffs]
Modified Sun Dec 18 21:06:46 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 9124 byte(s)
Diff to previous 285
 r11778@llin:  dpavlin | 2005-12-19 03:59:54 +0100
 move work on input


Revision 285 - (view) (annotate) - [select for diffs]
Modified Sun Dec 18 21:06:39 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 8633 byte(s)
Diff to previous 11
 r11777@llin:  dpavlin | 2005-12-19 00:02:47 +0100
 refactor Input::ISIS::* [0.02]


Revision 11 - (view) (annotate) - [select for diffs]
Modified Sat Jul 16 20:54:28 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 3764 byte(s)
Diff to previous 10
fix


Revision 10 - (view) (annotate) - [select for diffs]
Modified Sat Jul 16 20:35:30 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 3780 byte(s)
Diff to previous 9
ISIS input is finished, low_mem option has code (and not only documentation :-)


Revision 9 - (view) (annotate) - [select for diffs]
Modified Sat Jul 16 17:14:43 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 2992 byte(s)
Diff to previous 6
a bit more work on WebPAC::Input::ISIS


Revision 6 - (view) (annotate) - [select for diffs]
Modified Sat Jul 16 14:44:38 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 2740 byte(s)
Diff to previous 3
added WebPAC::Input::ISIS


Revision 3 - (view) (annotate) - [select for diffs]
Modified Sat Jul 16 11:07:38 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 2845 byte(s)
Diff to previous 1
moved implementation of lookups from older code-base


Revision 1 - (view) (annotate) - [select for diffs]
Added Sat Jun 25 20:23:23 2005 UTC (18 years, 9 months ago) by dpavlin
File length: 1197 byte(s)
initial import of some documentation and module structure


This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

  ViewVC Help
Powered by ViewVC 1.1.26