/[webpac2]/trunk/lib/WebPAC/Normalize.pm
This is repository of my old source code which isn't updated any more. Go to git.rot13.org for current projects!
ViewVC logotype

Log of /trunk/lib/WebPAC/Normalize.pm

Parent Directory Parent Directory | Revision Log Revision Log


Links to HEAD: (view) (annotate)
Sticky Revision:

Revision 592 - (view) (annotate) - [select for diffs]
Modified Sun Jul 9 15:22:30 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 17647 byte(s)
Diff to previous 589
 r822@llin:  dpavlin | 2006-07-09 17:14:07 +0200
 prefix doesn't die if first parametar is undef


Revision 589 - (view) (annotate) - [select for diffs]
Modified Fri Jul 7 21:48:09 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 17684 byte(s)
Diff to previous 586
 r817@llin:  dpavlin | 2006-07-07 23:48:50 +0200
 support repeatable subfields from Biblio::Isis 0.20


Revision 586 - (view) (annotate) - [select for diffs]
Modified Thu Jul 6 10:31:13 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 17610 byte(s)
Diff to previous 583
better _debug(2) output


Revision 583 - (view) (annotate) - [select for diffs]
Modified Wed Jul 5 00:12:08 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 17538 byte(s)
Diff to previous 579
rec and join_with now return '' if there are no results, so they are safe to
use inside marc_compose


Revision 579 - (view) (annotate) - [select for diffs]
Modified Tue Jul 4 11:08:43 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 17355 byte(s)
Diff to previous 574
 r798@llin:  dpavlin | 2006-07-04 13:08:44 +0200
 changed _get_marc_fields to return arrayref, tests and fix for marc_remove(field)


Revision 574 - (view) (annotate) - [select for diffs]
Modified Mon Jul 3 21:08:07 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 17340 byte(s)
Diff to previous 572
added marc_duplicate and marc_remove


Revision 572 - (view) (annotate) - [select for diffs]
Modified Mon Jul 3 14:32:40 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 13726 byte(s)
Diff to previous 571
fix warning with fields < 10


Revision 571 - (view) (annotate) - [select for diffs]
Modified Mon Jul 3 14:30:22 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 13689 byte(s)
Diff to previous 568
marc() now supports fields < 10 which don't have indicators and subfields


Revision 568 - (view) (annotate) - [select for diffs]
Modified Sun Jul 2 21:30:00 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 13440 byte(s)
Diff to previous 566
 r779@llin:  dpavlin | 2006-07-02 23:30:17 +0200
 more tuning of debug logging


Revision 566 - (view) (annotate) - [select for diffs]
Modified Sun Jul 2 21:17:54 2006 UTC (17 years, 8 months ago) by dpavlin
File length: 13421 byte(s)
Diff to previous 565
test split_rec_on corner cases, and fix one


Revision 565 - (view) (annotate) - [select for diffs]
Modified Sun Jul 2 20:33:13 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 13359 byte(s)
Diff to previous 564
skip empty values in marc_compose


Revision 564 - (view) (annotate) - [select for diffs]
Modified Sun Jul 2 20:14:21 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 13397 byte(s)
Diff to previous 562
 r772@llin:  dpavlin | 2006-07-02 22:14:37 +0200
 rough implementation of marc_leader (not tested enough)


Revision 562 - (view) (annotate) - [select for diffs]
Modified Sun Jul 2 16:14:41 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 13107 byte(s)
Diff to previous 554
added marc_compose to specify manually subfield order in MARC and
split_rec_on to split single field into parts based on regex


Revision 554 - (view) (annotate) - [select for diffs]
Modified Sat Jul 1 10:19:29 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 11273 byte(s)
Diff to previous 551
 r756@llin:  dpavlin | 2006-07-01 12:17:24 +0200
 pod improvements, added _debug


Revision 551 - (view) (annotate) - [select for diffs]
Modified Fri Jun 30 20:43:09 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 10822 byte(s)
Diff to previous 550
 r750@llin:  dpavlin | 2006-06-30 22:34:44 +0200
 check if marc_record has values


Revision 550 - (view) (annotate) - [select for diffs]
Modified Fri Jun 30 18:48:33 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 10735 byte(s)
Diff to previous 548
 r748@llin:  dpavlin | 2006-06-30 20:48:29 +0200
 re-implement magic again (so that it actually work in all cases consistant).
 Depend on Data::Dump to enable nice output.


Revision 548 - (view) (annotate) - [select for diffs]
Modified Thu Jun 29 23:29:02 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 8870 byte(s)
Diff to previous 547
 r744@llin:  dpavlin | 2006-06-30 01:31:00 +0200
 don't chew indicators with 0 value, removed debugging warning


Revision 547 - (view) (annotate) - [select for diffs]
Modified Thu Jun 29 23:19:26 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 8948 byte(s)
Diff to previous 544
 r742@llin:  dpavlin | 2006-06-30 01:21:24 +0200
 added marc_repetable_subfield and marc_indicators, renamed marc21 to marc [2.23]


Revision 544 - (view) (annotate) - [select for diffs]
Modified Thu Jun 29 21:52:51 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 7737 byte(s)
Diff to previous 543
 r736@llin:  dpavlin | 2006-06-29 23:54:24 +0200
 oh, another bit of magic missing...


Revision 543 - (view) (annotate) - [select for diffs]
Modified Thu Jun 29 21:19:08 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 7678 byte(s)
Diff to previous 542
 r732@llin:  dpavlin | 2006-06-29 23:20:46 +0200
 document magic (that is how WebPAC detects repeatable fields) and fix it to actually work :-)


Revision 542 - (view) (annotate) - [select for diffs]
Modified Thu Jun 29 21:18:59 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 7406 byte(s)
Diff to previous 541
 r731@llin:  dpavlin | 2006-06-29 23:02:08 +0200
 implement magic to create fields and repeatable fields (which might be broken for some cases).


Revision 541 - (view) (annotate) - [select for diffs]
Modified Thu Jun 29 21:18:50 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 6895 byte(s)
Diff to previous 540
 r730@llin:  dpavlin | 2006-06-29 21:33:48 +0200
 use MARC::Record 2.0 to support utf-8 encoding in MARC
 http://marcpm.sourceforge.net/


Revision 540 - (view) (annotate) - [select for diffs]
Modified Thu Jun 29 15:29:41 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 6608 byte(s)
Diff to previous 538
 r726@llin:  dpavlin | 2006-06-29 17:31:13 +0200
 add marc21 to normalize and create MARC file from those data [2.22]


Revision 538 - (view) (annotate) - [select for diffs]
Modified Thu Jun 29 15:29:19 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 5692 byte(s)
Diff to previous 536
 r722@llin:  dpavlin | 2006-06-26 21:29:56 +0200
 make private funtions with _


Revision 536 - (view) (annotate) - [select for diffs]
Modified Mon Jun 26 16:39:51 2006 UTC (17 years, 9 months ago) by dpavlin
File length: 5409 byte(s)
Diff to previous 436
 r719@llin:  dpavlin | 2006-06-26 18:40:57 +0200
 big refacture: depriciate and remove all normalisation formats except .pl sets (but
 old code is still available in WebPAC::Lookup::Normalize because lookups use it) [2.20]


Revision 436 - (view) (annotate) - [select for diffs]
Modified Sun Apr 30 12:17:19 2006 UTC (17 years, 11 months ago) by dpavlin
File length: 18199 byte(s)
Diff to previous 433
 r531@llin:  dpavlin | 2006-04-30 14:18:00 +0200
 fix warning on undef vars


Revision 433 - (view) (annotate) - [select for diffs]
Modified Mon Apr 17 16:01:12 2006 UTC (17 years, 11 months ago) by dpavlin
File length: 18168 byte(s)
Diff to previous 397
 r524@llin:  dpavlin | 2006-04-17 18:01:04 +0200
 added all_tags() to get sorted list of all tags in input xml


Revision 397 - (view) (annotate) - [select for diffs]
Modified Wed Feb 15 15:54:12 2006 UTC (18 years, 1 month ago) by dpavlin
File length: 17757 byte(s)
Diff to previous 375
 r458@llin:  dpavlin | 2006-02-15 17:01:53 +0100
 fix warnings


Revision 375 - (view) (annotate) - [select for diffs]
Modified Sun Jan 8 22:21:24 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 17693 byte(s)
Diff to previous 373
 r417@llin:  dpavlin | 2006-01-08 23:21:35 +0100
 fixed another corner-case


Revision 373 - (view) (annotate) - [select for diffs]
Modified Sun Jan 8 22:09:33 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 17678 byte(s)
Diff to previous 372
 r414@llin:  dpavlin | 2006-01-08 23:09:49 +0100
 and finally fix for all wired cases (I hope) [2.10]


Revision 372 - (view) (annotate) - [select for diffs]
Modified Sun Jan 8 21:50:34 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 17619 byte(s)
Diff to previous 371
 r412@llin:  dpavlin | 2006-01-08 22:50:49 +0100
 more refactoring: joined paste_to_arr and fill_in_to_arr to _rec_to_arr


Revision 371 - (view) (annotate) - [select for diffs]
Modified Sun Jan 8 21:16:27 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 17970 byte(s)
Diff to previous 368
 r409@llin:  dpavlin | 2006-01-08 22:16:39 +0100
 collect record sizes


Revision 368 - (view) (annotate) - [select for diffs]
Modified Sun Jan 8 20:32:06 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 17529 byte(s)
Diff to previous 364
 r403@llin:  dpavlin | 2006-01-08 21:31:43 +0100
 refactor and better document get_data


Revision 364 - (view) (annotate) - [select for diffs]
Modified Sun Jan 8 20:27:11 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 17136 byte(s)
Diff to previous 346
 r393@llin:  dpavlin | 2006-01-08 20:50:40 +0100
 better logging and minor fix to fill_arr


Revision 346 - (view) (annotate) - [select for diffs]
Modified Sat Jan 7 03:28:10 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 16982 byte(s)
Diff to previous 344
fixed warning


Revision 344 - (view) (annotate) - [select for diffs]
Modified Sat Jan 7 02:05:55 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 16958 byte(s)
Diff to previous 340
 r356@llin:  dpavlin | 2006-01-07 01:05:14 +0100
 fix failing test


Revision 340 - (view) (annotate) - [select for diffs]
Modified Mon Jan 2 10:58:26 2006 UTC (18 years, 2 months ago) by dpavlin
File length: 16945 byte(s)
Diff to previous 333
 r349@llin:  dpavlin | 2006-01-02 12:02:07 +0100
 fixed s999 fields


Revision 333 - (view) (annotate) - [select for diffs]
Modified Sat Dec 31 13:42:11 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 16778 byte(s)
Diff to previous 317
try to fix infinite loop (not working)


Revision 317 - (view) (annotate) - [select for diffs]
Modified Fri Dec 23 21:37:05 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 16678 byte(s)
Diff to previous 312
 r12234@llin:  dpavlin | 2005-12-23 23:38:41 +0100
 bug fix to skip delimiter before first occurence of field in format


Revision 312 - (view) (annotate) - [select for diffs]
Modified Tue Dec 20 23:31:37 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 16447 byte(s)
Diff to previous 295
 r343@athlon:  dpavlin | 2005-12-21 00:32:50 +0100
 fixed error output


Revision 295 - (view) (annotate) - [select for diffs]
Modified Mon Dec 19 15:34:47 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 16454 byte(s)
Diff to previous 268
 r11795@llin:  dpavlin | 2005-12-19 16:35:30 +0100
 fix regex filter, moved development version to real one [2.07]


Revision 268 - (view) (annotate) - [select for diffs]
Modified Fri Dec 16 21:09:42 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 16452 byte(s)
Diff to previous 261
 r11742@llin:  dpavlin | 2005-12-17 01:26:41 +0100
 cleanup


Revision 261 - (view) (annotate) - [select for diffs]
Modified Fri Dec 16 16:00:18 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 16517 byte(s)
Diff to previous 260
 r11729@llin:  dpavlin | 2005-12-16 21:00:26 +0100
 warn about non-defined filters just once


Revision 260 - (view) (annotate) - [select for diffs]
Modified Fri Dec 16 14:40:55 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 16436 byte(s)
Diff to previous 253
 r11727@llin:  dpavlin | 2005-12-16 19:41:08 +0100
 added filter{regex(s/foo/bar/)} [2.00_5]


Revision 253 - (view) (annotate) - [select for diffs]
Modified Thu Dec 15 17:01:10 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 15631 byte(s)
Diff to previous 252
 r11712@llin:  dpavlin | 2005-12-15 21:01:03 +0100
 lookups now work [2.00_3]


Revision 252 - (view) (annotate) - [select for diffs]
Modified Thu Dec 15 17:01:04 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 15631 byte(s)
Diff to previous 219
 r11711@llin:  dpavlin | 2005-12-15 20:02:16 +0100
 varios tweaks to make lookups work


Revision 219 - (view) (annotate) - [select for diffs]
Modified Mon Dec 5 17:48:08 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 15470 byte(s)
Diff to previous 217
 r11541@llin:  dpavlin | 2005-12-05 16:47:44 +0100
 added prefix [0.04]


Revision 217 - (view) (annotate) - [select for diffs]
Modified Mon Dec 5 17:47:51 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 15189 byte(s)
Diff to previous 209
 r11536@llin:  dpavlin | 2005-12-05 15:29:47 +0100
 change on load_ds and save_ds which not accept ONLY hash (and optional
 database name if not specified when calling new WebPAC::Store)


Revision 209 - (view) (annotate) - [select for diffs]
Modified Mon Dec 5 17:46:57 2005 UTC (18 years, 3 months ago) by dpavlin
File length: 15183 byte(s)
Diff to previous 125
 r11518@llin:  dpavlin | 2005-12-04 19:43:29 +0100
 renamed WebPAC::DB to WebPAC::Store


Revision 125 - (view) (annotate) - [select for diffs]
Modified Thu Nov 24 11:47:15 2005 UTC (18 years, 4 months ago) by dpavlin
File length: 15176 byte(s)
Diff to previous 74
 r9089@llin:  dpavlin | 2005-11-24 12:47:02 +0100
 fixed for new Webpac::DB 0.02


Revision 74 - (view) (annotate) - [select for diffs]
Modified Sun Nov 20 20:13:39 2005 UTC (18 years, 4 months ago) by dpavlin
File length: 15659 byte(s)
Diff to previous 70
 r8988@llin:  dpavlin | 2005-11-20 20:46:12 +0100
 added real implementation for WebPAC::Output::Estraier along with run.pl
 script which run test indexing (which will in one point move to
 WebPAC::Simple or something like that)


Revision 70 - (view) (annotate) - [select for diffs]
Modified Sat Nov 19 23:48:24 2005 UTC (18 years, 4 months ago) by dpavlin
File length: 15658 byte(s)
Diff to previous 64
 r8980@llin:  dpavlin | 2005-11-20 00:49:22 +0100
 implement data_structure that returns HASH and not ARRAY.
 
 Little explanation for this rationale:
 
 Array was needed back in WebPAC v1 because order of tags in import_xml was
 important. However, since we are no longer depending on order of tags in
 input/*.xml, hash is much better choice.


Revision 64 - (view) (annotate) - [select for diffs]
Modified Tue Nov 15 16:56:44 2005 UTC (18 years, 4 months ago) by dpavlin
File length: 15574 byte(s)
Diff to previous 39
 r8894@llin:  dpavlin | 2005-11-15 17:56:56 +0100
 fixed WebPAC::Normalize::get_data to work when called with subfield which
 doesn't exist, added tests


Revision 39 - (view) (annotate) - [select for diffs]
Modified Sat Nov 12 21:31:47 2005 UTC (18 years, 4 months ago) by dpavlin
File length: 15540 byte(s)
Diff to previous 38
check for current_filename and die if need (needs more work)


Revision 38 - (view) (annotate) - [select for diffs]
Modified Sat Nov 12 21:21:50 2005 UTC (18 years, 4 months ago) by dpavlin
File length: 15397 byte(s)
Diff to previous 31
added ForceContent so that tags without attributes work, added strict checking


Revision 31 - (view) (annotate) - [select for diffs]
Modified Sun Jul 24 15:03:11 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 15304 byte(s)
Diff to previous 29
re-worked logging, added no_log option to disable logging


Revision 29 - (view) (annotate) - [select for diffs]
Modified Sun Jul 24 11:17:44 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 14636 byte(s)
Diff to previous 22
some logging improvements and sample configuration file


Revision 22 - (view) (annotate) - [select for diffs]
Modified Sun Jul 17 22:48:25 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 14443 byte(s)
Diff to previous 18
beginning of unit testing and various fixes


Revision 18 - (view) (annotate) - [select for diffs]
Modified Sun Jul 17 14:53:37 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 14441 byte(s)
Diff to previous 15
first cut into WebPAC::DB


Revision 15 - (view) (annotate) - [select for diffs]
Modified Sun Jul 17 10:42:23 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 16320 byte(s)
Diff to previous 14
WebPAC::Common cleanup, most code moved to WebPAC::Normalize. Added
documentation about order of data mungling when normalising data.


Revision 14 - (view) (annotate) - [select for diffs]
Modified Sun Jul 17 00:04:25 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 11728 byte(s)
Diff to previous 13
small fixes


Revision 13 - (view) (annotate) - [select for diffs]
Modified Sat Jul 16 23:56:14 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 11714 byte(s)
Diff to previous 10
data_source seems to work


Revision 10 - (view) (annotate) - [select for diffs]
Added Sat Jul 16 20:35:30 2005 UTC (18 years, 8 months ago) by dpavlin
File length: 643 byte(s)
ISIS input is finished, low_mem option has code (and not only documentation :-)


This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.

  Diffs between and
  Type of Diff should be a

  ViewVC Help
Powered by ViewVC 1.1.26