Welcome to Bryan's Home Page for MARC-related Perl code


My name is Bryan Baldus. I am a cataloger at Quality Books, Inc., in Oregon, Illinois.

This page has been set up, initially, to distribute a number of Perl scripts (and modules) I have written to deal with MARC21/USMARC records.

Please see the manifest.htm and readme.htm for more information, along with the modules and scripts themselves.

Perl code files on the site end in txt to facilitate downloading. Change to .pm for BBMARC, Lintadditions, Errorchecks, and CodeData (MARC::Lint::CodeData) and to .pl for the others.

The above mentioned modules, are based on or extensions to the MARC::Record distribution, and are named MARC::[module name] (with CodeData being in a MARC::Lint directory).
They are referred to on this site as either *.pm or MARC::*.

The modules contain a number of known issues/to-do lists, and some checks are specific to Quality Books Inc.'s records.


Site arrangement:

bryanmodules

fullrecscripts

cleanupscripts

inprocess

prevversions

Each directory's contents are described in manifest.htm


Changes:

(June 23, 2013)

Module updates:

Errorchecks.pm:

Version 1.17: Updated Oct. 8, 2012 to June 22, 2013. Released June 23, 2013.

(Aug. 6, 2012)

Module updates:

Errorchecks.pm:

Version 1.16: Updated May 16-Nov. 14, 2011. Released July 7, 2012.

Version 1.15: Updated June 24-August 16, 2009. Released internally.

Lintadditions.pm:

Version 1.15: Updated May 21, 2012. Released Aug. 6, 2012.

Version 1.14: Updated July 6, 2009. Released 2009.

MARC::Global_Replace.pm:

Version 0.07--Updated May 8, 2010. Released Aug. 2, 2012

(July 28, 2009)

New scripts (fullrecscripts/Cleanup_full_recs):

008langblanktozxx.txt: Converts 008/35-37 from 3 blank spaces to zxx.

440to490-830.txt: Converts 440 to 490-830 pairs.

(May 25, 2008)

Module updates:

Errorchecks.pm:

Version 1.14: Updated Oct. 21, 2007, Jan. 21, 2008, May 20, 2008. Released May 25, 2008.

(Oct. 21, 2007)

Module updates:

Lintadditions.pm:

Version 1.13: Updated Oct. 21, 2007. Released Oct. 21, 2007.

(Oct. 3, 2007)

Module updates:

Errorchecks.pm:

Version 1.13: Updated Aug. 26, 2007. Released Oct. 3, 2007.

MARC::Lint::CodeData.pm:

Versions 1.15 to 1.18: Updated Feb. 28, 2007-Aug. 14, 2007.

Lintadditions.pm:

Version 1.12: Updated Mar. 1-Aug 26, 2007. Released Oct. 3, 2007.

(Feb. 25, 2007)

Module updates:

Errorchecks.pm:

Version 1.12: Updated July 5-Nov. 17, 2006. Released Feb. 25, 2007.

MARC::Lint::CodeData.pm:

Versions 1.09 to 1.14: Updated June 26, 2006-Jan. 8, 2007.

Lintadditions.pm:

Version 1.11: Updated June 12, 2006-Feb. 7, 2007. Released Feb. 25, 2007.

Script updates:

LCSHchangesparserpl110.txt

Version 1.10: Updated Dec. 7, 2006

Version 1.09: Updated Sept. 8, 2006

Version 1.08: Updated Sept. 4, 2006

New Module in process:

MARC::Lint::Lint_Authority.pm:

Version 0.01--Feb. 21, 2007. Posted Feb. 25, 2007

(June 19, 2006)

Module updates:

MARC::Global_Replace.pm:

Version 0.06--Updated June 18, 2006. Released June 19, 2006.

(June 6, 2006)

Module updates:

Errorchecks.pm:

Version 1.11: Updated June 5, 2006. Released June 6, 2006.

MARC::Lint::CodeData.pm:

(Most current version is available through CVS on SourceForge with MARC::Lint.)

Lintadditions.pm:

Version 1.10: Updated Oct. 17, 2005-May 18, 2006. Released June 6, 2006.

MARC::Global_Replace.pm:

Version 0.05--Updated May 1, 2006. Released June 6, 2006.

Version 0.04--Updated Feb. 13, 2006. Unreleased

Script updates:

LCSHchangesparserpl107.txt

Version 1.07: Updated May 8, 2006

Version 1.06: Updated Oct. 5, 2005

Version 1.05: Updated Aug. 25, 2005

parsedeathdateslists.pl.txt

No version. Very preliminary test code

(Jan. 2, 2006)

Module updates:

Errorchecks.pm:

Version 1.10: Updated Sept. 5-Jan. 2, 2006. Released Jan. 2, 2006.

MARC::Lint::CodeData.pm:

(Most current version is available through CVS on SourceForge with MARC::Lint.)

Version 1.04: Updated Oct. 13, 2005.

Version 1.03: Updated Aug. 31, 2005.

(Aug. 14, 2005)

Module updates:

Errorchecks.pm:

Version 1.09: Updated July 18, 2005. Released July 19, 2005 (Aug. 14, 2005 to CPAN).

Module in process:

MARC::File::MARCMaker.pm: (zipped and uncompressed as /marc-marcmaker/)

Version 0.03: Updated Aug. 2, 2005. Released Aug. 14, 2005.

MARC::Global_Replace.pm:

Version 0.03--Updated Aug. 3, 2005. Posted Aug. 14, 2005

Script updates:

LCSHchangesparserpl104.txt

Version 1.04: Updated July 28-Aug. 4, 2005


Current planned in progress tasks:

1. Clean LCSH weekly lists to identify cancelled->replaced headings. Preliminary code for this is in the inprocess directory.

2. Use the cleaned LCSH weekly list cancel/replace headings to do global SH replace. I am working on a new module, MARC::GlobalReplace, to do this. It is at a very early stage, and has been posted to the inprocess directory. It includes a script, global_replace_ident.pl to identify changed headings given 'allhash.txt' generated by the LCSH changes parser (v. 1.07+) and a file of MARC records. The script and module have undergone very minimal testing, but seem to do ok at reporting possible changed headings in MARC records.

3. Cleanup some of the templatified (full record/cleanup) scripts, adding type/creator information when using MacOS, for example, along with documentation.

4. Write additional lint checks, including (these will go into MARC::Errorchecks):

Item 4 is shorter now, as I added a number of check_XXX functions in MARC::Lintadditions.pm and MARC::Errorchecks.pm.

5. Work on integrating MARC::Lintadditions functionality into MARC::Lint. This has begun, with check_041, check_043, and check_245. The main hold-up is getting tests written for each check_xxx method.

6. Write tests for MARC::Errorchecks, MARC::Lintadditions, and MARC::BBMARC.

7. Work on creating MARC::File::MARCMaker. This is a rewrite of the MARCMaker-related code in MARC.pm, to allow MARC::Record to work with LC's MARCMaker format files (http://www.loc.gov/marc/makrbrkr.html). This has been uploaded to SourceForge CVS (marcpm, alongside MARC::Record, MARC::Lint, etc.)

8. Move file handling and other subroutines from internal MARC::QBI::Misc module to public MARC::BBMARC module. The functions in MARC::QBI::Misc are mainly for non-command line users, to put up prompts and reduce unwanted overwriting of files.

9. Work on creating MARC::Lint::Lint_Authority.pm. This will be a module essentially copying MARC::Lint, but with a data section and methods for validating MARC format for Authority data rather than Bibliographic. An initial version of this module appears in the inprocess directory.

With the added checks, the lint checker runs a bit slower, so I welcome any suggestions for improved efficiency.

I welcome any help with any of the above, especially number 7.


(July 19, 2005)

Module updates:

Errorchecks.pm:

Version 1.09: Updated July 18, 2005. Released July 19, 2005.

(July 16, 2005)

Module updates:

Lintadditions.pm:

Version 1.09: Updated Mar. 31-Apr., 2005. Released July 16, 2005.

Errorchecks.pm:

Version 1.08: Updated Feb. 15-July 11, 2005. Released July 16, 2005.

MARC::Lint::CodeData.pm:

Version 1.02: Updated June 21-July 12, 2005. Released (to CPAN) with new version of MARC::Errorchecks. Also posted to CVS on SourceForge with MARC::Lint.

Module in process:

MARC::File::MARCMaker.pm: (zipped and uncompressed as /marc-marcmaker/)

Version 0.02: Updated July 12-13, 2005. Released July 16, 2005.

Added and changed scripts:

Updated LCSH Changes Parser script, LCSHchangesparserpl103.txt:

(Mar. 7, 2005)

New Module in process:

MARC::File::MARCMaker.pm: (zipped and uncompressed as /marc-marcmaker/)

Version 0.01: Initial version, Nov. 21, 2004-Mar. 7, 2005. Released Mar. 7, 2005.

(Feb. 27, 2005)

Module updates:

Lintadditions.pm:

Version 1.08: Updated Feb. 21-27, 2005. Released Feb. 27, 2005.

(Feb. 13, 2005)

Module updates:

Errorchecks.pm:

Version 1.07: Updated Dec. 11-Feb. 2005. Released Feb. 13, 2005.

Lintadditions.pm:

Version 1.07: Updated Jan. 2-Feb. 1, 2005. Released Feb. 13, 2005.

MARC::Lint::CodeData.pm:

Version 1.01: Updated Jan. 5-Feb. 10, 2005. Released (to CPAN) Feb. 13, 2005 (with new version of MARC::Errorchecks).

Added and changed scripts:

See the manifest.htm page for more information about these. All are in fullrecscripts.

(Dec. 5, 2004)

New module:

MARC::Lint::CodeData.pm:

Version 1.00 (original version): First release, Dec. 5, 2004. Uploaded to SourceForge CVS, Jan. 3, 2005.

Module updates:

Errorchecks.pm:

Version 1.04: Updated Nov. 4-Dec. 4, 2004. Released Dec. 5, 2004.

Lintadditions.pm:

Version 1.06: Updated Nov. 21-24, 2004. Released Dec. 5, 2004.

BBMARC.pm:

Version 1.08: Updated Oct 31, 2004. Released Dec. 5, 2004.

Added and changed scripts:

See the manifest.htm page for more information about these. All but the last are in fullrecscripts. The last is in inprocess.

(Oct. 17, 2004)

Module updates:

Errorchecks.pm:

Version 1.03: Updated Aug. 30-Oct. 16, 2004. Released Oct. 17. First CPAN version.

Lintadditions.pm:

Version 1.05: Updated Aug. 30-Oct. 16, 2004. Released Oct. 17, 2004.

BBMARC.pm:

Version 1.07: Updated Aug. 30-Oct. 16, 2004. Released Oct. 16, 2004.

Added and changed scripts:

Reorganized Full Record Script directory as seen below. Note: Many of the scripts have not been reviewed lately, and so may not work with the current versions of my modules. This is particularly true of the items in Tests for Errorchecks and Tests for Lintadditions.

Cleanup full recs

Code list cleanup

Counting

Extraction

findmultiplefields.txt
hasbeenupdated.txt

Linting

mermarcfiles.txt
outputchangestogether.txt
printrecordasformatted.txt
rawanddecodedscan.txt
splitmarcfile.pl.txt

Tests for Errorchecks

Tests for Lintadditions

(Aug. 22, 2004):

Module updates:

Errorchecks.pm:

Version 1.02: Updated Aug. 11-22, 2004. Released Aug. 22, 2004.

Lintadditions.pm:

Version 1.04: Updated Aug. 10-22, 2004. Released Aug.22, 2004.

BBMARC.pm:

Version 1.06: Updated Aug. 10-22, 2004. Released Aug. 15, 2004.

Planned (next release):

Added and changed scripts:

Updated LCSH Changes Parser script, LCSHchangesparser2.txt:

(Aug. 15, 2004):

Module updates as described above (prereleased).

Added and changed scripts:

Updated LCSH Changes Parser script, LCSHchangesparserpl2.txt:

(Aug. 8, 2004):

Module updates:

Errorchecks.pm:

Version 1.01: Updated July 20-Aug. 7, 2004. Released Aug. 8, 2004.

Lintadditions.pm:

version 1.03: Updated July 20-Aug. 7, 2004. Released Aug. 8, 2004.

Added and changed scripts:

Most of these are test scripts created while writing the subroutines listed above.
The subroutines in the modules may have code not in the scripts, so it is best to use the module rather than the script for those checks (the last 3 full record scripts).

(July 17, 2004):

Module updates:

Errorchecks.pm:

Version 1.00 (update to 0.95): First release, July 17, 2004.

Lintadditions.pm:

version 1.02: Updated July 2-16, 2004. Released July 17, 2004.

BBMARC.pm:

Version 1.05: Updated July 3, 2004, released July 17, 2004

Added and changed scripts:

(June 22, 2004):

New module:

Errorchecks.pm (MARC::Errorchecks): Collection of error checking subroutines similar to MARC::Lint and MARC::Lintadditions. This is currently version 0.95 due to problems with the subroutine calls to check_003 and check_010. Warnings indicate use of uninitialized Array references.

Associated script for using MARC::Errorchecks: lintallchecks.txt. This can replace most of the error checking scripts, along with the checking portion of the cleanup full record scripts. It should also work without changes as Errorchecks.pm is updated with new subroutines.

(June 20, 2004):

Two new scripts:

findmultispacesafter010.txt: (looks for multiple spaces in a field, for fields after 010. Could be improved by accounting for other fields where multiple spaces would be acceptable (such as 035).

010cleanupscript.txt: For 010 fields with only an 8 or 10 digit LCCN in subfield 'a', makes sure proper spacing precedes and follows the number and replaces that subfield in the record. Reports any problems with cleaning the subfield.

Changes to my main modules:

Lintadditions.pm:

version 1.01: Updated June 17, 2004. Released June 20, 2004.

BBMARC.pm:

Version 1.04: Updated June 16, 2004, released June 20, 2004

Version 1.03: Updated June 10, not released.

(May 31, 2004):

Reorganized site arrangement. I removed separate directories for Mac, Win, and Unix, consolidating the files into the following directories:

cleanupscripts
fullrecscripts
inprocess
prevversions
bryanmodules

Each directory's contents are described in manifest.htm

The new inprocess directory contains alpha-or-so stage code, or code I may be having trouble with.

Currently this contains an LCSH Weekly Lists parser, which condenses a folder/directory of files into a file of tag-old-new headings, separated by tabs. It also compiles a file of all changed headings in the files in the input directory.

Updated MARC::BBMARC:

Added new module, MARC::Lintadditions.pm. This is an extension to MARC::Lint.pm, with added check_XXX functions (see the module for details).

Added script to go with Lintadditions.pm, lintwithadditions.pl, based on Example V3 of the MARC::Doc::Tutorial.

Added cleantrailingspaces.pl, which removes the space from the end of each field > 010. I have not yet dealt with the 010 trailing spaces cleanup.

Updated fieldextraction.pl. This should fix the problem created when I updated MARC::BBMARC::getthreedigits() to allow periods (so 6.. will retrieve all 6xx fields).

(May 1, 2004):

Updated BBMARC with a new function, validate008, along with other changes, as listed in BBMARC.pm, including version number (not fully implemented).

Moved BBMARC to a separate directory, MARC-BBMARC-[version number], which also includes the two main subroutines as separate files (validate007 and validate008).

Added 008checker.pl to go along with validate008.


About the author:

I am a cataloger/librarian with very limited programming experience. I began teaching myself Perl, using Coriolis' Perl Black Book, along with online documentation and books, around November, 2003. The extent of my knowledge is limited to knowing enough to start using MARC::Record's modules.


Copyright (c) 2003-2013
Bryan Baldus
eijabb@cpan.org

Last updated June 23, 2013.