This program takes publication information from library records, stored in OPACs and retrieved online or imported from export files, and plots that information on maps. A given location is marked with some kind of aggregate value, such as the number of chosen books that come from there or the earliest date of publication.
Unlike OCLC WorldMap, which displays large amounts of mostly static data with authoritative geographical information at a coarse country level, this program tries to display moderate amounts of data arrived at dynamically through a query and to refine geographical positions down to the city starting from more free-form data.
The aim is to use as much bibliographic information as is available to determine as accurately as possible where it is referring to. For instance, "Cambridge" in the human-readable record is ambiguous between Cambridge, Massachusetts and Cambridge, Cambridgeshire. But given an additional machine-readable record that says MA or UK, we can disambiguate. Lacking that, we can use the knowledge that the publisher is the Harvard University Press or the Cambridge University Press. In fact, given no more information than Harvard University Press, as might happen with Amazon data, we can make a good bet that is it from greater Boston. This is done with a user-extended rule-based system.
The approach can also make up for cataloging anomalies.
onc.260$a field with "Boston, New York" instead
of two fields with a semicolon. As it happens, there is a town named
Boston in New York State (ZIP 14025; pop. about 3K); straightforwardly
passing the field to a geocoder would find it. If
008/15-17 has mau, that candidate will fail
a validity check. Even without that, an exception can be added to avoid
this town.
This began as a further home cataloging exercise. The idea was enhanced based on some of the recent "mashup" discussions on the LibraryThing blogs.
LibraryThing data gotten from Amazon does not have place of publication. I cleaned up my own data to add it; I suspect others would prefer to do so as well. Another possibility, mentioned above, is to add patterns that match the publisher name, together with the unknown place, to the mappings database. Certainly for the major publishers in New York City like Random House under its various brands. And a representative place could be chosen for ones like the Oxford University Press.
Client side XSLT rendering does not quite work with FireFox. The problem is that the document object model presented to Javascript still has parts of the original XML DOM tree in addition to the transformed HTML DOM tree. Different stylesheet settings can get it close to working, although they have problems with accented characters. But even those fail deep inside Google Maps API.
There is no technical reason that mapping could not be to street level locations, rather than just city. More realistically, one might want to distinguish that Harvard University Press is 02138 and MIT Press is 02139. (Although ZIP codes can be problematic. I live in one that has parts in three different municipalities, which are in turn in three different counties.)
It is somewhat inconvenient to have to wave the mouse around to see the associated value information. But many label windows all stacked on top of one another is a mess, too. The attempt to have some slight color gradation for the values doesn't seem quite good enough, either. Perhaps something more garish really is needed.
One of the ideas that came up in the online discussions is displaying the oldest book for a number of locations. Doing this without any further qualification on the books is too many records to process; many servers refuse to have result sets larger than 10,000. It should still be possible by asking the query server to sort the records ascending by date and then taking just the first record and ignoring the rest. Then run this query once for each from a set of country codes or city matches. Unfortunately, I was unable to find a unsecured Z39.50 server that supports sorting.
Related to the above, I thought that a map of sources of incunabula
would be interesting. But I am unable to find a server that supports
a bib-1 relation attribute other than the default. I may not have
figured out the necessary other attributes properly. The alternative
is to have fifty separate queries, each matching one year from 1450 to
1500. Or use an entirely different search; Yale tags incunabula with
a 690 local subject.
The place group feature is something less useful than it might be because MARC bib-1 use attributes do not have anything to match the 008/15-17 field; only the 260$a field. So one cannot count books by state, for instance.
There are issues with character sets. usmarc records should be in the MARC8 character set, an extension of ANSEL. But they are sometimes in the ISO-8859-1 character set.
Getting a deep link from a record in a query result for later
display is library-specific. OCLC registers how to do this given an
ISBN. (Or not, although I have found that the non-ISBN searches often
don't work.) In the case where multiple books from the same place /
publisher are aggregated into a count greater than one, the info
window should a link to a search that combines the original query with
a place condition. I could not get this to work with any of the
libraries I used. LOC comes close, but there is no SRW
bath. field for place of publication, only publisher.
Rules for publisher only records could be automated by recording the place as complete library records pass through the system.
All server queries and map generation can be done standalone from the command line with a mininum of requirements (pretty much just Java Runtime). To refine the rules to deal with data anomalies or gaps in the external geocoders' knowledge, a simple browser interface runs against the server (which can be local). A good sense of how much (or how little) this interface does can be gathered form the associated help.
| Either one of: | ||
|
Precompiled
|
Compile yourself
| |
Some shell / command prompt scripts are included to make running standalone a little easier. All they do is invoke the Java programs.
The .cmd files are for Windows (tested on XP) and the .sh files are for Unix (tested on Debian Linux).
D:\zlibmap>bin\createdb
D:\zlibmap>bin\simplequery simple mit title Aeneid
D:\zlibmap>bin\runquery aeneid
D:\zlibmap>bin\LibraryThing \temp\LibraryThing_TD.xls MMcM "vegetarianism,cookbook" results\veggie.xml
D:\zlibmap>bin\xslt -in results\aeneid.xml -out results\aeneid.html
| Either one of: | ||
|
Precompiled
|
Compile yourself
| |
This software is released as open source under The MIT License. You use it at your own risk. See license.txt
for details.