View this PageEdit this PageUploads to this PageVersions of this PageHomeRecent ChangesSearchHelp Guide

De-duping issues


1. What are we going to match on?
OCLC number
LCCN
ISBN
author/title/date key?

2. What role does FRBR play in clustering versions, etc...?

3. Getting the most robust, full record as the final version.

4. Retaining information worth keeping -- MeSh headings, contents notes, etc...

5. Keeping the SIDs from all the institutions that have that record/work?

6. What non-bib data do we collect? E.g., sublibrary/collection codes, call numbers, URLs (do only keep multiples if they're different?)

7. Do we decompose the MARC record into some other format and, if so, what?

8. ADD MORE