        |
De-duping issues
1. What are we going to match on?
OCLC number
LCCN
ISBN
author/title/date key?
2. What role does FRBR play in clustering versions, etc...?
3. Getting the most robust, full record as the final version.
4. Retaining information worth keeping -- MeSh headings, contents notes, etc...
5. Keeping the SIDs from all the institutions that have that record/work?
6. What non-bib data do we collect? E.g., sublibrary/collection codes, call numbers, URLs (do only keep multiples if they're different?)
7. Do we decompose the MARC record into some other format and, if so, what?
8. ADD MORE
|