Class DleseIMSFileIndexingWriter

All Implemented Interfaces:
DocWriter

public class DleseIMSFileIndexingWriter extends ItemFileIndexingWriter
Creates a Lucene Document from a DLESE-IMS XML source file.

The Lucene Document fields that are created by this class are (in addition the the ones listed for FileIndexingServiceWriter):

doctype - Set to 'dlese_ims'. Stored. Note: the actual indexing of this field happens in the superclass FileIndexingServiceWriter.
additional fields - A number of additional fields are defined. See the Java code for method addFrameworkFields(Document, Document) for details.

Author:
John Weatherley, Ryan Deardorff
  • Constructor Details

    • DleseIMSFileIndexingWriter

      public DleseIMSFileIndexingWriter()
      Create a DleseIMSFileIndexingWriter
  • Method Details

    • initItem

      public void initItem(File source, org.apache.lucene.document.Document existingDoc) throws Exception
      Initialize the XML map prior to processing
      Specified by:
      initItem in class ItemFileIndexingWriter
      Parameters:
      source - The source file being indexed.
      existingDoc - A Document that previously existed in the index for this item, if present
      Throws:
      Exception - Thrown if error reading the XML map
    • destroy

      protected void destroy()
      Release map resources for GC after processing.
      Specified by:
      destroy in class ItemFileIndexingWriter
    • getReaderClass

      public String getReaderClass()
      Gets the name of the concrete DocReader class that is used to read this type of Document, which is "ItemDocReader".
      Specified by:
      getReaderClass in interface DocWriter
      Specified by:
      getReaderClass in class ItemFileIndexingWriter
      Returns:
      The STring "rg.dlese.dpc.index.reader.ItemDocReader".
    • getValidationReport

      protected String getValidationReport() throws Exception
      Gets a report detailing any errors found in the validation of the data, or null if no error was found.
      Specified by:
      getValidationReport in class ItemFileIndexingWriter
      Returns:
      Null if no data validation errors were found, otherwise a String that details the nature of the error.
      Throws:
      Exception - If error in performing the validation.
    • getDocType

      public final String getDocType()
      Gets the docType attribute of the DleseIMSFileIndexingWriter, which is 'dlese_ims.'
      Specified by:
      getDocType in interface DocWriter
      Specified by:
      getDocType in class ItemFileIndexingWriter
      Returns:
      The docType, which is 'dlese_ims.'
    • _getIds

      protected final String[] _getIds() throws Exception
      Gets the id attribute of the DleseIMSFileIndexingWriter object
      Specified by:
      _getIds in class XMLFileIndexingWriter
      Returns:
      The id value
      Throws:
      Exception - If an error occurs
    • getTitle

      public final String getTitle() throws Exception
      Gets the title attribute of the DleseIMSFileIndexingWriter object
      Specified by:
      getTitle in class XMLFileIndexingWriter
      Returns:
      The title value
      Throws:
      Exception - If an error occurs
    • getDescription

      public final String getDescription() throws Exception
      Gets the description attribute of the DleseIMSFileIndexingWriter object
      Specified by:
      getDescription in class XMLFileIndexingWriter
      Returns:
      The description value
      Throws:
      Exception - If an error occurs
    • getKeywords

      protected String getKeywords() throws Exception
      Returns the items keywords. An empty String or null is acceptable. The String is tokenized, stored and indexed under the field key 'keywords' and is also indexed in the 'default' field.
      Specified by:
      getKeywords in class ItemFileIndexingWriter
      Returns:
      The keywords String
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getCreatorLastName

      protected String getCreatorLastName() throws Exception
      Returns the items creator's last name. An empty String or null is acceptable. The String is tokenized, stored and indexed under the field the 'default' field only.
      Specified by:
      getCreatorLastName in class ItemFileIndexingWriter
      Returns:
      The creator's last name String
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getAssociatedMmdRecs

      protected MmdRec[] getAssociatedMmdRecs()
      Returns the MmdRecs for records in other collections that catalog the same resource. Does not include myMmdRec.
      Specified by:
      getAssociatedMmdRecs in class ItemFileIndexingWriter
      Returns:
      null
    • getAllMmdRecs

      protected MmdRec[] getAllMmdRecs()
      Returns the MmdRecs for all records associated with this resouce, including myMmdRec.
      Specified by:
      getAllMmdRecs in class ItemFileIndexingWriter
      Returns:
      null
    • getMyMmdRec

      protected MmdRec getMyMmdRec()
      Returns the MmdRec for this record only.
      Specified by:
      getMyMmdRec in class ItemFileIndexingWriter
      Returns:
      null
    • getCreator

      protected String getCreator() throws Exception
      Returns the items creator's full name. An empty String or null is acceptable. The String is tokenized, stored and indexed under the field key 'creator' and is also indexed in the 'default' field.
      Specified by:
      getCreator in class ItemFileIndexingWriter
      Returns:
      Creator's full name
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getContent

      protected String getContent()
      Returns null.
      Specified by:
      getContent in class ItemFileIndexingWriter
      Returns:
      null
    • getContentType

      protected String getContentType()
      Returns null.
      Specified by:
      getContentType in class ItemFileIndexingWriter
      Returns:
      null
    • getAccessionStatus

      protected String getAccessionStatus() throws Exception
      Returns the accession status of this record, for example 'accessioned'. The String is tokenized, stored and indexed under the field key 'accessionstatus'.
      Specified by:
      getAccessionStatus in class ItemFileIndexingWriter
      Returns:
      The accession status.
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getHasRelatedResource

      protected boolean getHasRelatedResource() throws Exception
      Returns false (not implemented).
      Specified by:
      getHasRelatedResource in class ItemFileIndexingWriter
      Returns:
      False.
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getRelatedResourceIds

      protected String[] getRelatedResourceIds() throws Exception
      Returns the IDs of related resources that are cataloged by ID, or null if none are present
      Specified by:
      getRelatedResourceIds in class ItemFileIndexingWriter
      Returns:
      Related resource IDs, or null if none are available
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getRelatedResourceUrls

      protected String[] getRelatedResourceUrls() throws Exception
      Returns the URLs of related resources that are cataloged by URL, or null if none are present
      Specified by:
      getRelatedResourceUrls in class ItemFileIndexingWriter
      Returns:
      Related resource URLs, or null if none are available
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getUrls

      public final String[] getUrls() throws Exception
      Gets the url attribute of the DleseIMSFileIndexingWriter object
      Specified by:
      getUrls in class XMLFileIndexingWriter
      Returns:
      The url value
      Throws:
      Exception - If an error occurs
    • getWhatsNewDate

      protected Date getWhatsNewDate() throws Exception
      Returns the date used to determine "What's new" in the library, which is null (unknown).
      Overrides:
      getWhatsNewDate in class ItemFileIndexingWriter
      Returns:
      The what's new date for the item
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getAccessionDate

      protected Date getAccessionDate() throws Exception
      Returns the accession date, which is null (unknown).
      Specified by:
      getAccessionDate in class ItemFileIndexingWriter
      Returns:
      The what's new date for the item
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getCreationDate

      protected Date getCreationDate() throws Exception
      Returns null.
      Specified by:
      getCreationDate in class ItemFileIndexingWriter
      Returns:
      null
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • getWhatsNewType

      protected String getWhatsNewType() throws Exception
      Returns null (unknown).
      Overrides:
      getWhatsNewType in class ItemFileIndexingWriter
      Returns:
      null.
      Throws:
      Exception - This method should throw and Exception with appropriate error message if an error occurs.
    • indexFullContentInDefaultAndStems

      public boolean indexFullContentInDefaultAndStems()
      Default and stems fields handled here, so do not index full content.
      Specified by:
      indexFullContentInDefaultAndStems in class XMLFileIndexingWriter
      Returns:
      False
    • addFrameworkFields

      protected final void addFrameworkFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc) throws Exception
      Adds custom fields to the index that are unique to DLESE-IMS
      Specified by:
      addFrameworkFields in class ItemFileIndexingWriter
      Parameters:
      newDoc - The feature to be added to the FrameworkFields attribute
      existingDoc - The feature to be added to the FrameworkFields attribute
      Throws:
      Exception - If an error occurs