Package org.dlese.dpc.index.writer
Class DleseCollectionFileIndexingWriter
java.lang.Object
org.dlese.dpc.index.writer.FileIndexingServiceWriter
org.dlese.dpc.index.writer.XMLFileIndexingWriter
org.dlese.dpc.index.writer.DleseCollectionFileIndexingWriter
- All Implemented Interfaces:
DocWriter
Used to write a Lucene
Document for a DLESE Collection XML record. The
reader for this type of Document is DleseCollectionDocReader. - Author:
- John Weatherley
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classAllows sorting of a Collection accession status XML Node, by date giving precedence to status = accessioned if dates are equal. -
Constructor Summary
ConstructorsConstructorDescriptionCreate a DleseCollectionFileIndexingWriter. -
Method Summary
Modifier and TypeMethodDescriptionprotected String[]_getIds()Gets the ID of this collection record.protected final voidaddFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc, File sourceFile) Adds fields to the index that are part of the collection-level Document.protected voiddestroy()This method is called at the conclusion of processing and may be used for tear-down.protected voidfinalize()Perform finalization...protected DateReturns the accession date or null if this collection is not currently accessioned.protected StringGets the most recent accession status found in the XML record.Gets the additional metadata for this collection that was indicated inwhen the collection was created inside an additionalMetadata element, or null.invalid reference
org.dlese.dpc.repository.RepositoryManager.putRecordprotected StringGets the collectionStatus attribute of the DleseCollectionFileIndexingWriter objectprotected StringgetCost()Gets the cost associated with this collection.static final StringgetCurrentCollectionStatus(org.dom4j.Document doc) Gets the status of the collection based on the values in the collection-level record.org.apache.lucene.document.DocumentgetDeletedDoc_OFF_2006_08_23(org.apache.lucene.document.Document existingDoc) Creates a LuceneDocumentfrom an existing CollectionFileIndexing Document by setting the field "deleted" to "true" and making the modtime equal to current time.The description for the collection.Gets the docType attribute of the DleseCollectionFileIndexingWriter, which is 'dlesecollect.'protected StringGets the format of the records in this collection.protected StringReturns the full title for the collection.protected String[]Gets the gradeRanges for this collection.protected StringgetKey()Gets the collection key used to identify the items in the collection this record refers to.protected StringGets the keywords associated with this collection.static longGets the numInstances attribute of the DleseCollectionFileIndexingWriter classprotected StringGets whether the collection is part of the DRC [true|false].Gets the name of the concreteDocReaderclass that is used to read this type ofDocument, which is "DleseCollectionDocReader".protected StringGets the collection's review process statement.protected StringGets the URL to the collection's review process statement.protected StringGets the URL to the collection's scope statement.protected StringReturns the short title for the collection.protected String[]Gets the subjects for this collection.getTitle()Gets the full titleString[]getUrls()Gets the URL to the collection.protected StringGets a report detailing any errors found in the XML validation of the collection record, or null if no error was found.protected DateReturns the date used to determine "What's new" in the library.protected StringReturns 'collection'.booleanDefault and stems fields handled here, so do not index full content.voidPerforms the necessary init functions (nothing done).Methods inherited from class org.dlese.dpc.index.writer.XMLFileIndexingWriter
addCustomFields, getBoundingBox, getCollections, getDeletedDoc, getDocGroup, getDom4jDoc, getFieldContent, getFieldContent, getFieldName, getIds, getIndex, getMyAnnoResultDocs, getMyCollectionDoc, getOaiModtime, getPrimaryId, getRecordDataService, getRelatedIds, getRelatedIdsMap, getRelatedUrls, getRelatedUrlsMap, getTermStringFromStringArray, getXmlIndexer, getXmlIndexerFieldsConfigMethods inherited from class org.dlese.dpc.index.writer.FileIndexingServiceWriter
abortIndexing, addDocToRemove, addToAdminDefaultField, addToDefaultField, create, getConfigAttributes, getDocsource, getFileContent, getFileIndexingPlugin, getFileIndexingService, getLuceneDoc, getPreviousRecordDoc, getSessionAttributes, getSourceDir, getSourceFile, isMakingDeletedDoc, isValidationEnabled, prtln, prtlnErr, setConfigAttributes, setDebug, setFileIndexingPlugin, setFileIndexingService, setIsMakingDeletedDoc, setValidationEnabled
-
Constructor Details
-
DleseCollectionFileIndexingWriter
public DleseCollectionFileIndexingWriter()Create a DleseCollectionFileIndexingWriter.
-
-
Method Details
-
finalize
Perform finalization... closing resources, etc. -
getNumInstances
public static long getNumInstances()Gets the numInstances attribute of the DleseCollectionFileIndexingWriter class- Returns:
- The numInstances value
-
getFullTitle
Returns the full title for the collection.- Returns:
- The fullTitle value
- Throws:
Exception- If error reading XML.
-
getShortTitle
Returns the short title for the collection.- Returns:
- The shortTitle value
- Throws:
Exception- If error reading XML.
-
getTitle
Gets the full title- Specified by:
getTitlein classXMLFileIndexingWriter- Returns:
- The title value
- Throws:
Exception- If error
-
getDescription
The description for the collection.- Specified by:
getDescriptionin classXMLFileIndexingWriter- Returns:
- The description String
- Throws:
Exception- If error reading XML.
-
getAdditionalMetadata
Gets the additional metadata for this collection that was indicated inwhen the collection was created inside an additionalMetadata element, or null.invalid reference
org.dlese.dpc.repository.RepositoryManager.putRecord- Returns:
- The additional metadata element as an String, or null if none.
-
getPartOfDRC
Gets whether the collection is part of the DRC [true|false].- Returns:
- The partOfDRC Value
- Throws:
Exception- If error
-
getAccessionStatus
Gets the most recent accession status found in the XML record.- Returns:
- The most recent accession status.
- Throws:
Exception- If error
-
getCollectionStatuses
Gets the collectionStatus attribute of the DleseCollectionFileIndexingWriter object- Returns:
- The collectionStatus value
- Throws:
Exception- If error
-
getKey
Gets the collection key used to identify the items in the collection this record refers to. For example, dcc or comet.- Returns:
- The Key value
- Throws:
Exception- If error
-
getUrls
Gets the URL to the collection.- Specified by:
getUrlsin classXMLFileIndexingWriter- Returns:
- The collectionUrl value
- Throws:
Exception- If error
-
getScopeUrl
Gets the URL to the collection's scope statement.- Returns:
- The URL to the collection's scope statement, or null if none.
- Throws:
Exception- If error
-
getReviewProcessUrl
Gets the URL to the collection's review process statement.- Returns:
- The URL to the collection's review process statement.
- Throws:
Exception- If error
-
getReviewProcess
Gets the collection's review process statement.- Returns:
- The collection's review process statement.
- Throws:
Exception- If error
-
getFormatOfRecords
Gets the format of the records in this collection.- Returns:
- The records format.
- Throws:
Exception- If error
-
getCost
Gets the cost associated with this collection.- Returns:
- The cost.
- Throws:
Exception- If error
-
getKeywords
Gets the keywords associated with this collection.- Returns:
- The all keywords separated by spaces.
- Throws:
Exception- NOT YET DOCUMENTED
-
getGradeRanges
Gets the gradeRanges for this collection.- Returns:
- The gradeRanges value
- Throws:
Exception- NOT YET DOCUMENTED
-
getSubjects
Gets the subjects for this collection.- Returns:
- The subjects value
- Throws:
Exception- NOT YET DOCUMENTED
-
_getIds
Gets the ID of this collection record.- Specified by:
_getIdsin classXMLFileIndexingWriter- Returns:
- The ID
- Throws:
Exception- If error
-
getDocType
Gets the docType attribute of the DleseCollectionFileIndexingWriter, which is 'dlesecollect.'- Specified by:
getDocTypein interfaceDocWriter- Specified by:
getDocTypein classFileIndexingServiceWriter- Returns:
- The docType, which is 'dlese_collect.'
-
getReaderClass
Gets the name of the concreteDocReaderclass that is used to read this type ofDocument, which is "DleseCollectionDocReader".- Specified by:
getReaderClassin interfaceDocWriter- Specified by:
getReaderClassin classFileIndexingServiceWriter- Returns:
- The String "org.dlese.dpc.index.reader.DleseCollectionDocReader".
-
getAccessionDate
Returns the accession date or null if this collection is not currently accessioned.- Returns:
- The accession date or null
- Throws:
Exception- This method should throw and Exception with appropriate error message if an error occurs.
-
getWhatsNewDate
Returns the date used to determine "What's new" in the library. Just returns the file mod date.- Specified by:
getWhatsNewDatein classXMLFileIndexingWriter- Returns:
- The what's new date for the item
- Throws:
Exception- This method should throw and Exception with appropriate error message if an error occurs.
-
getWhatsNewType
Returns 'collection'.- Specified by:
getWhatsNewTypein classXMLFileIndexingWriter- Returns:
- The string 'collection'.
-
init
Performs the necessary init functions (nothing done).- Specified by:
initin classXMLFileIndexingWriter- Parameters:
source- The source file being indexedexistingDoc- An existing Document that currently resides in the index for the given resource, or null if none was previously present- Throws:
Exception- If an error occured during set-up.
-
destroy
protected void destroy()This method is called at the conclusion of processing and may be used for tear-down.- Specified by:
destroyin classFileIndexingServiceWriter
-
getValidationReport
Gets a report detailing any errors found in the XML validation of the collection record, or null if no error was found.- Overrides:
getValidationReportin classFileIndexingServiceWriter- Returns:
- Null if no data validation errors were found, otherwise a String that details the nature of the error.
- Throws:
Exception- If error in performing the validation.
-
indexFullContentInDefaultAndStems
public boolean indexFullContentInDefaultAndStems()Default and stems fields handled here, so do not index full content.- Specified by:
indexFullContentInDefaultAndStemsin classXMLFileIndexingWriter- Returns:
- False
-
addFields
protected final void addFields(org.apache.lucene.document.Document newDoc, org.apache.lucene.document.Document existingDoc, File sourceFile) throws Exception Adds fields to the index that are part of the collection-level Document.- Specified by:
addFieldsin classXMLFileIndexingWriter- Parameters:
newDoc- The new Document that is being created for this resourceexistingDoc- An existing Document that currently resides in the index for the given resource, or null if none was previously presentsourceFile- The sourceFile that is being indexed.- Throws:
Exception- If an error occurs
-
getDeletedDoc_OFF_2006_08_23
public org.apache.lucene.document.Document getDeletedDoc_OFF_2006_08_23(org.apache.lucene.document.Document existingDoc) throws Throwable Creates a LuceneDocumentfrom an existing CollectionFileIndexing Document by setting the field "deleted" to "true" and making the modtime equal to current time.- Parameters:
existingDoc- An existing FileIndexingService Document that currently resides in the index for the given resource.- Returns:
- A Lucene FileIndexingService Document with the field "deleted" set to "true" and modtime set to current time.
- Throws:
Throwable- Thrown if error occurs
-
getCurrentCollectionStatus
Gets the status of the collection based on the values in the collection-level record.- Parameters:
doc- A dlese_collect XML Document- Returns:
- The currentCollectionStatus value
-