Package org.dlese.dpc.index.reader
Class FileIndexingServiceDocReader
java.lang.Object
org.dlese.dpc.index.reader.DocReader
org.dlese.dpc.index.reader.FileIndexingServiceDocReader
- All Implemented Interfaces:
Serializable
- Direct Known Subclasses:
ErrorDocReader,SimpleFileIndexingServiceDocReader,XMLDocReader
An abstract bean for accessing the data stored in a Lucene
Document
that was created by a FileIndexingServiceWriter. This class may be
extended for each Document type that might be returned in a search.- Author:
- John Weatherley
- See Also:
-
Field Summary
-
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedConstructor that initializes an empty DocReader.protectedFileIndexingServiceDocReader(org.apache.lucene.document.Document doc) Constructor that may be used programatically to wrap a reader around a LuceneDocumentthat was created by aDocWriter. -
Method Summary
Modifier and TypeMethodDescriptionbooleanDetermine whether the file associated with thisDocumentexists.Gets the date this record was indexed.Gets the date and time this record was indexed, as a String.protected static final StringReturn a string for the current time and date, sutiable for display in log files and output to standout:Determine whether the status of thisDocumentis deleted, indicated by a return value of "true".Gets the absolute path of the directory that contained the File used to index theDocument.Gets the absolute path of the file that was used to index theDocument.Gets the absolute path of the file that was used to index theDocument, encoded.Gets doctype associated with theDocument, for example 'dlese_ims,' 'adn,' or 'html'.getFile()Gets the File that was used to index theDocument.Determine whether the file associated with thisDocumentexists, indicated by a return value of "true".Gets the name of the File that was used to index theDocument.final StringGets the full content of the file that was used to index theDocument.final StringgetFullContentEncodedAs(String characterEncoding) Gets the full content of the file that was used to index theDocument, returned in the given character encoding, for example UTF-8.longGets the File modification time of the File used to index theDocument.Gets the file modification date in UTC format for the given record.Gets a String representataion of the File modification time of the File used to index theDocument.booleanDetermine whether the status of thisDocumentis deleted.protected static voidOutput a line of text to standard out, with datestamp, if debug is set to true.protected static voidOutput a line of text to error out, with datestamp.protected static final voidsetDebug(boolean db) Sets the debug attribute.Methods inherited from class org.dlese.dpc.index.reader.DocReader
doInit, getAttribute, getDocMap, getDocument, getIndex, getLazyDocMap, getQuery, getReaderType, getRepositoryManager, getScore, init, setDoc
-
Constructor Details
-
FileIndexingServiceDocReader
protected FileIndexingServiceDocReader(org.apache.lucene.document.Document doc) Constructor that may be used programatically to wrap a reader around a LuceneDocumentthat was created by aDocWriter.- Parameters:
doc- A LuceneDocument.- See Also:
-
FileIndexingServiceDocReader
protected FileIndexingServiceDocReader()Constructor that initializes an empty DocReader.
-
-
Method Details
-
getFullContent
Gets the full content of the file that was used to index theDocument. This includes all XML or HTML tags, etc.- Returns:
- The full content as text, or empty string if unable to process.
-
getFullContentEncodedAs
Gets the full content of the file that was used to index theDocument, returned in the given character encoding, for example UTF-8.- Parameters:
characterEncoding- The character encoding to return, for example 'UTF-8'- Returns:
- The full content as text, or empty string if unable to process.
-
getDoctype
Gets doctype associated with theDocument, for example 'dlese_ims,' 'adn,' or 'html'. Note that to support wildcard searching, the doctype is indexed with a leading '0' appened to the beginning. This method strips the leading zero prior to returning.- Returns:
- The doctype value.
-
getDeleted
Determine whether the status of thisDocumentis deleted, indicated by a return value of "true". This does not necessarily mean the file has been deleted.- Returns:
- The String "true" if the status is deleted, else "false".
-
isDeleted
public boolean isDeleted()Determine whether the status of thisDocumentis deleted. This does not necessarily mean the file has been deleted.Field: status [true]
- Returns:
- True if the status is deleted.
-
getFileExists
Determine whether the file associated with thisDocumentexists, indicated by a return value of "true".- Returns:
- The String "true" if the file exists, else "false".
-
fileExists
public boolean fileExists()Determine whether the file associated with thisDocumentexists.- Returns:
- True if the file exists, else false.
-
getDateFileWasIndexedString
Gets the date and time this record was indexed, as a String.- Returns:
- The date and time this record was indexed
-
getDateFileWasIndexed
Gets the date this record was indexed.- Returns:
- The date this record was indexed
-
getLastModifiedString
Gets a String representataion of the File modification time of the File used to index theDocument. Note that while this represents the File modification time, this date stamp does not get updated until the File is re-indexed by the indexer.- Returns:
- The File modification time.
-
getLastModifiedAsUTC
Gets the file modification date in UTC format for the given record.- Returns:
- The file modification date value.
-
getLastModified
public long getLastModified()Gets the File modification time of the File used to index theDocument. Note that while this represents the File modification time, this date stamp does not get updated until the File is re-indexed by the indexer.- Returns:
- The File modification time.
-
getFile
Gets the File that was used to index theDocument.- Returns:
- The source File.
-
getFileName
Gets the name of the File that was used to index theDocument.- Returns:
- The source File name.
-
getDocsource
Gets the absolute path of the file that was used to index theDocument.- Returns:
- The absolute path the the underlying file.
-
getDocsourceEncoded
Gets the absolute path of the file that was used to index theDocument, encoded.- Returns:
- The absolute path the the underlying file.
-
getDocDir
Gets the absolute path of the directory that contained the File used to index theDocument.- Returns:
- The docDir value.
-
getDateStamp
Return a string for the current time and date, sutiable for display in log files and output to standout:- Returns:
- The dateStamp value
-
setDebug
protected static final void setDebug(boolean db) Sets the debug attribute.- Parameters:
db- The new debug value
-
prtlnErr
Output a line of text to error out, with datestamp.- Parameters:
s- The text that will be output to error out.
-
prtln
Output a line of text to standard out, with datestamp, if debug is set to true.- Parameters:
s- The String that will be output.
-