Class FileIndexingServiceDocReader

java.lang.Object
org.dlese.dpc.index.reader.DocReader
org.dlese.dpc.index.reader.FileIndexingServiceDocReader
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
ErrorDocReader, SimpleFileIndexingServiceDocReader, XMLDocReader

public abstract class FileIndexingServiceDocReader extends DocReader implements Serializable
An abstract bean for accessing the data stored in a Lucene Document that was created by a FileIndexingServiceWriter. This class may be extended for each Document type that might be returned in a search.
Author:
John Weatherley
See Also:
  • Constructor Details

    • FileIndexingServiceDocReader

      protected FileIndexingServiceDocReader(org.apache.lucene.document.Document doc)
      Constructor that may be used programatically to wrap a reader around a Lucene Document that was created by a DocWriter.
      Parameters:
      doc - A Lucene Document.
      See Also:
    • FileIndexingServiceDocReader

      protected FileIndexingServiceDocReader()
      Constructor that initializes an empty DocReader.
  • Method Details

    • getFullContent

      public final String getFullContent()
      Gets the full content of the file that was used to index the Document. This includes all XML or HTML tags, etc.
      Returns:
      The full content as text, or empty string if unable to process.
    • getFullContentEncodedAs

      public final String getFullContentEncodedAs(String characterEncoding)
      Gets the full content of the file that was used to index the Document, returned in the given character encoding, for example UTF-8.
      Parameters:
      characterEncoding - The character encoding to return, for example 'UTF-8'
      Returns:
      The full content as text, or empty string if unable to process.
    • getDoctype

      public String getDoctype()
      Gets doctype associated with the Document, for example 'dlese_ims,' 'adn,' or 'html'. Note that to support wildcard searching, the doctype is indexed with a leading '0' appened to the beginning. This method strips the leading zero prior to returning.
      Returns:
      The doctype value.
    • getDeleted

      public String getDeleted()
      Determine whether the status of this Document is deleted, indicated by a return value of "true". This does not necessarily mean the file has been deleted.
      Returns:
      The String "true" if the status is deleted, else "false".
    • isDeleted

      public boolean isDeleted()
      Determine whether the status of this Document is deleted. This does not necessarily mean the file has been deleted.

      Field: status [true]

      Returns:
      True if the status is deleted.
    • getFileExists

      public String getFileExists()
      Determine whether the file associated with this Document exists, indicated by a return value of "true".
      Returns:
      The String "true" if the file exists, else "false".
    • fileExists

      public boolean fileExists()
      Determine whether the file associated with this Document exists.
      Returns:
      True if the file exists, else false.
    • getDateFileWasIndexedString

      public String getDateFileWasIndexedString()
      Gets the date and time this record was indexed, as a String.
      Returns:
      The date and time this record was indexed
    • getDateFileWasIndexed

      public Date getDateFileWasIndexed()
      Gets the date this record was indexed.
      Returns:
      The date this record was indexed
    • getLastModifiedString

      public String getLastModifiedString()
      Gets a String representataion of the File modification time of the File used to index the Document. Note that while this represents the File modification time, this date stamp does not get updated until the File is re-indexed by the indexer.
      Returns:
      The File modification time.
    • getLastModifiedAsUTC

      public String getLastModifiedAsUTC()
      Gets the file modification date in UTC format for the given record.
      Returns:
      The file modification date value.
    • getLastModified

      public long getLastModified()
      Gets the File modification time of the File used to index the Document. Note that while this represents the File modification time, this date stamp does not get updated until the File is re-indexed by the indexer.
      Returns:
      The File modification time.
    • getFile

      public File getFile()
      Gets the File that was used to index the Document.
      Returns:
      The source File.
    • getFileName

      public String getFileName()
      Gets the name of the File that was used to index the Document.
      Returns:
      The source File name.
    • getDocsource

      public String getDocsource()
      Gets the absolute path of the file that was used to index the Document.
      Returns:
      The absolute path the the underlying file.
    • getDocsourceEncoded

      public String getDocsourceEncoded()
      Gets the absolute path of the file that was used to index the Document, encoded.
      Returns:
      The absolute path the the underlying file.
    • getDocDir

      public String getDocDir()
      Gets the absolute path of the directory that contained the File used to index the Document.
      Returns:
      The docDir value.
    • getDateStamp

      protected static final String getDateStamp()
      Return a string for the current time and date, sutiable for display in log files and output to standout:
      Returns:
      The dateStamp value
    • setDebug

      protected static final void setDebug(boolean db)
      Sets the debug attribute.
      Parameters:
      db - The new debug value
    • prtlnErr

      protected static void prtlnErr(String s)
      Output a line of text to error out, with datestamp.
      Parameters:
      s - The text that will be output to error out.
    • prtln

      protected static void prtln(String s)
      Output a line of text to standard out, with datestamp, if debug is set to true.
      Parameters:
      s - The String that will be output.