Class SimpleLuceneIndex

java.lang.Object
org.dlese.dpc.index.SimpleLuceneIndex

public final class SimpleLuceneIndex extends Object
A simple API for searching, reading and writing Lucene indexes.
Author:
John Weatherley, Dave Deniman
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final boolean
    Indicates update operations will be blocked until the current one returns.
    static final int
    Use to set the boolean search operator to AND.
    static final int
    Use to set the boolean search operator to OR.
    static final boolean
    Indicates update operations will be allowed while others are still in progress.
  • Constructor Summary

    Constructors
    Constructor
    Description
    SimpleLuceneIndex(String indexDirPath)
    Initializes or creates an index at the given location using a default search field named "default" and a StandardAnalyzer for index searching and creation.
    SimpleLuceneIndex(String indexDirPath, String defaultField, org.apache.lucene.analysis.Analyzer analyzer)
    Initializes or creates an index at the given location using the default search field, additional stop words and analyzer indicated.
    SimpleLuceneIndex(String indexDirPath, org.apache.lucene.analysis.Analyzer analyzer)
    Initializes or creates an index at the given location using a default search field named "default" and the given Analyzer.
  • Method Summary

    Modifier and Type
    Method
    Description
    boolean
    addDoc(org.apache.lucene.document.Document doc)
    Adds a Document to the index.
    boolean
    addDoc(org.apache.lucene.document.Document doc, boolean block)
    Adds a Document to the index.
    boolean
    addDocs(org.apache.lucene.document.Document[] docs)
    Adds a group of Documents to the index.
    boolean
    addDocs(org.apache.lucene.document.Document[] docs, boolean block)
    Adds a group of Documents to the index.
    void
    Closes the writers and performs clean-up
    void
    Deletes the index and re-initializes a new, empty one in its place.
    void
    doWithDocument(Callback cal, String field, String term)
    Calls the callback function of cal for each document matching the term in the given field
    void
    doWithDocument(Callback cal, String field, String[] terms)
    Calls the callback function of cal for each document matching the terms in the given field
    static final String
    Encodes a String to an appropriate format that can be indexed as a single term using a StandardAnalyzer.
    static final String
    encodeToTerm(String s, boolean encodeWildCards)
    Encodes a String to an appropriate format that can be indexed as a single term using a StandardAnalyzer.
    static final String
    encodeToTerm(String s, boolean encodeWildCards, boolean encodeSpace)
    Encodes a String to an appropriate format that can be indexed as a single term or terms using a StandardAnalyzer.
    static final String
    escape(String term)
    Escapes all Lucene QueryParser reserved characters with a preceeding \.
    static final String
    escape(String term, String preserveChars)
    Escapes the Lucene QueryParser reserved characters with a preceeding \ except those included in preserveChars.
    protected void
    Override finalize to ensure resources are released...
    final org.apache.lucene.analysis.Analyzer
    Gets the analyzer that has been configured for this index.
    Gets an attribute from this SimpleLuceneIndex.
    static final String
    Gets a datestamp of the current time formatted for display with logs and output.
    Gets the name of the field that is searched by default if no field is indicated.
    org.apache.lucene.document.Document
    getDocument(int n)
    Gets the nth document in the index.
    Gets a list of all fields in the index listed alphabetically.
    Gets the ablsolute path to the directory where the index resides.
    long
    Gets the version number of the last time the index was modified by adding, deleting or changing a document.
    org.apache.lucene.queryParser.QueryParser.Operator
    Gets the Lucene boolean operator that is currently being used for searches.
    static org.apache.lucene.util.Version
    Gets /the version of Lucene.
    int
    Gets the total number of documents in the index.
    int
    Gets the number of documents that match the given query.
    int
    getNumDocs(org.apache.lucene.search.Query query)
    Gets the number of documents that match the given query.
    int
    Gets the boolean operator that is currently being used for searches.
    Gets the boolean operator that is currently being used for searches as a String (AND or OR).
    final org.apache.lucene.queryParser.QueryParser
    Gets a new instance of the QueryParser used by this SimpleLuceneIndex that uses it's Analyzers, defaultField and boolean operator settings.
    final org.apache.lucene.queryParser.QueryParser
    getQueryParser(String defaultSearchField)
    Gets a new instance of the QueryParser used by this SimpleLuceneIndex that uses it's Analyzers and boolean operator settings, allowing one to specify the default search field.
    org.apache.lucene.index.IndexReader
    Gets the IndexReader.
    final Map
    Gets a Map of all terms that are in the index under the given fields.
    Gets a Map of all terms that are in the index.
    Gets a Map of all terms that are in the index under the given field.
    final Map
    Gets a Map of all terms that are in the index under the given fields.
    int
    Gets the termFrequency across all fields in the index
    int
    Gets the termFrequency of terms in the given field.
    Gets a Map of Lists that contain the terms for each field in the index.
    Gets a list of all terms that are in the index under the given field name.
    boolean
    Indicates whether the index is currently being updated or modified.
    Gets a list of all Documents in the index.
    listDocs(String field, String term)
    Gets a list of all Documents in the index that match the given term in the given field.
    listDocs(String field, String[] terms)
    Gets a list of all Documents in the index that match the given terms in the given field.
    Gets a list of all terms in the index.
    boolean
    removeDocs(String field, String value)
    Removes all Documents that match the given term within the given field.
    boolean
    removeDocs(String field, String[] values)
    Removes all documents that match the given terms within the given field.
    boolean
    removeDocs(String field, String value, boolean block)
    See removeDocs(String,String) for description.
    Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
    searchDocs(String query, String defaultField)
    Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
    searchDocs(String query, String defaultField, org.apache.lucene.search.Filter filter, org.apache.lucene.search.Sort sortBy)
    Performs a search over the index using the qiven query String, default field and Filter, returning an ordered array of matching ranked results.
    searchDocs(String query, HashMap docReaderAttributes)
    Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
    searchDocs(String query, HashMap docReaderAttributes, org.apache.lucene.analysis.Analyzer analyzer)
    Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
    searchDocs(String query, org.apache.lucene.analysis.Analyzer analyzer)
    Performs a search over the index using the qiven query String and Analyzer, returning an ordered array of matching ranked results.
    searchDocs(String query, org.apache.lucene.search.Filter filter)
    Performs a search over the index using the qiven query String and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
    searchDocs(String query, org.apache.lucene.search.Filter filter, HashMap docReaderAttributes)
    Performs a search over the index using the qiven query String and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
    searchDocs(String query, org.apache.lucene.search.Filter filter, org.apache.lucene.search.Sort sortBy, HashMap docReaderAttributes, org.apache.lucene.analysis.Analyzer analyzer)
    Performs a search over the index using the qiven query String and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
    searchDocs(String query, org.apache.lucene.search.Sort sortBy)
    Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
    searchDocs(org.apache.lucene.search.Query query)
    Performs a search over the index using the qiven Query using the pre-defined default field, returning an ordered array of matching ranked results.
    searchDocs(org.apache.lucene.search.Query query, HashMap docReaderAttributes)
    Performs a search over the index using the Query object, returning an ordered array of matching ranked results.
    searchDocs(org.apache.lucene.search.Query query, org.apache.lucene.search.Filter filter)
    Performs a search over the index using the qiven Query and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
    searchDocs(org.apache.lucene.search.Query query, org.apache.lucene.search.Filter filter, org.apache.lucene.search.Sort sortBy, HashMap docReaderAttributes)
    Performs a search over the index using the qiven Query Object and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
    void
    setAttribute(String key, Object attribute)
    Sets an attribute that will be available for access in search results by calling DocReader.getAttribute(String) or ResultDoc.getAttribute(String).
    static void
    setDebug(boolean db)
    Sets the debug attribute of the SimpleLuceneIndex object
    void
    setOperator(int operator)
    Sets the boolean operator used during searches.
    void
    Instructs the indexer to stop processing updates.
    boolean
    update(String deleteField, String[] deleteValues, org.apache.lucene.document.Document[] addDocs)
    Updates the index by first deleting the documents that match the value(s) indicated in deleteValues in the field deleteField, then adding the documents in addDocs.
    boolean
    update(String deleteField, String[] deleteValues, org.apache.lucene.document.Document[] addDocs, boolean block)
    Updates the index by first deleting the documents that match the value(s) indicated in deleteValues in the field deleteField, then adding the documents in addDocs.
    boolean
    update(String deleteField, String deleteValue, ArrayList addDocs, boolean block)
    boolean
    update(String deleteField, String deleteValue, org.apache.lucene.document.Document[] addDocs, boolean block)
    boolean
    update(String deleteField, String deleteValue, org.apache.lucene.document.Document addDoc, boolean block)
    boolean
    update(String deleteField, ArrayList deleteValues, ArrayList addDocs)
    Updates the index by first deleting the documents that match the value(s) indicated in deleteValues in the field deleteField, then adding the documents in addDocs.
    boolean
    update(String deleteField, ArrayList deleteValues, ArrayList addDocs, boolean block)
    Updates the index by first deleting the documents that match the value(s) indicated in deleteValues in the field deleteField, then adding the documents in addDocs.

    Methods inherited from class java.lang.Object

    clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • BLOCK

      public static final boolean BLOCK
      Indicates update operations will be blocked until the current one returns. When this is passed into a method, the method will not return until the update operation has completed.
      See Also:
    • NO_BLOCK

      public static final boolean NO_BLOCK
      Indicates update operations will be allowed while others are still in progress. When this is passed into a method, the method will return immediately rather than waiting for the update operation to complete.
      See Also:
    • DEFAULT_OR

      public static final int DEFAULT_OR
      Use to set the boolean search operator to OR.
      See Also:
    • DEFAULT_AND

      public static final int DEFAULT_AND
      Use to set the boolean search operator to AND.
      See Also:
  • Constructor Details

    • SimpleLuceneIndex

      public SimpleLuceneIndex(String indexDirPath)
      Initializes or creates an index at the given location using a default search field named "default" and a StandardAnalyzer for index searching and creation.
      Parameters:
      indexDirPath - The directory where the index is located or will be created.
    • SimpleLuceneIndex

      public SimpleLuceneIndex(String indexDirPath, org.apache.lucene.analysis.Analyzer analyzer)
      Initializes or creates an index at the given location using a default search field named "default" and the given Analyzer.
      Parameters:
      indexDirPath - The directory where the index is located or will be created.
      analyzer - The default Analyzer to use for searching and index creation
    • SimpleLuceneIndex

      public SimpleLuceneIndex(String indexDirPath, String defaultField, org.apache.lucene.analysis.Analyzer analyzer)
      Initializes or creates an index at the given location using the default search field, additional stop words and analyzer indicated.
      Parameters:
      indexDirPath - The directory where the index is located or will be created.
      defaultField - The name of the field used for default searching, for example "default".
      analyzer - The default Analyzer to use for searching and index creation
  • Method Details

    • deleteAndReinititlize

      public void deleteAndReinititlize()
      Deletes the index and re-initializes a new, empty one in its place.
    • setAttribute

      public void setAttribute(String key, Object attribute)
      Sets an attribute that will be available for access in search results by calling DocReader.getAttribute(String) or ResultDoc.getAttribute(String).
      Parameters:
      key - The key used to reference the attribute.
      attribute - Any Java Object.
      See Also:
    • getAttribute

      public Object getAttribute(String key)
      Gets an attribute from this SimpleLuceneIndex. Note that these attributes are available for access in search results by calling DocReader.getAttribute(String) or ResultDoc.getAttribute(String). The key 'thisIndex' returns this index.
      Parameters:
      key - The key used to reference the attribute.
      Returns:
      The Java Object that is stored under the given key or null if none exists.
      See Also:
    • getIndexLocation

      public String getIndexLocation()
      Gets the ablsolute path to the directory where the index resides.
      Returns:
      The absolue path to the index.
    • searchDocs

      public ResultDocList searchDocs(String query)
      Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, org.apache.lucene.search.Sort sortBy)
      Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      sortBy - A Sort to apply to the results or null to use relevancy ranking
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, org.apache.lucene.analysis.Analyzer analyzer)
      Performs a search over the index using the qiven query String and Analyzer, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      analyzer - The Analyzer to use to determine the tokens in the query.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, HashMap docReaderAttributes)
      Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      docReaderAttributes - Attributes that are included for use in DocReaders via the ResultDocConfig.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(org.apache.lucene.search.Query query, HashMap docReaderAttributes)
      Performs a search over the index using the Query object, returning an ordered array of matching ranked results.
      Parameters:
      query - The Query to search over the index.
      docReaderAttributes - Attributes that are included for use in DocReaders via the ResultDocConfig.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, HashMap docReaderAttributes, org.apache.lucene.analysis.Analyzer analyzer)
      Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      docReaderAttributes - Attributes that are included for use in DocReaders via the ResultDocConfig.
      analyzer - The analyzer to use, or null to use the default
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, String defaultField)
      Performs a search over the index using the qiven query String, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      defaultField - The default field to search in.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, String defaultField, org.apache.lucene.search.Filter filter, org.apache.lucene.search.Sort sortBy)
      Performs a search over the index using the qiven query String, default field and Filter, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      defaultField - The default field to search in, or null to use the pre-defined default field.
      filter - A filter used for the search.
      sortBy - A Sort to apply to the results or null to use relevancy ranking
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, org.apache.lucene.search.Filter filter)
      Performs a search over the index using the qiven query String and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      filter - A filter used for the search.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(org.apache.lucene.search.Query query, org.apache.lucene.search.Filter filter)
      Performs a search over the index using the qiven Query and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
      Parameters:
      query - The Query to perform over the index.
      filter - A filter used for the search.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(org.apache.lucene.search.Query query)
      Performs a search over the index using the qiven Query using the pre-defined default field, returning an ordered array of matching ranked results.
      Parameters:
      query - The Query to perform over the index.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, org.apache.lucene.search.Filter filter, HashMap docReaderAttributes)
      Performs a search over the index using the qiven query String and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      filter - A filter used for the search.
      docReaderAttributes - Attributes that are included for use in DocReaders via the ResultDocConfig.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(org.apache.lucene.search.Query query, org.apache.lucene.search.Filter filter, org.apache.lucene.search.Sort sortBy, HashMap docReaderAttributes)
      Performs a search over the index using the qiven Query Object and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
      Parameters:
      query - The Query to perform over the index.
      filter - A filter used for the search.
      sortBy - A Sort to apply to the results or null to use relevancy ranking
      docReaderAttributes - Attributes that are included for use in DocReaders via the ResultDocConfig.
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • searchDocs

      public ResultDocList searchDocs(String query, org.apache.lucene.search.Filter filter, org.apache.lucene.search.Sort sortBy, HashMap docReaderAttributes, org.apache.lucene.analysis.Analyzer analyzer)
      Performs a search over the index using the qiven query String and Filter using the pre-defined default field, returning an ordered array of matching ranked results.
      Parameters:
      query - The query to perform over the index.
      filter - A Filter used for the search or null for none.
      sortBy - A Sort to apply to the results or null to use relevancy ranking
      docReaderAttributes - Attributes that are included for use in DocReaders via the ResultDocConfig or null for none
      analyzer - The analyzer to use, or null to use the default
      Returns:
      An ordered array of ranked results matching the given query.
      See Also:
    • setOperator

      public void setOperator(int operator)
      Sets the boolean operator used during searches. Once set, the given boolean operator will be used for all subsequent searches. If this method is never called the boolean operator defaults to OR.
      Parameters:
      operator - The new boolean operator value.
      See Also:
    • getOperator

      public int getOperator()
      Gets the boolean operator that is currently being used for searches.
      Returns:
      The boolean operator value.
      See Also:
    • getLuceneOperator

      public org.apache.lucene.queryParser.QueryParser.Operator getLuceneOperator()
      Gets the Lucene boolean operator that is currently being used for searches.
      Returns:
      The boolean operator value.
    • getQueryParser

      public final org.apache.lucene.queryParser.QueryParser getQueryParser()
      Gets a new instance of the QueryParser used by this SimpleLuceneIndex that uses it's Analyzers, defaultField and boolean operator settings.
      Returns:
      The QueryParser used by this SimpleLuceneIndex
    • getQueryParser

      public final org.apache.lucene.queryParser.QueryParser getQueryParser(String defaultSearchField)
      Gets a new instance of the QueryParser used by this SimpleLuceneIndex that uses it's Analyzers and boolean operator settings, allowing one to specify the default search field.
      Parameters:
      defaultSearchField - The search field used as default when none is specified in the query
      Returns:
      The QueryParser used by this SimpleLuceneIndex with the given default search field
    • getOperatorString

      public String getOperatorString()
      Gets the boolean operator that is currently being used for searches as a String (AND or OR).
      Returns:
      The boolean operator value as a String (AND or OR).
      See Also:
    • getDefaultSearchField

      public String getDefaultSearchField()
      Gets the name of the field that is searched by default if no field is indicated.
      Returns:
      The defaultSearchFirld value
    • getReader

      public org.apache.lucene.index.IndexReader getReader()
      Gets the IndexReader.
      Returns:
      The reader value
    • getNumDocs

      public int getNumDocs(String query)
      Gets the number of documents that match the given query.
      Parameters:
      query - The query to perform over the index.
      Returns:
      The number of matching documents.
    • getNumDocs

      public int getNumDocs(org.apache.lucene.search.Query query)
      Gets the number of documents that match the given query.
      Parameters:
      query - The query to perform over the index.
      Returns:
      The number of matching documents.
    • getNumDocs

      public int getNumDocs()
      Gets the total number of documents in the index.
      Returns:
      The number of documents in the index.
    • listDocs

      public List listDocs()
      Gets a list of all Documents in the index. Note: This method loads all Documents and requires a large amount of memory for large result sets (consider using search instead).
      Returns:
      A list of all documents in the index.
    • listDocs

      public List listDocs(String field, String term)
      Gets a list of all Documents in the index that match the given term in the given field. Note: This method loads all Documents and requires a large amount of memory for large result sets (consider using search instead).
      Parameters:
      field - The field searched.
      term - The term to match.
      Returns:
      A list of matching documents.
    • listDocs

      public List listDocs(String field, String[] terms)
      Gets a list of all Documents in the index that match the given terms in the given field. Note: This method loads all Documents and requires a large amount of memory for large result sets (consider using search instead).
      Parameters:
      field - The field searched.
      terms - The terms to match.
      Returns:
      A list of matching documents.
    • doWithDocument

      public void doWithDocument(Callback cal, String field, String[] terms)
      Calls the callback function of cal for each document matching the terms in the given field
      Parameters:
      cal -
      field -
      terms -
    • doWithDocument

      public void doWithDocument(Callback cal, String field, String term)
      Calls the callback function of cal for each document matching the term in the given field
      Parameters:
      cal -
      field -
      term -
    • listTerms

      public List listTerms()
      Gets a list of all terms in the index.
      Returns:
      A list of all terms in the index.
    • getFields

      public List getFields()
      Gets a list of all fields in the index listed alphabetically. Depending on the state of the index, the list may contain fileds that are empty, meaning all terms for the given field have been deleted and there are no possible matching queries within the field.
      Returns:
      A list of all fields in the index.
    • getTermLists

      public Map getTermLists()
      Gets a Map of Lists that contain the terms for each field in the index. The keys in the Map are Strings that represent all fields in the index. The List that is returned for each key contains all terms that are in the index for the given field.
      Returns:
      A Map of term Lists keyed by field Strings.
    • getTerms

      public List getTerms(String field)
      Gets a list of all terms that are in the index under the given field name. Implementation note: this method is not efficient. If you need to use this method frequently, consider caching the results and using getLastModifiedCount() to determe when to update the cache.
      Parameters:
      field - The indexed field name.
      Returns:
      List of terms in the index under the given field.
    • getTermCounts

      public Map getTermCounts(String field)
      Gets a Map of all terms that are in the index under the given field. The keys in the map are Strings that list the terms. The values in the Map are Integers that hold the total count of the terms in the given field across all documents.

      Implementation note: this method is not efficient. If you need to use this method frequently, consider caching the results and using getLastModifiedCount() to determe when to update the cache.

      Parameters:
      field - The indexed field name.
      Returns:
      Map containing terms/counts for all terms in the index under the given field.
    • getTermCounts

      public Map getTermCounts()
      Gets a Map of all terms that are in the index. The keys in the map are Strings that list the terms. The values in the Map are Integers that hold the total count of the terms across all documents.

      Implementation note: this method is not efficient. If you need to use this method frequently, consider caching the results and using getLastModifiedCount() to determe when to update the cache.

      Returns:
      Map containing terms/counts for all terms in the index under the given field.
    • getTermCounts

      public final Map getTermCounts(String[] fields)
      Gets a Map of all terms that are in the index under the given fields. The keys in the map are Strings that list the terms. The values in the Map are Integers that hold the total count of the terms in the given fields across all documents.

      Implementation note: this method is not efficient. If you need to use this method frequently, consider caching the results and using getLastModifiedCount() to determe when to update the cache.

      Parameters:
      fields - The indexed field names.
      Returns:
      Map containing terms/counts for all terms in the index under the given fields.
    • getTermAndDocCounts

      public final Map getTermAndDocCounts(String[] fields)
      Gets a Map of all terms that are in the index under the given fields. The keys in the map are Strings that list the terms. The values in the Map are TermDocCount Objects, which contain the term count, the total number of documents containing the term in one or more of the given field(s), and a list of fields in which the term appears.

      Implementation note: this method is not efficient. If you need to use this method frequently, consider caching the results and using getLastModifiedCount() to determe when to update the cache. Also, this method is considerably slower when more than one field is requested. This is because an extry query is required for each term that is found.

      Parameters:
      fields - The indexed field names.
      Returns:
      Map containing a TermDocCount Object for all terms in the index under the given fields.
      See Also:
    • getTermFrequency

      public int getTermFrequency(String term)
      Gets the termFrequency across all fields in the index
      Parameters:
      term - The term.
      Returns:
      The termFrequency value.
    • getTermFrequency

      public int getTermFrequency(String field, String term)
      Gets the termFrequency of terms in the given field.
      Parameters:
      field - The field.
      term - The term.
      Returns:
      The termFrequency.
    • addDoc

      public boolean addDoc(org.apache.lucene.document.Document doc)
      Adds a Document to the index. Blocks all other update operations until complete.
      Parameters:
      doc - The Document to add.
      Returns:
      True if successful.
    • addDoc

      public boolean addDoc(org.apache.lucene.document.Document doc, boolean block)
      Adds a Document to the index.
      Parameters:
      doc - The Document to add.
      block - Indicates whether to block other updates until complete.
      Returns:
      True if successful.
    • addDocs

      public boolean addDocs(org.apache.lucene.document.Document[] docs)
      Adds a group of Documents to the index. Blocks all other update operations until complete.
      Parameters:
      docs - The Documents to add.
      Returns:
      True if successful.
    • addDocs

      public boolean addDocs(org.apache.lucene.document.Document[] docs, boolean block)
      Adds a group of Documents to the index. Blocks all other update operations until complete.
      Parameters:
      docs - The Documents to add.
      block - Indicates whether to block other updates until complete.
      Returns:
      True if successful.
    • removeDocs

      public boolean removeDocs(String field, String value)
      Removes all Documents that match the given term within the given field. This is useful for removing a single document that is indexed with a unique ID field, or to remove a group of documents mathcing the same term for a given field. Blocks all other index update operations until this is complete.
      Parameters:
      field - The field that is searched.
      value - The term that is matched for deletes.
      Returns:
      True if the delete was successful.
    • removeDocs

      public boolean removeDocs(String field, String value, boolean block)
      See removeDocs(String,String) for description. Adds the ability to control whether blocking occurs during the update.
      Parameters:
      field - The field that is searched.
      value - The term that is matched for deletes.
      block - Indicates whether or not to block other update operations.
      Returns:
      True if the delete was successful.
    • removeDocs

      public boolean removeDocs(String field, String[] values)
      Removes all documents that match the given terms within the given field. This is useful for removing all individual documents that are indexed with a unique ID field. Blocks all other index update operations until this is complete.
      Parameters:
      field - The field that is searched.
      values - The terms that are matched for deletes.
      Returns:
      True if the delete was successful.
    • update

      public boolean update(String deleteField, String[] deleteValues, org.apache.lucene.document.Document[] addDocs, boolean block)
      Updates the index by first deleting the documents that match the value(s) indicated in deleteValues in the field deleteField, then adding the documents in addDocs. Assuming the deleteField contains a unique ID for the Document, the Document may be removed by indicating the ID in the deleteValues list. To replace an entry in the index for a single item, supply the item's ID in the deleteValues list and supply the new Document for the item in the addDocs list.
      Parameters:
      deleteField - The field searched for deleteValues.
      deleteValues - The value matched in deleteField to indicate which document(s) to delete.
      addDocs - An array of Documents to add to the index
      block - Indicates whether or not to block other threads or JVMs from read/write from the index during the delete/add operation.
      Returns:
      True if no errors, otherwise false.
    • update

      public boolean update(String deleteField, String[] deleteValues, org.apache.lucene.document.Document[] addDocs)
      Updates the index by first deleting the documents that match the value(s) indicated in deleteValues in the field deleteField, then adding the documents in addDocs. See update(String, String[], Document[], boolean) for description. Performs an update with blocking on.
      Parameters:
      deleteField - The field searched for deleteValues.
      deleteValues - Array of Strings containing the value matched in deleteField to indicate which document(s) to delete
      addDocs - Array containing Documents to add to the index
      Returns:
      True if no errors, otherwise false.
    • update

      public boolean update(String deleteField, String deleteValue, org.apache.lucene.document.Document[] addDocs, boolean block)
      Parameters:
      deleteField - The field searched for deleteValue.
      deleteValue - Matching docs are deleted.
      addDocs - These Docs are added to the index
      block - Block or run in background.
      Returns:
      True if no errors.
    • update

      public boolean update(String deleteField, String deleteValue, org.apache.lucene.document.Document addDoc, boolean block)
      Parameters:
      deleteField - The field searched for deleteValue.
      deleteValue - Matching docs are deleted.
      addDoc - The Doc to be added to the index
      block - Block or run in background.
      Returns:
      True if no errors.
    • update

      public boolean update(String deleteField, String deleteValue, ArrayList addDocs, boolean block)
      Parameters:
      deleteField - The field searched for deleteValue.
      deleteValue - Matching docs are deleted.
      addDocs - These Docs are added to the index
      block - Block or run in background.
      Returns:
      True if no errors.
    • update

      public boolean update(String deleteField, ArrayList deleteValues, ArrayList addDocs, boolean block)
      Updates the index by first deleting the documents that match the value(s) indicated in deleteValues in the field deleteField, then adding the documents in addDocs. See update(String, String[], Document[], boolean) for description.
      Parameters:
      deleteField - The field searched for deleteValues.
      deleteValues - ArrayList of Strings containing the value matched in deleteField to indicate which document(s) to delete
      addDocs - An ArrayList containing Documents to add to the index
      block - Indicates whether or not to block other threads or JVMs from read/write from the index during the delete/add operation.
      Returns:
      True if no errors, otherwise false.
    • update

      public boolean update(String deleteField, ArrayList deleteValues, ArrayList addDocs)
      Updates the index by first deleting the documents that match the value(s) indicated in deleteValues in the field deleteField, then adding the documents in addDocs. See update(String, String[], Document[], boolean) for description. Performs an update with blocking on.
      Parameters:
      deleteField - The field searched for deleteValues.
      deleteValues - ArrayList of Strings containing the value matched in deleteField to indicate which document(s) to delete
      addDocs - An ArrayList containing Documents to add to the index
      Returns:
      True if no errors, otherwise false.
    • getLastModifiedCount

      public long getLastModifiedCount()
      Gets the version number of the last time the index was modified by adding, deleting or changing a document. The version number counts the number of times the index was modified. If the index is deleted and rebuilt, the count will continue to be incremented until the next time the JVM is re-started. After the JVM has been re-started, the count will resume with the count of the new current index.
      Returns:
      The lastModifiedCount value
    • getDocument

      public org.apache.lucene.document.Document getDocument(int n)
      Gets the nth document in the index.
      Parameters:
      n - The document number
      Returns:
      The document value
    • isIndexing

      public boolean isIndexing()
      Indicates whether the index is currently being updated or modified. This means documents are in the process of being added or removed from the index.
      Returns:
      True if the index is in the process of being updated.
      See Also:
    • stopIndexing

      public void stopIndexing()
      Instructs the indexer to stop processing updates. Once complete, the index will be ready for future updating and searching but any additions or deletions that had not been completed will be lost. This method may take several seconds to return.
      See Also:
    • getAnalyzer

      public final org.apache.lucene.analysis.Analyzer getAnalyzer()
      Gets the analyzer that has been configured for this index.
      Returns:
      The Analyzer
    • close

      public void close()
      Closes the writers and performs clean-up
    • finalize

      protected void finalize()
      Override finalize to ensure resources are released...
      Overrides:
      finalize in class Object
    • escape

      public static final String escape(String term)
      Escapes all Lucene QueryParser reserved characters with a preceeding \. The resulting String will be interpereted by the QueryParser as a single term.
      Parameters:
      term - The original String
      Returns:
      The escaped term
      See Also:
      • QueryParser.escape(String)
    • escape

      public static final String escape(String term, String preserveChars)
      Escapes the Lucene QueryParser reserved characters with a preceeding \ except those included in preserveChars.
      Parameters:
      term - The original String
      preserveChars - List of characters NOT to escape
      Returns:
      The escaped term
      See Also:
      • QueryParser.escape(String)
    • encodeToTerm

      public static final String encodeToTerm(String s)
      Encodes a String to an appropriate format that can be indexed as a single term using a StandardAnalyzer. White-space is also encoded and incorporated into the single term. Note that this can not be unencoded. Save the value of the term in a separate field if it needs to be retrieved for display.

      Specifically: each letter or number character is left unchanded. All other characters are encoded as the letter 'x' followed by the integer value of the character, for example '@' is encoded as 'x64'.

      Parameters:
      s - The string to encode.
      Returns:
      Encoded String that can be used as a single term.
    • encodeToTerm

      public static final String encodeToTerm(String s, boolean encodeWildCards)
      Encodes a String to an appropriate format that can be indexed as a single term using a StandardAnalyzer. White-space is also encoded and incorporated into the single term. Leaving the wild card '*' char un-encoded will produce a String that can be used to search encoded terms using wild cards. Note that this can not be unencoded. Save the value of the term in a separate field if it needs to be retrieved for display.

      Specifically: each letter or number character is left unchanded. All other characters are encoded as the letter 'x' followed by the integer value of the character, for example '@' is encoded as 'x64'.

      Parameters:
      s - The string to encode.
      encodeWildCards - True to have the '*' char encoded, false to leave it un-encoded.
      Returns:
      Encoded String that can be used as a single term.
    • encodeToTerm

      public static final String encodeToTerm(String s, boolean encodeWildCards, boolean encodeSpace)
      Encodes a String to an appropriate format that can be indexed as a single term or terms using a StandardAnalyzer. Leaving the space char un-encoded will produce a String that will be tokenized by the space char into individual terms. Leaving the wild card '*' char un-encoded will produce a String that can be used to search encoded terms using wild cards. Note that this can not be unencoded. Save the value of the term in a separate field if it needs to be retrieved for display.

      Specifically: each letter or number character is left unchanded. All other characters are encoded as the letter 'x' followed by the integer value of the character, for example '@' is encoded as 'x64'.

      Parameters:
      s - The string to encode.
      encodeWildCards - True to have the '*' char encoded, false to leave it un-encoded.
      encodeSpace - True to have the space ' ' char encoded, false to leave it un-encoded.
      Returns:
      Encoded String that can be used as a single term or terms.
    • getLuceneVersion

      public static org.apache.lucene.util.Version getLuceneVersion()
      Gets /the version of Lucene.
      Returns:
      The luceneVersion value
    • getDateStamp

      public static final String getDateStamp()
      Gets a datestamp of the current time formatted for display with logs and output.
      Returns:
      A datestamp for display purposes.
    • setDebug

      public static void setDebug(boolean db)
      Sets the debug attribute of the SimpleLuceneIndex object
      Parameters:
      db - The new debug value