Uploaded image for project: 'Alfresco One Platform'
  1. Alfresco One Platform
  2. ACE-4000

SOLR 4 - Auto phrase detection, re-rank and index size reduction



      When users type in two words in the share search a search is performed for
      word1 AND word2, no weight is given to the phrase "word1 word 2".

      For simple searches (a string of terms that would be implicitly or explicitly anded together) auto generate a phrase for those terms and use it to weight the results.

      Execution of a conjunction followed by re-ranking means the phrase is only executed against the top matches of the conjunction query and not all possible matches. There may be some raking issues. Configuration to support a single pass (adding the phrase to the query rather then replacing it) is also possible.

      Shingle based solutions generate an index that is significantly larger although fast for 2 term queries. There is code to support shingle based index evaluation but does not cover in shingle wildcards. Autophrase is still useful for shingles.

      As auto phrases are generated at query time this should resolve the issue with multi-term frequencies. The parser would have split the terms and the analyzer not seen them together.

      Multi-term synonyms should be tested as part of this feature..





              • Assignee:
                closedissues Closed Issues
                ahind Andrew Hind [X] (Inactive)
              • Votes:
                0 Vote for this issue
                1 Start watching this issue


                • Created:

                  Structure Helper Panel