Uploaded image for project: 'Search and Discovery'
  1. Search and Discovery
  2. SEARCH-2171

Index dates in a decomposed way (year, month, day and etc)

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: Search Services 2.0
    • Labels:
      None

      Description

      Background

      To add support for SQL date functions we would like to index the parts of a date (or datetime) in separate fields. This will make it significantly easier to perform queries using e.g. "WHERE DAY(cm_created) == 16". Since we are planning to require a reindex for the next release from master (due to the content store removal work) then it is acceptable to add these extra fields to the index for all dates.

      However there was concern that for large metadata-only customers these fields may cause a significant increase in index size and if they aren't using Insight Engine then potentially they won't see the benefits.

      Consequently we would like to add a configuration property (which defaults to true) for switching on/off the generation and indexing of those additional fields.   

      Given the one-off nature of the opportunity to reindex then it seems reasonable to index fields for the YEAR, MONTH, DAY, HOUR, MINUTE and SECOND.

      Specifically, if the property type is DATE then additional fields for YEAR, MONTH, DAY will be created; if the datatype is DATETIME then YEAR, MONTH, DAY, HOUR, MINUTE and SECOND fields will be created.

      Note that no time zone is managed; the string representations of dates are always expressed in Coordinated Universal Time (UTC). As consequence of that, the generated parts will index the UTC value. For example, if we have a date like this: 

      1972-09-16T17:33:18Z

      the separate fields will have the following value: 

      • year: 1972
      • month: 9
      • day: 16
      • hour: 17
      • minute: 33
      • second: 18 

      ++Acceptance Criteria

      As an Insight Engine developer
      When I index a date or datetime property and the configuration property "alfresco.destructureDateFields" is set to true
      Then I can use Luke (or similar) to see that the year, month and day have been indexed as separate fields against the document.

      Notes

      • This includes a rough estimation of the additional disk space required for indexing the de-structured parts of date/datetime fields
      • This does not include multi-valued date/datetime fields: for each single-valued date / datetime field there will be one set of derived fields (day, month, year, etc)
      • This does not include any updates to UIs that can create or display models. We should raise any tickets needed for UI updates.
      • Need to check that this approach is suitable for a SP release - e.g. ACS 6.2.1 or whether it would need to wait until ACS 7.

        Attachments

          Issue Links

            Structure

              Activity

                People

                • Assignee:
                  Unassigned
                  Reporter:
                  tpage Tom Page
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  4 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:

                    Structure Helper Panel