-
Type:
Bug
-
Status: Done
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: Search Services 2.0
-
Labels:None
-
Bug Priority:
-
Epic Link:
-
Sprint:Team Ninja-King - S&I 36, Team Ninja-King - S&I 37, Team Ninja-King - S&I 38, Team Ninja-King - S&I 39
-
Story Points:13
Avg function doesn't work if you are grouping by multiple group by.
The issue is reproducible through an ad hoc integration test in:
org.alfresco.solr.query.stream.DistributedGroupBySqlIT
e.g.
sql = "select mf_freetext_underscore as underscoreField, count(mf_freetext_underscore) as numUnderscore, exif_manufacturer, finance_title" + "cm_created, " + "count(*) as ExposureCount, " + "sum(exif_exposureTime) as TotalExposure, " + "avg(exif_exposureTime) as AvgExposure, " + "min(exif_exposureTime) as MinExposure, " + "max(exif_exposureTime) as MaxExposure " + "from alfresco " + "where " + "exif_exposureTime > 0 and " + "cm_content='world' and " + "cm_created BETWEEN '2000-01-01T01:01:01Z' AND '2010-02-14T23:59:59Z'" + "group by mf_freetext_underscore, exif_manufacturer,finance_title";
The reason is that Calcite parses the avg function as avg when there's just one group by.
When there are multiple group by it is parsed as SUM(field)/COUNT(field).
This is noticeable here: org.alfresco.solr.sql.SolrTable.SolrQueryable#query
Taking a look to List<Pair<String, String>> metricPairs input for both the scenarios, the difference is clear (with the multiple group by you see the DIVISION in the String operations input parameter).
org.alfresco.solr.sql.SolrTable.SolrQueryable#query is kinda the Alfresco custom code entry point, so I suspect the bug may be Calcite side.
I was not able to find an elegant solution with a quick fix, addressing it may require some work (unless we opt for a brute workaround).
Acceptance criteria :
- avg function works when you group by multiple fields
- few new integration tests are added for that specific use case in org.alfresco.solr.query.stream.DistributedGroupBySqlIT
- avg(CAST(cm_sizeCurrent AS DOUBLE)) as AvgSize returns the double average even if cm_sizeCurrent was originally BIGINT
- Depended on by
-
SEARCH-2304 Support for CAST AS TIMESTAMP function for Date and Datetime fields
- Done