OFBiz › OFBiz - Dev

[jira] [Comment Edited] (OFBIZ-5115) wrong COUNT DISTINCT if there's GROUP BY clause

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

1 message

Nicolas Malin (Jira)

[jira] [Comment Edited] (OFBIZ-5115) wrong COUNT DISTINCT if there's GROUP BY clause

[ https://issues.apache.org/jira/browse/OFBIZ-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13540723#comment-13540723 ]

Leon edited comment on OFBIZ-5115 at 12/29/12 1:57 AM:
-------------------------------------------------------

Jacques,

We use OFBiz as the back-end framework of our LIMS (Laboratory Information Management System). It requires very heavy data statistics. In this case, we defined a static view entity to do some basic summary job and use "select" with distinct options to get desired result data-set. The data itself is no problem, but we found the total count it returned is always a little bigger than expected.

After some research, we found the root cause: the SQL with group by implies to select all the fields to be grouped by, that's why "count" and /or "count(distinct)" does not work as usual. e.g.

SELECT COUNT(*) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID -> its count result is same as SELECT COUNTRY_GEO_ID, COUNT(*) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID. and,

SELECT COUNT(DISTINCT COUNTRY_GEO_ID) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID -> its data set result size equals that of SELECT COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID, COUNT(*) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID.

was (Author: utcb):
Jacques,

We use OFBiz as the back-end framework of our LIMS (Laboratory Information Management System). It requires very heavy data statistics. In this case, we defined a static view entity to get some basic summary job and use "select" with distinct options to get desired result data-set. The data itself is no problem, but we found the total count it returned is always a little bigger than expected.

After some research, we found the root cause: the SQL with group by implies to select all the fields to be grouped by, that's why "count" and /or "count(distinct)" does not work as usual. e.g.

SELECT COUNT(*) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID -> its count result is same as SELECT COUNTRY_GEO_ID, COUNT(*) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID. and,

SELECT COUNT(DISTINCT COUNTRY_GEO_ID) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID -> its data set result size equals that of SELECT COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID, COUNT(*) FROM POSTAL_ADDRESS GROUP BY COUNTRY_GEO_ID, STATE_PROVINCE_GEO_ID.

> wrong COUNT DISTINCT if there's GROUP BY clause
> -----------------------------------------------
>
> Key: OFBIZ-5115
> URL: https://issues.apache.org/jira/browse/OFBIZ-5115
> Project: OFBiz
> Issue Type: Bug
> Components: framework
> Affects Versions: SVN trunk
> Reporter: Leon
> Priority: Minor
> Fix For: SVN trunk
>
> Attachments: OFBIZ-5115.patch
>
>
> if there's a "group by" and distinct find options set, then the count of select returned is always the total number of non-distinct.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira