dbaspot
Tags Register FAQ Calendar Search Today's Posts Mark Forums Read

index scan vs. table scan - ibm-db2

This is a discussion on index scan vs. table scan - ibm-db2 ; I have an application that runs a query something like this: select resp_desc, count(*) from userid.survey_fact a, userid.client_service_dim b, userid.samp_group_fact c where b.client_id = 8877 and a.client_service_id = b.client_service_sid and a.survey_id = c.survey_id and mailing_date_id between 20080101 and 20090101 group ...


Home > Database Forum > Other Databases > ibm-db2 > index scan vs. table scan

Reply

 

LinkBack Thread Tools Display Modes
  #1  
Old 09-08-2009, 05:32 PM
usenet
Guest
 
Posts: n/a
Default index scan vs. table scan

I have an application that runs a query something like this:

select resp_desc, count(*) from userid.survey_fact a,
userid.client_service_dim b,
userid.samp_group_fact c
where b.client_id = 8877
and a.client_service_id = b.client_service_sid
and a.survey_id = c.survey_id
and mailing_date_id between 20080101 and 20090101
group by resp_desc;

This query does a full table scan on the samp_group_fact table and I
can't understand why. According to the explain plan (see below), the
optimizer is expecting about 330,000 rows to be fetched from
survey_fact. Since the HSJOIN expects to return only 103,013 rows and
that the samp_group_fact table actually has more than 48 million total
rows, I really can't understand why the optimizer would choose not to
utilize the index.

The stats are all up to date.

Thanks in advance for any help.
Greig

DB2 Universal Database Version 9.1, 5622-044 (c) Copyright IBM Corp.
1991, 2006
Licensed Material - Program Property of IBM
IBM DATABASE 2 Explain Table Format Tool



******************** EXPLAIN INSTANCE ********************

DB2_VERSION: 09.01.7
SOURCE_NAME: SQLC2F0A
SOURCE_SCHEMA: NULLID
SOURCE_VERSION:
EXPLAIN_TIME: 2009-09-08-17.17.21.646137
EXPLAIN_REQUESTER: USERID

Database Context:
----------------
Parallelism: None
CPU Speed: 2.361721e-07
Comm Speed: 100
Buffer Pool size: 7000
Sort Heap size: 256
Database Heap size: 1200
Lock List size: 100
Maximum Lock List: 10
Average Applications: 1
Locks Available: 640

Package Context:
---------------
SQL Type: Dynamic
Optimization Level: 5
Blocking: Block All Cursors
Isolation Level: Cursor Stability



---------------- STATEMENT 1 SECTION 201 ----------------
QUERYNO: 1
QUERYTAG: CLP
Statement Type: Select
Updatable: No
Deletable: No
Query Degree: 1

Original Statement:
------------------
select resp_desc, count(*)
from userid.survey_fact a, userid.client_service_dim b,
userid.samp_group_fact c
where b.client_id = 8877 and a.client_service_id =
b.client_service_sid and
a.survey_id = c.survey_id and mailing_date_id between 20080101
and
20090101
group by resp_desc


Optimized Statement:
-------------------
SELECT Q5.$C0 AS "RESP_DESC", Q5.$C1
FROM
(SELECT Q4.$C0, COUNT(*)
FROM
(SELECT Q1.RESP_DESC
FROM USERID.SAMP_GROUP_FACT AS Q1, USERID.CLIENT_SERVICE_DIM AS
Q2,
USERID.SURVEY_FACT AS Q3
WHERE (Q3.MAILING_DATE_ID <= 20090101) AND (20080101 <=
Q3.MAILING_DATE_ID) AND (Q3.SURVEY_ID = Q1.SURVEY_ID)
AND
(Q3.CLIENT_SERVICE_ID = Q2.CLIENT_SERVICE_SID) AND
(Q2.CLIENT_ID = 8877)) AS Q4
GROUP BY Q4.$C0) AS Q5

Access Plan:
-----------
Total Cost: 1.08498e+06
Query Degree: 0

Rows
RETURN
( 1)
Cost
I/O
|
58880
GRPBY
( 2)
1.08498e+06
516914
|
58880
TBSCAN
( 3)
1.08497e+06
516914
|
58880
SORT
( 4)
1.08465e+06
516761
|
103013
HSJOIN
( 5)
1.0826e+06
516608
/--------+--------\
4.8195e+07 329934
TBSCAN NLJOIN
( 6) ( 7)
762852 180285
428492 71089.1
| /------+-------\
4.8195e+07 225 1466.37
TABLE: USERID TBSCAN FETCH
SAMP_GROUP_FACT ( 8) ( 14)
Q1 228.537 474.637
31.5519 186.843
| /---+----\
225 4638.77 1.62881e+08
SORT IXSCAN TABLE: USERID
( 9) ( 15) SURVEY_FACT
228.483 71.8377 Q3
31.5519 5.5801
| |
225 1.62881e+08
FETCH INDEX: USERID
( 10) SF_CLI_IDX
228.407 Q3
31.5519
/----+-----\
225 35113
RIDSCN TABLE: USERID
( 11) CLIENT_SERVICE_DIM
13.289 Q2
1.01885
|
225
SORT
( 12)
13.2807
1.01885
|
225
IXSCAN
( 13)
13.2046
1.01885
|
35113
INDEX: USERID
CLI_SRV_DIM_IDX_1
Q2



Extended Diagnostic Information:
--------------------------------

No extended Diagnostic Information for this statment.


Plan Details:
-------------


1) RETURN: (Return Result)
Cumulative Total Cost: 1.08498e+06
Cumulative CPU Cost: 4.30831e+10
Cumulative I/O Cost: 516914
Cumulative Re-Total Cost: 329.285
Cumulative Re-CPU Cost: 2.04703e+08
Cumulative Re-I/O Cost: 153
Cumulative First Row Cost: 1.0847e+06
Estimated Bufferpool Buffers: 175

Arguments:
---------
BLDLEVEL: (Build level)
DB2 v9.1.0.7 : s090308
ENVVAR : (Environment Variable)
DB2_ANTIJOIN = Y
ENVVAR : (Environment Variable)
DB2_EXTENDED_OPTIMIZATION = Y
HEAPUSE : (Maximum Statement Heap Usage)
112 Pages
PREPTIME: (Statement prepare time)
0 milliseconds
STMTHEAP: (Statement heap size)
4096

Input Streams:
-------------
19) From Operator #2

Estimated number of rows: 58880
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q6.$C1+Q6.RESP_DESC


2) GRPBY : (Group By)
Cumulative Total Cost: 1.08498e+06
Cumulative CPU Cost: 4.30569e+10
Cumulative I/O Cost: 516914
Cumulative Re-Total Cost: 323.097
Cumulative Re-CPU Cost: 1.78501e+08
Cumulative Re-I/O Cost: 153
Cumulative First Row Cost: 1.0847e+06
Estimated Bufferpool Buffers: 175

Arguments:
---------
AGGMODE : (Aggregration Mode)
FINAL
GROUPBYC: (Group By columns)
TRUE
GROUPBYN: (Number of Group By columns)
1
GROUPBYR: (Group By requirement)
1: Q4.RESP_DESC
ONEFETCH: (One Fetch flag)
FALSE

Input Streams:
-------------
18) From Operator #3

Estimated number of rows: 58880
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q4.RESP_DESC(A)


Output Streams:
--------------
19) To Operator #1

Estimated number of rows: 58880
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q6.$C1+Q6.RESP_DESC


3) TBSCAN: (Table Scan)
Cumulative Total Cost: 1.08497e+06
Cumulative CPU Cost: 4.30422e+10
Cumulative I/O Cost: 516914
Cumulative Re-Total Cost: 319.62
Cumulative Re-CPU Cost: 1.63781e+08
Cumulative Re-I/O Cost: 153
Cumulative First Row Cost: 1.0847e+06
Estimated Bufferpool Buffers: 175

Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
ALL
PREFETCH: (Type of Prefetch)
SEQUENTIAL
SCANDIR : (Scan Direction)
FORWARD

Input Streams:
-------------
17) From Operator #4

Estimated number of rows: 58880
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q4.RESP_DESC(A)


Output Streams:
--------------
18) To Operator #2

Estimated number of rows: 58880
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q4.RESP_DESC(A)


4) SORT : (Sort)
Cumulative Total Cost: 1.08465e+06
Cumulative CPU Cost: 4.28784e+10
Cumulative I/O Cost: 516761
Cumulative Re-Total Cost: 0
Cumulative Re-CPU Cost: 0
Cumulative Re-I/O Cost: 153
Cumulative First Row Cost: 1.08465e+06
Estimated Bufferpool Buffers: 437302

Arguments:
---------
AGGMODE : (Aggregration Mode)
PARTIAL
DUPLWARN: (Duplicates Warning flag)
FALSE
NUMROWS : (Estimated number of rows)
58880
ROWWIDTH: (Estimated width of rows)
20
SORTKEY : (Sort Key column)
1: Q4.RESP_DESC(A)
SPILLED : (Pages spilled to bufferpool or disk)
175
TEMPSIZE: (Temporary Table Page Size)
8192
UNIQUE : (Uniqueness required flag)
FALSE

Input Streams:
-------------
16) From Operator #5

Estimated number of rows: 103013
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q4.RESP_DESC


Output Streams:
--------------
17) To Operator #3

Estimated number of rows: 58880
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q4.RESP_DESC(A)


5) HSJOIN: (Hash Join)
Cumulative Total Cost: 1.0826e+06
Cumulative CPU Cost: 4.25133e+10
Cumulative I/O Cost: 516608
Cumulative Re-Total Cost: 1.0826e+06
Cumulative Re-CPU Cost: 4.25133e+10
Cumulative Re-I/O Cost: 516608
Cumulative First Row Cost: 1.0826e+06
Estimated Bufferpool Buffers: 437127

Arguments:
---------
BITFLTR : (Hash Join Bit Filter used)
325900
EARLYOUT: (Early Out flag)
NONE
HASHCODE: (Hash Code Size)
24 BIT
HASHTBSZ: (Number of hash table entries)
329934
TEMPSIZE: (Temporary Table Page Size)
8192

Predicates:
----------
5) Predicate used in Join
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 6.4783e-09

Predicate Text:
--------------
(Q3.SURVEY_ID = Q1.SURVEY_ID)


Input Streams:
-------------
2) From Operator #6

Estimated number of rows: 4.8195e
+07
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.RESP_DESC+Q1.SURVEY_ID

15) From Operator #7

Estimated number of rows: 329934
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID+Q3.SURVEY_ID


Output Streams:
--------------
16) To Operator #4

Estimated number of rows: 103013
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q4.RESP_DESC


6) TBSCAN: (Table Scan)
Cumulative Total Cost: 762852
Cumulative CPU Cost: 3.00314e+10
Cumulative I/O Cost: 428492
Cumulative Re-Total Cost: 762852
Cumulative Re-CPU Cost: 3.00314e+10
Cumulative Re-I/O Cost: 428492
Cumulative First Row Cost: 12.8528
Estimated Bufferpool Buffers: 428492

Arguments:
---------
JN INPUT: (Join input leg)
OUTER
MAXPAGES: (Maximum pages for prefetch)
ALL
PREFETCH: (Type of Prefetch)
SEQUENTIAL
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Input Streams:
-------------
1) From Object USERID.SAMP_GROUP_FACT

Estimated number of rows: 4.8195e
+07
Number of columns: 3
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.$RID$+Q1.RESP_DESC+Q1.SURVEY_ID


Output Streams:
--------------
2) To Operator #5

Estimated number of rows: 4.8195e
+07
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q1.RESP_DESC+Q1.SURVEY_ID


7) NLJOIN: (Nested Loop Join)
Cumulative Total Cost: 180285
Cumulative CPU Cost: 2.72554e+09
Cumulative I/O Cost: 71089.1
Cumulative Re-Total Cost: 72420
Cumulative Re-CPU Cost: 2.60062e+09
Cumulative Re-I/O Cost: 34852.2
Cumulative First Row Cost: 267.254
Estimated Bufferpool Buffers: 42039.8

Arguments:
---------
EARLYOUT: (Early Out flag)
NONE
FETCHMAX: (Override for FETCH MAXPAGES)
IGNORE
ISCANMAX: (Override for ISCAN MAXPAGES)
IGNORE
JN INPUT: (Join input leg)
INNER

Predicates:
----------
6) Predicate used in Join
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 2.84795e-05

Predicate Text:
--------------
(Q3.CLIENT_SERVICE_ID = Q2.CLIENT_SERVICE_SID)


Input Streams:
-------------
10) From Operator #8

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID(A)

14) From Operator #14

Estimated number of rows:
1466.37
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.CLIENT_SERVICE_ID(A)+Q3.SURVEY_ID


Output Streams:
--------------
15) To Operator #5

Estimated number of rows: 329934
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID+Q3.SURVEY_ID


8) TBSCAN: (Table Scan)
Cumulative Total Cost: 228.537
Cumulative CPU Cost: 2.01104e+06
Cumulative I/O Cost: 31.5519
Cumulative Re-Total Cost: 0.165827
Cumulative Re-CPU Cost: 702144
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 228.484
Estimated Bufferpool Buffers: 0

Arguments:
---------
JN INPUT: (Join input leg)
OUTER
MAXPAGES: (Maximum pages for prefetch)
ALL
PREFETCH: (Type of Prefetch)
NONE
SCANDIR : (Scan Direction)
FORWARD

Input Streams:
-------------
9) From Operator #9

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID(A)


Output Streams:
--------------
10) To Operator #7

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID(A)


9) SORT : (Sort)
Cumulative Total Cost: 228.483
Cumulative CPU Cost: 1.7828e+06
Cumulative I/O Cost: 31.5519
Cumulative Re-Total Cost: 0.111924
Cumulative Re-CPU Cost: 473907
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 228.483
Estimated Bufferpool Buffers: 32.5519

Arguments:
---------
DUPLWARN: (Duplicates Warning flag)
FALSE
NUMROWS : (Estimated number of rows)
225
ROWWIDTH: (Estimated width of rows)
8
SORTKEY : (Sort Key column)
1: Q2.CLIENT_SERVICE_SID(A)
TEMPSIZE: (Temporary Table Page Size)
8192
UNIQUE : (Uniqueness required flag)
FALSE

Input Streams:
-------------
8) From Operator #10

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID


Output Streams:
--------------
9) To Operator #8

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID(A)


10) FETCH : (Fetch)
Cumulative Total Cost: 228.407
Cumulative CPU Cost: 1.46062e+06
Cumulative I/O Cost: 31.5519
Cumulative Re-Total Cost: 0.111924
Cumulative Re-CPU Cost: 473907
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 26.1332
Estimated Bufferpool Buffers: 32.5519

Arguments:
---------
MAX RIDS: (Maximum RIDs per list prefetch request)
512
PREFETCH: (Type of Prefetch)
LIST
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Predicates:
----------
7) Sargable Predicate
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.00640788

Predicate Text:
--------------
(Q2.CLIENT_ID = 8877)


Input Streams:
-------------
6) From Operator #11

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$RID$(A)

7) From Object USERID.CLIENT_SERVICE_DIM

Estimated number of rows: 35113
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID+Q2.CLIENT_ID


Output Streams:
--------------
8) To Operator #9

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.CLIENT_SERVICE_SID


11) RIDSCN: (Row Identifier Scan)
Cumulative Total Cost: 13.289
Cumulative CPU Cost: 832971
Cumulative I/O Cost: 1.01885
Cumulative Re-Total Cost: 0.016967
Cumulative Re-CPU Cost: 71841.7
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 13.2807
Estimated Bufferpool Buffers: 2.01885

Arguments:
---------
NUMROWS : (Estimated number of rows)
225

Input Streams:
-------------
5) From Operator #12

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$RID$(A)


Output Streams:
--------------
6) To Operator #10

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$RID$(A)


12) SORT : (Sort)
Cumulative Total Cost: 13.2807
Cumulative CPU Cost: 797784
Cumulative I/O Cost: 1.01885
Cumulative Re-Total Cost: 0.00865681
Cumulative Re-CPU Cost: 36654.7
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 13.2807
Estimated Bufferpool Buffers: 2.01885

Arguments:
---------
DUPLWARN: (Duplicates Warning flag)
TRUE
NUMROWS : (Estimated number of rows)
225
ROWWIDTH: (Estimated width of rows)
16
SORTKEY : (Sort Key column)
1: Q2.$RID$(A)
TEMPSIZE: (Temporary Table Page Size)
8192
UNIQUE : (Uniqueness required flag)
TRUE

Input Streams:
-------------
4) From Operator #13

Estimated number of rows: 225
Number of columns: 0
Subquery predicate ID: Not
Applicable


Output Streams:
--------------
5) To Operator #11

Estimated number of rows: 225
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$RID$(A)


13) IXSCAN: (Index Scan)
Cumulative Total Cost: 13.2046
Cumulative CPU Cost: 475600
Cumulative I/O Cost: 1.01885
Cumulative Re-Total Cost: 0.00865681
Cumulative Re-CPU Cost: 36654.7
Cumulative Re-I/O Cost: 0
Cumulative First Row Cost: 12.8612
Estimated Bufferpool Buffers: 2.01885

Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
1
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NONE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT NONE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Predicates:
----------
7) Start Key Predicate
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.00640788

Predicate Text:
--------------
(Q2.CLIENT_ID = 8877)

7) Stop Key Predicate
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 0.00640788

Predicate Text:
--------------
(Q2.CLIENT_ID = 8877)


Input Streams:
-------------
3) From Object USERID.CLI_SRV_DIM_IDX_1

Estimated number of rows: 35113
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q2.$RID$+Q2.CLIENT_ID


Output Streams:
--------------
4) To Operator #12

Estimated number of rows: 225
Number of columns: 0
Subquery predicate ID: Not
Applicable


14) FETCH : (Fetch)
Cumulative Total Cost: 474.637
Cumulative CPU Cost: 1.21766e+07
Cumulative I/O Cost: 186.843
Cumulative Re-Total Cost: 385.646
Cumulative Re-CPU Cost: 1.16604e+07
Cumulative Re-I/O Cost: 185.843
Cumulative First Row Cost: 38.7704
Estimated Bufferpool Buffers: 1.53176e+06

Arguments:
---------
JN INPUT: (Join input leg)
INNER
MAXPAGES: (Maximum pages for prefetch)
181
PREFETCH: (Type of Prefetch)
SEQUENTIAL
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Predicates:
----------
3) Sargable Predicate
Comparison Operator: Less Than or
Equal (<=)
Subquery Input Required: No
Filter Factor: 0.947733

Predicate Text:
--------------
(Q3.MAILING_DATE_ID <= 20090101)

4) Sargable Predicate
Comparison Operator: Less Than or
Equal (<=)
Subquery Input Required: No
Filter Factor: 0.36838

Predicate Text:
--------------
(20080101 <= Q3.MAILING_DATE_ID)


Input Streams:
-------------
12) From Operator #15

Estimated number of rows:
4638.77
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.CLIENT_SERVICE_ID(A)

13) From Object USERID.SURVEY_FACT

Estimated number of rows:
1.62881e+08
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.MAILING_DATE_ID+Q3.SURVEY_ID


Output Streams:
--------------
14) To Operator #7

Estimated number of rows:
1466.37
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.CLIENT_SERVICE_ID(A)+Q3.SURVEY_ID


15) IXSCAN: (Index Scan)
Cumulative Total Cost: 71.8377
Cumulative CPU Cost: 564929
Cumulative I/O Cost: 5.5801
Cumulative Re-Total Cost: 58.8753
Cumulative Re-CPU Cost: 88952.1
Cumulative Re-I/O Cost: 4.5801
Cumulative First Row Cost: 25.7144
Estimated Bufferpool Buffers: 160823

Arguments:
---------
MAXPAGES: (Maximum pages for prefetch)
4
PREFETCH: (Type of Prefetch)
NONE
ROWLOCK : (Row Lock intent)
NEXT KEY SHARE
SCANDIR : (Scan Direction)
FORWARD
TABLOCK : (Table Lock intent)
INTENT SHARE
TBISOLVL: (Table access Isolation Level)
CURSOR STABILITY

Predicates:
----------
6) Start Key Predicate
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 2.84795e-05

Predicate Text:
--------------
(Q3.CLIENT_SERVICE_ID = Q2.CLIENT_SERVICE_SID)

6) Stop Key Predicate
Comparison Operator: Equal (=)
Subquery Input Required: No
Filter Factor: 2.84795e-05

Predicate Text:
--------------
(Q3.CLIENT_SERVICE_ID = Q2.CLIENT_SERVICE_SID)


Input Streams:
-------------
11) From Object USERID.SF_CLI_IDX

Estimated number of rows:
1.62881e+08
Number of columns: 2
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.CLIENT_SERVICE_ID(A)+Q3.$RID$


Output Streams:
--------------
12) To Operator #14

Estimated number of rows:
4638.77
Number of columns: 1
Subquery predicate ID: Not
Applicable

Column Names:
------------
+Q3.CLIENT_SERVICE_ID(A)


Objects Used in Access Plan:
---------------------------

Schema: USERID
Name: CLI_SRV_DIM_IDX_1
Type: Index
Time of creation:
2007-06-17-11.03.01.382438
Last statistics update:
2009-09-08-02.59.29.504485
Number of columns: 3
Number of rows: 35113
Width of rows: -1
Number of buffer pool pages: 197
Distinct row values: Yes
Tablespace name: DIMIT_TBSP1
Tablespace overhead: 12.670000
Tablespace transfer rate: 0.180000
Source for statistics: Single Node
Prefetch page count: 24
Container extent page count: 16
Index clustering statistic: 0.900742
Index leaf pages: 159
Index tree levels: 2
Index full key cardinality: 35113
Index first key cardinality: 7275
Index first 2 keys cardinality: 34626
Index first 3 keys cardinality: 35113
Index first 4 keys cardinality: -1
Index sequential pages: 59
Index page density: 47
Index avg sequential pages: 19
Index avg gap between sequences:31
Index avg random pages: 16
Fetch avg sequential pages: 31
Fetch avg gap between sequences:307
Fetch avg random pages: 1107
Index RID count: 35114
Index deleted RID count: 1
Index empty leaf pages: 0
Base Table Schema: USERID
Base Table Name:
CLIENT_SERVICE_DIM
Columns in index:
CLIENT_ID(A)
SERV_TYPE(A)
SITE_ID(A)

Schema: USERID
Name: SF_CLI_IDX
Type: Index
Time of creation:
2007-06-17-11.03.01.953638
Last statistics update:
2009-03-03-23.12.25.155122
Number of columns: 1
Number of rows: 162881175
Width of rows: -1
Number of buffer pool pages: 1372081
Distinct row values: No
Tablespace name: FACTIT_TBSP1
Tablespace overhead: 12.670000
Tablespace transfer rate: 0.180000
Source for statistics: Single Node
Prefetch page count: 24
Container extent page count: 16
Index clustering statistic: 0.914914
Index leaf pages: 160821
Index tree levels: 3
Index full key cardinality: 25547
Index first key cardinality: 25547
Index first 2 keys cardinality: -1
Index first 3 keys cardinality: -1
Index first 4 keys cardinality: -1
Index sequential pages: 556
Index page density: 19
Index avg sequential pages: 9
Index avg gap between sequences:2763
Index avg random pages: 3
Fetch avg sequential pages: 44
Fetch avg gap between sequences:112
Fetch avg random pages: 206
Index RID count: 162919863
Index deleted RID count: 38688
Index empty leaf pages: 0
Base Table Schema: USERID
Base Table Name: SURVEY_FACT
Columns in index:
CLIENT_SERVICE_ID(A)

Schema: USERID
Name: CLIENT_SERVICE_DIM
Type: Table
Time of creation:
2008-01-27-01.31.48.312712
Last statistics update:
2009-09-08-02.59.29.504485
Number of columns: 7
Number of rows: 35113
Width of rows: 24
Number of buffer pool pages: 197
Number of data partitions: 1
Distinct row values: No
Tablespace name: DIMIT_TBSP1
Tablespace overhead: 12.670000
Tablespace transfer rate: 0.180000
Source for statistics: Single Node
Prefetch page count: 24
Container extent page count: 16
Table overflow record count: 2268
Table Active Blocks: -1
Average Row Compression Ratio: 0
Percentage Rows Compressed: 0
Average Compressed Row Size: 0

Schema: USERID
Name: SAMP_GROUP_FACT
Type: Table
Time of creation:
2009-08-12-17.44.57.693881
Last statistics update:
2009-09-08-09.35.29.208003
Number of columns: 5
Number of rows: 48195029
Width of rows: 32
Number of buffer pool pages: 428492
Number of data partitions: 1
Distinct row values: No
Tablespace name: USERSPACE1
Tablespace overhead: 12.670000
Tablespace transfer rate: 0.180000
Source for statistics: Single Node
Prefetch page count: 192
Container extent page count: 32
Table overflow record count: 0
Table Active Blocks: -1
Average Row Compression Ratio: 0
Percentage Rows Compressed: 0
Average Compressed Row Size: 0

Schema: USERID
Name: SURVEY_FACT
Type: Table
Time of creation:
2008-01-27-01.31.48.312712
Last statistics update:
2009-03-03-23.12.25.155122
Number of columns: 15
Number of rows: 162881175
Width of rows: 29
Number of buffer pool pages: 1372081
Number of data partitions: 1
Distinct row values: No
Tablespace name: FACTIT_TBSP1
Tablespace overhead: 12.670000
Tablespace transfer rate: 0.180000
Source for statistics: Single Node
Prefetch page count: 24
Container extent page count: 16
Table overflow record count: 0
Table Active Blocks: -1
Average Row Compression Ratio: 0
Percentage Rows Compressed: 0
Average Compressed Row Size: 0
Reply With Quote
  #2  
Old 09-08-2009, 11:46 PM
usenet
Guest
 
Posts: n/a
Default Re: index scan vs. table scan


"Greig" wrote in message
news:aa047342-5e6a-4987-98f6-af8ceead0787@z30g2000yqz.googlegroups.com...
>I have an application that runs a query something like this:
>
> select resp_desc, count(*) from userid.survey_fact a,
> userid.client_service_dim b,
> userid.samp_group_fact c
> where b.client_id = 8877
> and a.client_service_id = b.client_service_sid
> and a.survey_id = c.survey_id
> and mailing_date_id between 20080101 and 20090101
> group by resp_desc;
>
> This query does a full table scan on the samp_group_fact table and I
> can't understand why. According to the explain plan (see below), the
> optimizer is expecting about 330,000 rows to be fetched from
> survey_fact. Since the HSJOIN expects to return only 103,013 rows and
> that the samp_group_fact table actually has more than 48 million total
> rows, I really can't understand why the optimizer would choose not to
> utilize the index.
>
> The stats are all up to date.
>
> Thanks in advance for any help.
> Greig


I have not looked at your query or explain in detail, but DB2 will generally
do a table scan if it guesses (using statistics) that it will have to read
at least one row on all the pages (or even a large percentage of the pages).
The reason for this is that it is usually faster to do the table scan in
such cases, even if some pages are needed (mostly because of page prefetch).

It might be in your particular case that a better access plan can be chosen,
I don't know. You could try to manually update some of the statistics (or
alter the tables to volatile) to force it to use an index and then see if it
really runs faster that way. Don't just rely on the relative cost of the
explain, run some tests with different stats (and access plans) to see what
the actual elapsed time results are.


Reply With Quote
  #3  
Old 09-09-2009, 01:47 PM
usenet
Guest
 
Posts: n/a
Default Re: index scan vs. table scan

On Sep 8, 11:46*pm, "Mark A" wrote:
> "Greig" wrote in message
>
> news:aa047342-5e6a-4987-98f6-af8ceead0787@z30g2000yqz.googlegroups.com...
>
>
>
>
>
> >I have an application that runs a query something like this:

>
> > select resp_desc, count(*) from userid.survey_fact a,
> > userid.client_service_dim b,
> > userid.samp_group_fact c
> > where b.client_id = 8877
> > and a.client_service_id = b.client_service_sid
> > and a.survey_id = c.survey_id
> > and mailing_date_id between 20080101 and 20090101
> > group by resp_desc;

>
> > This query does a full table scan on the samp_group_fact table and I
> > can't understand why. *According to the explain plan (see below), the
> > optimizer is expecting about 330,000 rows to be fetched from
> > survey_fact. Since the HSJOIN expects to return only 103,013 rows and
> > that the samp_group_fact table actually has more than 48 million total
> > rows, I really can't understand why the optimizer would choose not to
> > utilize the index.

>
> > The stats are all up to date.

>
> > Thanks in advance for any help.
> > Greig

>
> I have not looked at your query or explain in detail, but DB2 will generally
> do a table scan if it guesses (using statistics) that it will have to read
> at least one row on all the pages (or even a large percentage of the pages).
> The reason for this is that it is usually faster to do the table scan in
> such cases, even if some pages are needed (mostly because of page prefetch).
>
> It might be in your particular case that a better access plan can be chosen,
> I don't know. You could try to manually update some of the statistics (or
> alter the tables to volatile) to force it to use an index and then see ifit
> really runs faster that way. Don't just rely on the relative cost of the
> explain, run some tests with different stats (and access plans) to see what
> the actual elapsed time results are.- Hide quoted text -
>
> - Show quoted text -


Run a DB2 Advisor for that Query for Index Recomendations. Looks like
the Index Column order is not liked by DB2 Optimizer.
If you change that .....it will pick it up.

Cheers...
Shashi Mannepalli
Reply With Quote
  #4  
Old 09-09-2009, 01:50 PM
usenet
Guest
 
Posts: n/a
Default Re: index scan vs. table scan

WGreig wrote:
> I have an application that runs a query something like this:
>
> select resp_desc, count(*) from userid.survey_fact a,
> userid.client_service_dim b,
> userid.samp_group_fact c
> where b.client_id = 8877
> and a.client_service_id = b.client_service_sid
> and a.survey_id = c.survey_id
> and mailing_date_id between 20080101 and 20090101
> group by resp_desc;
>
> This query does a full table scan on the samp_group_fact table and I
> can't understand why. According to the explain plan (see below), the
> optimizer is expecting about 330,000 rows to be fetched from
> survey_fact. Since the HSJOIN expects to return only 103,013 rows and
> that the samp_group_fact table actually has more than 48 million total
> rows, I really can't understand why the optimizer would choose not to
> utilize the index.
>
> The stats are all up to date.


Without DDL (and some stats) for the samp_group_fact table and
associated objects, no one will be able to help answer the
question better than Mark A already did.
Reply With Quote
  #5  
Old 09-09-2009, 04:56 PM
usenet
Guest
 
Posts: n/a
Default Re: index scan vs. table scan

On Sep 9, 1:50*pm, Ian wrote:
> WGreig wrote:
> > I have an application that runs a query something like this:

>
> > select resp_desc, count(*) from userid.survey_fact a,
> > userid.client_service_dim b,
> > userid.samp_group_fact c
> > where b.client_id = 8877
> > and a.client_service_id = b.client_service_sid
> > and a.survey_id = c.survey_id
> > and mailing_date_id between 20080101 and 20090101
> > group by resp_desc;

>
> > This query does a full table scan on the samp_group_fact table and I
> > can't understand why. *According to the explain plan (see below), the
> > optimizer is expecting about 330,000 rows to be fetched from
> > survey_fact. Since the HSJOIN expects to return only 103,013 rows and
> > that the samp_group_fact table actually has more than 48 million total
> > rows, I really can't understand why the optimizer would choose not to
> > utilize the index.

>
> > The stats are all up to date.

>
> Without DDL (and some stats) for the samp_group_fact table and
> associated objects, no one will be able to help answer the
> question better than Mark A already did.- Hide quoted text -
>
> - Show quoted text -


Well, I can say that if I parameterize the query like this:

select resp_desc, count(*) from userid.survey_fact a,
userid.client_service_dim b,
userid.samp_group_fact c
where b.client_id = ?
and a.client_service_id = b.client_service_sid
and a.survey_id = c.survey_id
and mailing_date_id between ? and ?
group by resp_desc;

Then the optimizer does utilize the index and the query runs in
seconds rather than minutes (when I plug in matching parameters with
the first query). As for the DDL for SAMP_GROUP_FACT, it's pretty
simple. It's got a survey_id, resp_id and resp_desc. There is 1
index on survey_id. For every survey_id in survey_fact, there is at
most 1 matching record in samp_group_fact. What else would you like
to see in the way of stats?

Thanks again for the help.
Greig




Reply With Quote
Reply

Thread Tools
Display Modes



All times are GMT -4. The time now is 12:38 AM.