Simple one-table query always scans table - why? - Ingres Database
This is a discussion on Simple one-table query always scans table - why? - Ingres Database ; We have a one-table-query than _insists_ on a full table scan, regardless of any indexes, stats, optimizer options, etc. that we try. This is the query with sample where clause: select count(xref_reason) as num_links from case_xref where court='CREW' and court_no='72' ...
![]() |
| | LinkBack | Thread Tools | Display Modes |
|
#1
| |||
| |||
| regardless of any indexes, stats, optimizer options, etc. that we try. This is the query with sample where clause: select count(xref_reason) as num_links from case_xref where court='CREW' and court_no='72' and yr=2008 and insol_type='B' OR ( xref_court='CREW' and xref_court_no='72' and xref_yr=2008 and xref_insol_type='B' ) We have tried adding indexes of various structures, with w/o the xref_reason column, -zcpk compound key stats etc. When run individually, each 'side' of the OR uses its appropriate index, however when bolted together, the statement invariably scans the table. You will see that the table is quite small at ~8Mb, however this is a 1800 concurrent user application and the query is run a gazillion times per day. Unfortunately due to the lifespan of the application we can't easily make code changes. We are running Ingres II 2.6/0305 (rs4.us5/00), on AIX 5.2 Help! TIA Steve Table: 1> help table case_xref Name: case_xref Owner: ingres Created: 06/10/08 07:56:27 Location: custmerge Type: user table Version: II2.6 Page size: 4096 Cache priority: 0 Alter table version: 0 Alter table totwidth: 102 Row width: 102 Number of rows: 42072 Storage structure: hash Compression: none Duplicate Rows: allowed Number of pages: 2074 Overflow data pages: 24 Journaling: enabled after the next checkpoint Base table for view: no Permissions: yes, including ALL to ALL and SELECT to ALL Integrities: none Optimizer statistics: yes; see avg count below, more info in the iistats catalog Column Information: Key Avg Count Column Name Type Length Nulls Defaults Seq Per Value court varchar 7 no no 1 282.4 court_no varchar 7 no no 2 5.1 yr integer 4 no no 3 1618.2 insol_type char 1 no no 4 8414.4 xref_court varchar 7 yes null 278.6 xref_court_no varchar 7 yes null 4.5 xref_yr integer 4 yes null 689.7 xref_insol_type char 1 yes null 6010.3 xref_reason char 30 yes null from_subjrefno integer 4 no no to_subjrefno integer 4 no no from_caseid integer 4 no no to_caseid integer 4 no no link_code char 1 no no case_id integer 4 no yes Secondary indexes: Index Name Structure Keyed On idx_casexref_court2 hash xref_court, xref_court_no, xref_yr, xref_insol_type End of Request |
|
#2
| |||
| |||
|
This is just speculation but perhaps the query optimizer is making the decision that it would take longer to figure out the cost of the query than it would to just do a full table scan. Given that the table is so small, it may be faster just to keep the table in memory and do a full scan. I would try creating a much larger table with the same data/structure and see if the optimizer changes its QEP. --cnemelka On Thu, Nov 13, 2008 at 5:51 AM, Steve McElhinney > We have a one-table-query than _insists_ on a full table scan, > regardless of any indexes, stats, optimizer options, etc. that we try. > > This is the query with sample where clause: > > select count(xref_reason) as num_links from case_xref > where court='CREW' > and court_no='72' > and yr=2008 > and insol_type='B' > OR ( xref_court='CREW' > and xref_court_no='72' > and xref_yr=2008 > and xref_insol_type='B' ) > > > We have tried adding indexes of various structures, > with w/o the xref_reason column, -zcpk compound key stats etc. > > When run individually, each 'side' of the OR uses its appropriate > index, > however when bolted together, the statement invariably scans the > table. > > You will see that the table is quite small at ~8Mb, > however this is a 1800 concurrent user application > and the query is run a gazillion times per day. > > Unfortunately due to the lifespan of the application > we can't easily make code changes. > > We are running Ingres II 2.6/0305 (rs4.us5/00), on AIX 5.2 > > Help! > TIA Steve > > > Table: > > 1> help table case_xref > > > Name: case_xref > Owner: ingres > Created: 06/10/08 07:56:27 > Location: custmerge > Type: user table > Version: II2.6 > Page size: 4096 > Cache priority: 0 > Alter table version: 0 > Alter table totwidth: 102 > Row width: 102 > Number of rows: 42072 > Storage structure: hash > Compression: none > Duplicate Rows: allowed > Number of pages: 2074 > Overflow data pages: 24 > Journaling: enabled after the next checkpoint > Base table for view: no > Permissions: yes, including ALL to ALL and SELECT to ALL > Integrities: none > Optimizer statistics: yes; see avg count below, more info in the > iistats catalog > > Column Information: > Key > Avg Count > Column Name Type Length Nulls Defaults Seq > Per Value > court varchar 7 no no > 1 282.4 > court_no varchar 7 no no > 2 5.1 > yr integer 4 no no > 3 1618.2 > insol_type char 1 no no > 4 8414.4 > xref_court varchar 7 yes > null 278.6 > xref_court_no varchar 7 yes > null 4.5 > xref_yr integer 4 yes > null 689.7 > xref_insol_type char 1 yes > null 6010.3 > xref_reason char 30 yes null > from_subjrefno integer 4 no no > to_subjrefno integer 4 no no > from_caseid integer 4 no no > to_caseid integer 4 no no > link_code char 1 no no > case_id integer 4 no yes > > Secondary indexes: > Index Name Structure Keyed On > idx_casexref_court2 hash xref_court, xref_court_no, xref_yr, > xref_insol_type > > > End of Request > _______________________________________________ > Info-Ingres mailing list > Info-Ingres@kettleriverconsulting.com > http://www.kettleriverconsulting.com...fo/info-ingres > |
|
#3
| |||
| |||
| On Nov 13, 2008, at 7:51 AM, Steve McElhinney wrote: > We have a one-table-query than _insists_ on a full table scan, > regardless of any indexes, stats, optimizer options, etc. that we try. > > This is the query with sample where clause: > > select count(xref_reason) as num_links from case_xref > where court='CREW' > and court_no='72' > and yr=2008 > and insol_type='B' > OR ( xref_court='CREW' > and xref_court_no='72' > and xref_yr=2008 > and xref_insol_type='B' ) > This sort of thing has always been a problem for the optimizer, in my experience. There are a couple reasons that I can think of. First and most importantly, the query plan data structures are not well suited to the sort of plan you want. Forget about the primary index for a moment and assume that you have two secondaries. The plan you need is then something roughly like: select ... from base where tid in (select tid from index1 where ... union select tid from index2 where ....) but thanks in part to the way unions are structured internally, it's very difficult (if not impossible) for the optimizer to propose that plan to itself; the rewrite phase would have to invent the union subqueries out of thin air. Not that it's impossible, and it might even be able to do so at times (I really have no idea), but it wouldn't be simple or general. With a primary index lookup in the mix it's even harder to see exactly how one would execute this. I could imagine a rewrite into some sort of goofy tid full join, or perhaps a slightly different union form; the latter posing all the difficulties already mentioned. The second problem is that the optimizer insists on rewriting the query into Conjunctive Normal Form, which is a fancy-pants name for putting everything into the form of AND's of OR's. In other words, your nice looking query gets rewritten to a horrible mess that tends to lose the nice key-lookup-ness of your original query. Then, to produce actual key lookups, it has to rewrite the CNF back into DNF, OR's of AND's! I've often thought it should carry both, but whatever. I don't know if the complex CNF form of your query is a problem for the optimizer or not, but it might be. It is certainly going to make it more difficult to recognize that there are two key-lookup conditions OR'ed together. (See below) What you can do is try to rewrite the query as a union yourself, and see if things go any better. Karl PS I don't want to even try to write out the CNF for your query, but the CNF of this simpler one: select .. from .. where court='CREW' and court_no = 72 OR (xref_court = 'CREW' and xref_court_no=72) looks like: .... where (court='CREW' or xref_court='CREW') and (court='CREW' or xref_court_no = 72) and (court_no=72 or xref_court='CREW') and (court_no=72 or xref_court_no=72) and if you stare at the two for a long time, you'll see that they are the same thing, but the second certainly doesn't look like a pair of key lookups! (I hope I got that right!) |
![]() |
« Previous Thread
|
Next Thread »
| Thread Tools | |
| Display Modes | |
| |
All times are GMT -4. The time now is 02:25 AM.




Linear Mode