
44876 items (10086 unread) in 60 feeds
Blogger
(1937 unread)
Web2.0
(581 unread)
OpenSource
(1567 unread)
TechTrend
(1203 unread)
Database
(531 unread)
News
(3498 unread)
ITReview
(67 unread)
Geeks
(647 unread)
LifeMusic
(55 unread)
OPTIMAL MODE:
Optimal方式的hash join相对来说非常简单,在这个例子(optimal.doc)里面,一共需要2个partition,这也是hash join中最小的partition分配数,随着work area的增大,partition的数目也会增多,partition的分配规律遵守2的n次方这样的规则分配。在经过build table的scan以后,hash table被建立,每条记录被hash后对应的hash bucket也被填充,与此相对应的bitmap vector也bit位也被设置,这个时候开始probe table的scan,首先通过同样的hash function对join key进行运算,然后对比bitmap vector中的bit是否被标志,如果没被标志那么表示这条记录不符合关联的要求直接被丢弃。如果bit位被设置,那么表示这个值可能与hash bucket中保存的值一致,因为hash算法会有碰撞,所以一个bucket中会有不同的值存在,所以将会对比这个值和hash bucket中保存的值是不是相同,如果相同那么输出给客户端,等probe table扫描完毕,所有结果都输出给客户端。
optimal.doc 点击下载trace 文件
ONEPASS MODE:
当我们修改了hash_area_size到2m后,hash join的运行模式变成了onepass状态,从上面的trace(onepass.doc)文件中可以发现,partition变成了4个
### Partition Distribution ###
Partition:0 rows:49933 clusters:5 slots:1 kept=0
Partition:1 rows:49674 clusters:5 slots:1 kept=0
Partition:2 rows:49943 clusters:5 slots:4 kept=0
Partition:3 rows:50451 clusters:5 slots:5 kept=1
其中partition 3可以被放在work area中,其他几个partition不可避免的要写出到磁盘上。接下来开始probe table的scan,在扫描的过程中经过bitmap vector的过滤,不符合的记录被丢弃,因为bitmapVector中也保存了每个hash bucket是否在内存还是在磁盘,所以符合的记录就会去查到底这个hash bucket位于哪里,如果在内存,直接比较确切值,丢弃或者输出给客户端(从trace文件中看不出这个过程,trace文件看起来是dump所有不能放入内存的partitition,然后读入partition pair做hash join)。如果在磁盘,那么probe table的这条记录也写出到磁盘,形成一个partition pair。在partition 3的数据join完毕,接下来是partition 2的所有部分被读入内存,同时probe table相对的partition也被读入内存进行hash join,然后是partition 1,partition 0。可以看到这里都只读了一次probe table partition,所以我们称之为onepass hash join.
onepass.doc 点击下载trace 文件
MULTIPASS MODE:
这个就是multipass hash join的过程,trace文件异常的长,首先分配了2个partitions,没有一个partition能被完全放在work area里面
### Partition Distribution ###
Partition:0 rows:99876 clusters:10 slots:1 kept=0
Partition:1 rows:100125 clusters:10 slots:4 kept=0
首先是build table的scan, Work area里面容纳了partition 1的4个slot和partition 0的一个slot,另外还有一个slot是为了i/o用的。每当一个slot(可能是一个block)满后将会被写出到磁盘,等build table scan完毕,磁盘上保存了2个partition的内容,并且bitmap vector也被建立了。这时开始了probe table的scan,系统为probe table的每个partition分配了一个slot,还分配一个slot作为读入probe table的缓冲,通过bitmap vector的过滤,符合条件的记录被填充到probe partition的slot中并与build table partition的slot来比较确切值。这个步骤完了以后work area里面应该存在一个build table partition 0的slot,一个build table partition 1的slot,一个写出缓冲的slot,一个读入缓冲的slot,一个probe table partition 0的slot,一个probe table partition 1的slot,一共6个slot,刚好填满所有可分配的slot。这个过程中间可能会有在内存中匹配的行返回,然后开始从磁盘读取partition pair开始join,经过10次probe partition的读取。partition 0也是经过了10次probe partition的读取,这就带来了很多i/o,导致hash join的性能急剧下降。Oracle在进行multipass的过程中会有2次hash function的存在。Oracle将会读入磁盘上的build table partition再进行一次hash生成一些subpartition,这样的话每次读入probe table subpartition即可,而不用多次读入probe partition导致过大的i/o.不过在这个trace文件中并没有发现这个过程。的个和的一个另外还有一个是为了用的。每当一个(可能是一个)满后将会被写出到磁盘,等完毕,磁盘上保存了个的内容并且也被建立了。这时开始了的,系统为的每个分配了一个,还分配一个作为读入的缓冲,通过的过滤,符合条件的记录被填充到的中并与的来比较确切值。这个步骤完了以后里面应该存在一个的,一个的,一个写出缓冲的一个读入缓冲的,一个的,一个的,一共个,刚好填满所有可分配的。这个过程中间可能会有在内存中匹配的行返回,然后开始从磁盘读取开始,经过次的读取。也是经过了次的读取,这就带来了很多,导致的性能急剧下降。在进行的过程中会有次的存在。将会读入磁盘上的再进行一次生成一些,这样的话每次读入即可,而不用多次读入导致过大的不过在这个文件中并没有发现这个过程。里面容纳了的个和的一个另外还有一个是为了用的。每当一个(可能是一个)满后将会被写出到磁盘,等完毕,磁盘上保存了个的内容并且也被建立了。这时开始了的,系统为的每个分配了一个,还分配一个作为读入的缓冲,通过的过滤,符合条件的记录被填充到的中并与的来比较确切值。这个步骤完了以后里面应该存在一个的,一个的,一个写出缓冲的一个读入缓冲的,一个的,一个的,一共个,刚好填满所有可分配的。这个过程中间可能会有在内存中匹配的行返回,然后开始从磁盘读取开始,经过次的读取。也是经过了次的读取,这就带来了很多,导致的性能急剧下降。在进行的过程中会有次的存在。将会读入磁盘上的再进行一次生成一些,这样的话每次读入即可,而不用多次读入导致过大的不过在这个文件中并没有发现这个过程。另外还有一个trace文件反映了ROLE REVERSAL,oracle在进行join时会去评估build partition和probe partition的大小,如果发行probe partition小于build partition,那么会对换两者的角色,对原来的probe partition建立hash table,拿原来的build partition来匹配,这种方法也在一定程度上减少了i/o,提高了效率。
*** HASH JOIN GET FLUSHED PARTITIONS (PHASE 2) ***
Getting a pair of flushed partions.
BUILD PARTION: nrows:24948 size=(3 slots, 384K)
PROBE PARTION: nrows:12410 size=(2 slots, 256K)
ROLE REVERSAL OCCURRED
### Hash table overall statistics ###
Total buckets: 262144 Empty buckets: 250012 Non-empty buckets: 12132
Total number of rows: 12410
Maximum number of rows in a bucket: 3
Average number of rows in non-empty buckets: 1.022915
multipass.doc 点击下载trace 文件
上面这些只是本人基于trace文件对hash join运行模式的推断,不一定能反正hash join真实的运行情况,所以欢迎大家来讨论! 
大家可能都知道在进行oracle数据库版本升级的时候会有2种方式
1.通过dbua(database upgrade assistant)
2.exp/imp
通过dbua来升级的话由于不涉及到数据文件的改变,所以速度会比较快,但是如果dbua一旦在升级的过程中出现问题可能会导致原来的库不可用。
而通过exp/imp虽然对原来的库不会有影响,但是如果数据库比较大的话那么升级的时间将会是不可接受的(尤其对24*7)的应用来说。
针对这个问题,这次的2006 oracle openworld法国的amadeus公司提供了一个非常有创意的点子,就是利用dataguard和transport tablespace功能来实现最短时间内的安全升级。
首先让我们来了解一下amadeus公司
艾玛迪斯全球旅游分销系统公司(Amadeus Global Travel Distribution SA)是全球领先的旅游行业技术及分销供应商。1987年艾玛迪斯总部建立于西班牙马德里。在 Sophia Antipolis(法国尼斯附近)和美国波士顿设立有市场及开发部门。公司的数据中心位于德国慕尼黑附近的Erding。公司提供各种先进的旅游行业技术解决方案,至今已成为成长最快并被最广泛使用的全球分销系统(GDS)。
作为卓越的技术合作伙伴,艾玛迪斯把最先进的信息技术带入旅游行业,使众多的旅游供应商、休闲及商务旅游服务商从中获益。通过设立服务于当地市场的national marketing companies(NMCs),艾玛迪斯用其庞大的信息技术资源向全世界200个国家和地区提供优质的技术解决方案。
我们再来看一下跟它们的数据库相关的信息
他们的业务系统达到99.99%的可用率,每秒钟有30万次的数据库请求,每天有2亿8千万次transaction,这是一个相当大的数据库系统,如果用dbua或者exp/imp他们都不能接受升级的风险,于是他们的技术人员就想出了用dataguard和transport tablespace功能来实现最短时间内的安全升级。
具体的实现方法是这样的
1.先为主库建立一个dataguard数据库(可以在线做)
2.在dataguard库上安装10g软件(可以在线做)
3.整理一些不能通过transport tablespace搞定的东西,比如sequence,synonyms,grants......
4.停止主库这边所有write的应用,提供read的服务(写入停止,提供查询)
5.强制归档主库redo log并传到dataguard恢复(写入停止,提供查询)
6.利用transport tablespace来转换数据库版本,并创建sequencee,synonyms,grants等(写入停止,提供查询)。
7.验证新环境的过程,在验证过程中如果发现有问题,则可以切换会原来的系统(写入停止,提供查询)。
8.切换应用到10g数据库(提供服务)
amadeus在演习时做到10分钟内完成4,5,6,7并成功切换了系统,考虑到他们的数据库繁忙程度和数据库容量非常大,这真是一项伟大的成就。我们可以在以后的数据库版本的升级过程中借鉴他们的方法。
我们再从技术上验证一下transport tablespace可以运用在版本升级
在9i的库上创建一个test tablespace
create tablespace test
datafile '/opt/oracle/test.dbf' size 10m
extent management local autoallocate;
创建一张表在test表空间上
create table test1(a number) tablespace test;
insert into test1 values(1);
commit;
SQL 9i>select * from test1;
A
----------
1
把test表空间置为read only模式
alter tablespace test read only;
到处test tablespace的metadata
exp 'sys/sys as sysdba' transport_tablespace=y tablespaces=(TEST) file=test.dmp log=test.log
传输dmp文件和数据文件(在amadeus的案例里面由于10g的库和9i的库在同一台机器上,所以避免了拷贝数据文件的时间,这也是整个方案的重点之一)到远程
scp test.dmp oracle@10.0.100.115:/opt/oracle/
scp /opt/oracle/test.dbf oracle@10.0.100.115:/opt/oracle/
在目标库上导入metadata数据
imp 'sys/sys as sysdba' transport_tablespace=y tablespaces=(TEST) file='/opt/oracle/test.dmp' datafiles=
('/opt/oracle/test.dbf') tts_owners=test fromuser=test touser=test log=tts_i.log
查看test1表,发现数据一致
SQL 10G>select * from test1;
A
----------
1
把test表空间置为read write模式
alter tablespace test read write;
insert into test1 values(2);
SQL 10G>select * from test1;
A
----------
1
2
一切正常,测试完毕
这个测试简单的模仿了transport tablespace升级数据库的可能性,当然在实际过程中我们要校验是否自包含表空间,是否需要创建sequence等,但是总体来说这种方案能提供最短时间内的数据库版本升级。&referrer=)

create table fenbu as select 1 id,'Y' flag from dba_objects where rownum<100001;
insert into fenbu values(1,'N');
commit;
create index IDX_FENBU_FLAG on fenbu(flag);
analyze table fenbu compute statistics for table for all columns for all indexes;
var a varchar2(32);
exec :a:='N';
SQL 10G>set autotrace trace exp;
SQL 10G>alter session set events'10046 trace name context forever,level 12';
Session altered.
SQL 10G>select * from fenbu where flag=:a;
Execution Plan
----------------------------------------------------------
----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50001 | 244K| 202 (80)|
|* 1 | TABLE ACCESS FULL| FENBU | 50001 | 244K| 202 (80)|
----------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("FLAG"=:A)
Note
-----
- 'PLAN_TABLE' is old version
SQL 10G>alter session set events'10046 trace name context off';
Session altered.
很显然可以看到set autotrace的执行计划是错的,这是因为set auotrace,explain plan等操作
并不会发生bind peeking,它并不会把绑定变量的值反映到执行计划里面,不会去看直方图的
数据分布,所以它生成的计划并不可信,我们可以来看一下10046的真实计划。
select *
from
fenbu where flag=:a
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 0 4 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.00 0.00 0 4 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 55
Rows Row Source Operation
------- ---------------------------------------------------
1 TABLE ACCESS BY INDEX ROWID FENBU (cr=4 pr=0 pw=0 time=102 us)
1 INDEX RANGE SCAN IDX_FENBU_FLAG (cr=3 pr=0 pw=0 time=78 us)(object id 83455)
上一篇文章提到压缩表发生update后会导致行迁移,但是在上篇文章里面没有做过多描述
,这次我们来仔细看一下update compressed table会发生什么事情。此外我们再来看看压
缩表结构修改是怎么处理的。
首先创建测试表
create table test2(a varchar2(10),b varchar2(10),c varchar2(10));
begin
for i in 1000000000..1000100000 loop
insert into test2 values(i,'1',to_char(mod(i,100)));
commit;
end loop;
end;
/
SQL 10G>create table testcom4 compress as select * from test2 order by c;
Table created.
对压缩表添加一个列
SQL 10G>SQL 10G>SQL 10G>
alter table testcom4 add d number;
Table altered.
定位到一条记录,找出所在文件号,块号,文件号和rowid
SQL 10G>select dbms_rowid.ROWID_RELATIVE_FNO(rowid) file#,
dbms_rowid.ROWID_BLOCK_NUMBER(rowid) block#,
dbms_rowid.ROWID_ROW_NUMBER(rowid) row# from
testcom4 where rownum<2;
FILE# BLOCK# ROW#
---------- ---------- ----------
12 61364 0
SQL 10G>select rowid from testcom4 where rownum<2;
ROWID
------------------
AAAT9AAAMAAAO+0AAA
更新这条记录
SQL 10G>update testcom4 set d=1 where rowid='AAAT9AAAMAAAO+0AAA';
1 row updated.
SQL 10G>commit;
Commit complete.
SQL 10G>select dbms_rowid.ROWID_RELATIVE_FNO(rowid) file#,
dbms_rowid.ROWID_BLOCK_NUMBER(rowid) block#,
dbms_rowid.ROWID_ROW_NUMBER(rowid) row# from testcom4
where rowid='AAAT9AAAMAAAO+0AAA';
FILE# BLOCK# ROW#
---------- ---------- ----------
12 61364 0
dump这个block看看行迁移是怎么发生的
SQL 10G>alter system dump datafile 12 block 61364;
System altered.
perm_9ir2[3]={ 2 0 1 }
...
block_row_dump:
tab 0, row 0, @0x1f79
tl: 7 fb: --H-FL-- lb: 0x0 cc: 2
col 0: [ 1] 31
col 1: [ 1] 30
bindmp: 01 bc 02 c9 31 c9 30
tab 1, row 0, @0x1f69
tl: 9 fb: --H----- lb: 0x2 cc: 0
nrid: 0x0300f085.0 这里指向了新的数据块
bindmp: 20 02 00 03 00 f0 85 00 00
定位新的块
SQL 10G>select dbms_utility.DATA_BLOCK_ADDRESS_FILE(to_number('300f085','xxxxxxxxxx'))
file#,dbms_utility.DATA_BLOCK_ADDRESS_BLOCK(to_number('300f085','xxxxxxxxxx'))
block# from dual;
FILE# BLOCK#
---------- ----------
12 61573
dump新的block
SQL 10G>alter system dump datafile 12 block 61573;
System altered.
block_row_dump:
tab 0, row 0, @0x1f65
tl: 27 fb: ----FL-- lb: 0x1 cc: 4
hrid: 0x0300efb4.0
col 0: [10] 31 30 30 30 30 39 33 30 30 30
col 1: [ 1] 31
col 2: [ 1] 30
col 3: [ 2] c1 02
可以看到新的block里面已经是非压缩的数据格式了,从这里可以看出对压缩表的更新确实是会导致
压缩失效。
那么能不能删除新加的列呢?试一下
SQL 10G>alter table testcom4 drop column d;
alter table testcom4 drop column d
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables
报错了,提示“unsupported add/drop column operation on compressed tables”
metalink上说这是oracle的一个bug,在10g修复,但是在我的10g r2的版本上还是
不通过。9i的版本更加离谱,连add column都不行。
---------------------------
SQL 9I> alter table testcom4 add d number;
alter table testcom4 add d number
*
ERROR at line 1:
ORA-22856: cannot add columns to object tables
---------------------------
10g可以进行set unused的操作
SQL 10G>alter table testcom4 set unused column d;
Table altered.
但是drop unused columns依然报错,依然是一个bug
SQL 10G>alter table testcom4 drop unused columns;
alter table testcom4 drop unused columns
*
ERROR at line 1:
ORA-12996: cannot drop system-generated virtual column
希望下次下载一个patch可以解决这些问题。


分别测试了两个版本下1000行,10000行,100000行记录的6个对比
exec dbms_stats.SET_TABLE_STATS(OWNNAME=>'TEST',TABNAME=>'T1',NUMBLKS=>1000);
exec dbms_stats.SET_TABLE_STATS(OWNNAME=>'TEST',TABNAME=>'T1',NUMBLKS=>10000);
exec dbms_stats.SET_TABLE_STATS(OWNNAME=>'TEST',TABNAME=>'T1',NUMBLKS=>100000);
| MBRC | COST | ADJUSTED_MBRC | MBRC | COST | ADJUSTED_MBRC | MBRC | COST | ADJUSTED_MBRC | |||
| 4 | 377 | 2.652519894 | 4 | 3763 | 2.657454159 | 4 | 37615 | 2.658513891 | |||
| 8 | 273 | 3.663003663 | 8 | 2722 | 3.673769287 | 8 | 27199 | 3.676605758 | |||
| 16 | 221 | 4.524886878 | 16 | 2201 | 4.543389368 | 16 | 21990 | 4.547521601 | |||
| 32 | 195 | 5.128205128 | 32 | 1941 | 5.151983514 | 32 | 19386 | 5.158361704 | |||
| 64 | 182 | 5.494505495 | 64 | 1811 | 5.521811154 | 64 | 18084 | 5.529750055 | |||
| 128 | 176 | 5.681818182 | 128 | 1745 | 5.730659026 | 128 | 17433 | 5.736247347 | |||
|
9i r2
|
|||||||||||
| MBRC | COST | ADJUSTED_MBRC | MBRC | COST | ADJUSTED_MBRC | MBRC | COST | ADJUSTED_MBRC | |||
| 4 | 241 | 4.149377593 | 4 | 2397 | 4.171881519 | 4 | 23953 | 4.1748424 | |||
| 8 | 153 | 6.535947712 | 8 | 1519 | 6.583278473 | 8 | 15179 | 6.588049279 | |||
| 16 | 98 | 10.20408163 | 16 | 963 | 10.38421599 | 16 | 9619 | 10.39609107 | |||
| 32 | 63 | 15.87301587 | 32 | 611 | 16.36661211 | 32 | 6096 | 16.40419948 | |||
| 64 | 40 | 25 | 64 | 388 | 25.77319588 | 64 | 3863 | 25.88661662 | |||
| 128 | 26 | 38.46153846 | 128 | 246 | 40.6504065 | 128 | 2449 | 40.83299306 | |||

10g r2和前几个版本比起来对db_file_multiblock_read_count在cbo成本计算中的公式做了调整,看一下下面我实验后的对照表。
我的测试环境
[oracle@csdba ~]$ uname -a
Linux csdba 2.6.9-11.ELsmp #1 SMP Fri May 20 18:26:27 EDT 2005 i686 i686 i386 GNU/Linux
SQL*Plus: Release 10.2.0.1.0 - Production on Mon Jul 24 17:07:42 2006
Copyright (c) 1982, 2005, Oracle. All rights reserved.
Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning and Data Mining Scoring Engine options
分别测试了两个版本下1000行,10000行,100000行记录的6个对比
exec dbms_stats.SET_TABLE_STATS(OWNNAME=>'TEST',TABNAME=>'T1',NUMBLKS=>1000);
exec dbms_stats.SET_TABLE_STATS(OWNNAME=>'TEST',TABNAME=>'T1',NUMBLKS=>10000);
exec dbms_stats.SET_TABLE_STATS(OWNNAME=>'TEST',TABNAME=>'T1',NUMBLKS=>100000);
10g r2版本
| MBRC | BLOCKS | COST | ADJUSTED_MBRC |
| 4 | 1000 | 377 | 2.652519894 |
| 8 | 1000 | 273 | 3.663003663 |
| 16 | 1000 | 221 | 4.524886878 |
| 32 | 1000 | 195 | 5.128205128 |
| 64 | 1000 | 182 | 5.494505495 |
| 128 | 1000 | 176 | 5.681818182 |
| MBRC | BLOCKS | COST | ADJUSTED_MBRC |
| 4 | 10000 | 3763 | 2.657454159 |
| 8 | 10000 | 2722 | 3.673769287 |
| 16 | 10000 | 2201 | 4.543389368 |
| 32 | 10000 | 1941 | 5.151983514 |
| 64 | 10000 | 1811 | 5.521811154 |
| 128 | 10000 | 1745 | 5.730659026 |
| MBRC | BLOCKS | COST | ADJUSTED_MBRC |
| 4 | 100000 | 37615 | 2.658513891 |
| 8 | 100000 | 27199 | 3.676605758 |
| 16 | 100000 | 21990 | 4.547521601 |
| 32 | 100000 | 19386 | 5.158361704 |
| 64 | 100000 | 18084 | 5.529750055 |
| 128 | 100000 | 17433 | 5.736247347 |
9i r2
| MBRC | BLOCKS | COST | ADJUSTED_MBRC |
| 4 | 1000 | 241 | 4.149377593 |
| 8 | 1000 | 153 | 6.535947712 |
| 16 | 1000 | 98 | 10.20408163 |
| 32 | 1000 | 63 | 15.87301587 |
| 64 | 1000 | 40 | 25 |
| 128 | 1000 | 26 | 38.46153846 |
| MBRC | BLOCKS | COST | ADJUSTED_MBRC |
| 4 | 10000 | 2397 | 4.171881519 |
| 8 | 10000 | 1519 | 6.583278473 |
| 16 | 10000 | 963 | 10.38421599 |
| 32 | 10000 | 611 | 16.36661211 |
| 64 | 10000 | 388 | 25.77319588 |
| 128 | 10000 | 246 | 40.6504065 |
| MBRC | BLOCKS | COST | ADJUSTED_MBRC |
| 4 | 100000 | 23953 | 4.1748424 |
| 8 | 100000 | 15179 | 6.588049279 |
| 16 | 100000 | 9619 | 10.39609107 |
| 32 | 100000 | 6096 | 16.40419948 |
| 64 | 100000 | 3863 | 25.88661662 |
| 128 | 100000 | 2449 | 40.83299306 |
最后得出的测试结果是在10g r2里面,db_file_multiblock_read_count
对cost计算的影响明显变小,看起来oracle对db_file_multiblock_read_count
采取了更谨慎的态度,这样一来不会因为设置了
db_file_multiblock_read_count为一个较大的值而导致数据库倾向于全表扫描。

莱曼曾经在之前扑出阿亚拉点球,坎比亚索应该深知这一点,他还能够微笑着面对他面前的这个人吗?10秒钟以后他会是怎样的表情?
球被扑了!比赛结束了!阿根廷队失败了。他们再一次倒在法西斯的球队面前,悲情的阿根廷的主教练!佩克尔曼今天倒下!阿根廷别再为我哭泣!
这个点球是一个绝对理论上的臭脚。绝对的臭脚,阿根廷队淘汰出了四强!
这个失败属于阿根廷,属于阿亚拉,属于弗朗哥,属于佩克尔曼,属于所有热爱阿根廷足球的人!
阿根廷队真的会后悔的,佩克尔曼在下半时他们领先一球的情况下打得太保守、太沉稳了,他失去了自己在小组赛的那种勇气,面对德国悠久的历史,他失去了他在小组赛中那种痛打落水狗的作风,他终于自食其果。阿根廷队该回家了,也许他们不用回遥远的阿根廷,他们不用回家,因为他们大多数人都在欧洲生活,再见!

莱曼曾经在之前扑出阿亚拉点球,坎比亚索应该深知这一点,他还能够微笑着面对他面前的这个人吗?10秒钟以后他会是怎样的表情?
球被扑了!比赛结束了!阿根廷队失败了。他们再一次倒在法西斯的球队面前,悲情的阿根廷的主教练!佩克尔曼今天倒下!阿根廷别再为我哭泣!
这个点球是一个绝对理论上的臭脚。绝对的臭脚,阿根廷队淘汰出了四强!
这个失败属于阿根廷,属于阿亚拉,属于弗朗哥,属于佩克尔曼,属于所有热爱阿根廷足球的人!
阿根廷队真的会后悔的,佩克尔曼在下半时他们领先一球的情况下打得太保守、太沉稳了,他失去了自己在小组赛的那种勇气,面对德国悠久的历史,他失去了他在小组赛中那种痛打落水狗的作风,他终于自食其果。阿根廷队该回家了,也许他们不用回遥远的阿根廷,他们不用回家,因为他们大多数人都在欧洲生活,再见!
first_rows_n和all_rows都是oracle optimizer_mode的选项,他们有什么区别呢,会对优化器产生怎么样的影响呢?让我们一起来解开迷题.
all_rows模式:
all_rows是oracle优化器默认的模式,它将选择一种在最短时间内返回所有数据的执行计划,它将基于整体成本的考虑.
first_rows_n模式:
first_rows_n是从9i开始引入来代替以前的first_rows模式,虽然first_rows模式仍然存在,但是oracle已经不推荐使用.因为它基本上是基于oracle可执行文件硬编码的很多规则实现,比如它会尝试彻底去避免hash join或者merge join除非nest loop的非驱动表会进行全表扫描,first_rows也会偏向于使用索引而不是全表扫描,这在某些情况下也会带来反面的效果.所以oracle引入first_rows_n来代替first_rows,first_rows_n是根据成本而不是基于硬编码的规则来选择执行计划.n可以是1,10,100,1000或者直接用first_rows(n) hint指定任意正数.这里的n是我们想获取结果集的前n条记录,举个例子,如果n为1,那么oracle会选择一个最快速度返回结果集第一条记录的执行计划而不管是否它获取结果集的所有记录的执行成本是不是最优.这种需求在很多分页语句的需求中会碰到.
那么oracle是怎么判断first_rows_n的成本并作出选择的呢,10053跟踪事件能给我们答案
create table t as select * from dba_objects;
create table t1 as select * from t;
create index ind_object_id on t(object_id) compute statistics;
create index ind_t1_object_id on t1(object_id) compute statistics;
analyze table t compute statistics for table for all columns;
analyze table t1 compute statistics for table for all columns;
准备好测试表和索引后来看看测试脚本
all_rows模式:
alter session set events'10053 trace name context forever,level 1';
alter session set optimizer_mode=all_rows;
select t.owner from t,t1 where t.object_id = t1.object_id;
alter session set events'10053 trace name context off';
first_rows_1模式:
alter session set events'10053 trace name context forever,level 1';
alter session set optimizer_mode=first_rows_1;
select t.owner from t,t1 where t.object_id = t1.object_id;
alter session set events'10053 trace name context off';
first_rows_10模式:
alter session set events'10053 trace name context forever,level 1';
alter session set optimizer_mode=first_rows_10;
select t.owner from t,t1 where t.object_id = t1.object_id;
alter session set events'10053 trace name context off';
first_rows_100模式:
alter session set events'10053 trace name context forever,level 1';
alter session set optimizer_mode=first_rows_100;
select t.owner from t,t1 where t.object_id = t1.object_id;
alter session set events'10053 trace name context off';
由于篇幅太长,所以把10053的trace文件简化了一下,只留下join这一部分的内容,并把merge join的部分去除了
测试环境是10g r2
all_rows:
**************************
GENERAL PLANS
**************************
Considering cardinality-based initial join order.
***********************
Join order[1]: T[T]#0 T1[T1]#1
***************
Now joining: T1[T1]#1
***************
NL Join
Outer table: Card: 51986.00 Cost: 164.59 Resp: 164.59 Degree: 1 Bytes: 9
Inner table: T1 Alias: T1
Access Path: TableScan
NL Join: Cost: 8493121.71 Resp: 8493121.71 Degree: 0
Cost_io: 8358538.00 Cost_cpu: 839658589661
Resp_io: 8358538.00 Resp_cpu: 839658589661
Access Path: index (index (FFS))
Index: IND_T1_OBJECT_ID
resc_io: 25.16 resc_cpu: 7056806
ix_sel: 0.0000e+00 ix_sel_with_filters: 1
Inner table: T1 Alias: T1
Access Path: index (FFS)
NL Join: Cost: 1366740.53 Resp: 1366740.53 Degree: 0
Cost_io: 1307937.00 Cost_cpu: 366871247240
Resp_io: 1307937.00 Resp_cpu: 366871247240
Access Path: index (AllEqJoinGuess)
Index: IND_T1_OBJECT_ID
resc_io: 1.00 resc_cpu: 8371
ix_sel: 1.9239e-05 ix_sel_with_filters: 1.9239e-05
NL Join: Cost: 52220.34 Resp: 52220.34 Degree: 1
Cost_io: 52148.00 Cost_cpu: 451348998
Resp_io: 52148.00 Resp_cpu: 451348998
Best NL cost: 52220.34
resc: 52220.34 resc_io: 52148.00 resc_cpu: 451348998
resp: 52220.34 resp_io: 52148.00 resp_cpu: 451348998
Join Card: 51982.00 = outer (51986.00) * inner (51986.00) * sel (1.9234e-05)
Join Card - Rounded: 51982 Computed: 51982.00
HA Join
Outer table:
resc: 164.59 card 51986.00 bytes: 9 deg: 1 resp: 164.59
Inner table: T1 Alias: T1
resc: 28.13 card: 51986.00 bytes: 4 deg: 1 resp: 28.13
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA Join (swap)
Outer table:
resc: 28.13 card 51986.00 bytes: 4 deg: 1 resp: 28.13
Inner table: T Alias: T
resc: 164.59 card: 51986.00 bytes: 9 deg: 1 resp: 164.59
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA cost: 195.30
resc: 195.30 resc_io: 189.00 resc_cpu: 39324090
resp: 195.30 resp_io: 189.00 resp_cpu: 39324090
Best:: JoinMethod: Hash
Cost: 195.30 Degree: 1 Resp: 195.30 Card: 51982.00 Bytes: 13
***********************
Best so far: Table#: 0 cost: 164.5888 card: 51986.0000 bytes: 467874
Table#: 1 cost: 195.3030 card: 51982.0000 bytes: 675766
计算第一种join顺序的成本值,T做驱动表,T1做内部表,
Best:: JoinMethod: Hash
Cost: 195.30 Degree: 1 Resp: 195.30 Card: 51982.00 Bytes: 13
在这里可以看到最优join方式是hash join,
最终的成本是195.30,返回结果集记录数是51982
***********************
Join order[2]: T1[T1]#1 T[T]#0
***************
Now joining: T[T]#0
***************
NL Join
Outer table: Card: 51986.00 Cost: 28.13 Resp: 28.13 Degree: 1 Bytes: 4
Inner table: T Alias: T
Access Path: TableScan
NL Join: Cost: 8492985.25 Resp: 8492985.25 Degree: 0
Cost_io: 8358403.00 Cost_cpu: 839649495148
Resp_io: 8358403.00 Resp_cpu: 839649495148
Access Path: index (AllEqJoinGuess)
Index: IND_OBJECT_ID
resc_io: 2.00 resc_cpu: 15913
ix_sel: 1.9239e-05 ix_sel_with_filters: 1.9239e-05
NL Join (ordered): Cost: 104132.73 Resp: 104132.73 Degree: 1
Cost_io: 103999.00 Cost_cpu: 834303785
Resp_io: 103999.00 Resp_cpu: 834303785
Best NL cost: 104132.73
resc: 104132.73 resc_io: 103999.00 resc_cpu: 834303785
resp: 104132.73 resp_io: 103999.00 resp_cpu: 834303785
Join Card: 51982.00 = outer (51986.00) * inner (51986.00) * sel (1.9234e-05)
Join Card - Rounded: 51982 Computed: 51982.00
HA Join
Outer table:
resc: 28.13 card 51986.00 bytes: 4 deg: 1 resp: 28.13
Inner table: T Alias: T
resc: 164.59 card: 51986.00 bytes: 9 deg: 1 resp: 164.59
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA cost: 195.30
resc: 195.30 resc_io: 189.00 resc_cpu: 39324090
resp: 195.30 resp_io: 189.00 resp_cpu: 39324090
Join order aborted: cost > best plan cost
计算第二种join顺序的成本值,T1做驱动表,T做内部表,
Join order aborted: cost > best plan cost
第二种join顺序被放弃,因为成本大于已经第一种join顺序的最优成本
***********************
(newjo-stop-1) k:0, spcnt:0, perm:2, maxperm:2000
*********************************
Number of join permutations tried: 2
*********************************
(newjo-save) [1 0 ]
Final - All Rows Plan: Best join order: 1
Cost: 195.3030 Degree: 1 Card: 51982.0000 Bytes: 675766
Resc: 195.3030 Resc_io: 189.0000 Resc_cpu: 39324090
Resp: 195.3030 Resp_io: 189.0000 Resc_cpu: 39324090
在All Rows模式下最终优化器选择了Best join order: 1,Cost: 195.3030,
尝试了2种join 顺序(Number of join permutations tried: 2)
first_rows_1模式:
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
***********************
Join order[1]: T[T]#0 T1[T1]#1
***************
Now joining: T1[T1]#1
***************
NL Join
Outer table: Card: 51986.00 Cost: 164.59 Resp: 164.59 Degree: 1 Bytes: 9
Inner table: T1 Alias: T1
Access Path: TableScan
NL Join: Cost: 8493121.71 Resp: 8493121.71 Degree: 0
Cost_io: 8358538.00 Cost_cpu: 839658589661
Resp_io: 8358538.00 Resp_cpu: 839658589661
Access Path: index (index (FFS))
Index: IND_T1_OBJECT_ID
resc_io: 25.16 resc_cpu: 7056806
ix_sel: 0.0000e+00 ix_sel_with_filters: 1
Inner table: T1 Alias: T1
Access Path: index (FFS)
NL Join: Cost: 1366740.53 Resp: 1366740.53 Degree: 0
Cost_io: 1307937.00 Cost_cpu: 366871247240
Resp_io: 1307937.00 Resp_cpu: 366871247240
Access Path: index (AllEqJoinGuess)
Index: IND_T1_OBJECT_ID
resc_io: 1.00 resc_cpu: 8371
ix_sel: 1.9239e-05 ix_sel_with_filters: 1.9239e-05
NL Join: Cost: 52220.34 Resp: 52220.34 Degree: 1
Cost_io: 52148.00 Cost_cpu: 451348998
Resp_io: 52148.00 Resp_cpu: 451348998
Best NL cost: 52220.34
resc: 52220.34 resc_io: 52148.00 resc_cpu: 451348998
resp: 52220.34 resp_io: 52148.00 resp_cpu: 451348998
Join Card: 51982.00 = outer (51986.00) * inner (51986.00) * sel (1.9234e-05)
Join Card - Rounded: 51982 Computed: 51982.00
HA Join
Outer table:
resc: 164.59 card 51986.00 bytes: 9 deg: 1 resp: 164.59
Inner table: T1 Alias: T1
resc: 28.13 card: 51986.00 bytes: 4 deg: 1 resp: 28.13
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA Join (swap)
Outer table:
resc: 28.13 card 51986.00 bytes: 4 deg: 1 resp: 28.13
Inner table: T Alias: T
resc: 164.59 card: 51986.00 bytes: 9 deg: 1 resp: 164.59
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA cost: 195.30
resc: 195.30 resc_io: 189.00 resc_cpu: 39324090
resp: 195.30 resp_io: 189.00 resp_cpu: 39324090
Best:: JoinMethod: Hash
Cost: 195.30 Degree: 1 Resp: 195.30 Card: 51982.00 Bytes: 13
***********************
Best so far: Table#: 0 cost: 164.5888 card: 51986.0000 bytes: 467874
Table#: 1 cost: 195.3030 card: 51982.0000 bytes: 675766
*********************************
Number of join permutations tried: 1
*********************************
(newjo-save) [1 0 ]
Final - All Rows Plan: Best join order: 1
Cost: 195.3030 Degree: 1 Card: 51982.0000 Bytes: 675766
Resc: 195.3030 Resc_io: 189.0000 Resc_cpu: 39324090
Resp: 195.3030 Resp_io: 189.0000 Resc_cpu: 39324090
kkoipt: Query block SEL$1 (#0)
******* UNPARSED QUERY IS *******
SELECT /*+ NO_STAR_TRANSFORMATION NO_EXPAND */ "T"."OWNER" "OWNER" FROM "TEST"."T" "T","TEST"."T1" "T1" WHERE "T"."OBJECT_ID"="T1"."OBJECT_ID"
kkoqbc-end
: call(in-use=32712, alloc=49112), compile(in-use=35284, alloc=36696)
First K Rows: K/N ratio = 0.000019237428341, qbc=0x905f2620
First K Rows: Setup end
***********************
在FIRST_ROWS_1模式下,oracle会先按ALL_ROWS模式计算一种join顺序(Number of join permutations tried: 1)
,得到返回结果集的大小,
从而计算出FIRST_ROWS_1中的1条记录和所有结果集记录的一个比率值,
Join Card - Rounded: 51982 Computed: 51982.00
First K Rows: K/N ratio = 1/51982=0.000019237428341
通过这个K/N ratio,oracle会重新计算join cost
SINGLE TABLE ACCESS PATH (First K Rows)
Table: T Alias: T
Card: Original: 2 Rounded: 2 Computed: 2.00 Non Adjusted: 2.00
Access Path: TableScan
Cost: 2.00 Resp: 2.00 Degree: 0
Cost_io: 2.00 Cost_cpu: 7541
Resp_io: 2.00 Resp_cpu: 7541
Best:: AccessPath: TableScan
Cost: 2.00 Degree: 1 Resp: 2.00 Card: 2.00 Bytes: 9
***************************************
SINGLE TABLE ACCESS PATH (First K Rows)
Table: T1 Alias: T1
Card: Original: 25996 Rounded: 25996 Computed: 25996.00 Non Adjusted: 25996.00
Access Path: TableScan
Cost: 83.30 Resp: 83.30 Degree: 0
Cost_io: 82.00 Cost_cpu: 8079850
Resp_io: 82.00 Resp_cpu: 8079850
Access Path: index (index (FFS))
Index: IND_T1_OBJECT_ID
resc_io: 14.00 resc_cpu: 3532204
ix_sel: 0.0000e+00 ix_sel_with_filters: 1
Access Path: index (FFS)
Cost: 14.57 Resp: 14.57 Degree: 1
Cost_io: 14.00 Cost_cpu: 3532204
Resp_io: 14.00 Resp_cpu: 3532204
Access Path: index (FullScan)
Index: IND_T1_OBJECT_ID
resc_io: 59.00 resc_cpu: 5618765
ix_sel: 1 ix_sel_with_filters: 1
Cost: 59.90 Resp: 59.90 Degree: 1
Best:: AccessPath: IndexFFS Index: IND_T1_OBJECT_ID
Cost: 14.57 Degree: 1 Resp: 14.57 Card: 25996.00 Bytes: 4
First K Rows: unchanged join prefix len = 1
***********************
Join order[1]: T[T]#0 T1[T1]#1
***************
Now joining: T1[T1]#1
***************
NL Join
Outer table: Card: 2.00 Cost: 2.00 Resp: 2.00 Degree: 1 Bytes: 9
Inner table: T1 Alias: T1
Access Path: TableScan
NL Join: Cost: 166.59 Resp: 166.59 Degree: 0
Cost_io: 164.00 Cost_cpu: 16167241
Resp_io: 164.00 Resp_cpu: 16167241
Access Path: index (index (FFS))
Index: IND_T1_OBJECT_ID
resc_io: 13.50 resc_cpu: 3532204
ix_sel: 0.0000e+00 ix_sel_with_filters: 1
Inner table: T1 Alias: T1
Access Path: index (FFS)
NL Join: Cost: 30.13 Resp: 30.13 Degree: 0
Cost_io: 29.00 Cost_cpu: 7071948
Resp_io: 29.00 Resp_cpu: 7071948
Access Path: index (AllEqJoinGuess)
Index: IND_T1_OBJECT_ID
resc_io: 1.00 resc_cpu: 8371
ix_sel: 3.8475e-05 ix_sel_with_filters: 3.8475e-05
NL Join: Cost: 4.00 Resp: 4.00 Degree: 1
Cost_io: 4.00 Cost_cpu: 24284
Resp_io: 4.00 Resp_cpu: 24284
Best NL cost: 4.00
resc: 4.00 resc_io: 4.00 resc_cpu: 24284
resp: 4.00 resp_io: 4.00 resp_cpu: 24284
Join Card: 1.00 = outer (2.00) * inner (25996.00) * sel (1.9234e-05)
Join Card - Rounded: 1 Computed: 1.00
HA Join
Outer table:
resc: 164.59 card 51986.00 bytes: 9 deg: 1 resp: 164.59
Inner table: T1 Alias: T1
resc: 14.57 card: 25996.00 bytes: 4 deg: 1 resp: 14.57
using dmeth: 2 #groups: 1
Cost per ptn: 2.17 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 181.32 Resp: 181.32 [multiMatchCost=0.00]
HA Join (swap)
Outer table:
resc: 28.13 card 51986.00 bytes: 4 deg: 1 resp: 28.13
Inner table: T Alias: T
resc: 2.00 card: 2.00 bytes: 9 deg: 1 resp: 2.00
using dmeth: 2 #groups: 1
Cost per ptn: 1.75 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 31.88 Resp: 31.88 [multiMatchCost=0.00]
HA cost: 31.88
resc: 31.88 resc_io: 29.00 resc_cpu: 17981913
resp: 31.88 resp_io: 29.00 resp_cpu: 17981913
Best:: JoinMethod: NestedLoop
Cost: 4.00 Degree: 1 Resp: 4.00 Card: 1.00 Bytes: 13
***********************
Best so far: Table#: 0 cost: 2.0012 card: 2.0000 bytes: 18
Table#: 1 cost: 4.0039 card: 1.0000 bytes: 13
***********************
经过重新计算后,
计算第一种join顺序的成本值,T做驱动表,T1做内部表,
Best:: JoinMethod: NestedLoop
Cost: 4.00 Degree: 1 Resp: 4.00 Card: 1.00 Bytes: 13
在这里可以看到最优join方式是nest loop,这和ALL_ROWS下选择有了区别
最终的成本是4.00,返回结果集记录数是1(Join Card - Rounded: 1)
***********************
Join order[2]: T1[T1]#1 T[T]#0
***************
Now joining: T[T]#0
***************
NL Join
Outer table: Card: 2.00 Cost: 2.00 Resp: 2.00 Degree: 1 Bytes: 4
Inner table: T Alias: T
Access Path: TableScan
NL Join: Cost: 166.59 Resp: 166.59 Degree: 0
Cost_io: 164.00 Cost_cpu: 16167061
Resp_io: 164.00 Resp_cpu: 16167061
Access Path: index (AllEqJoinGuess)
Index: IND_OBJECT_ID
resc_io: 2.00 resc_cpu: 15913
ix_sel: 3.8475e-05 ix_sel_with_filters: 3.8475e-05
NL Join (ordered): Cost: 5.01 Resp: 5.01 Degree: 1
Cost_io: 5.00 Cost_cpu: 31647
Resp_io: 5.00 Resp_cpu: 31647
Best NL cost: 5.01
resc: 5.01 resc_io: 5.00 resc_cpu: 31647
resp: 5.01 resp_io: 5.00 resp_cpu: 31647
Join Card: 1.00 = outer (2.00) * inner (25996.00) * sel (1.9234e-05)
Join Card - Rounded: 1 Computed: 1.00
HA Join
Outer table:
resc: 28.13 card 51986.00 bytes: 4 deg: 1 resp: 28.13
Inner table: T Alias: T
resc: 83.30 card: 25996.00 bytes: 9 deg: 1 resp: 83.30
using dmeth: 2 #groups: 1
Cost per ptn: 2.17 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 113.59 Resp: 113.59 [multiMatchCost=0.00]
HA Join (swap)
Outer table:
resc: 164.59 card 51986.00 bytes: 9 deg: 1 resp: 164.59
Inner table: T1 Alias: T1
resc: 2.00 card: 2.00 bytes: 4 deg: 1 resp: 2.00
using dmeth: 2 #groups: 1
Cost per ptn: 1.75 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 168.34 Resp: 168.34 [multiMatchCost=0.00]
HA cost: 168.34
resc: 168.34 resc_io: 164.00 resc_cpu: 27076246
resp: 168.34 resp_io: 164.00 resp_cpu: 27076246
Join order aborted: cost > best plan cost
计算第二种join顺序的成本值,T1做驱动表,T做内部表,
Join order aborted: cost > best plan cost
第二种join顺序被放弃,因为成本大于已经第一种join顺序的最优成本
***********************
(newjo-stop-1) k:0, spcnt:0, perm:2, maxperm:1000
*********************************
Number of join permutations tried: 2
*********************************
(newjo-save) [1 0 ]
Final - First K Rows Plan: Best join order: 1
Cost: 4.0039 Degree: 1 Card: 1.0000 Bytes: 13
Resc: 4.0039 Resc_io: 4.0000 Resc_cpu: 24284
Resp: 4.0039 Resp_io: 4.0000 Resc_cpu: 24284
kkoipt: Query block SEL$1 (#0)
在FIRST_Rows_1模式下最终优化器选择了Best join order: 1,Cost: 4.0039,
尝试了2种join 顺序(Number of join permutations tried: 2)
实际上是3种,包括了一次在ALL_ROWS模式下的计算
另外再看一下
FIRST_Rows_10
FIRST_Rows_100
最终的执行计划选择和成本计算
FIRST_Rows_10:
Final - First K Rows Plan: Best join order: 1
Cost: 13.0163 Degree: 1 Card: 10.0000 Bytes: 130
Resc: 13.0163 Resc_io: 13.0000 Resc_cpu: 101517
Resp: 13.0163 Resp_io: 13.0000 Resc_cpu: 101517
FIRST_Rows_100:
Final - First K Rows Plan: Best join order: 1
Cost: 31.8883 Degree: 1 Card: 51982.0000 Bytes: 1143604
Resc: 31.8883 Resc_io: 29.0000 Resc_cpu: 18019724
Resp: 31.8883 Resp_io: 29.0000 Resc_cpu: 18019724
值得注意,FIRST_Rows_100选择了hash
再看一下执行计划
ALL_ROWS:
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 51982 | 659K| 195 (4)|
|* 1 | HASH JOIN | | 51982 | 659K| 195 (4)|
| 2 | INDEX FAST FULL SCAN| IND_T1_OBJECT_ID | 51986 | 203K| 28 (4)|
| 3 | TABLE ACCESS FULL | T | 51986 | 456K| 165 (2)|
-------------------------------------------------------------------------------
FIRST_Rows_1:
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 4 (0)|
| 1 | NESTED LOOPS | | 1 | 13 | 4 (0)|
| 2 | TABLE ACCESS FULL| T | 25996 | 228K| 2 (0)|
|* 3 | INDEX RANGE SCAN | IND_T1_OBJECT_ID | 1 | 4 | 1 (0)|
----------------------------------------------------------------------------
FIRST_Rows_10:
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 130 | 13 (0)|
| 1 | NESTED LOOPS | | 10 | 130 | 13 (0)|
| 2 | TABLE ACCESS FULL| T | 47264 | 415K| 2 (0)|
|* 3 | INDEX RANGE SCAN | IND_T1_OBJECT_ID | 1 | 4 | 1 (0)|
----------------------------------------------------------------------------
FIRST_Rows_100:
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 51982 | 1116K| 32 (10)|
|* 1 | HASH JOIN | | 51982 | 1116K| 32 (10)|
| 2 | INDEX FAST FULL SCAN| IND_T1_OBJECT_ID | 51986 | 203K| 28 (4)|
| 3 | TABLE ACCESS FULL | T | 51986 | 456K| 2 (0)|
-------------------------------------------------------------------------------
总结来说,first_rows_n基于成本计算,根据优先返回行数N重新计算各个对象的访问成本,
从而生成最快返回前N条记录的执行计划.

first_rows_n和all_rows都是oracle optimizer_mode的选项,他们有什么区别呢,会对优化器产生怎么样的影响呢?让我们一起来解开迷题.
all_rows模式:
all_rows是oracle优化器默认的模式,它将选择一种在最短时间内返回所有数据的执行计划,它将基于整体成本的考虑.
first_rows_n模式:
first_rows_n是从9i开始引入来代替以前的first_rows模式,虽然first_rows模式仍然存在,但是oracle已经不推荐使用.因为它基本上是基于oracle可执行文件硬编码的很多规则实现,比如它会尝试彻底去避免hash join或者merge join除非nest loop的非驱动表会进行全表扫描,first_rows也会偏向于使用索引而不是全表扫描,这在某些情况下也会带来反面的效果.所以oracle引入first_rows_n来代替first_rows,first_rows_n是根据成本而不是基于硬编码的规则来选择执行计划.n可以是1,10,100,1000或者直接用first_rows(n) hint指定任意正数.这里的n是我们想获取结果集的前n条记录,举个例子,如果n为1,那么oracle会选择一个最快速度返回结果集第一条记录的执行计划而不管是否它获取结果集的所有记录的执行成本是不是最优.这种需求在很多分页语句的需求中会碰到.
那么oracle是怎么判断first_rows_n的成本并作出选择的呢,10053跟踪事件能给我们答案
create table t as select * from dba_objects;
create table t1 as select * from t;
create index ind_object_id on t(object_id) compute statistics;
create index ind_t1_object_id on t1(object_id) compute statistics;
analyze table t compute statistics for table for all columns;
analyze table t1 compute statistics for table for all columns;
准备好测试表和索引后来看看测试脚本
all_rows模式:
alter session set events'10053 trace name context forever,level 1';
alter session set optimizer_mode=all_rows;
select t.owner from t,t1 where t.object_id = t1.object_id;
alter session set events'10053 trace name context off';
first_rows_1模式:
alter session set events'10053 trace name context forever,level 1';
alter session set optimizer_mode=first_rows_1;
select t.owner from t,t1 where t.object_id = t1.object_id;
alter session set events'10053 trace name context off';
first_rows_10模式:
alter session set events'10053 trace name context forever,level 1';
alter session set optimizer_mode=first_rows_10;
select t.owner from t,t1 where t.object_id = t1.object_id;
alter session set events'10053 trace name context off';
first_rows_100模式:
alter session set events'10053 trace name context forever,level 1';
alter session set optimizer_mode=first_rows_100;
select t.owner from t,t1 where t.object_id = t1.object_id;
alter session set events'10053 trace name context off';
由于篇幅太长,所以把10053的trace文件简化了一下,只留下join这一部分的内容,并把merge join的部分去除了
测试环境是10g r2
all_rows:
**************************
GENERAL PLANS
**************************
Considering cardinality-based initial join order.
***********************
Join order[1]: T[T]#0 T1[T1]#1
***************
Now joining: T1[T1]#1
***************
NL Join
Outer table: Card: 51986.00 Cost: 164.59 Resp: 164.59 Degree: 1 Bytes: 9
Inner table: T1 Alias: T1
Access Path: TableScan
NL Join: Cost: 8493121.71 Resp: 8493121.71 Degree: 0
Cost_io: 8358538.00 Cost_cpu: 839658589661
Resp_io: 8358538.00 Resp_cpu: 839658589661
Access Path: index (index (FFS))
Index: IND_T1_OBJECT_ID
resc_io: 25.16 resc_cpu: 7056806
ix_sel: 0.0000e+00 ix_sel_with_filters: 1
Inner table: T1 Alias: T1
Access Path: index (FFS)
NL Join: Cost: 1366740.53 Resp: 1366740.53 Degree: 0
Cost_io: 1307937.00 Cost_cpu: 366871247240
Resp_io: 1307937.00 Resp_cpu: 366871247240
Access Path: index (AllEqJoinGuess)
Index: IND_T1_OBJECT_ID
resc_io: 1.00 resc_cpu: 8371
ix_sel: 1.9239e-05 ix_sel_with_filters: 1.9239e-05
NL Join: Cost: 52220.34 Resp: 52220.34 Degree: 1
Cost_io: 52148.00 Cost_cpu: 451348998
Resp_io: 52148.00 Resp_cpu: 451348998
Best NL cost: 52220.34
resc: 52220.34 resc_io: 52148.00 resc_cpu: 451348998
resp: 52220.34 resp_io: 52148.00 resp_cpu: 451348998
Join Card: 51982.00 = outer (51986.00) * inner (51986.00) * sel (1.9234e-05)
Join Card - Rounded: 51982 Computed: 51982.00
HA Join
Outer table:
resc: 164.59 card 51986.00 bytes: 9 deg: 1 resp: 164.59
Inner table: T1 Alias: T1
resc: 28.13 card: 51986.00 bytes: 4 deg: 1 resp: 28.13
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA Join (swap)
Outer table:
resc: 28.13 card 51986.00 bytes: 4 deg: 1 resp: 28.13
Inner table: T Alias: T
resc: 164.59 card: 51986.00 bytes: 9 deg: 1 resp: 164.59
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA cost: 195.30
resc: 195.30 resc_io: 189.00 resc_cpu: 39324090
resp: 195.30 resp_io: 189.00 resp_cpu: 39324090
Best:: JoinMethod: Hash
Cost: 195.30 Degree: 1 Resp: 195.30 Card: 51982.00 Bytes: 13
***********************
Best so far: Table#: 0 cost: 164.5888 card: 51986.0000 bytes: 467874
Table#: 1 cost: 195.3030 card: 51982.0000 bytes: 675766
计算第一种join顺序的成本值,T做驱动表,T1做内部表,
Best:: JoinMethod: Hash
Cost: 195.30 Degree: 1 Resp: 195.30 Card: 51982.00 Bytes: 13
在这里可以看到最优join方式是hash join,
最终的成本是195.30,返回结果集记录数是51982
***********************
Join order[2]: T1[T1]#1 T[T]#0
***************
Now joining: T[T]#0
***************
NL Join
Outer table: Card: 51986.00 Cost: 28.13 Resp: 28.13 Degree: 1 Bytes: 4
Inner table: T Alias: T
Access Path: TableScan
NL Join: Cost: 8492985.25 Resp: 8492985.25 Degree: 0
Cost_io: 8358403.00 Cost_cpu: 839649495148
Resp_io: 8358403.00 Resp_cpu: 839649495148
Access Path: index (AllEqJoinGuess)
Index: IND_OBJECT_ID
resc_io: 2.00 resc_cpu: 15913
ix_sel: 1.9239e-05 ix_sel_with_filters: 1.9239e-05
NL Join (ordered): Cost: 104132.73 Resp: 104132.73 Degree: 1
Cost_io: 103999.00 Cost_cpu: 834303785
Resp_io: 103999.00 Resp_cpu: 834303785
Best NL cost: 104132.73
resc: 104132.73 resc_io: 103999.00 resc_cpu: 834303785
resp: 104132.73 resp_io: 103999.00 resp_cpu: 834303785
Join Card: 51982.00 = outer (51986.00) * inner (51986.00) * sel (1.9234e-05)
Join Card - Rounded: 51982 Computed: 51982.00
HA Join
Outer table:
resc: 28.13 card 51986.00 bytes: 4 deg: 1 resp: 28.13
Inner table: T Alias: T
resc: 164.59 card: 51986.00 bytes: 9 deg: 1 resp: 164.59
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA cost: 195.30
resc: 195.30 resc_io: 189.00 resc_cpu: 39324090
resp: 195.30 resp_io: 189.00 resp_cpu: 39324090
Join order aborted: cost > best plan cost
计算第二种join顺序的成本值,T1做驱动表,T做内部表,
Join order aborted: cost > best plan cost
第二种join顺序被放弃,因为成本大于已经第一种join顺序的最优成本
***********************
(newjo-stop-1) k:0, spcnt:0, perm:2, maxperm:2000
*********************************
Number of join permutations tried: 2
*********************************
(newjo-save) [1 0 ]
Final - All Rows Plan: Best join order: 1
Cost: 195.3030 Degree: 1 Card: 51982.0000 Bytes: 675766
Resc: 195.3030 Resc_io: 189.0000 Resc_cpu: 39324090
Resp: 195.3030 Resp_io: 189.0000 Resc_cpu: 39324090
在All Rows模式下最终优化器选择了Best join order: 1,Cost: 195.3030,
尝试了2种join 顺序(Number of join permutations tried: 2)
first_rows_1模式:
***************************************
GENERAL PLANS
***************************************
Considering cardinality-based initial join order.
***********************
Join order[1]: T[T]#0 T1[T1]#1
***************
Now joining: T1[T1]#1
***************
NL Join
Outer table: Card: 51986.00 Cost: 164.59 Resp: 164.59 Degree: 1 Bytes: 9
Inner table: T1 Alias: T1
Access Path: TableScan
NL Join: Cost: 8493121.71 Resp: 8493121.71 Degree: 0
Cost_io: 8358538.00 Cost_cpu: 839658589661
Resp_io: 8358538.00 Resp_cpu: 839658589661
Access Path: index (index (FFS))
Index: IND_T1_OBJECT_ID
resc_io: 25.16 resc_cpu: 7056806
ix_sel: 0.0000e+00 ix_sel_with_filters: 1
Inner table: T1 Alias: T1
Access Path: index (FFS)
NL Join: Cost: 1366740.53 Resp: 1366740.53 Degree: 0
Cost_io: 1307937.00 Cost_cpu: 366871247240
Resp_io: 1307937.00 Resp_cpu: 366871247240
Access Path: index (AllEqJoinGuess)
Index: IND_T1_OBJECT_ID
resc_io: 1.00 resc_cpu: 8371
ix_sel: 1.9239e-05 ix_sel_with_filters: 1.9239e-05
NL Join: Cost: 52220.34 Resp: 52220.34 Degree: 1
Cost_io: 52148.00 Cost_cpu: 451348998
Resp_io: 52148.00 Resp_cpu: 451348998
Best NL cost: 52220.34
resc: 52220.34 resc_io: 52148.00 resc_cpu: 451348998
resp: 52220.34 resp_io: 52148.00 resp_cpu: 451348998
Join Card: 51982.00 = outer (51986.00) * inner (51986.00) * sel (1.9234e-05)
Join Card - Rounded: 51982 Computed: 51982.00
HA Join
Outer table:
resc: 164.59 card 51986.00 bytes: 9 deg: 1 resp: 164.59
Inner table: T1 Alias: T1
resc: 28.13 card: 51986.00 bytes: 4 deg: 1 resp: 28.13
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA Join (swap)
Outer table:
resc: 28.13 card 51986.00 bytes: 4 deg: 1 resp: 28.13
Inner table: T Alias: T
resc: 164.59 card: 51986.00 bytes: 9 deg: 1 resp: 164.59
using dmeth: 2 #groups: 1
Cost per ptn: 2.58 #ptns: 1
hash_area: 0 (max=0) Hash join: Resc: 195.30 Resp: 195.30 [multiMatchCost=0.00]
HA cost: 195.30
resc: 195.30 resc_io: 189.00 resc_cpu: 39324090
resp: 195.30 resp_io: 189.00 resp_cpu: 39324090
Best:: JoinMethod: Hash
Cost: 195.30 Degree: 1 Resp: 195.30 Card: 51982.00 Bytes: 13
***********************
Best so far: Table#: 0 cost: 164.5888 card: 51986.0000 bytes: 467874
Table#: 1 cost: 195.3030 card: 51982.0000 bytes: 675766
*********************************
Number of join permutations tried: 1
*********************************
(newjo-save) [1 0 ]
Final - All Rows Plan: Best join order: 1
Cost: 195.3030 Degree: 1 Card: 51982.0000 Bytes: 675766
Resc: 195.3030 Resc_io: 189.0000 Resc_cpu: 39324090
Resp: 195.3030 Resp_io: 189.0000 Resc_cpu: 39324090
kkoipt: Query block SEL$1 (#0)
******* UNPARSED QUERY IS *******
SELECT /*+ NO_STAR_TRANSFORMATION NO_EXPAND */ "T"."OWNER" "OWNER" FROM "TEST"."T" "T","TEST"."T1" "T1" WHERE "T"."OBJECT_ID"="T1"."OBJECT_ID"
kkoqbc-end
: call(in-use=32712, alloc=49112), compile(in-use=35284, alloc=36696)
First K Rows: K/N ratio = 0.000019237428341, qbc=0x905f2620
First K Rows: Setup end
***********************
在FIRST_ROWS_1模式下,oracle会先按ALL_ROWS模式计算一种join顺序(Number of join permutations tried: 1)
,得到返回结果集的大小,
从而计算出FIRST_ROWS_1中的1条记录和所有结果集记录的一个比率值,
Join Card - Rounded: 51982 Computed: 51982.00
First K Rows: K/N ratio = 1/51982=0.000019237428341
通过这个K/N ratio,oracle会重新计算join cost
SINGLE TABLE ACCESS PATH (First K Rows)
Table: T Alias: T
Card: Original: 2 Rounded: 2 Computed: 2.00 Non Adjusted: 2.00
Access Path: TableScan
Cost: 2.00 Resp: 2.00 Degree: 0
Cost_io: 2.00 Cost_cpu: 7541
Resp_io: