函数的稳定性状态,简单地说就是相同的输入参数情况下,函数返回值是否相同。稳定性使得优化器可以判断不同函数的行为。为了得到最佳的优化结果,在创建函数时我们应该指定与函数功能相对应稳定性级别。
PostgreSQL 或 KingbaseES 函数在定义时有三种稳定态级别:volatile、stable 和 immutable。默认情况下,创建函数的稳定性为volatile。以下是这三种函数的区别:
KingbaseES 为了兼容Oracle,增加了 Deterministic稳定态,等价于 immutable 。
以下例子同时在PostgreSQL 与 KingbaseES 进行过验证。
结论:等式左边的 volatile 函数可能导致函数不必要的被重复执行。
create or replace function test_volatile(id integer)
returns bigint
volatile
language sql
as
$$ select count(*) from t1 $$ ;
/
create or replace function test_stable(id integer)
returns bigint
stable
language sql
as
$$ select count(*) from t1 $$ ;
/
create or replace function test_immutable(id integer)
returns bigint
immutable
language sql
as
$$ select count(*) from t1 $$ ;
/
例子数据如下:
create table test(id integer);
insert into test select 1 from generate_series(1,10);
可以看到三个函数调用时间基本没有差别,因为,传入的参数是id 变量值(虽然实际值是相同的)。
test=# select count(*) from test where test_volatile(id)=1;
0
(1 row)
Time: 39652.495 ms (00:39.652)
test=# select count(*) from test where test_stable(id)=1;
0
(1 row)
Time: 38789.952 ms (00:38.790)
test=# select count(*) from test where test_immutable(id)=1;
0
(1 row)
Time: 38591.957 ms (00:38.592)
test=# select count(*) from test where id=test_volatile(1);
0
(1 row)
Time: 40353.777 ms (00:40.354)
test=# select count(*) from test where id=test_stable(1);
0
(1 row)
Time: 40500.253 ms (00:40.500)
test=# select count(*) from test where id=test_immutable(1);
0
(1 row)
Time: 374.300 ms
可以看到,对于常量值,stable 和 immutable 类型的函数实际只需调用一次。
test=# select count(*) from test where test_volatile(1)=1;
0
(1 row)
Time: 41647.551 ms (00:41.648)
test=# select count(*) from test where test_stable(1)=1;
0
(1 row)
Time: 399.161 ms
test=# select count(*) from test where test_immutable(1)=1;
0
(1 row)
Time: 389.367 ms
可以看到,对于相同的输入参数, test_immutable 在同一query 只执行一次。
对于for 循环,实际结果相同。
test=# begin
test-# for i in 1..100 loop
test-# perform test_immutable(1);
test-# end loop;
test-# end;
test-# /
ANONYMOUS BLOCK
Time: 1001.184 ms (00:01.001)
来看以下例子:
test=# create table t2(id integer,name char(999));
CREATE TABLE ^
test=# insert into t2 select generate_series(1,100),repeat('a',999);
INSERT 0 100
test=# create index ind_t2 on t2(id);
CREATE INDEX
test=#
test=# analyze t2;
ANALYZE
test=# explain analyze select count(*) from t2 where id=test_volatile(1);
Aggregate (cost=41.25..41.26 rows=1 width=8) (actual time=38735.427..38735.428 rows=1 loops=1)
-> Seq Scan on t2 (cost=0.00..41.25 rows=1 width=0) (actual time=38735.424..38735.425 rows=0 loops=1)
Filter: (id = test_volatile(1))
Rows Removed by Filter: 100
Planning Time: 0.191 ms
Execution Time: 38735.447 ms
(6 rows)
test=# explain analyze select count(*) from t2 where id=test_stable(1);
Aggregate (cost=8.41..8.42 rows=1 width=8) (actual time=385.449..385.450 rows=1 loops=1)
-> Index Only Scan using ind_t2 on t2 (cost=0.39..8.41 rows=1 width=0) (actual time=385.446..385.446 rows=0 loops=1)
Index Cond: (id = test_stable(1))
Heap Fetches: 0
Planning Time: 398.499 ms
Execution Time: 385.487 ms
(6 rows)
test=# explain analyze select count(*) from t2 where id=test_immutable(1);
Aggregate (cost=8.16..8.17 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=1)
-> Index Only Scan using ind_t2 on t2 (cost=0.14..8.16 rows=1 width=0) (actual time=0.011..0.011 rows=0 loops=1)
Index Cond: (id = '10000000'::bigint)
Heap Fetches: 0
Planning Time: 383.902 ms
Execution Time: 0.032 ms
(6 rows)
可以看到volatile 情况下,无法使用索引
create table tab1(id1 integer,name1 varchar(9),addr1 text);
create table tab2(id2 integer,name2 varchar(9),addr2 text);
insert into tab1 select generate_series(1,1000000),generate_series(1,1000000),'abc';
insert into tab2 select generate_series(1,1000000),generate_series(1,1000000),'abc';
create index ind_tab1 on tab1(id1);
create index ind_tab2 on tab2(id2);
对于 tab2 表的访问无法使用索引。
test=# \df+ replace
List of functions
Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description
--------+---------+-------------------+---------------------+------+------------+----------+--------+----------+-------------------+----------+---------------------------------+-------------
sys | replace | character varying | text, text, text | func | volatile | safe | system | invoker | | c | ora_replace_text |
test=# explain select id1,name1 from tab1 a ,(select id2,replace(id2,'b','B') name2 from tab2 ) b where a.id1=b.id2 and name1='123';
Hash Join (cost=17935.01..52120.02 rows=1 width=10)
Hash Cond: (tab2.id2 = a.id1)
-> Seq Scan on tab2 (cost=0.00..20435.00 rows=1000000 width=36)
-> Hash (cost=17935.00..17935.00 rows=1 width=10)
-> Seq Scan on tab1 a (cost=0.00..17935.00 rows=1 width=10)
Filter: ((name1)::text = '123'::text)
(6 rows)
不使用子查询情况下的,可以使用索引:
test=# explain select id1,name1,id2,replace(id2,'b','B') name2 from tab1 a,tab2 b where a.id1=b.id2 and name1='123';
Nested Loop (cost=0.42..17943.46 rows=1 width=46)
-> Seq Scan on tab1 a (cost=0.00..17935.00 rows=1 width=10)
Filter: ((name1)::text = '123'::text)
-> Index Only Scan using ind_tab2 on tab2 b (cost=0.42..8.44 rows=1 width=4)
Index Cond: (id2 = a.id1)
(5 rows)
改成immutable 函数后,子查询可以提升,从而能够使用索引。
test=# \df+ replace
List of functions
Schema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description
--------+---------+-------------------+---------------------+------+------------+----------+--------+----------+-------------------+----------+---------------------------------+-------------
sys | replace | character varying | text, text, text | func | immutable | safe | system | invoker | | c | ora_replace_text |
test=# explain select id1,name1 from tab1 a ,(select id2,replace(id2,'b','B') name2 from tab2 ) b where a.id1=b.id2 and name1='123';
Nested Loop (cost=0.42..17943.45 rows=1 width=10)
-> Seq Scan on tab1 a (cost=0.00..17935.00 rows=1 width=10)
Filter: ((name1)::text = '123'::text)
-> Index Only Scan using ind_tab2 on tab2 (cost=0.42..8.44 rows=1 width=4)
Index Cond: (id2 = a.id1)
(5 rows)
函数的稳定性特性不是PostgreSQL 独有的,Oracle 也有同样的机制(默认volatile , 可以加 deterministic),不同的稳定态会影响性能及数据的准确性。用户在创建函数时,必须对了解自己所创建的函数应归属的属性,错误的属性设置可能导致函数返回结果错误,以及函数执行的性能问题。
手机扫一扫
移动阅读更方便
你可能感兴趣的文章