这款存储引擎目前主要是由Cybertec公司推进,因为Postrgesql的表都是以heap形式存储的,MVCC机制中对删除或者更新的行使用的是根据事物号设置可见或者不可见,所以导致Postgresql中的表在频繁更新和删除的情况下,极易膨胀。而zheap则是类似引入了undo的技术,oracle和mysql就是使用此技术,所以不会有表膨胀的情况。有兴趣的同学可以安装试玩一下,生产千万不要用哦,bug很多。

github地址:
https://github.com/cybertecpostgresql/postgres/tree/zheap_undo_record_set

下载后,编译安装
./configure --prefix=/opt/pgsql_zheap --with-pgport=1928 --with-segsize=8 --with-perl --with-python --with-openssl --with-pam --with-ldap --with-libxml --with-libxslt --enable-thread-safety
make -j 2 all 
make install
make -C contrib install
测试实例
#启动户数据库后,查看进程,新增了discard worker进程
postgres: undo discard worker

#进入数据库
psql -U zheap -d  postgres
#修改参数并建测试表
postgres=# show default_table_access_method ;
 default_table_access_method 
-----------------------------
 heap
(1 row)

postgres=# SET default_table_access_method TO zheap;
SET

#这里我没有修改shared_buffers,是默认的128MB,可能是设置太小,回滚导致了报错
postgres=# CREATE TABLE t_large (id serial, name text);
CREATE TABLE
postgres=# begin;
BEGIN
postgres=*# INSERT INTO t_large (name) SELECT 'dummy' FROM generate_series(1, 10000000);
INSERT 0 10000000
postgres=*# \d+
                                          List of relations
 Schema |      Name      |   Type   | Owner | Persistence | Access method |    Size    | Description 
--------+----------------+----------+-------+-------------+---------------+------------+-------------
 public | t_large        | table    | zheap | permanent   | zheap         | 193 MB     | 
 public | t_large_id_seq | sequence | zheap | permanent   |               | 8192 bytes | 
(2 rows)

postgres=*# rollback ;
ERROR:  out of memory
DETAIL:  Failed on request of size 104 in memory context "TopTransactionContext".


#更新表
postgres=# begin;
BEGIN
postgres=*# UPDATE t_large SET id = id - 1;
UPDATE 10000000
postgres=*# commit ;
COMMIT
postgres=# \d+
                                          List of relations
 Schema |      Name      |   Type   | Owner | Persistence | Access method |    Size    | Description 
--------+----------------+----------+-------+-------------+---------------+------------+-------------
 public | t_large        | table    | zheap | permanent   | zheap         | 193 MB     | 
 public | t_large_id_seq | sequence | zheap | permanent   |               | 8192 bytes | 
(2 rows)

#可以看到在 base/undo目录产生了一些undo数据,另外还有一个目录pg_undo
zheap@melot-postgresql-csvlog-> du -sh pg_undo/
8.0K    pg_undo/
zheap@melot-postgresql-csvlog-> du -sh base/undo
2.6G    base/undo
#查看目录,base/undo目录在更新的时候,会逐渐增长,而且我这里测试undo使用的空间在更新事物提交后,并没有回收,所以回收undo的动作和提交事物没有关系,回收undo空间也是异步的,介绍说是discard worker进行会清理。
zheap@melot-postgresql-csvlog-> ll pg_undo/
total 8.0K
-rw------- 1 zheap zheap 52 Dec 30 16:19 0000000191AF0428
-rw------- 1 zheap zheap 52 Dec 30 16:24 00000001B481E058
zheap@melot-postgresql-csvlog-> ll base/undo/ | more
total 2.6G
-rw------- 1 zheap zheap 1.0M Dec 30 16:09 000000.0000000000
-rw------- 1 zheap zheap 1.0M Dec 30 16:08 000000.0000100000
-rw------- 1 zheap zheap 1.0M Dec 30 16:08 000000.0000200000
-rw------- 1 zheap zheap 1.0M Dec 30 16:08 000000.0000300000
-rw------- 1 zheap zheap 1.0M Dec 30 16:08 000000.0000400000
-rw------- 1 zheap zheap 1.0M Dec 30 16:08 000000.0000500000
#可以看到以下更新的表无膨胀
postgres=# UPDATE t_large SET id = id - 1;
postgres=# \d+
                                          List of relations
 Schema |      Name      |   Type   | Owner | Persistence | Access method |    Size    | Description 
--------+----------------+----------+-------+-------------+---------------+------------+-------------
 public | t_large        | table    | zheap | permanent   | zheap         | 193 MB     | 
 public | t_large_id_seq | sequence | zheap | permanent   |               | 8192 bytes | 
(2 rows)
postgres=#  UPDATE t_large SET id = id + 2;
UPDATE 10000000
postgres=# \dt+
                                   List of relations
 Schema |  Name   | Type  | Owner | Persistence | Access method |  Size  | Description 
--------+---------+-------+-------+-------------+---------------+--------+-------------
 public | t_large | table | zheap | permanent   | zheap         | 193 MB | 

#在做了vacuum full后,表反而表大了
postgres=#  vacuum full t_large ;
VACUUM
postgres=# \d+
                                          List of relations
 Schema |      Name      |   Type   | Owner | Persistence | Access method |    Size    | Description 
--------+----------------+----------+-------+-------------+---------------+------------+-------------
 public | t_large        | table    | zheap | permanent   | zheap         | 269 MB     | 
 public | t_large_id_seq | sequence | zheap | permanent   |               | 8192 bytes | 
(2 rows)


#增加主键,进行测试
postgres=# alter table t_large add primary key(id);
ALTER TABLE
postgres=# \di+ t_large_pkey
                                          List of relations
 Schema |     Name     | Type  | Owner |  Table  | Persistence | Access method |  Size  | Description 
--------+--------------+-------+-------+---------+-------------+---------------+--------+-------------
 public | t_large_pkey | index | zheap | t_large | permanent   | btree         | 214 MB | 
zheap@melot-postgresql-csvlog-> psql -U zheap -d postgres
psql: error: connection to server on socket "/tmp/.s.PGSQL.1928" failed: FATAL:  out of memory
DETAIL:  Failed on request of size 208 in memory context "CacheMemoryContext".
#更新直接报错内存溢出
postgres=#  UPDATE t_large SET id = id - 1;
ERROR:  out of memory
DETAIL:  Failed on request of size 16 in memory context "ExecutorState".
ERROR:  out of memory
DETAIL:  Failed on request of size 6 in memory context "TopTransactionContext".

#新开一个session,终止执行update的session
postgres=# select * from pg_stat_activity ;
-[ RECORD 1 ]----+-----------------------------------
datid            | 14904
datname          | postgres
pid              | 21824
leader_pid       | 
usesysid         | 10
usename          | zheap
application_name | psql
client_addr      | 
client_hostname  | 
client_port      | -1
backend_start    | 2021-12-30 16:56:39.124222+08
xact_start       | 2021-12-30 16:57:23.093982+08
query_start      | 2021-12-30 16:57:23.093982+08
state_change     | 2021-12-30 16:57:23.093985+08
wait_event_type  | 
wait_event       | 
state            | active
backend_xid      | 751
backend_xmin     | 751
query_id         | 
query            | UPDATE t_large SET id = id - 1;
backend_type     | client backend

postgres=# select pg_terminate_backend(21824);
 pg_terminate_backend 
----------------------
 t
 #终止无效
postgres=# select * from pg_stat_activity where pid=21824;
-[ RECORD 1 ]----+--------------------------------
datid            | 14904
datname          | postgres
pid              | 21824
leader_pid       | 
usesysid         | 10
usename          | zheap
application_name | psql
client_addr      | 
client_hostname  | 
client_port      | -1
backend_start    | 2021-12-30 16:56:39.124222+08
xact_start       | 2021-12-30 16:57:23.093982+08
query_start      | 2021-12-30 16:57:23.093982+08
state_change     | 2021-12-30 16:57:23.093985+08
wait_event_type  | 
wait_event       | 
state            | active
backend_xid      | 751
backend_xmin     | 751
query_id         | 
query            | UPDATE t_large SET id = id - 1;
backend_type     | client backend

从以上测试来看,存在的问题不少,zheap存储引擎还有一段路要走。

参考:
https://www.cybertec-postgresql.com/en/zheap-undo-logs-discarding-in-postgresql/
https://www.cybertec-postgresql.com/en/zheap-reinvented-postgresql-storage/

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐