博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen
阅读量:6464 次
发布时间:2019-06-23

本文共 38417 字,大约阅读时间需要 128 分钟。

标签

PostgreSQL , TPC-H , HTAP


背景

去年测的PostgreSQL 10 vs Deepgreen.

PG每年发一个大版本,此次PostgreSQL 11即将release,对PostgreSQL 11 beta3 vs Deepgreen v18 测试的TPC-H结果如下。

事实证明PG的AP能力又有了质的飞越,OLAP已无限接近MPP数据库deepgreen的性能。同时拥有单实例百万级QPS的OLTP能力,应该是目前覆盖业务场景可以做到最广泛的数据库(PG的TAG:全栈数据库 无疑)。

测试 环境

ECS , 32核 , 本地SSD 8TB.

PostgreSQL 11 beta3

Deepgreen v18 (16 segments)

SF=10

deepgreen部署quick guide

以os用户digoal为例, 部署deepgreen(6个segment)

# 部署目录mkdir -p /data01/dp/dbfast0mkdir -p /data01/dp/dbfast1mkdir -p /data01/dp/dbfast2mkdir -p /data02/dp/dbfast1mkdir -p /data02/dp/dbfast2mkdir -p /data03/dp/dbfast1mkdir -p /data03/dp/dbfast2chown -R digoal:digoal /data01/dpchown -R digoal:digoal /data02/dpchown -R digoal:digoal /data03/dp
su - digoalwget  https://s3.amazonaws.com/vitessedata/download/deepgreendb.18.08.rh7.x86_64.180823.bin   chmod  +x deepgreendb.18.08.rh7.x86_64.180823.bin./deepgreendb.18.08.rh7.x86_64.180823.binvi ./deepgreendb/greenplum_path.sh#追加# 使用实际目录export MASTER_DATA_DIRECTORY=/data01/dp/dbfast0/dg-1export PGDATA=$MASTER_DATA_DIRECTORYexport PGHOST=127.0.0.1export PGPORT=15432# 使用os用户名export PGUSER=digoalexport PGPASSWORD=123export PGDATABASE=postgres  . ./deepgreendb/greenplum_path.shecho "`hostname`" > hostfilegpssh-exkeys -h localhostgpssh-exkeys -h `hostname`  cat > cluster.conf <

测试过程

PG 11(未使用分区表), DP v18(使用分区表,1个月1分区).

1、对所有列加索引

do language plpgsql $$    declare      tables name[] := array['customer','lineitem','nation','orders','part','partsupp','region','supplier'];       n name;       x name;       i int;      sql text;      ts1 timestamp;      ts2 timestamp;    begin      set max_parallel_maintenance_workers=24;   -- 创建索引时的并行度            set max_parallel_workers=32;      set max_parallel_workers_per_gather=32;      set maintenance_work_mem='4GB';          set min_parallel_table_scan_size=0;      set min_parallel_index_scan_size=0;      set parallel_setup_cost=0;      set parallel_tuple_cost=0;          foreach n in array tables loop        i := 1;          for x in select attname from pg_attribute where attrelid=n::regclass and attnum>=1 and not attisdropped loop          sql := format('create index IF NOT EXISTS idx_%s__%s on %s (%s) tablespace tbs1', n, i, n, x);   -- 封装创建索引的SQL          ts1 := clock_timestamp();          raise notice '% start execute: %', ts1, sql;          execute sql;  -- 创建索引            commit;  -- pg 11已支持procedure, do里面开子事务        ts2 := clock_timestamp();          raise notice '% end execute: % , duration: %', ts2, sql, ts2-ts1;          i:=i+1;        end loop;        execute format('analyze %s', n);       end loop;    end;    $$;

多轮测试后得到实际使用的索引(包括pk, uk在内), 使用下面的CODE替换之前的所有索引。

select string_agg(pg_get_indexdef(indexrelname::regclass),';') from pg_stat_all_indexes  where relname = any(array['customer','lineitem','nation','orders','part','partsupp','region','supplier']) and idx_scan<>0;
do language plpgsql $$declare  tables name[] := array['customer','lineitem','nation','orders','part','partsupp','region','supplier'];  n name;  idx_sql text;  ts1 timestamp;  ts2 timestamp;  sql text;begin  set max_parallel_maintenance_workers=24;   -- 创建索引时的并行度  set max_parallel_workers=32;  set max_parallel_workers_per_gather=32;  set maintenance_work_mem='4GB';  set min_parallel_table_scan_size=0;  set min_parallel_index_scan_size=0;  set parallel_setup_cost=0;  set parallel_tuple_cost=0;  -- set default_tablespace='tbs1';  -- 是否使用其他表空间  idx_sql := 'CREATE UNIQUE INDEX IF NOT EXISTS customer_pkey ON public.customer USING btree (c_custkey);CREATE INDEX IF NOT EXISTS idx1 ON public.lineitem USING btree (l_orderkey) WHERE (l_commitdate < l_receiptdate);CREATE INDEX IF NOT EXISTS idx_lineitem__11 ON public.lineitem USING btree (l_shipdate);CREATE INDEX IF NOT EXISTS idx_lineitem__2 ON public.lineitem USING btree (l_partkey);CREATE INDEX IF NOT EXISTS idx_lineitem__3 ON public.lineitem USING btree (l_suppkey);CREATE INDEX IF NOT EXISTS idx_lineitem_orderkey ON public.lineitem USING btree (l_orderkey);CREATE INDEX IF NOT EXISTS idx_lineitem_shipdate ON public.lineitem USING btree (l_shipdate, l_discount, l_quantity);CREATE UNIQUE INDEX IF NOT EXISTS lineitem_pkey ON public.lineitem USING btree (l_orderkey, l_linenumber);CREATE INDEX IF NOT EXISTS idx6 ON public.nation USING btree (n_nationkey, n_regionkey);CREATE INDEX IF NOT EXISTS idx_nation_regionkey ON public.nation USING btree (n_regionkey);CREATE UNIQUE INDEX IF NOT EXISTS nation_pkey ON public.nation USING btree (n_nationkey);CREATE INDEX IF NOT EXISTS idx_orders__6 ON public.orders USING btree (o_orderpriority);CREATE INDEX IF NOT EXISTS idx_orders_custkey ON public.orders USING btree (o_custkey);CREATE INDEX IF NOT EXISTS idx_orders_orderdate ON public.orders USING btree (o_orderdate);CREATE UNIQUE INDEX IF NOT EXISTS orders_pkey ON public.orders USING btree (o_orderkey);CREATE INDEX IF NOT EXISTS idx_part_1 ON public.part USING btree (p_container, p_brand);CREATE INDEX IF NOT EXISTS idx_part__5 ON public.part USING btree (p_type);CREATE INDEX IF NOT EXISTS idx_part__6 ON public.part USING btree (p_size);CREATE UNIQUE INDEX IF NOT EXISTS part_pkey ON public.part USING btree (p_partkey);CREATE INDEX IF NOT EXISTS idx ON public.partsupp USING btree (ps_partkey, ps_suppkey, ps_supplycost);CREATE INDEX IF NOT EXISTS idx_partsupp__4 ON public.partsupp USING btree (ps_supplycost);CREATE INDEX IF NOT EXISTS idx_partsupp_partkey ON public.partsupp USING btree (ps_partkey);CREATE INDEX IF NOT EXISTS idx_partsupp_suppkey ON public.partsupp USING btree (ps_suppkey);CREATE UNIQUE INDEX IF NOT EXISTS partsupp_pkey ON public.partsupp USING btree (ps_partkey, ps_suppkey);CREATE INDEX IF NOT EXISTS idx5 ON public.region USING btree (r_name, r_regionkey);CREATE UNIQUE INDEX IF NOT EXISTS region_pkey ON public.region USING btree (r_regionkey);CREATE INDEX IF NOT EXISTS idx4 ON public.supplier USING btree (s_suppkey, s_nationkey);CREATE INDEX IF NOT EXISTS idx_supplier_nation_key ON public.supplier USING btree (s_nationkey);CREATE UNIQUE INDEX IF NOT EXISTS supplier_pkey ON public.supplier USING btree (s_suppkey);';  for sql in select regexp_split_to_table(idx_sql, ';')   loop    ts1 := clock_timestamp();    raise notice '% start execute: %', ts1, sql;    execute sql;     commit;  -- pg 11已支持procedure, do里面开子事务    ts2 := clock_timestamp();    raise notice '% end execute: % , duration: %', ts2, sql, ts2-ts1;  end loop;    foreach n in array tables loop    execute format('analyze %s', n);  end loop;end;$$;

2、强制测试过程中,使用并行查询

alter role postgres set enable_nestloop=off;  alter role postgres set enable_mergejoin=off;  alter role postgres set work_mem ='8GB';    alter role postgres set max_parallel_workers_per_gather=24;  alter role postgres set parallel_setup_cost=0;  alter role postgres set parallel_tuple_cost=0;  alter role postgres set min_parallel_table_scan_size=0;  alter role postgres set min_parallel_index_scan_size=0;  alter role postgres set geqo_threshold=100;  alter table customer set (parallel_workers=24);  alter table lineitem set (parallel_workers=24);  alter table nation set (parallel_workers=24);  alter table orders set (parallel_workers=24);  alter table part set (parallel_workers=24);  alter table partsupp set (parallel_workers=24);  alter table region set (parallel_workers=24);  alter table supplier set (parallel_workers=24);

3、Q17,加了一个索引

create index idx_part_1 on part (p_container,p_brand);

4、PG 11校对了成本因子

postgresql.auto.confcpu_tuple_cost=0.00018884145574257426  cpu_index_tuple_cost = 0.00433497085216479990  cpu_operator_cost = 0.00216748542608239995  seq_page_cost=0.014329  random_page_cost =0.016
postgresql.conflisten_addresses = '0.0.0.0'port = 1921max_connections = 2000superuser_reserved_connections = 3unix_socket_directories = '., /var/run/postgresql, /tmp'tcp_keepalives_idle = 60tcp_keepalives_interval = 10tcp_keepalives_count = 10shared_buffers = 48GBmax_prepared_transactions = 2000work_mem = 4GBmaintenance_work_mem = 2GBdynamic_shared_memory_type = posixvacuum_cost_delay = 0bgwriter_delay = 10msbgwriter_lru_maxpages = 1000bgwriter_lru_multiplier = 10.0effective_io_concurrency = 4max_worker_processes = 128max_parallel_maintenance_workers = 16max_parallel_workers_per_gather = 24parallel_leader_participation = onmax_parallel_workers = 32wal_level = minimal  synchronous_commit = offwal_writer_delay = 10mscheckpoint_timeout = 35minmax_wal_size = 96GBmin_wal_size = 24GBcheckpoint_completion_target = 0.1max_wal_senders = 0random_page_cost = 1.1effective_cache_size = 400GBdefault_statistics_target = 1000log_destination = 'csvlog'logging_collector = onlog_directory = 'log'log_filename = 'postgresql-%a.log'log_truncate_on_rotation = onlog_rotation_age = 1dlog_rotation_size = 0log_checkpoints = on log_connections = onlog_disconnections = onlog_error_verbosity = verbose log_line_prefix = '%m [%p] 'log_timezone = 'PRC'log_autovacuum_min_duration = 0autovacuum_freeze_max_age = 1200000000autovacuum_multixact_freeze_max_age = 1400000000autovacuum_vacuum_cost_delay = 0msvacuum_freeze_table_age = 1150000000vacuum_multixact_freeze_table_age = 1150000000datestyle = 'iso, mdy'timezone = 'PRC'lc_messages = 'C'lc_monetary = 'C'lc_numeric = 'C'lc_time = 'C'default_text_search_config = 'pg_catalog.english'

5、deepgreen v18

gp_vmem_protect_limit=16384;

TPC-H (SF=10) 性能对比

数据库 q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21 q22
PostgreSQL 10 17 2 41 6 14 4 27 31 54 16 7 100 213 4 25 60 552 1317 11 82 95 143
PostgreSQL 11beta3 3 2 2 5 3 4 2 2 9 3 1 2 4 2 6 10 27 46 2 10 13 2
DeepGreen v16 6 4 10 4 7 1 27 10 283 5 2 5 14 1 2 5 31 53 6 10 246 7
DeepGreen v18 2 1 1 1 1 0 3 1 3 2 1 2 2 0 0 1 5 3 2 1 3 1

SF=200

SF=200 Deepgreen on ECS(64c, 512G, 8*2TB(ssd) lvm strip)(33 segments)

2018-08-25 15:03:59 [1535180639] : running TPC-H benchmark2018-08-25 15:03:59 [1535180639] : running queries defined in TPC-H benchmark2018-08-25 15:03:59 [1535180639] :   running query 12018-08-25 15:03:59 [1535180639] : run explain2018-08-25 15:03:59 [1535180639] : run the query on background2018-08-25 15:04:12 [1535180652] :     query 1 finished OK (12 seconds)2018-08-25 15:04:12 [1535180652] :   running query 22018-08-25 15:04:12 [1535180652] : run explain2018-08-25 15:04:12 [1535180652] : run the query on background2018-08-25 15:04:15 [1535180655] :     query 2 finished OK (3 seconds)2018-08-25 15:04:15 [1535180655] :   running query 32018-08-25 15:04:15 [1535180655] : run explain2018-08-25 15:04:15 [1535180655] : run the query on background2018-08-25 15:04:27 [1535180667] :     query 3 finished OK (11 seconds)2018-08-25 15:04:27 [1535180667] :   running query 42018-08-25 15:04:27 [1535180667] : run explain2018-08-25 15:04:27 [1535180667] : run the query on background2018-08-25 15:04:34 [1535180674] :     query 4 finished OK (7 seconds)2018-08-25 15:04:34 [1535180674] :   running query 52018-08-25 15:04:34 [1535180674] : run explain2018-08-25 15:04:34 [1535180674] : run the query on background2018-08-25 15:04:47 [1535180687] :     query 5 finished OK (12 seconds)2018-08-25 15:04:47 [1535180687] :   running query 62018-08-25 15:04:47 [1535180687] : run explain2018-08-25 15:04:47 [1535180687] : run the query on background2018-08-25 15:04:48 [1535180688] :     query 6 finished OK (1 seconds)2018-08-25 15:04:48 [1535180688] :   running query 72018-08-25 15:04:48 [1535180688] : run explain2018-08-25 15:04:48 [1535180688] : run the query on background2018-08-25 15:05:21 [1535180721] :     query 7 finished OK (32 seconds)2018-08-25 15:05:21 [1535180721] :   running query 82018-08-25 15:05:21 [1535180721] : run explain2018-08-25 15:05:21 [1535180721] : run the query on background2018-08-25 15:05:32 [1535180732] :     query 8 finished OK (11 seconds)2018-08-25 15:05:32 [1535180732] :   running query 92018-08-25 15:05:32 [1535180732] : run explain2018-08-25 15:05:32 [1535180732] : run the query on background2018-08-25 15:06:02 [1535180762] :     query 9 finished OK (29 seconds)2018-08-25 15:06:02 [1535180762] :   running query 102018-08-25 15:06:02 [1535180762] : run explain2018-08-25 15:06:02 [1535180762] : run the query on background2018-08-25 15:06:11 [1535180771] :     query 10 finished OK (9 seconds)2018-08-25 15:06:11 [1535180771] :   running query 112018-08-25 15:06:11 [1535180771] : run explain2018-08-25 15:06:11 [1535180771] : run the query on background2018-08-25 15:06:14 [1535180774] :     query 11 finished OK (2 seconds)2018-08-25 15:06:14 [1535180774] :   running query 122018-08-25 15:06:14 [1535180774] : run explain2018-08-25 15:06:14 [1535180774] : run the query on background2018-08-25 15:06:23 [1535180783] :     query 12 finished OK (9 seconds)2018-08-25 15:06:23 [1535180783] :   running query 132018-08-25 15:06:23 [1535180783] : run explain2018-08-25 15:06:23 [1535180783] : run the query on background2018-08-25 15:06:41 [1535180801] :     query 13 finished OK (17 seconds)2018-08-25 15:06:41 [1535180801] :   running query 142018-08-25 15:06:41 [1535180801] : run explain2018-08-25 15:06:41 [1535180801] : run the query on background2018-08-25 15:06:42 [1535180802] :     query 14 finished OK (1 seconds)2018-08-25 15:06:42 [1535180802] :   running query 152018-08-25 15:06:42 [1535180802] : run explain2018-08-25 15:06:42 [1535180802] : run the query on background2018-08-25 15:06:46 [1535180806] :     query 15 finished OK (4 seconds)2018-08-25 15:06:46 [1535180806] :   running query 162018-08-25 15:06:46 [1535180806] : run explain2018-08-25 15:06:46 [1535180806] : run the query on background2018-08-25 15:06:56 [1535180816] :     query 16 finished OK (9 seconds)2018-08-25 15:06:56 [1535180816] :   running query 172018-08-25 15:06:56 [1535180816] : run explain2018-08-25 15:06:56 [1535180816] : run the query on background2018-08-25 15:08:06 [1535180886] :     query 17 finished OK (69 seconds)2018-08-25 15:08:06 [1535180886] :   running query 182018-08-25 15:08:06 [1535180886] : run explain2018-08-25 15:08:06 [1535180886] : run the query on background2018-08-25 15:08:30 [1535180910] :     query 18 finished OK (24 seconds)2018-08-25 15:08:30 [1535180910] :   running query 192018-08-25 15:08:30 [1535180910] : run explain2018-08-25 15:08:31 [1535180911] : run the query on background2018-08-25 15:08:42 [1535180922] :     query 19 finished OK (11 seconds)2018-08-25 15:08:42 [1535180922] :   running query 202018-08-25 15:08:42 [1535180922] : run explain2018-08-25 15:08:42 [1535180922] : run the query on background2018-08-25 15:08:55 [1535180936] :     query 20 finished OK (13 seconds)2018-08-25 15:08:56 [1535180936] :   running query 212018-08-25 15:08:56 [1535180936] : run explain2018-08-25 15:08:56 [1535180936] : run the query on background2018-08-25 15:09:16 [1535180956] :     query 21 finished OK (20 seconds)2018-08-25 15:09:16 [1535180956] :   running query 222018-08-25 15:09:16 [1535180956] : run explain2018-08-25 15:09:16 [1535180956] : run the query on background2018-08-25 15:09:26 [1535180966] :     query 22 finished OK (9 seconds)2018-08-25 15:09:26 [1535180966] : finished TPC-H benchmark

SF=200 PG 11

使用HINT改几条慢SQL。

gp_tpch/dss/queries/2.sql17.sql18.sql

Q2

/*+ Leading((partsupp ((nation region) supplier))) Leading((part (((nation region) supplier) partsupp))) */         select        s_acctbal,        s_name,        n_name,        p_partkey,        p_mfgr,        s_address,        s_phone,        s_commentfrom        part,        supplier,        partsupp,        nation,        regionwhere        p_partkey = ps_partkey        and s_suppkey = ps_suppkey        and p_size = 47        and p_type like '%NICKEL'        and s_nationkey = n_nationkey        and n_regionkey = r_regionkey        and r_name = 'ASIA'        and ps_supplycost = (                select                        min(ps_supplycost)                from                        partsupp,                        supplier,                        nation,                        region                where                        p_partkey = ps_partkey                        and s_suppkey = ps_suppkey                        and s_nationkey = n_nationkey                        and n_regionkey = r_regionkey                        and r_name = 'ASIA'        )order by        s_acctbal desc,        n_name,        s_name,        p_partkeyLIMIT 100;
QUERY PLAN                                                                 ------------------------------------------------------------------------------------------------------------------------------------------- Limit  (cost=4589445377.60..4589445377.60 rows=1 width=198)   ->  Sort  (cost=4589445377.60..4589445377.60 rows=1 width=198)         Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey         ->  Hash Join  (cost=225752.90..4589445377.59 rows=1 width=198)               Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND ((SubPlan 1) = partsupp.ps_supplycost))               ->  Gather  (cost=0.00..18775.09 rows=158816 width=34)                     Workers Planned: 28                     ->  Parallel Seq Scan on part  (cost=0.00..18775.09 rows=5672 width=34)                           Filter: (((p_type)::text ~~ '%NICKEL'::text) AND (p_size = 47))               ->  Hash  (cost=69355.84..69355.84 rows=31998515 width=180)                     ->  Gather  (cost=69220.57..69355.84 rows=31998515 width=180)                           Workers Planned: 28                           ->  Parallel Hash Join  (cost=69220.57..69355.84 rows=1142804 width=180)                                 Hash Cond: (supplier.s_suppkey = partsupp.ps_suppkey)                                 ->  Parallel Hash Join  (cost=858.16..861.87 rows=14286 width=172)                                       Hash Cond: (nation.n_nationkey = supplier.s_nationkey)                                       ->  Parallel Hash Join  (cost=0.02..0.03 rows=1 width=34)                                             Hash Cond: (nation.n_regionkey = region.r_regionkey)                                             ->  Parallel Seq Scan on nation  (cost=0.00..0.01 rows=1 width=42)                                             ->  Parallel Hash  (cost=0.01..0.01 rows=1 width=8)                                                   ->  Parallel Seq Scan on region  (cost=0.00..0.01 rows=1 width=8)                                                         Filter: (r_name = 'ASIA'::bpchar)                                       ->  Parallel Hash  (cost=689.83..689.83 rows=71429 width=154)                                             ->  Parallel Seq Scan on supplier  (cost=0.00..689.83 rows=71429 width=154)                                 ->  Parallel Hash  (cost=54898.31..54898.31 rows=5714021 width=24)                                       ->  Parallel Seq Scan on partsupp  (cost=0.00..54898.31 rows=5714021 width=24)               SubPlan 1                 ->  Aggregate  (cost=1806.01..1806.01 rows=1 width=8)                       ->  Hash Join  (cost=1804.89..1806.00 rows=7 width=8)                             Hash Cond: (partsupp_1.ps_suppkey = supplier_1.s_suppkey)                             ->  Index Scan using idx_partsupp_partkey on partsupp partsupp_1  (cost=0.49..1.48 rows=37 width=16)                                   Index Cond: (part.p_partkey = ps_partkey)                             ->  Hash  (cost=861.87..861.87 rows=400000 width=8)                                   ->  Gather  (cost=858.16..861.87 rows=400000 width=8)                                         Workers Planned: 28                                         ->  Parallel Hash Join  (cost=858.16..861.87 rows=14286 width=8)                                               Hash Cond: (nation_1.n_nationkey = supplier_1.s_nationkey)                                               ->  Parallel Hash Join  (cost=0.02..0.03 rows=1 width=8)                                                     Hash Cond: (nation_1.n_regionkey = region_1.r_regionkey)                                                     ->  Parallel Seq Scan on nation nation_1  (cost=0.00..0.01 rows=1 width=16)                                                     ->  Parallel Hash  (cost=0.01..0.01 rows=1 width=8)                                                           ->  Parallel Seq Scan on region region_1  (cost=0.00..0.01 rows=1 width=8)                                                                 Filter: (r_name = 'ASIA'::bpchar)                                               ->  Parallel Hash  (cost=689.83..689.83 rows=71429 width=16)                                                     ->  Parallel Seq Scan on supplier supplier_1  (cost=0.00..689.83 rows=71429 width=16) JIT:   Functions: 98   Inlining: true   Optimization: true(49 rows)

更新

Q2 建议加这个索引, 使用nestloop join

create index idx on partsupp (ps_partkey,ps_suppkey,ps_supplycost);set enable_nestloop=on;or/*+ Set(enable_nestloop on) */ select                                                                                                                   s_acctbal,        s_name,           n_name,        p_partkey,        p_mfgr,           s_address,        s_phone,          s_commentfrom                     part,        supplier,        partsupp,        nation,          region where                 p_partkey = ps_partkey        and s_suppkey = ps_suppkey        and p_size = 47                   and p_type like '%NICKEL'        and s_nationkey = n_nationkey        and n_regionkey = r_regionkey        and r_name = 'ASIA'                  and ps_supplycost = (                select                               min(ps_supplycost)                from                                              partsupp,                        supplier,                        nation,                          region                 where                                 p_partkey = ps_partkey                        and s_suppkey = ps_suppkey                        and s_nationkey = n_nationkey                        and n_regionkey = r_regionkey                        and r_name = 'ASIA'                  )                                  order by         s_acctbal desc,        n_name,                s_name,        p_partkeyLIMIT 100;   Time: 5345.093 ms (00:05.345)                                                                        QUERY PLAN                                                                        ---------------------------------------------------------------------------------------------------------------------------------------------------------- Limit  (cost=2807136.16..2807136.16 rows=1 width=198)   ->  Sort  (cost=2807136.16..2807136.16 rows=1 width=198)         Sort Key: supplier.s_acctbal DESC, nation.n_name, supplier.s_name, part.p_partkey         ->  Nested Loop  (cost=17.90..2807136.15 rows=1 width=198)               Join Filter: (nation.n_nationkey = supplier.s_nationkey)               ->  Nested Loop  (cost=17.88..2807135.97 rows=6 width=180)                     ->  Nested Loop  (cost=17.51..2807133.70 rows=6 width=42)                           ->  Index Scan using idx_part__6 on part  (cost=0.49..15906.04 rows=160623 width=34)                                 Index Cond: (p_size = 47)                                 Filter: ((p_type)::text ~~ '%NICKEL'::text)                           ->  Index Only Scan using idx on partsupp  (cost=17.02..17.38 rows=1 width=24)                                 Index Cond: ((ps_partkey = part.p_partkey) AND (ps_supplycost = (SubPlan 1)))                                 SubPlan 1                                   ->  Aggregate  (cost=16.52..16.52 rows=1 width=8)                                         ->  Hash Join  (cost=0.91..16.51 rows=8 width=8)                                               Hash Cond: (supplier_1.s_nationkey = nation_1.n_nationkey)                                               ->  Nested Loop  (cost=0.86..16.34 rows=38 width=16)                                                     ->  Index Scan using idx_partsupp_partkey on partsupp partsupp_1  (cost=0.49..0.78 rows=38 width=16)                                                           Index Cond: (part.p_partkey = ps_partkey)                                                     ->  Index Scan using supplier_pkey on supplier supplier_1  (cost=0.37..0.41 rows=1 width=16)                                                           Index Cond: (s_suppkey = partsupp_1.ps_suppkey)                                               ->  Hash  (cost=0.03..0.03 rows=5 width=8)                                                     ->  Gather  (cost=0.02..0.03 rows=5 width=8)                                                           Workers Planned: 28                                                           ->  Parallel Hash Join  (cost=0.02..0.03 rows=1 width=8)                                                                 Hash Cond: (nation_1.n_regionkey = region_1.r_regionkey)                                                                 ->  Parallel Seq Scan on nation nation_1  (cost=0.00..0.01 rows=1 width=16)                                                                 ->  Parallel Hash  (cost=0.01..0.01 rows=1 width=8)                                                                       ->  Parallel Seq Scan on region region_1  (cost=0.00..0.01 rows=1 width=8)                                                                             Filter: (r_name = 'ASIA'::bpchar)                     ->  Index Scan using supplier_pkey on supplier  (cost=0.37..0.38 rows=1 width=154)                           Index Cond: (s_suppkey = partsupp.ps_suppkey)               ->  Materialize  (cost=0.02..0.06 rows=5 width=34)                     ->  Gather  (cost=0.02..0.03 rows=5 width=34)                           Workers Planned: 28                           ->  Parallel Hash Join  (cost=0.02..0.03 rows=1 width=34)                                 Hash Cond: (nation.n_regionkey = region.r_regionkey)                                 ->  Parallel Seq Scan on nation  (cost=0.00..0.01 rows=1 width=42)                                 ->  Parallel Hash  (cost=0.01..0.01 rows=1 width=8)                                       ->  Parallel Seq Scan on region  (cost=0.00..0.01 rows=1 width=8)                                             Filter: (r_name = 'ASIA'::bpchar) JIT:   Functions: 67   Inlining: true   Optimization: true(45 rows)

Q4

create index idx1 on lineitem (l_orderkey) where l_commitdate < l_receiptdate;

Q17

/*+ Set(work_mem 12GB) Set(max_parallel_workers_per_gather 28) */ explain select        sum(l_extendedprice) / 7.0 as avg_yearlyfrom        lineitem,        part,        (SELECT l_partkey AS agg_partkey, 0.2 * avg(l_quantity) AS avg_quantity FROM lineitem GROUP BY l_partkey) part_aggwhere        p_partkey = l_partkey        and agg_partkey = l_partkey        and p_brand = 'Brand#11'        and p_container = 'SM PKG'        and l_quantity < avg_quantityLIMIT 1;
QUERY PLAN                                                             ----------------------------------------------------------------------------------------------------------------------------------- Limit  (cost=6130644.31..6130644.31 rows=1 width=8)   ->  Aggregate  (cost=6130644.31..6130644.31 rows=1 width=8)         ->  Hash Join  (cost=5647079.65..6129794.97 rows=391854 width=8)               Hash Cond: (part.p_partkey = lineitem_1.l_partkey)               Join Filter: (lineitem.l_quantity < (('0.2'::double precision * avg(lineitem_1.l_quantity))))               ->  Gather  (cost=11834.33..491832.73 rows=1193799 width=32)                     Workers Planned: 28                     ->  Parallel Hash Join  (cost=11834.33..491832.73 rows=42636 width=32)                           Hash Cond: (lineitem.l_partkey = part.p_partkey)                           ->  Parallel Seq Scan on lineitem  (cost=0.00..382461.00 rows=42857385 width=24)                           ->  Parallel Hash  (cost=11830.99..11830.99 rows=1421 width=8)                                 ->  Parallel Bitmap Heap Scan on part  (cost=427.06..11830.99 rows=1421 width=8)                                       Recheck Cond: ((p_container = 'SM PKG'::bpchar) AND (p_brand = 'Brand#11'::bpchar))                                       ->  Bitmap Index Scan on idx_part_1  (cost=0.00..418.44 rows=39794 width=0)                                             Index Cond: ((p_container = 'SM PKG'::bpchar) AND (p_brand = 'Brand#11'::bpchar))               ->  Hash  (cost=5542429.84..5542429.84 rows=39389900 width=16)                     ->  Finalize HashAggregate  (cost=5356798.88..5534991.39 rows=39389900 width=16)                           Group Key: lineitem_1.l_partkey                           ->  Gather  (cost=568246.52..575684.96 rows=1102917200 width=40)                                 Workers Planned: 28                                 ->  Partial HashAggregate  (cost=568246.52..575684.96 rows=39389900 width=40)                                       Group Key: lineitem_1.l_partkey                                       ->  Parallel Seq Scan on lineitem lineitem_1  (cost=0.00..382461.00 rows=42857385 width=16) JIT:   Functions: 44   Inlining: true   Optimization: true(27 rows)

Q18

/*+ Leading( ((lineitem (customer orders)) ANY_subquery) ) */  explain select        c_name,        c_custkey,        o_orderkey,        o_orderdate,        o_totalprice,        sum(l_quantity)from        customer,        orders,        lineitemwhere        o_orderkey in (                select                        l_orderkey                from                        lineitem                group by                        l_orderkey having                                sum(l_quantity) > 312        )        and c_custkey = o_custkey        and o_orderkey = l_orderkeygroup by        c_name,        c_custkey,        o_orderkey,        o_orderdate,        o_totalpriceorder by        o_totalprice desc,        o_orderdateLIMIT 100;
QUERY PLAN                                                                ----------------------------------------------------------------------------------------------------------------------------------------- Limit  (cost=5094012.87..5094013.09 rows=100 width=55)   ->  Sort  (cost=5094012.87..5104543.55 rows=4858479 width=55)         Sort Key: orders.o_totalprice DESC, orders.o_orderdate         ->  HashAggregate  (cost=4932105.34..4933022.83 rows=4858479 width=55)               Group Key: customer.c_custkey, orders.o_orderkey               ->  Hash Join  (cost=1182913.25..4900513.30 rows=4858479 width=55)                     Hash Cond: (orders.o_orderkey = lineitem_1.l_orderkey)                     ->  Gather  (cost=160021.94..1146574.91 rows=1200006784 width=63)                           Workers Planned: 28                           ->  Parallel Hash Join  (cost=160021.94..1146574.91 rows=42857385 width=63)                                 Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)                                 ->  Parallel Seq Scan on lineitem  (cost=0.00..382461.00 rows=42857385 width=16)                                 ->  Parallel Hash  (cost=134775.97..134775.97 rows=10714120 width=47)                                       ->  Parallel Hash Join  (cost=108506.79..134775.97 rows=10714120 width=47)                                             Hash Cond: (customer.c_custkey = orders.o_custkey)                                             ->  Parallel Seq Scan on customer  (cost=0.00..11101.38 rows=1071468 width=27)                                             ->  Parallel Hash  (cost=83260.82..83260.82 rows=10714120 width=28)                                                   ->  Parallel Seq Scan on orders  (cost=0.00..83260.82 rows=10714120 width=28)                     ->  Hash  (cost=1020029.33..1020029.33 rows=1214594 width=8)                           ->  Finalize HashAggregate  (cost=1011214.02..1019799.96 rows=1214594 width=8)                                 Group Key: lineitem_1.l_orderkey                                 Filter: (sum(lineitem_1.l_quantity) > '312'::double precision)                                 ->  Gather  (cost=568246.52..568934.61 rows=102025924 width=16)                                       Workers Planned: 28                                       ->  Partial HashAggregate  (cost=568246.52..568934.61 rows=3643783 width=16)                                             Group Key: lineitem_1.l_orderkey                                             ->  Parallel Seq Scan on lineitem lineitem_1  (cost=0.00..382461.00 rows=42857385 width=16)(27 rows)

SF=200 pg 11 结果

2018-08-25 19:46:22 [1535197582] : running TPC-H benchmark2018-08-25 19:46:22 [1535197582] : running queries defined in TPC-H benchmark2018-08-25 19:46:22 [1535197582] :   running query 12018-08-25 19:46:22 [1535197582] : run explain2018-08-25 19:46:22 [1535197582] : run the query on background2018-08-25 19:46:40 [1535197600] :     query 1 finished OK (18 seconds)2018-08-25 19:46:40 [1535197600] :   running query 22018-08-25 19:46:40 [1535197600] : run explain2018-08-25 19:46:40 [1535197600] : run the query on background2018-08-25 19:47:19 [1535197639] :     query 2 finished OK (38 seconds)2018-08-25 19:47:19 [1535197639] :   running query 32018-08-25 19:47:19 [1535197639] : run explain2018-08-25 19:47:21 [1535197641] : run the query on background2018-08-25 19:47:46 [1535197666] :     query 3 finished OK (25 seconds)2018-08-25 19:47:46 [1535197666] :   running query 42018-08-25 19:47:46 [1535197666] : run explain2018-08-25 19:47:47 [1535197667] : run the query on background2018-08-25 19:48:19 [1535197699] :     query 4 finished OK (32 seconds)2018-08-25 19:48:19 [1535197699] :   running query 52018-08-25 19:48:19 [1535197699] : run explain2018-08-25 19:48:19 [1535197699] : run the query on background2018-08-25 19:49:16 [1535197756] :     query 5 finished OK (57 seconds)2018-08-25 19:49:17 [1535197757] :   running query 62018-08-25 19:49:17 [1535197757] : run explain2018-08-25 19:49:17 [1535197757] : run the query on background2018-08-25 19:49:25 [1535197765] :     query 6 finished OK (8 seconds)2018-08-25 19:49:25 [1535197765] :   running query 72018-08-25 19:49:25 [1535197765] : run explain2018-08-25 19:49:25 [1535197765] : run the query on background2018-08-25 19:50:17 [1535197817] :     query 7 finished OK (52 seconds)2018-08-25 19:50:17 [1535197817] :   running query 82018-08-25 19:50:18 [1535197818] : run explain2018-08-25 19:50:18 [1535197818] : run the query on background2018-08-25 19:50:42 [1535197842] :     query 8 finished OK (24 seconds)2018-08-25 19:50:42 [1535197842] :   running query 92018-08-25 19:50:42 [1535197842] : run explain2018-08-25 19:50:42 [1535197842] : run the query on background2018-08-25 19:51:49 [1535197909] :     query 9 finished OK (66 seconds)2018-08-25 19:51:49 [1535197909] :   running query 102018-08-25 19:51:49 [1535197909] : run explain2018-08-25 19:51:49 [1535197909] : run the query on background2018-08-25 19:52:27 [1535197947] :     query 10 finished OK (38 seconds)2018-08-25 19:52:27 [1535197947] :   running query 112018-08-25 19:52:27 [1535197947] : run explain2018-08-25 19:52:28 [1535197948] : run the query on background2018-08-25 19:52:52 [1535197972] :     query 11 finished OK (24 seconds)2018-08-25 19:52:52 [1535197972] :   running query 122018-08-25 19:52:52 [1535197972] : run explain2018-08-25 19:52:52 [1535197972] : run the query on background2018-08-25 19:53:18 [1535197998] :     query 12 finished OK (26 seconds)2018-08-25 19:53:18 [1535197998] :   running query 132018-08-25 19:53:18 [1535197998] : run explain2018-08-25 19:53:19 [1535197999] : run the query on background2018-08-25 19:54:58 [1535198098] :     query 13 finished OK (98 seconds)2018-08-25 19:54:58 [1535198098] :   running query 142018-08-25 19:54:58 [1535198098] : run explain2018-08-25 19:54:58 [1535198098] : run the query on background2018-08-25 19:55:11 [1535198111] :     query 14 finished OK (13 seconds)2018-08-25 19:55:11 [1535198111] :   running query 152018-08-25 19:55:11 [1535198111] : run explain2018-08-25 19:55:11 [1535198111] : run the query on background2018-08-25 19:56:10 [1535198170] :     query 15 finished OK (58 seconds)2018-08-25 19:56:10 [1535198170] :   running query 162018-08-25 19:56:10 [1535198170] : run explain2018-08-25 19:56:10 [1535198170] : run the query on background2018-08-25 19:58:05 [1535198285] :     query 16 finished OK (114 seconds)2018-08-25 19:58:05 [1535198285] :   running query 172018-08-25 19:58:05 [1535198285] : run explain2018-08-25 19:58:07 [1535198287] : run the query on background2018-08-25 20:10:21 [1535199021] :     query 17 finished OK (732 seconds)2018-08-25 20:10:21 [1535199021] :   running query 182018-08-25 20:10:21 [1535199021] : run explain2018-08-25 20:10:22 [1535199022] : run the query on background2018-08-25 20:20:19 [1535199619] :     query 18 finished OK (595 seconds)2018-08-25 20:20:19 [1535199619] :   running query 192018-08-25 20:20:19 [1535199619] : run explain2018-08-25 20:20:19 [1535199619] : run the query on background2018-08-25 20:20:31 [1535199631] :     query 19 finished OK (12 seconds)2018-08-25 20:20:31 [1535199631] :   running query 202018-08-25 20:20:31 [1535199631] : run explain2018-08-25 20:20:34 [1535199634] : run the query on background2018-08-25 20:24:08 [1535199848] :     query 20 finished OK (213 seconds)2018-08-25 20:24:08 [1535199848] :   running query 212018-08-25 20:24:08 [1535199848] : run explain2018-08-25 20:24:08 [1535199848] : run the query on background2018-08-25 20:26:13 [1535199973] :     query 21 finished OK (124 seconds)2018-08-25 20:26:13 [1535199973] :   running query 222018-08-25 20:26:13 [1535199973] : run explain2018-08-25 20:26:14 [1535199974] : run the query on background2018-08-25 20:26:28 [1535199988] :     query 22 finished OK (14 seconds)2018-08-25 20:26:28 [1535199988] : finished TPC-H benchmark

SF=200

数据库 q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 q19 q20 q21 q22
PostgreSQL 11beta3 18 38 25 32 57 8 52 24 66 38 24 26 98 13 58 114 732 595 12 213 124 14
DeepGreen v18 12 3 11 7 12 1 32 11 29 9 2 9 17 1 4 9 69 24 11 13 20 9

小结

PostgreSQL 11 parallel hash agg, parallel hash join的性能有了质的飞跃。

同时PostgreSQL 11在创建索引时,也支持了PARALLEL,创建索引的速度有了飞跃。

(SF=10)相比PostgreSQL 10,PostgreSQL 11版本AP能力得到了极大增强。在“没有使用列存储、没有使用分区表的”情况下,已接近Deepgreen(Greenplum原班人马)。ck, 田丰老师 .

PostgreSQL 11优化器还有可以加强的地方,在TPC-H的一些SQL中,执行计划并不是最好的,比如通过aqo提高优化器的能力。

PostgreSQL实现HTAP是必然的趋势。将来还可以结合citus插件实现sharding, (同时引入列存功能)。

参考

转载地址:http://njezo.baihongyu.com/

你可能感兴趣的文章
使用树莓派拍摄延时动画,制作GIF图
查看>>
js 效果
查看>>
19.Java5同步集合类的应用
查看>>
<c:forEach varStatus="status">中 varStatus的作用
查看>>
puppet来管理文件和软件包
查看>>
Python基础进阶之路(一)之运算符和输入输出
查看>>
阻塞非阻塞异步同步 io的关系
查看>>
ClickStat业务
查看>>
DMA32映射问题
查看>>
Android内存泄露之开篇
查看>>
提高效率—编程中的技巧
查看>>
导出excel——弹出框
查看>>
高并发程序设计
查看>>
ExtJs之组件(window)
查看>>
SoapUI中如何传递cookie
查看>>
静态成员变量的初始化
查看>>
POJ 1269 Intersecting Lines(判断两直线位置关系)
查看>>
MSSQL数据库跨表和跨数据库查询方法简(转)
查看>>
spring3.0.7中各个jar包的作用总结
查看>>
Windows 10 /win10 上使用GIT慢的问题,或者命令行反应慢的问题
查看>>