StarRocks3.3.1集群(存算一体)部署基准测试

测试准备

部署规划

服务器 FE BE MysqlClient
37 yes
38 yes
39 yes
40 yes yes

硬件环境

  • FE

    机器 物理机
    CPU Intel® Xeon® CPU E5-2630 v4 @ 2.20GHz
    内存 256G
    磁盘 HDD
    网卡 千兆网卡
  • BE

    机器 物理机
    CPU Intel® Xeon® CPU E5-2630 v4 @ 2.20GHz
    内存 256G
    磁盘 HDD
    网卡 千兆网卡

软件环境

  • 操作系统
    • FE(37)使用的是 CentOS release 6.5 (Final)
    • BE(38,39,40)使用CentOS Linux release 7.3.1611 (Core)
  • 内核版本
    • FE(37)使用的是Linux 2.6.32-504.23.4.el6.centos.plus.x86_64
    • BE(38,39,40) Linux 3.10.0-693.25.4.el7.x86_64
  • 软件版本:

测试数据与结果

测试数据

100G数据场景
行数
call_center 30
catalog_page 20400
catalog_returns 14404374
catalog_sales 143997065
customer_address 1000000
customer_demographics 1920800
customer 2000000
date_dim 73049
household_demographics 7200
income_band 20
inventory 399330000
item 204000
promotion 1000
reason 55
ship_mode 20
store 402
store_returns 28795080
store_sales 287997024
time_dim 86400
warehouse 15
web_page 2040
web_returns 7197670
web_sales 72001237
web_site 24
1000G数据场景
行数
call_center 72
catalog_page 50400
catalog_returns 158401130
catalog_sales 1583977481
customer_address 7000000
customer_demographics 3841600
customer 14000000
date_dim 146098
household_demographics 14400
income_band 40
inventory 1182330000
item 504000
promotion 2500
reason 120
ship_mode 40
store 1404
store_returns 316794844
store_sales 3167985023
time_dim 172800
warehouse 35
web_page 5040
web_returns 79195192
web_sales 792001613
web_site 78

测试结果

问题

一:Query58查询语句报错
with ss_items as
 (select i_item_id item_id
        ,sum(ss_ext_sales_price) ss_item_rev 
 from store_sales
     ,item
     ,date_dim
 where ss_item_sk = i_item_sk
   and d_date in (select d_date
                  from date_dim
                  where d_week_seq = (select d_week_seq 
                                      from date_dim
                                      where d_date = '2000-01-03'))
   and ss_sold_date_sk   = d_date_sk
 group by i_item_id),
 cs_items as
 (select i_item_id item_id
        ,sum(cs_ext_sales_price) cs_item_rev
  from catalog_sales
      ,item
      ,date_dim
 where cs_item_sk = i_item_sk
  and  d_date in (select d_date
                  from date_dim
                  where d_week_seq = (select d_week_seq 
                                      from date_dim
                                      where d_date = '2000-01-03'))
  and  cs_sold_date_sk = d_date_sk
 group by i_item_id),
 ws_items as
 (select i_item_id item_id
        ,sum(ws_ext_sales_price) ws_item_rev
  from web_sales
      ,item
      ,date_dim
 where ws_item_sk = i_item_sk
  and  d_date in (select d_date
                  from date_dim
                  where d_week_seq =(select d_week_seq 
                                     from date_dim
                                     where d_date = '2000-01-03'))
  and ws_sold_date_sk   = d_date_sk
 group by i_item_id)
  select  ss_items.item_id
       ,ss_item_rev
       ,ss_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3) * 100 ss_dev
       ,cs_item_rev
       ,cs_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3) * 100 cs_dev
       ,ws_item_rev
       ,ws_item_rev/((ss_item_rev+cs_item_rev+ws_item_rev)/3) * 100 ws_dev
       ,(ss_item_rev+cs_item_rev+ws_item_rev)/3 average
 from ss_items,cs_items,ws_items
 where ss_items.item_id=cs_items.item_id
   and ss_items.item_id=ws_items.item_id 
   and ss_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
   and ss_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
   and cs_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
   and cs_item_rev between 0.9 * ws_item_rev and 1.1 * ws_item_rev
   and ws_item_rev between 0.9 * ss_item_rev and 1.1 * ss_item_rev
   and ws_item_rev between 0.9 * cs_item_rev and 1.1 * cs_item_rev
 order by item_id
         ,ss_item_rev
 limit 100;

错误信息:ERROR : Expected LE 1 to be returned by expression

原因构造表date_dim中有两条数据重复了。删除掉即可。

delete 
from
	(
	SELECT
		ROW_NUMBER() OVER (PARTITION BY d_date_sk
	ORDER BY
		d_date_sk) AS row_num
	FROM
		date_dim
	where
		d_date = '2000-01-03' 
		) a
where
		row_num >1
二:Query64查询语句超时
with cs_ui as
 (select cs_item_sk
        ,sum(cs_ext_list_price) as sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund
  from catalog_sales
      ,catalog_returns
  where cs_item_sk = cr_item_sk
    and cs_order_number = cr_order_number
  group by cs_item_sk
  having sum(cs_ext_list_price)>2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit)),
cross_sales as
 (select i_product_name product_name
     ,i_item_sk item_sk
     ,s_store_name store_name
     ,s_zip store_zip
     ,ad1.ca_street_number b_street_number
     ,ad1.ca_street_name b_street_name
     ,ad1.ca_city b_city
     ,ad1.ca_zip b_zip
     ,ad2.ca_street_number c_street_number
     ,ad2.ca_street_name c_street_name
     ,ad2.ca_city c_city
     ,ad2.ca_zip c_zip
     ,d1.d_year as syear
     ,d2.d_year as fsyear
     ,d3.d_year s2year
     ,count(*) cnt
     ,sum(ss_wholesale_cost) s1
     ,sum(ss_list_price) s2
     ,sum(ss_coupon_amt) s3
  FROM   store_sales
        ,store_returns
        ,cs_ui
        ,date_dim d1
        ,date_dim d2
        ,date_dim d3
        ,store
        ,customer
        ,customer_demographics cd1
        ,customer_demographics cd2
        ,promotion
        ,household_demographics hd1
        ,household_demographics hd2
        ,customer_address ad1
        ,customer_address ad2
        ,income_band ib1
        ,income_band ib2
        ,item
  WHERE  ss_store_sk = s_store_sk AND
         ss_sold_date_sk = d1.d_date_sk AND
         ss_customer_sk = c_customer_sk AND
         ss_cdemo_sk= cd1.cd_demo_sk AND
         ss_hdemo_sk = hd1.hd_demo_sk AND
         ss_addr_sk = ad1.ca_address_sk and
         ss_item_sk = i_item_sk and
         ss_item_sk = sr_item_sk and
         ss_ticket_number = sr_ticket_number and
         ss_item_sk = cs_ui.cs_item_sk and
         c_current_cdemo_sk = cd2.cd_demo_sk AND
         c_current_hdemo_sk = hd2.hd_demo_sk AND
         c_current_addr_sk = ad2.ca_address_sk and
         c_first_sales_date_sk = d2.d_date_sk and
         c_first_shipto_date_sk = d3.d_date_sk and
         ss_promo_sk = p_promo_sk and
         hd1.hd_income_band_sk = ib1.ib_income_band_sk and
         hd2.hd_income_band_sk = ib2.ib_income_band_sk and
         cd1.cd_marital_status <> cd2.cd_marital_status and
         i_color in ('purple','burlywood','indian','spring','floral','medium') and
         i_current_price between 64 and 64 + 10 and
         i_current_price between 64 + 1 and 64 + 15
group by i_product_name
       ,i_item_sk
       ,s_store_name
       ,s_zip
       ,ad1.ca_street_number
       ,ad1.ca_street_name
       ,ad1.ca_city
       ,ad1.ca_zip
       ,ad2.ca_street_number
       ,ad2.ca_street_name
       ,ad2.ca_city
       ,ad2.ca_zip
       ,d1.d_year
       ,d2.d_year
       ,d3.d_year
)
select cs1.product_name
     ,cs1.store_name
     ,cs1.store_zip
     ,cs1.b_street_number
     ,cs1.b_street_name
     ,cs1.b_city
     ,cs1.b_zip
     ,cs1.c_street_number
     ,cs1.c_street_name
     ,cs1.c_city
     ,cs1.c_zip
     ,cs1.syear
     ,cs1.cnt
     ,cs1.s1 as s11
     ,cs1.s2 as s21
     ,cs1.s3 as s31
     ,cs2.s1 as s12
     ,cs2.s2 as s22
     ,cs2.s3 as s32
     ,cs2.syear
     ,cs2.cnt
from cross_sales cs1,cross_sales cs2
where cs1.item_sk=cs2.item_sk and
     cs1.syear = 1999 and
     cs2.syear = 1999 + 1 and
     cs2.cnt <= cs1.cnt and
     cs1.store_name = cs2.store_name and
     cs1.store_zip = cs2.store_zip
order by cs1.product_name
       ,cs1.store_name
       ,cs2.cnt
       ,cs1.s1
       ,cs2.s1;

查询超时

默认超时时长300,5分钟。

TEST SQL

TPC-DS test SQL

tpcds-poc-1.0脚本操作记录

  • 脚本创建100G数据 耗时 3.5h左右
  • 脚本导入100G数据 耗时 40min左右
  • 脚本执行一次99个查询语句查询100G数据 耗时 30min左右
  • 脚本创建1000G数据 耗时 35h左右
  • 脚本导入1000G数据 耗时 6h左右
  • 脚本执行一次99个查询语句查询1000G数据 耗时 90min左右

与Doris测试对比

SQL 100G 100G 100G 1000G 1000G 1000G
- StarRocks-avg(ms) Doris-avg(ms) min StarRocks-avg(ms) Doris-avg(ms) min
Query01 410 612 StarRocks 3210 5424 StarRocks
Query02 240 4022 StarRocks 1997 35139 StarRocks
Query03 263 235 Doris 3162 1397 Doris
Query04 18821 7504 Doris 233482 70339 Doris
Query05 546 1328 Doris 10175 4138 Doris
Query06 302 272 Doris 3041 562 Doris
Query07 434 564 Doris 11611 1496 Doris
Query08 2075 425 Doris 3183 983 Doris
Query09 824 1983 StarRocks 12924 12806 Doris
Query10 344 432 StarRocks 5027 1578 Doris
Query11 10137 4893 Doris 132540 46074 Doris
Query12 215 195 Doris 1347 244 Doris
Query13 391 528 Doris 8174 1794 Doris
Query14 3127 8952 StarRocks 35346 55529 StarRocks
Query15 291 337 StarRocks 6973 1871 Doris
Query16 1591 1842 Doris 5492 2541 Doris
Query17 694 786 Doris 14939 4500 Doris
Query18 462 586 Doris 8684 2001 Doris
Query19 357 308 Doris 5718 779 Doris
Query20 201 182 Doris 1237 273 Doris
Query21 241 211 Doris 643 369 Doris
Query22 1286 6235 StarRocks 8719 6048 Doris
Query23 27994 20605 Doris 300932 273127 Doris
Query24 917 6596 StarRocks 148372 53072 Doris
Query25 651 478 Doris 10943 2180 Doris
Query26 247 353 Doris 5415 725 Doris
Query27 359 306 Doris 8425 1318 Doris
Query28 782 1587 StarRocks 12448 12260 Doris
Query29 633 1347 Doris 11828 2910 Doris
Query30 309 1603 Doris 2417 1406 Doris
Query31 1411 1304 Doris 32464 10024 Doris
Query32 129 99 Doris 1132 123 Doris
Query33 519 495 Doris 8474 1885 Doris
Query34 323 378 Doris 3485 2423 Doris
Query35 879 1108 StarRocks 17665 8160 Doris
Query36 308 387 StarRocks 6646 1242 Doris
Query37 195 160 Doris 814 188 Doris
Query38 2539 5411 StarRocks 39111 45072 StarRocks
Query39 633 1175 StarRocks 2799 1981 Doris
Query40 228 662 Doris 2229 826 Doris
Query41 68 81 StarRocks 75 98 StarRocks
Query42 267 297 Doris 3299 261 Doris
Query43 1323 400 Doris 15430 2412 Doris
Query44 239 813 StarRocks 3527 4656 StarRocks
Query45 328 329 Doris 12843 1881 Doris
Query46 475 731 StarRocks 8563 3365 Doris
Query47 1099 2917 StarRocks 21633 6427 Doris
Query48 344 293 Doris 6365 1278 Doris
Query49 444 455 Doris 8484 1331 Doris
Query50 443 298 Doris 7658 1978 Doris
Query51 3538 6216 StarRocks 19543 30309 StarRocks
Query52 265 118 Doris 3491 256 Doris
Query53 347 247 Doris 6099 1006 Doris
Query54 7893 736 Doris 14291 3054 Doris
Query55 265 116 Doris 3312 247 Doris
Query56 509 767 Doris 8437 831 Doris
Query57 1068 1647 StarRocks 8491 3965 Doris
Query58 1840 407 Doris - 640 Doris
Query59 6657 2279 Doris 124888 16880 Doris
Query60 553 507 Doris 9563 1348 Doris
Query61 561 196 Doris 7640 1105 Doris
Query62 241 311 StarRocks 11957 1809 Doris
Query63 332 229 Doris 6115 843 Doris
Query64 1932 1782 Doris - 7341 Doris
Query65 802 3356 StarRocks 13477 31614 StarRocks
Query66 399 900 Doris 7577 1093 Doris
Query67 6981 15050 StarRocks 200395 116469 Doris
Query68 393 496 StarRocks 8777 6074 Doris
Query69 378 491 StarRocks 4508 1978 Doris
Query70 2645 933 Doris 59192 6435 Doris
Query71 405 486 StarRocks 6731 1695 Doris
Query72 1392 14127 StarRocks 87554 36832 Doris
Query73 315 281 Doris 3668 1861 Doris
Query74 5580 4600 Doris 100042 42634 Doris
Query75 2421 1941 Doris 47009 16450 Doris
Query76 262 930 StarRocks 5631 4877 Doris
Query77 662 355 Doris 10095 727 Doris
Query78 5241 6332 StarRocks 72845 62186 Doris
Query79 1353 1791 StarRocks 17413 15120 Doris
Query80 628 456 Doris 15022 2375 Doris
Query81 409 510 StarRocks 4432 3359 Doris
Query82 229 204 Doris 1314 778 Doris
Query83 189 277 StarRocks 18251 491 Doris
Query84 167 178 Doris 1523 909 Doris
Query85 410 517 Doris 2073 1844 Doris
Query86 230 334 StarRocks 3361 2340 Doris
Query87 2533 5611 StarRocks 38576 46143 StarRocks
Query88 1183 1620 StarRocks 17564 12995 Doris
Query89 360 279 Doris 6492 1264 Doris
Query90 130 399 StarRocks 872 1720 Doris
Query91 176 206 StarRocks 452 269 Doris
Query92 150 76 Doris 768 116 Doris
Query93 316 42 Doris 7269 4128 Doris
Query94 911 260 Doris 3574 1556 Doris
Query95 2924 356 Doris 6192 5221 Doris
Query96 204 213 StarRocks 2582 1831 Doris
Query97 946 2071 StarRocks 18051 16439 Doris
Query98 369 445 StarRocks 2579 1084 Doris
Query99 384 417 StarRocks 13140 3560 Doris
数据量 StarRocks快速个数 Doris快速个数
100G 40 59
1000G 9 90

彩蛋

由于执行的时间太长,需要以后台执行的方式来执行脚本

后台执行

nohup sh bin/benchmark.sh -p -d tpcds > test.log 2>&1 &

查询后台直接脚本

ps -ef | grep benchmark.sh
Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐