云数科技-云库CloudyBi
云数科技 技术

云数科技 > CloudyBI - 查询测试对比




测试环境
硬件: 10台PC SERVER (CPU AMD4核9250 3.0GHz主频, 内存4G)
软件: 操作系统 Windows XP
数据库: SQL Server 2000 个人版
(考虑到用户大规模部署的成本,我们在单节点使用XP操作系统,SQL Server 2000个人版,最大程度降低企业成本)

对比测试

测试用例:求各省多年拜访数合计
查询SQL:select "D_Time".Year as "year","D_Location".Region as "region","D_Location".Province as "province",Sum("F_Activity".Activity_Count) as "activity_count" from f_activity "F_Activity" join d_time "D_Time" on "F_Activity".Activity_Date_SK="D_Time".Date_SK join d_location "D_Location" on "F_Activity".Location_SK="D_Location".Location_SK group by "D_Time".Year,"D_Location".Region,"D_Location".Province
数据分割方案 第一次执行时间(S) 后续执行时间(S) 倍数
单机(传统数据库和BI) 18.6 17.7  
10台机器,9个节点 1 1 17.7
10台机器18个节点 0.5 0.5 35.4

测试用例:求各人员拜访数合计
查询SQL:select "D_Employee".USER_ID1 as "user_id1","D_Employee".USER_ID2 as "user_id2","D_Employee".Staff_Name1 as "staff_name1","D_Employee".Staff_Name2 as "staff_name2","D_Time".Year as "year",Sum("F_Activity".Activity_Count) as "activity_count" from f_activity "F_Activity" join d_employee "D_Employee" on "F_Activity".Employee_SK="D_Employee".Staff_Row_ID_SK join d_time "D_Time" on "F_Activity".Activity_Date_SK="D_Time".Date_SK group by "D_Employee".USER_ID1, "D_Employee".USER_ID2, "D_Employee".Staff_Name1, "D_Employee".Staff_Name2, "D_Time".Year
数据分割方案 第一次执行时间(S) 后续执行时间(S) 倍数
单机(传统数据库和BI) 6.18 6.01  
10台机器,9个节点 1 0.8 7.5
10台机器18个节点 0.5 0.6 12

测试用例:各地区拜访数和医生数(求count distinct)
查询SQL:select "D_Location".Region as "region","D_Location".Province as "province","D_Time".Year as "year",Sum("F_Activity".Activity_Count) as "activity_count",count(distinct "D_Doctor".Doctor_ID) as "Doctor_Count" from f_activity "F_Activity" join d_doctor "D_Doctor" on "F_Activity".Doctor_SK="D_Doctor".Doctor_SK join d_time "D_Time" on "F_Activity".Activity_Date_SK="D_Time".Date_SK join d_location "D_Location" on "F_Activity".Location_SK="D_Location".Location_SK group by "D_Location".Region,"D_Location".Province,"D_Time".Year
数据分割方案 第一次执行时间(S) 后续执行时间(S) 倍数
单机(传统数据库和BI) 23.4 23.1  
10台机器,9个节点 3.2 2.8 8.25
10台机器18个节点 1.6 1.5 15.4

测试用例:各人员拜访数和医生数(求count distinct)
查询SQL:select "D_Employee".USER_ID1 as "user_id1","D_Employee".USER_ID2 as "user_id2","D_Time".Year as "year",Sum("F_Activity".Activity_Count) as "activity_count",count(distinct "D_Doctor".Doctor_ID) as "doctor_Count" from f_activity "F_Activity" join d_employee "D_Employee" on "F_Activity".Employee_SK="D_Employee".Staff_Row_ID_SK join d_doctor "D_Doctor" on "F_Activity".Doctor_SK="D_Doctor".Doctor_SK join d_time "D_Time" on "F_Activity".Activity_Date_SK="D_Time".Date_SK group by "D_Employee".USER_ID1,"D_Employee".USER_ID2,"D_Time".Year
数据分割方案 第一次执行时间(S) 后续执行时间(S) 倍数
单机(传统数据库和BI) 214.8 182.4  
10台机器,9个节点 162.1 15.8 11.5
10台机器18个节点 5.0 5.0 36.5


查询速度测试

* 数据表结构


SQL:单表求和


SQL:多表连接求count distinct



测试总结

  对比测试证明CBI的并行架构能够极大的提高大数据量查询的速度和效率。 查询速度随节点数的增加呈接近线性的增长。
  10台机器,可以获得5-36倍左右速度的提高。
  数据查询越慢,查询越复杂,速度获得的提升越高。