Experimental Setup
We ran the experiments on the Hadoop Cluster which has:
- 20 slaves machines:
- Each has 8GB RAM and a 4-core CPU.
- Each has 2 map slot and 1 reduce slot.
- HDFS block size is 128MB
We made the experiments on three data input sets:
- ish_is:
- Contains 6 columns which represent: year, month, day, hour, minute, payload
- ROLLUP over (overall, year, month, day, hour, minute) and uses the SUM aggregate function.
- uniform_syn:
- Contains 7 columns which represent: year, month, day, hour, minute, second, payload
- ROLLUP over (overall, year, month, day, hour, minute, second) and uses the SUM aggregate function.
- rdns:
- Contains one column which represents the Unixtime.
- ROLLUP over (overall, year, month, day, hour, minute, second) and uses the COUNT_STAR aggregate function.
We also made an experiment on data input set uniform_syn with two ROLLUP in the CUBE clause, the first ROLLUP for (year, month, day) the second ROLLUP for (hour, minute, second) and uses the SUM aggregate function.
For each data input set, we ran two experiments: one which uses the current ROLLUP, one which uses our ROLLUP.
All the experiments with our ROLLUP were executed with the PIVOT=3.