Aiyara cluster

An Aiyara cluster is a low-powered computer cluster specially designed to process Big Data. The Aiyara cluster model can be considered as a specialization of the Beowulf cluster in the sense that Aiyara is also built from commodity hardware, not inexpensive personal computers, but system-on-chip computer boards. Unlike Beowulf, applications of an Aiyara cluster are scoped only for the Big Data area, not for scientific high-performance computing. Another important property of an Aiyara cluster is that it is low-power. It must be built with a class of processing units that produces less heat.

The name Aiyara originally referred to the first ARM-based cluster built by Wichai Srisuruk and Chanwit Kaewkasi at Suranaree University of Technology. The name "Aiyara" came from a Thai word literally an elephant to reflect its underneath software stack, which is Apache Hadoop.

Like Beowulf, an Aiyara cluster does not define a particular software stack to run atop it. A cluster normally runs a variant of the Linux operating system. Commonly used Big Data software stacks are Apache Hadoop and Apache Spark.

Development

A report of the Aiyara hardware which successfully processed a non-trivial amount of Big Data was published in the Proceedings of ICSEC 2014.^[1] Aiyara Mk-I, the second Aiyara cluster, consists of 22 Cubieboards. It is the first known SoC-based ARM cluster which is able to process Big Data successfully using the Spark and HDFS stack.^[2]

The Aiyara cluster model, a technical description explaining how to build an Aiyara cluster, was later published by Chanwit Kaewkasi in the DZone's 2014 Big Data Guide.^[3] The further results and cluster optimization techniques, that make the cluster's processing rate boost to 0.9 GB/min while still preserve low-power consumption, were reported in the Proceeding of IEEE's TENCON 2014.^[4]

The whole architecture of software stack, including the runtime, data integrity verification and data compression, is studied and improved. The work reported in this paper achieved the processing rate at almost 0.9 GB/min, successfully processed the same benchmarks from the previous work by roughly 38 minutes.

References

^ C. Kaewkasi and W. Srisuruk. A Study of Big Data Processing Constraints on a Low-Power Hadoop Cluster. Proceedings of the 18th ICSEC, 2014, pp. 308-313
^ The first Spark/Hadoop ARM cluster runs atop Cubieboards April 8, 2014 on Cubieboard.org
^ Chanwit Kaewkasi. The DIY Big Data Cluster. DZone Big Data Guide 2014. September 22, 2014, pp. 20-21
^ C. Kaewkasi and W. Srisuruk. Optimizing performance and power consumption for an ARM-based big data cluster. Proceedings of the 2014 IEEE Region 10 Conference, 2014, pp. 1-6

[1] C. Kaewkasi and W. Srisuruk. A Study of Big Data Processing Constraints on a Low-Power Hadoop Cluster. Proceedings of the 18th ICSEC, 2014, pp. 308-313

[2] The first Spark/Hadoop ARM cluster runs atop Cubieboards April 8, 2014 on Cubieboard.org

[3] Chanwit Kaewkasi. The DIY Big Data Cluster. DZone Big Data Guide 2014. September 22, 2014, pp. 20-21

[4] C. Kaewkasi and W. Srisuruk. Optimizing performance and power consumption for an ARM-based big data cluster. Proceedings of the 2014 IEEE Region 10 Conference, 2014, pp. 1-6

[1]

[2]

[3]

[4]

Aiyara cluster

Development

See also

References