Spark : It is a cluster computing system that was developed in the UC Berkeley AMPLab and it is used to run large-scale applications such as spam filtering and traffic prediction. Spark provides primitives for in-memory cluster computing and APIs in Scala, Java and Python