Using Apache Spark for AI Development
“Spark is a leading open-source cluster-computing framework that is used for distributed databases and machine learning. Although not primarily designed for AI Spark allows you to take advantage of data parallelism and the large distributed systems used in AI development.
AI practitioners should recognize when to use Spark for a particular application. In this course you ll explore advanced techniques for working with Apache Spark and identify the key advantages of using Spark over other platforms. You ll define the meaning of resilient distributed databases (RDDs) and explore several workflows related to them.
You ll move on to recognize how to work with a Spark DataFrame identifying its features and use cases. Finally you ll learn how to create a machine learning pipeline using Spark ML Pipelines.”