Open source technology Apache Spark as the analytics and machine learning platform of choice for many companies. While Spark has manifested in numerous parts of the Microsoft stack, including SQL Server 2019, Microsoft's go–to Spark service is Azure Databricks.
The service, from Microsoft and Databricks (the company founded by Spark's creators), is a versatile one, geared towards analytics, data engineering and data science. Azure Databricks lets developers work in notebooks, offline, interactively with running clusters, or schedule them as production jobs that provision Spark clusters on-demand.
This session will cover the concepts, service mechanics, and code necessary for you to do analytics and machine learning on Azure Databricks, and integrate it with other Microsoft cloud services and on-premises technologies.
You will learn:
- About the fundamentals of Apache Spark, Spark SQL and Spark MLlib
- How to use Databricks notebooks
- How to manage clusters and jobs
- How to integrate Azure Databricks with blob storage and Azure Data Lake Store (ADLS)
- How to write Python code for both analytics and machine learning