Databricks on the Federal Science DataHub
Azure Databricks is a cloud-based big data analytics platform. It provides a collaborative environment for data scientists, data engineers, and business analysts to work together on big data projects. Azure Databricks on the Federal Science DataHub (FSDH) combines the power of Apache Spark with a collaborative notebook interface, making it easy to build and deploy data pipelines, machine learning models, and analytics applications.
Azure Databricks is ideal for:
- processing large datasets
- building machine learning models
- running interactive queries
- collaborating on data projects
The FSDH allows you to provision Azure Databricks for your research, enabling you to conduct analysis of your data at a large scale.
Learn how to:
- Provision Databricks on the FSDH: Requesting, configuring and removing tools in your workspace
- Get started with Databricks: Databricks 101
- Manage Databricks clusters: Databricks Cluster Policies
- Use Git or other version control with Databricks: Databricks Git Integration
- Use Visual Studio Code with Databricks: Databricks VS Code Extension
- Create dashboards in Databricks: Databricks Dashboarding Tool Comparison
- Compare Databricks with other dashboarding tools: Dashboarding Tool Comparison
- Use workflows in Databricks: Databricks Workflows
- Run AutoML experiments in Databricks: Databricks AutoML
- Add Conda, PyPI, or CRAN packages to Databricks clusters: Databricks Custom Libraries