Federal Science DataHubFederal Science DataHub
  • English
  • Français
  • English
  • English
  • Français
  • English
  • Overview
  • Managing Workspaces and Users

    • Getting a workspace (only available on the GC network)
    • Estimate costs (only available on the GC network)
    • Account Setup
    • Requesting, configuring and removing tools in your workspace
    • Invite a user
    • Change a user role
    • Manage your CBR & workspace budgets
  • Storage

    • Working with Azure Storage
    • Bring Your Own Storage

      • Import AWS S3 Bucket
      • Import Azure Storage
      • Import Google Cloud Platform Storage
    • Access Storage in Databricks
    • Use AzCopy to Interact with Storage
  • Databricks

    • Getting Started with Databricks
    • FSDH Cluster Policies
    • MLFlow: AutoML and Experiments
    • Databricks Workflows
    • Dashboarding

      • How to Dashboard in Databricks
      • Dashboarding Tool Comparison
    • External Extensions

      • Git/GitHub Integration with Databricks
      • Databricks VS Code Extension
      • Working with Conda
      • Connecting Google API to Databricks
  • PostgreSQL

    • Create and use a PostgreSQL Database
    • Add a User to PostgreSQL on FSDH
    • PostgreSQL vs Azure Databricks Database Features
  • Web Applications

    • Hosting Web Apps on DataHub
  • Migrating to Production

    • Migrating Storage
    • Migrating Databricks
    • Migrating PostgreSQL
    • Migrating Web Apps
  • User Guidance

    • Account Management and Access control of workspaces
    • Backup and Recovery
    • Code Management
    • Restricted File Types on FSDH Storage
    • Workspace Monitoring
  • Terms and Conditions

Databricks on the Federal Science DataHub

Azure Databricks is a cloud-based big data analytics platform. It provides a collaborative environment for data scientists, data engineers, and business analysts to work together on big data projects. Azure Databricks on the Federal Science DataHub (FSDH) combines the power of Apache Spark with a collaborative notebook interface, making it easy to build and deploy data pipelines, machine learning models, and analytics applications.

Azure Databricks is ideal for:

  • processing large datasets
  • building machine learning models
  • running interactive queries
  • collaborating on data projects

The FSDH allows you to provision Azure Databricks for your research, enabling you to conduct analysis of your data at a large scale.

Learn how to:

  • Provision Databricks on the FSDH: Requesting, configuring and removing tools in your workspace
  • Get started with Databricks: Databricks 101
  • Manage Databricks clusters: Databricks Cluster Policies
  • Use Git or other version control with Databricks: Databricks Git Integration
  • Use Visual Studio Code with Databricks: Databricks VS Code Extension
  • Create dashboards in Databricks: Databricks Dashboarding Tool Comparison
  • Compare Databricks with other dashboarding tools: Dashboarding Tool Comparison
  • Use workflows in Databricks: Databricks Workflows
  • Run AutoML experiments in Databricks: Databricks AutoML
  • Add Conda, PyPI, or CRAN packages to Databricks clusters: Databricks Custom Libraries
Edit this page on GitHub
Last Updated: 2026-04-13, 11:39 a.m.