Federal Science DataHubFederal Science DataHub
  • English
  • Français
  • English
  • Français
  • Overview
  • Managing Workspaces and Users

    • Getting a workspace (only available on the GC network)
    • Estimate costs (only available on the GC network)
    • Account Setup
    • Requesting, configuring and removing tools in your workspace
    • Invite a user
    • Change a user role
    • Manage your CBR & workspace budgets
  • Storage

    • Working with Azure Storage
    • Bring Your Own Storage

      • Import AWS S3 Bucket
      • Import Azure Storage
      • Import Google Cloud Platform Storage
    • Access Storage in Databricks
    • Use AzCopy to Interact with Storage
  • Databricks

    • Getting Started with Databricks
    • FSDH Cluster Policies
    • MLFlow: AutoML and Experiments
    • Databricks Workflows
    • Dashboarding

      • How to Dashboard in Databricks
      • Dashboarding Tool Comparison
    • External Extensions

      • Git/GitHub Integration with Databricks
      • Databricks VS Code Extension
      • Working with Conda
      • Connecting Google API to Databricks
  • PostgreSQL

    • Create and use a PostgreSQL Database
    • Add a User to PostgreSQL on FSDH
    • PostgreSQL vs Azure Databricks Database Features
  • Web Applications

    • Hosting Web Apps on DataHub
  • Migrating to Production

    • Migrating Storage
    • Migrating Databricks
    • Migrating PostgreSQL
    • Migrating Web Apps
  • User Guidance

    • Account Management and Access control of workspaces
    • Backup and Recovery
    • Github and code repo management
    • Incident Detection & Response
    • Monitor Usage
    • Monitoring and Auditing a Workspace
    • Source code
    • Restricted File Types on FSDH Storage
  • Terms and Conditions

Git/GitHub Integration with Databricks

Pre-requisites

  • Familiarity with Git version control.
  • Have access to a Git repository - the creation and access to a repository is outside the scope of this guide.

Why use Git/GitHub

  • Better version control than built-in change tracker.
  • Manage code and notebooks outside Databricks.
  • Collaboration across multiple workspaces.

Workbook vs Repository

  • Once you connect Git to Databricks, you can create and use notebooks like normal and also push them to GitHub.
  • Workbook files in Git have a slightly different syntax than standard Jupyter Notebooks.

Part 1: Setup Git/GitHub with Databricks

Navigate to your Settings from the top right drop down.

image

Select Linked Accounts in the left-hand sidebar.

image Select your preferred Git provider and follow the prompts to link your account. Typically, you will need to provide your username and a token generated from your Git provider.

For GitHub, the access tokens can be created in Settings / Personal Access Token / Tokens (classic). On this page, click "Generate new token" then "Generate new token (classic)". image

As a security best practice, we do recommend configuring an expiration date of less than a year. Tokens can easily be regenerated following expiration by following these steps.

The scopes required are repo and workflow (optional for GitHub Actions workflows) image

If the token and access permissions are correctly configured - you should see a green mark in the settings page image

Part 2: Accessing and Modifying Repositories

To clone a repository, you will need its HTTPS access link, and its branch. In your workspace, click "Create" and then "Git folder" in the dropdown menu. image

Input the Git repository URL and then click Create Git folder. image

Once this is configured, you can see the folder in your workspace and navigate into it.

Changes done from Databricks can be pushed to the repository using the built-in Git menu. Access it by right clicking in your repo files and clicking "Git...". image

This screen will allow you to commit and push changes, as well as pull existing changes. image

Resolving Conflicts

If multiple commits impact the same code, there could be a conflict issue. The following message will appear while attempting to pull code:

image

Select "Resolve conflict using PR", which opens the following:

image This step will ask you to create a new branch where your changes will be committed. Enter a branch name and a commit message then commit the messages again. If successful, you'll see the following message. You can follow the link in that message to make your changes.

image

Edit this page on GitHub
Last Updated: 2026-04-13, 11:39 a.m.
Next
Databricks VS Code Extension