Federal Science DataHubFederal Science DataHub
  • English
  • Français
  • English
  • Français
  • Overview
  • Managing Workspaces and Users

    • Getting a workspace (only available on the GC network)
    • Estimate costs (only available on the GC network)
    • Account Setup
    • Requesting, configuring and removing tools in your workspace
    • Invite a user
    • Change a user role
    • Manage your CBR & workspace budgets
  • Storage

    • Working with Azure Storage
    • Bring Your Own Storage

      • Import AWS S3 Bucket
      • Import Azure Storage
      • Import Google Cloud Platform Storage
    • Access Storage in Databricks
    • Use AzCopy to Interact with Storage
  • Databricks

    • Getting Started with Databricks
    • FSDH Cluster Policies
    • MLFlow: AutoML and Experiments
    • Databricks Workflows
    • Dashboarding

      • How to Dashboard in Databricks
      • Dashboarding Tool Comparison
    • External Extensions

      • Git/GitHub Integration with Databricks
      • Databricks VS Code Extension
      • Working with Conda
      • Connecting Google API to Databricks
  • PostgreSQL

    • Create and use a PostgreSQL Database
    • Add a User to PostgreSQL on FSDH
    • PostgreSQL vs Azure Databricks Database Features
  • Web Applications

    • Hosting Web Apps on DataHub
  • Migrating to Production

    • Migrating Storage
    • Migrating Databricks
    • Migrating PostgreSQL
    • Migrating Web Apps
  • User Guidance

    • Account Management and Access control of workspaces
    • Backup and Recovery
    • Github and code repo management
    • Incident Detection & Response
    • Monitor Usage
    • Monitoring and Auditing a Workspace
    • Source code
    • Restricted File Types on FSDH Storage
  • Terms and Conditions

Databricks VS Code Extension

Using the Databricks VS Code extension, you can connect to a Databricks workspace from within VS Code. This allows you to:

  • Write your code locally in VS Code, and then run it remotely on a Databricks cluster.
  • Run SQL queries on a Databricks cluster and see the results directly in VS Code.
  • Manage your Databricks clusters.

Why use it?

Visual Studio Code is an extremely popular code editor. As an open source system, it has a large community of contributors and users. It is also highly extensible, allowing users to install a wide variety of extensions to support for different programming languages, debugging, and more.

Prerequisites

  • Visual Studio Code.
  • A Databricks workspace on the Federal Science DataHub.

Install the extension

  1. Open Visual Studio Code.
  2. Click the Extensions icon in the left navigation bar.
  3. Search for Databricks.
  4. Click Install. The correct extension is shown in the screenshot below. Databricks extension

Connect to a Databricks workspace

  1. Click the Databricks icon in the left navigation bar.
  2. Click Configure.
  3. Enter the URL of your Databricks workspace in the space shown below, up until .net/. For example, if your workspace URL is https://sample.azuredatabricks.net/?o=111111111111#, enter https://sample.azuredatabricks.net/. Databricks extension
  4. On the next screen, select Edit Databricks profiles
  5. On the screen that opens, complete the following and save the file:
[DEFAULT]
host = https://sample.azuredatabricks.net/
token = your_token
jobs-api-version = 2.1
  • [DEFAULT] is the name of the profile. You can change it to anything you want.
  • host is the same that you entered in the previous step.
  • token is your personal access token. To generate a token, see Databricks personal access token authentication
  • jobs-api-version should remain unchanged.

After completing these steps, click Configure again and access your saved [DEFAULT] profile. Databricks will connect automatically.

Run Local Code

NOTE: You must open a folder to use this part of the extension. To do so, click File > Open Folder and select the folder where you keep your code.

Attach a Cluster

Before running code, you must attach a cluster.

  1. Open the Databricks icon in the left navigation bar.
  2. If no cluster is attached, hover over the Cluster bar and click Configure Cluster, as shown in the screenshot below. Databricks extension
  3. From the dropdown, select the cluster you want to attach, as shown in the screenshot below. Databricks extension
  4. You can now start the cluster from the extension.

NOTE: You cannot create a cluster in the extension. You must create it in Databricks itself.

Writing Your Code

Any .py file works, but you can format them to leverage the notebook functionality of Databricks.

To use a notebook, put # Databricks notebook source at the top of your .py file. This tells Databricks to treat the file as a notebook. You can then use the following commands to control the notebook:

  • # COMMAND ---------- creates a new cell.
  • # MAGIC %md creates a markdown cell.
  • # MAGIC %sql creates a SQL cell.
  • # MAGIC %scala creates a Scala cell.
  • # MAGIC %r creates an R cell.
  • # MAGIC %python creates a Python cell.

NOTE: You should ensure that any libraries you import are installed on the cluster. You can do this by including the pip install command in your code or by installing the libraries on the cluster itself.

Run Local Code on a Cluster

  1. Open the Explorer menu in the left navigation bar.
  2. Navigate to the file you want to run. You can use any time of file that you can run in Databricks (R, Python, etc).
  3. Ensure the cluster is started, then right-click the file and select Run File as Workflow in Databricks, as shown in the screenshot below. Databricks extension
  4. The file will run on the cluster. You can see the results in the Output window. Databricks extension

NOTE: Your code will get copied to Databricks under the .ide folder in your workspace.

Edit this page on GitHub
Last Updated: 2026-04-13, 11:39 a.m.
Previous
Git/GitHub Integration with Databricks
Next
Working with Conda