Google BigQuery

Google BigQuery

The recommended connector library for BigQuery is pybigquery.

Install BigQuery Driver

Follow the steps here about how to install new database drivers when setting up Ferris locally via docker-compose.

echo "pybigquery" >> ./docker/requirements-local.txt

Connecting to BigQuery

When adding a new BigQuery connection in Ferris, you’ll need to add the GCP Service Account credentials file (as a JSON).

  1. Create your Service Account via the Google Cloud Platform control panel, provide it access to the appropriate BigQuery datasets, and download the JSON configuration file for the service account.
  2. In Ferris you can either upload that JSON or add the JSON blob in the following format (this should be the content of your credential JSON file):
{
        "type": "service_account",
        "project_id": "...",
        "private_key_id": "...",
        "private_key": "...",
        "client_email": "...",
        "client_id": "...",
        "auth_uri": "...",
        "token_uri": "...",
        "auth_provider_x509_cert_url": "...",
        "client_x509_cert_url": "..."
    }
  1. Additionally, can connect via SQLAlchemy URI instead

    The connection string for BigQuery looks like:

    bigquery://{project_id}
    

    Go to the Advanced tab, Add a JSON blob to the Secure Extra field in the database configuration form with the following format:

    {
    "credentials_info": <contents of credentials JSON file>
    }
    

    The resulting file should have this structure:

    {
     "credentials_info": {
         "type": "service_account",
         "project_id": "...",
         "private_key_id": "...",
         "private_key": "...",
         "client_email": "...",
         "client_id": "...",
         "auth_uri": "...",
         "token_uri": "...",
         "auth_provider_x509_cert_url": "...",
         "client_x509_cert_url": "..."
         }
     }
    

You should then be able to connect to your BigQuery datasets.

To be able to upload CSV or Excel files to BigQuery in Ferris, you’ll need to also add the pandas_gbq library.


Last modified November 24, 2023: update (f4bc5ea)