This guide walks you through setting up the Gemini AI Column Description Generator in a new Google Cloud Platform (GCP) project. This tool automatically generates descriptions for database columns using Google's Gemini AI model integrated with BigQuery ML.
The following APIs need to be activated
The service account or user running the script needs the following roles:
roles/bigquery.dataEditor)roles/bigquery.jobUser)roles/bigquery.connectionUser)roles/aiplatform.user)roles/ml.developer)# Replace [USER_EMAIL] with the actual email
export USER_EMAIL="your-email@domain.com"
export PROJECT_ID="your-project-id"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="user:$USER_EMAIL" \
--role="roles/bigquery.dataEditor"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="user:$USER_EMAIL" \
--role="roles/bigquery.jobUser"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="user:$USER_EMAIL" \
--role="roles/bigquery.connectionUser"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="user:$USER_EMAIL" \
--role="roles/aiplatform.user"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="user:$USER_EMAIL" \
--role="roles/ml.developer"
Create a dataset to hold your source tables (if not already existing):
CREATE SCHEMA IF NOT EXISTS `your-project-id.DM`
OPTIONS (
description = "Data Mart - Source tables for analysis",
location = "US" -- Choose appropriate location
);
Create a dataset for storing the AI-generated descriptions:
CREATE SCHEMA IF NOT EXISTS `your-project-id.TEST_GEMINI`
OPTIONS (
description = "Gemini AI generated column descriptions",
location = "US" -- Must match source dataset location
);
Via BigQuery Console:
Click + Add data, and then use the search bar for data sources to search for Vertex AI. Click on the result for Vertex AI then click Vertex AI Models: BigQuery Federation.
Select Connection type as Vertex AI remote models, remote functions and BigLake (Cloud Resource) if not already selected and set Connection ID to vertex-ai-connection.
After creating the connection, note the service account email that's automatically created. You'll need to grant it permissions.
-- Get connection details
SELECT * FROM `your-project-id.us-central1.INFORMATION_SCHEMA.CONNECTIONS`
WHERE connection_name = 'vertex-ai-connection';
The connection service account needs access to Vertex AI:
# Get the service account email from the connection details above
export CONNECTION_SA="service-account-email@gcp-sa-bigquery-condel.iam.gserviceaccount.com"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$CONNECTION_SA" \
--role="roles/aiplatform.user"
CREATE OR REPLACE MODEL `your-project-id.YOUR_DATASET.gemini_remote_model`
REMOTE WITH CONNECTION `your-project-id.europe-west1.vertex-ai-connection`
OPTIONS (
ENDPOINT = 'gemini-2.0-flash'
);
-- Test the model with a simple query
SELECT ml_generate_text_llm_result
FROM ML.GENERATE_TEXT(
MODEL `your-project-id.TEST_GEMINI.gemini_remote_model`,
(SELECT "Test connection to Gemini AI" AS prompt),
STRUCT(TRUE AS flatten_json_output)
);
In the main script, update these variables with your project details:
-- Update these variables in the script
DECLARE dataset_name STRING DEFAULT 'your-project-id.DM';
DECLARE output_dataset STRING DEFAULT 'TEST_GEMINI';
DECLARE model_name STRING DEFAULT 'TEST_GEMINI.gemini_remote_model';
DECLARE sample_size INT64 DEFAULT 5;
-- Verify the output table was created
SELECT
table_name,
COUNT(*) as column_count
FROM `your-project-id.TEST_GEMINI.column_descriptions_with_ai`
GROUP BY table_name;
-- View sample results
SELECT * FROM `your-project-id.TEST_GEMINI.column_descriptions_with_ai` LIMIT 5;
"Permission denied" errors
"Model not found" errors
"No tables found" errors
Quota exceeded errors
Once installed, you can:
For issues specific to: