Deployment

1. Project Overview

Purpose and Scope:
This deployment setup provisions a containerized MQTT server and subscriber application for Freeport FMI, supporting development, pre-production, and production environments. It orchestrates the Mosquitto MQTT broker and a Python-based subscriber (which bridges MQTT to Google Cloud Pub/Sub and BigQuery) using Docker and Supervisor, and deploys the stack to Google Cloud Platform (GCP) using flexible environment configurations.

Primary Use Cases:

Reliable ingestion of sensor data via MQTT.
Forwarding of sensor data to cloud analytics and event processing pipelines.
Environment-specific deployments (DEV, PPRD, PRD) with tailored resources and credentials.

Explicitly Not Handled:

Automated scaling beyond a single instance per environment.
Secure, production-grade MQTT (no TLS/mTLS or authentication in current config).
Downstream analytics or visualization (handled elsewhere).

2. System Architecture

Core Components and Responsibilities:

Docker Container: Bundles Mosquitto broker, Python subscriber, and Supervisor for process management.
Mosquitto Broker: Handles MQTT connections and message routing.
Python Subscriber: Consumes MQTT messages, publishes to Pub/Sub, and writes to BigQuery.
Supervisor: Ensures both Mosquitto and the subscriber run reliably within the container.
GCP Flexible Environment: Hosts the container, manages networking, scaling, and health checks.

Data and Control Flow:

MQTT clients publish sensor data to Mosquitto broker (port 1883).
Python subscriber listens to MQTT topics, processes messages, and forwards them to Pub/Sub and BigQuery.
GCP manages container lifecycle, health checks, and network routing.

External Services and Dependencies:

Google Cloud Pub/Sub (for event streaming).
Google BigQuery (for data storage).
GCP VPC and IAM (for secure networking and service account access).

3. Core Concepts & Domain Logic

Key Abstractions and Domain Terms:

Environment (DEV/PPRD/PRD): Drives configuration, credentials, and resource allocation.
Supervisor: Manages multiple processes inside the container.
Service Account: Used for secure access to GCP APIs.

Business/Technical Invariants:

Each environment uses its own GCP project, service account, and VPC connector.
Only one instance runs per environment (no horizontal scaling).

Mental Model:

Each deployment is a self-contained MQTT ingestion node, tightly integrated with GCP services and isolated per environment.

4. Codebase Structure

High-level Layout:

mqtt_subscriber.py: Python application for MQTT-to-cloud bridging.
mosquitto.conf: Mosquitto broker configuration.
supervisord.conf: Supervisor configuration for process management.
requirements.txt: Python dependencies.
Dockerfile: Container build instructions.
Environment-specific YAMLs (dev.yml, pprd.yml, prd.yml): GCP deployment descriptors.
CI/CD YAML (project-ci.yml): Pipeline and deployment automation.

Responsibility Boundaries:

Application logic (Python) vs. infrastructure (Docker, Mosquitto, Supervisor).
Environment-specific configuration is isolated in separate YAML files.

What Changes Together:

Application code and requirements.
Mosquitto and Supervisor configs.
Environment YAMLs and CI/CD pipeline.

5. Configuration & Environment

Environment Variables:

Set in deployment YAMLs (e.g., APP_ENV, GCP_PROJECT_ID, MQTT_BROKER, etc.).
Control application behavior and cloud integration.

Configuration Files:

mosquitto.conf: Broker settings.
supervisord.conf: Process management.
requirements.txt: Python dependencies.

Differences Between Local, Staging, and Production:

Each environment uses a different GCP project, service account, VPC connector, and resource allocation.
Production (prd.yml) uses more CPU/memory than DEV/PPRD.

6. Runtime Behavior

Startup Sequence:

Container starts, Supervisor launches Mosquitto and Python subscriber.
Mosquitto listens on port 1883.
Subscriber connects to broker and cloud services.

Normal Execution Flow:

MQTT messages are received, processed, and forwarded to cloud.
Supervisor restarts processes if they fail.

Error Handling and Logging Strategy:

Application logs to stdout (captured by GCP logging).
Mosquitto logs as configured.
Liveness/readiness checks ensure unhealthy containers are restarted.

7. Deployment & Operations

Build Process:

Docker image is built from the Dockerfile (Python, Mosquitto, Supervisor, app code).

Deployment Method:

CI/CD pipeline (project-ci.yml) deploys the container to GCP Flexible Environment using environment-specific YAMLs.
Each deployment uses a dedicated service account and VPC connector.

Runtime Dependencies:

GCP credentials (service account key or workload identity).
Network access to GCP APIs.

Scaling and Rollback Considerations:

Only one instance per environment (no auto-scaling).
Rollback by redeploying a previous image/tag.

8. Extending the System

Where and How to Add New Features:

Update Python code for new message handling or cloud integrations.
Adjust Mosquitto or Supervisor configs as needed.
Add environment variables to YAMLs for new configuration options.

Recommended Patterns:

Use environment variables for all configuration.
Keep environment YAMLs in sync with code/config changes.
Test changes in DEV before promoting to PPRD/PRD.

Anti-patterns and Risk Areas:

Hardcoding secrets or credentials.
Making breaking changes to topic structure or message schema without coordination.
Running multiple instances without unique MQTT client IDs.

Testing Strategy:

Unit/integration tests for Python code.
Deploy to DEV and validate end-to-end message flow before promoting.

9. Security & Compliance

Authentication and Authorization:

GCP access is secured via service accounts.
MQTT broker is not secured (no TLS or authentication) – must be addressed for production.

Secrets Handling:

No secrets in code; all sensitive data should be managed via environment or GCP Secret Manager.

Data Sensitivity Considerations:

Sensor data may be sensitive; ensure GCP IAM and network policies are enforced.

10. Common Pitfalls & Gotchas

No MQTT Security: Broker is open on 1883 with no authentication/TLS; not suitable for production as-is.
Single Instance: No horizontal scaling; may be a bottleneck for high-throughput scenarios.
Resource Limits: Production uses more resources, but disk is fixed at 10GB; monitor for growth.
Environment Drift: YAMLs must be kept in sync with code/config changes.
Supervisor: If Supervisor or one of its managed processes fails to start, the container will not function correctly.
Health Checks: Liveness/readiness checks are critical for auto-recovery but may need tuning for startup delays.

Space shortcuts

Page tree