Deployment

1. Project Overview

Purpose and Scope:
This deployment setup provisions a containerized MQTT server and subscriber application for Freeport FMI, supporting development, pre-production, and production environments. It orchestrates the Mosquitto MQTT broker and a Python-based subscriber (which bridges MQTT to Google Cloud Pub/Sub and BigQuery) using Docker and Supervisor, and deploys the stack to Google Cloud Platform (GCP) using flexible environment configurations.

Primary Use Cases:

  • Reliable ingestion of sensor data via MQTT.
  • Forwarding of sensor data to cloud analytics and event processing pipelines.
  • Environment-specific deployments (DEV, PPRD, PRD) with tailored resources and credentials.

Explicitly Not Handled:

  • Automated scaling beyond a single instance per environment.
  • Secure, production-grade MQTT (no TLS/mTLS or authentication in current config).
  • Downstream analytics or visualization (handled elsewhere).

2. System Architecture

Core Components and Responsibilities:

  • Docker Container: Bundles Mosquitto broker, Python subscriber, and Supervisor for process management.
  • Mosquitto Broker: Handles MQTT connections and message routing.
  • Python Subscriber: Consumes MQTT messages, publishes to Pub/Sub, and writes to BigQuery.
  • Supervisor: Ensures both Mosquitto and the subscriber run reliably within the container.
  • GCP Flexible Environment: Hosts the container, manages networking, scaling, and health checks.

Data and Control Flow:

  1. MQTT clients publish sensor data to Mosquitto broker (port 1883).
  2. Python subscriber listens to MQTT topics, processes messages, and forwards them to Pub/Sub and BigQuery.
  3. GCP manages container lifecycle, health checks, and network routing.

External Services and Dependencies:

  • Google Cloud Pub/Sub (for event streaming).
  • Google BigQuery (for data storage).
  • GCP VPC and IAM (for secure networking and service account access).

3. Core Concepts & Domain Logic

Key Abstractions and Domain Terms:

  • Environment (DEV/PPRD/PRD): Drives configuration, credentials, and resource allocation.
  • Supervisor: Manages multiple processes inside the container.
  • Service Account: Used for secure access to GCP APIs.

Business/Technical Invariants:

  • Each environment uses its own GCP project, service account, and VPC connector.
  • Only one instance runs per environment (no horizontal scaling).

Mental Model:

  • Each deployment is a self-contained MQTT ingestion node, tightly integrated with GCP services and isolated per environment.

4. Codebase Structure

High-level Layout:

  • mqtt_subscriber.py: Python application for MQTT-to-cloud bridging.
  • mosquitto.conf: Mosquitto broker configuration.
  • supervisord.conf: Supervisor configuration for process management.
  • requirements.txt: Python dependencies.
  • Dockerfile: Container build instructions.
  • Environment-specific YAMLs (dev.yml, pprd.yml, prd.yml): GCP deployment descriptors.
  • CI/CD YAML (project-ci.yml): Pipeline and deployment automation.

Responsibility Boundaries:

  • Application logic (Python) vs. infrastructure (Docker, Mosquitto, Supervisor).
  • Environment-specific configuration is isolated in separate YAML files.

What Changes Together:

  • Application code and requirements.
  • Mosquitto and Supervisor configs.
  • Environment YAMLs and CI/CD pipeline.

5. Configuration & Environment

Environment Variables:

  • Set in deployment YAMLs (e.g., APP_ENV, GCP_PROJECT_ID, MQTT_BROKER, etc.).
  • Control application behavior and cloud integration.

Configuration Files:

  • mosquitto.conf: Broker settings.
  • supervisord.conf: Process management.
  • requirements.txt: Python dependencies.

Differences Between Local, Staging, and Production:

  • Each environment uses a different GCP project, service account, VPC connector, and resource allocation.
  • Production (prd.yml) uses more CPU/memory than DEV/PPRD.

6. Runtime Behavior

Startup Sequence:

  • Container starts, Supervisor launches Mosquitto and Python subscriber.
  • Mosquitto listens on port 1883.
  • Subscriber connects to broker and cloud services.

Normal Execution Flow:

  • MQTT messages are received, processed, and forwarded to cloud.
  • Supervisor restarts processes if they fail.

Error Handling and Logging Strategy:

  • Application logs to stdout (captured by GCP logging).
  • Mosquitto logs as configured.
  • Liveness/readiness checks ensure unhealthy containers are restarted.

7. Deployment & Operations

Build Process:

  • Docker image is built from the Dockerfile (Python, Mosquitto, Supervisor, app code).

Deployment Method:

  • CI/CD pipeline (project-ci.yml) deploys the container to GCP Flexible Environment using environment-specific YAMLs.
  • Each deployment uses a dedicated service account and VPC connector.

Runtime Dependencies:

  • GCP credentials (service account key or workload identity).
  • Network access to GCP APIs.

Scaling and Rollback Considerations:

  • Only one instance per environment (no auto-scaling).
  • Rollback by redeploying a previous image/tag.

8. Extending the System

Where and How to Add New Features:

  • Update Python code for new message handling or cloud integrations.
  • Adjust Mosquitto or Supervisor configs as needed.
  • Add environment variables to YAMLs for new configuration options.

Recommended Patterns:

  • Use environment variables for all configuration.
  • Keep environment YAMLs in sync with code/config changes.
  • Test changes in DEV before promoting to PPRD/PRD.

Anti-patterns and Risk Areas:

  • Hardcoding secrets or credentials.
  • Making breaking changes to topic structure or message schema without coordination.
  • Running multiple instances without unique MQTT client IDs.

Testing Strategy:

  • Unit/integration tests for Python code.
  • Deploy to DEV and validate end-to-end message flow before promoting.

9. Security & Compliance

Authentication and Authorization:

  • GCP access is secured via service accounts.
  • MQTT broker is not secured (no TLS or authentication) – must be addressed for production.

Secrets Handling:

  • No secrets in code; all sensitive data should be managed via environment or GCP Secret Manager.

Data Sensitivity Considerations:

  • Sensor data may be sensitive; ensure GCP IAM and network policies are enforced.

10. Common Pitfalls & Gotchas

  • No MQTT Security: Broker is open on 1883 with no authentication/TLS; not suitable for production as-is.
  • Single Instance: No horizontal scaling; may be a bottleneck for high-throughput scenarios.
  • Resource Limits: Production uses more resources, but disk is fixed at 10GB; monitor for growth.
  • Environment Drift: YAMLs must be kept in sync with code/config changes.
  • Supervisor: If Supervisor or one of its managed processes fails to start, the container will not function correctly.
  • Health Checks: Liveness/readiness checks are critical for auto-recovery but may need tuning for startup delays.