Deployment
1. Project Overview
Purpose and Scope:
This deployment setup provisions a containerized MQTT server and subscriber application for Freeport FMI, supporting development, pre-production, and production environments. It orchestrates the Mosquitto MQTT broker and a Python-based subscriber (which bridges MQTT to Google Cloud Pub/Sub and BigQuery) using Docker and Supervisor, and deploys the stack to Google Cloud Platform (GCP) using flexible environment configurations.
Primary Use Cases:
- Reliable ingestion of sensor data via MQTT.
- Forwarding of sensor data to cloud analytics and event processing pipelines.
- Environment-specific deployments (DEV, PPRD, PRD) with tailored resources and credentials.
Explicitly Not Handled:
- Automated scaling beyond a single instance per environment.
- Secure, production-grade MQTT (no TLS/mTLS or authentication in current config).
- Downstream analytics or visualization (handled elsewhere).
2. System Architecture
Core Components and Responsibilities:
- Docker Container: Bundles Mosquitto broker, Python subscriber, and Supervisor for process management.
- Mosquitto Broker: Handles MQTT connections and message routing.
- Python Subscriber: Consumes MQTT messages, publishes to Pub/Sub, and writes to BigQuery.
- Supervisor: Ensures both Mosquitto and the subscriber run reliably within the container.
- GCP Flexible Environment: Hosts the container, manages networking, scaling, and health checks.
Data and Control Flow:
- MQTT clients publish sensor data to Mosquitto broker (port 1883).
- Python subscriber listens to MQTT topics, processes messages, and forwards them to Pub/Sub and BigQuery.
- GCP manages container lifecycle, health checks, and network routing.
External Services and Dependencies:
- Google Cloud Pub/Sub (for event streaming).
- Google BigQuery (for data storage).
- GCP VPC and IAM (for secure networking and service account access).
3. Core Concepts & Domain Logic
Key Abstractions and Domain Terms:
- Environment (DEV/PPRD/PRD): Drives configuration, credentials, and resource allocation.
- Supervisor: Manages multiple processes inside the container.
- Service Account: Used for secure access to GCP APIs.
Business/Technical Invariants:
- Each environment uses its own GCP project, service account, and VPC connector.
- Only one instance runs per environment (no horizontal scaling).
Mental Model:
- Each deployment is a self-contained MQTT ingestion node, tightly integrated with GCP services and isolated per environment.
4. Codebase Structure
High-level Layout:
mqtt_subscriber.py: Python application for MQTT-to-cloud bridging.mosquitto.conf: Mosquitto broker configuration.supervisord.conf: Supervisor configuration for process management.requirements.txt: Python dependencies.- Dockerfile: Container build instructions.
- Environment-specific YAMLs (
dev.yml,pprd.yml,prd.yml): GCP deployment descriptors. - CI/CD YAML (
project-ci.yml): Pipeline and deployment automation.
Responsibility Boundaries:
- Application logic (Python) vs. infrastructure (Docker, Mosquitto, Supervisor).
- Environment-specific configuration is isolated in separate YAML files.
What Changes Together:
- Application code and requirements.
- Mosquitto and Supervisor configs.
- Environment YAMLs and CI/CD pipeline.
5. Configuration & Environment
Environment Variables:
- Set in deployment YAMLs (e.g.,
APP_ENV,GCP_PROJECT_ID,MQTT_BROKER, etc.). - Control application behavior and cloud integration.
Configuration Files:
mosquitto.conf: Broker settings.supervisord.conf: Process management.requirements.txt: Python dependencies.
Differences Between Local, Staging, and Production:
- Each environment uses a different GCP project, service account, VPC connector, and resource allocation.
- Production (
prd.yml) uses more CPU/memory than DEV/PPRD.
6. Runtime Behavior
Startup Sequence:
- Container starts, Supervisor launches Mosquitto and Python subscriber.
- Mosquitto listens on port 1883.
- Subscriber connects to broker and cloud services.
Normal Execution Flow:
- MQTT messages are received, processed, and forwarded to cloud.
- Supervisor restarts processes if they fail.
Error Handling and Logging Strategy:
- Application logs to stdout (captured by GCP logging).
- Mosquitto logs as configured.
- Liveness/readiness checks ensure unhealthy containers are restarted.
7. Deployment & Operations
Build Process:
- Docker image is built from the Dockerfile (Python, Mosquitto, Supervisor, app code).
Deployment Method:
- CI/CD pipeline (
project-ci.yml) deploys the container to GCP Flexible Environment using environment-specific YAMLs. - Each deployment uses a dedicated service account and VPC connector.
Runtime Dependencies:
- GCP credentials (service account key or workload identity).
- Network access to GCP APIs.
Scaling and Rollback Considerations:
- Only one instance per environment (no auto-scaling).
- Rollback by redeploying a previous image/tag.
8. Extending the System
Where and How to Add New Features:
- Update Python code for new message handling or cloud integrations.
- Adjust Mosquitto or Supervisor configs as needed.
- Add environment variables to YAMLs for new configuration options.
Recommended Patterns:
- Use environment variables for all configuration.
- Keep environment YAMLs in sync with code/config changes.
- Test changes in DEV before promoting to PPRD/PRD.
Anti-patterns and Risk Areas:
- Hardcoding secrets or credentials.
- Making breaking changes to topic structure or message schema without coordination.
- Running multiple instances without unique MQTT client IDs.
Testing Strategy:
- Unit/integration tests for Python code.
- Deploy to DEV and validate end-to-end message flow before promoting.
9. Security & Compliance
Authentication and Authorization:
- GCP access is secured via service accounts.
- MQTT broker is not secured (no TLS or authentication) – must be addressed for production.
Secrets Handling:
- No secrets in code; all sensitive data should be managed via environment or GCP Secret Manager.
Data Sensitivity Considerations:
- Sensor data may be sensitive; ensure GCP IAM and network policies are enforced.
10. Common Pitfalls & Gotchas
- No MQTT Security: Broker is open on 1883 with no authentication/TLS; not suitable for production as-is.
- Single Instance: No horizontal scaling; may be a bottleneck for high-throughput scenarios.
- Resource Limits: Production uses more resources, but disk is fixed at 10GB; monitor for growth.
- Environment Drift: YAMLs must be kept in sync with code/config changes.
- Supervisor: If Supervisor or one of its managed processes fails to start, the container will not function correctly.
- Health Checks: Liveness/readiness checks are critical for auto-recovery but may need tuning for startup delays.
