| Task Name | Description | Env | Responsibility |
| Obtain [SOURCE_NAME] Access & Technical Documentation | Objective Get all the access, credentials, and documentation needed so we can safely and reliably ingest data from the [SOURCE_NAME] system. |
--------------------------------------------------------------------------- Description This task covers everything required to connect to the [SOURCE_NAME] system from our environment. It includes setting up user/service access, collecting connection and authentication details, understanding the data structure, and documenting how often data will be updated. The outcome is that we can successfully test a connection from Dev and have all information stored securely and centrally. |
Scope
•
--------------------------------------------------------------------------- Scope
|
|
|
|
Collect API details: |
o
Collect database details: |
|
|
Collect SFTP details: |
•
|
Obtain the required auth method and details, e.g.: |
|
|
|
|
Clarify any token expiry, rotation, or renewal process. |
|
|
|
|
|
|
o Obtain data dictionary or schema documentation describing:
Tables/endpoints/files
Fields, data types, allowed values
Key relationships and important business rules.
• Frequency and SLA
o
Confirm with the source owner:
|
Data refresh
|
|
________________________________________
Deliverables
• Credentials
o
--------------------------------------------------------------------------- Deliverables
|
|
|
|
A confirmation email/message from the [SOURCE_NAME] owner stating that: |
|
|
________________________________________
Definition of Done
•
--------------------------------------------------------------------------- Definition of Done
|
|
| Dev Test Prod | Data Engineer |
| [ENV] : Network Whitelisting for [SOURCE_NAME] | Objective |
Ensure network connectivity from Dev environment Connects to [SOURCE_NAME]. Scope |
|
|
|
Deliverables |
|
Definition of Done |
| Dev Test Prod | DevOps |
| Collect [SOURCE_NAME] Schema and Table Metadata details | Objective Identify all source tables, fields, and any dependencies needed for [PROJECT], and document how they will be used for the purpose of [PROJECT PURPOSE]. Note - Cautious while working or check if the source will be obsolete after some, then in that case there might be some rework/effort Identify the complexity/Priority for data load |
Scope
•
Scope
|
|
|
|
|
|
|
|
|
|
|
|
|
Deliverables
•
--------------------------------------------------------------------------- Deliverables
|
|
|
|
Definition of Done
•
--------------------------------------------------------------------------- Definition of Done
|
| Dev |
| Create Azure Function – [SOURCE_NAME] | Objective Build an Azure Function in the Development environment to extract data from [SOURCE_NAME] for the [USE_CASE]. Scope |
|
|
Deliverables |
|
Definition of Done |
| Dev | DevOps |
| Implement Data Ingestion from [Source] to Kafka | Objective |
Implement ingestion pipeline to publish data from [SOURCE_NAME] to Kafka Scope |
|
Deliverables |
|
|
Definition of Done |
|
|
| Dev | Data Engineer |
| Implement Delta/Incremetal Logic for [Source] | Objective |
Implement logic to ingest only new or updated records from source. Scope |
|
Deliverables |
|
Definition of Done |
|
|
| Dev | Data Engineer |
| Setup Github Repo and Create CI/CD Pipeline | Objective Set up an automated deployment pipeline for [SOURCE_NAME] so that code can be built, tested and deployed to Dev automatically, and to Prod with manual approval. Scope |
|
|
|
• Use two main branches:
main – for production-ready code
dev – for development and testing
Document how and when code is merged between dev and main romoted to the Main/Prod branch is:
Successfully deployed to the Prod environment via the pipeline after passing the manual approval step.
• Pipeline status is visible in GitHub Actions, and basic run instructions are documented in the repository (e.g. in README.md).
Implement monitoring and validation checks for ingestion pipeline.
Scope:
• Enable Application Insights
• Create ingestion success/failure logs
• Implement row count validation
• Implement schema validation check
• Track ingestion duration
• Validate Kafka message count vs source count
Deliverables:
• Monitoring dashboard created
• Validation queries implemented
• Test failure scenario validated
Definition of Done:
• Metrics visible in monitoring tool
• Validation alerts triggered on failure
Implement automated alerting for ingestion pipeline failures and performance degradation.
Scope:
• Configure and validate alerts for the ingestion pipeline, including:
• Function execution failures
• Kafka publish failures
• Zero records ingested for a scheduled run
• Abnormally long execution time / SLA breach
• Integration of alerts with email and/or Microsoft Teams channels
• Definition and configuration of appropriate severity levels (e.g. Critical, High, Medium, Low)
Deliverables:
• Alerts tested
• Alert documentation created
Definition of Done:
• Failure simulation triggers alert
• Alert reaches responsible team
Deploy the finalized Python code for the [SOURCE_NAME] system to the Production environment, following deployment and security best practices.
________________________________________
Description
This task covers preparing, validating, and deploying the final Python code for [SOURCE_NAME] into Production. It ensures that the code is production-ready, uses proper configuration and secrets management, and that deployment is done in a controlled and auditable way.
________________________________________
Scope
Setup Pull request and review it.
• Deployment readiness
o Confirm that the Python code has passed all required checks in lower environments (Dev / QA / UAT).
o Ensure all known defects for this release are either resolved or accepted.
• Configuration & secrets
o Verify that all Prod configuration (environment variables, connection strings, endpoints) is set correctly.
o Ensure all secrets (keys, passwords, tokens) are stored in a secure store (e.g. Key Vault, GitHub/Azure DevOps secrets) and not in code.
• Deployment process
o Use the approved CI/CD pipeline or standard deployment process to deploy the Python code to Production.
o Follow the agreed change management process (e.g. change ticket, approvals, CAB if required).
o Perform a controlled deployment (e.g. scheduled window, blue/green/canary if applicable).
• Post-deployment validation
o Run smoke tests or basic functional checks to confirm that the Python code runs correctly in Production.
o Verify logging and monitoring are working (logs, alerts, dashboards).
• Documentation & handover
o Update deployment notes / release documentation with:
Deployed version / commit
Deployment date and time
Any known issues or follow-up items
o Inform relevant stakeholders that the deployment is complete.
________________________________________
Deliverables
• Python code for [SOURCE_NAME] successfully deployed to the Production environment.
• Updated configuration and secrets for Production stored in the approved secure store.
• Deployment / release notes documented in the project Confluence/SharePoint or release tracker.
________________________________________
Definition of Done
•
|
3. Set up GitHub Actions pipeline
• Create a GitHub Actions workflow that runs on relevant events (e.g. pull request, push to dev or main).
4. Implement pipeline steps
• The pipeline should include at least:
a. Code linting – run static code analysis / style checks.
b. Unit tests – run the automated unit test suite and fail the build if tests fail.
c. Build validation – build/pack the application to ensure it compiles/builds successfully.
d. Dev deployment – automatically deploy successful builds from dev (or a chosen branch) to the Dev environment.
e. Prod deployment with approval – deploy to the Prod environment only after a manual approval step (e.g. environment protection rule or manual approval job).
5. Configure environment variables per environment
• Define and store configuration values separately for:
Dev environment
Prod environment
Ensure secrets are stored securely (e.g. GitHub Secrets) and are correctly used by the pipeline.
Deliverables:
• A fully working CI/CD pipeline in GitHub Actions for [SOURCE_NAME].
• Automatic deployment to Dev on successful pipeline runs (as defined in the branching strategy).
• Manual approval-based deployment to Prod with a clear approval step and responsible approvers defined.
Definition of Done:
• A sample change merged to the Dev branch is:
Automatically built, tested, and successfully deployed to the Dev environment via the pipeline.
• A sample change p
Deliverables
Definition of Done
| Dev | Data Engineer | |
| DEV: Setup Monitoring & Data Validation for [Source] | Objective Implement monitoring and validation checks for ingestion pipeline. Scope
Deliverables
Definition of Done
| Dev Test Prod | Data Engineer |
| Set Up Alerting and Logging | Objective Implement automated alerting for ingestion pipeline failures and performance degradation. Scope
Deliverables
Definition of Done
| Dev Test Prod | Data Engineer |
| Deploy [Source] data to Production | Objective Deploy the finalized Python code for the [SOURCE_NAME] system to the Production environment, following deployment and security best practices. --------------------------------------------------------------------------- Description This task covers preparing, validating, and deploying the final Python code for [SOURCE_NAME] into Production. It ensures that the code is production-ready, uses proper configuration and secrets management, and that deployment is done in a controlled and auditable way. --------------------------------------------------------------------------- Scope Setup Pull request and review it.
--------------------------------------------------------------------------- Deliverables
--------------------------------------------------------------------------- Definition of Done |
|
|
|
• Build or configure a process to extract data from [SOURCE_NAME].
•
| Prod | Support Engineer | |
| Implement Security Policy | Dev Test Prod | Data Engineer | |
| Ingest [SOURCE_NAME] Data to [INTERMEDIATE_LAYER_NAME] for Fabric Processing | Objective Extract data from [SOURCE_NAME] and store it in the [INTERMEDIATE_LAYER_NAME] (e.g. ADLS Gen2, Blob Storage, File Share). This intermediate layer will be the trusted source for downstream loading into Microsoft Fabric. Direct ingestion into Fabric is not feasible because of: |
Scope
Scope
|
|
|
Deliverables
• Data written to intermediate layer
o
--------------------------------------------------------------------------- Deliverables
|
|
|
|
|
|
|
|
Monitoring and logging set up for: |
|
|
________________________________________
--------------------------------------------------------------------------- Definition of Done (DoD) |
|
|
|
|
|
|
|
|
|
|
|
|
|
| Dev Test Prod | Data Engineer |
| Setup [INTERMEDIATE_LAYER_NAME] | Objective Define and set up the [INTERMEDIATE_LAYER_NAME] (e.g. ADLS Gen2, Blob Storage, File Share) for data coming from [SOURCE], so that data can be loaded into Microsoft Fabric in a controlled and reliable way. Direct ingestion into Fabric is not feasible due to: |
--------------------------------------------------------------------------- Scope of Work |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Deliverables
•
--------------------------------------------------------------------------- Deliverables
|
|
|
|
|
--------------------------------------------------------------------------- Definition of Done |
|
|
|
|
| Dev Test Prod | Data Engineer |