| Status | WIP |
|---|---|
| Stakeholders | |
| Outcome | |
| Due Date | |
| Owner | |
| Solution/Domain/Data Architect |
Most of the lab sites are equipped with File Server atop iOMega NAS, which is in some cases if malfunctioning also are running at the limit for adding more hard disks to accommodate new demands for storage and been compliant with data storage regulations.
[ TL;DR]... tbd
Use cases
- Lyon/RICL
- OpenLab instruments generate files at hundreds GB scale for some techniques which demands local computer power to load in stand alone applications for expert analysis, this magnitude size also makes it impossible to load these files to cloud repository as well as creating difficulties in being compliant with regulatory terms for storage.
- Shanghai
- Waters instruments demands 12TB space to store application data. This storage magnitute reaches the local limitation for LabPC disk and also for local network in the iOMega NAS.
- Bristol
- Waters instruments demands 12TB space to store application data. This storage magnitute reaches the local limitation for LabPC disk and also for local network in the iOMega NAS.
Questions and Concerns
Architectural Significant Requirements
- Data rotation
- Retention
- Export Control/Cyber Sec
- Access Control
- Data transfer latency SLA
- Data Consumers (data structure/data model)
Impediments and Blockers
Tradeoff analysis
- Alternatives
- XYZ
- YZX
- Sensitivity Points
- Risks
- Non Risks
- Architectural Approaches
Quick-Wins
The criticality of the problem demands that there be at least a temporary solution as an alternative to mitigate the immediate impact of losing sensitive data for the business due to the local storage limit at the current date.
Some questions that can facilitate the analysis:
- What type of search criteria is used on historical data?
- What retrieval is done on this found data?
- What type of processing is done on the found and retrieved data?
- Is it possible to parse this data?
Possible alternatives to consider in advance:
- Promote historical data close to the (AWS Landing Zone) ACD Labs domain so that it can be ingested for later analysis in reports
- Promote historical data close to the (GCP) Lab-Booster domain so that it can be ingested for later analysis in reports
- Promote historical data to the Azure Fabric Lakehouse so that it can be ingested for later analysis in reports
Design Solution Proposal
Meeting Notes
[SoW] LabPC - Storage for lab-local data at scale (Olivier SAUSSOL, Mijajlovic, Julie, Tiago Oliveira)
- LabPC Storage Study
- SoW
- Assess the issue
- Scenarios (business needs?)
- *Pictures* for real case situation (disks at the floor...)
- Worst case scenarios
- Application data, storage for long term, later data combination cross apps
- Instruments categories (data volume, SLA,...)
- (inventory for storage) LABPC Storage needs consolidation https://docs.google.com/spreadsheets/d/1U-d6W2LEGV9mz4XK9hBmHg1FTZY_-oYxlkXKkukgSNQ/edit?gid=0#gid=0
- Total storage needed for now and the forecast?
- Impact
- Reaching limit of storage
- User storing data on inappropriate devices (NAS, - shadow it)
- Risks
- 1.Data loss
- 2.Data steal
- 3.Shadow IT
- 4.Export data control/legal regulations concerns
- 5.Business continuity
- Cost
- Getting expensive to maintain/extend disks
- Impediments
- Hard to standardize instruments
- Blocked to onboard new instrument
- User complains
- Bolate: 77 pcs, storage not appropriate
- Shanghai:
- "Mark´s use case: large file generated
- Scenarios (business needs?)
- Evaluate Alternatives
- Meeting Stakeholders
- Business strategy
- Skills demanded for the assessment
- Solution Vendor providers
- Short Term solution (quick-wins)
- Already some in place - *highlight quick-wins ongoing
- Distributed storage on LabPC (disks): store on LabPC by convenience to avoid losing data
- Leveraging local File Server (*Shanghai - non standard storage)
- Cloud storage for historical LabPC data - it depends on application readiness for Virtualization deployment
- Recall AWS gateway for data - PoC (ask Khemaies)
- PoC for instrument application virtualization (pick that one with larger data volumetry; ask Olivier)
- Consider ACD Lab (as Mark Kwasnik) as the interface to run analytics - in case instrument application not ready for virtualization
- Already some in place - *highlight quick-wins ongoing
- Long Term solution
- Outcome
- Presentation
- (1st phase)structure scenario (deadline ) [Julie, Olivier, Tiago]
- touch points weekly basis meeting (30min) March (Wednesday 10:30)
- (2nd phase)engage expert per domain for the solution alternatives
- Decision to be taken
- Business Case
- Project organization
- Solution Architecture Proposal
- (1st phase)structure scenario (deadline ) [Julie, Olivier, Tiago]
- Presentation
- Assess the issue
- SoW
References
Presentation https://docs.google.com/presentation/d/1UE4-b8e0GxHoxlVHJomt54JpcWwCv5Uh48Tw3qLA41o/edit#slide=id.g33d7e03e36c_0_0
