This document provide an overview of the environment and main tools and services.


GCP Project

BigQuery Datasets

Cloud Storage

For Silica, only two kinds of buckets are used for the moment and they are shared between all labs:

[...]-files

These buckets hold the raw data used as base for tables available on the BigQuery tables. Those files must be kept in case of reprocessing. 

[...]working-files

This kind of bucket is used by Talend in order to load the data into BigQuery. Those are temporary files and they are overwritten every run.

Talend Projects

These are the three Talend projects used for Silica. After migration to Talend Cloud version 8, "-mig" was added to the repo name.

Python Scripts - Gitlab

All the scripts are available in a just one Git repo:

https://gitlab.solvay.com/solvay-it-bda/talend/rni/silica/bda-talend-python-rni-silica