File location The file is named archive.py , in the home directory of the Dataiku user on design prod VM.
Scheduling To set up the cron job, we run :
crontab -e
The job is set up to run every day at 9AM French time / 3PM Singapour Time
Logs location Logs are located in the /dataiku/dss_data/run directory.
Execution The program lists every file in the run directory, then zip all the backend.log.* into a archivelogs.zip file. We use google cloud storage python package. We can install the package with :
pip install google-cloud-storage
Cloud Storage
The zip files are located in the cs-bda-dss-ew-prod-design bucket, in the dataiku/logs folder. Files are structured with this subfolder structure : Year => Month => Daily zip