• File location 
    The file is named archive.py , in the home directory of the Dataiku user on design prod VM.


  • Scheduling
    To set up the cron job, we run :

    crontab -e

    The job is set up to run every day at 9AM French time / 3PM Singapour Time


  • Logs location
    Logs are located in the /dataiku/dss_data/run directory.


  • Execution
    The program lists every file in the run directory, then zip all the backend.log.* into a archivelogs.zip file.
    We use google cloud storage python package. We can install the package with :

    pip install google-cloud-storage




  • Cloud Storage

    The zip files are located in the cs-bda-dss-ew-prod-design bucket, in the dataiku/logs folder.
    Files are structured with this subfolder structure : Year => Month => Daily zip




  • No labels