Steps to upgrade DSS Cloud (with Kubernetes and User Isolation) to latest version.

For commands with sudo, run as your user (i.e. run as cheryl_abundo) instead of as dataiku.

For all other commands, run as dataiku.


  1. Check if anyone is logged in to DSS cloud 

    If no one is connected, stop DSS (run commands as dataiku)
    sudo su - dataiku
       ./dss_data/bin/dss stop
        

    If someone is connected, inform the person/s before stopping DSS,

  2. Clean-up DSS disk
      rm -rf /dataiku/dss_data/tmp/*
      rm -rf  /dataiku/dss_data/caches/*
      rm -rf  /dataiku/dss_data/exports/*
      rm -rf  /dataiku/dss_data/diagnosis/*
      rm -rf  /dataiku/dss_data/jobs/*
      rm -rf  /dataiku/dss_data/scenarios/*

  3. Backup data directory

    1. Check the size of the DSS data directory
      du -hcs  /dataiku/dss_data


    2. Check if there's enough disk space in the VM
      df -h

      If there's not enough space, increase disk by following 3c else proceed to 4

    3. Increase disk attached to VM instance


      1. Open boot disk and click on Edit. Specify higher Size and click Save.


      2. Open VM SSH and identify the disk with the file system and the partition that you want to resize.
        sudo lsblk

      3. Resize the image partition identified above.
        sudo growpart /dev/sda 1

      4. Extend the file system on the disk/partition to use the added space.
        sudo xfs_growfs /dev/sda1

      5. Verify that the file system is resized
        df -h /dev/sda1

    4. Copy DSS data directory to backup directory (can take awhile, have coffee, work on something else in the meantime)
        cp -rv ./dss_data ./dss_data_backup
      Check that the backup directory is about the same size  (=>) as the original data directory
        du -hcs dss_data_backup/

        du -hcs dss_data/
  4. Download the latest Dataiku installation file
      wget https://downloads.dataiku.com/public/dss/8.0.1/dataiku-dss-8.0.1.tar.gz
    Unpack the file
      tar xzf dataiku-dss-8.0.1.tar.gz
    After successfully unpacking, delete the tar file
      rm dataiku-dss-8.0.1.tar.gz

  5. Perform the upgrade
      dataiku-dss-8.0.1/installer.sh -d dss_data -u


    If there is a missing dependency, run the following as your user (ie. cheryl_abundo)
      sudo -i "/dataiku/dataiku-dss-6.0.1/scripts/install/install-deps.sh" -without-java -with-conda
      Then as dataiku, rerun
    dataiku-dss-6.0.1/installer.sh -d design -u

    Successful update installation will show


  6. Edit env files
    vim /dataiku/dss_data/env-site.sh

    Make sure it includes
    # This file is sourced last by DSS startup scripts
    # You can add local customizations to it
    export PATH="/dataiku/anaconda3/condabin:/dataiku/anaconda3/bin:$PATH"
    export TEMP="/dataiku/dss_data/tmp"

    vim /dataiku/dss_data/env-default.sh
    Check the following options
    export DKU_BACKEND_JAVA_OPTS="-Xmx30g -XX:+UseG1GC -Xloggc:/dev/stderr -XX:+PrintGCTimeStamps -Djavax.net.debug=ssl -Djavax.net.ssl.keyStore=/dataiku/certificates/postgresql/client.p12 -Djavax.net.ssl.keyStoreType=pkcs12 -Djavax.net.ssl.keyStorePassword=changeme"
    export DKU_FEK_JAVA_OPTS="-Xmx4g -XX:+UseParallelGC -Xloggc:/dev/stderr -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Djavax.net.debug=ssl -Djavax.net.ssl.keyStore=/dataiku/certificates/postgresql/client.p12 -Djavax.net.ssl.keyStoreType=pkcs12 -Djavax.net.ssl.keyStorePassword=changeme"
    export DKU_HPROXY_JAVA_OPTS="-Xmx4g -XX:+UseParallelGC -Xloggc:/dev/stderr -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Djavax.net.debug=ssl -Djavax.net.ssl.keyStore=/dataiku/certificates/postgresql/client.p12 -Djavax.net.ssl.keyStoreType=pkcs12 -Djavax.net.ssl.keyStorePassword=changeme"
    export DKU_JEK_JAVA_OPTS="-Xmx4g -XX:+UseParallelGC -Xloggc:/dev/stderr -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Djavax.net.debug=ssl -Djavax.net.ssl.keyStore=/dataiku/certificates/postgresql/client.p12 -Djavax.net.ssl.keyStoreType=pkcs12 -Djavax.net.ssl.keyStorePassword=changeme"

  7. Update R installation
      ./dss_data/bin/dssadmin install-R-integration


  8. Reinstall graphics export
      ./dss_data/bin/dssadmin install-graphics-export
    If there is a missing dependency, run the following as your user (ie. cheryl_abundo)
      sudo -i "/dataiku/dataiku-dss-6.0.1/scripts/install/install-deps.sh" -without-java -without-python -with-chrome
      Then as dataiku, rerun
    ./dss_data/bin/dssadmin install-graphics-export

  9. Reinstall standalone Hadoop and Spark

    1. Download required files
      export DSS_VERSION="8.0.1"

      wget https://downloads.dataiku.com/public/dss/"$DSS_VERSION"/dataiku-dss-spark-standalone-"$DSS_VERSION"-2.4.5-generic-hadoop3.tar.gz
      wget https://downloads.dataiku.com/public/dss/"$DSS_VERSION"/dataiku-dss-hadoop-standalone-libs-generic-hadoop3-"$DSS_VERSION".tar.gz

    2. Run Hadoop integration
      ./dss_data/bin/dssadmin install-hadoop-integration -standaloneArchive /dataiku/dataiku-dss-hadoop-standalone-libs-generic-hadoop3-"$DSS_VERSION".tar.gz

    3. Run Spark integration
      ./dss_data/bin/dssadmin install-spark-integration -standaloneArchive /dataiku/dataiku-dss-spark-standalone-"$DSS_VERSION"-2.4.5-generic-hadoop3.tar.gz -forK8S

    4. Build container images

      # Download the prebuilt images archive
      export DSS_VERSION="8.0.1"
      wget http://downloads.dataiku.com/public/dss/"$DSS_VERSION"/container-images/dataiku-dss-ALL-base_dss-"$DSS_VERSION"-r-py3.6.tar.gz

      # Load the prebuilt images on your local docker repository (on the DSS vm)
      gunzip dataiku-dss-ALL-base_dss-"$DSS_VERSION"-r-py3.6.tar.gz
      docker image load < dataiku-dss-ALL-base_dss-"$DSS_VERSION"-r-py3.6.tar

      # Check the new dss-8.0.1 images are loaded
      docker images

      # The following commands will allow your DSS instance to use the DSS images just loaded (double check the name for each container image)
      export DSS_CONTAINER_EXEC_IMAGE="dataiku-dss-container-exec-base:dss-"$DSS_VERSION"-r-py3.6"
      export DSS_SPARK_IMAGE="dataiku-dss-spark-exec-base:dss-"$DSS_VERSION"r-py3.6"
      export DSS_API_NODE_IMAGE="dataiku-dss-apideployer-base:dss-"$DSS_VERSION"-r-py3.6"

      ./bin/dssadmin build-base-image --type container-exec --mode=use --source-image "$DSS_CONTAINER_EXEC_IMAGE"
      ./bin/dssadmin build-base-image --type spark --mode=use --source-image "$DSS_SPARK_IMAGE"
      ./bin/dssadmin build-base-image --type api-deployer --mode=use --source-image "$DSS_API_NODE_IMAGE"


    5. Build code env images
      ./dss_data/bin/dssadmin build-code-env-images --all

  10. Secure the new installation (user isolation framework)
    Run as your user (ie. cheryl_abundo)
      sudo /dataiku/dss_data/bin/dssadmin install-impersonation dataiku 

  11. Start DSS
      ./dss_data/bin/dss start

  12. Configure container execution

    Verify connection by clicking on TEST
    Click on PUSH IMAGES 

  13. Configure Spark  

  Click on PUSH IMAGES