Summary

This API is to extract the respond time of each application on each sites. However, not all Solvay sites have this program setup. Therefore, we can see only some sites only.

The API URL https://app.kadiska.com/api/v1/query (l_VAR_kadiska_url_query)

Example of query (Body)

Table

Incremental

Load By

Incremental

Flag

No.

Fields

Talend Job

ODS Table

DM

DPL Table

Kadiska-rum

Start time

KADISKA_RUM

83

F070_Kadiska_rum_to_ODS

ODS_KDK_0000_F001_I_H_rum

FACT_kdk_rum

V_FACT_kdk_rum

Tools: Talend

Detail job

J070_Kadiska_to_GCS

Define the date
1. Time to select "Begin" to "End"
2. Time to stop "Stop"
3. Time today = "Now"
Login to Kadiska API on URL: https://app.kadiska.com/api/v1/config/clients/client:5b77758db8/tokens (l_VAR_kadiska_url_token)and keep the token as global variable
Start looping
Extract data from Kadiska API on URL: https://app.kadiska.com/api/v1/query (l_VAR_kadiska_url_query) controlling by following variables
In case [no more data] there is no data or the end time >= stop time, stop the loop
In case [there is data more than limit] there is data and number of line = limit, add offset + number of line
In case [no more data during the interval, go to next hour]there is data and number of line < limit or there is no data and End time < Stop, set offset = 0 and add Begin and End time 1 hour (interval)

Flow job

F070_Kadiska_rum_to_ODS

Define variable such as filename
Get meta_run_id
Define meta_file_name for incremental or reload which depending on l_VAR_kadiska_q_reload but the name is fixed to KADISKA_RUM[_RELOAD]
Get the last successfully load from table incremental_loading
Call detail job
Load to ODS
Update the last successful date to incremetnal_loading table

Access rights

It is request to network team : Mohammed IDRISSI and PRISCILA SANCHES DE BRITO-ex , Kadiska team = https://kadiska.com/contact-us/

Username/Password encrypt on variable g_CNX_KDK_Password

Source

https://app.kadiska.com/api/v1/query

Format

JSON

Destination

Location

Bucket = cs-ew1-prj-data-dm-dt-[dev]-staging/KADISKA/
DataOean GCP = prj-data-dm-dt-[env]
STG Table names =
- prj-data-dm-dt-prod.STG.STG_KDK_0000_0000_F001_I_H_rum

ODS Table names =
- prj-data-dm-dt-prod.ODS.ODS_KDK_0000_F001_I_H_rum

DPL View names =
- prj-data-dm-dt-[env].DPL.V_FACT_hlx_work_order

Format

columnar format

Sizing

Assessment

Loading

1.1 Incremental Load

It is control by variables:

l_VAR_kadiska_q_condition : normally it is blank but it can add additional filter such as ["=", "watcher_name", ["$","Google Calendar"]],
l_VAR_kadiska_q_interval : during project design, we agree to have interval 1 hour = 3600000 milliseconds (should not be changed, agree on the project to keep 1 hour gap)
l_VAR_kadiska_q_limit : max number of records to get from API each call (recommend 9999 which is max of API return)
l_VAR_kadiska_q_nloop : max number of loop (recommend 5, increase more when need to reload in order to avoid many duplicated data)
l_VAR_kadiska_q_offset : start index of the record from API query
l_VAR_kadiska_q_reload : "incremental" for the incremental load, in case of reload change this variable to end time with epoch format (starting time will get from table incremental_loading on meta_file_name = KADISKA_RUM_RELOAD
l_VAR_kadiska_email_flag : "yes" when need Talend to send email to inform when the incremental load is not increasing on the incremental_table, which can use the loading duplicate data
l_VAR_kadiska_email_recipient: list of email to get the email

1.2 Full load

N/A

1.3. Reloading data

It is control by variables:

l_VAR_kadiska_q_limit: The number of records to get from API each time
l_VAR_kadiska_q_nloop: the number of loop for each load. (the number of limit x nloop will be max record on the file to load each time to BQ)
l_VAR_kadiska_q_offset: The number of record index from API. It starts with 0. Will change to another number only specific reload case
l_VAR_kadiska_q_reload: If it is NOT "increment", it will the full load with condition on this parameter such as [1724371200000]. It will be the end time (epoch format). The start date will always get from incremental_loading table with meta_file_name = KADISKA_RUM_RELOAD

How to:

1. Stop schedule job

2. Change the start date that is required to reload on

UPDATE STG.incremental_loading

SET meta_last_process_date = '2021-12-22 00:00:00' --the date that want to reload

where meta_file_name = 'KADISKA_RUM_RELOAD' -- identify the object that want to reload but the name to change must be _RELOAD. Without _RELOAD will use for incremental load only.

3. Change context l_VAR_kadiska_q_reload to the end date with epoch format

4. Recheck the variables: limit, nloop, offset

5. Run the job to load until KADISKA_RUM_RELOAD is reach the end time that we enter on l_VAR_kadiska_q_reload. Meaning reload is complete

6. Change the context l_VAR_kadiska_q_reload to be "incremental"

7. Reschedule the job

1.4 Plan to schedule

Every hour

1.5 Timing

5 minutes

Criticality

Low

Logging

1. Check the last loading

select * from STG.incremental_loading
where meta_file_name like '%KADISKA_RUM'
order by meta_file_name

2. Check the loading records / error

select job.job_name , job.meta_start_date , logs.meta_run_id , logs.meta_source_system , logs.meta_step , logs.meta_status , logs.meta_num_lines , logs.meta_error_lines from STG.log_tables logs join STG.run_jobs job on logs.meta_run_id = job.meta_run_id
where logs.meta_run_id in ( SELECT meta_run_id FROM STG.run_jobs order by meta_start_date desc limit 100 )
and job_name like '%Kadiska%'
and meta_start_date > DATE_SUB ( CURRENT_TIMESTAMP () , INTERVAL 1 DAY )
order by meta_start_date desc

3. Check duplicated data / missing data

select meta_business_date , meta_run_id , time_begin , count (*) from `ODS.ODS_KDK_0000_F001_I_H_rum`
group by meta_business_date , meta_run_id , time_begin
order by meta_business_date desc , time_begin desc

Known Error

1. If the last nloop times, not cover all records by limit, it will move to next hour (happen only loading > 1 hour gap). Check on Logging script 3 if the number of records = limit, it may have more data

2. When there is no data loading and it API return on postman like picture below

The log in TMC will be

Error tExtractJSONFields_2 - Missing property in path $['data']

It is error on Kadiska about the limit of run times / day.

For non available data - Contact Solvay Network team - mohamed.idrissi@solvay.com, priscila.sanchesdebrito-ext@solvay.com , vinicius.cosendey@solvay.com

Can also contact Kadiska team directly https://kadiska.com/contact-us/ (needs account created by Vinicious) adding in the ticket as cc Network team (Mohammed, Priscila and Vinicious)

to create a request /case > Process_ How to open Kadiska Cases/Incidents
Other contacts: email at support@netskope.com, or via telephone using one of our regional support numbers:
US: 1-800-685-2098
UK: 44-8455280141
Australia: 1-800-505-486
Europe: 44-8455280141
Singapore: 80-0130-2191
India: 00080-0100-4400

Page tree

Products

Services

Ways of Working

Meet the Team

Technologies

Help

Life in DA&AI

Summary

Tools: Talend

Detail job

Flow job

Access rights

Source

Format

Destination

Location

Format

Sizing

Assessment

Loading

1.1 Incremental Load

1.2 Full load

1.3. Reloading data

1.4 Plan to schedule

1.5 Timing

Criticality

Logging

Known Error

Page tree

Products

Services

Ways of Working

Meet the Team

Technologies

Help

Life in DA&AI

Kadiska - Network Statistic

Summary

Tools: Talend

Detail job

Flow job

Access rights

Source

Format

Destination

Location

Format

Sizing

Assessment

Loading

1.1 Incremental Load

1.2 Full load

1.3. Reloading data

1.4 Plan to schedule

1.5 Timing

Criticality

Logging

Known Error