Summary
This API is to extract the respond time of each application on each sites. However, not all Solvay sites have this program setup. Therefore, we can see only some sites only.
The API URL https://app.kadiska.com/api/v1/query (l_VAR_kadiska_url_query)
Example of query (Body)
| Table | Incremental Load By | Incremental Flag | No. Fields | Talend Job | ODS Table | DM | DPL Table |
|---|---|---|---|---|---|---|---|
Start time | KADISKA_RUM | 83 | F070_Kadiska_rum_to_ODS | ODS_KDK_0000_F001_I_H_rum | FACT_kdk_rum | V_FACT_kdk_rum |
Tools: Talend
Detail job
- J070_Kadiska_to_GCS
- Define the date
- Time to select "Begin" to "End"
- Time to stop "Stop"
- Time today = "Now"
- Login to Kadiska API on URL: https://app.kadiska.com/api/v1/config/clients/client:5b77758db8/tokens (l_VAR_kadiska_url_token)and keep the token as global variable
- Start looping
- Extract data from Kadiska API on URL: https://app.kadiska.com/api/v1/query (l_VAR_kadiska_url_query) controlling by following variables
- In case [no more data] there is no data or the end time >= stop time, stop the loop
- In case [there is data more than limit] there is data and number of line = limit, add offset + number of line
- In case [no more data during the interval, go to next hour]there is data and number of line < limit or there is no data and End time < Stop, set offset = 0 and add Begin and End time 1 hour (interval)
Flow job
- F070_Kadiska_rum_to_ODS
- Define variable such as filename
- Get meta_run_id
- Define meta_file_name for incremental or reload which depending on l_VAR_kadiska_q_reload but the name is fixed to KADISKA_RUM[_RELOAD]
- Get the last successfully load from table incremental_loading
- Call detail job
- Load to ODS
- Update the last successful date to incremetnal_loading table
Access rights
It is request to network team : Mohammed IDRISSI and PRISCILA SANCHES DE BRITO-ex , Kadiska team = https://kadiska.com/contact-us/
Username/Password encrypt on variable g_CNX_KDK_Password
Source
https://app.kadiska.com/api/v1/query
Format
JSON
Destination
Location
- Bucket = cs-ew1-prj-data-dm-dt-[dev]-staging/KADISKA/
- DataOean GCP = prj-data-dm-dt-[env]
- STG Table names =
- prj-data-dm-dt-prod.STG.STG_KDK_0000_0000_F001_I_H_rum
- ODS Table names =
- prj-data-dm-dt-prod.ODS.ODS_KDK_0000_F001_I_H_rum
- DPL View names =
- prj-data-dm-dt-[env].DPL.V_FACT_hlx_work_order
Format
columnar format
Sizing
Assessment
Loading
1.1 Incremental Load
It is control by variables:
- l_VAR_kadiska_q_condition : normally it is blank but it can add additional filter such as ["=", "watcher_name", ["$","Google Calendar"]],
- l_VAR_kadiska_q_interval : during project design, we agree to have interval 1 hour = 3600000 milliseconds (should not be changed, agree on the project to keep 1 hour gap)
- l_VAR_kadiska_q_limit : max number of records to get from API each call (recommend 9999 which is max of API return)
- l_VAR_kadiska_q_nloop : max number of loop (recommend 5, increase more when need to reload in order to avoid many duplicated data)
- l_VAR_kadiska_q_offset : start index of the record from API query
- l_VAR_kadiska_q_reload : "incremental" for the incremental load, in case of reload change this variable to end time with epoch format (starting time will get from table incremental_loading on meta_file_name = KADISKA_RUM_RELOAD
- l_VAR_kadiska_email_flag : "yes" when need Talend to send email to inform when the incremental load is not increasing on the incremental_table, which can use the loading duplicate data
- l_VAR_kadiska_email_recipient: list of email to get the email
1.2 Full load
N/A
1.3. Reloading data
It is control by variables:
- l_VAR_kadiska_q_limit: The number of records to get from API each time
l_VAR_kadiska_q_nloop: the number of loop for each load. (the number of limit x nloop will be max record on the file to load each time to BQ)
l_VAR_kadiska_q_offset: The number of record index from API. It starts with 0. Will change to another number only specific reload case - l_VAR_kadiska_q_reload: If it is NOT "increment", it will the full load with condition on this parameter such as [1724371200000]. It will be the end time (epoch format). The start date will always get from incremental_loading table with meta_file_name = KADISKA_RUM_RELOAD
How to:
1. Stop schedule job
2. Change the start date that is required to reload on
UPDATE STG.incremental_loading
SET meta_last_process_date = '2021-12-22 00:00:00' --the date that want to reload
where meta_file_name = 'KADISKA_RUM_RELOAD' -- identify the object that want to reload but the name to change must be _RELOAD. Without _RELOAD will use for incremental load only.
3. Change context l_VAR_kadiska_q_reload to the end date with epoch format
4. Recheck the variables: limit, nloop, offset
5. Run the job to load until KADISKA_RUM_RELOAD is reach the end time that we enter on l_VAR_kadiska_q_reload. Meaning reload is complete
6. Change the context l_VAR_kadiska_q_reload to be "incremental"
7. Reschedule the job
1.4 Plan to schedule
Every hour
1.5 Timing
5 minutes
Criticality
Low
Logging
1. Check the last loading
select * from STG.incremental_loading
where meta_file_name like '%KADISKA_RUM'
order by meta_file_name
2. Check the loading records / error
select job.job_name , job.meta_start_date , logs.meta_run_id , logs.meta_source_system , logs.meta_step , logs.meta_status , logs.meta_num_lines , logs.meta_error_lines from STG.log_tables logs join STG.run_jobs job on logs.meta_run_id = job.meta_run_id
where logs.meta_run_id in ( SELECT meta_run_id FROM STG.run_jobs order by meta_start_date desc limit 100 )
and job_name like '%Kadiska%'
and meta_start_date > DATE_SUB ( CURRENT_TIMESTAMP () , INTERVAL 1 DAY )
order by meta_start_date desc
3. Check duplicated data / missing data
select meta_business_date , meta_run_id , time_begin , count (*) from `ODS.ODS_KDK_0000_F001_I_H_rum`
group by meta_business_date , meta_run_id , time_begin
order by meta_business_date desc , time_begin desc
Known Error
1. If the last nloop times, not cover all records by limit, it will move to next hour (happen only loading > 1 hour gap). Check on Logging script 3 if the number of records = limit, it may have more data
2. When there is no data loading and it API return on postman like picture below
The log in TMC will be
Error tExtractJSONFields_2 - Missing property in path $['data']
It is error on Kadiska about the limit of run times / day.
For non available data - Contact Solvay Network team - mohamed.idrissi@solvay.com, priscila.sanchesdebrito-ext@solvay.com , vinicius.cosendey@solvay.com
Can also contact Kadiska team directly https://kadiska.com/contact-us/ (needs account created by Vinicious) adding in the ticket as cc Network team (Mohammed, Priscila and Vinicious)
- to create a request /case > Process_ How to open Kadiska Cases/Incidents
Other contacts: email at support@netskope.com, or via telephone using one of our regional support numbers:
US: 1-800-685-2098
UK: 44-8455280141
Australia: 1-800-505-486
Europe: 44-8455280141
Singapore: 80-0130-2191
India: 00080-0100-4400




