Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

3.1 Find the "Web ID" by using this URL: https://pivision.eua.solvay.com/piwebapi/points/search?dataServerWebId=F1DSUZOarN2bd0yKBpfAHpVUXAQUNFVzFQU1RFS1BEQTAx&query=TOR.BIR.SILO.SILO.SILO.CBSY_ML2_BIR_Calcio_BIR_2COL_A&selectedFields=Items.WebId

The red text is the tag name in this file  List of Tag  (column N). For example , in this project for Torrelavega, we will get only 5 tags, which control by variable add the tags into this context in RDS l_VAR_eBatch_tag_TOR_tag ={'TOR.BIR.SILO.SILO.SILO.SY_L2_

...

COL_A','TOR.BIR.SILO.SILO.SILO.SY_L2_COL_

...

B','TOR.BIR.SILO.SILO.SILO.

...

SY_

...

L2_

...

COL_L','TOR.BIR.SILO.SILO.SILO.

...

SY_

...

L2_

...

PPM_

...

CA_BIR'}

Remark: each site will have different parameter. TOR is the site name Torrelavega.

...

Example output of the link will be 

...

  1. It is the main URL to call in order to get another url 
  2. Output from run 1 on web (required authorization to get the result)
  3. Selected link will be called to get the result
  4. Call the selected link URL  + the new parameter  on 
  5. Example of the full URL 
  6. It is the result of the second call.  The field can be select by parameter "selectedFields=Items.Timestamp;Items.Value"

Talend

Project IND_EBATCH

The URL of API is setup on context below for all the sites

l_VAR_eBatch_endpoint

l_VAR_eBatch_piwebapi_url

l_VAR_eBatch_tag_dataServer


Site specific will be setup on context such as TOR is Torrelavega

l_VAR_eBatch_eventFrame_TOR_assetDB for event frame

l_VAR_eBatch_tag_TOR_tag for tag value


1. Job Detail

Image RemovedImage Added

  1. tFileDelete: Delete the log file from the python name DATA\[env]\IND\IND_EBATCH\py_log.txt.  All the jobs use the same file name and it will delete before.  The deletion will be on EventFrame job only. The other will not delete. Therefore, we can see all the logs each load. 
  2. tJava: To generate .py file in DATA\[env]\IND\IND_EBATCH\pyCodeyyyyMMddHHmmss.py by reading input parameter from the Flow job and the RSD parameter.   Example of python code
  3. tSystem: To run the py file.
    1. The output of the screen will be kept in DATA\[env]\IND\IND_EBATCH\py_log.txt.
    2. The extraction will be generated in DATA\[env]\IND\IND_EBATCH\Output\[API interface]\file.csv
  4. tGSPut:  To upload csv file to GCS  prj-data-sad-ebatch-[env]/cs-ew1-prj-data-sad-ebatch-[env]-staging/[API interface]
  5. tFileDelete: Delete csv files
  6. tFileDelete: Delete py file when the context l_VAR_eBatch_py_debug = no

2. Flow job 

2.1. Get the last load from table STG.incremental_loading

...

To access to PI, it is required david.genet@solvay.com  to grant the authorization to access the pi (window account - SVC_TALEND_ADMIN_DEV ) first in order to run the python (authenticate by window account). If promote, it is required to ask authorization. 

In DEV and TEST, we can use function HttpNegotiateAuth() to identify the access without provide user name and password, it will automatically identify the user SVC_TALEND_ADMIN_DEV. However, in PREPROD and PROD, the configuration is different. Therefore, we need to use function HttpNtlmAuth( 'EUA \\ user', 'password' ) instead.  This can be setup in context l_VAR_eBatch_piwebapi_auth in RDS

4. Source

https://pivision-dev.eua.solvay.com/piwebapi/

...

Event Frame with calculation
Event Frame with value in time series - Granulom
Tags data

Sizing

Less than 100 records / hour

...

The step 2.4 of the Flow job, it will use job T101 for Tag and Event Frame while T103_UPDATE_LAST_EXTRACT_FROM_TAGS.  This job will be a special select sql because the extraction will get many tags at the same time and the start time is not the same for each tags. Therefore, we need to select minimum start time among the tags. 

meta_file_name is setup by context l_VAR_eBatch_tag_TOR_load.  If this name change, the l_VAR_eBatch_tag_TOR_STG and l_VAR_eBatch_tag_TOR_ODS should be changed as well in case want to save the data into another tables with the new incremental load flag. Normally these 3 contexts should not be changed. 

2. Reloading data

It is flexible to do the reload by selected starttime start time on table STG.incremental_load on project prj-data-sad-ebatch-dev

update `prj-data-sad-ebatch-dev.STG.incremental_loading`
set meta_last_process_date = '2024-11-01 00:00:00'
where meta_file_name = 'PI_TOR_EFRAME' -- or PI_TOR_EFRAME_GRANULOM, PI_TOR_TAG,

also it is possible to specific the end time of reloading on RDS "l_VAR_eBatch_eventFrame_TOR_endtime" like 2024-05-29%2003:00:00.  Normally this context variable should be empty

3. Scheduling

     PL_EBATCH_PI_TO_ODS run every hour on Mon - Frievery day including weekend. 

4. Timing

    An event frame will take around 2 - 4 min

...

select   job.job_name ,  job.meta_start_date ,  logs.meta_run_id ,  logs.meta_source_system ,  logs.meta_step ,  logs.meta_status ,  logs.meta_num_lines ,  logs.meta_error_lines   from   STG.log_tables   logs   join   STG.run_jobs   job   on   logs.meta_run_id  =  job.meta_run_id
where    logs.meta_run_id   in   ( SELECT   meta_run_id   FROM   STG.run_jobs   order   by   meta_start_date   desc   limit   100 )
and   job_name   like   '%PI%'
and   meta_step  =  'Bucket to Staging'
and   meta_start_date   >    DATE_SUB ( CURRENT_TIMESTAMP () ,  INTERVAL   1   DAY )
order   by   job.meta_start_date   desc

         We   1. 1 We should NOT have 0 record.  If it is 0 normally, it mean that python is error.  (if python is error, the log will be OK in Talend log)please check the python log at \data\[env]\IND\IND_EBATCH\py_log.txt)

          1.2  If it is required to check the py file, context l_VAR_eBatch_py_debug in RDS has to change from no to yes.  Then the py file will not be deleted after run the program. 

    2. Monitor last update date. The result should be the same current date. 

...