Search This Blog

Friday, 16 August 2013

Change Data Capture using Mapping variable

Informatica Parameters and Variables


Purpose

A mapping can utilize parameters and variables to store information during the execution. Each parameter and variable is defined with a specific data type and their main purpose is to provide increased development flexibility. 

Parameters are different from variables in the fact that:


  • Value of a parameter is fixed during the run of the mapping

  • Variables can change in value during run-time.


Both parameters and variables can be accessed from any component in the mapping which supports it. To create a parameter or variable, go to Mapping -> Parameters and Variables from within the Mapping Designer in the Designer client.


The format is $$VariableName or $$ParameterName .

What is Mapping Variable?


These are variables created in Power Center Designer, which you can use in any expression in a mapping, and you can also use the mapping variables in a source qualifier filter, user-defined join, or extract override, and in the Expression Editor of reusable transformations.

Mapping Variable Starting Value


Mapping variable can take the starting value from



  1. Parameter file


  2. Pre-session variable assignment


  3. Value saved in the repository


  4. Initial value


  5. Default Value


The Integration Service looks for the start value in the order mentioned above. Value of the mapping variable can be changed with in the session using an expression and the final value of the variable will be saved into the repository. The saved value from the repository is retrieved in the next session run and used as the session start value.


Changing values of Variables


To change the value of a variable, one of the following functions can be used within an expression: SETMAXVARIABLE($$Variable, value) , SETMINVARIABLE($$Variable, value), SETVARIABLE($$Variable, value) , SETCOUNTVARIABLE($$Variable), where:


  • SETVARIABLE sets the variable to a value that you specify (executes only if a row is marked as insert or update). At the end of a successful session, the Integration Service saves either the MAX or MIN of (start value.final value) to the repository, depending on the aggregate type of the variable. Unless overridden, it uses the saved value as the start value of the variable for the next session run.

  • SETCOUNTVARIABLE - increments a counter variable. If the Row Type is Insert increment +1, if Row Type is Delete increment -1. A value = 0 is used for Update and Reject.

  • SETMAXVARIABLE - compare current value to value passed into the function. Returns the higher value and sets the current value to the higher value.

  • SETMINVARIABLE - compare current value to the value passed into the function. Returns the lower value and sets the current value to the lower value.


At the end of a successful session, the values of variables are saved to the repository. The SETVARIABLE function writes the final value of a variable to the repository based on the Aggregation Type selected when the variable was defined.

Change Data Capture Implementation using Mapping variable.


We will implement Change Data Capture for CUSTOMER data load. We need to load any new/latest customer or changed customers data to a flat file. Since the column UPDATE_TS value changes for any new or updated customer record, we will be able to find the new or changed customer records using UPDATE_TS column.


As the first step lets start the mapping and create a mapping variable as shown in below image.



  • $$M_DATA_END_TIME as Date/Time





Now bring in the source and source qualified to the mapping designer workspace. Open the source qualifier and give the filter condition to get the latest data from the source as shown below.



  • STG_CUSTOMER_MASTER.UPDATE_TS > CONVERT(DATETIME,'$$M_DATA_END_TIME')





Note: This filter condition will make sure that, latest data is pulled from the source table each and every time. Latest value for the variable $M_DATA_END_TIME is retrieved from the repository every time the session is run.


Now map the column UPDATE_TS to an expression transformation and create a variable expression as below.



  • SETMAXVARIABLE($M_DATA_END_TIME,UPDATE_TS)





Note: This expression will make sure that, latest value from the column UPDATE_TS is stored into the repository after the successful completion of the session run.


Now you can map all the remaining columns to the downstream transformation and complete all other transformation required in the mapping.





That's all you need to configure Change Data Capture, Now create your workflow and run the workflow.


Once you look into the session log file you can see the mapping variable value is retrieved from the repository and used in the source SQL, just like shown in the image below.





You can look at the mapping variable value stored in the repository, from workflow manager. Choose the session from the workspace, right click and select 'View Persistent Value'. You get the mapping variable in a pop up window, like shown below.




Parameter files


Parameter file is an ASCII file which is used to set values of mapping paramteres and variables. Parameters can be set on workflow, worklet or session level. The physical location of a parameter file is set in Workflow manager in Workflows -> Edit. It can also be specified using the PMCMD command when starting a session task.


Parameter file structure


Parameters can be grouped into the following sections:



  • [Global]


  • [Service: service name]


  • [folder name.WF:workflow name]


  • [folder name.WF:workflow name.WT:worklet name]


  • [folder name.WF:workflow name.WT:worklet name.WT:worklet name...]


  • [folder name.WF:workflow name.ST:session name]


  • [folder name.session name]


  • [session name]


Examples / useful tips



  • The value is initialized by the specification that defines it, however it can be set to a different value in a parameter file, specified for the session task


  • Initialization priority of Parameters: Parameter file, Declared initial value, Default value


  • Initialization priority of Variables: Parameter file, Repository value, Declared initial value, Default value


  • Parameters and variables can only be utilized inside of the object that they are created in.


  • Parameters and variables can be used in pre and post-SQL


  • Sample parameter file:


  • [Service:IntegrationSvc_01]


  • $$SuccessEmail=dwhadmin@etl-tools.info


  • $$FailureEmail=helpdesk@etl-tools.info


  • [DWH_PROJECT.WF:wkf_daily_loading]


  • $$platform=hpux


  • $$DBC_ORA=oracle_dwh


  • [DWH_PROJECT.WF:wkf_daily_loading.ST:s_src_sa_sapbw]


  • $$DBC_SAP=sapbw.etl-tools.info


  • $$DBC_ORA=oracle_sap_staging



No comments:

Post a Comment