Tuesday, 2 September 2014

How to load all duplicate rows and unique rows in to separate tables

How to load all duplicate rows and unique rows in to separate tables.

Approach 1 using expression:
Add source and target definition to the mapping by dragging into workspace.
Here is mapping snapshot:
 Add Sorter Transformation next to Source Qualifier to Sort Employee details in Ascending by EMPNO. Drag required port from Source Qualifier to Sorter.


Add Expression transformation next to Sorter transformation to compare each & every record like current and previous record to identified duplicate records. Drag required port from Sorter to expression.

To compare current and previous records the ports much in be in below sequence
 Input/current record, compare current & previous, output and previous record.
EMPNO much be in/out port
var_compare much be variable port
out_flag much be out port
pre_EMPNO much be variable port 
where Pre_EMPNO = EMPNO 

below is the compare logic








out_flag=var_compare
Add Router Transformation next to Expression to separate all union and all duplicate records. Drag required port from expression to router.

Select Router, right click and edit then go to Group tab to create 2 group one for Union records and another for Duplicate records as below
Add Group Filter Condition to Group Unique as out_flag=1, similar for Group Duplicate as out_flag !=1 as below.
 Map the required port from Group Unique to Target instance EMP_Unique, Similarly Group Duplicate to Target instance EMP_Duplicate as below


Approach 2 by using aggregator: 
Below is the screenshot using aggregator.

After Import or create source/target instance and drag it to designer workspace.

Added sorter transformation next to Source Qualifier and drag required ports/columns from it to Sorter. Select sorter transformation, right click and edit. Go to ports tab and for EMPNO change "Direction" as Ascending. This is to sort recording based on EMPNO as below.

Add Aggregator transformation next to Sorter and drag required port/columns to it, select aggregator, right click and edit.
Go to Ports tab: Check group for EMPNO and new column by name count to get the employee count to identified any duplicate empno.





Add a Joiner Transformation next to sorter & aggregator transformation and drag EMPNO, count from Aggregator (highlighted in yellow) to this joiner and drag required ports from sorter to it as below. This joiner is used to added employee count to each corresponding record set.

Select Joiner transformation, right click and edit. Go to Port Tab and mark port as master which are coming from aggregator transformation. So that it will get only matching from aggregator.

Next go to Property tab of the Joiner to defined join Type.

Next go to Condition tab of the Joiner to defined join condition as below:

Add another Aggregator next to Joiner and drag required port from joiner to it, and to get count of each empno.
Select Aggregator, right click & edit, then go to Ports tab and select group by for EMPNO as show below. This aggregator is added because of all the input records are pushed from SQ to Joiner and to this.

Add Filter Transformation next to second aggregator for filter only duplicate records so that why can loaded in file.  Drag required port from second aggregator to this filter.













Select Filter, right click on it & edit. Go to properties tab to add filter condition as above. Then drag required port from filter to the target instance emp_duplicate.


2 comments:

  1. Hi Gowtham,
    Can you please provide the answers for all scenarios. Because it is very helpful and explanation is excellent

    ReplyDelete
  2. 8 Web Design Tips to Increase Sales
    Every business owner wants a website that encourages users to take the next step: buying or communicating. This step is called conversion and is when the user converts to become a customer. If your website has a lot of traffic but few conversions, you need to determine why.

    How to change date of birth in facebook - Social Network
    On the social network Facebook, we do not laugh at birthdays. Best way to change date of birth in Facebook
    A priori you can only modify it yourself once by editing your user profile. There is no question of changing the date of birth every four mornings.

    Here is 150+ High Press Release link building sites 2021
    First of all, let's talk about why it is beneficial to use backlinks to improve your SEO process. Backlinks provide you with several things. By using backlinks, you can get better results and improve your search engine rankings. Keep in mind that you need a goal, which is to build backlinks leading to your home page plus individual pages. Google will also place more emphasis on websites with quality backlinks.

    ReplyDelete