Tuesday, July 14, 2015

Term Lookup Transformation in SSIS

Term Lookup is the one of the powerful Transformation in Sql Server Integration Service. This Transformation is basically helps to analyze the set of text records.
Term Lookup transformation will look for set of user predefined nouns or noun Phrases in transformation input text column and provide the count of occurrences as an output on row by row basis.

Example

A mobile customer support team has collected the product feedback from the end user for a newly released product. This Unstructured Data [Feedback] can be analyze for most frequently encountered problems and that can be resolved in the next release of the product.

How to use Term Lookup Transformation

Table structure - Here I have used three tables
  1. CustFeedback table [Given Input]
  2. Customer Feedback Table
  3. SearchKey Table [User predefined nouns or noun Phrases]
  4. SearchKey Table
  5. Data_Analyze Table [Excepted output]
  6. Data Analyze Table

Steps To Be Carried Out

Step 1: Create new project in BIDS with Integration Service Project Template.
New Project
Step 2: Drag the Dataflow task to Control flow.
Data Flow Task
Step 3: Create a New OLEDB Connection.
Right click on Connection Manager and select New OLEDB Connection and provide the necessary information to connect your SQL database.
New OLEDB Connection
Step 4: Add OLEDB source Task and configure the same.
Double click on Data Flow task and Drag the OLEDB source Task to Data flow.
New OLEDB Source
Double click on OLEDB source Task and select the connection which you have created in previous step and select the name of the Feedback Table.
OLEDB Source Config
Step 5: Now it’s a time to use Term lookup transformation to find out the terms in Feedback source column, but the Term Lookup transformation will except only Unicode String [DT_WSTR] or Unicode text stream [DT_NTXT] as its data type . To convert the data type of the column we use data Conversion transformation.
New Data Conversion
Convert Feedback column Data Type. Double click on Data Conversion transformation and select Feedback column in available Input Columns and select Unicode text stream [DT_NTXT] as Data Type as shown in the below screenshot.
Data Conversion Config
Step 6: Add Term lookup transformation and configure the same.
Drag Term Lookup Transformation to Data Flow.
New Term Lookup
Configure Term Lookup Transformation
When you double click on Term Lookup Transformation you will see three tabs in editor window.
  1. Reference Table: Select the Reference Table name from the dropdown
  2. Ref Table
  3. Term Lookup: Now you will see the Available Input Columns and Available Reference Columns.
  4. Available Input Columns
    Select the check box in Available Input Column list which you wanted to see as a output column.
    Output Columns
    The terms [or Records] which are available in the Keyword column of the Reference table is going to compare with Converted_Feedback column of the Input Column. Drag converted_Feedback column to Keyword column to relate both the column.
    Ref Columns
  5. Advanced: Select the checkbox “Use Case-sensitive Term lookup” for Case-sensitive term lookup.
  6. Advanced Tab
Step 7: Add OLEDB destination Task and configure the same [copy the result data into Data_Analyze ] Table.
New OLEDB Dest
Double click on OLEDB destination Task and provide the necessary information to connect Data_Analyze table.
OLEDB Dest Config
Provide the appropriate column name in Mapping tab
OLEDB Dest Config Map
Step 8: Run the application.
Run
After successful execution of this package we will get following output in Data_Analyze Table.
Data Analyze
  • ID – Auto generated number .
  • Terms – The Term or phrases which are found in the provided input column.
  • Frequency –Count of Term occurrences in output on row by row basis.
  • UserID – Used ID Information for the table CustFeedback.
  • Converted_Feedback- Feedback Information from the table CustFeedback.

0 comments:

Popular Posts

Recent Posts

Unordered List

Text Widget

Blog Archive