Task Factory Advanced Lookup Cache

 Task Factory users running version 2020.1.4 or older (released prior to May 27, 2020): There's an important Task Factory update. Please visit here for more details.

Advanced Lookup Cache Connection Manager

The Advanced Lookup Cache Connection Manager is used to setup the cache database and configure how data is retrieved from it. 

Available Columns Tab

The available columns tab allows you to define columns that are part of the cache, whether or not the column is output when a lookup is performed, and what type of cache database to use.

Task Factory Advanced Lookup Cache Connection Manager Available Columns

OptionDescription
Use In-Memory CacheThis tells the connection manager whether to use in-memory cache or on-disk cache when creating the cache database.
Define Customer Cache Location By default, the cache connection manager is created in a temporary file location defined by SSIS. You can choose to store the database in a custom location by checking this box and entering/selecting a path to the file.
Note:  The file should be defined with an .s3db extension.
Output ColumnsThe Output column tells the connection manager to return this value when a lookup is performed. The output columns can be seen on the Advanced Lookup Transform UI after a cache connection manager is selected.

Input Parameters Tab

Task Factory Advanced Lookup Cache Connection Manager Input Parameters tab

OptionDescription
Input ParameterUsed by the Advanced Lookup Transform to define the lookup values to find rows in the cache database. They are mapped from Input columns in the Advanced Lookup Transform.
Input Parameter ConditionThis is where you can easily create conditions for queries. By selecting the word And, you can select between an And and Or conditions. Selecting the green adds more conditions and selecting the redremoves them.

Task Factory Advanced Lookup Cache Connection Manager Input Parameter Conditions

Note:  The first condition requirement is the table key is equal to the input parameter.

Task Factory Advanced Lookup Cache Connection Manager Input ParametersTask Factory Advanced Lookup Cache Connection Manager Input ParametersTask Factory Advanced Lookup Cache Connection Manager Input Parameters

Select the green New Group Icon at the end of the condition to create a new And/Or group.

Properties Tab

Task Factory Advanced Lookup Cache Connection Manager Properties tab

OptionDescription
Multiple Row Match PolicyThis option lets you define what should happen if multiple rows are found during a lookup.
Make All Comparisons Case SensitiveUsing this option makes all string comparisons performed case sensitive (e.g. SolarWinds does not equal solarwinds).

Advanced Lookup Cache Transform

Transform IconTransform Description
Task Factory Advanced Lookup Cache Transform iconThe Advanced Lookup Cache Transform is used to store rows from a source into an in-memory or on-disk cache. See Advanced Lookup Cache Connection Manager to learn more about setting up the connection manager.

Task Factory Advanced Lookup Cache Transform
Advanced Lookup Cache Transform Unpopulated

Task Factory Advanced Lookup Cache Transform
Advanced Lookup Cache Transform Populated
OptionDescription
Choose Cache Connection ManagerYou can select an existing Advanced Lookup Cache Connection Manager or choose to create a new one.
Map Cache ColumnsAfter a Cache Connection Manager has been selected, the input columns and columns from the cache are automatically mapped if the names match. Otherwise you need to manually map the columns.

Advanced Lookup Transform

Transform IconTransform Description
Task Factory Advanced Lookup Transform Icon

The Advanced Lookup Transform allows you to store a temporary cache of rows and perform lookups on that cache to pull that data into a data flow. There are two modes for the Advanced Lookup Transform

  • Connected 
  • Disconnected 

See the Advanced Lookup Cache Connection Manager to learn more about setting up the connection manager.



Task Factory Advanced Lookup Transform
Advanced Lookup Transform Unpopulated

Task Factory Advanced Lookup Transform
Advanced Lookup Transform Populated
OptionDescription
Select Cache Connection ManagerSelect an Advanced Lookup Cache Connection Manager. If you have not created one yet, you will need to create one. Once you have one selected, it automatically populates the available columns.
Map Input Columns To Cache Input ParametersThis is where you map input columns to the input parameters defined in the Advanced Lookup Cache Connection Manager.
What Should Happen To Non-Matched RowsThis tells the component what should happen when a non-matched row is found in the lookup.
  • Fail Component - This option fails the component when a non-matched row is found.
  • Redirect Rows To Non-Matched Output - This option redirects the rows to the non-matched output.
  • Output Null Values To Output - This option outputs NULL values for each of the output columns defined in the advanced lookup.
Output Columns (Defined in Cache Connection Manager)This section lists all of the output columns defined in the Advanced Lookup Cache Connection Manager. You have a few choices on how to output the values of the output columns based on the column action.
  • Add New Output - This option creates a new output column based on the column alias
  • Replace Input Column - This option allows you to overwrite the current value of the selected input column with the value of the output column from the lookup cache.
Refresh Columns From Cache ButtonThis button refreshes the columns from the cache connection manager.

Setting up an Advanced Lookup Transform

Complete the following steps to setup an Advanced Lookup Transform:

1.  Create a data flow task, and then add an OLE DB Source. In this example we are using an OLE DB source connected to AdventureWorksDW. (download here).
OLE DB Source Editor select source 2. For this example select only the following columns from the columns tab.

  • ProductKey
  • Color
  • StandardCost
  • EndDate
  • ProductAlternateKey
  • EnglishProductName
  • StartDate
  • LargePhoto

OLE DB Source Editor select columns3. Now, create a lookup cache using the Advanced Lookup Cache Transform. Drag an instance of TF Advanced Lookup Cache Transform to the designer window. Attach the OLE DB Source Output to the TF Advanced Lookup Cache Transform. Open the transform by double clicking it, which opens the edit window.

Task Factory Advanced Cache Transform Cache Column

4. Select the Choose Cache Connection Manager drop down list and then choose Create New Lookup Cache Connection...

5. Setup the Advanced Lookup Cache Connection Manager. Then deselect all output except for Product Key.

Task Factory Advanced Lookup Connection Manager Available Columns6. Next, select the Input Parameters tab, and then create two input parameters named ProductAlternateKey and OrderDate.

Task Factory Advanced Lookup Cache Connection Manager Input Parameters tab7. Then, create the following conditions:  Note that Is Blank = Is Null.

Task Factory Advanced Lookup Cache Connection Manager Input Parameter Conditions8. Select OK and then your window should look like this:

Task Factory Advanced Cache Transform Example9. Add a new data flow task and a new OLE DB source, for this example we are connecting to AdventureWorks and creating a custom SQL query.

Task Factory OLE DB Source Editor10. Add a new TF Advanced Lookup Transform to the designer, drag the output from the OLE DB source to the TF Advanced Lookup Transform. Double click the transform to open it. Once the editor window is open, choose the previously created Cache Connection Manager.

11. In the Map Input Columns to Cache Input Parameters window, connect OrderDate from the Input column to the OrderDate from the Cache Column by dragging one to the other. Do the same for the remaining field, and then select OK.
Task Factory Advanced Lookup Form connect parameters

12.  Drag your Matched and Non-Matched outputs to your desired destination. For this example we used TF Terminator Destinations for both.
Task Factory Package Example

Disconnected Lookups

Disconnected Lookups allow you to use a cache to perform lookups without an Advanced Lookup Cache Transform in your data flow.

Functions

There are two functions that can be used to execute disconnected lookups:

FunctionDescription
 LookupData(connectionManager As AdvancedLookupCacheConnectionManager, ParamArray args As Object)Used to retrieve the FIRST column setup as an output column in the Advanced Lookup Cache Manager defined in the connectionManager parameter.

 Parameters:
  • connectionManager - The name of the Advanced Lookup Cache Connection Manager.
  • args - Parameter array of input parameters defined in the Advanced Lookup Cache Connection Manager. Args matches the number of input parameters setup in the Advanced Lookup Cache Connection manager defined in the connectionManager argument.
LookupDataByColumn(connectionManager As AdvancedLookupCacheConnectionManager, outputColumnName As String, ParamArray args As ObjectUsed to retrieve the value of the column defined in the outputColumnName parameter from the Advanced Lookup Cache Connection manager defined in the connectionManager parameter.

Parameters:
  • connectionManager - The name of the Advanced Lookup Cache Connection Manager
  • outputColumnName - The name of the column to retrieve from the output. The column defined here must be setup as an output column in the Advanced Lookup Cache Connection Manager.
  • args - Parameter array of input parameters defined in the Advanced Lookup Cache Connection Manager. Args matches the number of input parameters setup in the Advanced Lookup Cache Manager defined in the connectionManager argument.