Task Factory users running version 2020.1.4 or older (released prior to May 27, 2020): There's an important Task Factory update. Please visit here for more details.
Unsupported: The Hadoop component was deprecated in version 2019.4.1 and is no longer supported.
Important: Users who are able to successfully test their connection yet receive an unable to connect error at runtime, please direct your attention to the following help document as you may need to update your local hosts file.
Hadoop WebHDF connection manager is available for SQL versions 2012 and higher.
Used with Hadoop WebHDFS Source.
|WebHDFS Server Address||The fully qualified web address and port number where the HDFS (Hadoop Distributed File System) is located (example: http://192.168.1.10:50070).|
|Username||The username with permission to access HDFS files.|
Hadoop WebHDFS Source
|Source Icon||Source Description|
|The Hadoop WebHDFS Source is used to stream large files stored in the HDFS of a Hadoop server which can be converted into rows of data within SSIS. Currently, the Hadoop WebHDFS Source only supports text and CSV files. See the Hadoop WebHDFS Connection Manager to learn more about setting up the connection manager.|
|File Name||The filename (if in the root directory) or path to the files stored within HDFS (example: FolderName/DataFile.txt).|
|Output Columns||Users can create, remove, and configure the name, index (zero-based), data type, length, precision, and scale of the columns being extracted from the text file.|