Friday, September 20, 2013

Pentaho Data Integration ( PDI ) - Overview

Pentaho Data Integration is a flexible tool that allows collecting data from disparate sources such as databases, files, and applications, and turning the data into a unified format that is accessible and relevant to end users. Pentaho Data Integration provides the Extraction, Transformation, and Loading (ETL) engine that facilitates the process of capturing the right data, cleansing the data, and storing the data using a uniform and consistent format.

Common Uses of Pentaho Data Integration Include:

  • Data migration between different databases and applications.
  • Loading huge data sets into databases taking full advantage of cloud, clustered and massively parallel processing environments.
  • Data Cleansing with steps ranging from very simple to very complex transformations.
  • Data Integration including the ability to leverage real-time ETL as a data source for Pentaho Reporting.
  • Data warehouse population with built-in support for slowly changing dimensions and surrogate key creation.

No comments:

Post a Comment