Friday, October 2, 2015

Pentaho Data Integration : Aggregation using Group By step

This step can be used to perform various types of aggregations such as sum, average, min, max e.t.c. Input data always need to be sorted for this step to work properly.

This step support following aggregation methods.
  1. Sum
  2. Average or Mean

Saturday, September 19, 2015

Pentaho Data Integration - Data Grid Input step

This step generally used for testing, reference or demo purposes. We can create a static rows in a grid.

  • Meta tab : Enter field names and meta data info.
  • Data tab : Enter static data in a grid.

Here are the step wise illustrations on how to use Data Grid step.

Pentaho Common Errors : Error converting data while looking up value

Error Message

Stream lookup.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected error
Stream lookup.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : org.pentaho.di.core.exception.KettleStepException:
Stream lookup.0 - Error converting data while looking up value
Stream lookup.0 -

Pentaho Data Integration - CSV File Input with parallel execution enabled

CSV file input is a commonly used input step to read delimited files. Options are similar to text file input steps. Here are the general configurable options.

  1. File name - Input file name.
  2. Delimiter - Support common delimiters like coma, tab, pipe e.t.c
  3. Enclosure - Optional enclosures like double quotes.
  4. NIO buffer size - Read buffer size.
  5. Lazy Conversion - Significant performance improvement by avoiding data type conversions. Check this option only if the logic is mere pass through.

Sunday, August 30, 2015

Pentaho Data Integration - PDI 5.4 Installation for Windows 64 bit

Pentaho 5.4 is the latest Pentaho version with powerful features.
Stepwise illustration on how to install Pentaho Data Integration 5.4 is given below.

Here are some of the highlights of the new version.

Wednesday, March 12, 2014

Pentaho Data Integration : Google Analytics

Google Analytics service provide details about a website's traffic. This service track various statistics and can be integrated with AdWords to review online campaigns.

Pentaho Google Analytics step allows to extract Google Analytics data.
Stepwise illustration given below.

Step 1

Enable Google Analytics and generate API key.

Wednesday, February 19, 2014

Pentaho Common Errors : Driver class 'org.gjt.mm.mysql.Driver' could not be found

Error Message
Error connecting to database [MySQLDev] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database

Driver class 'org.gjt.mm.mysql.Driver' could not be found, make sure the 'MySQL' driver (jar file) is installed.
org.gjt.mm.mysql.Driver