Pentaho Data Integration - Configure Oracle JDBC Connection

Stepwise illustration on how to configure native JDBC Oracle database connections for Pentaho Data Integration. JDBC connections are the easiest and most commonly used access protocol. Connections can be configured using Spoon and managed by DI server.

Step 1 :
 
Open Spoon.
Go to "Database Connection >> New Connection Wizard"

Pentaho Common Errors : Driver class 'oracle.jdbc.driver.OracleDriver' could not be found

Error Message

Error connecting to database [ORA_TEST_JDBC] : org.pentaho.di.core.exception.KettleDatabaseException:
Error occured while trying to connect to the database

Driver class 'oracle.jdbc.driver.OracleDriver' could not be found, make sure the 'Oracle' driver (jar file) is installed.
oracle.jdbc.driver.OracleDriver



Pentaho Business Analytics Enterprise Edition 5.0.2 - Installation for Windows 64 bit

 
Step 1 :
 Download latest Pentaho Business Analytics version ( 5.0.2 )  from http://pentaho.com/download
 
 
Step 2 :
 
Save and execute file pentaho-business-analytics-5.0.2-x64.exe.
Installation wizard will pop up. Click Next.


Pentaho Data Integration - Configure DI Server for Windows

Step 1 :


Go To "Start > Pentaho Enterprise Edition > Server Management"
Start DI and Tomcat servers using icon "Start Data Integration Server"


Pentaho Data Integration - PDI 5.0.2 Installation for Windows 64 bit

Step 1 : 

Download latest PDI version ( 5.0.2 )  from http://pentaho.com/download



Pentaho Data Integration - PDI 5.0.1 Installation for Linux

Step 1 : 

Download latest PDI version ( 5.0.1 )  from http://pentaho.com/download

Step 2 : 

Save the bin file to Downloads directory.
Execute bin file pdi-5.0.1-x64.bin.




Pentaho Repository Queries

User Info

SELECT LOGIN, NAME, DESCRIPTION, ENABLED FROM R_USER

Job Info

SELECT NAME, DESCRIPTION, JOB_VERSION, JOB_STATUS, CREATED_USER, CREATED_DATE, MODIFIED_USER, MODIFIED_DATE FROM R_JOB

Transformation Info

SELECT NAME, DESCRIPTION, TRANS_VERSION, TRANS_STATUS, CREATED_USER, CREATED_DATE, MODIFIED_USER, MODIFIED_DATE FROM R_TRANSFORMATION


Pentaho Data Integration - PDI Installation for Windows

Step 1 : 

Download latest PDI version from http://pentaho.com/download.
Choose 32 Bit or 64 Bit based on OS requirements.





Pentaho Data Integration ( PDI ) - Generic Design Guidelines

Design for Failure Handling

Recommended to ensure that the data source is available before a process is kicked off. One basic design principle is that the ETL job needs to be able to fail gracefully when a data availability test fails.

Kettle contains following features to do this.

  • Test a repository connection.
  • Ping a host to check whether it's available.
  • Wait for a SQL command to return success/failure based on a row count condition.
  • Check for empty folders.
  • Check for the existence of a file, table, or column.
  • Compare files or folders.
  • Set a timeout on FTP and SSH connections.
  • Create failure/success outputs on every available job step.


Pentaho Data Integration ( PDI ) - Overview

Pentaho Data Integration is a flexible tool that allows collecting data from disparate sources such as databases, files, and applications, and turning the data into a unified format that is accessible and relevant to end users. Pentaho Data Integration provides the Extraction, Transformation, and Loading (ETL) engine that facilitates the process of capturing the right data, cleansing the data, and storing the data using a uniform and consistent format.

Common Uses of Pentaho Data Integration Include:
UA-46724997-1