Showing posts with label Pentaho Kettle Design Guide lines. Show all posts
Showing posts with label Pentaho Kettle Design Guide lines. Show all posts

Friday, September 20, 2013

Pentaho Data Integration ( PDI ) - Generic Design Guidelines

Design for Failure Handling

Recommended to ensure that the data source is available before a process is kicked off. One basic design principle is that the ETL job needs to be able to fail gracefully when a data availability test fails.

Kettle contains following features to do this.

  • Test a repository connection.
  • Ping a host to check whether it's available.
  • Wait for a SQL command to return success/failure based on a row count condition.
  • Check for empty folders.
  • Check for the existence of a file, table, or column.
  • Compare files or folders.
  • Set a timeout on FTP and SSH connections.
  • Create failure/success outputs on every available job step.