| Area | Issue | |------|-------| | | Spoon GUI looks dated (Swing-based), high learning curve for orchestration (Jobs vs. Transformations confusing). | | Big data performance | Spark translation is not complete – some steps force row-by-row, losing Spark's advantages. | | Version control | Repository uses binary blob storage – diff/merge of ETL logic is impossible. Git integration is via external XML export (clunky). | | Monitoring | Poor built-in operational monitoring (no native Prometheus metrics). Need external tools. | | OLAP (Mondrian) | Development stalled; community recommended using Apache Druid or ClickHouse instead. | | Documentation | Outdated, community-driven for CE; enterprise docs behind paywall. | | Streaming | No native streaming (Kafka connector exists but not first-class). |
| Competitor | Pentaho advantage | Pentaho disadvantage | |------------|------------------|----------------------| | | Lower cost (CE), unified analytics | Talend has better data quality & governance | | Informatica | Simplicity, open source | Informatica scales better, has AI features | | Apache NiFi | Stronger reporting & dashboards | NiFi is better for real-time dataflows | | dbt | GUI + end-to-end (ingest to dashboard) | dbt is superior for transformation (SQL-first, version control) | | Tableau Prep | More ETL connectors, jobs/orchestration | Tableau Prep is simpler for analysis prep only | | Microsoft Fabric | Vendor-neutral, on-prem friendly | Fabric has deeper integration with Power BI |
Pentaho Software has a wide range of applications across various industries, including:
This feature highlights PDI’s capability to provide a , allowing organizations to blend data from any source into a unified view without requiring extensive manual coding.