Pentaho recently announced Pentaho 5.0 which represents a major advancement for this supplier of business analytics and data integration software as well as for the open source community to which it contributes and supports. In fact, with 250 new features and enhancements in the 5.0 release, it’s important not to lose the forest for the trees. Some of the highlights are a new user interface that caters to specific roles within the organization, tight integration with emerging databases such as Mongo, and enhanced extensibility. With a funding round of $60 million coming less than a year ago and the growing market momentum around big data and analytics and it appears that Pentaho has doubled down at the right time in its efforts to balance the needs of the enterprise with those of the end user.

The 5.0 products have been completely redesigned for discovery analytics, content creation, accessibility and simplified administration. One key area of change is around roles, or what I  call personas. I recently discussed the different analytic personas that are emerging in today’s data-driven organization, and Pentaho has done a good job of addressing these at each level of the organization. In particular, the system addresses each part of the analytic value chain, from data integration through to analytic discovery and visualization.

I was surprised by the usability of the visual analytics tool, which offers a host of capabilities that enable easy data exploration and visual vr_ngbi_br_importance_of_bi_technology_considerationsdiscovery. Features such as drag-and-drop conditional formatting, including color coding, are simple, intuitive and powerful. Drop-down charting reveals an impressive list of visualizations that can be changed with a single click once. Users will need to understand the chart types by name, however, since no thumbnail visuals are revealed upon scrolling over and there is no chart recommendation engine. But overall, the release’s ease-of-use developments are a major improvement in an already usable system that our firm rated Hot in the 2012 Value Index on Business Intelligence, putting Pentaho on par with other best-in-class tools. According to our benchmark study of next-generation business intelligence systems, usability is becoming more important in business intelligence and is the key buying criterion 63 percent of the time.

Advances from an enterprise perspective include features that will help IT manage the large volumes of data being introduced into the environment through its support of big data sources and streamlining the automation of data integration. Capabilities such as job restart, rollback and load balancing are all included. For administrators, you can more easily configure and manage the system, including security levels, licensing and servers. In addition, new REST services APIs simplify the embedding of analytics and reporting into SaaS implementations. This last advancement in embedding is important, as I discussed in a recent piece that making analytics available anywhere is extremely important.

No discussion of big data integration and analytics is complete vr_infomgt_barriers_to_information_managementwithout the mention of Pentaho Data Integration (PDI), which I consider the crown jewel of the Pentaho portfolio. The value of PDI is derived from its ability to put big data integration and business analytics in the same workflow. The data integration through a user-friendly graphical paradigm helps a range of IT and analysts blend data from multiple platforms at the semantic layer rather than the user level. This enables centralized agreement around data definitions so companies can govern and secure their information environments. The Pentaho approach addresses the two biggest barriers to information management, as revealed in our benchmark research: data spread across too many systems (67%) and multiple versions of the truth (64%). While other tools on the market facilitate blending at the business-user level, there is an inherent danger in such an approach because each individual can create analysis according to the definition that best suits his or her argument. It is similar to the spreadsheet problem we have now, in which many analysts come together, each with a different understanding of the source data.

vr_bigdata_big_data_technologies_plannedIts depth in data integration is very robust and Pentaho  supports a range of big data which has been expanding rapidly to multiple data sources that are being used today and what our research found is planned to be used like Data Warehouse Appliances (35%), In-memory Database (34%), Specialized DBMS (33%) and Hadoop (32%) as found in our Big Data benchmark. Beyond these big data and RDBMS sources that are supported today, it has also expanded to non-SQL sources. The open source and pluggable nature of the Pentaho architecture allows community-driven evolution beyond traditional JDBC and ODBC drivers and gives an increasingly important leverage point for using its platform. For example, the just announced MongoDB Connector enables deep integration that includes replica sets, tag sets and read and write preferences, as well as first-of-its-kind reporting on the Mongo NoSQL database. MongoDB is a document database, which is a new class of database that allows a more flexible, object-oriented approach for accessing new sources of information. The emergence of MongoDB mirrors that of new, more flexible notation languages such as JavaScript Object Notation (JSON). While reporting is still basic, I expect the initial integration with MongoDB to be just a first step for the Pentaho community in optimizing information around this big data store. Additionally, Pentaho announced new integration with Splunk, Amazon Redshift and Cloudera Impala, as well as certifications including MongoDB, Cassandra, Cloudera, Intel, Hortonworks and MapR.

Currently the analytics and BI market is bifurcated, with the so-called stack vendors occupying entrenched positions in many organizations and visual discovery selling to business users through a viral bottom-up strategy. Both sides are moving to the middle in their development efforts and addressing the lack of data integration that is integrated in the Pentaho approach. The challenge for the traditional enterprise BI vendors is to build flexible, user-friendly visual platforms, while for the newcomers it’s applying structure and governance to their visually oriented information environment. Arguably, Pentaho is building its platform from the middle out. The company has done a good job of balancing usability aspects with the governance and security models needed for a holistic approach that both IT and end users can support. Organizations that are looking for a unified data integration and business analytics approach for business and IT, including advanced analytics and embedded approaches to information-driven applications, should consider Pentaho.


Tony Cosentino

VP and Research Director