You are currently browsing the tag archive for the ‘Data’ tag.

Splunk’s annual gathering, this year called .conf 2015, in late September hosted almost 4,000 Splunk customers, partners and employees. It is one of the fastest-growing user conferences in the technology industry. The area dedicated to Splunk partners has grown from a handful of booths a few years ago to a vast showroom floor many times larger. While the conference’s main announcement was the release of Splunk Enterprise 6.3, its flagship platform, the progress the company is making in the related areas of machine learning and the Internet of Things (IoT) most caught my attention.

Splunk’s strength is its ability to index, normalize, correlate and query data throughout the technology stack, including applications, servers, networks and sensors. It uses distributed search that enables correlation and analysis of events across local- and wide-area networks without moving vast amounts of data. Its architectural approach unifies cloud and on-premises implementations and provides extensibility for developers building applications. Originally, Splunk provided an innovative way to troubleshoot complex technology issues, but over time new uses for Splunk-based data have emerged, including digital marketing analytics, cyber security, fraud prevention and connecting digital devices in the emerging Internet of Things. Ventana Research has covered Splunk since its establishment in the market, most recently in this analysis of mine.

Splunk’s experience in dealing directly with distributed, time-series data and processes on a large scale puts it in position to address the Internet of Things from an industrial perspective. This sort of data is at the heart of large-scale industrial control systems, but it often comes in different formats and its implementation is based on different formats and protocols. For instance, sensor technology and control systems that were invented 10 to 20 years ago use very different technology than modern systems. Furthermore, as with computer technology, there are multiple layers in stack models that have to communicate. Splunk’s tools help engineers and systems analysts cross-reference these disparate systems in the same way that it queries computer system and network data, however, the systems can be vastly different. To address this challenge, Splunk turns to its partners and its extensible platform. For example, Kepware has developed plug-ins that use its more than 150 communication drivers so users can stream real-time industrial sensor and machine data directly into the Splunk platform. Currently, the primary value drivers for organizations in this field of the industrial IoT are operational efficiency, predictive maintenance and asset management. At the conference, Splunk showcased projects in these areas including one with Target that uses Splunk to improve operations in robotics and manufacturing.

For its part, Splunk is taking a multipronged approach by acquiring companies, investing in internal development and enabling its partner ecosystem to build new products. One key enabler of its approach to IoT is machine learning algorithms built on the Splunk platform. In machine learning a model can use new data to continuously learn and adapt its answers to queries. This differs from conventional predictive analytics, in which users build models and validate them based on a particular sample; the model does not adapt over time. With machine learning, for instance, if a piece of equipment or an automobile shows a certain optimal pattern of operation over time, an algorithm can identify that pattern and build a model for how that system should behave. When the equipment begins to act in a less optimal or anomalous way, the system can alert a human operator that there may be a problem, or in a machine-to-machine situation, it can invoke a process to solve the problem or recalibrate the machine.

Machine learning algorithms allow event processes to be audited, analyzed and acted upon in real time. They enable predictive capabilities for maintenance, transportation and logistics, and asset management and can also be applied in more people-oriented domains such as fraud prevention, security, business process improvement, and digital products.  IoT potentially can have a major impact on business processes, but only if organizations can realign systems to discover-and-adapt rather than model-and-apply approaches. For instance, processes are often carried out in an uneven fashion different from the way the model was conceived and communicated through complex process documentation and systems. As more process flows are directly instrumented and more processes carried out by machines, the ability to model directly based on the discovery of those event flows and to adapt to them (through human learning or machine learning) becomes key to improving organizational processes. Such realignment of business processes, however, often involves broad organizational transformation.Our benchmark research on operational intelligence shows that challenges associated with people and processes, rather than information and technology, most often hold back organizational improvement.

Two product announcements made at the conference illuminate the direction Splunk is taking with IoT and machine learning. The first is User Behavior Analytics (UBA), based VR2015_InnovationAwardWinneron its acquisition of Caspida, which produces advanced algorithms that can detect anomalous behavior within a network. Such algorithms can model internal user behavior, and when behavior deviates from the specified norm, it can generate an alert that can be addressed through investigative processes usingSplunk Enterprise Security 4.0. Together, Splunk Enterprise Security 4.0 and UBA won the 2015 Ventana Research CIO Innovation Award.The acquisition of Caspida shows that Splunk is not afraid to acquire companies in niche areas where they can exploit their platform to deliver organizational value. I expect that we will see more such acquisitions of companies with high value ML algorithms as Splunk carves out specific positions in the emergent markets.

The other product announced is IT Service Intelligence (ITSI), which highlights machine learning algorithms alongside of Splunk’s core capabilities. The IT Service Intelligence App is an application in which end users deploy machine learning to see patterns in various IT service scenarios. ITSI can inform and enable multiple business uses such as predictive maintenance, churn analysis, service level agreements and chargebacks. Similar to UBA, it uses anomaly detection to point out issues and enables managers to view highly distributed processes such as claims process data in insurance companies. At this point, however, use of ITSI (like other areas of IoT) may encounter cultural and political issues as organizations deal with changes in the roles of IT and operations management. Splunk’s direction with ITSI shows that the company is staying close to its IT operations knitting as it builds out application software, but such development also puts Splunk into new competitive scenarios where legacy technology and processes may still be considered good enough.

We note that ITSI is built using Splunk’s Machine Learning Toolkit and showcase, which currently is in preview mode. The vr_Big_Data_Analytics_08_top_capabilities_of_big_data_analyticsplatform is an important development for the company and fills one of the gaps that I pointed out in its portfolio last year. Addressing this gap enables Splunk and its partners to create services that apply advanced analytics to big data that almost half (45%) of organizations find important. The use of predictive and advanced analytics on big data I consider a killer application for big data; our benchmark research on big data analytics backs this claim: Predictive analytics is the type of analytics most (64%) organizations wish to pursue on big data.

Organizations currently looking at IoT use cases should consider Splunk’s strategy and tools in the context of specific problems they need to address. Machine learning algorithms built for particular industries are key so it is important to understand if the problem can be addressed using prebuilt applications provided by Splunk or one of its partners, or if the organization will need to build its own algorithms using the Splunk machine learning platform or alternatives. Evaluate both the platform capabilities and the instrumentation, the type of protocols and formats involved and how that data will be consumed into the system and related in a uniform manner. Most of all, be sure the skills and processes in the organization align with the technology from an end user and business perspective.


Ventana Research

The concept and implementation of what is called big data are no longer new, and many organizations, especially larger ones, view it as a way to manage and understand the flood of data they receive. Our benchmark research on big data analytics shows that business intelligence (BI) is the most common type of system to which organizations deliver big data. However, BI systems aren’t a good fit for analyzing big data. They were built to provide interactive analysis of structured data sources using Structured Query Language (SQL). Big data includes large volumes of data that does not fit into rows and columns, such as sensor data, text data and Web log data. Such data must be transformed and modeled before it can fit into paradigms such as SQL.

The result is that currently many organizations run separate systems for big data and business intelligence. On one system, conventional BI tools as well as new visual discovery tools act on structured data sources to do fast interactive analysis. In this area analytic databases can use column store approaches and visualization tools as a front end for fast interaction with the data. On other systems, big data is stored in distributed systems such as the Hadoop Distributed File System (HDFS). Tools that use it have been developed to access, process and analyze the data. Commercial distribution companies aligned with the open source Apache Foundation, such as Cloudera, Hortonworks and MapR, have built ecosystems around the MapReduce processing paradigm. MapReduce works well for search-based tasks but not so well for the interactive analytics for which business intelligence systems are known. This situation has created a divide between business technology users, who gravitate to visual discovery tools that provide easily accessible and interactive data exploration, and more technically skilled users of big data tools that require sophisticated access paradigms and elongated query cycles to explore data.

vr_Big_Data_Analytics_07_dissatisfaction_with_big_data_analyticsThere are two challenges with the MapReduce approach. First, working with it is a highly technical endeavor that requires advanced skills. Our big data analytics research shows that lack of skills is the most widespread reason for dissatisfaction with big data analytics, mentioned by more than two-thirds of companies. To fill this gap, vendors of big data technologies should facilitate use of familiar interfaces including query interfaces and programming language interfaces. For example, our research shows that Standard SQL is the most important method for implementing analysis on Hadoop. To deal with this challenge, the distribution companies and others offer SQL abstraction layers on top of HDFS, such as HIVE and Cloudera Impala. Companies that I have written about include Datameer and Platfora, whose systems help users interact with Hadoop data via interactive systems such as spreadsheets and multidimensional cubes. With their familiar interaction paradigms such systems have helped increase adoption of Hadoop and enable more than a few experts to access big data systems.

The second challenge is latency. As a batch process MapReduce must sort and aggregate all of the data before creating analytic output. Technology such as Tez, developed by Hortonworks, and Cloudera Impala aim to address such speed limitations; the first leverages MapReduce, and the other circumvents MapReduce altogether. Adoption of these tools has moved the big data market forward, but challenges remain such as the continuing fragmentation of the Hadoop ecosystem and a lack of standardization in approaches.

An emerging technology holds promise for bridging the gap between big data and BI in a way that can unify big data ecosystems rather than dividing them. Apache Spark, under development since 2010 at the University of California Berkeley’s AMPLab, addresses both usability and performance concerns for big data. It adds flexibility by running on multiple platforms in terms of both clustering (such as Hadoop YARN and Apache Mesos) and distributed storage (for example, HDFS, Cassandra, Amazon S3 and OpenStack’s Swift). Spark also expands the potential uses because the platform includes an SQL abstraction layer (Spark SQL), a machine learning library (MLlib), a graph library (GraphX) and a near-real-time engine (Spark Streaming). Furthermore, Spark can be programmed using modern languages such as Python and Scala. Having all of these components integrated is important because interactive business intelligence, advanced analytics and operational intelligence on big data all can work without dealing with the complexity of having individual proprietary systems that were necessary to do the same things previously.

Because of this potential Spark is becoming a rallying point for providers of big data analytics. It has become the most active Apache project as key open source contributors moved their focus from other Hadoop projects to it. Out of the effort in Berkeley, Databricks was founded for commercial development of open source Apache Spark and has raised more than $46 million. Since the initial release in May 2014 the momentum for Spark has continued to build; major companies have made announcements around Apache Spark. IBM said it will dedicate 3,500 researchers and engineers to develop the platform and help customers deploy it. This is the largest dedicated Spark effort in the industry, akin to the move IBM made in the late 1990s with the Linux open source operating system. Oracle has built Spark into its Big Data Appliance. Microsoft has Spark as an option on its HDInsight big data approach but has also announced Prajna, an alternative approach to Spark. SAP has announced integration with its SAP HANA platform, although it represents “coopetition” for SAP’s in-memory platform. In addition, all the major business intelligence players have built or are building connectors to run on Spark. In time, Spark likely will serve as a data ingestion engine for connecting devices in the Internet of Things (IoT). For instance, Spark can integrate with technologies such as Apache Kafka or Amazon Kinesis to instantly process and analyze IoT data so that immediate action can be taken. In this way, as it is envisioned by its creators, Spark can serve as the nexus of multiple systems.

Because it is a flexible in-memory technology for big data, Spark opens the door to many new opportunities, which in business use include interactive analysis, advanced customer analytics,VentanaResearch_NextGenPredictiveAnalytics_BenchmarkResearchfraud detection, and systems and network management. At the same time, it is not yet a mature technology and for this reason,  organizations considering adoption should tread carefully. While Spark may offer better performance and usability, MapReduce is already widely deployed. For those users, it is likely best to maintain the current approach and not fix what is not broken. For future big data use, however, Spark should be carefully compared to other big data technologies. In this case as well as others, technical skills can still be a concern. Scala, for instance, one of the key languages used with Spark, has little adoption, according to our recent research on next-generation predictive analytics. Manageability is an issue as for any other nascent technology and should be carefully addressed up front. While, as noted, vendor support for Spark is becoming apparent, frequent updates to the platform can mean disruption to systems and processes, so examine the processes for these updates. Be sure that vendor support is tied to meaningful business objectives and outcomes. Spark is an exciting new technology, and for early adopters that wish to move forward with it today, both big opportunities and challenges are in store.


Ventana Research

As I discussed in the state of data and analytics in the cloud recently, usability is a top evaluation criterion for organizations in selecting cloud-based analytics software. Data access of cloud and on-premises systems are essential antecedents of usability. They can help business people perform analytic tasks themselves without having to rely on IT. Some tools allow data integration by business users on an ad hoc basis, but to provide an enterprise integration process and a governed information platform, IT involvement is often necessary. Once that is done, though, using cloud-based data for analytics can help, empowering business users and improving communication and process .

vr_DAC_16_dealing_with_multiple_data_sourcesTo be able to make the best decisions, organizations need access to multiple integrated data sources. The research finds that the most common data sources are predictable: business applications (51%), business intelligence applications (51%), data warehouses or operational data stores (50%), relational databases (41%) and flat files (33%). Increasingly, though, organizations also are including less structured sources such as semistructured documents (33%), social media (27%) and nonrelational database systems (19%). In addition there are important external data sources, including business applications (for 61%), social media data (48%), Internet information (42%), government sources (33%) and market data (29%). Whether stored in the cloud or locally, data must be normalized and combined into a single data set so that analytics can be performed.

Given the distributed nature of data sources as well as the diversity of data types, information platforms and integration approaches are changing. While more than three in five companies (61%) still do integration primarily between on-premises systems, significant percentages are now doing integration from the cloud to on-premises (47%) and from on-premises to the cloud (39%). In the future, this trend will become more pronounced. According to our research, 85 percent of companies eventually will integrate cloud data with on-premises sources, and 84 percent will do the reverse. We expect that hybrid architectures, a mix of on-premises and cloud data infrastructures, will prevail in enterprise information architectures for years to come while slowly evolving to equality of bidirectional data transfer between the two types.

Further analysis shows that a focus on integrating data for cloud analytics can give organizations competitive advantage. Those who said it is very important to integrate data for cloud-based analytics (42% of participants) also said they are very confident in their ability to use the cloud for analytics (35%); that’s three times more often than those who said integrating data is important (10%) or somewhat important (9%). Those saying that integration is very important also said more often that cloud-based analytics helps their customers, partners and employees in an array of ways, including improved presentation of data and analytics (62% vs. 43% of those who said integration is important or somewhat important), gaining access to many different data sources (57% vs. 49%) and improved data quality and data management (59% vs. 53%). These numbers indicate that organizations that neglect the integration aspects of cloud analytics are likely to be at a disadvantage compared to their peers that make it a priority.

Integration for cloud analytics is typically a manual task. In particular, almost half (49%) of organizations in the research use spreadsheets to manage the integration and preparation of cloud-based data. Yet doing so poses serious challenges: 58 percent of those using spreadsheets said it hampers their ability to manage processes efficiently. While traditional methods may suffice for integrating relatively small and well-defined data sets in an on-premises environment, they have limits when dealing with the scale and complexity of cloud-based data. vr_DAC_02_satisfaction_with_data_integration_toolsThe research also finds that organizations utilizing newer integration tools are satisfied with them more often than those using older tools. More than three-fourths (78%) of those using tools provided by a cloud applications  provider said they are satisfied or somewhat satisfied with them, as are even more (86%) of those using data integration tools designed for cloud computing; by comparison, fewer of those using spreadsheets (56%) or traditional enterprise data integration tools (71%) are satisfied.

This is not surprising. Modern cloud connectors are designed to connect via loosely coupled interfaces that allow cloud systems to share data in a flexible manner. The research thus suggests that for organizations needing to integrate data from cloud-based data sources, switching to modern integration tools can streamline the process.

Overall three-quarters of companies in our research said that it is important or very important to access data from cloud-based sources for analysis. Cloud-based analytics isn’t useful unless the right data can be fed into the analytic process. But without capable tools this is not easy to do. A substantial impediment is that analysts spend the majority of their time in accessing and preparing the data rather than in actual analysis. Complicating the task, each data source can represent a different, possibly complex, data model. Furthermore, the data sets may have varying data formats and interface requirements, which are not easily addressed with legacy integration tools.

Such complexity is the new reality, and new tools and approaches have come to market to address these complexities. For organizations looking to integrate their data for cloud-based analytics, we recommend exploring these new integration processes and technologies.


Ventana Research

Our recently completed benchmark research on data and analytics in the cloud shows that analytics deployed in cloud-based systems is gaining widespread adoption. Almost half (48%) of vr_DAC_04_widespread_use_of_cloud_based_analyticsparticipating organizations are using cloud-based analytics, another 19 percent said they plan to begin using it within 12 months, and 31 percent said they will begin to use cloud-based analytics but do not know when. Participants in various areas of the organization said they use cloud-based analytics, but front-office functions such as marketing and sales rated it important more often than did finance, accounting and human resources. This front-office focus is underscored by the finding that the categories of information for which cloud-based analytics is most often deemed important are forecasting (mentioned by 51%) and customer-related (47%) and sales-related (33%) information.

The research also shows that while adoption is high, organizations face challenges as they seek to realize full value from their cloud-based data and analytics initiatives. Our Performance Index analysis reveals that only one in seven organizations reach the highest Innovative level of the four levels of performance in their use of cloud-based analytics. Of the four dimensions we use to further analyze performance, organizations do better in Technology and Process than in Information and People. That is, the tools and analytic processes used for data and analytics in the cloud have advanced more rapidly than users’ abilities to work with their information. The weaker performance in People and Information is reflected in findings on the most common barriers to deployment of cloud-based analytics: lack of confidence about the security of data and analytics, mentioned by 56 percent of organizations, and not enough skills to use cloud-based analytics (42%).

Given the top barrier of perceived data security issues, it is not surprising the research finds that the largest percentage of organizations (66%) use a private cloud, which by its nature ostensibly is more secure, to deploy analytics; fewer use a public cloud (38%) or a hybrid cloud (30%), although many use more than one type today. We know from tracking analytics and business intelligence software providers that operate in the public cloud that this is changing quite rapidly. Comparing vr_DAC_06_how_to_deploy_cloud_based_analyticsdeployment by industry sector, the research analysis shows that private and hybrid clouds are more prevalent in the regulated areas of finance, insurance and real estate and government than in services and manufacturing. The research suggests that private and hybrid cloud deployments are used more often for analytics where data privacy is a concern.

Furthermore, organizations said that access to data for analytics is easier with private and hybrid clouds (29% for public cloud vs. 58% for private cloud and 67% for hybrid cloud). In addition, organizations using private and hybrid cloud more often said they have improved communication and information sharing (56% public vs. 72% private and 70% hybrid). Thus, the research data makes clear that organizations feel more comfortable implementing analytics in a private or hybrid cloud in many areas.

Private and hybrid cloud implementations of data and analytics often coincide with large data integration efforts, which are necessary at some point to benefit from such deployments. Those who said that integration is very important also said more often than those giving it less importance that cloud-based analytics helps their customers, partners and employees in an array of ways, including improved presentation of data and analytics (62% vs. 43% of those who said integration is important or somewhat important), gaining access to many different data sources (57% vs. 49%) and improved data quality and data management (59% vs. 53%). We note that the focus on data integration efforts correlates more with private and hybrid cloud approaches than with public cloud approaches, thus the benefits cannot be directly assigned to the various cloud approaches nor the integration efforts.

Another key insight from the research is that data and analytics often are considered in conjunction with mobile and collaboration initiatives which have different priorities for business than IT or in consumer markets. Nine out of 10 organizations said they use or intend to use collaboration technology to support their cloud-based data and analytics, and 83 percent said they need to support data access and analytics on mobile devices. Two-thirds said they support both tablets and smartphones and multiple mobile operating systems, the most important of which are Apple iOS (ranked first by 60%), Google Android (ranked first by 26%) and Microsoft Windows Mobile (ranked first by 13%). We note that Microsoft has a higher percentage of importance here than its reported market share (approximately 2.5%) would suggest. Similarly, Google Android has greater penetration than Apple in the consumer market (51% vs. 41%). We expect that the influence of mobile operating systems related to data and analytics in the cloud will continue to evolve and be impacted by upcoming corporate technology refreshment cycles, the consolidation of PCs and mobile devices, and the “bring your own device” (BYOD) trend.

The research finds that usability (63%) and reliability (57%) arevr_DAC_20_evaluation_criteria_for_cloud_based_analytics the top technology buying criteria, which is consistent with our business technology innovation research conducted last year. What has changed is that manageability is cited as very important as often as functionality, by approximately half of respondents, a stronger showing than in our previous research.  We think it likely that manageability is gaining prominence as cloud providers and organizations sort out issues in who manages deployments along with usage and licensing, along with who actually owns your data in the cloud which my colleague Robert Kugel has discussed.

As the research shows, the importance of cloud data and analytics is continuing to grow. The importance of this topic makes me eager to discuss further the attitudes, re­quire­­ments and future plans of organizations that use data and analytics in the cloud and to identify the best prac­tices of those that are most proficient in it. For more information on this topic, and learn more on best practices for data and analytics in the cloud, and download the executive summary of the report to improve your readiness.


Ventana Research

Our benchmark research shows that analytics is the top businessvr_bti_br_technology_innovation_priorities technology innovation priority; 39% of organizations rank it first. This is no surprise as new information sources and new technologies in data processing, storage, networking, databases and analytic software are combining to offer capabilities for using information never before possible. For businesses, the analytic priority is heightened by intense competition on several fronts; they need to know as much as possible about pricing, strategies, customers and competitors. Within the organization, the IT department and the lines of business continue to debate issues around the analytic skills gap, information simplification, information governance and the rise of time-to-value metrics. Given this backdrop, I expect 2014 to be an exciting year for  studying analytic technologies and how they apply to business.

Three key focus areas comprise my 2014 analytics research agenda. The first includes a specific focus on business analytics and methods like discovery and exploratory. This area will be covered in depth in our new research on next-generation business analytics commencing in the first half of 2014. At Ventana Research, we break discovery analytics into visual discovery, data discovery, event discovery and information discovery. The definitions and uses of each type appear in Mark Smith’s analysis of the four discovery technologies. As part of this research, we will examine these exploratory tools and techniques in the context of the analytic skills gap and the new analytic process flows in organizations. The people and process aspects of the research will include how governance and controls are being implemented alongside these innovations. The exploratory analytics space includes business intelligence, which our research shows is still the primary method of deploying information and analytics in organizations. Two upcoming Value Indexes, Mobile Business Intelligence, due out in the first quarter, and Business Intelligence, starting in the second, will provide up-to-date and in-depth evaluations and ranking of vendors in these categories.

Ventana_Research_Value_Index_LogoMy second agenda area is big data and predictive analytics. The first research on this topic will be released in the first quarter of the year as benchmark research on big data analytics. This fresh and comprehensive research maps to my analysis of the four pillars of big data Analytics, a framework for thinking about big data and the associated analytic technologies. This research also has depth in the areas of predictive analytics and big data approaches in use today. In addition to that benchmark research, we will conduct a first of its kind, the Big Data Analytics Value Index, which will assess the major players applying analytics to big data. Real-time and right-time big data also is called operational intelligence, an area Ventana Research has pioneered over the years. Our Operational Intelligence Value Index, which will be released in the first quarter, evaluates vendors of software that helps companies do real-time analytics against large streams of data that builds on our benchmark research on the topic.

The third focus area is information simplification and cloud-based business analytics including business intelligence. In our benchmark research on information optimization, recently released, Ventana_Research_Benchmark_Research_Logonearly all (97%) organizations said it is important or very important to simplify informa­tion access for both their business and their customers. Paradoxically, at the same time the technology landscape is getting more fragmented and complex; in order to simplify, software design will need innovative uses of analytic technology to mask the underlying complexity through layers of abstraction. In particular, users need the areas of sourcing data and preparing data for analysis to be simplified and made more flexible so they can devote less time to these tasks and more the actual analysis. Part of the challenge in information optimization and integration is to analyze data that originates in the cloud or has been moved there. This issue has important implications for debates around information presentation, the semantic web, where analytics are executed, and whether business intelligence will move to the cloud in any more than a piecemeal fashion. We’ll explore these topics in benchmark research on business intelligence and analytics in the cloud, which is planned for the second half of 2014. We released in 2013 research on location analytics and the use of geography for presentation and processing of data which we refer to as location analytics.

Analytics as a business discipline is getting hotter as we move forward in the 21st century, and I am thrilled to be part of the analytics community. I welcome any feedback you have on my research agenda and look forward to continuing to providing research, collaborating and educating with you in 2014.


Tony Cosentino

VP and Research Director

In his keynote speech at the sixth annual Tableau Customer Conference, company co-founder and CEO Christian Chabot borrowed from Steve Jobs’ famous quote that the computer “is the equivalent of a bicycle for our minds,” to suggest that his company software is such a new bicycle. He went on to build an argument about the nature of invention and Tableau’s place in it. The people who make great discoveries, Chabot said, start with both intuition and logic. This approach allows them to look at ideas and information from different perspectives and to see things that others don’t see. In a similar vein, he went on, Tableau allows us to look at things differently, understand patterns and generate new ideas that might not arise using traditional tools. Cabot key point was profound: New technologies such as Tableau with its visual analytics software that use new and big data sources of information are enablers and accelerators of human understanding.

vr_bti_br_technology_innovation_prioritiesTableau represents a new class of business intelligence (BI) software that is designed for business analytics allowing users to visualize and interact on data in new ways and does not mandate that relationships in the data be predefined. This business analytics focus is critical as it is the top ranked technology innovation in business today as identified by 39 percent of organizations as found in our research. In traditional BI systems, data is modeled in so called cube or more defined structures which allow users to slice and dice data instantaneously and in a user friendly fashion. The cube structure solves the problem of abstracting the complexity of the structured query language (SQL) of the database and the inordinate amount of time it can take to read data from a row oriented database. However, with memory decreasing in cost significantly, the advent of new column oriented databases, and approaches such as VizQL (Tableau’s proprietary query language that allows for direct visual query of a database), the methods of traditional BI approaches are now challenged by new ones. Tableau has been able to effectively exploit the exploratory aspects of data through its technological approach, and even further with the advent of many of the new big data sources that require more discovery type methods.

After Chabot’s speech, Chris Stolte, Tableau’s co-founder and chief development officer, took the audience through the company’s product vision, which is centered around the themes of seamless access to data, visual analytics for everyone, ease of use and beauty, storytelling, and enterprise analytics everywhere. This is essential as we classify what Tableau is performing in their software as methods of discovery for business analytics for which my colleague points out the four types where Tableau currently has two of them with data and visual support. Most important is that Tableau is working to address a broader set of personas of users that I have outlined to the industry for its products and expanding further to analyst, publishers and data geeks. As part of Stolte’s address, the product team took the stage to discuss innovations coming in Tableau 8.1 scheduled for this fall of 2013 and 8.2 product releases due early in 2014 all of which have been publicly disclosed in their Internet broadcast of the keynote.

One of those innovations is a new connection interface that enables a workflow for connecting to data. It provides very light data integration capabilities with which users clean and reshape data in a number of ways. The software automatically detects inner join keys with a single click, and the new table shows up automatically. vr_infomgt_barriers_to_information_managementUsers can easily manipulate left and right joins as well as change the join field altogether. Once data is extracted and imported, new tools such as a data parser enable users to specify a specific date format. While these data integration capabilities are admittedly lightweight compared with tools such as Informatica or Pentaho (which just released its latest 5.0 platform) that is integrated with its business analytics offering, they are a welcome development for users who still spend the majority of their time cleaning, preparing and reviewing data compared to analyzing it. Our benchmark research on information management shows that dirty data is a barrier to information management 58 percent of the time. Tableau’s developments and others in the area of information management should continue to erode the entrenchment of tools inappropriately used for analytics, especially spreadsheets, which my colleague Robert Kugel has researched in the use and misuse of spreadsheets. These advancements in simpler and access to data are critical as 44 percent of organizations indicated more time is spent on data related activities compared to analytic tasks.

A significant development in the 8.1 release of Tableau is the integration of R, the open source programming language for statistics and predictive analytics. Advances in the R language through the R community have been robust and continue to gain steam, as I discussed recently in an analysis of opportunities and barriers for R. Tableau users will still need to know the details of R, but now output can be natively visualized in Tableau. Depending on their needs, use can gain a more integrated and advanced statistical experience with Tableau partner tools such as Alteryx, which both my colleague Richard Snow and I have written about this year. Alteryx integrates with R at a higher level of abstraction and also directly integrates with Tableau output. While R integration is important for Tableau to provide new capabilities, it should be noted that this is a single-threaded approach and will be limited to running in-memory. This will be a concern for those trying to analyze truly large data sets since a single thread approach limits the modeler to about a single terabyte of data. For now, Tableau likely will serve mostly as an R sandbox for sample data, but when users need to move algorithms into production for larger data, they probably will have to use a parallelized environment. Other BI vendors like Information Builders and WebFocus has already embedded R into its BI product that is designed for analysts and hides the complexities of R.

Beyond the R integration, Tableau showed useful descriptive analyst methods such as box plots and percentile aggregations. Forecast vr_predanalytics_top_predictive_techniques_usedimprovements facilitate change of prediction bands and adjustment of seasonality factors. Different ranking methods can be used, and two-pass totals provide useful data views. While these analytic developments are nice, they are not groundbreaking. Tools like BIRT Analytics, Datawatch Panopticon and Tibco Spotfire, are still ahead with their ability to visualize data models in many methods like decision trees and clustering methods. Meanwhile, SAP just acquired KXEN and will likely start to integrate predictive capabilities into SAP Lumira, its visual analytics platform. SAS is also integrating easy-to-use high-end analytics into its Visual Analytics tool, and IBM’s SPSS and Cognos Insight work together for advanced analytics. Our research on predictive analytics shows that classification trees (69%) followed by regression techniques and association rules (66% and 61%, respectively) are the statistical techniques most often used in organizations today. Tableau also indicated future investments into improving location and maps with visualization. This goal aligns with our Location Analytics research which found 48 percent of business has found that using location analytics significantly improves their business process.  Tableau advancements in visualizing analytics is great for the analysts and data geeks but it is still beyond competencies of information consumers. At the same time, Tableau has done what Microsoft has not done with Microsoft Excel: simplicity in preparing analytics for interactive visualization and discovery. In addition Tableau is easily accessible by mobile technology like Tablets which is definitely not a strong spot for Microsoft.

Making it easier for analysts and knowledge workers, Tableau has two-click copy and paste in dashboards between workbooks and is a significant development that allows users to organize visualizations in an expedient fashion. They store the collection of data and visualization in folders from the data window, where they access all the components that are used for discovery. They also can support search to find any detail easily. Transparency features and quick filter formatting allow users to match brand logos and colors. Presentation mode allows full-screen center display, and calendar controls let users select dates in ways they are familiar with from using other calendaring tools. What was surprising is that Tableau did not show how to present supporting information to the visualization like free form text that is what analysts do in using Microsoft Powerpoint and support the integration of content/documents that is not just structured data. My colleague has pointed to the failures of business intelligence and what analysts need to provide more context and information to the visualization, and it appears Tableau is starting to address them.

Tableau developer Robert Kosara showed a Tableau 8.2 feature called Storypoints, which puts a navigator at the top of the screen and pulls together different visualizations to support a logical argument. This is important to advance the potential issue my colleague has pointed out with what we have in visual anarchy today. Storypoints are linked directly to data visualizations from which you can navigate across and see varying states of the visualization. Storytelling is of keen interest to analysts, because it is the primary way in which they prepare information and analytics for review and in support of the decision making needs in the organization. Encapsulating the observations in telling a story though requires more than navigation across states of visualization with a descriptive caption. It should support embedding of descriptive information related to the visualization and not just to navigation across it. Tableau has more to offer with its canvas layout and embedding other presentation components but did not spend much time outlining what is fully possible today. The idea of storytelling and collaboration is a hot area of development, with multiple approaches coming to market including those from Datameer, Roambi, Yellowfin and QlikTech (with its acquisition of NcomVA, a Swedish visualization company). These approaches need to streamline the cumbersome process of copying data from Excel into PowerPoint, building charts and annotating slides. Tableau’s Storypoints and the ability to guide navigation on the visualizations and copy and paste dashboards together are good first steps that can be a superior alternative to just using a personal productivity approach with Microsoft Office, but Tableau will still need more depth to replicate the flexibility in particular of Microsoft Powerpoint.

The last area of development, and perhaps the most important for Tableau, is making the platform and tools more enterprise ready. Security, availability, scalability and manageability are the hallmarks of an enterprise-grade application, and Tableau is advancing in each of these areas. The Tableau 8.1 release includes external load-balancing support and more dynamic support for host names. Companies using the SAML standard can administer a single sign-on that delegates authentication to an identity provider. IPv6 support for next-generation Internet apps and, perhaps most important from a scalability perspective, 64-bit architecture have been extended across the product line. (Relevant to this last development, Tableau Desktop in version 8.2 will operate natively on Apple Mac, which garnered perhaps the loudest cheer of the day from the mostly youthful attendees). For proof of its scalability, Tableau pointed to Tableau Public, which fields more than 70,000 views each day. Furthermore, Tableau’s cloud implementation called Tableau Online offers a multi-tenant architecture and a columnar database for scalability, performance and unified upgrades for cloud users. Providing a cloud deployment is critical according to our research that found that a quarter of organizations prefer a cloud approach; however cloud BI applications have seen slower adoption.

Enterprise implementations are the battleground on which Tableau wants to compete and is making inroads through rapid adoption by the analysts who are responsible for analytics across the organization. During the keynote, Chabot took direct aim at legacy business intelligence vendors, suggesting that using enterprise BI platform tools is akin to tying a brick to a pen and trying to draw. Enterprise BI platforms, he argued, are designed to work in opposition to great thinking. They are not iterative and exploratory in nature but rather developed in a predefined fashion. While this may be true in some cases, those same BI bricks are often the foundation of many organizations and they are not always easy to remove. Last year I argued in analyzing Tableau 8.0 that the company was not ready to compete for larger BI implementations. New developments coming this year address some of these shortcomings, but there is still the lingering question of the entrenchment of BI vendors and the metrics that are deployed broadly in organizations. Companies such as IBM, Oracle and SAP have a range of applications including finance, planning and others that reside at the heart of most organizations, and these applications can dictate the metrics and key indicators to which people and organizations are held accountable. For Tableau to replace them would require more integration with the established metrics and indicators that are managed within these tools and associated databases. For large rollouts driven by well-defined parameterized reporting needs and interaction with enterprise applications, Tableau still has work to do. Furthermore, every enterprise BI vendor has its own visual offering and is putting money into catching up to Tableau.

In sum, Tableau’s best-in-class ease of use does serve as a bicycle for the analytical mind, and with its IPO this year, Tableau is pedaling as fast as ever to continue its innovations. We research thousands of BI deployments and recently awarded Cisco our Business Technology Leadership Award in Analytics for 2013 for its use of Tableau Software. VR_leadershipwinnerCisco, who has many business intelligence (BI) tools, uses Tableau to design analytics and visualize data in multiple areas of its business. Tableau’s ability to capture the hearts and minds of those analysts that are responsible for analytics, and demonstrate business value in short period of time, or what is called time-to-value (TTV) and even more important for big data as I have pointed out, is why they are growing rapidly building a community and passion towards its products. Our research finds that business is driving adoption of business analytics which helps Tableau avoid the politics of IT while addressing the top selection criteria of usability. In addition, the wave of business improvement initiatives are changing how 60 percent of organizations select technology with buyers no longer simply accepting the IT standard or existing technology approach. Buyers both on the IT and in business should pay close attention to these trends and for all organizations looking to compete with analytics through simple but elegant visualizations should consider Tableau’s offerings.


Tony Cosentino

VP and Research Director

As a new generation of business professionals embraces a new generation of technology, the line between people and their tools begins to blur. This shift comes as organizations become flatter and leaner and roles, vr_ngbi_br_importance_of_bi_technology_considerationscontext and responsibilities become intertwined. These changes have introduced faster and easier ways to bring information to users, in a context that makes it quicker to collaborate, assess and act. Today we see this in the prominent buying patterns for business intelligence and analytics software and an increased focus on the user experience. Almost two-thirds (63%) of participants in our benchmark research on next-generation business intelligence say that usability is the top purchase consideration for business intelligence software. In fact, usability is the driving factor in evaluating and selecting technology across all application and technology areas, according to our benchmark research.

In selecting and using technology, personas (that is, an idealized cohort of users) are particularly important, as they help business and IT assess where software will be used in the organization and define the role, responsibility and competency of users and the context of what they need and why. At the same time, personas help software companies understand the attitudinal, behavioral and demographic profile of target individuals and the specific experience that is not just attractive but essential to those users. For example, the mobile and collaborative intelligence capabilities needed by a field executive logging in from a tablet at a customer meeting are quite different from the analytic capabilities needed by an analyst trying to understand the causes of high customer churn rates and how to change that trend with a targeted marketing campaign.

Understanding this context-driven user experience is the first step toward defining the personas found in today’s range of analytics users. The key is to make the personas simple to understand but comprehensive enough to cover the diversity of needs for business analytic types within the organization. To help organizations be more effective in their analytic process and engagement of their resources and time, we recommend the following five analytical personas: (Note that in my years of segmentation work, I’ve found that the most important aspects are the number of segments and the names of those segments. To this end, I have chosen a simple number, five, and the most intuitive names I could find to represent each persona.)

Information Consumer: This persona is not technologically savvy and may even feel intimidated by technology. Information must be provided in a user-friendly fashion to minimize frustration. These users may rely on one or two tools that they use just well enough to do their jobs, which typically involves consuming information in presentations, reports, dashboards or other forms that are easy to read and interpret. They are oriented more to language than to numbers and in most cases would rather read or listen to information about the business. They can write a pertinent memo or email, make a convincing sales pitch or devise a brilliant strategy. Their typical role within the organization varies, but among this group is the high-ranking executive, including the CEO, for whom information is prepared. In the lines of business, this consumer may be a call center agent, a sales manager or a field service worker. In fact, in many companies, the information consumer makes up the majority of the organization. The information consumer usually can read Excel and PowerPoint documents but rarely works within them. This persona feels empowered by consumer-grade applications such as Google, Yelp and Facebook.

Knowledge Worker: Knowledge workers are business, technologically and data savvy and have domain knowledge. They interpret data in functional ways. These workers understand descriptive data but are not likely to take on data integration tasks or interpret advanced statistics (as in a regression analysis). In terms of tools, they can make sense of spreadsheets and with minimal training use the output of tools like business intelligence systems, pivot tables and visual discovery tools. They also actively participate in providing feedback and input to planning and business performance software. Typically, these individuals are over their heads when they are asked to develop a pivot table or structure multi-dimensional data. In some instances, however, new discovery tools allow them to move beyond such limitations. The knowledge worker persona includes but is not limited to technology savvy executives, line of business managers to directors, domain experts and operations managers. Since these workers focus on decision-making and business outcomes, analytics is an important part of their overall workflow but targeted at specific tasks. For analytical tools this role may use applications with embedded analytics, analytic discovery and modeling approaches. Visual discovery tools and in many instances user friendly SaaS applications are empowering the knowledge worker to be more analytically driven without IT involvement.

Analyst: Well versed in data, this persona often knows business intelligence and analytics tools that pertain to the position and applies analytics to analyze various aspects of the business. These users are familiar with applications and systems and know how to retrieve and assemble data from them in many forms. They can also perform a range of data blending and data preparation tasks, and create dashboards and data visualizations along with pivot tables with minimal or no training. They can interpret many types of data, including correlation and in some cases regression. The analyst’s role involves modeling and analytics either within specific analytic software or within software used for business planning and enterprise performance management. More senior analysts focus on more advanced analytics, such as predictive analytics and data mining, to understand current patterns data and predict future outcomes. These analysts might be called a split persona in terms of where their skills and roles are deployed in the organization. They may reside in IT, but a lot more are found on the business side, as they are accountable for analytics tied to the outcomes of the analytics. Analysts on the business side may not be expert in SQL or computer programming but may be adept with languages such as R or SAS. Those on the IT side are more familiar with SQL and the building of data models used in databases. With respect to data preparation, the IT organization looks at integration through the lens of ETL and associated tool sets, whereas the business side looks at it from a data-merge perspective and the creation of analytical data sets in places like spreadsheets.

The roles that represent this persona often are explicitly called analysts with a prefix that in most cases is representative of the department they work from, such as finance, marketing, sales or operations but could have prefixes like corporate, customer, operational or other cross-departmental responsibilities. The analytical tools they use almost always include the spreadsheet, as well as complementary business intelligence tools and a range of analytical tools like visual discovery and in some cases more advanced predictive analytics and statistical software. Visual discovery and commodity modeling approaches are empowering some analyst workers to move upstream from a role of data monger to a more interpretive decision support position. For those already familiar with advanced modeling, today’s big data environments, including new sources of information and modern technology, are providing the ability to build much more robust models and solve an entirely new class of business problems.

Publisher: Skilled in data and analytics, the publisher typically knows how to configure and operate business intelligence tools and publish information from them in dashboards or reports. They are typically skilled in the basics of spreadsheets and publishing information to Microsoft Word or PowerPoint tools. These users not only can interpret many types of analytics but can also build and validate the data for their organizations. Similar to the analyst, the publisher may be considered a split persona, as these individuals may be in a business unit or IT. The IT-based publisher is more familiar with the business intelligence processes and knows the data sources and how to get to data from the data warehouse or even big data sources. They may have basic configuration and scripting skills that enable them to produce outputs in several ways. They may also have basic SQL and relational data modeling skills that help them identify what can be published and determine how data can be combined through the BI tool or databases. The titles related to publisher may include business intelligence manager, data analyst, or manager or director of data or information management. The common tools used by the publisher include business intelligence authoring tools, various visualization and analytic tools, and office productivity tools like Microsoft Office and Adobe Acrobat.

Data Geek: A data geek, data analyst or potentially as sophisticated as a data scientist has expert data management skills, has an interdisciplinary approach to data that melds the split personas discussed at the analyst and senior analyst levels. The primary difference between the data geek and the analyst is that the latter usually focuses on either the IT side or the business side. A senior analyst with a Ph.D. in computer science understands relational data models and programming languages but may not understand advanced statistical models and statistical programming languages. Similarly, a Ph.D. in statistics understands advanced predictive models and associated tools but may not be prepared to write computer code. The data scientist not only understands both advanced statistics and modeling but enough about computer programming and systems along with domain knowledge. The titles for this role vary but include chief analytics officer, enterprise data architect, data analyst, head of information science and even data scientist.

To align analytics and the associated software to individuals in the organization, businesses should use personas to best identify who needs what set of capabilities to be effective. Organizations should also assess competency levels in their personas to avoid adopting software that is too complicated or difficult to use. In some cases you will have individuals that can perform multiple personas. Instead of wasting time, resources and financial capital, look to define what is needed and where training is needed to ensure business and IT work collaboratively in business analytics. While some business analytics software is getting easier to use, many of the offerings are still difficult to use because they are still being designed for IT or more sophisticated analysts. While these individuals are an important group, they represent only a small portion of the users who need analytic business tools.

vr_bigdata_obstacles_to_big_data_analytics (2)The next generation of business intelligence and business analytics will in part address the need to more easily consume information through smartphones and tablets but will not overcome one of the biggest barriers to big data analytics: the skills gap. Our benchmark research on big data shows staffing (79%) and training (77%) are the two biggest challenges organizations face in efforts to take advantage of big data through analytics. In addition, a language barrier still exists in some organizations, where IT speaks in terms of TCO and cost of ownership and efficiency while the business speaks in terms of effectiveness and outcomes or time to value, which I have written about previously. While all of these goals are important, the organization needs to cater to the metrics that are needed by its various personas. Such understanding starts with better defining the different personas and building more effective communication among the groups to ensure that they work together more collaboratively to achieve their respective goals and get the most value from business analytics.


Tony Cosentino

VP and Research Director

Did you catch all the big data analogies people used in 2012? There were many, like the refinement of oil analogy, or the spinning straw into gold analogy, and less useful but more entertaining ones, like big data is like a box of chocolates, or big data is like The Matrix (because “there’s no way Keanu Reeves learns Kung Fu in five seconds without using big data”).  I tend to like the water analogy, which I’ll use here to have a little fun and to briefly describe how I see the business analytics market in 2013.

2013 is about standing lakes of information that will turn into numerous tributaries. These various tributaries of profile, behavioral and attitudinal data will flow into the digital river of institutional knowledge. Analytics, built out by people, process, information and technology, will be the only banks high enough to control this vast river and funnel it through the duct of organizational culture and into an ocean of market advantage.

With this river of information as a backdrop, I’m excited to introduce the Ventana Research Business Analytics Research Agenda for 2013, focused on three themes:

Answering the W’s (the what, the so what, the now what and the then what)

The first and perhaps most important theme of the 2013 research agenda builds on answering the W’s – the what, the so what, the now vr_bigdata_big_data_capabilities_not_availablewhat and the then what – which was also the topic of one of my most widely read blog posts last year. In that piece I suggested a substantive shift from the discussion the three V’s to the four W’s corresponds to the shift from a technologically-oriented discussion to a business-oriented one. Volume, variety and velocity are well-known parameters of big data that help facilitate the technology discussion, but when we look at analytics and how it can drive success for an organization, we need to move to the so what, now what and then what of analytical insights, organizational decision-making and closed-loop processes. Our big data research found some significant gaps in the business analytics spectrum for what is available in the organization today fueling a new generation of technology for consuming big data.

Outcome-driven approaches are a good way of framing issues, given that business analytics, and in particular big data analytics, are such broad topics, yet the use cases are so specific. Our big data analytics benchmark research for 2013 that we will start shortly will therefore look at specific benefits and supported business cases across industries and LOB in order to assess best practices for big data analytics. The research will investigate the opportunities and barriers that exist today  and explore what needs to happen for us to move from an early adopter market to an early majority market. The benchmark research will feed weighting algorithms into our Big Data Analytics Value Index, which will look at the analytics vendors that are tackling the formidable challenges of providing software to analyze large and multistructured datasets.

Disseminating insights within the organization is a big part of moving from insights to action, and business intelligence is still a primary vehicle for driving insight into the organization. While there is a lot to be said about mobile BI, collaborative BI, visual discovery and predictive analytics, core business intelligence systems remain at the heart of many organizations. Therefore we will continue our in-depth coverage of core business intelligence systems with our Business Intelligence Value Index, in the context of our next-generation business analytics benchmark research that will start in 2013.

Embracing next generation technology for business analytics

We’re beginning to see businesses embracing next-generation technology for business analytics.Collaborative business intelligence is a critical part of this conversation, both in terms of getting insights and in termsvr_ngbi_br_location_is_important_for_bi of making decisions. Last year’s next-generation business intelligence benchmark research showed us that the market is still undecided on how next-generation BI will be rolled out, with business applications being the preferred method, but only slightly more so than through business intelligence or office productivity tools. In addition to collaboration, we will focus on mobile and location trends in our next-generation business analytics benchmark research and our new location analytics benchmark research that we have already found has specific needs for business analysts. We see mobile business intelligence as a particularly hot area in 2013, and we are therefore breaking out mobile business intelligence vendors in this year’s Mobile Business Intelligence Value Index that we will conduct.

Another hot area of next-generation technology revolves around right-time data and real-time data and how they can be built into organizational workflows. As our operational intelligence benchmark research found, perceptions around OI and real-time data differ significantly between IT and business users. We will extend this discussion in the context of the big data analytics benchmark research, and specifically in the context of our Operational Intelligence Value Index that we will do in 2013.

Using analytical best practices across business and IT

Our final theme relates to the use of analytical best practices across business and IT. We’ll be looking at best practices for companies as they evolve to become analytics-driven organizations. In this context, we’llvr_predanalytics_benifits_of_predictive_analytics look at exploratory analytics, visual discovery and even English representation of the analytics itself as approaches in our next-generation business analytics benchmark research and how these impact how we assess BI in our Business Intelligence Value Index. We’ll look at how organizations exploit predictive analytics on big data as a competitive advantage as found in our predictive analytics benchmark within the context of our big data analytics benchmark research. We’ll look at the hot areas of sales and customer analytics, including best practices and their intersection with cloud computing models. And we’ll look at previously untapped areas of analytics that are just now heating up, such as human capital analytics. In our human capital analytics benchmark research that will begin shortly we’ll look across the landscape to assess not just analytics associated with core HR, but analytics around talent management and workforce optimization as well.

I see a high level of innovation and change going on in the business analytics market in 2013. Whenever an industry undergoes such change, high-quality primary research acts as a lighthouse for both customers and suppliers.  Companies can capitalize on all of the exciting developments in analytics, business intelligence and related areas of innovation to drive competitive advantage, but only if they understand the changes and potential value.

I am looking forward providing a practical perspective on using all forms of business analytics as a value in organizations and helping our Ventana Research community and clients.

Come read and download the full research agenda.


Tony Cosentino
VP and Research Director

RSS Tony Cosentino’s Analyst Perspectives at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Tony Cosentino – Twitter


  • 72,942 hits
%d bloggers like this: