You are currently browsing the category archive for the ‘Supply Chain’ category.

In many organizations, advanced analytics groups and IT are separate, and there often is a chasm of understanding between them, as I have noted. A key finding in our benchmark research on big data analytics is that communication and knowledge sharing is a top benefit of big data analytics initiatives,vr_Big_Data_Analytics_06_benefits_realized_from_big_data_analytics but often it is a latent benefit. That is, prior to deployment, communication and knowledge sharing is deemed a marginal benefit, but once the program is deployed it is deemed a top benefit. From a tactical viewpoint, organizations may not spend enough time defining a common vocabulary for big data analytics prior to starting the program; our research shows that fewer than half of organizations have agreement on the definition of big data analytics. It makes sense therefore that, along with a technical infrastructure and management processes, explicit communication processes at the beginning of a big data analytics program can increase the chance of success. We found these qualities in the Chorus platform of Alpine Data Labs, which received the Ventana Research Technology Innovation Award for Predictive Analytics in September 2014.

VR2014_TechInnovation_AwardWinnerAlpine Chorus 5.0, the company’s flagship product, addresses the big data analytics communication challenge by providing a user-friendly platform for multiple roles in an organization to build and collaborate on analytic projects. Chorus helps organizations manage the analytic life cycle from discovery and data preparation through model development and model deployment. It brings together analytics professionals via activity streams for rapid collaboration and workspaces that encourage projects to be managed in a uniform manner. While activity streams enable group communication via short messages and file sharing, workspaces allow each analytic project to be managed separately with capabilities for project summary, tracking and data source mapping. These functions are particularly valuable as organizations embark on multiple analytic initiatives and need to track and share information about models as well as the multitude of data sources feeding the models.

The Alpine platform addresses the challenge of processing big data by parallelizing algorithms to run across big data platforms such as Hadoop and making it accessible by a wide audience of users. The platform supports most analytic databases and all major Hadoop distributions. Alpine was vr_Big_Data_Analytics_13_advanced_analytics_on_big_dataan early adopter of Apache Spark, an open source in-memory data processing framework that one day may replace the original map-reduce processing paradigm of Hadoop. Alpine Data Labs has been certified by Databricks, the primary contributor to the Spark project, which is responsible for 75 percent of the code added in the past year. With Spark, Alpine’s analytic models such as logistic regression run in a fraction of the time previously possible and new approaches, such as one the company calls Sequoia Forest, a machine learning approach that is a more robust version of random forest analysis. Our big data analytics research shows that predictive analytics is a top priority for about two-thirds (64%) of organizations, but they often lack the skills to deploy a fully customized approach. This is likely a reason that companies now are looking for more packaged approaches to implementing big data analytics (44%) than custom approaches (36%), according to our research. Alpine taps into this trend by delivering advanced analytics directly in Hadoop and the HDFS file system with its in-cluster analytic capabilities that address the complex parallel processing tasks needed to run in distributed environments such as Hadoop.

A key differentiator for Alpine is usability. Its graphical user interface provides a visual analytic workflow experience built on popular algorithms to deliver transformation capabilities and predictive analytics on big data. The platform supports scripts in the R language, which can be cut and pasted into the workflow development studio; custom operators for more advanced users; and Predictive Model Markup Language (PMML), which enables extensible model sharing and scoring across different systems. The complexities of the underlying data stores and databases as well as the orchestration of the analytic workflow are abstracted from the user. Using it an analyst or statistician does not need to know programming languages or the intricacies of the database technology to build analytic models and workflows.

It will be interesting to see what direction Alpine will take as the big data industry continues to evolve; currently there are many point tools, each strong in a specific area of the analytic process. For many of the analytic tools currently available in the market, co-opetition among vendors prevails in which partner ecosystems compete with stack-oriented approaches. The decisions vendors make in terms of partnering as well as research and development are often a function of these market dynamics, and buyers should be keenly aware of who aligns with whom.  For example, Alpine currently partners with Qlik and Tableau for data visualization but also offers its own data visualization tool. Similarly, it offers data transformation capabilities, but its toolbox could be complimented by data preparation and master data solutions. This emerging area of self-service data preparation is important to line-of-business analysts, as my colleague Mark Smith recently discussed.

Alpine Labs is one of many companies that have been gaining traction in the booming analytics market. With a cadre of large clients and venture capital backing of US$23 million in series A and B, Alpine competes in an increasingly crowded and diverse big data analytics market. The management team includes industry veterans Joe Otto and Steve Hillion. Alpine seems to be particularly well suited for customers that have a clear understanding of the challenges of advanced analytics vr_predanalytics_benefits_of_predictive_analytics_updatedand are committed to using it with big data to gain a competitive advantage. This benefit is what organizations find most in over two thirds (68%) of organizations according to our predictive analytics benchmark research. A key differentiator for Alpine Labs is the collaboration platform, which helps companies clear the communication hurdle discussed above and address the advanced analytics skills gap at the same time. The collaboration assets embedded into the application and the usability of the visual workflow process enable the product to meet a host of needs in predictive analytics. This platform approach to analytics is often missing in organizations grounded in individual processes and spreadsheet approaches. Companies seeking to use big data with advanced analytics tools should include Alpine Labs in their consideration.


Ventana Research

Qlik was an early pioneer in developing a substantial market for a visual discovery tool that enables end users to easily access and manipulate analytics and data. Its QlikView application uses an associative experience that takes  an in-memory, correlation-based approach to present a simpler design and user experience for analytics than previous tools. Driven by sales of QlikView, the company’s revenue has grown to more than $.5 billion, and originating in Sweden it has a global presence.

At its annual analyst event in New York the business intelligence and analytics vendor discussed recent product developments, in particular the release of Qlik Sense. It is a drag-and-drop visual analytics tool targeted at business users but scalable enough for enterprise use. Its aim is to give business users a simplified visual analytic experience that takes advantage of modern cloud technologies. Such a user experience is important; our benchmark research into next-generation business intelligence shows that usability is an important buying criterion for nearly two out of three (63%) companies. A couple of months ago, Qlik introduced Qlik Sense for desktop systems, and at the analyst event it announced general availability of the cloud and server editions.

vr_bti_br_technology_innovation_prioritiesAccording to our research into business technology innovation, analytics is the top initiative for new technology: 39 percent of organizations ranked it their number-one priority. Analytics includes exploratory and confirmatory approaches to analysis. Ventana Research refers to exploratory analytics as analytic discovery and segments it into four categories that my colleague Mark Smith has articulated. Qlik’s products belong in the analytic discovery category. Users can use the tool to investigate data sets in an intuitive and visual manner, often conducting root cause analysis and decision support functions. This software market is relatively young, and competing companies are evolving and redesigning their products to suit changing tastes. Tableau, one of Qlik’s primary competitors, which I wrote about recently, is adapting its current platform to developments in hardware and in-memory processing, focusing on usability and opening up its APIs. Others have recently made their first moves into the market for visual discovery applications, including Information Builders and MicroStrategy. Companies such as Actuate, IBM, SAP, SAS and Tibco are focused on incorporating more advanced analytics in their discovery tools. For buyers, this competitive and fragmented market creates a challenge when comparing offers in the analytic discovery market.

A key differentiator is Qlik Sense’s new modern architecture, which is designed for cloud-based deployment and embedding in other applications for specialized use. Its analytic engine plugs into a range of Web services. For instance, the Qlik Sense API enables the analytic engine to call to a data set on the fly and allow the application to manipulate data in the context of a business process. An entire table can be delivered to node.js, which extends the JavaScript API to offer server-side features and enables the Qlik Sense engine to take on an almost unlimited number of real-time connections  by not blocking input and output. Previously developers could write PHP script and pipe SQL to get the data, and the resulting application is viable but complex to build and maintain. Now all they need is JavaScript and HTML. The Qlik Sense architecture abstracts the complexity and allows JavaScript developers to make use of complex constructs without intricate knowledge of the database. The new architecture can decouple the Qlik engine from the visualizations themselves, so Web developers can define expressions and dimensions without going into the complexities of the server-side architecture. Furthermore, by decoupling the services, developers gain access to open source visualization technologies such as d3.js. Cloud-based business intelligence and extensible analytics are becoming a hot topic. I have written about this, including a glimpse of our newly announced benchmark research on the next generation of data and analytics in the cloud. From a business user perspective, these types of architectural changes may not mean much, but for developers, OEMs and UX design teams, it allows much faster time to value through a simpler component-based approach to utilizing the Qlik analytic engine and building visualizations.

vr_Big_Data_Analytics_06_benefits_realized_from_big_data_analyticsThe modern architecture of Qlik Sense together with the company’s ecosystem of more than 1,000 partners and a professional services organization that has completed more than 2,700 consulting engagements, gives Qlik a competitive position. The service partner relationships, including those with major systems integrators, are key to the company’s future since analytics is as much about change management as technology. Our research in analytics consistently shows that people and processes lag technology and information in performance with analytics. Furthermore, in our benchmark research into big data analytics, the benefits most often mentioned as achieved are better communication and knowledge sharing (24%), better management and alignment of business goals (18%), and gaining competitive advantage (17%).

As tested on my desktop, Qlik Sense shows an intuitive interface with drag-and-drop capabilities for building analysis. Formulas are easy to incorporate as new measures, and the palate offers a variety of visualization options which automatically fit to the screen. The integration with QlikView is straightforward in that a data model from QlikView can be saved seamlessly and opened intact in Qlik Sense. The storyboard function allows for multiple visualizations to build into narratives and for annotations to be added including linkages with data. For instance, annotations can be added to specific inflection points in a trend line or outliers that may need explanation. Since the approach is all HTML5-based, the visualizations are ready for deployment to mobile devices and responsive to various screen sizes including newer smartphones, tablets and the new class of so-called phablets. In the evaluation of vendors in our Mobile Business Intelligence Value Index Qlik ranked fourth overall.

In the software business, of course, technology advances alone don’t guarantee success. Qlik has struggled to clarify the position its next-generation product and it is not a replacement for QlikView. QlikView users are passionate about keeping their existing tool because they have already designed dashboards and calculations using this tool. Vendors should not underestimate user loyalty and adoption. Therefore Qlik now promises to support both products for as long as the market continues to demand them. The majority of R&D investment will go into Qlik Sense as developers focus on surpassing the capabilities of QlikView. For now, the company will follow a bifurcated strategy in which the tools work together to meet needs for various organizational personas. To me, this is the right strategy. There is no issue in being a two-product company, and the revised positioning of Qlik Sense complements QlikView both on the self-service side and the developer side. Qlik Sense is not yet as mature a product as QlikView, but from a business user’s perspective it is a simple and effective analysis tool for exploring data and building different data views. It is simpler because users no do not need to script the data in order to create the specific views they deem necessary. As the product matures, I expect it to become more than an end user’s visual analysis tool since the capabilities of Qlik Sense lends itself to web scale approaches. Over time, it will be interesting to see how the company harmonizes the two products and how quickly customers will adopt Qlik Sense as a stand-alone tool.

For companies already using QlikView, Qlik Sense is an important addition to the portfolio. It will allow business users to become more engaged in exploring data and sharing ideas. Even for those not using QlikView, with its modern architecture and open approach to analytics, Qlik Sense can help future-proof an organization’s current business intelligence architecture. For those considering Qlik for the first time, the choice may be whether to bring in one or both products. Given the proven approach of QlikView, in the near term a combination approach may be a better solution in some organizations. Partners, content providers and ISVs should consider Qlik Branch, which provides resources for embedding Qlik Sense directly into applications. The site provides developer tools, community efforts such as d3.js integrations and synchronization with Github for sharing and branching of designs. For every class of user, Qlik Sense can be downloaded for free and tested directly on the desktop. Qlik has made significant strides with Qlik Sense, and it is worth a look for anybody interested in the cutting edge of analytics and business intelligence.


Ventana Research

Alteryx has released version 9.0 of Alteryx Analytics that provides a range of data to predictive analytics in advance of its annual user conference called Inspire 2014. I have covered the company for several years as it has emerged as a key player in providing a range of business analytics from predictive to big data analytics. The importance of this category of analytics is revealed by our latest benchmark research on big data analytics, which finds that predictive analytics is the most important type of big data analytics, ranked first by nearly half (47%) of research participants. The new version 9 includes new capabilities and integration with a range of new information sources including read and write capability to IBM SPSS and SAS for range of analytic needs.

vr_Big_Data_Analytics_08_top_capabilities_of_big_data_analyticsAfter attending Inspire 2013 last year, I wrote about capabilities that are enabling an emerging business role, that which Alteryx calls the data artisan. The label refers to analysts who combines both art and science in using analytics to help direct business outcomes. Alteryx uses an innovative and intuitive approach to analytic tasks, using workflow and linking various data sources through in-memory computation and processing. It takes a “no code” drag and drop approach to integrate data from files and databases, prepare data for analysis, and build and score predictive models to yield relevant results. Other vendors in the advanced analytics market are also applying this approach, but few mature tools are currently available. The output of the Alteryx analytic processes can be shared automatically in numerous data formats including direct export into visualization tools such as those from Qlik (new support) and Tableau. This can help users improve their predictive analytics capabilities and take action on the outcomes of analytics, which are the two capabilities most-often cited in our research as needed to improve big data analytics.

vr_Big_Data_Analytics_09_use_cases_for_big_data_analyticsAlteryx now works with Revolution Analytics to increase the scalability of its system to work with large data sets. The open source language R continues to gain popularity and is being embedded in many business intelligence tools, but it runs only on data that can be loaded into memory. Running only in memory does not address analytics on datasets that run into Terabytes and hundreds of millions of values, and potentially requires use of a sub-sampling approach to advanced analytics. With its RevoScaleR, Revolution Analytics rewrites parts of the R algorithm so that the processing tasks can be parallelized and run in big data architectures such as Hadoop. Such capability is important for analytic problems including recommendation engines, unsupervised anomaly detection, some classification and regression problems, and some clustering problems. These analytic techniques are appropriate for some of the top business uses of big data analytics, which according to our research are cross-selling and up-selling (important to 38%), better understanding of individual customers (32%), analyzing all data rather than a sample (30%) and price optimization (28%). Alteryx Analytics automatically detects whether to use RevoScaleR or open source R algorithms. This approach simplifies the technical complexities of scaling R by providing a layer of abstraction for the analytic professional.

Scoring – the ability to input a data record and receive the probability of a particular outcome – is an important if not well understood aspect of predictive analytics. Our research shows that companies that score models on a timely basis according to their needs get better organizational results than those that score all models the same way. Working with Revolution Analytics, Alteryx has enhanced scoring scalability for R algorithms with new capabilities that chunk data in a parallelized fashion. This approach bypasses the memory-only approach to enable a theoretically unlimited number of scores to be processed. For large-scale implementations and consumer applications in industries such as retail, an important target market for Alteryx, and these capabilities are becoming important.

Alteryx 9.0 also improves on open source R’s default approach to scoring, which is “all or nothing.” That is, if data is missing (a null value) or a new level for a categorical variable is not included in the original model, R will not score the model until the issue is addressed. This process is a particular problem for analysts who want to score data in small batches or individually. In contrast, Alteryx’s new “best effort” approach scores the records that can be run without incident, and those that cannot be run are returned with an error message. This adjustment is particularly important as companies start to deploy predictive analytics into areas such as call centers or within Web applications such as automatic quotes for insurance.

vr_Big_Data_Analytics_02_defining_big_data_analyticsAlteryx 9.0 also has new predictive modeling tools and functionality. A spline model helps address regression and classification problems such as data reduction and nonlinear relationships and their interactions. It uses a clear box way to serve users with differing objectives and skill levels. The approach exposes the underpinnings of the model so that advanced users can modify a model, but at the same time less sophisticated users can use the model without necessarily understanding all of the intricacies of the model itself. Other capabilities include a Gamma regression tool allows data matching to model the Gamma family of distributions using the generalized linear modeling (GLM) framework. Heat plot tools for visualizing joint probability distributions, such as between customer income level and customer advocacy, and more robust A/B testing tools, which are particularly important in digital marketing analytics, are also part of the release.

At the same time, Alteryx has expanded its base of information sources. According to our research, working with all sources of data, not just one, is the most common definition for big data analytics, as stated by three-quarters (76%) of organizations. While structured data from transaction systems and so-called systems of record is still the most important, new data sources including those coming from external sources are becoming important. Our research shows that the most widely used external data sources are cloud applications (54%) and social media data (46%); five additional data sources, including Internet, consumer, market and government sources, are virtually tied in third position (with 39% to 42% each). Alteryx will need to be mindful of best practices in big data analytics as I have outlined to ensure it can stay on top of a growing set of requirements to blend big data but also apply a range of advanced analytics.

New connectors to the social media data provider Gnip give access to social media websites through a single API, and a DataSift ( connector helps make social media more accessible and easier to analyze for any business need. Other new connectors in 9.0 include those for Foursquare, Google Analytics, Marketo, and Twitter. New data warehouse connectors include those for Amazon Redshift, HP Vertica, Microsoft SQL Server and Pivotal Greenplum. Access to SPSS and SAS data files also is introduced in this version; Alteryx hopes to break down the barriers to entry in accounts dominated by these advanced analytic stalwarts. With already existing connectors to major cloud and on-premises data sources, the company provides a robust integration platform for analytics.

Alteryx is on a solid growth curve as evidenced by the increasing number of inquiries and my conversations with company vr_Customer_Analytics_08_time_spent_in_customer_analyticsexecutives. It’s not surprising given the disruptive potential of the technology itself and its unique analytic workflow technology for data blending and advanced analytics. This data blending and workflow technology that Alteryx provides is not highlighted enough as it is one of the largest differentiators of its software and reduces the data related tasks like preparing (47%) and reviewing (43%) data that our customer analytics research finds gets in the way of analysts performing analytics. Additionally Alteryx ability to apply location analytics within its product is a key differentiation that our research found delivers exponential value from analytics than just viewing traditional visualization and tables of data. Also location analytics like Alteryx provides helps rapidly identify areas where customer experience and satisfaction can be improved and is the top benefit found in our research. The flexible platform resonates particularly well with line-of-business and especially in fast-moving, lightly regulated industries such as travel, retail and consumer goods where speed of analytics are critical to be performed. The work the company is doing with Revolution Analytics and the ability to scale is important for advanced analytic that operate on big data. The ability to seamlessly connect and blend information sources is a critical capability for Alteryx and it’s a wise move to invest further in this area but Alteryx will need to examine where collaborative technology could be used to help business work together on analytics within the software. Alteryx will need to continue to adapt to the market demand for analytics and keep focused on varying line of business areas so it can continue its growth. Just about any company involved in analytics today should evaluate Alteryx and see how it can streamline analytics in a very unique approach.


Tony Cosentino

VP and Research Director

Organizations should consider multiple aspects of deploying big data analytics. These include the type of analytics to be deployed, how the analytics will be deployed technologically and who must be involved both internally and externally to enable success. Our recent big data analytics benchmark research assesses each of these areas. How an organization views these deployment considerations may depend on the expected benefits of the big data analytics program and the particular business case to be made, which I discussed recently.

According to the research, the most important capability of big data analytics is predictive analytics (64%), but among companies vr_Big_Data_Analytics_08_top_capabilities_of_big_data_analyticsthat have deployed big data analytics, descriptive analytic approaches of query and reporting (74%) and data discovery (64%) are more readily available than predictive capabilities (57%). Such statistics may be a function of big data technologies such as Hadoop, and their associated distributions having prioritized the ability to run descriptive statistics through standard SQL, which is the most common method for implementing analysis on Hadoop. Cloudera’s Impala, Hortonworks’ Stinger (an extension of Apache Hive), MapR’s Drill, IBM’s Big SQL, Pivotal’s HAWQ and Facebook’s open-source contribution of Presto SQL all focus on accessing data through an SQL paradigm. It is not surprising then that the technology research participants use most for big data analytics is business intelligence (75%) and that the most-used analytic methods — pivot tables (46%), classification (39%) and clustering (37%) — are descriptive and exploratory in nature. Similarly, participants said that visualization of big data allows analysts to perform faster analysis (49%), understand context better (48%), perform root-cause analysis (40%) and display multiple result sets (40%), but visualization does not provide more advanced analytic capabilities. While various vendors now offer approaches to run advanced analytics on big data, the research shows that in terms of big data, organizational capabilities still revolve around more basic analytic access.

For companies that are implementing advanced analytic capabilities on big data, there are further analytic process considerations, and many have not yet tackled those. Model building and model deployment should be manageable and timely, involve specialized personnel, and integrate into the broader enterprise architecture. While our research provides an in-depth look at adoption of the different types of in-database analytics, deployment of advanced analytic sandboxes, data mining, model management, integration with business processes and overall model deployment, that is beyond the topic here.

Beyond analytic considerations, a host of technological decisionsvr_Big_Data_Analytics_13_advanced_analytics_on_big_data must be made around big data analytics initiatives. One of these is the degree of customization necessary. As technology advances, customization is giving way to more packaged approaches to big data analytics. According to our research, the majority (54%) of companies that have already implemented big data analytics did custom builds using big data-specific languages and interfaces. The most of those that have not yet deployed are likely to purchase a dedicated or packaged application (44%), followed by a custom build (36%). We think that this pre- and post-deployment comparison reflects a maturing market.

The move from custom approaches to standardized ones has important implications for the skills sets needed for a big data vr_Big_Data_Analytics_14_big_data_analytics_skillsanalytics initiative. In comparing the skills that organizations said they currently have to the skills they need to be successful with big data analytics, it is clear that companies should spend more time building employees’ statistical, mathematical and visualization skills. On the flip side, organizations should make sure their tools can support skill sets that they already have, such as use of spreadsheets and SQL. This is convergent with other findings about training needs, which include applying analytics to business problems (54%), training on big data analytics tools (53%), analytic concepts and techniques (46%) and visualizing big data (41%). The data shows that as approaches become more standardized and the market focus shifts toward them from customized implementations, skill needs are shifting as well. This is not to say that demand is moving away from the data scientist completely. According to our research, organizations that involve cross-functional teams or data scientists in the deployment process are realizing the most significant impact. It is clear that multiple approaches for personnel, departments and current vendors play a role in deployments and that some approaches will be more effective than others.

Cloud computing is another key consideration with respect to deploying analytics systems as well as sandbox modelling and testing environments. For deployment of big data analytics, 27 percent of companies currently use a cloud-based method, while 58 percent said they do not and 16 percent do not know what is used. Not surprisingly, far fewer IT professionals (19%) than business users (40%) said they use cloud-based deployments for big data analytics. The flexibility and capability that cloud resources provide is particularly attractive for sandbox environments and for organizations that lack big data analytic expertise. However, for big data model building, most organizations (42%) still utilize a dedicated internal sandbox environment to build models while fewer (19%) use a non-dedicated internal sandbox (that is, a container in a data warehouse used to build models) and others use a cloud-based sandbox either as a completely separate physical environment (9%) or as a hybrid approach (9%). From this last data we infer that business users are sometimes using cloud-based systems to do big data analytics without the knowledge of IT staff. Among organizations that are not using cloud-based systems for big data analytics, security (45%) is the primary reason that they do not.

Perhaps the most important consideration for big data analytics is choosing vendors to partner with to achieve organizational objectives. When we understand the move from custom technological approaches to more packaged ones and the types of analytics currently being implemented for big data, it is not surprising that a majority of research participants (52%) are looking to their business intelligence systems providers to supply their big data analytics solution. However, a significant number of companies (35%) said they will turn to a specialist analytics provider or their database provider (34%). When evaluating big data analytics, usability is the most important vendor consideration but not by as wide a margin as in categories such as business intelligence. A look at criteria rated important and very important by research participants reveals usability is the highest ranked (94%), but functionality (92%) and reliability (90%) follow closely. Among innovative new technologies, collaboration is important (78%) while mobile access (46%) is much less so. Coupled with the finding that communication and knowledge sharing combined is an important benefit of big data analytics, it is clear that organizations are cognizant of the collaborative imperative when choosing a big data analytics product.

Deployment of big data analytics starts with forethought and a well-defined business case that includes the expected benefits I discussed in my previous analysis. Once the outcome-driven framework is established, organizations should consider the types of analytics needed, the enabling technologies and the people and processes necessary for implementation. To learn more about our big data analytics research, download a copy of the executive summary here.


Tony Cosentino

VP & Research Director

We recently released our benchmark research on big data analytics, and it sheds light on many of the most important discussions occurring in business technology today. The study’s structure was based on the big data analytics framework that I laid out last year as well as the framework that my colleague Mark Smith put forth on the four types of discovery technology available. These frameworks view big data and analytics as part of a major change that includes a movement from designed data to organic data, the bringing together of analytics and data in a single system, and a corresponding move away from the technology-oriented three Vs of big data to the business-oriented three Ws of data. Our big data analytics research confirms these trends but also reveals some important subtleties and new findings with respect to this important emerging market. I want to share three of the most interesting and even surprising results and their implications for the big data analytics market.

First, we note that communication and knowledge sharing is a primary vr_Big_Data_Analytics_06_benefits_realized_from_big_data_analyticsbenefit of big data analytics initiatives, but it is a latent one. Among organizations planning to deploy big data analytics, the benefits most often anticipated are faster response to opportunities and threats (57%), improving efficiency (57%), improving the customer experience (48%) and gaining competitive advantage (43%). However, once a big data analytics system has moved into production, the benefits most often mentioned as achieved are better communication and knowledge sharing (51%), gaining competitive advantage (51%), improved efficiency in business processes (49%) and improved customer experience and satisfaction (46%). (The chart shows rankings of first choices as most important.) Although the last three of these benefits are predictable, it’s noteworthy that the benefit of communication and knowledge sharing, while not a priority before deployment, becomes one of the two most often cited later.

As for the implications, in our view, one reason why communication and knowledge sharing are more often seen as a key benefit after deployment rather than before is that agreement on big data analytics terminology is often lacking within organizations. Participants from fewer than half (44%) of organizations said that the people making business technology decisions mostly agree or completely agree on the meaning of big data analytics, while the same number said there are many different opinions about its meaning. To address this particular challenge, companies should pay more attention to setting up internal communication structures prior to the launch of a big data analytics project, and we expect collaborative technologies to play a larger role in these initiatives going forward.

vr_Big_Data_Analytics_02_defining_big_data_analyticsA second finding of our research is that integration of distributed data is the most important enabler of big data analytics. Asked the meaning of big data analytics in terms of capabilities, the largest percentage (76%) of participants said it involves analyzing data from all sources rather than just one, while for 55 percent it means analyzing all of the data rather than just a sample of it. (We allowed multiple responses.) More than half (56%) told us they view big data as finding patterns in large and diverse data sets in Hadoop, which indicates the continuing influence of this original big data technology. A second tier of percentages emphasizes timeliness as an aspect of big data: doing real-time processing on streams of data (44%), visualizing large structured data sets in seconds (40%) and doing real-time scoring against a database record (36%).

The implications here are that the primary characteristic of big data analytics technology is the ability to analyze data from many data sources. This shows that companies today are focused on bringing together multiple information sources and secondarily being able to process all data rather than just a sample, as well as being able to do machine learning on especially large data sets. Fast processing and the ability to analyze streams of data are relegated to third position in these priorities. That suggests that the so-called three Vs of big data are confusing the discussion by prioritizing volume, velocity and variety all at once. For companies engaged in big data analytics today, sourcing and integration of various data sources in an expedient manner is the top priority, followed by the ideas of size and then speed of arrival of data.

Third, we found that usage is not relegated to particular industries, vr_Big_Data_Analytics_09_use_cases_for_big_data_analyticscertain types of companies or certain functional areas. From among 25 uses for big data analytics those that participants are personally involved with, three of the four most often mentioned involve customers and sales: enabling cross-selling and up-selling (38%), understanding the customer better (32%) and optimizing pricing (28%). Meanwhile, optimizing IT operations ranked fifth (24%) though it was most often chosen by those in IT roles (76%). What is particularly fascinating, however, is that 17 of the 25 use cases were named by more than 10 percent, which indicates many uses for big data analytics.

The primary implication of this finding is that big data analytics is not following the famous technology adoption curves outlined in books such as Geoffrey Moore’s seminal work, “Crossing the Chasm.” That is, companies are not following a narrowly defined path that solves only one particular problem. Instead, they are creatively deploying technological innovations en route to a diverse set of outcomes. And this is occurring across organizational functions and industries, including conservative ones, which conflicts with conventional wisdom. For this reason, companies are more often looking across industries and functional disciplines as part of their due diligence on big data analytics to come up with unique applications that may yield competitive advantage or organizational efficiencies.

In summary, it has been difficult for companies to define what big data analytics actually means and how to prioritize their investments accordingly. Research such as ours can help organizations address this issue. While the above discussion outlines a few of the interesting findings of this research, it also yields many more insights, related to aspects as diverse as big data in the cloud, sandbox environments, embedded predictive analytics, the most important data sources in use, and the challenges of choosing an architecture and deploying big data analytic products. For a copy of the executive summary download it directly from the Ventana Research community.


Ventana Research

As a new generation of business professionals embraces a new generation of technology, the line between people and their tools begins to blur. This shift comes as organizations become flatter and leaner and roles, vr_ngbi_br_importance_of_bi_technology_considerationscontext and responsibilities become intertwined. These changes have introduced faster and easier ways to bring information to users, in a context that makes it quicker to collaborate, assess and act. Today we see this in the prominent buying patterns for business intelligence and analytics software and an increased focus on the user experience. Almost two-thirds (63%) of participants in our benchmark research on next-generation business intelligence say that usability is the top purchase consideration for business intelligence software. In fact, usability is the driving factor in evaluating and selecting technology across all application and technology areas, according to our benchmark research.

In selecting and using technology, personas (that is, an idealized cohort of users) are particularly important, as they help business and IT assess where software will be used in the organization and define the role, responsibility and competency of users and the context of what they need and why. At the same time, personas help software companies understand the attitudinal, behavioral and demographic profile of target individuals and the specific experience that is not just attractive but essential to those users. For example, the mobile and collaborative intelligence capabilities needed by a field executive logging in from a tablet at a customer meeting are quite different from the analytic capabilities needed by an analyst trying to understand the causes of high customer churn rates and how to change that trend with a targeted marketing campaign.

Understanding this context-driven user experience is the first step toward defining the personas found in today’s range of analytics users. The key is to make the personas simple to understand but comprehensive enough to cover the diversity of needs for business analytic types within the organization. To help organizations be more effective in their analytic process and engagement of their resources and time, we recommend the following five analytical personas: (Note that in my years of segmentation work, I’ve found that the most important aspects are the number of segments and the names of those segments. To this end, I have chosen a simple number, five, and the most intuitive names I could find to represent each persona.)

Information Consumer: This persona is not technologically savvy and may even feel intimidated by technology. Information must be provided in a user-friendly fashion to minimize frustration. These users may rely on one or two tools that they use just well enough to do their jobs, which typically involves consuming information in presentations, reports, dashboards or other forms that are easy to read and interpret. They are oriented more to language than to numbers and in most cases would rather read or listen to information about the business. They can write a pertinent memo or email, make a convincing sales pitch or devise a brilliant strategy. Their typical role within the organization varies, but among this group is the high-ranking executive, including the CEO, for whom information is prepared. In the lines of business, this consumer may be a call center agent, a sales manager or a field service worker. In fact, in many companies, the information consumer makes up the majority of the organization. The information consumer usually can read Excel and PowerPoint documents but rarely works within them. This persona feels empowered by consumer-grade applications such as Google, Yelp and Facebook.

Knowledge Worker: Knowledge workers are business, technologically and data savvy and have domain knowledge. They interpret data in functional ways. These workers understand descriptive data but are not likely to take on data integration tasks or interpret advanced statistics (as in a regression analysis). In terms of tools, they can make sense of spreadsheets and with minimal training use the output of tools like business intelligence systems, pivot tables and visual discovery tools. They also actively participate in providing feedback and input to planning and business performance software. Typically, these individuals are over their heads when they are asked to develop a pivot table or structure multi-dimensional data. In some instances, however, new discovery tools allow them to move beyond such limitations. The knowledge worker persona includes but is not limited to technology savvy executives, line of business managers to directors, domain experts and operations managers. Since these workers focus on decision-making and business outcomes, analytics is an important part of their overall workflow but targeted at specific tasks. For analytical tools this role may use applications with embedded analytics, analytic discovery and modeling approaches. Visual discovery tools and in many instances user friendly SaaS applications are empowering the knowledge worker to be more analytically driven without IT involvement.

Analyst: Well versed in data, this persona often knows business intelligence and analytics tools that pertain to the position and applies analytics to analyze various aspects of the business. These users are familiar with applications and systems and know how to retrieve and assemble data from them in many forms. They can also perform a range of data blending and data preparation tasks, and create dashboards and data visualizations along with pivot tables with minimal or no training. They can interpret many types of data, including correlation and in some cases regression. The analyst’s role involves modeling and analytics either within specific analytic software or within software used for business planning and enterprise performance management. More senior analysts focus on more advanced analytics, such as predictive analytics and data mining, to understand current patterns data and predict future outcomes. These analysts might be called a split persona in terms of where their skills and roles are deployed in the organization. They may reside in IT, but a lot more are found on the business side, as they are accountable for analytics tied to the outcomes of the analytics. Analysts on the business side may not be expert in SQL or computer programming but may be adept with languages such as R or SAS. Those on the IT side are more familiar with SQL and the building of data models used in databases. With respect to data preparation, the IT organization looks at integration through the lens of ETL and associated tool sets, whereas the business side looks at it from a data-merge perspective and the creation of analytical data sets in places like spreadsheets.

The roles that represent this persona often are explicitly called analysts with a prefix that in most cases is representative of the department they work from, such as finance, marketing, sales or operations but could have prefixes like corporate, customer, operational or other cross-departmental responsibilities. The analytical tools they use almost always include the spreadsheet, as well as complementary business intelligence tools and a range of analytical tools like visual discovery and in some cases more advanced predictive analytics and statistical software. Visual discovery and commodity modeling approaches are empowering some analyst workers to move upstream from a role of data monger to a more interpretive decision support position. For those already familiar with advanced modeling, today’s big data environments, including new sources of information and modern technology, are providing the ability to build much more robust models and solve an entirely new class of business problems.

Publisher: Skilled in data and analytics, the publisher typically knows how to configure and operate business intelligence tools and publish information from them in dashboards or reports. They are typically skilled in the basics of spreadsheets and publishing information to Microsoft Word or PowerPoint tools. These users not only can interpret many types of analytics but can also build and validate the data for their organizations. Similar to the analyst, the publisher may be considered a split persona, as these individuals may be in a business unit or IT. The IT-based publisher is more familiar with the business intelligence processes and knows the data sources and how to get to data from the data warehouse or even big data sources. They may have basic configuration and scripting skills that enable them to produce outputs in several ways. They may also have basic SQL and relational data modeling skills that help them identify what can be published and determine how data can be combined through the BI tool or databases. The titles related to publisher may include business intelligence manager, data analyst, or manager or director of data or information management. The common tools used by the publisher include business intelligence authoring tools, various visualization and analytic tools, and office productivity tools like Microsoft Office and Adobe Acrobat.

Data Geek: A data geek, data analyst or potentially as sophisticated as a data scientist has expert data management skills, has an interdisciplinary approach to data that melds the split personas discussed at the analyst and senior analyst levels. The primary difference between the data geek and the analyst is that the latter usually focuses on either the IT side or the business side. A senior analyst with a Ph.D. in computer science understands relational data models and programming languages but may not understand advanced statistical models and statistical programming languages. Similarly, a Ph.D. in statistics understands advanced predictive models and associated tools but may not be prepared to write computer code. The data scientist not only understands both advanced statistics and modeling but enough about computer programming and systems along with domain knowledge. The titles for this role vary but include chief analytics officer, enterprise data architect, data analyst, head of information science and even data scientist.

To align analytics and the associated software to individuals in the organization, businesses should use personas to best identify who needs what set of capabilities to be effective. Organizations should also assess competency levels in their personas to avoid adopting software that is too complicated or difficult to use. In some cases you will have individuals that can perform multiple personas. Instead of wasting time, resources and financial capital, look to define what is needed and where training is needed to ensure business and IT work collaboratively in business analytics. While some business analytics software is getting easier to use, many of the offerings are still difficult to use because they are still being designed for IT or more sophisticated analysts. While these individuals are an important group, they represent only a small portion of the users who need analytic business tools.

vr_bigdata_obstacles_to_big_data_analytics (2)The next generation of business intelligence and business analytics will in part address the need to more easily consume information through smartphones and tablets but will not overcome one of the biggest barriers to big data analytics: the skills gap. Our benchmark research on big data shows staffing (79%) and training (77%) are the two biggest challenges organizations face in efforts to take advantage of big data through analytics. In addition, a language barrier still exists in some organizations, where IT speaks in terms of TCO and cost of ownership and efficiency while the business speaks in terms of effectiveness and outcomes or time to value, which I have written about previously. While all of these goals are important, the organization needs to cater to the metrics that are needed by its various personas. Such understanding starts with better defining the different personas and building more effective communication among the groups to ensure that they work together more collaboratively to achieve their respective goals and get the most value from business analytics.


Tony Cosentino

VP and Research Director

Users of big data analytics are finally going public. At the Hadoop Summit last June, many vendors were still speaking of a large retailer or a big bank as users but could not publically disclose their partnerships. Companies experimenting with big data analytics felt that their proof of concept was so innovative that once it moved into production, it would yield a competitive advantage to the early mover. Now many companies are speaking openly about what they have been up to in their business laboratories. I look forward to attending the 2013 Hadoop Summit in San Jose to see how much things have changed in just a single year for Hadoop centered big data analytics.

Our benchmark research into operational intelligence, which I argue is another name for real-time big data analytics, shows diversity in big data analytics use cases by industry. The goals of operational intelligence are an interesting mix as the research shows relative parity among managing performance (59%), detecting fraud and security (59%), complying with regulations (58%) and managing risk (58%), but when we drill down into different industries there are some interesting nuances. For instance, healthcare and banking are driven much more by risk and regulatory compliance, services such as retail are driven more by performance, and manufacturing is driven more by cost reduction. All of these make sense given the nature of the businesses. Let’s look at them in more detail.

vr_oi_goals_of_using_operational_intelligenceThe retail industry, driven by market forces and facing discontinuous change, is adopting big data analytics out of competitive necessity. The discontinuity comes in the form of online shopping and the need for traditional retailers to supplement their brick-and-mortar locations. JCPenney and Macy’s provide a sharp contrast in how two retailers approached this challenge. A few years ago, the two companies eyed a similar competitive space, but since that time, Macy’s has implemented systems based on big data analytics and is now sourcing locally for online transactions and can optimize pricing of its more than 70 million SKUs in just one hour using SAS High Performance Analytics. The Macy’s approach has, in Sun-Tzu like fashion, made the “showroom floor” disadvantage into a customer experience advantage. JCPenney, on the other hand, used gut-feel management decisions based on classic brand merchandising strategies and ended up alienating its customers and generating law suits and a well-publicized apology to its customers. Other companies including Sears are doing similarly innovative work with suppliers such as Teradata and innovative startups like Datameer in data hub architectures build around Hadoop.

Healthcare is another interesting market for big data, but the dynamics that drive it are less about market forces and more about government intervention and compliance issues. Laws around HIPPA, the recent Healthcare Affordability Act, OC-10 and the HITECH Act of 2009 all have implications for how these organizations implement technology and analytics. Our recent benchmark research on governance, risk and compliance indicates that many companies have significant concerns about compliance issues: 53 percent of participants said they are concerned about them, and 42 percent said they are very concerned. Electronic health records (EHRs) are moving them to more patient-centric systems, and one goal of the Affordable Care Act is to use technology to produce better outcomes through what it calls meaningful use standards.  Facing this title wave of change, companies including IBM analyze historical patterns and link it with real-time monitoring, helping hospitals save the lives of at-risk babies. This use case was made into a now-famous commercial by advertising firm Ogilvy about the so-called data babies. IBM has also shown how cognitive question-and-answer systems such as Watson assist doctors in diagnosis and treatment of patients.

Data blending, the ability to mash together different data sources without having to manipulate the underlying data models, is another analytical technique gaining significant traction. Kaiser Permanente is able to use tools from Alteryx, which I have assessed, to consolidate diverse data sources, including unstructured data, to streamline operations to improve customer service. The two organizations made a joint presentation similar to the one here at Alteryx’s user conference in March.

vr_grc_worried_about_grcFinancial services, which my colleague Robert Kugel covers, is being driven by a combination of regulatory forces and competitive market forces on the sales end. Regulations produce a lag in the adoption of certain big data technologies, such as cloud computing, but areas such as fraud and risk management are being revolutionized by the ability, provided through in-memory systems, to look at every transaction rather than only a sampling of transactions through traditional audit processes. Furthermore, the ability to pair advanced analytical algorithms with in-memory real-time rules engines helps detect fraud as it occurs, and thus criminal activity may be stopped at the point of transaction. On a broader scale, new risk management frameworks are becoming the strategic and operational backbone for decision-making in financial services.

On the retail banking side, copious amounts of historical customer data from multiple banking channels combined with government data and social media data are providing banks the opportunity to do microsegmentation and create unprecedented customer intimacy. Big data approaches to micro-targetting and pricing algorithms, which Rob recently discussed in his blog on Nomis, enable banks and retailers alike to target individuals and customize pricing based on an individual’s propensity to act. While partnerships in the financial services arena are still held close to the vest, the universal financial services providers – Bank of America, Citigroup, JPMorgan Chase and Wells Fargo – are making considerable investments into all of the above-mentioned areas of big data analytics.

Industries other than retail, healthcare and banking are also seeing tangible value in big data analytics. Governments are using it to provide proactive monitoring and responses to catastrophic events. Product and design companies are leveraging big data analytics for everything from advertising attribution to crowdsourcing of new product innovation. Manufacturers are preventing downtime by studying interactions within systems and predicting machine failures before they occur. Airlines are recalibrating their flight routing systems in real time to avoid bad weather. From hospitality to telecommunications to entertainment and gaming, companies are publicizing their big data-related success stories.

Our research shows that until now, big data analytics has primarily been the domain of larger, digitally advanced enterprises. However, as use cases make their way through business and their tangible value is accepted, I anticipate that the activity around big data analytics will increase with companies that reside in the small and midsize business market. At this point, just about any company that is not considering how big data analytics may impact its business faces an unknown and uneasy future. What a difference a year makes, indeed.


Tony Cosentino

VP and Research Director

Responding to the trend that businesses now ask less sophisticated users to perform analysis and rely on software to help them, Oracle recently announced a new release  of its flagship Oracle BI Foundational Suite (OBIFS as well as updates to Endeca, the discovery platform that Oracle bought in 2011. Endeca is part of a new class of tools that bring new capabilities in information discovery, self-service access and interactivity. Such approaches represent an important part of the evolution of business intelligence to business analytics as I have noted in my agenda for 2013.

Oracle Business Intelligence Foundational Suite includes many components not limited to Oracle Business Intelligence Enterprise Edition (OBIEE), Oracle Essbase and a scorecard and strategy application. OBIEE is the enabling foundation that federates queries across data sources and enables reporting across multiple platforms. Oracle Essbase is an in-memory OLAP tool that enables forecasting and planning, including what-if scenarios embedded in a range of Oracle BI Applications, which are sold separately. The suite, along with the Endeca software, is integrated with Exalytics, Oracle’s appliance for BI and analytics. Oracle’s appliance strategy, which I wrote about after Oracle World last year invests heavily in the Sun Microsystems hardware acquired in 2010.

These updates are far-ranging and numerous (including more than 200 changes to the software). I’d like to point out some important pieces that advance Oracle’s position in the BI market. A visualization recommendations engine offers guidance on the type of visualization that may be appropriate for a user’s particular data. This feature, already sold by others in the market, may be considered a subset of the broader capability of guided analysis. Advanced visualization techniques have become more important for companies as they make it easier for users to understand data and is critical to compete with the likes of  Tableau, a player in this space which I wrote about last year.

Another user-focused update related to visualization is performance tiles, which enable important KPIs to be displayed prominently within the context of the screen surface area. Performance tiles are a great way to start improving the static dashboards that my colleague Mark Smith has critiqued. From what I have seen it is unclear to what degree the business user can define and change Oracle’s performance tile KPIs (for example, the red-flagged metrics assignedvr_bigdata_big_data_capabilities_not_available to the particular business user that appear within the scorecard function of the software) and how much the system can provide in a prescriptive analytic fashion. Other visualizations that have been added include waterfall charts, which enable dependency analysis; these are especially helpful for pricing analysis by showing users how changes in one dimension impact pricing on the whole. Another is MapViews for manipulation and design to support location analytics that our next generation BI research finds the capability to deploy geographic maps are most important to BI in 47 percent of organizations, and then visualize metrics associated with locations in 41 percent of organizations. Stack charts now provide auto-weighting for 100-percent sum analysis that can be helpful for analytics such as attribution models. Breadcrumbs empower users to understand and drill back through their navigation process, which helps them understand how a person came to a particular analytical conclusion. Finally Trellis View actions provides contextual functionality to help turn data into action in an operational environment. The advancements of these visualizations are critical for Oracle big data efforts as visualization is a top three big data capability not available in 37 percent of organizations according to our big data research and our latest technology innovation research on business analytics found presenting data visually as the second most important capability for organizations according to 48 percent of organizations.

vr_ngbi_br_collaboration_tool_access_preferencesThe update to Oracle Smart View for Office also puts more capability in the hands of users. It natively integrates Excel and other Microsoft Office applications with operational BI dashboards so users can perform analysis and prepare ad-hoc reports directly within these desktop environments. This is an important advance for Oracle since our benchmark research in the use of spreadsheets across the enterprise found that the combination of BI and spreadsheets happens all the time or frequently in 74 percent of organization. Additionally the importance of collaborating with business intelligence is essential and having tighter integration is a critical use case as found in our next generation business intelligence research that found using Microsoft Office for collaboration with business intelligence is important to 36 percent of organizations.

Oracle efforts to evolve its social collaboration efforts through what they call Oracle Social Network have advanced significantly but do not appear to be in the short term plan to integrate and make available through its business intelligence offering. Our research finds more than two-thirds (67%) rank this as important and then embedding it within BI is a top need in 38 percent of organizations. Much of what Oracle already provides could be easily integrated and meet business demand for a range of people-based interactions that most are still struggling to manage through e-mail.

Oracle has extended its existing capabilities in its OBIEE with Hadoop integration via a HIVE connector that allows Oracle to pull data into OBIEE from big data sources, while an MDX search function enabled by integration with the Endeca discovery tool allows OBIEE to do full text search and data discovery. Connections to new data sources are critically important in today’s environment; our research shows that retaining and analyzing more data is the number-one ranked use for big data in 29 percent of organizations according to our technology innovation research. Federated data discovery is particularly important as most companies are often unaware of their information assets and therefore unknowingly limit their analysis.

Beyond the core BI product, Oracle made significant advances with Endeca 3.0. Users can now analyze Excel files. This is an existing capability for other vendors, so it was important for Oracle to gain parity here. Beyond that, Endeca now comes with a native JavaScript Object Notation (JSON) reader and support for authorization standards. This furthers its ability to do contextual analysis and sentiment analysis on data in text and social media. Endeca also now can pull data from the Oracle BI server to marry with the analysis. Overall the new version of Endeca enables new business-driven information discovery that is essential to relieve the stress on analysts and IT to create and publish information and insights to business.

Oracle’s continued investments into BI applications that supply prebuilt analytics and these packaged analytics applications span from the front office (sales and marketing), to operations (procurement and supply chain) to the back office (finance and HR). Given the enterprise-wide support, Oracle’s BI can perform cross-functional analytics and deliver fast time to value since users do not have to spend time building the dashboards. Through interoperation with the company’s enterprise applications, customers can execute action directly into applications such as PeopleSoft, JD Edwards or Oracle Business Suite. Oracle has begun to leverage more of its score-carding function that enables KPI relationships to be mapped and information aggregated and trended. Scorecards are important for analytic cultures because they are a common communication platform for executive decision-makers and allow ownership assignment of metrics.

I was surprised to not find much advancement in Oracle business intelligence efforts that operate on smartphones and tablets. Our research finds mobile business intelligence is important to 69 percent of organizations and that 78 percent of organizations reveal that no or some BI capabilities are available in their current deployment of BI. For those that are using mobile business intelligence, only 28 percent are satisfied. For years, IT has not placed a priority on mobile support of BI while business has been clamoring for it and now more readily leading the efforts with 52 percent planning new or expanded deployments on tablets and 32 percent on smartphones. In this highly competitive market to capture more opportunity, Oracle will need to significantly advance its efforts and make its capabilities freely available without passwords as other BI providers have already done. It also will need to recognize that business is more interested in alerts and events through notifications to mobile technology than trying to make the entire suite of BI capabilities replicated on these technologies.

Oracle has foundational positions in enterprise applications and database technology and has used these positions to drive significant vr_ngbi_br_importance_of_bi_technology_considerationssuccess in BI. The company’s proprietary “walled garden” approach worked well for years, but now technology changes, including movements toward open source and cloud computing, threaten that entrenched position. Surprisingly, the company has moved slowly off of its traditional messaging stance targeted at the CIO, IT and the data center. That position seems to focus the company too much on the technology-driven 3 V’s of big data and analytics, and not enough on the business driven 3 W’s that I advocate. As the industry moves into the age of analytics, where information is looked upon as a critical commodity and usability is the key to adoption (our research finds usability to be the top evaluation consideration in 63 percent of organizations), CIOs will need to further move beyond its IT approach for BI as I have noted and get more engaged into the requirements of business. Oracle’s business intelligence strategy and how it addresses these business outcomes and the use across all business users is key to the company’s future and organizations should examine these critical advancements to its BI offering very closely to determine if you can improve the value of information and big data in an organization.


Tony Cosentino

VP and Research Director

SAS Institute held its 24th annual analyst summit last week in Steamboat Springs, Colorado. The 37-year-old privately held company is a key player in big data analytics, and company executives showed off their latest developments and product roadmaps. In particular, LASR Analytical Server and Visual Analytics 6.2, which is due to be released this summer, are critical to SAS’ ability to secure and expand its role as a preeminent analytics vendor in the big data era.

For SAS, the competitive advantage in Big Data rests in predictive vr_predanalytics_predictive_analytics_obstaclesanalytics, and according to our benchmark research into predictive analytics, 55 percent of businesses say the challenge of architectural integration is a top obstacle to rolling out predictive analytics in the organization. Integration of analytics is particularly daunting in a big-data-driven world, since analytics processing has traditionally taken place on a platform separate from where the data is stored, but now they must come together. How data is moved into parallelized systems and how analytics are consumed by business users are key questions in the market today that SAS is looking to address with its LASR and Visual Analytics.

Jim Goodnight, the company’s founder and plainspoken CEO, says he saw the industry changing a few years ago. He speaks of a large bank doing a heavy analytical risk computation that took upwards of 18 hours, which meant that the results of the computation were not ready in time for the next trading day. To gain competitive advantage, the time window needed to be reduced, but running the analytics in a serialized fashion was a limiting factor. This led SAS to begin parallelizing the company’s workhorse procedures, some of which were first developed upwards of 30 years ago. Goodnight also discussed the fact that building these parallelizing statistical models is no easy task. One of the biggest hurdles is getting the mathematicians and data scientists that are building these elaborate models to think in terms of the new parallelized architectural paradigm.

Its Visual Analytics software is a key component of the SAS Big Data Analytics strategy. Our latest business technology innovation benchmark research [] found that close to half (48%) of organizations present business analytics visually. Visual Analytics, which was introduced early last year, is a cloud-based offering running off of the LASR in-memory analytic engine and the Amazon Web Services infrastructure. This web-based approach allows SAS to iterate quickly without worrying a great deal about revision management while giving IT a simpler server management scenario. Furthermore, the web-based approach provides analysts with a sandbox environment for working with and visualizing in the cloud big data analytics; the analytic assets can then be moved into a production environment. This approach will also eventually allow SAS to combine data integration capabilities with the data analysis capabilities.

With descriptive statistics being the ante in today’s visual discovery world, SAS is focusing Visual Analytics to take advantage of the vr_bigdata_obstacles_to_big_data_analytics (2)company’s predictive analytics history and capabilities. Visual Analytics 6.2 integrates predictive analytics and rapid predictive modeling (RPM) to do, among other things, segmentation, propensity modeling and forecasting. RPM makes it possible for models to be generated via sophisticated software that runs through multiple algorithms to find the best fit based on the data involved. This type of commodity modeling approach will likely gain significant traction as companies look to bring analytics into industrial processes and address the skills gap in advanced analytics. According to our BTI research, the skills gap is the biggest challenge facing big data analytics today, as participants identified staffing (79%) and training (77%) as the top two challenges.

Visual Analytics’ web-based approach is likely a good long-term bet for SAS, as it marries data integration and cloud strategies. These factors, coupled with the company’s installed base and army of loyal users, give SAS a head start in redefining the world of analytics. Its focus on integrating visual analytics for data discovery, integration and commodity modeling approaches also provides compelling time-to-value for big data analytics. In specific areas such as marketing analytics, the ability to bring analytics into the applications themselves and allow data-savvy marketers to conduct a segmentation and propensity analysis in the context of a specific campaign can be a real advantage. Many of SAS’ innovations cannibalize its own markets, but such is the dilemma of any major analytics company today.

The biggest threat to SAS today is the open source movement, which offers big data analytic approaches such as Mahout and R. For instance, the latest release of R includes facilities for building parallelized code. While academics working in R often still build their models in a non-parallelized, non-industrial fashion, the current and future releases of R promise more industrialization. As integration of Hadoop into today’s architectures becomes more common, staffing and skillsets are often a larger obstacle than the software budget. In this environment the large services companies loom larger because of their role in defining the direction of big data analytics. Currently, SAS partners with companies such as Accenture and Deloitte, but in many instances these companies have split loyalties. For this reason, the lack of a large in-house services and education arm may work against SAS.

At the same time, SAS possesses blueprints for major analytic processes across different industries as well as horizontal analytic deployments, and it is working to move these to a parallelized environment. This may prove to be a differentiator in the battle versus R, since it is unclear how quickly the open source R community, which is still primarily academic, will undertake the parallelization of R’s algorithms.

SAS partners closely with database appliance vendors such as Greenplum and Teradata, with which it has had longstanding development relationships. With Teradata, it integrates into the BYNET messaging system, allowing for optimized performance between Teradata’s relational database and the LASR Analytic Server. Hadoop is also supported in the SAS reference architecture. LASR accesses HDFS directly and can run as a thin memory layer on top of the Hadoop deployment. In this type of deployment, Hadoop takes care of everything outside the analytic processing, including memory management, job control and workload management.

These latest developments will be of keen interest to SAS customers. Non-SAS customers who are exploring advanced analytics in a big data environment should consider SAS LASR and its MPP approach. Visual Analytics follows the “freemium” model that is prevalent in the market, and since it is web-based, any instances downloaded today can be automatically upgraded when the new version arrives in the summer. For the price, the tool is certainly worth a test drive for analysts. For anyone looking into such tools and foresee the need for inclusion predictive analytics, it should be of particular interest.


Tony Cosentino
VP and Research Director

SAP just released solid preliminary quarterly and annual revenue numbers, which in many ways can be attributed to a strong strategic vision around the HANA in-memory platform and strong execution throughout the organization. Akin to flying an airplane while simultaneously fixing it, SAP’s bold move to HANA may at some point see the company continuing to fly when other companies are forced to ground parts of their fleets.

Stepping into 2012, the HANA strategy was still nascent, and SAP provided incentives both to customers and the channel to bring them along. At that point SAP also promoted John Schweitzer to senior vice president and general manager for business analytics.  Schweitzer, an industry veteran who spent his career with Hyperion and Oracle before moving to SAP in 2010, gave a compelling analytics talk at SAPPHIRE NOW in early 2012 and was at the helm for the significant product launches during the course of the year.

One of these products is SAP’s Visual Intelligence, which was released in the spring of 2012. The product takes Business Objects vr_ngbi_br_importance_of_bi_technology_considerationsBusiness Explorer and moves it to the desktop so that analysts can work without the need to have IT involved. Because it runs on HANA, it allows real-time visual exploration of data akin to what we have been seeing with companies such as Tableau, which recently passed the $100 million mark in revenue.  As with many of SAP’s new offers, the advantage of running on top of HANA allows visual exploration analysis on very large data sets. The same size data sets may quickly overwhelm certain “in-memory” competitor systems. In order to facilitate buzz and help develop “Visi,” as Visual Intelligence is also called, the company ran a Data Geek Challenge in which SAP encouraged users to develop big data visualizations on top of HANA.

Tools such as SAP’s Visual Intelligence begin to address the analytics usability challenge that I recently wrote about in the The Brave New World of Business Intelligence which focused on recent research we did on next-generation business intelligence. One key takeaway from that research is that usability expectations for business intelligence are being set on a consumer level, but that our information environments and our processes in the enterprise are too fragmented to achieve the same level of usability, resulting in relatively high dissatisfaction with next-generation business tools. Despite that dissatisfaction, the high switch costs for enterprise BI mean that customers are essentially captive, and this gives incumbents time to adapt their approaches. There is no doubt that SAP looks to capitalize on this opportunity with agile development approaches and frequent iterations of the Visual Intelligence software.

Another key release in 2012 was around Predictive Analytics which went generally available late in the year after SAP TechEd in Madrid. With this release, SAP moves away from its partnership
vr_predanalytics_benifits_of_predictive_analytics with IBM SPSS. The move makes sense given that sticking with SPSS would have resulted in a critical dependency for SAP. According to our benchmark research on Predictive Analytics, 68% of organizations see predictive analytics as a source of competitive advantage. Once again, HANA may prove to be a strong differentiator for SAP in that the company will be able to leverage the in-memory system to visualize data and run predictive analytics on very large data sets and across different workloads. Furthermore, the sap Predictive Analytics offering inherits data acquisition and data manipulation functionality from SAP Visual Intelligence. However, it cannot be installed on the same machine as Visi, according to a blog on the SAP Community Network.

SAP will need to work hard to build out predictive analytics capabilities to compete with the likes of SAS, IBM SPSS and other providers which have years of custom developed vertical and Line-of-Business solution sets. Beyond the customized analytical assets, IBM and SAS will likely promote the idea of commodity models as such non-optimized modeling approaches become more important. Commodity models are a breed of “good enough” models that allow for prediction and data reduction that is a step-function better than a purely random or uninformed decision. Where deep analytical skill sets are not available, sophisticated software can run through data and match it to the appropriate model.

In 2012, SAP also continued to develop targeted analytical solutions, which bundles SAP BI and the Sybase IQ server, a columnar database. Column-oriented databases such as Sybase IQ, Vertica, ParAccel and Infobright have gained momentum over the last few years by providing a platform that organizes data in a way that allows for much easier analytical access and data compression.  Instead of writing to disk in a row oriented fashion, columnar approaches write to disk in a column oriented fashion allowing for faster analytical cycles and reduced Time-to-Value.

The competitive advantage of columnar databases, however, may be mitigated as in-memory approaches gain favor in the marketplace. For this reason, continued development on the Sybase IQ platform may be an intermediate step until the HANA analytics portfolio is built out; after which HANA will likely cannibalize parts of its own stack. SAP’s dual approach to Big Data analytics with both “in-memory” and columnar provides good visibility into the classic Innovator’s Dilemma that faces many technology suppliers today and how SAP is dealing with this dilemma. It should be noted, however, that SAP is also working on an integrated portfolio approach and that Sybase IQ may actually be a better fit as datasets move to Petabyte scale.

Another aspect of that same Innovator’s Dilemma is a fragmented choice environment as new technologies develop. Our research shows that the market is undecided on how it will roll out critical next-generation business intelligence capabilities such as collaboration. Just under two-fifth of our Next Generationvr_ngbi_br_collaboration_tool_access_preferences Business Intelligence study participants (38%) prefer business intelligence applications as the primary access method for collaborative BI, but 36 percent prefer access through office productivity tools, and 34 percent prefer access through applications themselves. (Not surprisingly, IT leans more toward providing tools within the already existing landscape, while business users are more likely to want this capability within the context of the application.)  This fragmented choice scenario carries over to analytics as well, where spreadsheets are still the dominant analytical tool in most organizations. Here at Ventana Research, we are fielding more inquiries on application-embedded analytics and how these will play out in the organizational landscape. I anticipate this debate will continue through 2013, with different parts of the market providing solid arguments for each of the three camps. Since HANA uniquely provides both transactional processing and analytic processing in one engine, it will be interesting to look closer at the HANA and Business Objects roadmap in 2013 to see how they are positioning with respect to this debate. Furthermore, as I discuss in my 2013 Research Agenda blog post, disseminating insights within the organization is a big part of moving from insights to action, and business intelligence is still the primary vehicle for moving insight into the organization.  For this reason, the natural path for many organizations may indeed be through their Business Intelligence systems.

SAP, clearly separating its strategic position, looks to continue to innovate its entire portfolio, including both applications and analytics, based on the HANA database. In the most recent quarter, SAP took a big step forward in this regard by porting its entire business suite to run on HANA as my colleague Robert Kugel discussed in a recent blog.

While there are still some critical battles to be played out, one thing remains clear: SAP is one of the dominant players in business intelligence today. Our Value Index on Business Intelligence has assessed SAP as a Hot Vendor and is top ranked. SAP aims to stay that way by continuing to innovate around HANA and giving its customers and prospects a seamless transition to next-generation analytics and technologies.


Tony Cosentino
VP and Research Director

RSS Tony Cosentino’s Analyst Perspectives at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Tony Cosentino – Twitter

Error: Twitter did not respond. Please wait a few minutes and refresh this page.


  • 73,162 hits
%d bloggers like this: