You are currently browsing the tag archive for the ‘alteryx’ tag.

One of the key findings in our latest benchmark research into predictive analytics is that companies are incorporating predictive analytics into their operational systems more often than was the case three years ago. The research found that companies are less inclined to purchase stand-alone predictive analytics tools (29% vs 44% three years ago) and more inclined to purchase predictive analytics built into business intelligence systems (23% vs 20%), applications (12% vs 8%), databases (9% vs 7%) and middleware (9% vs 2%). This trend is not surprising since operationalizing predictive analytics – that is, building predictive analytics directly into business process workflows – improves companies’ ability to gain competitive advantage: those that deploy predictive analyticsvr_NG_Predictive_Analytics_12_frequency_of_updating_predictive_models within business processes are more likely to say they gain competitive advantage and improve revenue through predictive analytics than those that don’t.

In order to understand the shift that is underway, it is important to understand how predictive analytics has historically been executed within organizations. The marketing organization provides a useful example since it is the functional area where organizations most often deploy predictive analytics today. In a typical organization, those doing statistical analysis will export data from various sources into a flat file. (Often IT is responsible for pulling the data from the relational databases and passing it over to the statistician in a flat file format.) Data is cleansed, transformed, and merged so that the analytic data set is in a normalized format. It then is modeled with stand-alone tools and the model is applied to records to yield probability scores. In the case of a churn model, such a probability score represents how likely someone is to defect. For a marketing campaign, a probability score tells the marketer how likely someone is to respond to an offer. These scores are produced for marketers on a periodic basis – usually monthly. Marketers then work on the campaigns informed by these static models and scores until the cycle repeats itself.

The challenge presented by this traditional model is that a lot can happen in a month and the heavy reliance on process and people can hinder the organization’s ability to respond quickly to opportunities and threats. This is particularly true in fast-moving consumer categories such as telecommunications or retail. For instance, if a person visits the company’s cancelation policy web page the instant before he or she picks up the phone to cancel the contract, this customer’s churn score will change dramatically and the action that the call center agent should take will need to change as well. Perhaps, for example, that score change should mean that the person is now routed directly to an agent trained to deal with possible defections. But such operational integration requires that the analytic software be integrated with the call agent software and web tracking software in near-real time.

Similarly, the models themselves need to be constantly updated to deal with the fast pace of change. For instance, if a telecommunications carrier competitor offers a large rebate to customers to switch service providers, an organization’s churn model can be rendered out of date and should be updated. Our research shows that organizations that constantly update their models gain competitive advantage more often than those that only update them periodically (86% vs 60% average), more often show significant improvement in organizational activities and processes (73% vs 44%), and are more often very satisfied with their predictive analytics (57% vs 23%).

Building predictive analytics into business processes is more easily discussed than done; complex business and technical challenges must be addressed. The skills gap that I recently wrote about is a significant barrier to implementing predictive analytics. Making predictive analytics operational requires not only statistical and business skills but technical skills as well.   From a technical perspective, one of the biggest challenges for operationalizing predictive analytics is accessing and preparing data which I wrote about. Four out of ten companies say that this is the part of the predictive analytics process vr_NG_Predictive_Analytics_02_impact_of_doing_more_predictive_analyticswhere they spend the most time. Choosing the right software is another challenge that I wrote about. Making that choice includes identifying the specific integration points with business intelligence systems, applications, database systems, and middleware. These decisions will depend on how people use the various systems and what areas of the organization are looking to operationalize predictive analytics processes.

For those that are willing to take on the challenges of operationalizing predictive analytics the rewards can be significant, including significantly better competitive positioning and new revenue opportunities. Furthermore, once predictive analytics is initially deployed in the organization it snowballs, with more than nine in ten companies going on to increase their use of predictive analytics. Once companies reach that stage, one third of them (32%) say predictive analytics has had a transformational impact and another half (49%) say it provides a significant positive benefits.


Ventana Research

Our benchmark research into predictive analytics shows that lack of resources, including budget and skills, is the number-one business barrier to the effective deployment and use of predictive analytics; awareness – that is, an understanding of how to apply predictive analytics to business problems – is second. In order to secure resources and address awareness problems a business case needs to be created and communicated clearly wherever appropriate across the organization. A business case presents the reasoning for initiating a project or task. A compelling business case communicates the nature of the proposed project and the arguments, both quantified and unquantifiable, for its deployment.

The first steps in creating a business case for predictive analytics are to understand the audience and to communicate with the experts who will be involved in leading the project. Predictive analytics can be transformational in nature and therefore the audience potentially is broad, including many disciplines within the organization. Understand who should be involved in business case creation a list that may include business users, analytics users and IT. Those most often primarily responsible for designing and deploying predictive analytics are data scientists (in 31% of organizations), the business intelligence and data warehouse team (27%), those working in general IT (16%) and line of business analysts (13%), so be sure to involve these groups. Understand the specific value and challenges for each of the constituencies so the business case can represent the interests of these key stakeholders. I discuss the aspects of the business where these groups will see predictive analytics most adding value here and here.

For the business case for a predictive analytics deployment to be persuasive, executives also must understand how specifically the deployment will impact their areas of responsibilityvr_NG_Predictive_Analytics_01_front_office_functions_use_predictive_anal.._ and what the return on investment will be. For these stakeholders, the argument should be multifaceted. At a high level, the business case should explain why predictive analytics is important and how it fits with and enhances the organization’s overall business plan. Industry benchmark research and relevant case studies can be used to paint a picture of what predictive analytics can do for marketing (48%), operations (44%) and IT (40%), the functions where predictive analytics is used most.

A business case should show how predictive analytics relates to other relevant innovation and analytic initiatives in the company. For instance, companies have been spending money on big data, cloud and visualization initiatives where software returns can be more difficult to quantify. Our research into big data analytics and data and analytics in the cloud show that the top benefit for these initiatives are communication and knowledge sharing. Fortunately, the business case for predictive analytics can cite the tangible business benefits our research identified, the most often identified of which are achieving competitive advantage (57%), creating new revenue opportunities (50%), and increasing profitability vr_NG_Predictive_Analytics_03_benefits_of_predictive_analytics(46%). But the business case can be made even stronger by noting that predictive analytics can have added value when it is used to leverage other current technology investments. For instance, our big data analytics research shows that the most valuable type of analytics to be applied to big data is predictive analytics.

To craft the specifics of the business case, concisely define the business issue that will be addressed. Assess the current environment and offer a gap analysis to show the difference between the current environment and the future environment). Offer a recommended solution, but also offer alternatives. Detail the specific value propositions associated with the change. Create a financial analysis summarizing costs and benefits. Support the analysis with a timeline including roles and responsibilities. Finally, detail the major risk factors and opportunity costs associated with the project.

For complex initiatives, break the overall project into a series of shorter projects. If the business case is for a project that will involve substantial work, consider providing separate timelines and deliverables for each phase. Doing so will keep stakeholders both informed and engaged during the time it takes to complete the full project. For large predictive analytics projects, it is important to break out the due-diligence phase and try not to make any hard commitments until that phase is completed. After all, it is difficult to establish defensible budgets and timelines until one knows the complete scope of the project.

Ensure that the project time line is realistic and addresses all the key components needed for a successful deployment.  In particular with predictive analytics projects, make certain that it reflects a thoughtful approach to data access, data quality and data preparation. We note that four in 10 organizations say vr_NG_Predictive_Analytics_08_time_spent_in_predictive_analytic_processthat the most time spent in the predictive analytics process is in data preparation and another 22 percent say that they spend the most time accessing data sources. If data issues have not been well thought through, it is next to impossible for the predictive analytics initiative to be successful. Read my recent piece on operationalizing predictive analytics to show how predictive analytics will align with specific business processes.

If you are proposing the implementation of new predictive analytics software, highlight the multiple areas of return beyond competitive advantage and revenue benefits. Specifically, new software can have a total lower cost of ownership and generate direct cost savings from improved operating efficiencies. A software deployment also can yield benefits related to people (productivity, insight, fewer errors), management (creativity, speed of response), process (shorter time on task or time to complete) and information (easier access, more timely, accurate and consistent). Create a comprehensive list of the major benefits the software will provide compared to the existing approach, quantifying the impact wherever possible. Detail all major costs of ownership whether the implementation is on-premises or cloud-based: these will include licensing, maintenance, implementation consulting, internal deployment resources, training, hardware and other infrastructure costs. In other words, think broadly about both the costs and the sources of return in building the case for new technology. Also, read my recent piece on procuring predictive analytics software.

Understanding the audience, painting the vision, crafting the specific case, outlining areas of return, specifying software, noting risk factors, and being as comprehensive as possible are all part of a successful business plan process. Sometimes, the initial phase is really just a pitch for project funding and there won’t be any dollar allocation until people are convinced that the program will get them what they need.  In such situations multiple documents may be required, including a short one- to two-page document that outlines vision and makes a high-level argument for action from the organizational stakeholders. Once a cross functional team and executive support is in place, a more formal assessment and design plan following the principles above will have to be built.

Predictive analytics offers significant returns for organizations willing pursue it, but establishing a solid business case is the first step for any organization.


Ventana Research

Our research into next-generation predictive analytics shows that along with not having enough skilled resources, which I discussed in my previous analysisNGPA AP #4 image 1the inability to readily access and integrate data is a primary reason for dissatisfaction with predictive analytics (in 62% of participating organizations). Furthermore, this area consumes the most time in the predictive analytics process: The research finds that preparing data for analysis (40%) and accessing data (22%) are the parts of the predictive analysis process that create the most challenges for organizations. To allow more time for actual analysis, organizations must work to improve their data-related processes.

Organizations apply predictive analytics to many categories of information. Our research shows that the most common categories are customer (used by 50%), marketing (44%), product (43%), financial (40%) and sales (38%). Such information often has to be combined from various systems and enriched with information from new sources. Before users can apply predictive analytics to these blended data sets, the information must be put into a common form and represented as a normalized analytic data set. Unlike in data warehouse systems, which provide a single data source with a common format, today data is often located in a variety of systems that have different formats and data models. Much of the current challenge in accessing and integrating data comes from the need to include not only a variety of relational data sources but also less structured forms of data. Data that varies in both structures and sizes is commonly called big data.

To deal with the challenge of storing and computing big data, organizations planning to use predictive analytics increasingly turn to big data technology. While flat files and relational databases on standard hardware, each cited by almost two-thirds (63%) of participants, are still the most commonly used tools for predictive analytics, more than half (52%) of organizations now use data warehouse appliances for Using Big Data with Predictive Analytics predictive analytics, and 31 percent use in-memory databases, which the second-highest percentage (24%) plan to adopt in the next 12 to 24 months. Hadoop and NoSQL technologies lag in adoption, currently used by one in four organizations, but in the next 12 to 24 months an additional 29 percent intend to use Hadoop and 20 percent more will use other NoSQL approaches. Furthermore, more than one-quarter (26%) of organizations are evaluating Hadoop for use in predictive analytics, which is the most of any technology.

 Some organizations are considering moving from on-premises to cloud-based storage of data for predictive analytics; the most common reasons for doing so are to improve accessing data (for 49%) and preparing data for analysis (43%). This trend speaks to the increasing importance of cloud-based data sources as well as cloud-based tools that provide access to many information sources and provide predictive analytics. As organizations accumulate more data and need to apply predictive analytics in a scalable manner, we expect the need to access and use big data and cloud-based systems to increase.

While big data systems can help handle the size and variety of data, they do not of themselves solve the challenges of data access and normalization. This is especially true for organizations that need to blend new data that resides in isolated systems. How to do this is critical for organizations to consider, especially in light of the people using predictive analytic system and their skills. There are three key considerations here. One is the user interface, the most common of which are spreadsheets (used by 48%), graphical workflow modeling tools (44%), integrated development environments (37%) and menu-driven modeling tools (35%). Second is the number of data sources to deal with and which are supported by the system; our research shows that four out of five of organizations need to access and integrate five or more data sources. The third consideration is which analytic languages and libraries to use and which are supported by the system; the research finds that Microsoft Excel, SQL, R, Java and Python are the most widely used for predictive analytics. Considering these three priorities both in terms of the resident skills, processes, current technology, and information sources that need to be accessed are crucial for delivering value to the organization with predictive analytics.

While there has been an exponential increase in data available to use in predictive analytics as well as advances in integration technology, our research shows that data access and preparation are still the most challenging and time-consuming tasks in the predictive analytics process. Although technology for these tasks has improved, complexity of the data has increased through the emergence of different data types, large-scale data and cloud-based data sources. Organizations must pay special attention to how they choose predictive analytics tools that can give easy access to multiple diverse data sources including big data stores and provide capabilities for data blending and provisioning of analytic data sets. Without these capabilities, predictive analytics tools will fall short of expectations.


Ventana Research

The Performance Index analysis we performed as part of our next-generation predictive analytics benchmark research shows that only one in four organizations, those functioning at the highest Innovative level of performance, can use predictive analytics to compete effectively against others that use this technology less well. We analyze performance in detail in four dimensions (People, Process, Information and Technology), and for predictive analytics we find that organizations perform best in the Technology dimension, with 38 percent reaching the top Innovative level. This is often the case in our analyses, as organizations initially perform better in the details of selectingvr_NG_Predictive_Analytics_performance_06_dimensions and managing new tools than in the other dimensions. Predictive analytics is not a new technology per se, but the difference is that it is becoming more common in business units, as I have written.

In contrast to organizations’ performance in the Technology dimension, only 10 percent reach the Innovative level in People and only 11 percent in Process. This disparity uncovered by the research analysis suggests there is value in focusing on the skills that are used to design and deploy predictive analytics. In particular, we found that one of the two most-often cited reasons why participants are not fully satisfied with the organization’s use of predictive analytics is that there are not enough skilled resources (cited by 62%). In addition, 29 percent said that the need for too much training or customized skills is a barrier to changing their predictive analytics.

The challenge for many organizations is to find the combination of domain knowledge, statistical and mathematical knowledge, and technical knowledge that it needs to be able to integrate predictive analytics into other technology systems and into operations in the lines of business, which I also have discussed. The need for technical knowledge is evident in the research findings on the jobs held by individual participants: Three out of four require technical sophistication. More than one-third (35%) are data scientists who have a deep understanding of predictive analytics and its use as well as of data-related technology; one-fourth are data analysts who understand the organization’s data and systems but have limited knowledge of predictive analytics; and 16 percent described themselves as predictive analytics experts who have a deep understanding of this topic but not of technology in general. The research also finds that those most often primarily responsible for designing and deploying predictive analytics are data scientists (in 31% of organizations) or members of the business intelligence and data warehouse team (27%). This focus on business intelligence and data warehousing vr_NG_Predictive_Analytics_16_why_users_dont_produce_predictive_analysesrepresents a shift toward integrating predictive analytics with other technologies and indicates a need to scale predictive analytics across the organization.

In only about half (52%) of organizations are the people who design and deploy predictive analytics the same people who utilize the output of these processes. The most common reasons cited by research participants that users of predictive analytics don’t produce their own analyses are that they don’t have enough skills training (79%) and don’t understand the mathematics involved (66%). The research also finds evidence that skills training pays off: Fully half of those who said they received adequate training in applying predictive analytics to business problems also said they are very satisfied with their predictive analytics; percentages dropped precipitously for those who said the training was somewhat adequate (8%) and inadequate (6%). It is clear that professionals trained in both business and technology are necessary for an organization to successfully understand, deploy and use predictive analytics.

To determine the technical skills and training necessary for predictive analytics, it is important to understand which languages and libraries are used. The research shows that the most common are SQL (used by 67% of organizations) and Microsoft Excel (64%), with which many people are familiar and which are relatively easy to use. The three next-most commonly used are much more sophisticated: the open source language R (by 58%), Java (42%) and Python (36%). Overall, many languages are in use: Three out of five organizations use four or more of them. This array reflects the diversity of approaches to predictive analytics. Organizations must assess what languages make sense for their uses, and vendors must support many languages for predictive analytics to meet the demands of all customers.

The research thus makes clear that organizations must pay attention to a variety of skills and how to combine them with technology to ensure success in using predictive analytics. Not all the skills necessary in an analytics-driven organization can be combined in one person, as I discussed in my analysis of analytic personas. We recommend that as organizations focus on the skills discussed above, they consider creating cross-functional teams from both business and technology groups.


Ventana Research

To impact business success, Ventana Research recommends viewing predictive analytics as a business investment rather than an IT investment.  Our recent benchmark research into next-generation predictive analytics  reveals that since our previous research on the topic in 2012, funding has shifted from general business budgets (previously 44%) to line of business IT budgets (previously 19%). Now more than vr_NG_Predictive_Analytics_15_preferences_in_purchasing_predictive_analy.._  half of organizations fund such projects from business budgets: 29 percent from general business budgets and 27 percent from a line of business IT budget. This shift in buying reflects the mainstreaming of predictive analytics in organizations,  which I recently wrote about .

This shift in funding of initiatives coincides with a change in the preferred format for predictive analytics. The research reveals that 15 percent fewer organizations prefer to purchase predictive analytics as stand-alone technology today than did in the previous research (29% now vs. 44% then). Instead we find growing demand for predictive analytics tools that can be integrated with operational environments such as business intelligence or transaction applications. More than two in five (43%) organizations now prefer predictive analytics embedded in other technologies. This integration can help businesses respond faster to market opportunities and competitive threats without having to switch applications.

  vr_NG_Predictive_Analytics_14_considerations_in_evaluating_predictive_an.._ The features most often sought in predictive analytics products further confirm business interest. Usability (very important to 67%) and capability (59%) are the top buying criteria, followed by reliability (52%) and manageability (49%). This is consistent with the priorities of organizations three years ago with one important exception: Manageability was one of the two least important criteria then (33%) but today is nearly tied with reliability for third place. This change makes sense in light of a broader use of predictive analytics and the need to manage an increasing variety of models and input variables.

Further, as a business investment predictive analytics is most often used in front-office functions, but the research shows that IT and operations are closely associated with these functions. The top four areas of predictive analytics use are marketing (48%), operations (44%), IT (40%) and sales (38%). In the previous research operations ranked much lower on the list.

To select the most useful product, organizations must understand where IT and business buyers agree and disagree on what matters. The research shows that they agree closely on how to deploy the tools: Both expressed a greater preference to deploy on-premises (business 53%, IT 55%) but also agree in the number of those who prefer it on demand through cloud computing (business 22%, IT 23%). More than 90 percent on both sides said the organization plans to deploy more predictive analytics, and they also were in close agreement (business 32%, IT 33%) that doing so would have a transformational impact, enabling the organization to do things it couldn’t do before.

However, some distinctions are important to consider, especially when looking at the business case for predictive analytics. Business users more often focus on the benefit of achieving competitive advantage (60% vs. 50% of IT) and creating new revenue opportunities (55% vs. 41%), which are the two benefits most often cited overall. On the other hand, IT professionals more often focus on the benefits of in­creased upselling and cross-selling (53% vs. 32%), reduced risk (26% vs. 21%) and better compliance (26% vs. 19%); the last two reflect key responsibilities of the IT group.

Despite strong business involvement, when it comes to products, IT, technical and data experts are indispensable for the evaluation and use of predictive analytics. Data scientists or the head of data management are most often involved in recommending (52%) and evaluating (56%) predictive analytics technologies. Reflecting the need to deploy predictive analytics to business units, analysts and IT staff are the next-most influential roles for evaluating and recommending. This involvement of technically sophisticated individuals combined with the movement away from organizations buying stand-alone tools indicates an increasingly team-oriented approach.

Purchase of predictive analytics often requires approval from high up in the organization, which underscores the degree of enterprise-wide interest in this technology. The CEO or president is most likely to be involved in the final decision in small (87%) and midsize (76%) companies. In contrast, large companies rely most on IT management (40%), and very large companies rely most on the CIO or head of IT (60%). We again note the importance of IT in the predictive analytics decision-making process in larger organizations. In the previous research, in large companies IT management was involved in approval in 9 percent of them and the CIO was involved in only 40 percent.

As predictive analytics becomes more widely used, buyers should take a broad view of the design and deployment requirements of the organization and specific lines of business. They should consider which functional areas will use the tools and consider issues involving people, processes and information as well as technology when evaluating such systems. We urge business and IT buyers to work together during the buying process with the common goal of using predictive analytics to deliver value to the enterprise.


Ventana Research

Our recently released benchmark research into next-generation predictive analytics  shows that in this increasingly important area many organizations are moving forward in the dimensions of information and technology, but most are challenged to find people with the right skills and to align organizationalVentanaResearch_NextGenPredictiveAnalytics_BenchmarkResearch processes to derive business value from predictive analytics.

For those that have done so, the rewards can be significant. One-third of organizations participating in the research said that using predictive  analytics leads to transformational change – that is, it enables them to do things they couldn’t do before – and at least half said that it provides competitive advantage or creates new revenue opportunities. Reflecting the  vr_NG_Predictive_Analytics_03_benefits_of_predictive_analytics momentum behind predic­tive analytics today, virtually all participants (98%) that have engaged in predictive analytics said that they will be rolling out more of it.

Our research shows that predictive analytics is being used most often in the front offices of organizations, specifically in marketing (48%), operations (44%) and IT (40%). While operations and IT are not often considered front-office functions, we find that they are using predictive analytics in service to customers. For instance, the ability to manage and impact the customer experience by applying analytics to big data is an increasingly important approach that  I recently wrote about . As conventional channels of communication give way to digital channels, the use of predictive analytics in operations and IT becomes more valuable for marketing and customer service.

However, the most widespread barrier to making changes in predictive analytics is lack of resources (cited by 52% of organizations), which includes finding the necessary skills to design and deploy programs. The research shows that currently consultants and data scientists are those most often needed. Half the time those designing the system are also the end users of it, which indicates that using predictive analytics still requires advanced skills. Lack of awareness (cited by 48%) is the second-most common barrier; many organizations fail to understand the vr_NG_Predictive_Analytics_06_technical_challenges_to_predictive_analyti.._  value of predictive analytics in their business. Some of the reluctance to implement predictive analytics may be because doing so can require significant change. Predictive analytics often represents a new way of thinking and can necessitate revamping of key organizational processes.

From a technical perspective, the most common deployment challenge is difficulty in integrating predictive analytics into the information architecture, an issue cited by half of participants. This is not surprising given the diversity of tools and databases involved in big data. Problems with accessing source data (30%), inappropriate algorithms (26%) and inaccurate results (21%) also impede use. Accessing and normalizing data sources is a significant issue as many different types of data must be incorporated to use predictive analytics optimally. Blending this data and turning it into a clean analytic data set often takes significant effort. Confirming this is the finding that data preparation is the most challenging part of the analytic process for half of the organizations in the research.

Regarding interaction with other established systems, business intelligence is most often the integration point (for 56% of companies). However, it also is increasingly embedded in databases and middleware. The ability to perform modeling in databases is important since it enables analysts to work with large data sets and do more timely model updates and scoring. Embedding into middleware has grown fourfold since our previous research on predictive analytics in 2012; this has implications for the emerging Internet of Things (IoT), through which people will interact with an increasing array of devices.

Another sign of the broader adoption of predictive analytics is how and where buying decisions are made. Budgets for  vr_NG_Predictive_Analytics_07_funding_improvement_in_predictive_analytic.._ predictive analytics are shifting. Since the previous research, funding sourced from general business budgets has declined 9 percent and increased 8 percent in line-of-business IT budgets. This comports with a shift in the form in which organizations prefer to buy predictive analytics, which now is less as a stand-alone product and more embedded in other systems. Usability and functionality are still the top buying criteria, reflecting needs to simplify predictive analytics tools and address the skills gap while still being able to access a range of capabilities.

Overall the research shows that the application of predictive analytics to business processes sets high-performing organizations apart from others. Companies more often achieve competitive advantage with predictive analytics when they support the deployment of predictive analytics in business processes (66% vs. 57% overall), use business intelligence and data warehouse teams to design and deploy predictive analytics (71% vs. 58%) and fund predictive analytics as a shared service (73% vs. 58%). Similarly, those that train employees in the application of predictive analytics to business problems achieve more satisfaction and better outcomes.

Organizations looking to improve their business through predictive analytics should examine what others are doing. Since the time of our previous research, innovation has expanded and there are more peer organizations across industries and business functions that can be emulated. And the search for such innovation need not be limited to within one’s industry; cross-industry examples also can be enlightening. More concretely, the research finds that people and processes are where organizations can improve most in predictive analytics. We advise them to concentrate on streamlining processes, acquiring necessary skills and supporting both with technology available in the market. To begin, develop a practical predictive analytics strategy and enlist all stakeholders in the organization to support initiatives.


Ventana Research

Alteryx has released version 9.0 of Alteryx Analytics that provides a range of data to predictive analytics in advance of its annual user conference called Inspire 2014. I have covered the company for several years as it has emerged as a key player in providing a range of business analytics from predictive to big data analytics. The importance of this category of analytics is revealed by our latest benchmark research on big data analytics, which finds that predictive analytics is the most important type of big data analytics, ranked first by nearly half (47%) of research participants. The new version 9 includes new capabilities and integration with a range of new information sources including read and write capability to IBM SPSS and SAS for range of analytic needs.

vr_Big_Data_Analytics_08_top_capabilities_of_big_data_analyticsAfter attending Inspire 2013 last year, I wrote about capabilities that are enabling an emerging business role, that which Alteryx calls the data artisan. The label refers to analysts who combines both art and science in using analytics to help direct business outcomes. Alteryx uses an innovative and intuitive approach to analytic tasks, using workflow and linking various data sources through in-memory computation and processing. It takes a “no code” drag and drop approach to integrate data from files and databases, prepare data for analysis, and build and score predictive models to yield relevant results. Other vendors in the advanced analytics market are also applying this approach, but few mature tools are currently available. The output of the Alteryx analytic processes can be shared automatically in numerous data formats including direct export into visualization tools such as those from Qlik (new support) and Tableau. This can help users improve their predictive analytics capabilities and take action on the outcomes of analytics, which are the two capabilities most-often cited in our research as needed to improve big data analytics.

vr_Big_Data_Analytics_09_use_cases_for_big_data_analyticsAlteryx now works with Revolution Analytics to increase the scalability of its system to work with large data sets. The open source language R continues to gain popularity and is being embedded in many business intelligence tools, but it runs only on data that can be loaded into memory. Running only in memory does not address analytics on datasets that run into Terabytes and hundreds of millions of values, and potentially requires use of a sub-sampling approach to advanced analytics. With its RevoScaleR, Revolution Analytics rewrites parts of the R algorithm so that the processing tasks can be parallelized and run in big data architectures such as Hadoop. Such capability is important for analytic problems including recommendation engines, unsupervised anomaly detection, some classification and regression problems, and some clustering problems. These analytic techniques are appropriate for some of the top business uses of big data analytics, which according to our research are cross-selling and up-selling (important to 38%), better understanding of individual customers (32%), analyzing all data rather than a sample (30%) and price optimization (28%). Alteryx Analytics automatically detects whether to use RevoScaleR or open source R algorithms. This approach simplifies the technical complexities of scaling R by providing a layer of abstraction for the analytic professional.

Scoring – the ability to input a data record and receive the probability of a particular outcome – is an important if not well understood aspect of predictive analytics. Our research shows that companies that score models on a timely basis according to their needs get better organizational results than those that score all models the same way. Working with Revolution Analytics, Alteryx has enhanced scoring scalability for R algorithms with new capabilities that chunk data in a parallelized fashion. This approach bypasses the memory-only approach to enable a theoretically unlimited number of scores to be processed. For large-scale implementations and consumer applications in industries such as retail, an important target market for Alteryx, and these capabilities are becoming important.

Alteryx 9.0 also improves on open source R’s default approach to scoring, which is “all or nothing.” That is, if data is missing (a null value) or a new level for a categorical variable is not included in the original model, R will not score the model until the issue is addressed. This process is a particular problem for analysts who want to score data in small batches or individually. In contrast, Alteryx’s new “best effort” approach scores the records that can be run without incident, and those that cannot be run are returned with an error message. This adjustment is particularly important as companies start to deploy predictive analytics into areas such as call centers or within Web applications such as automatic quotes for insurance.

vr_Big_Data_Analytics_02_defining_big_data_analyticsAlteryx 9.0 also has new predictive modeling tools and functionality. A spline model helps address regression and classification problems such as data reduction and nonlinear relationships and their interactions. It uses a clear box way to serve users with differing objectives and skill levels. The approach exposes the underpinnings of the model so that advanced users can modify a model, but at the same time less sophisticated users can use the model without necessarily understanding all of the intricacies of the model itself. Other capabilities include a Gamma regression tool allows data matching to model the Gamma family of distributions using the generalized linear modeling (GLM) framework. Heat plot tools for visualizing joint probability distributions, such as between customer income level and customer advocacy, and more robust A/B testing tools, which are particularly important in digital marketing analytics, are also part of the release.

At the same time, Alteryx has expanded its base of information sources. According to our research, working with all sources of data, not just one, is the most common definition for big data analytics, as stated by three-quarters (76%) of organizations. While structured data from transaction systems and so-called systems of record is still the most important, new data sources including those coming from external sources are becoming important. Our research shows that the most widely used external data sources are cloud applications (54%) and social media data (46%); five additional data sources, including Internet, consumer, market and government sources, are virtually tied in third position (with 39% to 42% each). Alteryx will need to be mindful of best practices in big data analytics as I have outlined to ensure it can stay on top of a growing set of requirements to blend big data but also apply a range of advanced analytics.

New connectors to the social media data provider Gnip give access to social media websites through a single API, and a DataSift ( connector helps make social media more accessible and easier to analyze for any business need. Other new connectors in 9.0 include those for Foursquare, Google Analytics, Marketo, and Twitter. New data warehouse connectors include those for Amazon Redshift, HP Vertica, Microsoft SQL Server and Pivotal Greenplum. Access to SPSS and SAS data files also is introduced in this version; Alteryx hopes to break down the barriers to entry in accounts dominated by these advanced analytic stalwarts. With already existing connectors to major cloud and on-premises data sources, the company provides a robust integration platform for analytics.

Alteryx is on a solid growth curve as evidenced by the increasing number of inquiries and my conversations with company vr_Customer_Analytics_08_time_spent_in_customer_analyticsexecutives. It’s not surprising given the disruptive potential of the technology itself and its unique analytic workflow technology for data blending and advanced analytics. This data blending and workflow technology that Alteryx provides is not highlighted enough as it is one of the largest differentiators of its software and reduces the data related tasks like preparing (47%) and reviewing (43%) data that our customer analytics research finds gets in the way of analysts performing analytics. Additionally Alteryx ability to apply location analytics within its product is a key differentiation that our research found delivers exponential value from analytics than just viewing traditional visualization and tables of data. Also location analytics like Alteryx provides helps rapidly identify areas where customer experience and satisfaction can be improved and is the top benefit found in our research. The flexible platform resonates particularly well with line-of-business and especially in fast-moving, lightly regulated industries such as travel, retail and consumer goods where speed of analytics are critical to be performed. The work the company is doing with Revolution Analytics and the ability to scale is important for advanced analytic that operate on big data. The ability to seamlessly connect and blend information sources is a critical capability for Alteryx and it’s a wise move to invest further in this area but Alteryx will need to examine where collaborative technology could be used to help business work together on analytics within the software. Alteryx will need to continue to adapt to the market demand for analytics and keep focused on varying line of business areas so it can continue its growth. Just about any company involved in analytics today should evaluate Alteryx and see how it can streamline analytics in a very unique approach.


Tony Cosentino

VP and Research Director

Users of big data analytics are finally going public. At the Hadoop Summit last June, many vendors were still speaking of a large retailer or a big bank as users but could not publically disclose their partnerships. Companies experimenting with big data analytics felt that their proof of concept was so innovative that once it moved into production, it would yield a competitive advantage to the early mover. Now many companies are speaking openly about what they have been up to in their business laboratories. I look forward to attending the 2013 Hadoop Summit in San Jose to see how much things have changed in just a single year for Hadoop centered big data analytics.

Our benchmark research into operational intelligence, which I argue is another name for real-time big data analytics, shows diversity in big data analytics use cases by industry. The goals of operational intelligence are an interesting mix as the research shows relative parity among managing performance (59%), detecting fraud and security (59%), complying with regulations (58%) and managing risk (58%), but when we drill down into different industries there are some interesting nuances. For instance, healthcare and banking are driven much more by risk and regulatory compliance, services such as retail are driven more by performance, and manufacturing is driven more by cost reduction. All of these make sense given the nature of the businesses. Let’s look at them in more detail.

vr_oi_goals_of_using_operational_intelligenceThe retail industry, driven by market forces and facing discontinuous change, is adopting big data analytics out of competitive necessity. The discontinuity comes in the form of online shopping and the need for traditional retailers to supplement their brick-and-mortar locations. JCPenney and Macy’s provide a sharp contrast in how two retailers approached this challenge. A few years ago, the two companies eyed a similar competitive space, but since that time, Macy’s has implemented systems based on big data analytics and is now sourcing locally for online transactions and can optimize pricing of its more than 70 million SKUs in just one hour using SAS High Performance Analytics. The Macy’s approach has, in Sun-Tzu like fashion, made the “showroom floor” disadvantage into a customer experience advantage. JCPenney, on the other hand, used gut-feel management decisions based on classic brand merchandising strategies and ended up alienating its customers and generating law suits and a well-publicized apology to its customers. Other companies including Sears are doing similarly innovative work with suppliers such as Teradata and innovative startups like Datameer in data hub architectures build around Hadoop.

Healthcare is another interesting market for big data, but the dynamics that drive it are less about market forces and more about government intervention and compliance issues. Laws around HIPPA, the recent Healthcare Affordability Act, OC-10 and the HITECH Act of 2009 all have implications for how these organizations implement technology and analytics. Our recent benchmark research on governance, risk and compliance indicates that many companies have significant concerns about compliance issues: 53 percent of participants said they are concerned about them, and 42 percent said they are very concerned. Electronic health records (EHRs) are moving them to more patient-centric systems, and one goal of the Affordable Care Act is to use technology to produce better outcomes through what it calls meaningful use standards.  Facing this title wave of change, companies including IBM analyze historical patterns and link it with real-time monitoring, helping hospitals save the lives of at-risk babies. This use case was made into a now-famous commercial by advertising firm Ogilvy about the so-called data babies. IBM has also shown how cognitive question-and-answer systems such as Watson assist doctors in diagnosis and treatment of patients.

Data blending, the ability to mash together different data sources without having to manipulate the underlying data models, is another analytical technique gaining significant traction. Kaiser Permanente is able to use tools from Alteryx, which I have assessed, to consolidate diverse data sources, including unstructured data, to streamline operations to improve customer service. The two organizations made a joint presentation similar to the one here at Alteryx’s user conference in March.

vr_grc_worried_about_grcFinancial services, which my colleague Robert Kugel covers, is being driven by a combination of regulatory forces and competitive market forces on the sales end. Regulations produce a lag in the adoption of certain big data technologies, such as cloud computing, but areas such as fraud and risk management are being revolutionized by the ability, provided through in-memory systems, to look at every transaction rather than only a sampling of transactions through traditional audit processes. Furthermore, the ability to pair advanced analytical algorithms with in-memory real-time rules engines helps detect fraud as it occurs, and thus criminal activity may be stopped at the point of transaction. On a broader scale, new risk management frameworks are becoming the strategic and operational backbone for decision-making in financial services.

On the retail banking side, copious amounts of historical customer data from multiple banking channels combined with government data and social media data are providing banks the opportunity to do microsegmentation and create unprecedented customer intimacy. Big data approaches to micro-targetting and pricing algorithms, which Rob recently discussed in his blog on Nomis, enable banks and retailers alike to target individuals and customize pricing based on an individual’s propensity to act. While partnerships in the financial services arena are still held close to the vest, the universal financial services providers – Bank of America, Citigroup, JPMorgan Chase and Wells Fargo – are making considerable investments into all of the above-mentioned areas of big data analytics.

Industries other than retail, healthcare and banking are also seeing tangible value in big data analytics. Governments are using it to provide proactive monitoring and responses to catastrophic events. Product and design companies are leveraging big data analytics for everything from advertising attribution to crowdsourcing of new product innovation. Manufacturers are preventing downtime by studying interactions within systems and predicting machine failures before they occur. Airlines are recalibrating their flight routing systems in real time to avoid bad weather. From hospitality to telecommunications to entertainment and gaming, companies are publicizing their big data-related success stories.

Our research shows that until now, big data analytics has primarily been the domain of larger, digitally advanced enterprises. However, as use cases make their way through business and their tangible value is accepted, I anticipate that the activity around big data analytics will increase with companies that reside in the small and midsize business market. At this point, just about any company that is not considering how big data analytics may impact its business faces an unknown and uneasy future. What a difference a year makes, indeed.


Tony Cosentino

VP and Research Director

Our benchmark research found in business technology innovation that analytics is the most important new technology for improving their organization’s performance; they ranked big data only fifth out of six choices. This and other findings indicate that the best way for big data to contribute value to today’s organizations is to be paired with analytics. Recently, I wrote about what I call the four pillars of big data analytics on which the technology must be built. These areas are the foundation of big data and information optimization, predictive analytics, right-time analytics and the discovery and visualization of analytics. These components gave me a framework for looking at Teradata’s approach to big data analytics during the company’s analyst conference last week in La Jolla, Calif.

The essence of big data is to optimize the information used by the business for whatever type of need as my colleague has identified as a key value of these investmentsVR_2012_TechAward_Winner_LogoData diversity presents a challenge to most enterprise data warehouse architectures. Teradata has been dealing with large, complex sets of data for years, but today’s different data types are forcing new modes of processing in enterprise data warehouses. Teradata is addressing this issue by focusing on a workload-specific architecture that aligns with MapReduce, statistics and SQL. Its Unified Data Architecture (UDA) incorporates the Hortonworks Hadoop distribution, the Aster Data platform and Teradata’s stalwart RDBMS EDW. The Big Data Analytics appliance that encompasses the UDA framework won our annual innovation award in 2012. The system is connected through Infiniband and accesses Hadoop’s metadata layer directly through Hcatalog. Bringing these pieces together represents the type of holistic thinking that is critical for handling big data analytics; at the same time there are some costs as the system includes two MapReduce processing environments. For more on the UDA architecture, read my previous post on Teradata as well as my colleague Mark Smith’s piece.

Predictive analytics is another foundational piece of big data analytics and one of the top priorities in organizations. However, according to our vr_bigdata_big_data_capabilities_not_availablebig data research, it is not available in 41 percent of organizations today. Teradata is addressing it in a number of ways and at the conference Stephen Brobst, Teradata’s CTO, likened big data analytics to a high-school chemistry classroom that has a chemical closet from which you pull out the chemicals needed to perform an experiment in a separate work area. In this analogy, Hadoop and the RDBMS EDW are the chemical closet, and Aster Data provides the sandbox where the experiment is conducted. With mulitple algorithms currently written into the platform and many more promised over the coming months, this sandbox provides a promising big data lab environment. The approach is SQL-centric and as such has its pros and cons. The obvious advantage is that SQL is a declarative language that is easier to learn than procedural languages, and an established skills base exists within most organizations. The disadvantage is that SQL is not the native tongue of many business analysts and statisticians. While it may be easy to call a function within the context of the SQL statement, the same person who can write the statement may not know when and where to call the function. One way for Teradata to expediently address this need is through its existing partnerships with companies like Alteryx, which I wrote about recently. Alteryx provides a user-friendly analytical workflow environment and is establishing a solid presence on the business side of the house. Teradata already works with predictive analytics providers like SAS but should further expand with companies like Revolution Analytics that I assessed that are using R technology to support a new generation of tools.

Teradata is exploiting its advantage with algorithms such as nPath, which shows the path that a customer has taken to a particular outcome such as buying or not buying. According to our big data benchmark research, being able to conduct what-if analysis and predictive analytics are the two most desired capabilities not currently available with big data, as the chart shows. The algorithms that Teradata is building into Aster help address this challenge, but despite customer case studies shown at the conference, Teradata did not clearly demonstrate how this type of algorithm and others seamlessly integrate to address the overall customer experience or other business challenges. While presenters verbalized it in terms of improving churn and fraud models, and we can imagine how the handoffs might occur, the presentations were more technical in nature. As Teradata gains traction with these types of analytical approaches, it will behoove the company to show not just how the algorithm and SQL works but how it works in the use by business and analysts who are not as technically savvy.

Another key principle behind big data analytics is timeliness of the analytics. Given the nature of business intelligence and traditional EDW architectures, until now timeliness of analytics has been associated with how quickly queries run. This has been a strength of the Teradata MPP share-nothing architecture, but other appliance architectures, such as those of Netezza and Greenplum, now challenge Teradata’s hegemony in this area. Furthermore, trends in big data make the situation more complex. In particular, with very large data sets, many analytical environments have replaced the traditional row-level access with column access. Column access is a more natural way for data to be accessed for analytics since it does not have to read through an entire row of data that may not be relevant to the task at hand. At the same time, column-level access has downsides, such as the reduced speed at which you can write to the system; also, as the data set used in the analysis expands to a high number of columns, it can become less efficient than row-level access. Teradata addresses this challenge by providing both row and column access through innovative proprietary access and computation techniques.

Exploratory analytics on large, diverse data sets also has a timeliness imperative. Hadoop promises the ability to conduct iterative analysis on such data sets, which is the reason that companies store big data in the first place according to our big data benchmark research. Iterative analysis is akin to the way the human brain naturally functions, as one question naturally leads to another question. However, methods such as Hive, which allows an SQL-like method to access Hadoop data, can be very slow, sometimes taking hours to return a query. Aster enables much faster access and therefore provides a more dynamic interface for iterative analytics on big data.

Timeliness also has to do with incorporating big data in a stream-oriented environment and only 16 percent of organizations are very satisfied with timeliness of events according to our operational intelligence benchmark research. In a use case such as fraud and security, rule-based systems work with complex algorithmic functions to uncover criminal activity. While Teradata itself does not provide the streaming or complex event processing (CEP) engines, it can provide the big data analytical sandbox and algorithmic firepower necessary to supply the appropriate algorithms for these systems. Teradata partners with major players in this space already, but would be well served to further partner with CEP and other operational intelligence vendors to expand its footprint. By the way, these vendors will be covered in our upcoming Operational Intelligence Value Index, which is based on our operational intelligence benchmark research. This same research showed that analyzing business and IT events together was very important in 45 percent of organizations.

The visualization and discovery of analytics is the last foundational pillarvr_ngbi_br_importance_of_bi_technology_considerations and here Teradata is still a work in progress. While some of the big data visualizations Aster generates show interesting charts, they lack a context to help people interpret the chart. Furthermore, the visualization is not as intuitive and requires the writing and customization of SQL statements. To be fair, most visual and discovery tools today are relationally oriented and Teradata is trying to visualize large and diverse sets of data. Furthermore, Teradata partners with companies including MicroStrategy and Tableau to provide more user-friendly interfaces. As Teradata pursues the big data analytics market, it will be important to demonstrate how it works with its partners to build a more robust and intuitive analytics workflow environment and visualization capability for the line-of-business user. Usability (63%) and functionality (49%) are the top two considerations when evaluating business intelligence systems according to our research on next-generation business intelligence.

Like other large industry technology players, Teradata is adjusting to the changes brought by business technology innovation in just the last few years. Given its highly scalable databases and data modeling – areas that still represent the heart of most company’s information architectures –  Teradata has the potential to pull everything together and leverage their current deployed base. Technologists looking at Teradata’s new and evolving capabilities will need to understand the business use cases and share these with the people in charge of such initiatives. For business users, it is important to realize that big data is more than just visualizing disparate data sets and that greater value lies in setting up an efficient back end process that applies the right architecture and tools to the right business problem.


Tony Cosentino
VP and Research Director

This year’s Inspire, Alteryx’s annual user conference, featured new developments around the company’s analytics platform. Alteryx CEO Dean Stoecker kicked off the event by talking about the promise of big data, the dissemination of analytics throughout the organization, and the data artisan as the “new boss.” Alteryx coined the term “data artisan” to represent the persona at the center of the company’s development and marketing efforts. My colleague Mark Smith wrote about the rise of the data artisan in his analysis of last year’s event.

President and COO George Mathew keynoted day two, getting into more specifics on the upcoming 8.5 product release. vr_ngbi_br_importance_of_bi_technology_considerationsAdvancements revolve around improvement in the analytical design environment, embedded search capabilities, the addition of interactive mapping and direct model output into Tableau. The goal is to provide an easier, more intuitive user experience. Our benchmark research into next-generation business intelligence shows buyers consider usability the top buying criteria at 63 percent. The redesigned Alteryx interface boasts a new look for the icons and more standardization across different functional environments. Color coding of the toolbox groups tools according to functions, such as data preparation, analytics and reporting. A new favorites function is another good addition, given that users tend to rely on the same tools depending on their role within the analytics value chain. Users can now look at workflows horizontally and not just vertically, and easily change the orientation if for example they are working on an Apple iPad. Version 8.5 allows embedded search and more streamlined navigation, and continues its focus on a role-based application, which my colleague has been advocating for a while. According to the company, 94 percent of its user base demanded interactive mapping; that’s now part of the product, letting users draw a polygon around an area of interest, then integrate it into the analytical application for runtime execution.

The highlight of the talk was the announcement of integration with Tableau 8.0 and the ability to write directly to the software without having to follow the cumbersome process of exporting a file and then reopening it in another application. Alteryx was an alpha partner and worked directly with the code base for Tableau 8.0, which I wrote up a few months ago. The partnership exemplifies the coopetition environment that many companies find themselves in today. While Tableau does some basic prediction, and Alteryx does some basic visual reporting, the companies’ core competencies brought together into one workflow is much more powerful for the user. Another interesting aspect is the juxtaposition of the two user groups. The visually oriented Tableau group in San Diego seemed much younger and was certainly much louder on the reveals, while the analytically oriented Alteryx group was much more subdued.

Alteryx has been around since 1997, when it was called SRC. It grew up focused around location analytics, which allowed it to establish foundational analytic use cases in vertical areas such as real estate and retail. After changing the company name and focusing more on horizontal analytics, Alteryx is growing fast with backing from, interestingly enough, SAP Ventures. Since the company was already profitable, it used a modest infusion of capital to grow its product marketing and sales functions. The move seems to have paid off. Companies such as Dunkin Brands and Redbox use Alteryx and the company has made significant inroads with marketing services companies.  A number of consulting companies, such as Absolute Data and Capgemini, are using Alteryx for customer and marketing analytics and other use cases. I had an interesting talk with the CEO of a small but important services firm who said that he is being asked to introduce innovative analytical approaches to much larger marketing services and market research firms. He told me that Alteryx is a key part of the solution he’ll be introducing to enable things such as big data analytics.

Alteryx provides value in a few innovative ways that are not new to this release, but that are foundational to the company’s business vr_bigdata_obstacles_to_big_data_analyticsstrategy. First, it marries data integration with analytics, which allows business users who have traditionally worked in a flat-file environment to pull from multiple data sources and integrate information within the context of the Alteryx application. Within that same environment, users can build analytic workflows and publish applications to a private or public cloud. This approach helps address the obstacles found in our research in big data analytics where staffing (79%) and training (77%) are addressed by Alteryx through providing more flexibility for business to engage into the analytic process.

Alteryx manages an analytics application store called the Analytics Gallery that crowdsources and shares user-created models. These analytical assets can be used internally within an organization or sold on the broader market. Proprietary algorithms can be secured through a black box approach, or made open to allow other users to tweak the analytic code. It’s similar to what companies like Datameer are doing on top of Hadoop, or Informatica in the cloud integration market. The store gives descriptions of what the applications do, such as fuzzy matching or target marketing. Being crowdsourced, the number of applications should proliferate over time, tracking advancements in the R open source project, since R is at the heart of the Alteryx analytic strategy and what it calls clear box analytics. The underlying algorithm is easily viewed and edited based on permissions established by the data artisan, similar to what we’ve seen with companies such as 1010data. Alteryx 8.5 works with R 3.0, the latest version. On the back end, Alteryx partners with enterprise data warehouse powerhouses such as Teradata, and works with the Hortonworks Hadoop distribution.

I encourage analysts of all stripes to take a look at the Alteryx portfolio. Perhaps start with the Analytics Gallery to get a flavor of what the company does and the type of analytics customers are building and using today.  Alteryx can benefit analysts looking to move beyond the limitations of a flat-file analytics environment, and especially marketing analysts who want to marry third-party data from sources such as the US Census Bureau, Experian, TomTom or Salesforce, which Alteryx offers within its product. If you have not seen Alteryx, you should take a look and see how they are changing the way analytic processes are designed and managed.


Tony Cosentino

VP and Research Director

RSS Tony Cosentino’s Analyst Perspectives at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Tony Cosentino – Twitter

Error: Twitter did not respond. Please wait a few minutes and refresh this page.


  • 72,942 hits
%d bloggers like this: