You are currently browsing the tag archive for the ‘cloud’ tag.

Splunk’s annual gathering, this year called .conf 2015, in late September hosted almost 4,000 Splunk customers, partners and employees. It is one of the fastest-growing user conferences in the technology industry. The area dedicated to Splunk partners has grown from a handful of booths a few years ago to a vast showroom floor many times larger. While the conference’s main announcement was the release of Splunk Enterprise 6.3, its flagship platform, the progress the company is making in the related areas of machine learning and the Internet of Things (IoT) most caught my attention.

Splunk’s strength is its ability to index, normalize, correlate and query data throughout the technology stack, including applications, servers, networks and sensors. It uses distributed search that enables correlation and analysis of events across local- and wide-area networks without moving vast amounts of data. Its architectural approach unifies cloud and on-premises implementations and provides extensibility for developers building applications. Originally, Splunk provided an innovative way to troubleshoot complex technology issues, but over time new uses for Splunk-based data have emerged, including digital marketing analytics, cyber security, fraud prevention and connecting digital devices in the emerging Internet of Things. Ventana Research has covered Splunk since its establishment in the market, most recently in this analysis of mine.

Splunk’s experience in dealing directly with distributed, time-series data and processes on a large scale puts it in position to address the Internet of Things from an industrial perspective. This sort of data is at the heart of large-scale industrial control systems, but it often comes in different formats and its implementation is based on different formats and protocols. For instance, sensor technology and control systems that were invented 10 to 20 years ago use very different technology than modern systems. Furthermore, as with computer technology, there are multiple layers in stack models that have to communicate. Splunk’s tools help engineers and systems analysts cross-reference these disparate systems in the same way that it queries computer system and network data, however, the systems can be vastly different. To address this challenge, Splunk turns to its partners and its extensible platform. For example, Kepware has developed plug-ins that use its more than 150 communication drivers so users can stream real-time industrial sensor and machine data directly into the Splunk platform. Currently, the primary value drivers for organizations in this field of the industrial IoT are operational efficiency, predictive maintenance and asset management. At the conference, Splunk showcased projects in these areas including one with Target that uses Splunk to improve operations in robotics and manufacturing.

For its part, Splunk is taking a multipronged approach by acquiring companies, investing in internal development and enabling its partner ecosystem to build new products. One key enabler of its approach to IoT is machine learning algorithms built on the Splunk platform. In machine learning a model can use new data to continuously learn and adapt its answers to queries. This differs from conventional predictive analytics, in which users build models and validate them based on a particular sample; the model does not adapt over time. With machine learning, for instance, if a piece of equipment or an automobile shows a certain optimal pattern of operation over time, an algorithm can identify that pattern and build a model for how that system should behave. When the equipment begins to act in a less optimal or anomalous way, the system can alert a human operator that there may be a problem, or in a machine-to-machine situation, it can invoke a process to solve the problem or recalibrate the machine.

Machine learning algorithms allow event processes to be audited, analyzed and acted upon in real time. They enable predictive capabilities for maintenance, transportation and logistics, and asset management and can also be applied in more people-oriented domains such as fraud prevention, security, business process improvement, and digital products.  IoT potentially can have a major impact on business processes, but only if organizations can realign systems to discover-and-adapt rather than model-and-apply approaches. For instance, processes are often carried out in an uneven fashion different from the way the model was conceived and communicated through complex process documentation and systems. As more process flows are directly instrumented and more processes carried out by machines, the ability to model directly based on the discovery of those event flows and to adapt to them (through human learning or machine learning) becomes key to improving organizational processes. Such realignment of business processes, however, often involves broad organizational transformation.Our benchmark research on operational intelligence shows that challenges associated with people and processes, rather than information and technology, most often hold back organizational improvement.

Two product announcements made at the conference illuminate the direction Splunk is taking with IoT and machine learning. The first is User Behavior Analytics (UBA), based VR2015_InnovationAwardWinneron its acquisition of Caspida, which produces advanced algorithms that can detect anomalous behavior within a network. Such algorithms can model internal user behavior, and when behavior deviates from the specified norm, it can generate an alert that can be addressed through investigative processes usingSplunk Enterprise Security 4.0. Together, Splunk Enterprise Security 4.0 and UBA won the 2015 Ventana Research CIO Innovation Award.The acquisition of Caspida shows that Splunk is not afraid to acquire companies in niche areas where they can exploit their platform to deliver organizational value. I expect that we will see more such acquisitions of companies with high value ML algorithms as Splunk carves out specific positions in the emergent markets.

The other product announced is IT Service Intelligence (ITSI), which highlights machine learning algorithms alongside of Splunk’s core capabilities. The IT Service Intelligence App is an application in which end users deploy machine learning to see patterns in various IT service scenarios. ITSI can inform and enable multiple business uses such as predictive maintenance, churn analysis, service level agreements and chargebacks. Similar to UBA, it uses anomaly detection to point out issues and enables managers to view highly distributed processes such as claims process data in insurance companies. At this point, however, use of ITSI (like other areas of IoT) may encounter cultural and political issues as organizations deal with changes in the roles of IT and operations management. Splunk’s direction with ITSI shows that the company is staying close to its IT operations knitting as it builds out application software, but such development also puts Splunk into new competitive scenarios where legacy technology and processes may still be considered good enough.

We note that ITSI is built using Splunk’s Machine Learning Toolkit and showcase, which currently is in preview mode. The vr_Big_Data_Analytics_08_top_capabilities_of_big_data_analyticsplatform is an important development for the company and fills one of the gaps that I pointed out in its portfolio last year. Addressing this gap enables Splunk and its partners to create services that apply advanced analytics to big data that almost half (45%) of organizations find important. The use of predictive and advanced analytics on big data I consider a killer application for big data; our benchmark research on big data analytics backs this claim: Predictive analytics is the type of analytics most (64%) organizations wish to pursue on big data.

Organizations currently looking at IoT use cases should consider Splunk’s strategy and tools in the context of specific problems they need to address. Machine learning algorithms built for particular industries are key so it is important to understand if the problem can be addressed using prebuilt applications provided by Splunk or one of its partners, or if the organization will need to build its own algorithms using the Splunk machine learning platform or alternatives. Evaluate both the platform capabilities and the instrumentation, the type of protocols and formats involved and how that data will be consumed into the system and related in a uniform manner. Most of all, be sure the skills and processes in the organization align with the technology from an end user and business perspective.


Ventana Research

The concept and implementation of what is called big data are no longer new, and many organizations, especially larger ones, view it as a way to manage and understand the flood of data they receive. Our benchmark research on big data analytics shows that business intelligence (BI) is the most common type of system to which organizations deliver big data. However, BI systems aren’t a good fit for analyzing big data. They were built to provide interactive analysis of structured data sources using Structured Query Language (SQL). Big data includes large volumes of data that does not fit into rows and columns, such as sensor data, text data and Web log data. Such data must be transformed and modeled before it can fit into paradigms such as SQL.

The result is that currently many organizations run separate systems for big data and business intelligence. On one system, conventional BI tools as well as new visual discovery tools act on structured data sources to do fast interactive analysis. In this area analytic databases can use column store approaches and visualization tools as a front end for fast interaction with the data. On other systems, big data is stored in distributed systems such as the Hadoop Distributed File System (HDFS). Tools that use it have been developed to access, process and analyze the data. Commercial distribution companies aligned with the open source Apache Foundation, such as Cloudera, Hortonworks and MapR, have built ecosystems around the MapReduce processing paradigm. MapReduce works well for search-based tasks but not so well for the interactive analytics for which business intelligence systems are known. This situation has created a divide between business technology users, who gravitate to visual discovery tools that provide easily accessible and interactive data exploration, and more technically skilled users of big data tools that require sophisticated access paradigms and elongated query cycles to explore data.

vr_Big_Data_Analytics_07_dissatisfaction_with_big_data_analyticsThere are two challenges with the MapReduce approach. First, working with it is a highly technical endeavor that requires advanced skills. Our big data analytics research shows that lack of skills is the most widespread reason for dissatisfaction with big data analytics, mentioned by more than two-thirds of companies. To fill this gap, vendors of big data technologies should facilitate use of familiar interfaces including query interfaces and programming language interfaces. For example, our research shows that Standard SQL is the most important method for implementing analysis on Hadoop. To deal with this challenge, the distribution companies and others offer SQL abstraction layers on top of HDFS, such as HIVE and Cloudera Impala. Companies that I have written about include Datameer and Platfora, whose systems help users interact with Hadoop data via interactive systems such as spreadsheets and multidimensional cubes. With their familiar interaction paradigms such systems have helped increase adoption of Hadoop and enable more than a few experts to access big data systems.

The second challenge is latency. As a batch process MapReduce must sort and aggregate all of the data before creating analytic output. Technology such as Tez, developed by Hortonworks, and Cloudera Impala aim to address such speed limitations; the first leverages MapReduce, and the other circumvents MapReduce altogether. Adoption of these tools has moved the big data market forward, but challenges remain such as the continuing fragmentation of the Hadoop ecosystem and a lack of standardization in approaches.

An emerging technology holds promise for bridging the gap between big data and BI in a way that can unify big data ecosystems rather than dividing them. Apache Spark, under development since 2010 at the University of California Berkeley’s AMPLab, addresses both usability and performance concerns for big data. It adds flexibility by running on multiple platforms in terms of both clustering (such as Hadoop YARN and Apache Mesos) and distributed storage (for example, HDFS, Cassandra, Amazon S3 and OpenStack’s Swift). Spark also expands the potential uses because the platform includes an SQL abstraction layer (Spark SQL), a machine learning library (MLlib), a graph library (GraphX) and a near-real-time engine (Spark Streaming). Furthermore, Spark can be programmed using modern languages such as Python and Scala. Having all of these components integrated is important because interactive business intelligence, advanced analytics and operational intelligence on big data all can work without dealing with the complexity of having individual proprietary systems that were necessary to do the same things previously.

Because of this potential Spark is becoming a rallying point for providers of big data analytics. It has become the most active Apache project as key open source contributors moved their focus from other Hadoop projects to it. Out of the effort in Berkeley, Databricks was founded for commercial development of open source Apache Spark and has raised more than $46 million. Since the initial release in May 2014 the momentum for Spark has continued to build; major companies have made announcements around Apache Spark. IBM said it will dedicate 3,500 researchers and engineers to develop the platform and help customers deploy it. This is the largest dedicated Spark effort in the industry, akin to the move IBM made in the late 1990s with the Linux open source operating system. Oracle has built Spark into its Big Data Appliance. Microsoft has Spark as an option on its HDInsight big data approach but has also announced Prajna, an alternative approach to Spark. SAP has announced integration with its SAP HANA platform, although it represents “coopetition” for SAP’s in-memory platform. In addition, all the major business intelligence players have built or are building connectors to run on Spark. In time, Spark likely will serve as a data ingestion engine for connecting devices in the Internet of Things (IoT). For instance, Spark can integrate with technologies such as Apache Kafka or Amazon Kinesis to instantly process and analyze IoT data so that immediate action can be taken. In this way, as it is envisioned by its creators, Spark can serve as the nexus of multiple systems.

Because it is a flexible in-memory technology for big data, Spark opens the door to many new opportunities, which in business use include interactive analysis, advanced customer analytics,VentanaResearch_NextGenPredictiveAnalytics_BenchmarkResearchfraud detection, and systems and network management. At the same time, it is not yet a mature technology and for this reason,  organizations considering adoption should tread carefully. While Spark may offer better performance and usability, MapReduce is already widely deployed. For those users, it is likely best to maintain the current approach and not fix what is not broken. For future big data use, however, Spark should be carefully compared to other big data technologies. In this case as well as others, technical skills can still be a concern. Scala, for instance, one of the key languages used with Spark, has little adoption, according to our recent research on next-generation predictive analytics. Manageability is an issue as for any other nascent technology and should be carefully addressed up front. While, as noted, vendor support for Spark is becoming apparent, frequent updates to the platform can mean disruption to systems and processes, so examine the processes for these updates. Be sure that vendor support is tied to meaningful business objectives and outcomes. Spark is an exciting new technology, and for early adopters that wish to move forward with it today, both big opportunities and challenges are in store.


Ventana Research

As I discussed in the state of data and analytics in the cloud recently, usability is a top evaluation criterion for organizations in selecting cloud-based analytics software. Data access of cloud and on-premises systems are essential antecedents of usability. They can help business people perform analytic tasks themselves without having to rely on IT. Some tools allow data integration by business users on an ad hoc basis, but to provide an enterprise integration process and a governed information platform, IT involvement is often necessary. Once that is done, though, using cloud-based data for analytics can help, empowering business users and improving communication and process .

vr_DAC_16_dealing_with_multiple_data_sourcesTo be able to make the best decisions, organizations need access to multiple integrated data sources. The research finds that the most common data sources are predictable: business applications (51%), business intelligence applications (51%), data warehouses or operational data stores (50%), relational databases (41%) and flat files (33%). Increasingly, though, organizations also are including less structured sources such as semistructured documents (33%), social media (27%) and nonrelational database systems (19%). In addition there are important external data sources, including business applications (for 61%), social media data (48%), Internet information (42%), government sources (33%) and market data (29%). Whether stored in the cloud or locally, data must be normalized and combined into a single data set so that analytics can be performed.

Given the distributed nature of data sources as well as the diversity of data types, information platforms and integration approaches are changing. While more than three in five companies (61%) still do integration primarily between on-premises systems, significant percentages are now doing integration from the cloud to on-premises (47%) and from on-premises to the cloud (39%). In the future, this trend will become more pronounced. According to our research, 85 percent of companies eventually will integrate cloud data with on-premises sources, and 84 percent will do the reverse. We expect that hybrid architectures, a mix of on-premises and cloud data infrastructures, will prevail in enterprise information architectures for years to come while slowly evolving to equality of bidirectional data transfer between the two types.

Further analysis shows that a focus on integrating data for cloud analytics can give organizations competitive advantage. Those who said it is very important to integrate data for cloud-based analytics (42% of participants) also said they are very confident in their ability to use the cloud for analytics (35%); that’s three times more often than those who said integrating data is important (10%) or somewhat important (9%). Those saying that integration is very important also said more often that cloud-based analytics helps their customers, partners and employees in an array of ways, including improved presentation of data and analytics (62% vs. 43% of those who said integration is important or somewhat important), gaining access to many different data sources (57% vs. 49%) and improved data quality and data management (59% vs. 53%). These numbers indicate that organizations that neglect the integration aspects of cloud analytics are likely to be at a disadvantage compared to their peers that make it a priority.

Integration for cloud analytics is typically a manual task. In particular, almost half (49%) of organizations in the research use spreadsheets to manage the integration and preparation of cloud-based data. Yet doing so poses serious challenges: 58 percent of those using spreadsheets said it hampers their ability to manage processes efficiently. While traditional methods may suffice for integrating relatively small and well-defined data sets in an on-premises environment, they have limits when dealing with the scale and complexity of cloud-based data. vr_DAC_02_satisfaction_with_data_integration_toolsThe research also finds that organizations utilizing newer integration tools are satisfied with them more often than those using older tools. More than three-fourths (78%) of those using tools provided by a cloud applications  provider said they are satisfied or somewhat satisfied with them, as are even more (86%) of those using data integration tools designed for cloud computing; by comparison, fewer of those using spreadsheets (56%) or traditional enterprise data integration tools (71%) are satisfied.

This is not surprising. Modern cloud connectors are designed to connect via loosely coupled interfaces that allow cloud systems to share data in a flexible manner. The research thus suggests that for organizations needing to integrate data from cloud-based data sources, switching to modern integration tools can streamline the process.

Overall three-quarters of companies in our research said that it is important or very important to access data from cloud-based sources for analysis. Cloud-based analytics isn’t useful unless the right data can be fed into the analytic process. But without capable tools this is not easy to do. A substantial impediment is that analysts spend the majority of their time in accessing and preparing the data rather than in actual analysis. Complicating the task, each data source can represent a different, possibly complex, data model. Furthermore, the data sets may have varying data formats and interface requirements, which are not easily addressed with legacy integration tools.

Such complexity is the new reality, and new tools and approaches have come to market to address these complexities. For organizations looking to integrate their data for cloud-based analytics, we recommend exploring these new integration processes and technologies.


Ventana Research

PivotLink is a cloud-based provider of business intelligence and analytics that serves primarily retail companies. Its flagship product is Customer PerformanceMETRIX, which I covered in detail last year. Recently, the company released an important update to the product, adding attribution modeling, a type of advanced analytic that allows marketers to optimize spending across channels. For retailers these types of capabilities are particularly important. The explosion of purchase channels introduced by the Internet and competition from online retailers are forcing a more analytic approach to marketing as organizations try to decide where the marketing funds can be spent to best results. Our benchmark research into predictive analytics shows that achieving competitive advantage is the number-one reason for implementing predictive analytics, chosen by two-thirds (68%) of all companies and by even more retail organizations.

vr_predanalytics_benifits_of_predictive_analyticsAttribution modeling applied to marketing enables users to assign relative monetary and/or unit values to different marketing channels. With so many channels for marketers to choose among to spend their limited resources, it is difficult for them to defend the marketing dollars they allot to channels if they cannot provide analysis of the return on the investment. While attribution modeling has been around for a long time, the explosion of channels to create what PivotLink calls omnichannel marketing, is a relatively recent phenomenon. In the past, marketing spend focused on just a few channels such as television, radio, newspapers and billboards. Marketers modeled spending through a type of attribution called market mix models (MMM). These models are built around aggregate data, which is adequate when you have few just a few channels to calibrate, but it breaks down in the face of a broader environment. Furthermore, the MMM approach does not allow for sequencing of events, which is important in understanding how to direct spending to impact different parts of the purchase funnel. Newer data sources combined with attribution approaches like the ones PivotLink employs increase visibility of consumer behavior on the individual level, which enables a more finely grained approach. While market mix models will persist when only aggregate data is available, the collection of data in multiple forms (as by using big data) will expand the use of individual level models.

PivotLink’s approach allows marketers and analysts to address an important part of attribution modeling: how credit is assigned across channels. Until now, the first click and the last click typically have been given greatest weight. The problem is that the first click can give undue weighting to the higher part of the funnel and the last click undue weighting to the lower end. For instance, customers may go to a display advertisement to become aware of an offer, but later do a search and buy shortly after. In this instance, the last-click model would likely give too much credit to the search and not give enough credit to the display advertisement. While PivotLink does enable assignment by first click and last click (and by equal weighting as well), the option of custom weighting is the most compelling. After choosing that option from the drop-down menu, the marketer sees a slider in which weights can be assigned manually. This is often the preferred method of attribution in today’s business environment because it provides more flexibility and often reflects better the reality of a particular category; however,  domain expertise is necessary to apportion the weights wisely. To answer this particular challenge, the PivotLink software offers guidance based on industry best practices on how to weight the credit assignment.

Being based in the cloud, PivotLink is able to achieve an aggressive release cycle. Rapid product development is important for the company as its competitive landscape becomes crowded as on-premises analytics providers port their applications into the cloud and larger vendors look at the midmarket space for incremental growth. PivotLink can counter this by continuing to focus on usability and analytics applications for vertical industries. Attribution modeling is an important feature, and I expect to see PivotLink roll out other compelling analytics as well. Retailers looking for fast time-to-value in analytics and an intuitive system that does not need a statistician nor IT involvement, should consider PivotLink.


Tony Cosentino

VP and Research Director

Roambi, a supplier of mobile analytics and visualization software, announced the release of a cloud-based version of its product, which allows the company to move beyond the on-premises approach where it is established and into the hands of more business users. Roambi Business enables users to automate data import, create models and refresh data on demand. Furthermore, the company announced a North America Partner Program along with the cloud release. This will encourage ISVs and solution partners to develop for the new product. The move to the vr_bti_br_technology_innovation_prioritiescloud is a big one for the company, giving access to a new market in which companies need to deliver business intelligence (BI) to their increasingly mobile workforces.

The challenge of mobility and operating across smartphones and tablets is coming to the forefront of the BI industry, as indicated by our business technology innovation benchmark research, in which mobile technology is ranked as the second-most important innovation (by 15%) in a virtual tie with collaboration; analytics is the only higher ranked innovation (at 39%). With tablet sales likely to surpass PC sales in just a couple of years, the trend toward mobile devices will continue to gather momentum, and vendors of BI applications will need to provide them for these platforms. One way companies and vendors alike are addressing this challenge is to move applications to the cloud.

Roambi was one of the first to embrace industry trends in data visualization and mobile BI, but until now it focused on larger corporations and intersecting to business and on-premises approaches to BI. The company has been successful with deployments in 10 of the Fortune 50 and in eight of the 10 largest pharmaceutical companies. This presence in industries such as pharmaceuticals makes sense in that many of the early uses of mobile BI has vr_ngbi_br_importance_of_bi_technology_considerationsbeen in retail and sales specifically.

Roambi first caught our attention for having a user-friendly approach to BI that helps mobile workers improve productivity by accessing various forms of information through a handheld device. In particular, Microsoft Excel and BI applications can be ported onto mobile devices in the form of report visualizations and flipped through using the native swipe gestures on Apple iOS devices. The broad access of a cloud platform extends the firm’s focus on usability, which our benchmark research into next-generation BI finds to be the top evaluation criteria for 64 percent of potential customers. My colleague Mark Smith has written more on the user-focused nature of Roambi’s products.

Roambi Business is multitenant software as a service hosted on Amazon Web Services. It offers a no-integration API approach to data movement. That is, the API utilizes REST protocols for data exchange. When a request is sent from the Roambi application, responses are returned in JavaScript Object Notation (JSON). (The exception is that results for particularly large request sets are returned in IEEE754 format.) In this way, the API acts a conduit to transfer data from any JSON-compliant system including Excel,, Google spreadsheets and BI applications. Once data is transferred into the Roambi file system, the publishing tool allows users to quickly turn the data into a user-friendly form that is represented on the mobile device. As well, the product empowers an administrator to define user rights, and single sign-on is provided through industry standard SAML 2.0. Security, a big concern for mobile BI, is addressed in a number of ways including remote wipe, file exploration, application pass codes and file recall.

For the cloud version of the software, Roambi has redesigned the entire publishing engine to be HTML5-compliant but still iOS-oriented in that it takes advantage of native iOS gestures. The redesign of the publishing tool set extends to Roambi Flow, an application that enables power users to assemble and group information for presentations, publications or applications. (An example of such output is a briefing book or a digital publication.) This feature is particularly important since a specific data-driven storyline distributed to a group of users often is needed to produce a decision. Currently, the cumbersome cut-and-paste process revolves around data and content produced in Excel and Word and put into vr_ngbi_br_what_capabilities_matter_for_mobile_biPowerPoint for ultimate dissemination through an organization.

A couple of features are not yet available on the cloud addition. Push notifications are important, and with the new architecture I expect to see that soon. According to our next generation business intelligence benchmark research on mobile BI, alerts and notifications is the most important ranked feature (important to 42% of organizations), which should be a big part of mobile BI. While some interactivity is not available in the first release of the cloud edition with visualization, the flow of reviewing data is simple and easy to examine metrics.

Roambi will face strong competition from other BI vendors aggressively improving their own mobile BI offerings. Vendors of visual discovery software, of traditional BI and of integrated stacks each have a unique position that takes advantage of features like data mashup and broad integration capabilities. The battle for this market will be won only over time, but Roambi has a unique position of its own in terms of ease of use and time-to-value. In fact, the company’s strategic focus on design and the user experience is coinciding with current business discussions and top priorities in the buying trends in the market.

Roambi’s flagship product now in the cloud, technology that was mostly configurable by teams in larger companies’ is available to anyone easily including ISVs. Furthermore, cloud computing approach allows easier access for the business and requires less technical resources and reduces the potential financial impact. For companies looking to deploy business intelligence and analytics quickly to mobile devices while providing ease of use and the ability to communicate not only graphically but with a storyline approach, Roambi is worth a look to see how simple business intelligence can be on mobile technology.


Tony Cosentino

VP and Research Director

ParAccel is a well-funded big data startup, with $64 million invested in the firm so far. Only a few companies can top this level of startup funding, and most of them are service-based rather than product-based companies. Amazon has a 20 percent stake in the company and is making a big bet on the company’s technology to run its Redshift data warehouse in the cloud initiative. Microstrategy also uses ParAccel for it’s cloud offering, but holds no equity in the company.

ParAccel provides a software-based analytical platform that competes in the database appliance market, and as many in the space are increasingly trying to do, it is building analytic processes on top of the platform. On the base level, ParAccel is a massively parallel processing (MPP) database with columnar compression support, which allows for very fast query and analysis times. It is offered either as software or in an appliance configuration which, as we’ll discuss in a moment, is a different approach than many others in the space are taking. It connects with Teradata, Hadoop, Oracle and Microsoft SQL Server databases as well as financial market data such as semi-structured trading data and NYSE data through what the company calls On Demand Integration (ODI). This allows joint analysis through SQL of relational and non-relational data sources. In-database analytics offer more than 600 functions (though places on the company’s website and datasheets still say just over 500).

The company’s latest release, ParAccel 4.0, introduced product enhancements around performance as well as reliability and scalability. Performance enhancements include advanced query optimization that is said to improve aggregation performance 20X by doing “sort-aware” aggregations which tracks data properties up and down the processing pipeline. ParAccel’s own High Speed Interconnect protocol has been further optimized reducing data distribution overhead and speeding query processing. The new version 4.0 introduces new algorithms that exploit I/O patterns to pre-fetch data and store in memory, which again speeds query processing and reduced I/O overhead. The need for scalability is addressed in enhancements to enable the system to scale to 5,000 concurrent connections supporting up to 38,000 users on a single system. Its Hash Join algorithms allow for complex analytics by allowing the number of joins to fit the complexity of the analytic. Finally, interactive workload management introduces a class of persistent queries that allows short running queries and long running queries to be run side by side without impacting performance. This is particularly important as the integration of on-demand data sources through the company’s ODI approach could otherwise interfere with more interactive user requirements.

The company separates out its semi-annual database release cycle from the more iterative analytics release cycle. The new analytic functions just released just last month include a number of interesting developments for the company. Text analytics for various feeds allows for analytics across a variety of use cases, such as social media, customer comment analysis, insurance and warranty claims. In addition, functions such as sessionization and JSON parsing allow a new dimension of analytics for ParAccel as web data can now be analyzed. The new analytic capabilities allow the company to address a broad class of use cases such as “golden path analysis”, fraud detection, attribution modeling, segmentation and profiling. Interestingly, some of these use case are of the same character as those seen in the Hadoop world.

So where does ParAccel fit in the broader appliance landscape? vr_bigdata_big_data_technologies_plannedAccording to our benchmark research on big data more than 35 percent of businesses plan to use appliance technology, but the market is still fragmented. The appliance landscape can be broken down into categories that include hardware and software that run together, software that can be deployed across commodity hardware, and non-relational parallel processing paradigms such as Hadoop. This landscape gets especially interesting when we look at Amazon’s Redshift and the idea of elastic scalability on a relational data warehouse. The lack of elastic scalability in the data warehouse has been a big limitation for business; it has traditionally taken significant money, time and energy to implement.

With its “Right to Deploy” pricing strategy, ParAccel promises the same elasticity as with its on-premises deployments. The new pricing policy removes the traditional per-node pricing obstacles by offering prices based on “unlimited data” and takes into consideration the types of analytics that a company wants to deploy. This strategy may play well against companies that only sell their appliances bundled with hardware. Such vendors will have a difficult time matching ParAccel’s pricing because of their hardware-driven business model. While the offer is likely to get ParAccel invited into more consideration sets, it remains to be seen whether they win more deals based on it.

Partnerships with Amazon and MicroStrategy to provide cloud infrastructure produce a halo effect for ParAccel, but the cloud approaches compete against ParAccel’s internal sales efforts. One of the key differentiators for ParAccel as the company competes against the cloud version of itself will be the analytics that are stacked on top of the platform. Since neither Redshift nor MicroStrategy cloud offers currently license the upper parts of this value stack, customers and prospects will likely hear quite a bit about the library of 600-plus functions and the ability to address advanced analytics for clients. The extensible approach and the fact that the company has built analytics as a first class object in its database allow the architecture to address speed, scalability and analytic complexity. The one potential drawback, depending on how you look at it, is that the statistical libraries are based on user-defined-functions (UDFs) written in a procedural language. While the library integration is seamless to end users and scales well, if a company needs to customize the algorithms, data scientists must go into the underlying procedural programming language to make the changes. The upside is that the broad library of analytics can be used based on the SQL paradigm.

vr_bigdata_obstacles_to_big_data_analytics (2)While ParAccel aligns closely with the Hadoop ecosystem in order to source data, the company also seems to be welcoming opportunities to compete with Hadoop. Some of the use cases mentioned above such as so called “golden-path analysis, and others have been provided as key Hadoop analytic use cases. Furthermore, many Hadoop vendors are bringing the SQL access paradigm and traditional BI tools together with Hadoop to mitigate the skills gap in organizations. But if an MPP database like ParAccel that is built natively for relational data is also able to do big data analytics, and is able to deliver a more mature product with similar horizontal scalability and cost structure, the argument for standard SQL analytics on Hadoop becomes less compelling. If ParAccel is right, and SQL is the Lingua Franca for analytics, then they may be in a good position to fill the so called skills gap. Our benchmark research on business technology innovations shows that the biggest challenge for organizations deploying big data today revolves around staffing and training, with more than 77 percent of companies claiming that they are challenged in both categories.

ParAccel offers a unique approach in a crowded market. The new pricing policy is a brilliant stroke, as it not only will get the company invited into more bid opportunities, but it moves client conversations away from the technology-oriented three Vs and more to analytics and the business-oriented three Ws. If the company puts pricing pressure on the integrated appliance vendors, it will be interesting to see if any of those vendors begin to separate out their own software and allow it to run on commodity hardware. That would be a hard decision for them, since their underlying business models often rely on an integrated hardware/software strategy. With companies such as MicroStrategy and Amazon choosing it for their underlying analytical platforms, the company is one to watch. Depending on the use case and the organization, ParAccel’s in-database analytics should be readily considered and contrasted with other approaches.


Tony Cosentino

VP and Research Director

Our benchmark research into retail analytics says that only 34 percent of retail companies are satisfied with the process they currently use to create analytics. That’s a 10 percent lower satisfaction score than we found for all industries combined. The dissatisfaction is being driven by underperforming technology that cannot keep up with the dramatic changes that are occurring in the retail industry. Retail analytics lag those in the broader business world, with 71 percent still using spreadsheets as their primary analysis tool. This is significantly higher than other industries and shows the immaturity in the field of retail analytics.

While in the past retailers did not need to be on the cutting edge of analytics, dramatic changes occurring in retail are driving a new analytics imperative:

Manufacturers are forming direct relationships with consumers through communities and e-commerce. These relationships can extend into the store and influence buyers at the point of purchase.  This “pull-through” strategy increases the power and brand equity of the supplier while decreasing the position strength of the retailer. This dynamic is evidenced by JC Penney, which positions itself as a storefront for an entire portfolio of supplier brands. Whereas before the retailer owned the relationship with the consumer, the relationship is now shared between the retailer and its suppliers.

What this means for retail analytics: Our benchmark research shows retail has lagged behind other businesses with respect to analytics. Given the new co-opitition environment with suppliers, retailers must use analytics to compete. Their decreasing brand equity means that they need analytics not just for brand strategy and planning, but also in tactical areas such as merchandising and promotional management. At the same time, retailers are working with ever-increasing amounts of data that is often shared throughout the supply chain to build business cases and to enrich customer experience, and that data is ripe for analysis in service to business goals.

E-commerce is driving a convergence of offline and online retail consumer behavior, forcing change to a historically inert retail analytics culture. As we’ve all heard by now, online retailers such as Amazon threaten the business models of showroom retailers. Some old-line companies are dealing with the change by taking an “if you can’t beat ’em, join ’em” approach. Traditional brick-and-mortar company Walgreens, for instance, acquired and put kiosks in its stores to let customers order out-of-stock items immediately at the same price. However, online retailers, instead of looking to move into a brick-and-mortar environment, are driving their business model back into the data center and forward onto mobile devices. Amazon, for instance, offers Amazon Web Services and Kindle tablet.

What this means for retail analytics: There has historically been a wall between the .com area of a company and the rest of the organization. Companies did mystery shopping to do price checks in physical trade areas and bots to do the same thing over the Internet. Now companies such as Sears are investing heavily to gain full digital transparency into the supply chain so that they can change pricing on the fly – that is, it may choose to undercut a competitor on a specific SKU, then when its system finds a lack of inventory among competitors for the item, it can automatically increase its price and its margin. Eventually the entire industry, including midtier retailers, will have to focus on how analytics can improve their business.

Retailers are moving the focus of their strategy away from customer acquisition and toward customer retention. We see this change of focus both on the brick-and-mortar side, where loyalty card programs are becoming ubiquitous, and online via key technology enablers such as Google, whose I/O 2012 conference focused on the shift from online customer acquisition to online customer retention.

What this means for retail analytics: As data proliferates, businesses gain the ability to look more closely at how individuals contribute to a company’s revenue and profit. Traditional RFM and attribution approaches are becoming more precise as we move away from aggregate models and begin to look at particular consumer behavior. Analytics can help pinpoint changes in behavior that matter, and more importantly, indicate what organizations can do to retain desired customers or expand share-of-wallet. In addition, software to improve the customer experience within the context of a site visit is becoming more important. This sort of analytics, which might be called a type of online ethnography, is a powerful tool for improving the customer experience and increasing the stickiness of a retailer’s site.

In sum, our research on retail analytics shows that outdated technological and analytical approaches still dominate the retail industry. At the same time, changes in the industry are forcing companies to rethink their strategies, and many companies are addressing these challenges by leveraging analytics to attract and retain the most valued customers. For large firms, the stakes are extremely high, and the decisions around how to implement this strategy can determine not just profitability but potentially their future existence. Retail organizations need to consider investments into new approaches for getting access to analytics. For example, analytics provided via cloud computing and software as a service are becoming more pervasive help ensure they meet the capabilities and needs of business roles. Such approaches are a step function above the excel based environments that many retailers are living in today.


Tony Cosentino

Vice President and Research Director

With more than 90,000 attendees registered and 100,000 more expected to watch via live stream on Facebook,’s Dreamforce is the biggest technology event of this year. The conference kicked off yesterday morning with MC Hammer letting the packed house know that it was “Chatter time” and leaving little doubt about the theme of the Marc Benioff’s keynote speech: Social. Citing numbers from McKinsey and IBM, Benioff suggested that social adds $1.3 trillion to the economy and that CEOs see social media as the second most important communication channel of the 21st century, just after the direct sales force. Our own sales benchmark research here at Ventana Research shows similar trends, with 63 percent finding that collaboration is a key trend in sales organizations.

The keynote pronouncements were put in context by a number of clients. Two clients in particular highlighted broad changes occurring in industry.

Rossignol uses to exploit social and mobile areas for competitive advantage. Rossignol is a winter sports gear retailer, and its target consumers – as well as its dealers, such as REI – are often youthful and cutting-edge. allows Rossignol’s sales representatives to adjust offers and sign deals at the point of purchase with mobile devices. It integrates customer social profile information, offers team interaction between the pricing departments and other managers, and lets salespeople revise proposals and sign deals on the spot. With impressive ground game capabilities like these, Rossignol will likely take market share until its competitors can deploy a similar approach.

Salesforce also allows the ski community to be involved with Rossignol’s brand at a visceral level through things like coaching camps and excursions, and provides informal interactions with like-minded skiers. Even more impressive is the involvement of the company’s communities in its two-year product development cycles. This sort of crowdsourcing was unheard-of just a few years ago.

Rossignol represents the impact this type of company and brand is having on the relationships between manufacturers and retailers. Retailers are losing leverage with customers as manufacturers build loyal followings and establish pull-through channel strategies among their end-user customer base. We see similar trends in other markets.

Salesforce also highlighted General Electric, my alma mater. When I was at GE, collaboration consisted of doing things like GE Boundaryless Sales, where reps would share leads across GE Capital as well as with the core divisions. So GE has always been somewhat collaborative, but I was surprised to see how aggressively it is pursuing things such as social collaboration with its sales teams.  In particular, it is rolling out in the Honeywell division, which in and of itself is a large diversified business. Deal sizes go from a few thousand dollars that may take a couple of weeks, to a seven-year, $100 million deal.

Two interesting things here apply to the broader GE organization, and illustrate how sales is changing as a result of social and mobile. First, the new systems give management a better picture of what is happening and how to manage its sales force. It’s difficult to do forecasting, communication and root-cause analysis with very different buying environments. Internally, these types of diversified organizations are often a Tower of Babel. Social tools provide the equivalent of a common language that can be instituted across the organization.

The second big impact is the inversion of the organizational pyramid. This is not just a GE phenomenon, but is beginning to occur across multiple industries.  The business process used to be about getting information from the sales force into the organization –  for instance, getting pipeline updates to management and rolling them up, or trying to get a salesperson to share his contact information. Now it’s just the opposite; organizations are pushing information down into the hands of the sales folks and empowering them to make deals. This is changing the nature of sales. It used to be that a lone wolf was the ideal salesperson. Now it’s a social collaborator who can act like an orchestra conductor.

As these two companies demonstrate, seismic changes are occurring in organizations and across industries. I strongly encourage companies with medium-sized or large sales forces that haven’t yet moved forward with sales force automation (SFA), or consumer brands that are not actively engaged in community building, to start doing so.

For additional in-depth analysis of the social aspect at Dreamforce as well as the key announcements, read my colleague Mark Smith’s post. Salesforce is helping companies change the way they operate and through the use of social and mobile technology in conjunction with cloud computing which is definitely worth looking at more closely.


Tony Cosentino

Vice President and Research Director

Our benchmark research on business analytics suggests that it is counterproductive to take a general approach. A better approach is to focus on particular use cases and lines of business (LOB). For this reason, in a series of upcoming articles, I will look at our business analytics research in the context of different industries and different functional areas of an organization, and illustrate how analytics are being applied to solve real business problems.

Our benchmark research on business analytics reveals that 89 percent of organizations find that it is important or very important to make it simpler to provide analytics and metrics. To me, this says that today’s analytic environments are a Tower of Babel. We need more user-friendly tools, collaboration and most of all a common vernacular.

With this last point in mind, let’s start by defining business analytics. Here at Ventana Research, business analytics refers to the application of mathematical computation and models to generate relevant historical and predictive insights that can be used to optimize business- and IT-related processes and decisions.This definition helps us to focus on the technological underpinning of analytics, but more importantly, it focuses us on the outcomes of business and IT processes and decisions.

To provide more context, we might think of the what, the so what and the now what when it comes to information, analytics and decision-making. The what is data or information in its base form. In order to derive meaning, we apply different types of analytics and go through analytical processes. This addresses the so what, or the why should I care about the data. The now what involves decision-making and actions taken on the data; this is where ideas such as operational intelligence and predictive analytics play a big role. I will look to our benchmark research in these areas to help guide the discussion.

It’s important not to think about business analytics in a technological silo removed from the people, process, information and tools that make up the Ventana Maturity Index. In this broader sense, business analytics helps internal teams derive meaning from data and guides their decisions. Our next-generation business intelligence research focuses on collaboration and mobile application of analytics, two key components for making analytics actionable within the organization.

In addition, our research shows a lot of confusion about the terms surrounding analytics. Many users don’t understand scorecards and dashboards, and find discovery, iterative analysis, key performance metrics, root-cause analysis and predictive analytics to be ambiguous terms. We’ll be discussing all of these ideas in the context of business technology innovation, our 2012 business intelligence research agenda and of course our large body of research on technology and business analytics.

Organizations must care about analytics because analytics provides companies with a competitive advantage by showing what their customers want, when they want it and how they want it delivered. It can help reduce inventory carrying costs in manufacturing and retail, fraud in insurance and finance, churn in telecommunications and even violent crime on our streets. The better an organization can integrate data and utilize both internal and external information in a coherent fashion, the greater the value of their analytics.

I hope you enjoy this series and find it useful as you define your own analytics agenda within your organization.


Tony Cosentino

VP & Research Director

RSS Tony Cosentino’s Analyst Perspectives at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Tony Cosentino – Twitter

Error: Twitter did not respond. Please wait a few minutes and refresh this page.


  • 73,687 hits
%d bloggers like this: