You are currently browsing the monthly archive for January 2014.

Paxata, a new data and analytics software provider says it wants to address one of the most pressing challenges facing today’s analyst performing analytics: simplifying data preparation. This trend toward simplification is well aligned with the market’s desire for improving usability, which our benchmark research into Next-Generation Business Intelligence shows is a primary buying consideration in two-thirds (64%) of companies. This trend is driving significant adoption of business-friendly-front-end visual and data discovery tools and is part of my research agenda for 2014.

On the back end, however, there is still considerable complexity. VR_Benchmark_Research_logoNon-traditional relational database systems such as Hadoop and big data appliances address the need to store and to some degree query massive amounts of structured and unstructured data. But the ability to efficiently and effectively blend these data sources and any third-party cloud-based data is still a challenge.

To address this challenge, the front end analytics tools that are being adopted by analysts and the multitude of back-end database systems must be integrated to deliver high quality analytic data sets. Today, this is no easy task. My latest benchmark research into Information Optimization recently released finds that when companies create and deploy information, the largest portions of time are spent on preparing data for analysis (49%) and reviewing data for quality and consistency issues (47%). In fact, our research shows that analysts consistently spend anywhere from 40 percent to 60 percent of their time in the data preparation phase that precedes actual analysis of the data.

Paxata and its Adaptive Data Preparation platform aims to solve the challenge of data preparation by improving the data vr_ss21_spreadsheets_arent_easily_replacedaggregation, enrichment, quality and governance processes. It does this using a spreadsheet paradigm, a choice of approach that should resonate well with business analysts; our research into spreadsheet use in today’s enterprises finds that the majority of them (56%) are resistant to a move away from spreadsheets.

In Paxata’s design, once the data is loaded the software displays the combined dataset in a spreadsheet format and the user then manipulates the rows and columns to accomplish the various data preparation tasks. For instance, to profile the data, the analyst can first use a search box and an autocomplete query to find the data of interest and then use color-coded cells and visualization techniques to highlight patterns in the data. For data that may include multiple duplicate records such as addresses, the company includes services that help to sort through these records and make suggestions on what records to combine. This last task may be of particular interest for marketers attempting to combine multiple third-party data sources that list several addresses and names for the same individual.

Another key aspect of Paxata’s software is a history function that allows users to return to any step in the data preparation process and make changes on the fly. This ability to explore the lineage of the data enables another interesting function: “Paxata Share.” This collaborative capability enables multiple users to collaboratively evaluate the differences between data sets by looking at different assumptions that went into the processing of the data. This function is particularly interesting as it has the potential to solve the challenge of “battling boardroom facts” – the situation in which people come to a meeting with different versions of the truth based on the same data sources but different data preparation assumptions.

Under the covers, Paxata’s offering boasts a cloud-based multi-tenant architecture hosted on Rackspace and leveraging the OpenStack platform. The company says its product can comfortably handle big data, processing millions of rows (or about a terabyte) of data in real time. If data sets are larger than this, a batch process can replace the real-time analysis.

In my view, the main value of Paxata’s technology lies in the data analyst time it potentially can save. Much of the functionality it offers involves data discovery driven by the kinds of machine learning algorithms that my colleague Mark Smith discussed Four types of Discovery Technology. For instance, the Paxata software will recommend data and metric definitions based on the business context in which the analyst is working – a customer versus a supply chain context, for example – and these recommendations will sharpen as more data runs through the system.

Paxata is off to a great start, though the data connectors its product offers currently are limited; this will improve as it builds out connectors for more data sources. The company will also need to sort through a very noisy marketplace of companies that provide similar services, on-premises or in the cloud, and that all are adapting their messages to address the data preparation challenge. On its website, Paxata lists Cloudera, Qlik Technologies and Tableau as technology partners. The company also lists dozens of information enrichment partners including government organizations and data companies such as Acxiom, DataSift, and Esri. The list of information partners is extensive, which reflects a thoughtful focus on the value of third-party data sources.

Utilizing efficient cloud computing technology, Paxata is able to come out of the gate with aggressive pricing listed on the company site that is about $300 per month which is pretty small amount for the time that is saved on daily, weekly and monthly basis. Such pricing should help adoption especially with business analysts that the company targets. Organizations that are struggling with the time they put into the data preparation phase of analytics and those that are looking to leverage outside data sources in new and innovative ways should look into Paxata.


Tony Cosentino

VP and Research Director

Our benchmark research shows that analytics is the top businessvr_bti_br_technology_innovation_priorities technology innovation priority; 39% of organizations rank it first. This is no surprise as new information sources and new technologies in data processing, storage, networking, databases and analytic software are combining to offer capabilities for using information never before possible. For businesses, the analytic priority is heightened by intense competition on several fronts; they need to know as much as possible about pricing, strategies, customers and competitors. Within the organization, the IT department and the lines of business continue to debate issues around the analytic skills gap, information simplification, information governance and the rise of time-to-value metrics. Given this backdrop, I expect 2014 to be an exciting year for  studying analytic technologies and how they apply to business.

Three key focus areas comprise my 2014 analytics research agenda. The first includes a specific focus on business analytics and methods like discovery and exploratory. This area will be covered in depth in our new research on next-generation business analytics commencing in the first half of 2014. At Ventana Research, we break discovery analytics into visual discovery, data discovery, event discovery and information discovery. The definitions and uses of each type appear in Mark Smith’s analysis of the four discovery technologies. As part of this research, we will examine these exploratory tools and techniques in the context of the analytic skills gap and the new analytic process flows in organizations. The people and process aspects of the research will include how governance and controls are being implemented alongside these innovations. The exploratory analytics space includes business intelligence, which our research shows is still the primary method of deploying information and analytics in organizations. Two upcoming Value Indexes, Mobile Business Intelligence, due out in the first quarter, and Business Intelligence, starting in the second, will provide up-to-date and in-depth evaluations and ranking of vendors in these categories.

Ventana_Research_Value_Index_LogoMy second agenda area is big data and predictive analytics. The first research on this topic will be released in the first quarter of the year as benchmark research on big data analytics. This fresh and comprehensive research maps to my analysis of the four pillars of big data Analytics, a framework for thinking about big data and the associated analytic technologies. This research also has depth in the areas of predictive analytics and big data approaches in use today. In addition to that benchmark research, we will conduct a first of its kind, the Big Data Analytics Value Index, which will assess the major players applying analytics to big data. Real-time and right-time big data also is called operational intelligence, an area Ventana Research has pioneered over the years. Our Operational Intelligence Value Index, which will be released in the first quarter, evaluates vendors of software that helps companies do real-time analytics against large streams of data that builds on our benchmark research on the topic.

The third focus area is information simplification and cloud-based business analytics including business intelligence. In our benchmark research on information optimization, recently released, Ventana_Research_Benchmark_Research_Logonearly all (97%) organizations said it is important or very important to simplify informa­tion access for both their business and their customers. Paradoxically, at the same time the technology landscape is getting more fragmented and complex; in order to simplify, software design will need innovative uses of analytic technology to mask the underlying complexity through layers of abstraction. In particular, users need the areas of sourcing data and preparing data for analysis to be simplified and made more flexible so they can devote less time to these tasks and more the actual analysis. Part of the challenge in information optimization and integration is to analyze data that originates in the cloud or has been moved there. This issue has important implications for debates around information presentation, the semantic web, where analytics are executed, and whether business intelligence will move to the cloud in any more than a piecemeal fashion. We’ll explore these topics in benchmark research on business intelligence and analytics in the cloud, which is planned for the second half of 2014. We released in 2013 research on location analytics and the use of geography for presentation and processing of data which we refer to as location analytics.

Analytics as a business discipline is getting hotter as we move forward in the 21st century, and I am thrilled to be part of the analytics community. I welcome any feedback you have on my research agenda and look forward to continuing to providing research, collaborating and educating with you in 2014.


Tony Cosentino

VP and Research Director

RSS Tony Cosentino’s Analyst Perspectives at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Tony Cosentino – Twitter

Error: Twitter did not respond. Please wait a few minutes and refresh this page.


  • 73,162 hits
%d bloggers like this: