You are currently browsing the monthly archive for March 2013.

ParAccel is a well-funded big data startup, with $64 million invested in the firm so far. Only a few companies can top this level of startup funding, and most of them are service-based rather than product-based companies. Amazon has a 20 percent stake in the company and is making a big bet on the company’s technology to run its Redshift data warehouse in the cloud initiative. Microstrategy also uses ParAccel for it’s cloud offering, but holds no equity in the company.

ParAccel provides a software-based analytical platform that competes in the database appliance market, and as many in the space are increasingly trying to do, it is building analytic processes on top of the platform. On the base level, ParAccel is a massively parallel processing (MPP) database with columnar compression support, which allows for very fast query and analysis times. It is offered either as software or in an appliance configuration which, as we’ll discuss in a moment, is a different approach than many others in the space are taking. It connects with Teradata, Hadoop, Oracle and Microsoft SQL Server databases as well as financial market data such as semi-structured trading data and NYSE data through what the company calls On Demand Integration (ODI). This allows joint analysis through SQL of relational and non-relational data sources. In-database analytics offer more than 600 functions (though places on the company’s website and datasheets still say just over 500).

The company’s latest release, ParAccel 4.0, introduced product enhancements around performance as well as reliability and scalability. Performance enhancements include advanced query optimization that is said to improve aggregation performance 20X by doing “sort-aware” aggregations which tracks data properties up and down the processing pipeline. ParAccel’s own High Speed Interconnect protocol has been further optimized reducing data distribution overhead and speeding query processing. The new version 4.0 introduces new algorithms that exploit I/O patterns to pre-fetch data and store in memory, which again speeds query processing and reduced I/O overhead. The need for scalability is addressed in enhancements to enable the system to scale to 5,000 concurrent connections supporting up to 38,000 users on a single system. Its Hash Join algorithms allow for complex analytics by allowing the number of joins to fit the complexity of the analytic. Finally, interactive workload management introduces a class of persistent queries that allows short running queries and long running queries to be run side by side without impacting performance. This is particularly important as the integration of on-demand data sources through the company’s ODI approach could otherwise interfere with more interactive user requirements.

The company separates out its semi-annual database release cycle from the more iterative analytics release cycle. The new analytic functions just released just last month include a number of interesting developments for the company. Text analytics for various feeds allows for analytics across a variety of use cases, such as social media, customer comment analysis, insurance and warranty claims. In addition, functions such as sessionization and JSON parsing allow a new dimension of analytics for ParAccel as web data can now be analyzed. The new analytic capabilities allow the company to address a broad class of use cases such as “golden path analysis”, fraud detection, attribution modeling, segmentation and profiling. Interestingly, some of these use case are of the same character as those seen in the Hadoop world.

So where does ParAccel fit in the broader appliance landscape? vr_bigdata_big_data_technologies_plannedAccording to our benchmark research on big data more than 35 percent of businesses plan to use appliance technology, but the market is still fragmented. The appliance landscape can be broken down into categories that include hardware and software that run together, software that can be deployed across commodity hardware, and non-relational parallel processing paradigms such as Hadoop. This landscape gets especially interesting when we look at Amazon’s Redshift and the idea of elastic scalability on a relational data warehouse. The lack of elastic scalability in the data warehouse has been a big limitation for business; it has traditionally taken significant money, time and energy to implement.

With its “Right to Deploy” pricing strategy, ParAccel promises the same elasticity as with its on-premises deployments. The new pricing policy removes the traditional per-node pricing obstacles by offering prices based on “unlimited data” and takes into consideration the types of analytics that a company wants to deploy. This strategy may play well against companies that only sell their appliances bundled with hardware. Such vendors will have a difficult time matching ParAccel’s pricing because of their hardware-driven business model. While the offer is likely to get ParAccel invited into more consideration sets, it remains to be seen whether they win more deals based on it.

Partnerships with Amazon and MicroStrategy to provide cloud infrastructure produce a halo effect for ParAccel, but the cloud approaches compete against ParAccel’s internal sales efforts. One of the key differentiators for ParAccel as the company competes against the cloud version of itself will be the analytics that are stacked on top of the platform. Since neither Redshift nor MicroStrategy cloud offers currently license the upper parts of this value stack, customers and prospects will likely hear quite a bit about the library of 600-plus functions and the ability to address advanced analytics for clients. The extensible approach and the fact that the company has built analytics as a first class object in its database allow the architecture to address speed, scalability and analytic complexity. The one potential drawback, depending on how you look at it, is that the statistical libraries are based on user-defined-functions (UDFs) written in a procedural language. While the library integration is seamless to end users and scales well, if a company needs to customize the algorithms, data scientists must go into the underlying procedural programming language to make the changes. The upside is that the broad library of analytics can be used based on the SQL paradigm.

vr_bigdata_obstacles_to_big_data_analytics (2)While ParAccel aligns closely with the Hadoop ecosystem in order to source data, the company also seems to be welcoming opportunities to compete with Hadoop. Some of the use cases mentioned above such as so called “golden-path analysis, and others have been provided as key Hadoop analytic use cases. Furthermore, many Hadoop vendors are bringing the SQL access paradigm and traditional BI tools together with Hadoop to mitigate the skills gap in organizations. But if an MPP database like ParAccel that is built natively for relational data is also able to do big data analytics, and is able to deliver a more mature product with similar horizontal scalability and cost structure, the argument for standard SQL analytics on Hadoop becomes less compelling. If ParAccel is right, and SQL is the Lingua Franca for analytics, then they may be in a good position to fill the so called skills gap. Our benchmark research on business technology innovations shows that the biggest challenge for organizations deploying big data today revolves around staffing and training, with more than 77 percent of companies claiming that they are challenged in both categories.

ParAccel offers a unique approach in a crowded market. The new pricing policy is a brilliant stroke, as it not only will get the company invited into more bid opportunities, but it moves client conversations away from the technology-oriented three Vs and more to analytics and the business-oriented three Ws. If the company puts pricing pressure on the integrated appliance vendors, it will be interesting to see if any of those vendors begin to separate out their own software and allow it to run on commodity hardware. That would be a hard decision for them, since their underlying business models often rely on an integrated hardware/software strategy. With companies such as MicroStrategy and Amazon choosing it for their underlying analytical platforms, the company is one to watch. Depending on the use case and the organization, ParAccel’s in-database analytics should be readily considered and contrasted with other approaches.


Tony Cosentino

VP and Research Director

This year’s Inspire, Alteryx’s annual user conference, featured new developments around the company’s analytics platform. Alteryx CEO Dean Stoecker kicked off the event by talking about the promise of big data, the dissemination of analytics throughout the organization, and the data artisan as the “new boss.” Alteryx coined the term “data artisan” to represent the persona at the center of the company’s development and marketing efforts. My colleague Mark Smith wrote about the rise of the data artisan in his analysis of last year’s event.

President and COO George Mathew keynoted day two, getting into more specifics on the upcoming 8.5 product release. vr_ngbi_br_importance_of_bi_technology_considerationsAdvancements revolve around improvement in the analytical design environment, embedded search capabilities, the addition of interactive mapping and direct model output into Tableau. The goal is to provide an easier, more intuitive user experience. Our benchmark research into next-generation business intelligence shows buyers consider usability the top buying criteria at 63 percent. The redesigned Alteryx interface boasts a new look for the icons and more standardization across different functional environments. Color coding of the toolbox groups tools according to functions, such as data preparation, analytics and reporting. A new favorites function is another good addition, given that users tend to rely on the same tools depending on their role within the analytics value chain. Users can now look at workflows horizontally and not just vertically, and easily change the orientation if for example they are working on an Apple iPad. Version 8.5 allows embedded search and more streamlined navigation, and continues its focus on a role-based application, which my colleague has been advocating for a while. According to the company, 94 percent of its user base demanded interactive mapping; that’s now part of the product, letting users draw a polygon around an area of interest, then integrate it into the analytical application for runtime execution.

The highlight of the talk was the announcement of integration with Tableau 8.0 and the ability to write directly to the software without having to follow the cumbersome process of exporting a file and then reopening it in another application. Alteryx was an alpha partner and worked directly with the code base for Tableau 8.0, which I wrote up a few months ago. The partnership exemplifies the coopetition environment that many companies find themselves in today. While Tableau does some basic prediction, and Alteryx does some basic visual reporting, the companies’ core competencies brought together into one workflow is much more powerful for the user. Another interesting aspect is the juxtaposition of the two user groups. The visually oriented Tableau group in San Diego seemed much younger and was certainly much louder on the reveals, while the analytically oriented Alteryx group was much more subdued.

Alteryx has been around since 1997, when it was called SRC. It grew up focused around location analytics, which allowed it to establish foundational analytic use cases in vertical areas such as real estate and retail. After changing the company name and focusing more on horizontal analytics, Alteryx is growing fast with backing from, interestingly enough, SAP Ventures. Since the company was already profitable, it used a modest infusion of capital to grow its product marketing and sales functions. The move seems to have paid off. Companies such as Dunkin Brands and Redbox use Alteryx and the company has made significant inroads with marketing services companies.  A number of consulting companies, such as Absolute Data and Capgemini, are using Alteryx for customer and marketing analytics and other use cases. I had an interesting talk with the CEO of a small but important services firm who said that he is being asked to introduce innovative analytical approaches to much larger marketing services and market research firms. He told me that Alteryx is a key part of the solution he’ll be introducing to enable things such as big data analytics.

Alteryx provides value in a few innovative ways that are not new to this release, but that are foundational to the company’s business vr_bigdata_obstacles_to_big_data_analyticsstrategy. First, it marries data integration with analytics, which allows business users who have traditionally worked in a flat-file environment to pull from multiple data sources and integrate information within the context of the Alteryx application. Within that same environment, users can build analytic workflows and publish applications to a private or public cloud. This approach helps address the obstacles found in our research in big data analytics where staffing (79%) and training (77%) are addressed by Alteryx through providing more flexibility for business to engage into the analytic process.

Alteryx manages an analytics application store called the Analytics Gallery that crowdsources and shares user-created models. These analytical assets can be used internally within an organization or sold on the broader market. Proprietary algorithms can be secured through a black box approach, or made open to allow other users to tweak the analytic code. It’s similar to what companies like Datameer are doing on top of Hadoop, or Informatica in the cloud integration market. The store gives descriptions of what the applications do, such as fuzzy matching or target marketing. Being crowdsourced, the number of applications should proliferate over time, tracking advancements in the R open source project, since R is at the heart of the Alteryx analytic strategy and what it calls clear box analytics. The underlying algorithm is easily viewed and edited based on permissions established by the data artisan, similar to what we’ve seen with companies such as 1010data. Alteryx 8.5 works with R 3.0, the latest version. On the back end, Alteryx partners with enterprise data warehouse powerhouses such as Teradata, and works with the Hortonworks Hadoop distribution.

I encourage analysts of all stripes to take a look at the Alteryx portfolio. Perhaps start with the Analytics Gallery to get a flavor of what the company does and the type of analytics customers are building and using today.  Alteryx can benefit analysts looking to move beyond the limitations of a flat-file analytics environment, and especially marketing analysts who want to marry third-party data from sources such as the US Census Bureau, Experian, TomTom or Salesforce, which Alteryx offers within its product. If you have not seen Alteryx, you should take a look and see how they are changing the way analytic processes are designed and managed.


Tony Cosentino

VP and Research Director

SAS Institute held its 24th annual analyst summit last week in Steamboat Springs, Colorado. The 37-year-old privately held company is a key player in big data analytics, and company executives showed off their latest developments and product roadmaps. In particular, LASR Analytical Server and Visual Analytics 6.2, which is due to be released this summer, are critical to SAS’ ability to secure and expand its role as a preeminent analytics vendor in the big data era.

For SAS, the competitive advantage in Big Data rests in predictive vr_predanalytics_predictive_analytics_obstaclesanalytics, and according to our benchmark research into predictive analytics, 55 percent of businesses say the challenge of architectural integration is a top obstacle to rolling out predictive analytics in the organization. Integration of analytics is particularly daunting in a big-data-driven world, since analytics processing has traditionally taken place on a platform separate from where the data is stored, but now they must come together. How data is moved into parallelized systems and how analytics are consumed by business users are key questions in the market today that SAS is looking to address with its LASR and Visual Analytics.

Jim Goodnight, the company’s founder and plainspoken CEO, says he saw the industry changing a few years ago. He speaks of a large bank doing a heavy analytical risk computation that took upwards of 18 hours, which meant that the results of the computation were not ready in time for the next trading day. To gain competitive advantage, the time window needed to be reduced, but running the analytics in a serialized fashion was a limiting factor. This led SAS to begin parallelizing the company’s workhorse procedures, some of which were first developed upwards of 30 years ago. Goodnight also discussed the fact that building these parallelizing statistical models is no easy task. One of the biggest hurdles is getting the mathematicians and data scientists that are building these elaborate models to think in terms of the new parallelized architectural paradigm.

Its Visual Analytics software is a key component of the SAS Big Data Analytics strategy. Our latest business technology innovation benchmark research [] found that close to half (48%) of organizations present business analytics visually. Visual Analytics, which was introduced early last year, is a cloud-based offering running off of the LASR in-memory analytic engine and the Amazon Web Services infrastructure. This web-based approach allows SAS to iterate quickly without worrying a great deal about revision management while giving IT a simpler server management scenario. Furthermore, the web-based approach provides analysts with a sandbox environment for working with and visualizing in the cloud big data analytics; the analytic assets can then be moved into a production environment. This approach will also eventually allow SAS to combine data integration capabilities with the data analysis capabilities.

With descriptive statistics being the ante in today’s visual discovery world, SAS is focusing Visual Analytics to take advantage of the vr_bigdata_obstacles_to_big_data_analytics (2)company’s predictive analytics history and capabilities. Visual Analytics 6.2 integrates predictive analytics and rapid predictive modeling (RPM) to do, among other things, segmentation, propensity modeling and forecasting. RPM makes it possible for models to be generated via sophisticated software that runs through multiple algorithms to find the best fit based on the data involved. This type of commodity modeling approach will likely gain significant traction as companies look to bring analytics into industrial processes and address the skills gap in advanced analytics. According to our BTI research, the skills gap is the biggest challenge facing big data analytics today, as participants identified staffing (79%) and training (77%) as the top two challenges.

Visual Analytics’ web-based approach is likely a good long-term bet for SAS, as it marries data integration and cloud strategies. These factors, coupled with the company’s installed base and army of loyal users, give SAS a head start in redefining the world of analytics. Its focus on integrating visual analytics for data discovery, integration and commodity modeling approaches also provides compelling time-to-value for big data analytics. In specific areas such as marketing analytics, the ability to bring analytics into the applications themselves and allow data-savvy marketers to conduct a segmentation and propensity analysis in the context of a specific campaign can be a real advantage. Many of SAS’ innovations cannibalize its own markets, but such is the dilemma of any major analytics company today.

The biggest threat to SAS today is the open source movement, which offers big data analytic approaches such as Mahout and R. For instance, the latest release of R includes facilities for building parallelized code. While academics working in R often still build their models in a non-parallelized, non-industrial fashion, the current and future releases of R promise more industrialization. As integration of Hadoop into today’s architectures becomes more common, staffing and skillsets are often a larger obstacle than the software budget. In this environment the large services companies loom larger because of their role in defining the direction of big data analytics. Currently, SAS partners with companies such as Accenture and Deloitte, but in many instances these companies have split loyalties. For this reason, the lack of a large in-house services and education arm may work against SAS.

At the same time, SAS possesses blueprints for major analytic processes across different industries as well as horizontal analytic deployments, and it is working to move these to a parallelized environment. This may prove to be a differentiator in the battle versus R, since it is unclear how quickly the open source R community, which is still primarily academic, will undertake the parallelization of R’s algorithms.

SAS partners closely with database appliance vendors such as Greenplum and Teradata, with which it has had longstanding development relationships. With Teradata, it integrates into the BYNET messaging system, allowing for optimized performance between Teradata’s relational database and the LASR Analytic Server. Hadoop is also supported in the SAS reference architecture. LASR accesses HDFS directly and can run as a thin memory layer on top of the Hadoop deployment. In this type of deployment, Hadoop takes care of everything outside the analytic processing, including memory management, job control and workload management.

These latest developments will be of keen interest to SAS customers. Non-SAS customers who are exploring advanced analytics in a big data environment should consider SAS LASR and its MPP approach. Visual Analytics follows the “freemium” model that is prevalent in the market, and since it is web-based, any instances downloaded today can be automatically upgraded when the new version arrives in the summer. For the price, the tool is certainly worth a test drive for analysts. For anyone looking into such tools and foresee the need for inclusion predictive analytics, it should be of particular interest.


Tony Cosentino
VP and Research Director

Big data analytics is being offered as the key to addressing a wide array of management and operational needs across business and IT. But the label “big data analytics” is used in a variety of ways, confusing people about its usefulness and value and about how best to implement to drive business value. The uncertainty this causes poses a challenge for organizations that want to take advantage of big data in order to gain competitive advantage, comply with regulations, manage risk and improve profitability.

Recently, I discussed a high-level framework for thinking about big data analytics that aligns with former Census Director Robert Groves’ ideas of designed data on the one hand and organic data on the other. This second article completes that picture by looking at four specific areas that constitute the practical aspects of big data analytics – topics that must be brought into any holistic discussion of big data analytics strategy. Today, these often represent point-oriented approaches, but architectures are now coming to market that promise more unified solutions.

Big Data and Information Optimization: the intersection of big data analytics and traditional approaches to analytics. Analytics performed by database professionals often differ significantly from analytics delivered by line-of-business staffers who work in more flat-file-oriented environments. Today, advancements in in-memory systems, vr_bigdata_obstacles_to_big_data_analyticsin-database analytics and workload-specific appliances provide scalable architectures that bring processing to the data source and allow organizations to push analytics out to a broader audience, but how to bridge the divide between the two kinds of analytics is still a key question. Given the relative immaturity of new technologies and the dominance of relational databases for information delivery, it is critical to examine how all analytical assets will interact with core database systems.  As we move to operationalizing analytics on an industrial scale, the current advanced analytical approaches break down because it requires pulling data into a separate analytic environment and does not leverage advances in parallel computing. Furthermore, organizations need to determine how they can apply existing skill sets and analytical access paradigms such as business intelligence tools, SQL, spreadsheets and visual analysis, to big data analytics. Our recent big data benchmark research shows that the skills gap is the biggest issue facing analytics initiatives with staffing and training as an obstacle in over three quarters of organizations.

Visual analytics and data discovery: Visualizing data is a hot topic, especially in big data analytics. Much of big data analysis is about finding patterns in data and visualizing them so that people can tell a story and give context to large and diverse sets of data. Exploratory analytics allows us to develop and investigate hypotheses, reduce data, do root-cause analysis and suggest modeling approaches for our predictive analytics. Until now the focus of these tools has been on descriptive statistics related to SQL or flat file environments, but now visual analytics vendors are bringing predictive capabilities into the market to drive usability, especially at the business user level. This is a difficult challenge because the inherent simplicity of these descriptive visual tools clashes with the inherent complexity that defines predictive analytics. In addition, companies are looking to apply visualization to the output of predictive models as well. Visual discovery players are opening up their APIs in order to directly export predictive model output.

New tools and techniques in visualization along with the proliferation of in-memory systems allow companies the means of sorting through and making sense of big data, but exactly how these tools work, the types of visualizations that are important to big data analytics and how they integrate into our current big data analytics architecture are still key questions, as is the issue of how search-based data discovery approaches fit into the architectural landscape.

Predictive analytics: Visual exploration of data cannot surface all patterns, especially the most complex ones. To make sense of enormous data sets, data mining and statistical techniques can find patterns, relationships and anomalies in the data and use them to predict future outcomes for individual cases. Companies need to investigate the use of advanced analytic approaches and algorithmic methods that can transform and analyze organic data for uses such as predicting security threats, uncovering fraud or targeting product offers to particular customers.

Commodity models (a.k.a. good-enough models) are allowing business users to drive the modeling process. How these models can be vr_predanalytics_benifits_of_predictive_analyticsbuilt and consumed at the front line of the organization with only basic oversight by a statistician data scientist is a key area of focus as organizations endeavor to bring analytics into the fabric of the organization. The increased load on the back end systems is another key consideration if the modeling is a dynamic software driven approach. How these models are managed and tracked is yet another consideration. Our research on predictive analytics shows that companies that update their models more frequently have much higher satisfaction ratings than those that update on a less frequent basis.  The research further shows that in over half of organizations that competitive advantage and revenue growth are the primary reasons that predictive analytics are deployed.

Right-time and real-time analytics: It’s important to investigate the intersection of big data analytics with right-time and real-time systems and learn how participants are using big data analytics in production on an industrial scale. This usage guides the decisions that we make today around how to begin the task of big data analytics. Another choice organizations must make is whether to capture and store all of their data and analyze it on the back end, attempt to process it on the fly, or do both. In this context, event processing and decision management technologies represent a big part of big data analytics since they can help examine data streams for value and deliver information to the front lines of the organization immediately. How traditionally batch-oriented big data technologies such as Hadoop fit into the broader picture of right-time consumption still needs to be answered as well. Ultimately, as happens with many aspects of big data analytics, the discussion will need to center on the use case and how to address the time to value (TTV) equation.

Organizations embarking on a big data strategy must not fail to consider the four areas above. Furthermore, their discussions cannot cover just the technological approaches, but must include people, processes and the entire information landscape. Often, this endeavor requires a fundamental rethinking of organizational processes and questioning of the status quo.  Only then can companies see the forest for the trees.


Tony Cosentino
VP and Research Director

Platfora has gained a lot of buzz in the Big Data analytics market primarily through word of mouth. Late last year the company took the covers off of some impressive and potentially disruptive technology that takes aim at the broad BI and business analytics ecosystem, including the very foundation on which the industry is built. It recently demonstrated its software at the Strata Conference where the audience that is fixated on big data was in attendance.

Platfora looks to provide the underlying architecture of tomorrow’s vr_ngbi_br_importance_of_bi_technology_considerationsBI systems and address the challenge of big data analytics. Our benchmark research shows that one of the biggest hurdles facing next-generation BI systems is usability and was the top category in 63 percent of organizations. This is of specific concern when it comes to big data analytics and today’s Hadoop ecosystem, where many companies are taking to the Field of Dreams approach – if you build it, they will come. That is, many companies are setting up Hadoop clusters, but users have no access to the underlying data and need data scientists to come in and painstakingly extract nuggets of value. Simply connecting Hadoop to applications via connectors does not work well since there is no good way to sort through the Hadoop data to decide what to move into a more production-oriented system.

Platfora promises to solve this problem by bypassing both traditional architectures and newer hybrid architectures and putting everything in Hadoop, from data capture to data preparation to analysis and visualization.

The challenge with traditional architectures, Platfora argues, is that they organize data in a predetermined manner, but today’s big data analytics environment dictates that organizations cannot determine in advance what they will need to explore in the future. If a user gets to a level of analysis that is not part of the current schema, someone in the organization must undertake a herculean effort to recreate the entire data model. It’s the ‘I don’t know what I don’t know’ challenge. In my blog post Big Data Analytics Faces a Chasm of Understanding, I discuss the difference in exploratory analytics and confirmatory approaches that marks the difference between the 20th and 21st century approaches to business analytics. Businesses need both, but the nature of big data demands the exploratory approach be given more weight.

Platfora stores data in Hadoop and works with all of the open source stacks, including those from Cloudera, HortonWorks and MapR, as well as EMC Pivotal HD proprietary distribution announced just this week and assessed by my colleague. The secret sauce for Platfora is its ability to provide visibility into the underlying file system and provide a shopping-basket metaphor, where an analyst can choose vr_bigdata_obstacles_to_big_data_analyticsdifferent dimensions that are of interest. Through what the company calls Fractal Cache technology, which is a distributed query engine, it takes the data and creates the relationship on the fly in-memory. This essentially provides an ad hoc data mart, which an analyst can then access to do slice-and-dice analysis with sub-second response times and solve exploratory analytics challenges. If an analyst drills down and finds that an interesting piece of information is not included in the model, he can have the software recreate the model on an ad hoc basis, which generally takes from minutes up to an hour, according to the company.

The software’s power and ease of use allows business analysts to expand the breadth of questions they can ask of the data without having to go back to IT. According to the company, it takes only a few hours of training on the system to get up and running. This is especially important given that our Big Data benchmark research that assessed the challenges of Hadoop says one of the biggest challenges organizations face today is one of staffing and training as found in over three quarters of organizations. If Platfora can solve this conundrum and implement it within the enterprise, it will indeed start to move organizations beyond the technologically oriented three V’s discussion about big data into the business-oriented discussion around the three Ws.

The biggest challenge the company may face is institutional. Companies have spent billions of dollars implementing their current architectures, and relationships with software providers often run deep. Furthermore, the idea of such a system largely replacing traditional data warehouses threatens not only the competition, but perhaps the departments the company aims to sell into. Simply put, such a business-usable system obviates the need for an entire area of IT. Many firms, especially large ones, are inherently risk-averse, and this may be the biggest challenge facing Platfora. Other software providers have started with similar messaging out of the gate, but then shifted to more of a coexistence-messaging approach to gain traction in organizations. It will be interesting to see whether Platfora yields to these same headwinds as it moves through its beta phase and into GA.

In sum, Platfora is an exciting new company, and companies that are adopting Hadoop should look into the way in can drive big data analytics and maybe change the culture of their organizations.


Tony Cosentino

VP and Research Director

SiSense gained a lot of traction last week at the Strata conference vr_ngbi_br_importance_of_bi_technology_considerationsin San Jose as it broke records in the so-called 10x10x10 Challenge – analyzing 10 terabytes of data in 10 seconds on a $10,000 commodity machine – and earned the company’s Prism product the Audience Choice Award. The Israel-based company, founded in 2005, has venture capital backing and is currently running at a profit with customers in more than 50 countries and marquee customers such as Target and Merck. Prism, its primary product, provides the entire business analytics stack, from ETL capabilities through data analysis and visualization. From the demonstrations I’ve seen, the toolset appears relatively user-friendly, which is important because customers say usability is the top criterion in 63 percent of organizations according to our next-generation business intelligence.

Prism comprises three main components: ElastiCube Manager, Prism BI Studio and Prism Web. ElastiCube Manager provides a visual environment with which users can ingest data from multiple sources, define relationship between data and do transforms via SQL functions. Prism Studio provides a visual environment that lets customers build visual dashboards that link data and provide users with dynamic charts and interactivity. Finally, Prism Web provides web-based functionality for interacting, sharing dashboards and managing user profiles and accounts.

At the heart of Prism is the ElasticCube technology, which can query large volumes of data quickly. ElasticCube uses a columnar approach, which allows for fast compression. With SiSense, queries are optimized by the CPU itself. That is, the system decides in an ad hoc manner the most efficient way to use both disk and memory. Most other approaches on the market lean either to a pure-play in-memory system or toward a columnar approach.

The company’s approach to big data analytics reveals the chasm that exists in big data analytics understanding between descriptive analytics and more advanced analytics such as we see with R, SAS and SPSS. When SiSense speaks of big data analytics, it is speaking of the ability to consume and explore very large data sets without predefining schemas. By doing away with schemas, the software does away with the  need for a statistician, a data mining engineer or an IT person for that matter. Instead, organizations need analysts with a good understanding of the business, the data sources with which they are working and the basic characteristics of those data sets. SiSense does not do sophisticated predictive modeling or data mining, but rather root-cause and contextual analysis across diverse and potentially very large data sets.

SiSense today has a relatively small footprint, and is facing an uphill battle against entrenched BI and analytics players for enterprise deployments but it’s easy to download and try approach will help it get traction with analysts who are less loyal to the BI that IT departments have purchased. SiSense Vice President of Marketing Bruno Aziza, formerly with Microsoft’s Business Intelligence group, and CEO Amit Bendov have been in the industry for a fair amount of time and understand this challenge. Their platform’s road into the organization is more likely through business groups rather than IT. For this reason, SiSense’s competitors are Tableau and QlikView on the discovery side and products likes SAP HANA and Actuate’s BIRT Analytics in-memory plus columnar approaches are likely the closest competitors in terms of the technological approach to accessing and visualizing large data sets. This ability to access large data sets in a timely manner without the need for data scientists can help overcome the top challenges BI users have in the areas of staffing and real-time access, which we uncovered in our recent business technology innovation research.

vr_bigdata_obstacles_to_big_data_analyticsSiSense has impressive technology, and is getting some good traction. It bears consideration by departmental and mid-market organizations that need to perform analytics across growing volumes of data without the need for an IT department to support their needs.


Tony Cosentino
VP and Research Director

RSS Tony Cosentino’s Analyst Perspectives at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Tony Cosentino – Twitter

Error: Twitter did not respond. Please wait a few minutes and refresh this page.


  • 73,783 hits
%d bloggers like this: