You are currently browsing the tag archive for the ‘IBM’ tag.

SAS Institute, a long-established provider analytics software, showed off its latest technology innovations and product road maps at its recent analyst conference. In a very competitive market, SAS is not standing still, and executives showed progress on the goals introduced at last year’s conference, which I coveredSAS’s Visual Analytics software, integrated with an in-memory analytics engine called LASR, remains the company’s flagship product in its modernized portfolio. CEO Jim Goodnight demonstrated Visual Analytics’ sophisticated integration with statistical capabilities, which is something the company sees as a differentiator going forward. The product already provides automated charting capabilities, forecasting and scenario analysis, and SAS probably has been doing user-experience testing, since the visual interactivity is better than what I saw last year. SAS has put Visual Analytics on a six-month release cadence, which is a fast pace but necessary to keep up with the industry.

Visual discovery alone is becoming an ante in the analytics market,vr_predanalytics_benefits_of_predictive_analytics_updated since just about every vendor has some sort of discovery product in its portfolio. For SAS to gain on its competitors, it must make advanced analytic capabilities part of the product. In this regard, Dr. Goodnight demonstrated the software’s visual statistics capabilities, which can switch quickly from visual discovery into regression analysis running multiple models simultaneously and then optimize the best model. The statistical product is scheduled for availability in the second half of this year. With the ability to automatically create multiple models and output summary statistics and model parameters, users can create and optimize models in a more timely fashion, so the information can be come actionable sooner. In our research on predictive analytics, the most participants (68%) cited competitive advantage as a benefit of predictive analytics, and companies that are able to update their models daily or more often, our research also shows, are very satisfied with their predictive analytics tools more often than others are. The ability to create models in an agile and timely manner is valuable for various uses in a range of industries.

There are three ways that SAS allows high performance computing. The first is the more traditional grid approach which distributes processing across multiple nodes. The second is the in-database approach that allows SAS to run as a process inside of the database. vr_Big_Data_Analytics_08_top_capabilities_of_big_data_analyticsThe third is extracting data and running it in-memory. The system has the flexibility to run on different large-scale database types such as MPP as well Hadoop infrastructure through PIG and HIVE. This is important because for 64 percent of organizations, the ability to run predictive analytics on big data is a priority, according to our recently released research on big data analytics. SAS can run via MapReduce or directly access the underlying Hadoop Distributed File System and pull the data into LASR, the SAS in-memory system. SAS works with almost all commercial Hadoop implementations, including Cloudera, Hortonworks, EMC’s Pivotal and IBM’s InfoSphere BigInsights. The ability to put analytical processes into the MapReduce paradigm is compelling as it enables predictive analytics on big data sets in Hadoop, though the immaturity of initiatives such as YARN may relegate the jobs to batch processing for the time being. The flexibility of LASR and the associated portfolio can help organizations overcome the challenge of architectural integration, which is the most widespread technological barrier to predictive analytics (for 55% of participants in that research). Of note is that the SAS approach provides purely analytical engine, and since there is no SQL involved in the algorithms, its overhead related to SQL is non-existent and it runs directly on the supporting system’s resources.

As well as innovating with Visual Analytics and Hadoop, SAS has a clear direction in its road map, intending to integrate the data integration and data quality aspects of the portfolio in a singlevr_Info_Optimization_04_basic_information_tasks_consume_time workflow with the Visual Analytics product. Indeed, data preparation is still a key sticking point for organizations. According to our benchmark research on information optimization, time spent in analytic tasks is still consumed most by data preparation (for 47%) and data quality and consistency (45%). The most valuable task, interpretation of the data, ranks fourth at 33 percent of analytics time. This is a big area of opportunity in the market, as reflected by the flurry of funding for data preparation software companies in the fourth quarter of 2013. For further analysis of SAS’s data management and big data efforts, please read my colleague Mark Smith’s analysis.

Established relationships with companies like Teradata and a reinvigorated relationship with SAP position SAS to remain at the heart of enterprise analytic architectures. In particular, the co-development effort that allow the SAS predictive analytic workbench to run on top of SAP HANA is promising, which raises the question of how aggressive SAP will be in advancing its own advanced analytic capabilities on HANA. One area where SAS could learn from SAP is in its developer ecosystem. While SAP has thousands of developers building applications for HANA, SAS could do a better job of providing the tools developers need to extend the SAS platform. SAS has been able to prosper with a walled-garden approach, but the breadth and depth of innovation across the technology and analytics industry puts this type of strategy under pressure.

Overall, SAS impressed me with what it has accomplished in the past year and the direction it is heading in. The broad-based development efforts raise a final question of where the company should focus its resources. Based on its progress in the past year, it seems that a lot has gone into visual analytics, visual statistics, LASR and alignment with the Hadoop ecosystem. In 2014, the company will continue horizontal development, but there is a renewed focus on specific analytic solutions as well. At a minimum, the company has good momentum in retail, fraud and risk management, and manufacturing. I’m encouraged by this industry-centric direction because I think that the industry needs to move away from the technology-oriented V’s toward the business-oriented W’s.

For customers already using SAS, the company’s road map is designed to capture market advantage with minimal disruption to existing environments. In particular, focusing on solutions as well as technological depth and breadth is a viable strategy. While it still may make sense for customers to look around at the innovation occurring in analytics, moving to a new system will often incur high switching costs in productivity as well as money. For companies just starting out with visual discovery or predictive analytics, SAS Visual Analytics provides a good point of entry, and SAS has a vision for more advanced analytics down the road.

Regards,

Tony Cosentino

VP and Research Director

Ventana Research recently completed the most comprehensiveVRMobileBIVI evaluation of mobile business intelligence products and vendors available anywhere today. The evaluation includes 16 technology vendors’ offerings on smartphones and tablets and use across Apple, Google Android, Microsoft Surface and RIM BlackBerry that were assessed in seven key categories: usability, manageability, reliability, capability, adaptability, vendor validation and TCO and ROI. The result is our Value Index for Mobile Business Intelligence in 2014. The analysis shows that the top supplier is MicroStrategy, which qualifies as a Hot vendor and is followed by 10 other Hot vendors: IBM, SAP, QlikTech, Information Builders, Yellowfin, Tableau Software, Roambi, SAS, Oracle and arcplan.

Our expertise, hands on experience and the buyer research from our benchmark research on next-generation business intelligence and on information optimization informed our product evaluations in this new Value Index. The research examined business intelligence on mobile technology to determine organizations’ current and planned use and the capabilities required for successful deployment.

What we found was wide interest in mobile business intelligence and a desire to improve the use of information in 40 percent of organizations, though adoption is less pervasive than interest. Fewer than half of organizations currently access BI capabilities on mobile devices, but nearly three-quarters (71%) expect their mobile workforce to be able to access BI capabilities in the next 12 months. The research also shows strong executive support: Nearly half of executives said that mobility is very important to their BI processes.

Mobile_BI_Weighted_OverallEase of access and use are an important criteria in this Value Index because the largest percentage of organizations identified usability as an important factor in evaluations of mobile business intelligence applications. This is an emphasis that we find in most of our research, and in this case it also may reflect users’ experience with first-generation business intelligence on mobile devices; not all those applications were optimized for touch-screen interfaces and designed to support gestures. It is clear that today’s mobile workforce requires the ability to access and analyze data simply and in a straightforward manner, using an intuitive interface.

The top five companies’ products in our 2014 Mobile Business Intelligence Value Index all provide strong user experiences and functionality. MicroStrategy stood out across the board, finishing first in five categories and most notably in the areas of user experience, mobile application development and presentation of information. IBM, the second-place finisher, has made significant progress in mobile BI with six releases in the past year, adding support for Android, advanced security features and an extensible visualization library. SAP’s steady support for the mobile access to SAP BusinessObjects platform and support for access to SAP Lumira, and its integrated mobile device management software helped produce high scores in various categories and put it in third place. QlikTech’s flexible offline deployment capabilities for the iPad and its high ranking in assurance-related category of TCO and ROI secured it the fourth spot. Information Builders’ latest release of WebFOCUS renders content directly with HTML5 and its Active Technologies and Mobile Faves, the company delivers strong mobile capabilities and rounds out the top five ranked companies. Other noteworthy innovations in mobile BI include Yellowfin’s collaboration technology, Roambi’s use of storyboarding in its Flow application.

Although there is some commonality in how vendors provide mobile access to data, there are many differences among their offerings that can make one a better fit than another for an organization’s particular needs. For example, companies that want their mobile workforce to be able to engage in root-cause discovery analysis may prefer tools from Tableau and QlikTech. For large companies looking for a custom application approach, MicroStrategy or Roambi may be good choices, while others looking for streamlined collaboration on mobile devices may prefer Yellowfin. Many companies may base the decision on mobile business intelligence on which vendor they currently have installed. Customers with large implementations from IBM, SAP or Information Builders will be reassured to find that these companies have made mobility a critical focus.

To learn more about this research and to download a free executive summary, please visit http://www.ventanaresearch.com/bivalueindex/.

Regards,

Tony Cosentino

Vice President and Research Director

Like every large technology corporation today, IBM faces an innovator’s dilemma in at least some of its business. That phrase comes from Clayton Christensen’s seminal work, The Innovator’s Dilemma, originally published in 1997, which documents the dynamics of disruptive markets and their impacts on organizations. Christensen makes the key point that an innovative company can succeed or fail depending on what it does with the cash generated by continuing operations. In the case of IBM, it puts around US$6 billion a year into research and development; in recent years much of this investment has gone into research on big data and analytics, two of the hottest areas in 21st century business technology. At the company’s recent Information On Demand (IOD) conference in Las Vegas, presenters showed off much of this innovative portfolio.

At the top of the list is Project Neo, which will go into beta release early in 2014. Its purpose to fill the skills gap related to big data analytics, which our benchmark research into big data shows is held back most by lack of knowledgeable staff (79%) and lack of training (77%). The skills situation can be characterized as a three-legged stool of domain knowledge (that is, line-of-business knowledge), statistical knowledge and technological knowledge. With Project Neo, IBM aims to reduce the technological and statistical demands on the domain expert and empower that person to use big data analytics in service of a particular outcome, such as reducing customer churn or presenting the next best offer. In particular, Neo focuses on multiple areas of discovery, which my colleague Mark Smith outlined. Most of the industry discussion about simplifying analytics has revolved around visualization rather than data discovery, which applies analytics that go beyond visualization, or information discovery, which addresses how we find and access information in a highly distributed environment. These areas are the next logical steps after visualization for software vendors to address, and IBM takes them seriously with Neo.

At the heart of Neo are the same capabilities found in IBM’s SPSSUntitled 1 Analytic Catalyst, which won the 2013 Ventana Research Innovation Award for analytics and which I wrote about. It also includes IBM’s BLU acceleration against the DB2 database, an in-memory optimization technique, which I have discussed as well, that provides access to the analysis of large data sets. The company’s Vivisimo acquisition, which is now called InfoSphere Data Explorer, adds information discovery capabilities. Finally, the Rapid Adaptive Visualization Engine (RAVE), which is IBM’s visualization approach across its portfolio, is layered on top for fast, extensible visualizations. Neo itself is a work in progress currently offered only over the cloud and back-ended by the DB2 database. However, following the acquisition earlier this year of SoftLayer, which provides a cloud infrastructure platform. I would expect to also have IBM make Neo to allow it to access more sources than just loaded data into IBM DB2.

IBM also recently started shipping SPSS Modeler 16.0. IBM bought SPSS in 2009 and has invested in Modeler heavily. Modeler Untitled 2(formerly SPSS Clementine) is an analytic workflow tool akin to others in the market such as SAS Enterprise Miner, Alteryx and more recent entries such as SAP Lumira. SPSS Modeler enables analysts at multiple levels to interact on analytics and do both data exploration and predictive analytics. Analysts can move data from multiple sources and integrate it into one analytic workflow. These are critical capabilities as our predictive analytics benchmark research shows: The biggest challenges to predictive analytics are architectural integration (for 55% of organizations) and lack of access to necessary source data (35%).

IBM has made SPSS the centerpiece of its analytic portfolio and offers it at three levels, Professional, Premium and Gold. With the top-level Gold edition, Modeler 16.0 includes capabilities that are ahead of the market: run-time integration with InfoSphere Streams (IBM’s complex event processing product), IBM’s Analytics Decision Management (ADM) and the information optimization capabilities of G2, a skunks-works project by led by Jeff Jonas, chief scientist of IBM’s Entity Analytics Group.

Integration with InfoSphere Streams that won a Ventana Research Technology Innovation award in 2013 enables event processing to occur in an analytic workflow within Modeler. This is a particularly compelling capability as the so-called “Internet of things” begins to evolve and the ability to correlate multiple events in real time becomes crucial. In such real-time environments, often quantified in milliseconds, events cannot be pushed back into a database and wait to be analyzed.

Decision management is another part of SPSS Modeler. Once models are built, users need to deploy them, which often entails steps such as integrating with rules and optimizing parameters. In a next best offer situation in a retail banking environment, for instance, a potential customer may score highly on propensity want to take out a mortgage and buy a house, but other information shows that the person would not qualify for the loan. In this case, the model itself would suggest telling the customer about mortgage offers, but the rules engine would override it and find another offer to discuss. In addition, there are times when optimization exercises are needed such as Monte Carlo simulations to help to figure out parameters such as risk using “what-if” modelling. In many situations, to gain competitive advantage, all of these capabilities must be rolled into a production environment where individual records are scored in real time against the organization’s database and integrated with the front-end system such as a call center application. The net capability that IBM’s ADM  brings is the ability to deploy analytical models into the business without consuming significant resources.

G2 is a part of Modeler and developed in IBM’s Entity Analytics Group. The group is garnering a lot of attention both internally and externally for its work around “entity analytics” – the idea that each information entity has characteristics that are revealed only in contextual information – charting innovative methods in the areas of data integration and privacy. In the context of Modeler this has important implications for bringing together disparate data sources that naturally link together but otherwise would be treated separately. A core example is that an individual may have multiple email addresses in different databases, has changed addresses or changed names perhaps due to a new marital status. Through machine-learning processes and analysis of the surrounding data, G2 can match records and attach them with some certainty to one individual. The system also strips out personally identifiable information (PII) to meet privacy and compliance standards. Such capabilities are critical for business as our latest benchmark research on information optimization shows that two in five organizations have more than 10 different data sources that they need to integrate and that the ability to simplify access to these systems is important to virtually all organizations (97%).

With the above capabilities, SPSS Modeler Gold edition achieves  market differentiation, but IBM still needs to show the advantage of base editions such as Modeler Professional. The marketing issue for SPSS Modeler is that it is considered a luxury car in a market being infiltrated by compacts and kit cars. In the latter case there is the R programming language, which is open-source and ostensibly free, but the challenge here is that companies need R programmers to run everything. SPSS Modeler and other such visually oriented tools (many of which integrate with open source R) allow easier collaboration on analytics, and ultimately the path to value is shorter. Even at its base level Modeler is an easy-to-use and capable statistical analysis tool that allows for collaborative workgroups and is more mature than many others in the market.

Companies must consider predictive analytics capabilities or Untitledrisk being left behind. Our research into predictive analytics shows that two-thirds of companies see predictive analytics as providing competitive advantage (68%) and particularly important in revenue-generating functions such as marketing (for 70%) and forecasting (72%). Companies currently looking into discovery analytics may want to try Neo, which will be available in beta in early 2014. Those interested in predictive analytics should consider the different levels of SPSS 16.0 as well as IBM’s flagship Signature Solutions, which I have covered. IBM has documented use cases that can give users guidance in terms of leading-edge deployment patterns and leveraging analytics for competitive advantage. If you have not taken a look at the depth of the analytic technology portfolio at IBM, I would make sure to do so, as you might miss some fundamental advancements to the processing of data and analytics to provide the valuable insights required to operate effectively in the global marketplace.

Regards,

Tony Cosentino

VP and Research Director

A few months ago, I wrote an article on the four pillars of big data analytics. One of those pillars is what is called discovery analytics or where visual analytics and data discovery combine together to meet the business and analyst needs. My colleague Mark Smith subsequently clarified the four types of discovery analytics: visual discovery, data discovery, information discovery and event discovery. Now I want to follow up with a discussion of three trends that our research has uncovered in this space. (To reference how I’m using these four discovery terms, please refer to Mark’s post.)

The most prominent of these trends is that conversations about visual discovery are beginning to include data discovery, and vendors are developing and delivering such tool sets today. It is well-known that while big data profiling and the ability to visualize data give us a broader capacity for understanding, there are limitations that can be vr_predanalytics_predictive_analytics_obstaclesaddressed only through data mining and techniques such as clustering and anomaly detection. Such approaches are needed to overcome statistical interpretation challenges such as Simpson’s paradox. In this context, we see a number of tools with different architectural approaches tackling this obstacle. For example, Information Builders, Datameer, BIRT Analytics and IBM’s new SPSS Analytic Catalyst tool all incorporate user-driven data mining directly with visual analysis. That is, they combine data mining technology with visual discovery for enhanced capability and more usability. Our research on predictive analytics shows that integrating predictive analytics into the existing architecture is the most pressing challenge (for 55% or organizations). Integrating data mining directly into the visual discovery process is one way to overcome this challenge.

The second trend is renewed focus on information discovery (i.e., search), especially among large enterprises with widely distributed systems as well as the big data vendors serving this market. IBM acquired Vivisimo and has incorporated the technology into its PureSystems and big data platform. Microsoft recently previewed its big data information discovery tool, Data Explorer. Oracle acquired Endeca and has made it a key component of its big data strategy. SAP added search to its latest Lumira platform. LucidWorks, an independent information discovery vendor that provides enterprise support for open source Lucene/Solr, adds search as an API and has received significant adoption. There are different levels of search, from documents to social media data to machine data,  but I won’t drill into these here. Regardless of the type of search, in today’s era of distributed computing, in which there’s a need to explore a variety of data sources, information discovery is increasingly important.

The third trend in discovery analytics is a move to more embeddable system architectures. In parallel with the move to the cloud, architectures are becoming more service-oriented, and the interfaces are hardened in such a way that they can integrate more readily with other systems. For example, the visual discovery market was born on the client desktop with Qlik and Tableau, quickly moved to server-based apps and is now moving to the cloud. Embeddable tools such as D3, which is essentially a visualization-as-a-service offering, allow vendors such as Datameer to include an open source library of visualizations in their products. Lucene/Solr represents a similar embedded technology in the information discovery space. The broad trend we’re seeing is with RESTful-based architectures that promote a looser coupling of applications and therefore require less custom integration. This move runs in parallel with the decline in Internet Explorer, the rise of new browsers and the ability to render content using JavaScript Object Notation (JSON). This trend suggests a future for discovery analysis embedded in application tools (including, but not limited to, business intelligence). The environment is still fragmented and in its early stage. Instead of one cloud, we have a lot of little clouds. For the vendor community, which is building more platform-oriented applications that can work in an embeddable manner, a tough question is whether to go after the on-premises market or the cloud market. I think that each will have to make its own decision on how to support customer needs and their own business model constraints.

Regards,

Tony Cosentino

VP and Research Director

IBM’s SPSS Analytic Catalyst enables business users to conduct the kind of advanced analysis that has been reserved for expert users of statistical software. As analytic modeling becomes more important to businesses and models proliferate in organizations, the ability to give domain experts advanced analytic capabilities can condense the analytic process and make the results available sooner for business use. Benefiting from IBM’s research and development in natural-language processing and its statistical modeling expertise, IBM SPSS Analytic Catalyst can automatically choose an appropriate model, execute the model, test it and explain it in plain English.

Information about the skills gap in analytics and the needvr_bigdata_obstacles_to_big_data_analytics (2) for more user-friendly tools indicates pent-up demand for this type of tool. Our benchmark research into big data shows that big data analytics is held back most by lack of knowledgeable staff (79%) and lack of training (77%).

In the case of SPSS Analytic Catalyst, the focus is on driver analysis. In its simplest form, a driver analysis aims to understand cause and effect among multiple variables. One challenge with driver analysis is to determine the method to use in each situation (choosing among, for example, linear or logistic regression, CART, CHAID or structural equation models). This is a complex decision which most organizations leave to the resident statistician or outsource to a professional analyst. Analytic Catalyst automates the task. It does not consider every method available, but that is not necessary. By examining the underlying data characteristics, it can address data sets, including what may be considered big data, with an appropriate algorithm. The benefit for nontechnical users is that Analytic Catalyst makes the decision on selecting the algorithm.

The tool condenses the analytic process into three steps: data upload, selection of the target variable (also called the dependent variable or outcome variable) and data exploration. Once the data is uploaded, the system selects target variables and automatically correlates and associates the data. Based on characteristics of the data, Analytic Catalyst chooses the appropriate method and returns summary data rather than statistical data. On the initial screen, it communicates so-called “top insights” in plain text and presents visuals, such as a decision tree in a churn analysis. Once the user has absorbed the top-level information, he or she can drill down into top key drivers. This enables users to see interactivity between attributes. Understanding this interactivity is an important part of driver analysis since causal variables often move together (a challenge known as multicollinearity) and it is sometimes hard to distinguish what is actually causing a particular outcome. For instance, analysis may blame the customer service department for a product defect and point to it as the primary driver of customer defection. Accepting this result, a company may mistakenly try to fix customer service when it is a product issue that needs to be addressed. This approach also overcomes the challenge of Simpson’s paradox, which is a hindrance for some visualization tools in the market. On subsequent navigations, Analytic Catalyst goes even further into how different independent variables move together, even if they do not directly explain the outcome variable.

Beyond the ability to automate modeling and enable exploration of data, I like that this new tool is suitable for both statistically inclined users (who can use it to get r-scores, model parameters or other data) and business users (whom visualizations and natural language walk through what things mean). Thus it enables cross-functional conversations and allows the domain expert to own the overall analysis.

I also like the second column of the “top key driver” screen, through which users can drill down into different questions regarding the data. Having a complete question set, the analyst can simply back out of one question and dive into another. The iterative process aligns naturally with the concept of data exploration.

IBM seems to be positioning the tool to help with early-stage analysis. From the examples I’ve seen, however, I think Analytic Catalyst would work well also as a back-end tool for marketers trying to increase wallet share through specific campaigns or for efforts by operations personnel to reduce churn by creating predefined actions at the point of service for particular at-risk customer populations.

IBM will need to continue to work with Analytic Catalyst vr_ngbi_br_importance_of_bi_technology_considerationsto get it integrated with other tools and ensure that it keeps the user experience in mind. Usability is the key buying criteria for nearly two-thirds (64%) of companies, according to our benchmark research into next-generation business intelligence.

It is important that the data models align with other models in the organization, such as customer value models, so that the right populations are targeted. Otherwise a marketer or operations person would likely need to figure this out in a different system, such as a BI tool. Also that user would have to put the analytical output into another system, such as a campaign management or business process tool, to make it actionable. Toward this end, I expect that IBM is working to integrate this product within its own portfolio and those of its partners.

SPSS Analytic Catalyst has leaped over the competition in putting sophisticated driver analytics into natural language that can guide almost any user through complex analytic scenarios. However, competitors are not standing still. Some are working on similar tools that apply natural language to sophisticated commodity modeling approaches, and many of the visual discovery vendors have similar but less optimized approaches. With the less sophisticated approaches, the question comes down to optimizing vs. satisfying. Other tools in the market satisfy the basic need for driver analysis (usually approached through simple correlation or one type of decision tree), but a more dynamic approach to driver analysis such as offered by IBM can reveal deeper understanding of the data. The answer will depend on an organization and its user group, but in fast-moving markets and scenarios where analytics is a key differentiator, this is a critical question to consider.

Regards,

Tony Cosentino

VP and Research Director

Users of big data analytics are finally going public. At the Hadoop Summit last June, many vendors were still speaking of a large retailer or a big bank as users but could not publically disclose their partnerships. Companies experimenting with big data analytics felt that their proof of concept was so innovative that once it moved into production, it would yield a competitive advantage to the early mover. Now many companies are speaking openly about what they have been up to in their business laboratories. I look forward to attending the 2013 Hadoop Summit in San Jose to see how much things have changed in just a single year for Hadoop centered big data analytics.

Our benchmark research into operational intelligence, which I argue is another name for real-time big data analytics, shows diversity in big data analytics use cases by industry. The goals of operational intelligence are an interesting mix as the research shows relative parity among managing performance (59%), detecting fraud and security (59%), complying with regulations (58%) and managing risk (58%), but when we drill down into different industries there are some interesting nuances. For instance, healthcare and banking are driven much more by risk and regulatory compliance, services such as retail are driven more by performance, and manufacturing is driven more by cost reduction. All of these make sense given the nature of the businesses. Let’s look at them in more detail.

vr_oi_goals_of_using_operational_intelligenceThe retail industry, driven by market forces and facing discontinuous change, is adopting big data analytics out of competitive necessity. The discontinuity comes in the form of online shopping and the need for traditional retailers to supplement their brick-and-mortar locations. JCPenney and Macy’s provide a sharp contrast in how two retailers approached this challenge. A few years ago, the two companies eyed a similar competitive space, but since that time, Macy’s has implemented systems based on big data analytics and is now sourcing locally for online transactions and can optimize pricing of its more than 70 million SKUs in just one hour using SAS High Performance Analytics. The Macy’s approach has, in Sun-Tzu like fashion, made the “showroom floor” disadvantage into a customer experience advantage. JCPenney, on the other hand, used gut-feel management decisions based on classic brand merchandising strategies and ended up alienating its customers and generating law suits and a well-publicized apology to its customers. Other companies including Sears are doing similarly innovative work with suppliers such as Teradata and innovative startups like Datameer in data hub architectures build around Hadoop.

Healthcare is another interesting market for big data, but the dynamics that drive it are less about market forces and more about government intervention and compliance issues. Laws around HIPPA, the recent Healthcare Affordability Act, OC-10 and the HITECH Act of 2009 all have implications for how these organizations implement technology and analytics. Our recent benchmark research on governance, risk and compliance indicates that many companies have significant concerns about compliance issues: 53 percent of participants said they are concerned about them, and 42 percent said they are very concerned. Electronic health records (EHRs) are moving them to more patient-centric systems, and one goal of the Affordable Care Act is to use technology to produce better outcomes through what it calls meaningful use standards.  Facing this title wave of change, companies including IBM analyze historical patterns and link it with real-time monitoring, helping hospitals save the lives of at-risk babies. This use case was made into a now-famous commercial by advertising firm Ogilvy about the so-called data babies. IBM has also shown how cognitive question-and-answer systems such as Watson assist doctors in diagnosis and treatment of patients.

Data blending, the ability to mash together different data sources without having to manipulate the underlying data models, is another analytical technique gaining significant traction. Kaiser Permanente is able to use tools from Alteryx, which I have assessed, to consolidate diverse data sources, including unstructured data, to streamline operations to improve customer service. The two organizations made a joint presentation similar to the one here at Alteryx’s user conference in March.

vr_grc_worried_about_grcFinancial services, which my colleague Robert Kugel covers, is being driven by a combination of regulatory forces and competitive market forces on the sales end. Regulations produce a lag in the adoption of certain big data technologies, such as cloud computing, but areas such as fraud and risk management are being revolutionized by the ability, provided through in-memory systems, to look at every transaction rather than only a sampling of transactions through traditional audit processes. Furthermore, the ability to pair advanced analytical algorithms with in-memory real-time rules engines helps detect fraud as it occurs, and thus criminal activity may be stopped at the point of transaction. On a broader scale, new risk management frameworks are becoming the strategic and operational backbone for decision-making in financial services.

On the retail banking side, copious amounts of historical customer data from multiple banking channels combined with government data and social media data are providing banks the opportunity to do microsegmentation and create unprecedented customer intimacy. Big data approaches to micro-targetting and pricing algorithms, which Rob recently discussed in his blog on Nomis, enable banks and retailers alike to target individuals and customize pricing based on an individual’s propensity to act. While partnerships in the financial services arena are still held close to the vest, the universal financial services providers – Bank of America, Citigroup, JPMorgan Chase and Wells Fargo – are making considerable investments into all of the above-mentioned areas of big data analytics.

Industries other than retail, healthcare and banking are also seeing tangible value in big data analytics. Governments are using it to provide proactive monitoring and responses to catastrophic events. Product and design companies are leveraging big data analytics for everything from advertising attribution to crowdsourcing of new product innovation. Manufacturers are preventing downtime by studying interactions within systems and predicting machine failures before they occur. Airlines are recalibrating their flight routing systems in real time to avoid bad weather. From hospitality to telecommunications to entertainment and gaming, companies are publicizing their big data-related success stories.

Our research shows that until now, big data analytics has primarily been the domain of larger, digitally advanced enterprises. However, as use cases make their way through business and their tangible value is accepted, I anticipate that the activity around big data analytics will increase with companies that reside in the small and midsize business market. At this point, just about any company that is not considering how big data analytics may impact its business faces an unknown and uneasy future. What a difference a year makes, indeed.

Regards,

Tony Cosentino

VP and Research Director

Organizations today must manage and understand a floodvr_bigdata_obstacles_to_big_data_analytics (2) of information that continues to increase in volume and turn it into competitive advantage through better decision making. To do that organizations need new tools, but more importantly, the analytical process knowledge to use them well. Our benchmark research into big data and business analytics found that skills and training are substantial obstacles to using big data (for 79%) and analytics (77%) in organizations.

But proficiency around technology and even statistical knowledge are not the only capabilities needed to optimize an organization’s use of analytics. A framework that complements the traditional analytical modeling process helps ensure that analytics are used correctly and will deliver the best results. I propose the following five principles that are concerned less with technology than with people and processes. (For more detail on the final two, see my earlier perspective on business analytics.)

Ask the right questions. Without a process for getting to the right question, the one that is asked often is the wrong one, yielding results that cannot be used as intended. Getting to the right question is a matter of defining goals and terms; when this is done, the “noise” of differing meanings is reduced and people can work together efficiently. Companies talk about strategic alignment, brand loyalty, big data and analytics, to name a few, yet these terms can mean different things to different people. Take time to discuss what people really want to know; describing something in detail ensures that everyone is on the same page. Strategic listening is a critical skill, and done right it will enable the analyst to identify, craft and focus the questions that the organization needs answered through the analytic process.

Take a Bayesian perspective. Bayesian analysis, also called posterior probability analysis, starts with assuming an end probability and works backward to determine prior probabilities. In a practical sense, it’s about updating a hypothesis when given new information; it’s about taking all available information and seeing where it is convergent. Of course, the more you know about the category you’re dealing with, the easier it is to separate the wheat from the chaff in terms of valuable information. Category knowledge allows you to look at the data from a different perspective and add complex existing knowledge. This, in and of itself is a Bayesian approach, but allows the analyst to iteratively take the investigation in the right direction. Bayesian analysis has had not only a great impact on statistics and market insights in recent years, but it has impacted how we view important historical events as well. For those interested in looking at how the Bayesian philosophy is taking hold in many different disciplines, there is an interesting book entitled The Theory That Would Not Die.

Don’t try to prove what you already know. Let the data guide the analysis rather than allowing pre-determined beliefs to guide the analysis. Physicist Enrico Fermi pointed out that measurement is the reduction of uncertainty. Analysts start with a hypothesis and try to disprove it rather than to prove it. From there, iteration is needed to come as close to the truth as possible. If we start with a gut feel and try to prove that gut feel, we are invoking the wrong approach. The point is, an analysis that starts by trying to prove that what we believe to be true, the results are rarely surprising and the analysis is likely to add nothing new.

Think in terms of “so what.” Moving beyond the “what” (i.e., measurement) to the “so what” (i.e., insights) should be a goal of any analysis, yet many are still turning out analysis that does nothing more than state the facts. Maybe 54 percent of people in a study prefer white houses, but why does anyone care that 54 percent people prefer white houses? Analyses must move beyond findings to answer critical business questions and provide informed insights, implications and even full recommendations.

Be sure to address the “now what.” The analytics professional should make sure that the findings, implications and recommendations of the analysis are heard. This is the final step in the analytic process, the “now what” – the actual business planning and implementation decisions that are driven by the analytic insights. If those insights do not lead to decision-making or action, then the effort has no value. There are a number of things that the analyst can do to facilitate that the information is heard. A compelling story line that incorporates animation and dynamic presentation is a good start. Depending on the size of the initiative, professional videography, implementation of learning systems and change management tools should also be involved.

Just because our business technology research finds vr_bti_br_technology_innovation_prioritiesanalytics as top priority and first ranked in 39 percent of organizations does not mean that adopting it will get immediate success. In order to implement a successful framework such as the one described above, organizations should build this one or a similar approach into their training programs and analytical processes. The benefits will be wide ranging including more targeted analysis, analytical depth, and analytical initiatives that have a real impact on decision making in the organization.

Regards,

Tony Cosentino

VP and Research Director

Our benchmark research found in business technology innovation that analytics is the most important new technology for improving their organization’s performance; they ranked big data only fifth out of six choices. This and other findings indicate that the best way for big data to contribute value to today’s organizations is to be paired with analytics. Recently, I wrote about what I call the four pillars of big data analytics on which the technology must be built. These areas are the foundation of big data and information optimization, predictive analytics, right-time analytics and the discovery and visualization of analytics. These components gave me a framework for looking at Teradata’s approach to big data analytics during the company’s analyst conference last week in La Jolla, Calif.

The essence of big data is to optimize the information used by the business for whatever type of need as my colleague has identified as a key value of these investmentsVR_2012_TechAward_Winner_LogoData diversity presents a challenge to most enterprise data warehouse architectures. Teradata has been dealing with large, complex sets of data for years, but today’s different data types are forcing new modes of processing in enterprise data warehouses. Teradata is addressing this issue by focusing on a workload-specific architecture that aligns with MapReduce, statistics and SQL. Its Unified Data Architecture (UDA) incorporates the Hortonworks Hadoop distribution, the Aster Data platform and Teradata’s stalwart RDBMS EDW. The Big Data Analytics appliance that encompasses the UDA framework won our annual innovation award in 2012. The system is connected through Infiniband and accesses Hadoop’s metadata layer directly through Hcatalog. Bringing these pieces together represents the type of holistic thinking that is critical for handling big data analytics; at the same time there are some costs as the system includes two MapReduce processing environments. For more on the UDA architecture, read my previous post on Teradata as well as my colleague Mark Smith’s piece.

Predictive analytics is another foundational piece of big data analytics and one of the top priorities in organizations. However, according to our vr_bigdata_big_data_capabilities_not_availablebig data research, it is not available in 41 percent of organizations today. Teradata is addressing it in a number of ways and at the conference Stephen Brobst, Teradata’s CTO, likened big data analytics to a high-school chemistry classroom that has a chemical closet from which you pull out the chemicals needed to perform an experiment in a separate work area. In this analogy, Hadoop and the RDBMS EDW are the chemical closet, and Aster Data provides the sandbox where the experiment is conducted. With mulitple algorithms currently written into the platform and many more promised over the coming months, this sandbox provides a promising big data lab environment. The approach is SQL-centric and as such has its pros and cons. The obvious advantage is that SQL is a declarative language that is easier to learn than procedural languages, and an established skills base exists within most organizations. The disadvantage is that SQL is not the native tongue of many business analysts and statisticians. While it may be easy to call a function within the context of the SQL statement, the same person who can write the statement may not know when and where to call the function. One way for Teradata to expediently address this need is through its existing partnerships with companies like Alteryx, which I wrote about recently. Alteryx provides a user-friendly analytical workflow environment and is establishing a solid presence on the business side of the house. Teradata already works with predictive analytics providers like SAS but should further expand with companies like Revolution Analytics that I assessed that are using R technology to support a new generation of tools.

Teradata is exploiting its advantage with algorithms such as nPath, which shows the path that a customer has taken to a particular outcome such as buying or not buying. According to our big data benchmark research, being able to conduct what-if analysis and predictive analytics are the two most desired capabilities not currently available with big data, as the chart shows. The algorithms that Teradata is building into Aster help address this challenge, but despite customer case studies shown at the conference, Teradata did not clearly demonstrate how this type of algorithm and others seamlessly integrate to address the overall customer experience or other business challenges. While presenters verbalized it in terms of improving churn and fraud models, and we can imagine how the handoffs might occur, the presentations were more technical in nature. As Teradata gains traction with these types of analytical approaches, it will behoove the company to show not just how the algorithm and SQL works but how it works in the use by business and analysts who are not as technically savvy.

Another key principle behind big data analytics is timeliness of the analytics. Given the nature of business intelligence and traditional EDW architectures, until now timeliness of analytics has been associated with how quickly queries run. This has been a strength of the Teradata MPP share-nothing architecture, but other appliance architectures, such as those of Netezza and Greenplum, now challenge Teradata’s hegemony in this area. Furthermore, trends in big data make the situation more complex. In particular, with very large data sets, many analytical environments have replaced the traditional row-level access with column access. Column access is a more natural way for data to be accessed for analytics since it does not have to read through an entire row of data that may not be relevant to the task at hand. At the same time, column-level access has downsides, such as the reduced speed at which you can write to the system; also, as the data set used in the analysis expands to a high number of columns, it can become less efficient than row-level access. Teradata addresses this challenge by providing both row and column access through innovative proprietary access and computation techniques.

Exploratory analytics on large, diverse data sets also has a timeliness imperative. Hadoop promises the ability to conduct iterative analysis on such data sets, which is the reason that companies store big data in the first place according to our big data benchmark research. Iterative analysis is akin to the way the human brain naturally functions, as one question naturally leads to another question. However, methods such as Hive, which allows an SQL-like method to access Hadoop data, can be very slow, sometimes taking hours to return a query. Aster enables much faster access and therefore provides a more dynamic interface for iterative analytics on big data.

Timeliness also has to do with incorporating big data in a stream-oriented environment and only 16 percent of organizations are very satisfied with timeliness of events according to our operational intelligence benchmark research. In a use case such as fraud and security, rule-based systems work with complex algorithmic functions to uncover criminal activity. While Teradata itself does not provide the streaming or complex event processing (CEP) engines, it can provide the big data analytical sandbox and algorithmic firepower necessary to supply the appropriate algorithms for these systems. Teradata partners with major players in this space already, but would be well served to further partner with CEP and other operational intelligence vendors to expand its footprint. By the way, these vendors will be covered in our upcoming Operational Intelligence Value Index, which is based on our operational intelligence benchmark research. This same research showed that analyzing business and IT events together was very important in 45 percent of organizations.

The visualization and discovery of analytics is the last foundational pillarvr_ngbi_br_importance_of_bi_technology_considerations and here Teradata is still a work in progress. While some of the big data visualizations Aster generates show interesting charts, they lack a context to help people interpret the chart. Furthermore, the visualization is not as intuitive and requires the writing and customization of SQL statements. To be fair, most visual and discovery tools today are relationally oriented and Teradata is trying to visualize large and diverse sets of data. Furthermore, Teradata partners with companies including MicroStrategy and Tableau to provide more user-friendly interfaces. As Teradata pursues the big data analytics market, it will be important to demonstrate how it works with its partners to build a more robust and intuitive analytics workflow environment and visualization capability for the line-of-business user. Usability (63%) and functionality (49%) are the top two considerations when evaluating business intelligence systems according to our research on next-generation business intelligence.

Like other large industry technology players, Teradata is adjusting to the changes brought by business technology innovation in just the last few years. Given its highly scalable databases and data modeling – areas that still represent the heart of most company’s information architectures –  Teradata has the potential to pull everything together and leverage their current deployed base. Technologists looking at Teradata’s new and evolving capabilities will need to understand the business use cases and share these with the people in charge of such initiatives. For business users, it is important to realize that big data is more than just visualizing disparate data sets and that greater value lies in setting up an efficient back end process that applies the right architecture and tools to the right business problem.

Regards,

Tony Cosentino
VP and Research Director

Last week, IBM brought industry analysts to its famed Almaden Research Center, where the company outlined its big data analytics strategy and introduced a number of new innovations. Big data is no new topic to IBM, which has for decades helped organizations store and use data. But technology has changed over those decades, and IBM is working hard to ensure it is part of the future and not just the past. Our latest business technology innovation research into big data technology finds that retaining and analyzing more data is the first-ranked priority in 29 percent of organizations. From both an IT and a business perspective, big data is critical to IBM’s future success.

On the strategy side, there was much discussion at the event around use cases and the different patterns of deployment for big data analytics. Inhi Cho Suh, vice president of strategy, outlined five compelling use cases for big data analytics:

  1. Discovery and visualization. These types of exploratory analytics in a federated environment are a big part of big data analytics, since they can unlock patterns that can be useful in areas as diverse as determining a root cause of an airline issue or understanding relationships among buyers. IBM is working hard to ensure that products such as IBM Cognos Insight can evolve to support a new generation of visual discovery for big data.
  2. 360-degree view of the customer. By bringing together data sources and applying analytics to increase such things as customer loyalty and share-of-wallet, companies can gain more revenue and market share with fewer resources. IBM needs to ensure it can actually support a broad array of information about customers – not just transactional or social media data but also voice as well as mobile interactions that also use text.
  3. Security and intelligence. This area includes areas around fraud and real-time cyber security, where companies leverage big data to predict anomalies and contain risk. IBM has been enhancing its ability to process real-time streams and transactions across any network. This is an important area for the company as it works to drive competitive advantage.
  4. Operational analysis. This is the ability to leverage networks of instrumented data sources to enable proactive monitoring through baseline analysis and real-time feedback mechanisms. The need for better operational analytics continues to increase. Our latest research on operational intelligence finds that organizations that use dedicated tools to handle this need will be more satisfied and gain better outcomes than those that do not.
  5. Data warehouse augmentation.  Big data stores can replace some traditional data stores and archival systems to allow larger sets of data to be analyzed, providing better information and leading to more precise decision-making capabilities. It should be no surprise that IBM has customers with some of the larger data warehouse deployments. The company can help customers evaluate their technology and improve or replace existing investments.

Prior to Inhi taking the stage, Dave Laverty, vice president of marketing, went through the new technologies being introduced. The first announcement was the BLU Accelerator – dynamic in-memory technology that promises to improve both performance and manageability on DB2 10.5. In tests, IBM says it achieved better than 10,000x performance on queries. The secret sauce lies in the ability to do column store data retrieval, maximize CPU processing, and provide skipping of data that is not needed for the particular analysis at hand. The benefits to the user are much faster performance across very large data sets and a reduction in manual SQL optimization. Our latest research into business technology innovation finds that in-memory technology is the technology most planned for use with big data in the next two years (22%), ahead of RDBMS (10%), data warehouse appliance (19%), specialized database (19%) and  Hadoop (20%).

vr_bigdata_obstacles_to_big_data_analytics (2)An intriguing comment from one of IBM’s customers was “What is bad SQL in a world with BLU?” An important extension of that question might be “What is the future role for database administrators, given new advancements around databases, and how do we leverage that skill set to fill the big data analytics gap?” According to our business technology innovations research, staffing (79%) and training (77%) are the two biggest challenges to implementing big data analytics.

One of IBM’s answers to the question of the skills gap comes in the form of BigSQL. A newly announced feature of InfoSphere BigInsights 2.1, BigSQL layers on top of BigInsights to provide accessibility through industry-standard SQL and SQL-based applications. Providing access to Hadoop has been a sticking point for organizations, since they have traditionally needed to write procedural code to access Hadoop data. BigSQL is similar in function to Greenplum’s Pivotal, Teradata Aster and Cloudera’s Impala, where SQL is used to mine data out of Hadoop. All of these products aim to provide access for SQL-trained users and for SQL-based applications, which represent the predominance of BI tools currently deployed in industry. The challenge for IBM, with a product portfolio that includes BigInsights and Cognos Insight, is to offer a clear message about what products meet what types of analytic needs for what types of business and IT professional needs. In addition further clarity from IBM on when to use big data analytics software partners like Datameer who was on an industry panel at the event and part of IBM global educational tour that I have also analyzed.

Another IBM announcement was the PureData System for Hadoop. This appliance approach to Hadoop provides a turnkey solution that can be up and running in a matter of hours. As you would expect in an appliance approach, it allows for consistent administration, workflow, provisioning and security with BigInsights. It also allows access to Hadoop through BigSheets, which presents summary information about the unstructured data in Hadoop, and which was already part of the BigInsights platform. Phil Francisco, vice president of big data product management and strategy, pointed out use cases around archival capabilities and the ability to do cold storage analysis as well as the ability to bring many unstructured sources together. The PureData System for Hadoop, due out in the second half of the year, adds a third version to the BigInsights lineup, which also includes the free web-based version and the Enterprise version. Expanding to support Hadoop with its appliances is critical as more organizations look to exploit the processing power of Hadoop technology for their database and information management needs.

Other announcements included new versions of InfoSphere Streams and Informix TimeSeries for reporting and analytics using smart meter and sensor technology. They help with real-time analytics and big data depending on the business and architectural needs of an organization. The integration of database and streaming analytics are key areas where IBM differentiates itself in the market.

Late in the day, Les Rechan, general manager for business analytics, told the crowd that he and Bob Picciano, general manager for information management, had recently promised the company $20 billion in revenue. That statement is important because in the age of big data, information management and analytics must be considered together, and the company needs a strong relationship between these two leaders to meet this ambitious objective. In an interview, Rechan told me that the teams realize this and are working hand-in-glove across strategy, product development and marketing. The camaraderie between the gentlemen was clear during the event, and bodes well for the organization. Ultimately, IBM will need to articulate why it should be considered for big data, as our technology innovation research finds organizations today are less worried about validation of a vendor from a size perspective (23%) compared to usability of the technology (64%).

IBM’s big data platform seems to be less a specific offer and more of an ethos of how to think about big data and big data analytics in a common-sense way. The focus on five well-thought-out use cases provides customers a frame for thinking through the benefits of big data analytics and gives them a head start with their business cases. Given the confusion in the market around big data, that common-sense approach serves the market well, and it is very much aligned with our own philosophy of focusing on what we call the business-oriented Ws rather than the technology-oriented Vs.

Big data analytics, and in particular predictive analytics, is complex and difficult to integrate into current architectures. Our benchmark research into predictive analytics shows that architectural integration is the biggest inhibitor with 55 percent of companies, which should be a message IBM takes to heart about integration of its predictive analytics tools with its big data technology options. Predictive analytics is the most important capability (49%) for business analytics, according to our technology innovation research, and IBM needs to  show more solutions that integrate predictive analytics with big data.

H.L. Mencken once said, “For every complex problem there is an answer that is clear, simple and wrong.” Big data analytics is a complex problem, and the market is still early. The latent benefit of IBM’s big data analytics strategy is that it allows IBM to continue to innovate and deliver without playing all of its chips at one time. In today’s environment, many supplier companies don’t have the same luxury.

As I pointed out in my blog post on the four pillars of big data analytics,vr_predanalytics_predictive_analytics_obstacles our research and clients are moving toward addressing big data and analytics in a more holistic and integrated manner. The focus shift is less about how organizations store or process information than how they use it. Some may argue that the IBM’s cadence is reflective of company size and is actually a competitive disadvantage, but I would argue that size and innovation leadership are not mutually exclusive. As companies grapple with the onslaught of big data and analytics, no one should underestimate IBM’s outcomes-based and services-driven approach, but in order to succeed IBM also needs to ensure it can meet the needs of organizations at a price they can afford.

Regards,

Tony Cosentino

VP and Research Director

Big data analytics is being offered as the key to addressing a wide array of management and operational needs across business and IT. But the label “big data analytics” is used in a variety of ways, confusing people about its usefulness and value and about how best to implement to drive business value. The uncertainty this causes poses a challenge for organizations that want to take advantage of big data in order to gain competitive advantage, comply with regulations, manage risk and improve profitability.

Recently, I discussed a high-level framework for thinking about big data analytics that aligns with former Census Director Robert Groves’ ideas of designed data on the one hand and organic data on the other. This second article completes that picture by looking at four specific areas that constitute the practical aspects of big data analytics – topics that must be brought into any holistic discussion of big data analytics strategy. Today, these often represent point-oriented approaches, but architectures are now coming to market that promise more unified solutions.

Big Data and Information Optimization: the intersection of big data analytics and traditional approaches to analytics. Analytics performed by database professionals often differ significantly from analytics delivered by line-of-business staffers who work in more flat-file-oriented environments. Today, advancements in in-memory systems, vr_bigdata_obstacles_to_big_data_analyticsin-database analytics and workload-specific appliances provide scalable architectures that bring processing to the data source and allow organizations to push analytics out to a broader audience, but how to bridge the divide between the two kinds of analytics is still a key question. Given the relative immaturity of new technologies and the dominance of relational databases for information delivery, it is critical to examine how all analytical assets will interact with core database systems.  As we move to operationalizing analytics on an industrial scale, the current advanced analytical approaches break down because it requires pulling data into a separate analytic environment and does not leverage advances in parallel computing. Furthermore, organizations need to determine how they can apply existing skill sets and analytical access paradigms such as business intelligence tools, SQL, spreadsheets and visual analysis, to big data analytics. Our recent big data benchmark research shows that the skills gap is the biggest issue facing analytics initiatives with staffing and training as an obstacle in over three quarters of organizations.

Visual analytics and data discovery: Visualizing data is a hot topic, especially in big data analytics. Much of big data analysis is about finding patterns in data and visualizing them so that people can tell a story and give context to large and diverse sets of data. Exploratory analytics allows us to develop and investigate hypotheses, reduce data, do root-cause analysis and suggest modeling approaches for our predictive analytics. Until now the focus of these tools has been on descriptive statistics related to SQL or flat file environments, but now visual analytics vendors are bringing predictive capabilities into the market to drive usability, especially at the business user level. This is a difficult challenge because the inherent simplicity of these descriptive visual tools clashes with the inherent complexity that defines predictive analytics. In addition, companies are looking to apply visualization to the output of predictive models as well. Visual discovery players are opening up their APIs in order to directly export predictive model output.

New tools and techniques in visualization along with the proliferation of in-memory systems allow companies the means of sorting through and making sense of big data, but exactly how these tools work, the types of visualizations that are important to big data analytics and how they integrate into our current big data analytics architecture are still key questions, as is the issue of how search-based data discovery approaches fit into the architectural landscape.

Predictive analytics: Visual exploration of data cannot surface all patterns, especially the most complex ones. To make sense of enormous data sets, data mining and statistical techniques can find patterns, relationships and anomalies in the data and use them to predict future outcomes for individual cases. Companies need to investigate the use of advanced analytic approaches and algorithmic methods that can transform and analyze organic data for uses such as predicting security threats, uncovering fraud or targeting product offers to particular customers.

Commodity models (a.k.a. good-enough models) are allowing business users to drive the modeling process. How these models can be vr_predanalytics_benifits_of_predictive_analyticsbuilt and consumed at the front line of the organization with only basic oversight by a statistician data scientist is a key area of focus as organizations endeavor to bring analytics into the fabric of the organization. The increased load on the back end systems is another key consideration if the modeling is a dynamic software driven approach. How these models are managed and tracked is yet another consideration. Our research on predictive analytics shows that companies that update their models more frequently have much higher satisfaction ratings than those that update on a less frequent basis.  The research further shows that in over half of organizations that competitive advantage and revenue growth are the primary reasons that predictive analytics are deployed.

Right-time and real-time analytics: It’s important to investigate the intersection of big data analytics with right-time and real-time systems and learn how participants are using big data analytics in production on an industrial scale. This usage guides the decisions that we make today around how to begin the task of big data analytics. Another choice organizations must make is whether to capture and store all of their data and analyze it on the back end, attempt to process it on the fly, or do both. In this context, event processing and decision management technologies represent a big part of big data analytics since they can help examine data streams for value and deliver information to the front lines of the organization immediately. How traditionally batch-oriented big data technologies such as Hadoop fit into the broader picture of right-time consumption still needs to be answered as well. Ultimately, as happens with many aspects of big data analytics, the discussion will need to center on the use case and how to address the time to value (TTV) equation.

Organizations embarking on a big data strategy must not fail to consider the four areas above. Furthermore, their discussions cannot cover just the technological approaches, but must include people, processes and the entire information landscape. Often, this endeavor requires a fundamental rethinking of organizational processes and questioning of the status quo.  Only then can companies see the forest for the trees.

Regards,

Tony Cosentino
VP and Research Director

RSS Tony Cosentino’s Analyst Perspectives at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Tony Cosentino – Twitter

Error: Twitter did not respond. Please wait a few minutes and refresh this page.

Stats

  • 73,277 hits
%d bloggers like this: