You are currently browsing the tag archive for the ‘Teradata’ tag.

At its annual industry analyst summit last month and in a more recent announcement of enterprise support for parallelizing the R language on its Aster Discovery Platform, Teradata showed that it is adapting to changes in database and analytics technologies. The presentations at the conference revealed a unified approach to data architectures and value propositions in a variety of uses including the Internet of Things, digital marketing and ETL offloading. In particular, the company provided updates on the state of its business as well as how the latest version of its database platform, Teradata 15.0, is addressing customers’ needs for big data. My colleague Mark Smith covered these announcements in depth. The introduction of scalable R support was discussed at the conference but not announced publicly until late last month.

vr_Big_Data_Analytics_13_advanced_analytics_on_big_dataTeradata now has a beta release of parallelized support for R, an open source programming language used significantly in universities and growing rapidly in enterprise use. One challenge is that R relies on a single-thread, in-memory approach to analytics. Parallelization of R allows the algorithm to run on much larger data sets since it is not limited to data stored in memory. For a broader discussion of the pros and cons of R and its evolution, see my analysis. Our benchmark research shows that organizations are counting on companies such as Teradata to provide a layer of abstraction that can simplify analytics on big data architectures. More than half (54%) of advanced analytics implementations are custom built, but in the future this percentage will go down to about one in three (36%).

Teradata’s R project has three parts. The first includes a Teradata Aster R library, which supplies more than 100 prebuilt R functions that hide complexity of the in-database implementation. The algorithms cover the most common big data analytic approaches in use today, which according to our big data analytics benchmark research are classification (used by 39% of organizations), clustering (37%), regression (35%), time series (32%) and affinity analysis (29%). Some use innovative approaches available in Aster such as Teradata’s patented nPath algorithm, which is useful in areas such as digital marketing. All of these functions will receive enterprise support from Teradata, likely through its professional services team.

The second part of the project involves the R parallel constructor. This component gives analysts and data scientists tools to build their own parallel algorithms based on the entire library of open source R algorithms. The framework follows the “split, apply and combine” paradigm, which is popular among the R community. While Teradata won’t support the algorithms themselves, this tool set is a key innovation that I have not yet seen from others in the market.

Finally, the R engine has been integrated with Teradata’s SNAP integration framework. The framework provides unified access to multiple workload specific engines such as relational (SQL), graph (SQL-GR), MapReduce (SQL-MR) and statistics. This is critical since the ultimate value of analytics rests in the information itself. By tying together multiple systems, Teradata enables a variety of analytic approaches. More importantly, the data sources that can be merged into the analysis can deliver competitive advantages. For example, JSON integration, recently announced, delivers information from a plethora of connected devices and detailed Web data.

vr_Big_Data_Analytics_09_use_cases_for_big_data_analyticsTeradata is participating in industry discussions about both data management and analytics. As Mark Smith discussed, its unified approach to data architecture addresses challenges brought on competing big data platforms such as Hadoop and other NoSQL approaches like that one announced with MongoDB supporting JSON integration. These platforms access new information sources and help companies use analytics to indirectly increase revenues, reduce costs and improve operational efficiency. Analytics applied to big data serve a variety of uses, most often cross-selling and up-selling (for 38% of organizations), better understanding of individual customers (32%) and optimizing price (30%) and IT operations (24%). Teradata is active in these areas and is working in multiple industries such as financial services, retail, healthcare, communications, government, energy and utilities.

Current Teradata customers should evaluate the company’s broader analytic and platform portfolio, not just the database appliances. In the fragmented and diverse big data market, Teradata is sorting through the chaos to provide a roadmap for largest of organizations to midsized ones. The Aster Discovery Platform can put power into the hands of analysts and statisticians who need not be data scientists. Business users from various departments, but especially high-level marketing groups that need to integrate multiple data sources for operational use, should take a close look at the Teradata Aster approach.

Regards,

Tony Cosentino

VP & Research Director

At its Teradata Partners conference in Dallas, a broader vision for big data and analytics was articulated clearly. Their pitch centered on three areas – data warehousing, big data analytics and integrated marketing – that to some degree reflect Teradata’s core market and acquisitions in the last few years of companies like Aprimo who provides integrated marketing technology and Aster in big data analytics. The keynote showcased the company’s leadership position in the increasingly complex world of open source database software, cloud computing and business analytics.

As I discussed in writing about the 2013 Hadoop Summit, Teradata has embraced technologies such as Hadoop that can be seen as vr_bigdata_big_data_technologies_plannedboth a threat and an opportunity to its status as a dominant database provider over the past 20 years. Its holistic architectural approach appropriately named Unified Data Architecture (UDA) reflects an enlightened vision, but relies on the ideas that separate database workloads will drive a unified logical architecture and that companies will continue to rely on today’s major database vendors to provide leadership for the new integrated approach. Our big data benchmark research finds support for this overall position since most big data strategies still rely on a blend of approaches including data warehouse appliances (35%), in-memory databases (34%), specialized databases (33%) and Hadoop (32%).

Teradata is one of the few companies that has the capability to produce a truly integrated platform, and we see evidence of this by its advances in UDA, the Seamless Network Analytics Processing (SNAP) Framework and the Teradata Aster 6 Discovery platform. I want to note that the premise behind UDA is that the complexities of the different big data approaches is abstracted from the user which may access the data through tools such as Aster or other BI or visualization tool. This is important because it means that organizations and their users do not need to understand the complexities of the various types of emerging database approaches prior using them for competitive advantage.

The developments in Aster 6 show the power of the platform to access new and different workloads for new analytic solutions. Teradata announced three key developments about the Aster platform just before the Partners conference. A graph engine is added to complement the existing SQL and MapReduce engines. Graph analytics has not had as much exposure as other NoSQL technologies such as Hadoop or document databases, but it is beginning to gain traction for specific use cases where relationships are difficult to analyze with traditional analytics. For instance, any relationship network, including those in social media, telecommunications or healthcare, can use graph engines, but they also are being applied for basket analysis in retail or behind the scenes in areas such as master data management. The reason that the graph approach can be considered better in these situations is that it is more efficient. For example, it is easier to look at a graph of a social network and understand existing relationships and what is occurring, than trying to understand this same type of data looking at rows and columns. Similarly, using the ideas of nodes and edges, a graph database helps discover very complex patterns in the data that may not be obvious otherwise.

An integrated storage architecture compatible with Apache Hadoop HDFS, its file system, is another important development in Aster 6. It accommodates fast ingestion and preprocessing of multi-structured data. Perhaps the most important development for Aster 6 is the SNAP Framework, which integrates and optimizes execution of SQL queries across the different analytic engines. That is, Teradata has provided a layer of abstraction that removes the need for expertise in different flavors of NoSQL and puts it into SQL, a language that many data-oriented professionals understand.

vr_bigdata_obstacles_to_big_data_analytics %282%29Our big data benchmark research shows that staffing and training are major challenges to big data analytics for three-fourths of organizations today and advances in Aster 6 address multiple analytical access points needed in today’s big data environment. These three analytic access points which are the focus for Teradata are the data scientist, the analyst and the knowledge worker, as described in my recent post on the analytic personas that matter. For the first group of data scientists, there are open APIs and an integrated development environment (IDE) in which they can develop services directly against the unified data architecture. Analysts, who typically are less familiar with procedural programming approaches, can use the declarative paradigm of SQL to access data and call up functions within the unified data architecture. Some advanced algorithms are included now within Aster, as are a few big data visualizations such as sankey; on that topic, I think the best interactive sankey visualization for Aster is from Qlik Technologies, and was showcased at the company’s booth at the Teradata conference. The third persona and access point is the role of the knowledge worker, who accesses big data through BI and visualization tools. Ultimately, the Aster 6 platform brings an impressively integrated access approach to big data analytics; we have not yet seen its equal elsewhere in the market.

A key challenge that Teradata faces as it repositions itself from a best-in-class database provider for data warehousing to a big data and big data analytics provider is to articulate clearly how everything fits together to serve the business analyst. For instance, Teradata relies on its partner’s tools like visual discovery tools and analytical workflow tools such as Alteryx to tap into the power of its database, but it is hard to see how all of these tools use Aster 6. We saw the Aster 6 n-path analysis nicely displayed in an interactive sankey by QlikTech who I recently assessed, and an n-path node within the context of the Alteryx who I also analyzed advanced analytics workflow, but it is unclear how an analyst without specific SQL skills can do more than that. Furthermore, Teradata announced that its database integrates with the full library of advanced analytics algorithms through Fuzzy Logix, and through partnership with Revolution Analytics, R algorithms can run directly in the parallelized environment of Teradata, but again it is unclear how this plays with the Aster 6 approach. This is not to downplay the integration with Fuzzy Logix and Revolution Analytics because these are major announcements and they should not be underestimated especially for big data analytics. However, how these advancements align with the Aster approach and the usability of advanced analytics is still unclear. Our research shows that usability is becoming the most important buying criterion across categories of software and types of purchasers. In the case of next-generation business intelligence, usability is the number-one buying criterion for nearly two out of three (64%) organizations. Nevertheless, Aster 6 provides an agile, powerful and multifaceted data discovery platform that addresses the skills gap especially at the upper end of the analyst skills curve.

In an extension of this exploratory analytics position, Teradata also introduced cloud services. While we have seen vendors of BI and analytics as laggards in the cloud, it is increasingly difficult for them to ignore. Particular use cases are analytic sandboxes and exploratory analytics; in the cloud users can add or reduce resources as needed to address the analytic needs of the organization. Teradata introduced its cloud approach as TCO neutral which means that once you include all of the associated expense of running the service, it will be no more or less expensive than if it was to be run on premise. This runs counter to a lot of industry talk about the inexpensive nature of Amazon’s Redshift platform (based on the Paraccel MPP database that I wrote about). However, IT professionals who actually run databases are sophisticated enough to understand the cost drivers and know that a purely cost-based argument is a red herring. Network infrastructure costs, data governance, security and compliance all come into play since these issues are similar in the cloud as they are on-premises. TCO neutral is a reasonable position for Teradata since it shows that the company knows what it takes to deploy and run big data and analytics in the cloud. Although cloud players market themselves as less expensive, there still are plenty of expenses associated with it. The big differences are in the elasticity of the resources as well as the way the cost is distributed in the form of operational expenditures rather than capital expenditures. Buyers should consider all factors before making the datawarehouse cloud decision, but overall cloud strategy and use case are two critical criterions.

Its cloud computing direction is emblematic of the analytics market position that Teradata is aspiring to occupy. For years it has under-promised and over-delivered. This company doesn’t introduce products with a lot of hype and bugs and then ask the market to help fix them. Its reputation has earned it some of the biggest clients in the world and has built a high level of trust, especially within IT departments. As companies become frustrated with a lack of governance and security and a proliferation of data silos that today’s business-driven use of analytics spawns, I expect that the pendulum of power will swing back toward IT. It’s hard to predict when this may happen, but Teradata will be particularly well positioned when it does. Until then, on the business side it will continue to compete with systems integration consulting firms and other giants vying for the high level trusted advisor position in today’s enterprise. In this effort, Teradata has both industry and technical expertise and has established a center of excellence populated by some of the smartest minds in the big data world including Scott Nau, Tasso Argyros and Bill Franks. I recommend Bill Franks’ Taming the Big Data Tidal Wave as one of the most comprehensive and readable books on big data and analytics.

For large and midsize companies that are already Teradata customers, midsize companies with a cloud-first charter and any established organization rethinking its big data architecture, Teradata should be on the list of vendors to consider.

Regards,

Tony Cosentino

VP and Research Director

Users of big data analytics are finally going public. At the Hadoop Summit last June, many vendors were still speaking of a large retailer or a big bank as users but could not publically disclose their partnerships. Companies experimenting with big data analytics felt that their proof of concept was so innovative that once it moved into production, it would yield a competitive advantage to the early mover. Now many companies are speaking openly about what they have been up to in their business laboratories. I look forward to attending the 2013 Hadoop Summit in San Jose to see how much things have changed in just a single year for Hadoop centered big data analytics.

Our benchmark research into operational intelligence, which I argue is another name for real-time big data analytics, shows diversity in big data analytics use cases by industry. The goals of operational intelligence are an interesting mix as the research shows relative parity among managing performance (59%), detecting fraud and security (59%), complying with regulations (58%) and managing risk (58%), but when we drill down into different industries there are some interesting nuances. For instance, healthcare and banking are driven much more by risk and regulatory compliance, services such as retail are driven more by performance, and manufacturing is driven more by cost reduction. All of these make sense given the nature of the businesses. Let’s look at them in more detail.

vr_oi_goals_of_using_operational_intelligenceThe retail industry, driven by market forces and facing discontinuous change, is adopting big data analytics out of competitive necessity. The discontinuity comes in the form of online shopping and the need for traditional retailers to supplement their brick-and-mortar locations. JCPenney and Macy’s provide a sharp contrast in how two retailers approached this challenge. A few years ago, the two companies eyed a similar competitive space, but since that time, Macy’s has implemented systems based on big data analytics and is now sourcing locally for online transactions and can optimize pricing of its more than 70 million SKUs in just one hour using SAS High Performance Analytics. The Macy’s approach has, in Sun-Tzu like fashion, made the “showroom floor” disadvantage into a customer experience advantage. JCPenney, on the other hand, used gut-feel management decisions based on classic brand merchandising strategies and ended up alienating its customers and generating law suits and a well-publicized apology to its customers. Other companies including Sears are doing similarly innovative work with suppliers such as Teradata and innovative startups like Datameer in data hub architectures build around Hadoop.

Healthcare is another interesting market for big data, but the dynamics that drive it are less about market forces and more about government intervention and compliance issues. Laws around HIPPA, the recent Healthcare Affordability Act, OC-10 and the HITECH Act of 2009 all have implications for how these organizations implement technology and analytics. Our recent benchmark research on governance, risk and compliance indicates that many companies have significant concerns about compliance issues: 53 percent of participants said they are concerned about them, and 42 percent said they are very concerned. Electronic health records (EHRs) are moving them to more patient-centric systems, and one goal of the Affordable Care Act is to use technology to produce better outcomes through what it calls meaningful use standards.  Facing this title wave of change, companies including IBM analyze historical patterns and link it with real-time monitoring, helping hospitals save the lives of at-risk babies. This use case was made into a now-famous commercial by advertising firm Ogilvy about the so-called data babies. IBM has also shown how cognitive question-and-answer systems such as Watson assist doctors in diagnosis and treatment of patients.

Data blending, the ability to mash together different data sources without having to manipulate the underlying data models, is another analytical technique gaining significant traction. Kaiser Permanente is able to use tools from Alteryx, which I have assessed, to consolidate diverse data sources, including unstructured data, to streamline operations to improve customer service. The two organizations made a joint presentation similar to the one here at Alteryx’s user conference in March.

vr_grc_worried_about_grcFinancial services, which my colleague Robert Kugel covers, is being driven by a combination of regulatory forces and competitive market forces on the sales end. Regulations produce a lag in the adoption of certain big data technologies, such as cloud computing, but areas such as fraud and risk management are being revolutionized by the ability, provided through in-memory systems, to look at every transaction rather than only a sampling of transactions through traditional audit processes. Furthermore, the ability to pair advanced analytical algorithms with in-memory real-time rules engines helps detect fraud as it occurs, and thus criminal activity may be stopped at the point of transaction. On a broader scale, new risk management frameworks are becoming the strategic and operational backbone for decision-making in financial services.

On the retail banking side, copious amounts of historical customer data from multiple banking channels combined with government data and social media data are providing banks the opportunity to do microsegmentation and create unprecedented customer intimacy. Big data approaches to micro-targetting and pricing algorithms, which Rob recently discussed in his blog on Nomis, enable banks and retailers alike to target individuals and customize pricing based on an individual’s propensity to act. While partnerships in the financial services arena are still held close to the vest, the universal financial services providers – Bank of America, Citigroup, JPMorgan Chase and Wells Fargo – are making considerable investments into all of the above-mentioned areas of big data analytics.

Industries other than retail, healthcare and banking are also seeing tangible value in big data analytics. Governments are using it to provide proactive monitoring and responses to catastrophic events. Product and design companies are leveraging big data analytics for everything from advertising attribution to crowdsourcing of new product innovation. Manufacturers are preventing downtime by studying interactions within systems and predicting machine failures before they occur. Airlines are recalibrating their flight routing systems in real time to avoid bad weather. From hospitality to telecommunications to entertainment and gaming, companies are publicizing their big data-related success stories.

Our research shows that until now, big data analytics has primarily been the domain of larger, digitally advanced enterprises. However, as use cases make their way through business and their tangible value is accepted, I anticipate that the activity around big data analytics will increase with companies that reside in the small and midsize business market. At this point, just about any company that is not considering how big data analytics may impact its business faces an unknown and uneasy future. What a difference a year makes, indeed.

Regards,

Tony Cosentino

VP and Research Director

Our benchmark research found in business technology innovation that analytics is the most important new technology for improving their organization’s performance; they ranked big data only fifth out of six choices. This and other findings indicate that the best way for big data to contribute value to today’s organizations is to be paired with analytics. Recently, I wrote about what I call the four pillars of big data analytics on which the technology must be built. These areas are the foundation of big data and information optimization, predictive analytics, right-time analytics and the discovery and visualization of analytics. These components gave me a framework for looking at Teradata’s approach to big data analytics during the company’s analyst conference last week in La Jolla, Calif.

The essence of big data is to optimize the information used by the business for whatever type of need as my colleague has identified as a key value of these investmentsVR_2012_TechAward_Winner_LogoData diversity presents a challenge to most enterprise data warehouse architectures. Teradata has been dealing with large, complex sets of data for years, but today’s different data types are forcing new modes of processing in enterprise data warehouses. Teradata is addressing this issue by focusing on a workload-specific architecture that aligns with MapReduce, statistics and SQL. Its Unified Data Architecture (UDA) incorporates the Hortonworks Hadoop distribution, the Aster Data platform and Teradata’s stalwart RDBMS EDW. The Big Data Analytics appliance that encompasses the UDA framework won our annual innovation award in 2012. The system is connected through Infiniband and accesses Hadoop’s metadata layer directly through Hcatalog. Bringing these pieces together represents the type of holistic thinking that is critical for handling big data analytics; at the same time there are some costs as the system includes two MapReduce processing environments. For more on the UDA architecture, read my previous post on Teradata as well as my colleague Mark Smith’s piece.

Predictive analytics is another foundational piece of big data analytics and one of the top priorities in organizations. However, according to our vr_bigdata_big_data_capabilities_not_availablebig data research, it is not available in 41 percent of organizations today. Teradata is addressing it in a number of ways and at the conference Stephen Brobst, Teradata’s CTO, likened big data analytics to a high-school chemistry classroom that has a chemical closet from which you pull out the chemicals needed to perform an experiment in a separate work area. In this analogy, Hadoop and the RDBMS EDW are the chemical closet, and Aster Data provides the sandbox where the experiment is conducted. With mulitple algorithms currently written into the platform and many more promised over the coming months, this sandbox provides a promising big data lab environment. The approach is SQL-centric and as such has its pros and cons. The obvious advantage is that SQL is a declarative language that is easier to learn than procedural languages, and an established skills base exists within most organizations. The disadvantage is that SQL is not the native tongue of many business analysts and statisticians. While it may be easy to call a function within the context of the SQL statement, the same person who can write the statement may not know when and where to call the function. One way for Teradata to expediently address this need is through its existing partnerships with companies like Alteryx, which I wrote about recently. Alteryx provides a user-friendly analytical workflow environment and is establishing a solid presence on the business side of the house. Teradata already works with predictive analytics providers like SAS but should further expand with companies like Revolution Analytics that I assessed that are using R technology to support a new generation of tools.

Teradata is exploiting its advantage with algorithms such as nPath, which shows the path that a customer has taken to a particular outcome such as buying or not buying. According to our big data benchmark research, being able to conduct what-if analysis and predictive analytics are the two most desired capabilities not currently available with big data, as the chart shows. The algorithms that Teradata is building into Aster help address this challenge, but despite customer case studies shown at the conference, Teradata did not clearly demonstrate how this type of algorithm and others seamlessly integrate to address the overall customer experience or other business challenges. While presenters verbalized it in terms of improving churn and fraud models, and we can imagine how the handoffs might occur, the presentations were more technical in nature. As Teradata gains traction with these types of analytical approaches, it will behoove the company to show not just how the algorithm and SQL works but how it works in the use by business and analysts who are not as technically savvy.

Another key principle behind big data analytics is timeliness of the analytics. Given the nature of business intelligence and traditional EDW architectures, until now timeliness of analytics has been associated with how quickly queries run. This has been a strength of the Teradata MPP share-nothing architecture, but other appliance architectures, such as those of Netezza and Greenplum, now challenge Teradata’s hegemony in this area. Furthermore, trends in big data make the situation more complex. In particular, with very large data sets, many analytical environments have replaced the traditional row-level access with column access. Column access is a more natural way for data to be accessed for analytics since it does not have to read through an entire row of data that may not be relevant to the task at hand. At the same time, column-level access has downsides, such as the reduced speed at which you can write to the system; also, as the data set used in the analysis expands to a high number of columns, it can become less efficient than row-level access. Teradata addresses this challenge by providing both row and column access through innovative proprietary access and computation techniques.

Exploratory analytics on large, diverse data sets also has a timeliness imperative. Hadoop promises the ability to conduct iterative analysis on such data sets, which is the reason that companies store big data in the first place according to our big data benchmark research. Iterative analysis is akin to the way the human brain naturally functions, as one question naturally leads to another question. However, methods such as Hive, which allows an SQL-like method to access Hadoop data, can be very slow, sometimes taking hours to return a query. Aster enables much faster access and therefore provides a more dynamic interface for iterative analytics on big data.

Timeliness also has to do with incorporating big data in a stream-oriented environment and only 16 percent of organizations are very satisfied with timeliness of events according to our operational intelligence benchmark research. In a use case such as fraud and security, rule-based systems work with complex algorithmic functions to uncover criminal activity. While Teradata itself does not provide the streaming or complex event processing (CEP) engines, it can provide the big data analytical sandbox and algorithmic firepower necessary to supply the appropriate algorithms for these systems. Teradata partners with major players in this space already, but would be well served to further partner with CEP and other operational intelligence vendors to expand its footprint. By the way, these vendors will be covered in our upcoming Operational Intelligence Value Index, which is based on our operational intelligence benchmark research. This same research showed that analyzing business and IT events together was very important in 45 percent of organizations.

The visualization and discovery of analytics is the last foundational pillarvr_ngbi_br_importance_of_bi_technology_considerations and here Teradata is still a work in progress. While some of the big data visualizations Aster generates show interesting charts, they lack a context to help people interpret the chart. Furthermore, the visualization is not as intuitive and requires the writing and customization of SQL statements. To be fair, most visual and discovery tools today are relationally oriented and Teradata is trying to visualize large and diverse sets of data. Furthermore, Teradata partners with companies including MicroStrategy and Tableau to provide more user-friendly interfaces. As Teradata pursues the big data analytics market, it will be important to demonstrate how it works with its partners to build a more robust and intuitive analytics workflow environment and visualization capability for the line-of-business user. Usability (63%) and functionality (49%) are the top two considerations when evaluating business intelligence systems according to our research on next-generation business intelligence.

Like other large industry technology players, Teradata is adjusting to the changes brought by business technology innovation in just the last few years. Given its highly scalable databases and data modeling – areas that still represent the heart of most company’s information architectures –  Teradata has the potential to pull everything together and leverage their current deployed base. Technologists looking at Teradata’s new and evolving capabilities will need to understand the business use cases and share these with the people in charge of such initiatives. For business users, it is important to realize that big data is more than just visualizing disparate data sets and that greater value lies in setting up an efficient back end process that applies the right architecture and tools to the right business problem.

Regards,

Tony Cosentino
VP and Research Director

RSS Tony Cosentino’s Analyst Perspectives at Ventana Research

  • An error has occurred; the feed is probably down. Try again later.

Tony Cosentino – Twitter

Error: Twitter did not respond. Please wait a few minutes and refresh this page.

Stats

  • 73,368 hits
%d bloggers like this: