Wednesday, 4 October 2017

What is Big Data Analytics?

The term "big data" refers to digital stores of information that have a high volume, velocity and variety. Big data analytics is the process of using software to uncover trends, patterns, correlations or other useful insights in those large stores of data.
Advantage: Big data analytics enables companies to increase revenues, decrease costs and become more competitive within their industries. Many firms are investing heavily in Big Data analytics.
Big data analytics is quickly gaining adoption. Enterprises have awakened to the reality that their big data stores represent a largely untapped gold mine that could help them lower costs, increase revenue and become more competitive. They don't just want to store their vast quantities of data, they want to convert that data into valuable insights that can help improve their companies.
As a result, investment in big data analytics tools is seeing remarkable gains. According to IDC, worldwide sales of big data and business analytics tools are likely to reach $150.8 billion in 2017, which is 12.4 percent higher than in 2016. And the market research firm doesn't see that trend stopping anytime soon. It forecasts 11.9 percent annual growth through 2020 when revenues will top $210 billion.

Data analytics isn't new. It has been around for decades in the form of business intelligence and data mining software. Over the years, that software has improved dramatically so that it can handle much larger data volumes, run queries more quickly and perform more advanced algorithms.
The market research firm Gartner categories big data analytics tools into four different categories:

  1. Descriptive Analytics: These tools tell companies what happened. They create simple reports and visualizations that show what occurred at a particular point in time or over a period of time. These are the least advanced analytics tools.
  2. Diagnostic Analytics: Diagnostic tools explain why something happened. More advanced than descriptive reporting tools, they allow analysts to dive deep into the data and determine root causes for a given situation.
  3. Predictive Analytics: Among the most popular big data analytics tools available today, predictive analytics tools use highly advanced algorithms to forecast what might happen next. Often these tools make use of artificial intelligence and machine learning technology.
  4. Prescriptive Analytics: A step above predictive analytics, prescriptive analytics tell organizations what they should do in order to achieve a desired result. These tools require very advanced machine learning capabilities, and few solutions on the market today offer true prescriptive capabilities.

Benefits of Big Data Analytics

Organizations decide to deploy big data analytics for a wide variety of reasons, including the following:
  • Business Transformation In general, executives believe that big data analytics offers tremendous potential to revolution their organizations. In the 2016 Data & Analytics Survey from IDGE, 78 percent of people surveyed agreed that over the next one to three years the collection and analysis of big data could fundamentally change the way their companies do business.
  • Competitive Advantage In the MIT Sloan Management Review Research Report Analytics as a Source of Business Innovation, sponsored by SAS, 57 percent of enterprises surveyed said their use of analytics was helping them achieve competitive advantage, up from 51 percent who said the same thing in 2015.
  • InnovationBig data analytics can help companies develop products and services that appeal to their customers, as well as helping them identify new opportunities for revenue generation. Also in the MIT Sloan Management survey, 68 percent of respondents agreed that analytics has helped their company innovate. That's an increase from 52 percent in 2015.
  • Lower Costs In the NewVantage Partners Big Data Executive Survey 2017, 49.2 percent of companies surveyed said that they had successfully decreased expenses as a result of a big data project.
  • Improved Customer Service Organizations often use big data analytics to examine social media, customer service, sales and marketing data. This can help them better gauge customer sentiment and respond to customers in real time.
  • Increased Security Another key area for big data analytics is IT security. Security software creates an enormous amount of log data. By applying big data analytics techniques to this data, organizations can sometimes identify and thwart cyberattacks that would otherwise have gone unnoticed.

Big Data Analytics Tools

Big data analytics has become so trendy that nearly every major technology company sells a product with the "big data analytics" label on it, and a huge crop of startups also offers similar tools. Cloud-based big data analytics have become particularly popular. In fact, the 2016 Big Data Maturity Surveyconducted by AtScale found that 53 percent of those surveyed planned to use cloud-based big data solutions, and 72 percent planned to do so in the future. Open source tools like Hadoop are also very important, often providing the backbone to commercial solution.
The lists below are not exhaustive, but do include a sampling of some of better known big data analytics solutions.

Open Source Big Data Analytics Tools

Big Data Analytics Vendors

How to Select a Big Data Application

Choosing big data software is a complicated process that requires a careful evaluation of your goals and the solutions available from vendors.

To be sure, big data solutions are in great demand. Today, enterprise leaders know that their big data is one of their most valuable resources — and one they can't afford to ignore. As a result, they are looking for hardware and software that can help them store, manage and analyze their big data.
According to IDC, enterprises will likely spend $150.8 billion on big data and analytics in 2017, 12.4 percent more than they spent last year. And that spending is likely to increase at 11.9 percent per year through 2020, when revenues will likely top $210 billion.
Much of that revenue is going toward big data applications. IDC forecasts that spending on software alone could exceed $70 billion in 2020. Spending is increasing particularly rapidly on non-relational analytic data stores (like NoSQL databases), which will likely grow 38.6 percent per year, and cognitive software platforms (like analytics tools with artificial intelligence and machine learning capabilities), which will likely grow 23.3 percent per year.
In order to capitalize on all that big data spending, vendors have slapped the "big data" label on a dizzying array of different products and services. That product proliferation can make it difficult for organizations to find the right big data applications to meet their needs. Experts suggest that a good way to start the process of selecting a big data application is to determine exactly what kind of application (or applications) you need.

Types of Big Data Applications

Enterprise software vendors offer a wide array of different types of big data applications. The kind of big data application that is right for you will depend on your goals.
For example, if you just want to expand your existing financial reporting capabilities with greater detail and depth, a data warehouse and business intelligence solution might be sufficient for your needs. If your sales and marketing teams want to use your big data to uncover new opportunities for increasing your revenue and margins, you might consider creating a data lake and/or investing in a data mining solution. If you want to create a data-driven culture where everyone in your organization is using data to guide their decision-making, you might want a data lake and predictive analytics and an in-memory database and possibly streaming analytics too.
Things can get a little more complicated because the lines between the different types of tools can be a little fuzzy. Some business intelligence tools have data mining and predictive analytics capabilities. Some predictive analytics tools include streaming capabilities.
Your best approach is to define your goals clearly at the outset and then go looking for products that will help you reach those goals. The chart below offers an overview of some of the most common types of big data applications and how they can be useful in the enterprise.



Key Decisions When Selecting a Big Data Application

No matter which type of big data application you select, you'll need to make some key decisions that will help you narrow down your options. Here are a few of the most important of these considerations:

On-premise vs cloud-based big data applications

The first big decision you'll need to make is whether you want to host your big data software in your own data center or if you want to use a cloud-based solution.
Currently, more organizations seem to be opting for the cloud. “Global spending on big data solutions via cloud subscriptions will grow almost 7.5 times faster than on-premise subscriptions." Brian Hopkins, Forrester vice president and principal analyst, wrote in an August 2017 blog post. "Furthermore, public cloud was the number one technology priority for big data according to our 2016 and 2017 surveys of data analytics professionals.”
Cloud-based big data applications are popular for several reasons, including scalability and ease of management. The major cloud vendors are also leading the way with artificial intelligence and machine learning research, which is allowing them to add advanced features to their solutions.
However, cloud isn't always the best option. Organizations with high compliance or security requirements sometimes find that they need to keep sensitive data on premises. In addition, some organizations already have investments in existing on-premises data solutions, and they find it more cost effective to continue running their big data applications locally or to use a hybrid approach.

Proprietary vs open source big data applications

Some of the most popular big data tools available, including the Hadoop ecosystem, are available under open source licenses. Forrester has estimated, “Firms will spend $800 million in Hadoop software and related services in 2017.”
One of the big appeals of Hadoop and other open source software is the low total cost of ownership. While proprietary solutions have hefty license fees and may require expensive specialized hardware, Hadoop has no licensing fees and can run on industry-standard hardware.
However, enterprises sometimes find it difficult to get the open source solutions up and running and configured for their needs. They may need to purchase support or consulting services, and organizations need to consider those expenses when figuring out total cost of ownership.

Batch vs streaming big data applications

The earliest big data solutions, like Hadoop, processed batch data only, but enterprises increasingly find that they want to analyze data in real-time. That has generated more interest in streaming solutions such as Spark, Storm, Samza and others.
Many analysts say that even if organizations don't think they need to process streaming data today, streaming capabilities are likely to become standard operating procedure in the not-too-distant future. For that reason, many organizations are moving toward Lambda architecture, a data processing architecture that can handle both real-time and batch data.

Characteristics to Look for in a Big Data Application

Once you have narrowed down your options, you'll need to evaluate the big data applications you are considering. The criteria below include some of the most important factors to examine.
  • Integration with Legacy Technology – Most organizations already have existing investments in data management and analytics technology. Replacing that technology completely can be expensive and disruptive, so organizations often choose to look for solutions that can be used alongside their current tools or that can augment their existing software.
  • Performance – A 2017 Talend study found that real-time analytics capabilities were one of business leaders' top IT priorities. Executives and managers need to be able to access insights in a timely manner if they are going to profit from those insights. That means investing in technology that can provide the speed they need.
  • Scalability – Big data stores get larger every day. Organizations not only need big data applications that perform quickly right now, they need big data applications that can continue to perform quickly as data stores grow exponentially. This need for scalability is one of the key reasons why cloud-based big data applications have become very popular.
  • Usability – Organizations should also consider the "learning curve" for any big data applications that they intend to purchase. Tools with easy deployment, easy configuration, intuitive interfaces and/or similarity or integration with tools the organization already uses can provide tremendous value.
  • Visualization – According to BI-Survey.com, "Visualization and explorative data analysis for business users (known as data discovery) have evolved into the hottest business intelligence and analytics topic in today’s market." Presenting data in charts and graphs makes it easier for human brains to spot trends and outliers, speeding up the process of identifying actionable insights.
  • Flexibility – The big data needs you have today are likely very different from the needs you will have in another year or two. That's why many enterprises choose to look for tools with the capacity to serve a variety of different goals rather than performing a single function very well.
  • Security – Much of the data included in those big data stores is sensitive information that would be highly valuable to competitors, nation-states or hackers. Organizations need to ensure that their big data has adequate protection to prevent the sorts of large data breaches that have recently been dominating headlines. That means looking either for tools that have security features like encryption and strong authentication built in or tools that integrate with your existing security solutions.
  • Support – Even experienced IT professionals sometimes find it difficult to deploy, maintain and use complex big data applications. Don't forget to consider the quality and cost of the support available from the various vendors.
  • Ecosystem – Most organizations need a number of different applications to meet all of their big data needs. That means looking for a big data platform that integrates with a lot of other popular tools and a vendor with strong partnerships with other providers.
  • Self-Service Capabilities – The Harvey Nash KPMG CIO Survey 2017 found that sixty percent of CIOs consistently report talent shortages, with big data and analytics being the most in-demand skillset. Because there aren't enough qualified data scientists to go around, organizations are looking for tools that other business professionals can use on their own. A recent Gartner blog post noted that in an average organization, about 32 percent of employees are using BI and analytics.
  • Total Cost of Ownership – The upfront costs of a big data application are only a small part of the picture. Organizations need to make sure they consider related hardware costs, ongoing license or subscription fees, employee time, support costs and any expenses related to the physical space for on-premises applications. Don't forget to factor in the fact that cloud computing costs generally decrease over time.
  • Estimated Time to Value – Another important financial consideration is how quickly you'll be able to get up and running with a particular solution. Most companies would prefer to see benefit from their big data projects within days or weeks rather than months or years.
  • Artificial Intelligence and Machine Learning – Finally, consider how innovative the various big data applications vendors are. AI and machine learning research are advancing at an incredible rate and becoming a mainstream part of big data analytics. Forrester has predicted, “In 2017, investments in AI will triple as firms work to convert customer data into personalized experiences.” If you choose a vendor that isn't on the cutting-edge of this research, you may find yourself falling behind the competition.

Tips for Selecting a Big Data Application

Clearly, choosing the right big data application is a complicated process that involves a myriad of factors. Experts and organizations that have successfully deployed big data software offer the following advice:
  • Understand your goals — As previously mentioned, knowing what you want to accomplish is of paramount importance when choosing a big data application. If you aren't sure why you are investing in a particular technology, your project is unlikely to succeed.
  • Start small — If you can demonstrate success with a small-scale big data analytics project, that will generate interest in using the tool throughout the company.
  • Take a holistic approach — While a small-scale project can help you gain experience and expertise with your technology, it's important to choose an application that can ultimately be used throughout the business. Gartner advises, “To support a ‘data and analytics everywhere’ world, IT professionals need to create a new end-to-end architecture built for agility, scale and experimentation. Today, disciplines are merging and approaches to data and analytics are becoming more holistic and encompassing the entire business.”
  • Work together — That same blog post also notes, “Gartner recommends data and analytics leaders work proactively to spread analytics throughout their organization, to get the largest possible benefit from enabling data to drive business actions.” Many organizations are attempting to build a data-driven culture, and that requires a great deal of cooperation among business and IT leaders.
  • Go viral — Those previously mentioned self-service capabilities can also help with the creation of data-driven culture. Gartner advises, “Enable analytics to truly go viral, within and outside the enterprise. Empower more business users to perform analytics by fostering a pragmatic approach to self-service and by embedding analytic capabilities at the point of data ingestion within interactions and processes.”