The Big Promise of Big Data | Features | ChannelWorld.in

PARTNER HOTLINES

The Big Promise of Big Data

By Joab Jackson, IDG News Service on Mar 15, 2012

For Twitter, making sense of its mountains of user data was big enough of a problem that it purchased another company just to help get the job done. Twitter's success is dependent entirely on how well it exploits the data its users generate. And it has a lot of data to work with: It hosts over 200 million accounts, which generate 230 million Twitter messages a day.

Last July, the social networking giant purchased BackType, a company with software, called Storm, that could parse live data streams such as millions of Twitter feeds. After the acquisition, Twitter released the source code of Storm, having no interest in commercializing the product itself.

Storm is valuable for Twitter for its own operations specifically because it can be useful in identifying emerging topics as they are unfolding, in real time, on the company's service. For instance, Twitter uses the software to calculate how widely Web addresses are shared across multiple Twitter users in real-time.

Such a job "is a really intense computation, which could involve thousands of database calls and millions of follower records," said Nathan Marz, Twitter lead engineer for Storm, who explained the technology in December at a New York conference held by Big Data software vendor DataStax.

Using a single machine, computing the reach of a Web address could take up to 10 minutes. But spread across 10 machines, Marz explained, it could execute in as little as a few seconds. For a company that makes money selling ads against emerging trends, the faster operation can be crucial.

Like Twitter, organizations are finding that they have a great deal of data on hand, and that the data could potentially be used to maximize profits and improve efficiencies -- if they can organize and analyze it quickly enough. This pursuit, made possible by a number of new technologies that are mostly open source is often referred to as big data.

"It absolutely gives us a competitive advantage if we can better understand what people care about and better use the data we have to create more relevant experiences," said Aaron Batalion, chief technology officer for online shopping service LivingSocial, which uses technologies such as the Apache Hadoop data processing platform to glean more information about what their users want.

"The days are over when you build a product once and it just works," Batalion said. "You have to take ideas, test them, iterate them, use data and analytics to understand what works and what doesn't in order to be successful. And that's how we use our big data infrastructure."

Big data getting bigger

Last May, consulting firm McKinsey and Company issued a report that anticipated how organizations would be deluged with data in the years to come. They also predicted that a number of industries -- including health care, public sector, retail, and manufacturing -- would benefit by analyzing their rapidly growing mounds of data.

Collecting and analyzing transactional data will give organizations more insight into their customers' preferences. It can be used to better inform the creation of products and services, and allow organizations to remedy emerging problems more quickly.

"The use of big data will become a key basis of competition and growth for individual firms," the report concluded. "The use of big data will underpin new waves of productivity growth and consumer surplus."

Of course, Teradata, IBM and Oracle, among many others, have been offering terabyte scale data warehouses for more than a decade. These days, however, data tends to be collected and stored in a wider variety of formats and can be processed in parallel across multiple servers, which would be a necessity given the amounts of information being analyzed. In addition to exhaustively maintained transactional data from databases and carefully culled data residing in data warehouses, organizations also are reaping untold amounts of log data from servers and forms of machine generated data, customer comments from internal and external social networks and other sources of loose, unstructured data.

"Traditional data systems simply don't handle big data very well, either because they can't handle the variety of data -- today's data is much less structured because it evolves very quickly, and because [such systems] just cannot scale at the rate it which they must ingest data," said Eric Baldeschwieler, chief technology officer of Hortonworks, a Yahoo spinoff company that offers a Hadoop distribution.

Such data is growing at an exponential rate, thanks to Moore's Law, pointed out Curt Monash, of Monash Research. Moore's Law states that the number of transistors that can be placed on a processor wafer doubles approximately every 18 months. Each new generation of processors is twice as powerful as its most recent predecessor. And, not surprisingly, the power of new servers also doubles every 18 months, which means their activities will generate correspondingly larger datasets as well.

The big data approach represents a major shift in how data is handled, said Jack Norris, vice president of marketing for MapR. Before, carefully culled data was piped through the network to a data warehouse, where it could be further examined. With increasing amounts of data, however, "the network becomes the bottleneck," he said. Distributed systems such as Hadoop allow the analysis to occur where the data resides.

Instead of creating a clean subset of user data to place in a data warehouse to be queried against a limited number of predetermined ways, big data software just collects all the data an organization generates, and allows administrators and analysts to worry about how to use the data later. In this sense, they are more scalable than traditional databases and data warehouses.

How the Internet spurred big data

In many ways, the giant online service providers such as Google, Amazon, Yahoo, Facebook and Twitter have been on the cutting edge of learning how to make the most of such large data sets. Google and Yahoo, among others, had a hand in developing Hadoop. Facebook engineers first developed the Apache Cassandra distributed database, also open source.

Hadoop got its start from a 2004 Google white paper, one that described the infrastructure Google built to analyze data across many different servers, using an indexing system called Bigtable. Google kept Bigtable for internal use, but Doug Cutting, a developer who had already created the Lucene/Solr open source search engine, created an open source version, naming the technology after his son's stuffed elephant.

One early adopter of Hadoop was Yahoo. The company hired Cutting and started dedicating large amounts of engineering work to refining the technology around 2006. "Yahoo had lots of interesting data across the company that could be correlated in various ways, but it existed in separated systems," said Cutting, who now works for Hadoop distribution provider Cloudera.

Yahoo is now one of Hadoop's biggest users, deploying it on more than 40,000 servers. The company uses the technology in a variety of ways. Hadoop clusters hold massive log files of what stories and sections users click on. Advertisement activity is also stored on Hadoop clusters, as are listings of all the content and articles Yahoo publishes.

"Hadoop is a great tool for organizing and condensing large amounts of data before it is put into a relational database," Monash said. The technology is particularly well suited for searching for patterns across large sets of text.

Another big data technology that got its start at an online service provider was the Cassandra database. Cassandra is able to store 2 million columns in a single row, making it handy for appending more data onto existing user accounts, without knowing ahead of time how the data should be formatted.

Using a Cassandra database can also be advantageous in that it can spread across multiple servers, which helps organizations scale their databases easily beyond a single server, or even a small cluster of servers.

Cassandra was developed by social-networking giant Facebook, which needed a massive distributed database to power the service's inbox search, said Jonathan Ellis, the Apache Cassandra project chairman and cofounder of DataStax, a company that now offers professional support for Cassandra.

Like Yahoo, Facebook wanted to use the Google Bigtable architecture, which could provide a column-and-row-oriented database structure that could be spread across a large number of nodes. The limit of Bigtable was that it was a master-node-oriented design. The whole operation depended on a single node to coordinate read and write activities across all the other nodes. In other words, if the head node went down, the whole system would be useless.

"That's not the best design. You want one where if one machine goes down, the others keep going," Ellis said.

So Ellis and his peers built Cassandra using a distributed architecture developed by Amazon, called Dynamo, which Amazon engineers described in a 2007 paper. Amazon first developed Dynamo to keep track of what its millions of online customers were putting in their shopping carts.

The Dynamo design is not dependent on any one master node. Any node can accept data for the whole system, as well as answer queries. Data is replicated across multiple hosts.

To the enterprise

The good news is that many of the tools first developed by these online service providers are becoming more available for enterprises as open source software. These days, big data tools are being tested by a wider range of organizations, outside the large online service providers. Financial institutions, telecommunications, government agencies, utility companies, retail, and energy companies all are testing big data systems, Baldeschwieler noted.

"There is an air of inevitability" with Hadoop and big data implementations, he said. "It's applicable to a huge variety of customers."

So how does an organization start to use its heaps of machine generated and social networking data?

Perhaps surprisingly, setting up the infrastructure will not be the biggest challenge for the CIO. Vendors such as Cloudera, Hortonworks, MapR and others are commercializing big data technologies, in effect, making them easier to deploy and manage.

Rather, finding the right talent to analyze the data will be the biggest hurdle, according to Forrester Research analyst James Kobielus.

Organizations will "have to focus on data science," Kobielus said. "They have to hire statistical modelers, text mining professionals, people who specialize in sentiment analysis."

Big data relies on solid data modeling, Kobielus said. "Statistical predictive models and test analytic models will be the core applications you will need to do big data," he said.

Many are predicting that big data will bring about an entirely new sort of professional, the data scientist. This would be someone with a deep understanding of mathematics and statistics who also knows how to work with big data technologies.

These people may be in short supply. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions, McKinsey and Company estimated.

Despite these limitations, organizations need to forge ahead just to stay competitive and efficient, said MapR's Norris. As an example, he pointed to Google, which entered the field of Internet search years after the competition did, only to have dominate the market within two years.

"A lot of that was due to the advantages of Google's back-end architecture," Norris said. Big data "is a big paradigm shift that has the potential to change industries."


Latest Features

TECHNOLOGY DIRECTIONS 2015

Enhancing Digital User-Experience in 2015: Karthik Ananth,Zinnov

How digital transformation is impacting the way companies engage with their customers.

Collaborating To Outcome Based World: Priyadarshi Mohapatra, Avaya

Priyadarshi Mohapatra, Managing Director, India and SAARC, Avaya, on how IT is transitioning from a keep-the-lights-on role to one that enables customers to deliver results.

Journey to the Third Platform in 2015: Rajesh Janey,EMC

Rajesh Janey, President, EMC, India and SAARC, says that flash storage will accelerate the growth of the third platform.

Fostering New Relationships in 2015: Partha Iyengar, Gartner

In order to adopt a digital business strategy, channel partners need to establish relationships with LoBs.

Combating a New Breed of Cyber Attacks in 2015: FireEye

Ramsunder Papineni, Regional Director, India and SAARC, FireEye, on the paradigm shift in today’s threat landscape and how organizations can combat new threats.

The Dawn of the Digital Age: Akhilesh Tuteja, KPMG

The development of digital infrastructure will be a key growth driver for technology and solution providers. 

Paradigm Shift from End-Users to User-First : Parag Arora,Citrix

Parag Arora, Area Vice President and India Head, India Sub-continent, Citrix, says new technologies will force organizations to take a user-first approach in 2015.

HP's Blueprint for 2015 - SDN and Cloud Computing : Neelam Dhawan

Neelam Dhawan, VP and General Manager, Enterprise Group and Country MD, HP India,  on why a combination of cloud computing and SDN will dominate 2015.

SAP Banks on HANA for 2015 : Ravi Chauhan

Ravi Chauhan, Managing Director, India and Sub- continent, SAP, on becoming a cloud company powered by HANA.

Mobile and Cloud Are Gamechangers of the Future: Karan Bajwa,Microsoft

Karan Bajwa, Managing Director, Microsoft India, says, in  2015, organizations will adopt a mobile-first and cloud-first strategy to get ahead of competition.  
 

Ready to Fight 2015's Threats : Jagdish Mahapatra,McAfee

Jagdish Mahapatra, Managing Director, McAfee, India and SAARC, part of Intel Security, says the company is armed with new solutions to beat sophisticated threats.

A Network for the Internet of Everything : Dinesh Malkani,Cisco

Dinesh Malkani, President, India and SAARC, Cisco, talks about IoT and the significant technology transitions in the networking world.
 

Moving to the Third Platform: Jaideep Mehta, IDC

Cloud and mobility are the two technologies that will fuel the rapid adoption of the third platform in India.

Envisaging a Holistic Security Strategy For 2015: Sanjay Rohatgi,Symantec

Sanjay Rohatgi, President–Sales, Symantec India, says the company has a set of holistic solutions in place to secure organizations from security threats. 

Intel's 2015 Plan: Taking the Digital India Story Forward

Debjani Ghosh, VP-Sales and Marketing Group and MD, South Asia, Intel, is banking on innovative technology to make the Digital India dream a reality.

Embracing SDN in 2015: Ashish Dhawan,Juniper Networks

Ashish Dhawan, Managing Director, India and SAARC, Juniper Networks, talks about the company’s well-etched roadmap to ride the SDN wave.

Beating the Bad Guys: Sivarama Krishnan, PwC

Organizations will need to turn inwards to establish robust information security strategies.

Hybrid Cloud is 2015's Biggest Gainer: Sunil Gupta,Netmagic

Sunil Gupta, Executive Director and President, Netmagic, an NTT Communications company, expects the hybrid cloud to be the biggest gainer in 2015.

2015 is the Year of SDDC: Arun Parameswaran,VMware

Arun Parameswaran, Managing Director, VMware India, says 2015 will be the year of software-defined datacenter.

Enterprises Surging Ahead with Hybrid Cloud in 2015 : Anil Valluri,NetApp

Anil Valluri, President, NetApp, India and SAARC, says, in 2015, enterprise platforms will start encompassing hybrid cloud architectures.
 

Armed for 2015's Security Threats: Anil Bhasin,Palo Alto Networks

Anil Bhasin, Managing Director, Palo Alto Networks India, says new threats weaken an organization’s network but advanced security tools can change that.

Building Capabilities for a Digital Tomorrow: Alok Ohrie,Dell

Alok Ohrie, President and Managing Director, Dell India, on the company’s investments to build end-to-end solutions and delivery capabilities for a digital world.

Going Truly Mobile in 2015: Vikram Sehgal, Forrester Research

India is embracing mobile faster than mature economies. Here’s what it needs to watch out for to do it well.

VIDEOS | FORECAST 2015

SAP Cloud Strategy Powered by HANA: Ravi Chauhan

CIOs will rapidly adopt SMAC in 2015, and SAP has innovative solutions to provide business advantage and competitive edge to India Inc., says Ravi Chauhan, MD, SAP India.

Dell to Dominate Converged Infrastructure Market in India: Alok Ohrie

From a PC manufacturer to an end-to-end solutions provider, we are gaining India market share through a strong partner ecosystem: Alok Ohrie, MD & President, Dell India.

Citrix Will Catalyze Shift From End-user to User-First: Parag Arora

Parag Arora, Area VP and India head, India Sub-continent, Citrix, talks about the company’s vision to drive a paradigm shift in enterprise IT: From an end-user approach to a user-first approach powered by mobile and cloud computing and enabled by competent channel partners.

Mobility to Boost Collaboration and Conferencing in 2015: Priyadarshi Mohapatra, Avaya

Avaya delivers great value as an end-to-end communications solutions provider across data, audio and video, says Priyadarshi Mohapatra, MD India and SAARC, Avaya

Securing Organizations Against Modern Day Threats: Sanjay Rohatgi, Symantec

We have proven that we can manage and secure an organization’s data from within and outside a network, which is a vital requirement by CISOs today, says Sanjay Rohatgi, President-Sales, India, Symantec.

We are the Apple of Network Security World: Anil Bhasin, Palo Alto Networks

Anil Bhasin, Managing Director, Palo Alto Networks India, says new threats weaken an organization’s network but advanced security tools can change that.

Software Defined Networking to Rule in 2015: Ashish Dhawan, Juniper Networks

Networking is definitely moving towards a software-defined paradigm and we continue to dominate the India market with an extensive portfolio, and well-entrenched channels, says Ashish Dhawan, MD, India and SAARC, Juniper Networks.

EMC to Dominate 3rd Platform Across India Inc.: Rajesh Janey

In the last year, EMC has refreshed its entire product line to enable customers take advantage of the 3rd platform, says Rajesh Janey, president, India and SAARC, EMC.

IoT Vital for Digital India Initiative: Debjani Ghosh, Intel

The tons of data that will be generated in the coming years will open opportunities in storage, and analytics, says Debjani Ghosh, VP, sales and marketing group, and MD-South Asia, Intel.

VDI, Flash and Hybrid Cloud to Propel Storage Market: Anil Valluri, NetApp

We are witnessing a movement from traditional data storage systems to a hybrid cloud environment says Anil Valluri, president, India and SAARC, NetApp.

SDDC is the Big Shift for 2015: Arun Parameswaran, VMware

Arun Parameswaran, MD, VMware, says that in India, unlike other countries, there is still a huge untapped opportunity to virtualize existing infrastructure in 2015.

Go Cloud for Business Advantage: Sunny Sharma, Foetron

Sunny Sharma, CEO and Founder, Foetron, speaks about the company's focused roadmap to ride the public cloud wave.

FireEye to Combat APTs Across Multiple Vectors in 2015: Ramsunder Papineni

Going into 2015, organizations need to think of security more holistically, including ways to defend end points, e-mail, Web, file, and mobile security, says Ramsunder Papineni, regional director, India and SAARC, FireEye.

Retaining IT Talent in 2015: Shirish Anjaria, Dynacons

Shirish Anjaria, CMD, Dynacons Systems & Solutions, speaks about how partner companies can enhance the talent pool of skilled IT staff.

New Style of IT to Gather Traction in 2015: Neelam Dhawan, HP

SDS, SDN and software defined infrastructure will play a key role across Indian organizations in 2015, says Neelam Dhawan, VP and GM, enterprise group, country MD India, HP.

Building Strong Vendor-Partner Relationships: Pawan Khurana, QuantM

Pawan Khurana, CEO, QuantM, on what he expects from technology vendor companies in 2015.

IoE to be Biggest Market Disruptor in 2015: Dinesh Malkani, Cisco

We continue developing innovative solutions in IoT and cloud computing and help our partner ecosystem capitalize on market opportunities, says Dinesh Malkani, president, Cisco India and SAARC.

New Technologies For New Growth: Murtuza Sutarwala, Swan Solutions & Services

Deep selling and upselling emerging technologies to customers enhances our value proposition as a competent solution provider, says Murtuza Sutarwala, Swan Solutions & Services.

Analytics is a Goldmine for Channels in 2015: Anoop Pai Dhungat, Galaxy Office Automation

Analytics, mobility, and security are the technology megatrends for us in 2015, says Anoop Pai Dhungat, CMD, Galaxy Office Automation.

Smartphone Proliferation to Impact Mobile Strategies in 2015: Vikram Sehgal, Forrester

Enhancing customer experience through mobility will be key priority for organizations in 2015, says Vikram Sehgal, VP and Research Director, Forrester.

3rd Platform to Take Off in India: Jaideep Mehta, IDC

Jaideep Mehta, MD, India and South Asia, IDC, say cloud computing and mobility will be the fastest growing 3rd platform technologies in India.

Opex Model the Way Forward for Partners in 2015: Ajay Sawant, Orient Technologies

Ajay Sawant, Orient Technologies, talks about the massive shift as traditional system integrators move towards an Opex-led business model.

Digital India is Colossal Opportunity for Channels: Akhilesh Tuteja, KPMG

Channel partners should devise a vertical strategy with the right alliances and innovative solutions, says Akhilesh Tuteja, Partner-IT Advisory, KPMG India.

Going Digital the Way Ahead for India Inc: Karthik Ananth, Zinnov

Since India is a mobile -first market, Indian organizations that are turning digital should ensure that they deliver a uniform experience for their customers, says Karthik Ananth, Director, Zinnov.

EDITOR'S PICK

Forecast 2015: IT Spending On An Upswing

As purse strings loosen up, CIOs blend innovation into 2015 IT budgets, but security and cost containment remain top priorities.

‘Security Compliance is Not a Proactive Phenomenon in India’

Pavan Duggal, Cyber Law Expert at the Supreme Court of India, explains why channel partners need to look beyond the IT Act 2000 as the security standards, given today’s fast-changing threat landscape, rapidly evolve.

IT is Indispensable for Business Optimization: David Aires, Intel

David L. Aires, VP, Information Technology Group, and GM, Information Technology Operations, believes security to be the biggest challenge in the current IT environment.

Is the CIO Role Nearing Extinction?

New technologies are shifting power to the hands of the user, endangering the CIO role. But do Indian CIOs consider that a threat or an opportunity? 

The Authentication Market is Big Play for Channels: Gaurav Chawla, Gemalto

We are building a partner network to address the increased demand for authentication solutions across India, says Gaurav Chawla, Director, IAM, Gemalto India.

Versatile Infosecurity: Riding the Security Wave

It takes vision and persistence to stay on top of the security curve. Versatile Infosecurity has mastered that art.

How Futurenet Technologies Helped Sterlite Copper Adopt Next-gen Client Computing

Sterlite Copper was able to successfully adopt next-gen client computing facilities with hand-in-hand assistance from Chennai-based Futurenet Technologies.

DigitalTrack Solutions: Right on the Security Track

DigitalTrack is keeping pace with the changes in the IT security space through DDoS and WAF solutions and is pushing security audits as part of its next move.

SLIDESHOWS

6 Leaders Who Headed for an Abrupt Exit

The abrupt exit of top leaders of Indian and global tech companies this year, with many of them citing ambiguous reasons, surprised the technology world.

Gartner Executive Summary Survey 2014

Gartner's Annual CIO Survey highlights the trends that will drive organizational IT spend in 2014.

10 Overhyped Tech Products That Crashed and Burned

The demos blew everyone away. Then reality hit.

Gartner Executive Summary Survey 2014

Gartner's Annual CIO Survey highlights the trends that will drive organizational IT spend in 2014.

ChannelWorld Survey: State of the Market 2014

Partners poll their sentiments, expectations, pain points, and challenges for the coming year.

FAST TRACK

TIM Infratech

Delivering ‘best of breed’ technologies to enterprises is key to success, says Monish Chhabria, MD, TIM Infratech

Mudra Electronics

A vendor-agnostic strategy helped us sustain business, says Bharat Shetty, CMD, Mudra Electronics.

Systematix Technologies

Our USP is a customer-friendly approach backed by services, says Akhilesh Khandelwal, Director, Systematix Technologies.

CorporateServe Solutions

Our ability to turnaround complex ERP projects in record time is what gets us customer referral, says Vinay Vohra, Founder & CEO, CorporateServe Solutions.

KernelSphere Technologies

We are emerging as an end-to-end systems integrator, says Vinod Kumar, MD, KernelSphere Technologies.

Uniware Systems

We constantly validate emerging technologies for first-mover advantage, says Vergis K.R., CEO, Uniware Systems.

Astek Networking & Solutions

An innovative approach helps us stay successful, says Ashish Agarwal, CEO, Astek Networking & Solutions.

CSM Technologies

Our approach is backed by innovation and simplicity, says Priyadarshi Nanu Pany, CEO, CSM Technologies.

EMC PARTNER SHOWCASE

Partnering for Profitability

Atul H. Gosar, Director, Network Techlab, shares how the company’s association with EMC has provided it with a competitive edge and a wide customer base, leading to increased profitability.

Sponsored Content

Promising Pipeline

Venkat Murthy, Prime Mover, 22by7 Solutions, shares how EMC brings in competitive edge by enabling technology, GTM and lead generation, helping 22by7 acquire new customers and retain old ones.

Sponsored Content

Powerful Performance

Deepak Jadhav, Director, VDA Infosolutions, says initiatives by EMC around training and certification have helped the company’s staff improve its performance and enhance customer experience.

Sponsored Content

Performance Booster

Rajiv Kumar, CEO, Proactive Data Systems, says that the solution provider’s association with EMC has helped expand its customer base and added value to existing offerings.

Sponsored Content

Pursuit of Profitability

Santosh Agrawal, CEO, Esconet Technologies, shares insights on how the systems integrator’s association with EMC has spelled sustained success over the years.

Sponsored Content

Non-Performance is Not an Option

Nitin Aggarwal, Director, Trifin Technologies, shares insights on how the association with EMC has helped the system integrator stand out and empowered its personnel to deliver consistent performance.

Sponsored Content

STRATEGIC DIRECTIONS 2014

Driving IT to Make an Impact: IDC

IT is being increasingly viewed as something which would help drive revenue rather than just another cost line-item.

Software-Defined Infrastructure: Forrester

Firms must invest in transforming infrastructure to eradicate complex infrastructure to keep pace with business needs.

Better Safe Than Sorry: PwC

Organizations should create a culture of security that starts with commitment of top executives and cascades to all employees and third parties.

New Skills for a New Era: Gartner

A new talent strategy is required—one that is a key part of the evolving IT strategy and one that focuses on a blend of business and modern IT skills.

The Rise and Growth of Big Data: Ernst & Young

Leading organizations are reaping rich rewards on their investment in big data even as competition struggles to keep pace.

SOCIAL MEDIA @ CW India
SIGNUP FOR OUR NEWSLETTER

Signup for our newsletter and get regular updates.