Skip to content


Internet of Things – Definitions & Predictions

Print

Remember the days when we were debating the definition of “Cloud”? That was as recent as 2009. Fast forward to 2014 and we’re facing the same ambiguities with the Internet of Things (or IoT).

It’s a market that is as big or bigger than Cloud. IDC expects the overall market for IoT to grow at a 12.5% CAGR from $1.3 trillion in 2013 to $3.0 trillion in 2020. IDC also forecasts that there will be approximately 30 billion autonomous things attached to the Internet in 2020, which serve as the catalyst driving this significant revenue opportunity.  IDC believes that services and connectivity will make up the majority of the IoT market — outside of intelligent systems; together, they are estimated to account for just over half of the worldwide IoT market in 2013. IDC expects that by 2020, this percentage will inch up to 66% of the worldwide IoT market (outside of intelligent/embedded systems) but give way to the increasingly valuable platforms, applications, and analytic services that are forecast to together equal 30% of the total market revenue.

It’s a big opportunity….IoT. So what is it exactly?

Wikipedia IoT Definition

Screen Shot 2014-11-15 at 5.35.36 PM

Wikipedia defines the Internet of Things (IoT) as the interconnection of uniquely identifiable embedded computing devices within the existing Internet infrastructure. Typically, IoT is expected to offer advanced connectivity of devices, systems, and services that goes beyond machine-to-machine communications (M2M) and covers a variety of protocols, domains, and applications. The interconnection of these embedded devices (including smart objects), is expected to usher in automation in nearly all fields, while also enabling advanced applications like a Smart Grid.

IDC IoT Definition

IDC_Logo-square

IDC defines the Internet of Things as a network of networks of uniquely identifiable endpoints (or “things”) that communicate without human interaction using IP connectivity — whether “locally” or globally. IDC has identified the IoT ecosystem as having the following piece parts — intelligent systems, connectivity, platforms, analytics, applications, security, and services. While the overall market opportunity is represented in terms of billions of connected things and trillions of dollars in revenue opportunity — the question continuously asked is where the revenue opportunity lies across these different technology layers.

Other key aspects of IoT:

  • The IoT brings meaning to the concept of ubiquitous connectivity for businesses, governments, and consumers with its innate management, monitoring, and analytics.
  • With uniquely identifiable endpoints integrated throughout networks, operational and location data, as well as other such data, is managed and monitored by the intelligent or traditional embedded system that has been enhanced and made part of IoT solutions and applications for businesses, governments, and consumers.
  • IoT is composed of technology-based connected solutions that allow businesses and governments to gain insights that help transform how they engage with customers, deliver products/services, and run operations.

GigaOm IoT Definition

Identity.Gigaom09.Logo_GigaOm defines the internet of things as an ultra-connected environment of capabilities and services, enabling interaction with and among physical objects and their virtual representations, based on supporting technologies such as sensors, controllers, or low-powered wireless as well as services available from the wider internet.

Internet-connected objects, devices, and other “things” are proliferating in every domain.

  • Farmers’ gates can be fitted with SIM cards to monitor whether they have been left open or allow farmers to close them remotely. Cows are being equipped with pay-asyou- go devices, which can send SMS texts when they are in heat.
  • Beer barrels now have radio tags so that they can be tracked from brewery to bar and back. Indeed, few supply chains exist today without some kind of automated product tracking. Many major supermarkets now offer bar-code readers to self-scanning shoppers, for example.
  • Startups such as Supermechanical and Electric Imp are creating monitoring devices that can be connected to light bulbs or other electrical devices, garage doors, or windows or simply left in the basement to check for water leaks.
  • “Things” don’t necessarily have to be small: Buses, trains, and cars can be fit with monitoring devices so they can provide accurate information to both control rooms and customers.

Gartner IoT Definition

gartner-logo-blue-sq

Gartner defines the Internet of Things (IoT) as the network of dedicated physical objects (things) that contain embedded technology to sense or interact with their internal state or external environment.

The IoT comprises an ecosystem that includes things (e.g. baby monitor), communications (e.g. home broadband), applications and data analysis (obstructive sleep apnea prevention).

Machine-to-machine (M2M) communication services refer to connectivity services that link IoT “things” to central or back-end systems, without human input. Operational technology (OT) is enterprise technology used to monitor and/or control physical devices, assets and processes.

Screen Shot 2014-11-16 at 6.44.36 PM

Industrial IoT vs. Human IoT

Screen Shot 2014-11-16 at 10.45.09 AM

I like how Moore Insights and Strategy defines IIoT vs. HIoT. Designing for IIoT requires deep understanding of solution spaces and an ability to connect systems manufactured many decades apart. IIoT favors solutions vendors such as DIGI, Echelon, and Freescale, who have solid roots in the industrial control world.  HIoT favors fast moving prototyping driven by leaps of faith in user experience (UX) and device design, exemplified by the Maker community in particular, led by vendors such as Apple and Microsoft.

IoT Predictions

Gartner predicts that, by 2020, the installed base of the IoT will exceed 26 billion units worldwide and will consist of a very diverse range of smart objects and equipment. Between 2014 and 2020, the number of connected objects will grow explosively and few organizations will escape the need to deliver applications that link smart objects and equipment to their corporate IT systems. However, systems involving the IoT will be very different from conventional IT applications, so IT leaders must act now prepare for this future.

IDC estimates that as of the end of 2013, there were 10 billion IoT units installed — with IP connectivity and communicating autonomously. IDC predicts that the installed base of IoT units to grow at a 16.8% CAGR over the forecast period to 29.5 billion in 2020 (see Table 3). Enablers for the impressive growth rate over the forecast period include but are not limited to pervasiveness of wireless connectivity, ubiquitous access to the Internet regardless of location, IoT standard protocols, government support for efficient technologies and services, innovation happening in a new segment of the tech market, and business process efficiencies and consumer realities around a connected lifestyle.

IDC’s 2014 predictions for IoT are as follows:

  1.  IoT Partnerships Will Emerge Among Disparate Vendor Ecosystems
  2. Leaps of Faith in 2014 Will Create End-to-End IoT Solutions
  3. Open Source-minded China Will Be a Key Player in the IoT
  4. “Plumbing” of the IoT Will Attract Significant Activity in 2014
  5. IoT Will Come to Healthcare in 2014
  6. Mobility Software Vendors Will Continue to Show a Lack of Interest in IoT
  7. Worldwide “Smart City” Spending on the IoT Will Be $265 Billion in 2014
  8. A Smart Wearable Will Launch & Sell More Units than Apple or Samsung Wearables on the Market in 2014
  9. IoT Security Is a Hot Topic, But There Will Be No Heat Until There Is a Fire
  10. Professional Services Will Open Up the IoT Competitive Landscape
  11. Big Data Will Drive Value Creation from IoT

Areas of Opportunity?

internet-of-things-gartner

Gartner believes that opportunities by revenue potential include:

  • Indoor Lighting
  • Media Consumption
  • Connected Cars
  • Smart Pills
  • Patient Monitoring
  • Smart Meters
  • Smart Computer Accessories / Home networking devices
  • Alarms & Security
  • Street & Area Lamps
  • Digital Cameras
  • Fitness/Sports
  • Thermostats
  • Toys
  • Other (Parking, Digital Signage, Trash, ATMs, home appliances)

Areas of particular interest based on GigaOM research include:

  • Health care is already making use of telehealth systems and services, an area likely to grow substantially over the coming years both inside hospitals and across community service delivery.
  • Agriculture is looking to combine sensor data (such as soil analysis) with environmental data, satellite imaging, and so on.
  • Physical retail is known to be struggling, particularly in light of lower-margin ecommerce. The future of physical retail lies in delivering improved experiences to customers, enabled by the internet of things.
  • Public safety and defense can benefit from the increased use of sensors and monitoring, combined with information from broader sources (environmental, geospatial, and so on).

Technology Business Research (TBR) believes lucrative returns will not come from sales of Internet-ready fashion, appliances or connected home electronics. TBR expects the IoT market over the next two years to be incredibly turbulent as customer adoption of new devices will be hit or miss, and device companies will invest hundreds of millions of dollars in product development with little return other than learning which devices do not appeal to customers. TBR believes health and fitness presents the largest opportunity, followed by personal safety. Ezra Gottheil believes that consumer IoT will complement and often lead the commercial IoT, and together, they will fuel a wave of innovation and expansion in all segments of IT. Jack Narcotta believes that the biggest upside will come from the platforms device vendors will construct. New IoT-ready platforms will enable vendors to embrace the first generation of IoT devices and allow the devices to intercommunicate with vendors’ current respective ecosystems. New platforms will also effectively future-proof vendor IoT strategies from the effects of fickle customers, peaks and valleys in demand for specific form factors, and the introduction of new protocols and technical standards.

IDC believes that there is no one vendor that will emerge as a winner. This market will involve a collection of vendors, service providers, and systems integrators that need to coexist and integrate products and solutions to meet the needs of customers — enterprises, governments, and consumers.

Screen Shot 2014-11-16 at 8.54.11 AM

IDC believes that the opportunity is being led by the IT hardware vendors, followed by software.  This is probably based on the fact that hardware spending in general, about 40% of total IT spending, drives downstream spending in software and services.

Screen Shot 2014-11-16 at 10.23.01 AM

So where does one focus? B2C or B2B markets? Both? IDC sees almost an even split.

B2C?:

  1. Security
  2. Children/Pet safety
  3. Energy
  4. Health & Fitness
  5. Smart Appliances

B2B?:

  1. Inventory Management
  2. Fleet Tracking/Diagnostics
  3. Shipment Monitoring
  4. Security
  5. City Systems (Parking, Street Lamps)

But $3T by 2020? That’s actually down from their $7.1T 2020 revenue forecast in their Worldwide and Regional Internet of Things (IoT) 2014–2020 Forecast: A Virtuous Circle of Proven Value and Demand, IDC #248451, back in May 2014. The change in forecast was due to how IDC weighted which use-cases grew faster. The high-end IoT scenario involves systems that have advanced monitoring and analytics. Mid-high-end scenarios have rich “track and trace” capabilities with exception-based reporting + prediction. Mid-low-end scenarios involves exception reporting, but without prediction. Low-end scenarios are only simple “track and trace”…no exception handling, and no prediction.

A “thing” categorized within the “low-end scenario” could generate $2 per month with a temperature sensor in a carton/container, generating $24 per year (e.g. tallying the goods moved from Asia to the US via boat freight). The “high-end scenario” could generate $100 per month with a monitor in a critical care unit, generating $1,200 per year (e.g. providing prescriptive capabilities around sepsis – which is the most expensive condition treated in hospitals, accounting for over $20 billion in annual costs to the U.S. healthcare system).

Screen Shot 2014-11-16 at 10.13.39 AM

At Directions 2014, IDC’s Carrie MacGillivray, talked about which industries were leading the charge with IoT based on the Value vs. Volume of the devices in their connected device networks. Insurance, Retail, and Transportation are considered the established IoT sectors, while Manufacturing, Consumer, and Utilities are the most promising in terms of growth.

Screen Shot 2014-11-16 at 10.38.31 AM

Vernon Turner‘s view is that Intelligent Systems (30%) is going to be the largest segment in the stack. Vernon also believes that companies like Bosch and GE will lead on Industrial IoT, while companies like Apple and Microsoft will lead on Human IoT.

Posted in Big Data, IoT.

Tagged with , , , , , , , , , , .


IoT is the new Big Data?

IOT

My favorite writer, Gil Press, sums it up with, “It’s Official: The Internet Of Things Takes Over Big Data As The Most Hyped Technology” where he talks about how Gartner released its latest Hype Cycle for Emerging Technologies, and how big data has moved down the “trough of disillusionment,” replaced by the Internet of Things at the top of the hype cycle.

The term Internet of Things was coined by the British technologist Kevin Ashton in 1999, to describe a system where the Internet is connected to the physical world via ubiquitous sensors.

Today, the huge amounts of data we are producing and the advances in mobile technologies are bringing the idea of Internet connected devices into our homes and daily lives.

Definition of IoT Has Expanded

Internet of Things was a popular concept dating far back to articles like  Scientific American in 2004. RFID and sensor technology enabling computers to observe, identify and understand the world—without the limitations of human-entered data.

However, I think people took it beyond the capture of “physical” events/data. I think Kevin Ashton envisioned a network of things that originally was wholly dependent on human beings for information, and then expanded to involve anything that touched a person (physical or not), passing from machine to machine.

Capturing the behavior of people will require a much broader collection of data beyond just sensor technology…beyond the “physical” – whether that is web server clickstream data, e-commerce transaction data, customer service call logs, search logs, video surveillance,  documents, etc. There is much more than “physical” or sensor-only data that involves the customer.

To truly begin understanding the behavior of people, you need to capture data from any touch point, gaining a holistic view of that person. Gaining a 360 degree of your customers, or a 360 degree view of your business by leveraging an environment of structured and unstructured data that can be analyzed….M2M (Machine to Machine) and/or IoT (Internet of Things) involving physical devices becomes a subset of the data sources available to such a project.

Is IoT a subset of Big Data or Visa Versa?

I was talking to the head of Big Data & Analytics at SAP (a peer to CSC’s Big Data & Analytics), David Parker, regarding IOT vs. Big Data. Their management has established a new IOT business unit, which I guaranteed David, would be addressing similar business use cases as his Big Data team at the end of the day.

Screen Shot 2014-08-18 at 1.59.46 PM

Last year Mukul Krishna, from Frost & Sullivan, presented a simple incremental view of how IoT feeds Big Data which then feeds a broader analytic platform. Think of IoT as a bunch of customized data sources (typically machines and sensors) leveraging customized collectors that feed a comprehensive platform (e.g. Hadoop vendors like Cloudera and Hortonworks) which, in turn, allow us to feed downstream analytic, BI, and visualization platforms.

Are Sensors the Core of IoT?

A sensor is technically any device which converts one form of energy to another form, the end usually being an electrical form mainly for measurement, control or monitoring purposes.

Take a typical temperature sensor like a gas pressure based tube sensor which expands or contracts to convert the temperature into a mechanical motion which can be displayed, recorded or used for control as required. Translation….I just described a thermostat as used in a refrigerator.

The raw electrical signal from a physical sensor is usually in analog form, and can be conveniently processed further and displayed on a meter or other suitable indication device or recorded on paper or other media such as magnetic tape or more advanced digital systems as required.

The sensor is typically classified as per its application and there could be many different types of sensors, with their own inherent advantages or disadvantages for a particular application. Putting it simply, the sensor generates an output which can be conveniently displayed, recorded or used to control or monitor the application at the point where the sensor is installed.

What’s so special about sensors? You can translate the analog physical world into a digital computer world. We convert the sensor’s analog signals into digital signals so that the computer can be able to read it, and then we feed that with other digital signals into a Big Data platform.

“Technologies that operate upon the physical world are a key component of the digital business opportunity.” as described by Gartner. Many of these “physical sensor technologies” may be new to IT, but they are expected to be high impact and even transformational.

I think IoT requires a lot of talent on the many types of physical sensors and how they are ultimately converted into a form that the emerging Big Data platforms can consume and analyze.

IoT needs a Big Data Platform

Getting your plants or your fridge to talk to you through sensors is one thing. Getting your plants to talk to your heating system is quite another. As we map the spread of IoT, it starts to get more complicated and barriers appear with a centralized big data analytics platform, or lack thereof, likely to halt progress.

Jeff Hagins Founder and CTO of SmartThings, described the data platform he has been working on that should help expand IoT and help product designers work out new ways of connecting machines and people.

He believes that the Internet of Things has to be built on a platform that is easy, intelligent and open. I argue that the evolving Big Data platforms introduced by the web scale giants like Google, Yahoo!, Linkedin, Twitter, Facebook and the like will become a standard for IoT-based applications….and IoT is just that, a set of specialized sensor connectors or sources coupled with a Big Data platform that enables a new generation of applications.

The blurring the physical and virtual worlds are strong concepts in this point. Physical assets become digitalized and become equal factors in understanding and managing the business value chain alongside already-digital entities, such as enterprise data warehouses, emerging big data systems and next-generation applications.

What do you think?

A few companies in the IoT space:

A few interesting videos to watch

Posted in Big Data.

Tagged with , , , , .


Avoid the Spiral

Double_Spiral_1280x1024

I ventured out of the “big company” environment back in 1998. It was 15 years later when I found myself back in the big company environment – a $13B revenue company after my startup was acquired.

As part of an executive team of 18 at a publicly-traded company, the environment could be considered a lot different when compared to any of the eight startups I was involved with prior. However, the reality is that it is not.

A good company environment is made up of the same factors, regardless of company size.

Even though I find it challenging these days to make time to reflect on the changes I need to make to improve my own leadership….I know that if I don’t, I will fall into a trap…a spiral.

Creating a “good” company environment, or in my case, a good business unit environment, may not be that important when things are going well.

When things are going well, the staff is excited to be working at the company because:

  • Their career path is wide open with lots of interesting jobs that naturally open up.
  • You, your peers, your family, and even your friends all think you are lucky for choosing and being a part of such a success.
  • Your resume is getting stronger by working at a company during its boom period.
  • It’s most likely lucrative with variable compensation plans paying off, bonuses being given, equity growing in value.

However, when things are challenging…and your business is struggling…all those reasons become reasons to leave.

The only thing that keeps an employee at a company when things are challenging is that people actually like their job.

Having worked and led staff through the toughest times of company life and death – including things like working for free, working long days and nights, working on weekends – I know what can be asked of a team when times are tough. But no team is going to respond to your requests of sacrifice for long, if they are working in a bad company environment.

In bad company environments, good employees disappear. In highly competitive and quick to change technology companies, disappearing talent starts the spiral.

When your company’s most important assets leave (your top performers), the company struggles to hit its numbers; it tries to backfill its core talent but can’t recruit it fast enough; it misses its milestones; declines in value; loses more of its key employees.

Spirals are extremely difficult to reverse.

So, yes, creating a “good” company environment isn’t that important when things are going well….but it sure the hell IS important when things go wrong.

…and things always go wrong.

I personally come to work every day because of the people first…..then the adrenalin fix I get from the business sector I’m in….and, finally, the technologies and products we can produce to disrupt the market….in that order.

Staying away from the Spiral

In great organizations, people focus on their work and they have almost a tribal confidence that they can get their job done. Good things happen for both the company and them personally.

You come to work each day knowing that your work can make a difference for the organization as well as yourself….motivating you and fulfilling your needs to support the sacrifice – the long hours, the missed kid birthday parties, and the canceled date nights with your spouse.

In poor organizations, people spend much of their time fighting organizational boundaries and broken processes. They are not clear on what their jobs are, so there is no way to know if they are getting the job done properly or not.

In some cases, because of pure will, your star performers will work ridiculous hours and deliver on their promises…but they will have little idea what it actually means for the company or for their careers.

To make it worse, when your star performers voice how screwed up the company situation is, management still denies the problem, defends the status quo, and, frankly, ignores the fact that they are dealing with people…not just quarterly goals, revenue targets, and operating income…again, only something you see in a poor company environment.

So how does one create a “good company environment”. For me it’s as simple as breathing air….it comes down to “telling it like it is” with: 1) transparency, and 2) strong communication….and I don’t mean detailing quarterly results with internal webex all-hands.

I like to personally “go out on a limb” by exposing the truths….by being personally vulnerable….ultimately, leading the team in a way that establishes a level of trust. I do this with a healthy dose of transparency and communication.

And in order to get the level of trust I need, one can’t emphasize the level to which you have to provide transparency and communication…..as a leader, you’ll feel uncomfortable with it when you are approaching the level that truely earns trust.

I have many techniques that I use to empower, not only my senior team, but my entire organization with the proper communication patterns only found in any good company environment….and needed to weather personal and professional storms.

Here are a few examples as I reflect over the past 15 years:

  • 1998: 50% marketshare erosion in one year due to changes in the customer ecosystem
  • 1999: A dysfunctional leadership team that can’t run the business
  • 2000: Dot com bust leading to a 2x increase in sales cycles
  • 2001: 911 requiring a 50% staff reduction
  • 2002: Your lead customer cancels their largest product line, on which you have bet the whole company
  • 2003: You enter into a patent war with your largest customer
  • 2004: Your leading acquisition transaction falls through with only months of capital left
  • 2005: Your co-founder and CTO loses his 1 and 3 year old sons in a car accident
  • 2006: A disruptive player enters the market, fundamentally changing the landscape
  • 2007: Your services go offline, crippling your top ten customers and impacting their businesses significantly
  • 2008: An acquisition candidate does an “end-around” going after your key engineering talent
  • 2009: Your co-founder and CTO gets “cold feet” and can’t commit prior to securing your next critical round of financing
  • 2010: The board of directors pulls funding right after you hired the “A-Team” & begin to ramp sales
  • 2011: Your product launch is significantly delayed (6 months) due to a fatal flaw in the technology
  • 2012: You realize that your original product that was 3 years in the making has to be thrown away
  • 2013: Investors back out with only 4 weeks of cash flow left

During this time, my father had heart surgery, my mother was diagnosed with leukemia, my wife had postpartum depression, my oldest son was diagnosed with dyslexia, I was diagnosed with a blockage in my left coronary artery, we had to sell our perfect home in order to keep the startup funds coming…and the personal list goes on.

My question for you:

Do you have the type of company environment where you have the support needed to weather any business or personal storm?

Posted in Leadership.

Tagged with , , , , , , , , .


Leadership Means Sacrificing

Screen Shot 2014-04-24 at 3.55.53 PM

What do Special Forces, Army Rangers, Navy SEALs, and Marines all have in common?

Teams like these go through what is considered by some to be the toughest military training in the world.

They also encounter obstacles that develop and test their stamina, leadership and ability to work as a team like no other.

I was talking recently to a colleague of mine about some our own own leadership at work. Emotions were strong. Deep sighs punctuated every other sentence.

We’re going through a business transformation, and as with most company turn-arounds, there is a strong conflict between the “old” and the “new”. This means old vs. new target markets, old vs. new business processes,  old vs. new people…and, at the core of most issues, old vs. new culture.

This colleague is part of the “new team”, chartered to help create change.

“I struggle with some of the leadership.”  he said, which reflected a general theme throughout the conversation.

This reminded me of the book, Fearless, the story of Adam Brown, a Navy SEAL, who sacrificed his life during the hunt for Osama Bin Laden.

Strange to think about the military when talking about business, since these two worlds couldn’t be the furthest from the other….or are they?

What Kind of Leadership Would You Prefer?

When Navy SEAL Adam Brown woke up on March 17, 2010, he didn’t know that he would die that night in the Hindu Kush Mountains of Afghanistan.

Who risks their lives for others so that they may survive? Heroes like Adam Brown do. You’ll find that military personnel are trained to risk their lives for others so that they may survive.

Would you want to be a part of a team with people who are willing to sacrifice themselves so that others like you may gain? Who wouldn’t?

In business, unfortunately, we give bonuses to employees who are willing to sacrifice others so that the business may gain.

I don’t know about you, but most people that I know want to work for an organization in which you have ABSOLUTE CONFIDENCE that others in the organization would sacrifice…so that YOU can gain…not them, not the business.

And guess what? The leadership and the business end up gaining in the end….because they have a workforce that doesn’t waste its time always looking over its shoulder, wondering what is going to happen next.

A Winning Culture

In my work to create high performing teams, I look for the type of business collegaues who are more like Adam Brown…the ones who sacrifice for the good of the team, not themselves. We want people who value this. This isn’t negotiable.

I want the team to know that I will GO OUT OF MY WAY to improve their well-being….that I care more about their success than my own. It’s not bullshit. Just ask anyone who has been part of a high-performing team….and you’ll probably hear the same.

” I care more about their success than my own”

Why? Because their success is our success. It’s that simple.

A winning culture is one where you have a team of people who are interested in improving each other…sacrificing their own interests in order to help the other.

In the end, you are NEVER looking over your shoulder…you are NEVER wasting energy trying to understand the mission. You’re focused, and you execute.

That’s a winning culture…a winning team…that’s leadership.

My collegue and I regained our enthusiasm as we reflected on our similar views. His last words still echoing in my head…

“One team one fight. Unity is what brings the necessary efficiencies to fight effectively. Lack of unity creates unnecessary distractions from the objective at hand.”

Posted in Leadership.

Tagged with , , , , , , , , , , , .


Big Data Top Ten

 

Screen Shot 2013-12-20 at 7.52.56 AM

What do you get when you combine Big Data technologies….like Pig and Hive? A flying pig?

No, you get a “Logical Data Warehouse”.

My general prediction is that Cloudera and Hortonworks are both aggressively moving to fulfilling a vision which looks a lot like Gartner’s “Logical Data Warehouse”….namely, “the next-generation data warehouse that improves agility, enables innovation and responds more efficiently to changing business requirements.”

In 2012, Infochimps (now CSC) leveraged its early use of stream processing, NoSQLs, and Hadoop to create a design pattern which combined real-time, ad-hoc, and batch analytics. This concept of combining the best-in-breed Big Data technologies will continue to advance across the industry until the entire legacy (and proprietary) data infrastructure stack will be replaced with a new (and open) one.

As this is happening, I predict that the following 10 Big Data events will occur in 2014.

1. Consolidation of NoSQLs begins

A few projects have strong commercialization companies backing them. These are companies who have reached “critical mass”, including Datastax with Cassandra, 10gen with MongoDB, and Couchbase with CouchDB.  Leading open source projects, like these, will pull further and further away from the pack of 150+ other NoSQLs, who are either fighting for the same value propositions (with a lot less traction) or solving small niche use-cases (and markets).

2. The Hadoop Clone wars end

The industry will begin standardizing on two distributions. Everyone else will become less relevant (It’s Intel vs. AMD. Lets not forget the other x86 vendors like IBM, UMC, NEC, NexGen, National, Cyrix, IDT, Rise, and Transmeta). If you are a Hadoop vendor, you’re either the Intel or AMD. Otherwise, you better be acquired or get out of the business by end of 2014.

3. Open source business model is acknowledged by Wall Street

Because the open source, scale-out, commodity approach to Big Data is fundamental to the new breed of Big Data technologies, open source now becomes a clear antithesis of the proprietary, scale-up, our-hardware-only, take-it-or-leave-it solutions. Unfortunately, the promises of international expansion, improved traction from sales force expansion, new products and alliances, will all fall on deaf ears of Wall Street analysts. Time to short the platform RDBMS and Enterprise Data Warehouse stocks.

4. Big Data and Cloud really means private cloud

Many claimed that 2013 was the “year of Big Data in the Cloud”. However, what really happened is that the Global 2000 immediately began their bare metal projects under tight control. Now that those projects are underway, 2014 will exhibit the next phase of Big Data on virtualized platforms. Open source projects like Serengeti for VSphere; Savanna for OpenStack; Ironfan for AWS, OpenStack, and VMware combined, or venture-backed and proprietary solutions like Bluedata will enable virtualized Big Data private clouds.

5. 2014 starts the era of analytic applications

Enterprises become savvy to the new reference architecture of combined legacy and new generation IT data infrastructure. Now it’s time to develop a new generation of applications that take advantage of both to solve business problems. System Integrators will shift resources, hire data scientists, and guide enterprises in their development of data-driven applications. This, of course, realizes the concepts like the 360 degree view, Internet of things, and marketing to one.

6. Search-based business intelligence tools will become the norm with Big Data

Having a “Google-like” interface that allows users to explore structured and unstructured data with little formal training is the where the new generation is going. Just look at Splunk for searching machine data. Imagine a marketer being able to simply “Google Search” for insights on their customers?

7. Real-time in-memory analytics, complex event processing, and ETL combine

The days of ETL in its pure form are numbered. It’s either ‘E’, then ‘L’, then ‘T’ with Hadoop, or it’s EAL (extract, apply analytics, and load) with new real-time stream-processing frameworks. Now that high-speed social data streams are the norm, so are processing frameworks that combine streaming data with micro-batch and batch data, performing complex processors on that data and feeding applications in sub-second response times.

8. Prescriptive analytics become more mainstream

After descriptive and predictive, comes prescriptive. Prescriptive analytics automatically synthesizes big data, multiple disciplines of mathematical sciences and computational sciences, and business rules, to make predictions and then suggests decision options to take advantage of the predictions. We will begin seeing powerful use-cases of this in 2014. Business users want to be recommended specific courses of action and to be shown the likely outcome of each decision.

9. MDM will provide the dimensions for big data facts

With Big Data, master data management will now cover both internal data that the organization has been managing over years (like customer, product and supplier data) as well as Big Data that is flowing into the organization from external sources (like social media, third party data, web-log data) and from internal data sources (such as unstructured content in documents and email). MDM will support polyglot persistence.

 10. Security in Big Data won’t be a big issue

Peter Sondergaard, Gartner’s senior vice president of research, will say that when it comes to big data and security that “You should anticipate events and headlines that continuously raise public awareness and create fear.” I’m not dismissing the fact that with MORE data comes  more responsibilities, and perhaps liabilities, for those that harbor the data. However, in terms of the infrastructure security itself, I believe 2014 will end with a clear understanding of how to apply those familiar best-practicies to your new Big Data platform including trusted Kerberos, LDAP integration, Active Directory integration, encryption, and overall policy administration.

Posted in Big Data.

Tagged with , , , , , , , , , , , , , .


SAP & Big Data

Gartner_DW_SAP

SAP customers are confused about the positioning between SAP Sybase IQ and SAP Hana as it applies to data warehousing. Go figure, so is SAP. You want to learn about their data warehousing offering, and all you hear is “Hana this” and “Hana that”.

It reminds me of the time after I left Teradata when the BI appliances came on the scene. First Netezza, then Greenplum, then Vertica and Aster Data, then ParAccel. Everyone was confused about what the BI appliance was in relation to the EDW. Do I need an EDW, a BI appliance, an EDW + BI appliance?

With SAP, Sybase IQ is supposed to be the data warehouse and Hana is the BI or analytic appliance that sits off to its side. Ok. SAP has a few customers on Sybase IQ, but are they the larger well-known brands? Lets face it….since its acquisition of Sybase in 2010, SAP has struggled with positioning it against incumbents like Teradata, IBM, and even Oracle.

SAP Roadmap

SAP_Roadmap

SAP’s move from exploiting it’s leadership position in enterprise ERP to exploring the new BI appliance and Big Data markets has been impressive IMHO. With acquisitions of EDW and RDBMS company, Sybase, in 2010 after earlier acquisition of BI leader, Business Objects, in 2007 was necessary to be relevant in the race to providing an end-to-end data infrastructure story. This was; however, a period of “catch-up” or “late entry” to the race.

The beginning of its true exploration began with SAP Hana and now strategic partnership with Hadoop commercialization company, Hortonworks. The ability to rise ahead of Data Warehouse and database management system leaders will require defining a new Gartner quadrant – the Big Data quadrant.

SAP Product Positioning

SAP_Product_PositioningLets look back in time at SAP’s early positioning. We have the core ERP business, the new “business warehouse” business, and the soon to be launched Hana business. The SAP data warehouse equation is essentially = Business Objects + Sybase IQ + Hana. Positioning Hana, as with most data warehouse vendors, is a struggle since it can be positioned as a data mart within larger footprints, or as THE EDW database altogether in smaller accounts. One would think that with proper guidelines, this positioning would be straightforward. But there is more than database size, and complexity of queries, but a very challenging variable of customer organizational requirements and politics that play into platform choice. As shown above, you can tell that SAP struggled with simplifying its message for its sales teams early on.

SAP Hana – More than a BI Appliance

SAP released its first version of their in-memory platform, SAP HANA 1.0 SP02, to the market on June 21st 2011. It was (and is) based on an acquired technology from Transact In Memory, a company that had developed a memory-centric relational database positioned for “real-time acquisition and analysis of update-intensive stream workloads such as sensor data streams in manufacturing, intelligence and defense; market data streams in financial services; call detail record streams in Telco; and item-level RFID tracking.” Sound familiar to our Big Data use-cases today?

As with most BI appliances back then, customers spent about $150k for a basic 1TB configuration (SAP partnered with Dell) for the hardware only – add software and installation services and we were looking at $300K, minimally, as the entry point. SAP started off with either a BI appliance (HANA 1.0) or a BW Data Warehouse appliance (HANA 1.0 SP03). Both of these using the SAP IMDB Database Technology (SAP HANA Database) as their underlying RDBMS.

BI Appliances come with analytics, of course

Hana_Analtics

When SAP first started marketing their Hana analytics, you were promised a suite of sophisticated analytics as part of their Predictive Analysis Library (PAL) which can be called directly in a “L wrapper” within an SQL Script. The inputs and outputs are all tables. PAL includes seven well known predictive analysis algorithms in several data mining algorithm categories:

  • Cluster analysis (K-means)
  • Classification analysis (C4.5 Decision Tree, K-nearest Neighbor, Multiple Linear Regression, ABC Classification)
  • Association analysis (Apriori)
  • Time Series (Moving Average)
  • Other (Weighted Score Table Calculation)

HANA’s main use case started with a focus around its installed base with a real-time in-memory data mart for analyzing data from SAP ERP systems. For example, profitability analysis (CO-PA) is one of the most commonly used capabilities within SAP ERP. The CO-PA Accelerator allows significantly faster processing of complex allocations and basically instantaneous ad hoc profitability queries. It belongs to accelerator-type usage scenarios in which SAP HANA becomes a secondary database for SAP products such as SAP ERP. This means SAP ERP data is replicated from SAP ERP into SAP HANA in real time for secondary storage.

BI Appliances are only as good as the application suite

Other use-cases for Hana include:

  • Profitability reporting and forecasting,
  • Retail merchandizing and supply-chain optimization,
  • Security and fraud detection,
  • Energy use monitoring and optimization, and,
  • Telecommunications network monitoring and optimization.

Applications developed on the platform include:

  • SAP COPA Accelerator
  • SAP Smart Meter Analytics
  • SAP Business Objects Strategic Workforce Planning
  • SAP SCM Sales and Operations Planning
  • SAP SCM Demand Signal Management

Most opportunities were initially “accelerators” with its in-memory performance improvements.

Aggregate real-time data sources

There are two main mechanisms that HANA supports for near-real-time data loads. First is the Sybase Replication Server (SRS), which works with SAP or non-SAP source systems running on Microsoft, IBM or Oracle databases. This was expected to be the most common mechanism for SAP data sources. There used to be some license challenges around replicating data out of Microsoft and Oracle databases, depending on how you license the database layer of SAP. I’ve been out of touch on whether these have been fully addressed.

SAP has a second choice of replication mechanism called System Landscape Transformation (SLT). SLT is also near-real-time and works from a trigger from within the SAP Business Suite products. This is both database-independent and pretty clever, because it allows for application-layer transformations and therefore greater flexibility than the SRS model. Note that SLT may only work with SAP source systems.

High-performance in-memory performance

HANA stores information in electronic memory, which is 50x faster (depending on how you calculate) than disk. HANA stores a copy on magnetic disk, in case of power failure or the like. In addition, most SAP systems have the database on one system and a calculation engine on another, and they pass information between them. With HANA, this all happens within the same machine.

 Why Hadoop?

SAP HANA is not a platform for loading, processing, and analyzing huge volumes – petabytes or more – of unstructured data, commonly referred to as big data. Therefore, HANA is not suited for social networking and social media data analytics. For such uses cases, enterprises are better off looking to open-source big-data approaches such as Apache Hadoop, or even MPP-based next generation data warehousing appliances like Pivotal Greenplum or similar.

SAP’s partnership with Hortonworks enables the ability to migrate data between HANA and Hadoop platforms. The basic idea is to treat Hadoop systems as an inexpensive repository of tier 2 and tier 3 data that can be, in turn, processed and analyzed at high speeds on the HANA platform. This is a typical design pattern between Hadoop and any BI appliance (SMP or MPP).

Screen Shot 2013-11-30 at 7.26.13 AM

SAP “Big Data White Space”?

Where do SAP customers need support? Where is the “Big Data White Space?”. SAP seems to think that persuading customers to run core ERP applications on HANA is all that matters. Are customer responding? Answer – not really.

Customers are saying they’re not planning to use it, with most of them citing high costs and a lack of clear benefit (aka use-case) behind their decision. Even analysts are advising against it - Forrester research said the HANA strategy is “understandable but not appealing”.

“If it’s about speeding up reporting of what’s just happened, I’ve got you, that’s all cool, but it’s not helping me process more widgets faster.”, SAP Customer.

SAP is betting its future on HANA + SaaS. However, what is working in SAP’s favor for the moment is the high level of commitment among existing (european) customers to on-premise software.

This is where the “white space” comes in. Bundling a core suite of well-designed business discovery services around the SAP solution-set will allow customers to feel like they are being listened to first, and sold technology second.

Understanding how to increase REVENUE with new greenfield applications around unstructured data that leverages the structured data from ERP systems can be a powerful opportunity. This means architecting a balance of historic “what happened”, real-time “what is currently happening”, and a combined “what will happen IF” all together into a single data symphony. Hana can be leveraged for more ad-hoc analytics on the combined historic and real-time data for business analysts to explore, rather than just be a report accelerator.

This will require:

  • Sophisticated business consulting services: to support uncovering the true revenue upside
  • Advanced data science services: to support building a new suite of algorithms on a combined real-time and historic analytics framework
  • Platform architecture services: to support the combination of open source ecosystem technologies with SAP legacy infrastructure

This isn’t rocket science. It just takes a focused tactical execution, leading with business cases first. The SAP-enabled Bid Data system can then be further optimized with cloud delivery as a cost reducer and time-to-value enhancer, along with a further focus around application development. Therefore, other white space includes:

  • Cloud delivery
  • Big Data application development

SAP must keep its traditional customers and SI partners (like CSC) engaged with “add-ons” to its core business applications with incentives for investing in HANA, while at the same time evolving its offerings for line of business buyers.

Some think that SAP can change the game by reaching/selling to marketers with new analytics offerings (e.g. see SAP & KXEN), enhanced mobile capabilities, ecosystem of start-ups, and a potential to incorporate its social/collaboration and e-commerce capabilities into one integrated offering for digital marketers and merchandisers.

Is a path to define a stronger CRM vision for marketers? It won’t be able to without credible SI partners who have experience with new media, digital agencies and specialty service providers who are defining the next wave of content- and data-driven campaigns and customer experiences.

Do you agree?

Posted in Big Data.

Tagged with , , , , , , , .


Infochimps, a CSC Company = Big Data Made Better

Hero-tile-template-announcement

What’s a $15B powerhouse in information technology (IT) and professional services doing with an open source based Big Data startup?

It starts with “Generation-OS”. We’re not talking about Gen-Y or Gen-Z. We’re talking Generation ‘Open Source’.

Massive disruption is occurring in information technology as businesses are building upon and around recent advances in analytics, cloud computing and storage, and an omni-channel experience across all connected devices. However, traditional paradigms in software development are not supporting the accelerating rate of change in mobile, web, and social experiences. This is where open source is fueling the most disruptive period in information technology since the move from the mainframe to client-server – Generation Open Source.

Infochimps = Open Standards based Big Data

Infochimps delivers Big Data systems with unprecedented speed, scale and flexibility to enterprise companies. (And when we say “enterprise companies,” we mean the Global 2000 – a market in which CSC has proven their success.) By joining forces with CSC, we together will deliver one of the most powerful analytic platforms to the enterprise in an unprecedented amount of time.

At the core of Infochimps’ DNA is our unique, open source-based Big Data and cloud expertise. Infochimps was founded by data scientists, cloud computing, and open source experts, who have built three critical analytic services required by virtually all next-generation enterprise applications: real-time data processing and analytics, batch analytics, and ad hoc analytics – all for actionable insights, and all powered by open-standards.

CSC = IT Delivery and Profession Services

When CSC begins to insert the Infochimps DNA into its global staff of 90,00 employees, focused on bringing Big Data to a broad enterprise customer base, powerful things are bound to happen. Infochimps Inc., with offices in both Austin, TX and Silicon Valley, becomes a wholly-owned subsidiary, reporting into CSC’s Big Data and Analytics business unit.

The Infochimps’ Big Data team and culture will remain intact, as CSC leverages our bold, nimble approach as a force multiplier in driving new client experiences and thought leadership. Infochimps will remain under its existing leadership, with a focus on continuous and collaborative innovation across CSC offerings.

I regularly coach F2K executives on the important topic of “splicing Big Data DNA” into their organizations. We now have the opportunity to practice what we’ve been preaching, by splicing the Infochimps DNA into the CSC organization, acting as a change agent, and ultimately accelerating CSC’s development of its data services platform.

Infochimps + CSC = Big Data Made Better

I laugh many times when we’re knocking on the doors of Fortune 100 CEOs.

“There’s a ‘monkey company’ at the door.”

The Big Data industry seems to be built on animal-based brands like the Hadoop Elephant. So to keep running with the animal theme, I’ve been asking C-levels the following question when they inquire about how to create their own Big Data expertise internally:

“If you want to create a creature that can breathe underwater and fly, would it be more feasible to insert the genes for gills into a seagull, or splice the genes for wings into a herring?”

In other words, do you insert Big Data DNA into the business savvy with simplified Big Data tools, or insert business DNA into your Big Data-savvy IT organization? In the case of CSC and Infochimps, I doubt that Mike Lawrie, CSC CEO, wants to be associated with either a seagull or a herring, but I do know he and his senior team is executing on a key strategy to become the thought leader in next-generation technology starting with Big Data and cloud.

Regardless of your preference for animals (chimpanzees, elephants, birds, or fish), the CSC and Infochimps combination speaks very well to CSC’s strategy for future growth with Big Data, cloud, and open source. Infochimps can now leverage CSC’s enterprise client base, industrialized sales and marketing, solutions development and production resources to scale our value proposition in the marketplace.

“Infochimps, a CSC company, is at the door.”

 Jim Kaskade
CEO
Infochimps, a CSC Company

Posted in Big Data, Cloud Computing.

Tagged with , , , , .


Real-time Big Data or Small Data?

big_little_bird

Have you heard of products like IBM’s InfoSphere Streams, Tibco’s Event Processing product, or Oracle’s CEP product? All good examples of commercially available stream processing technologies which help you process events in real-time.

I’ve been asked what I consider as “Big Data” versus “Small Data” in this domain. Here’s my view.

Real-Time Analytics Small Data Big Data
Data Volume None None
Data Velocity 100K events / day (<<1K events / second) Billion+ events / day (>>1K events / second)
Data Variety 1-6 structured sources AND 1 single destination (an output file, a SQL database, a BI tool) 6+ structured and 6+ unstructured sources AND many destinations (a custom application, a BI tool, several SQL databases, NoSQL databases, Hadoop)
Data Models Used for “transport” mainly. Little to no ETL, in-stream analytics, or complex event processing performed. Transport is the foundation. However, distributed ETL, linearly scalable in-memory and in-stream analytics are applied, and complex event processing is the norm.
Business Functions One line of business (e.g. financial trading) Several lines of business – to – 360 view
Business Intelligence No queries are performed against the data in motion. This is simply a mechanism for transporting transaction or event from the source to a database.Transport times are <1 second.Example: connect to desktop trading applications and transport trade events to an Oracle database. ETL, sophisticated algorithms, complex business logic, and even queries can be applied to the stream of events as they are in motion.  Analytics span across all data sources and, thus, all business functions.Transport and analytics occur in < 1 second.Example: connect to desktop trading applications, market data feeds, social media, and provide instantaneous trending reports. Allow traders to subscribe to information pertinent to their trades and have analytics applied in real-time for personalized reporting.

Want to see my view of Batch Analytics? Go Here.

Want to see my view of Ad Hoc Analytics? Go Here.

Here are a few other products in this space:

 

Posted in Big Data.

Tagged with , , , , , , , , , .


Ad Hoc Queries with Big Data or Small Data?

big-dog-little-dog

Do you think that you’re working with “Big Data”? or is it “Small Data”? If you’re asking ad hoc questions of your data, you’ll probably need something that supports “query-response” performance or, in other words, “near real-time”. We’re not talking about batch analytics, but more interactive / iterative analytics. Think NoSQL, or “near real-time Hadoop” with technologies like Impala. Here’s my view of Big versus Small with ad hoc analytics in either case.

Ad Hoc Analytics Small Data Big Data
Data Volume Megabytes – Gigabytes Terabytes (1-100TB)
Data Velocity Update in near real-time (seconds) Update in real-time (milliseconds)
Data Variety 1-6 structured data sources 6+ structured AND 6+ unstructured data sources
Data Models Aggregations with tens of tables Aggregations with up to 100s – 1000s of tables
Business Functions One line of business (e.g. sales) Several lines of business – to – 360 view
Business Intelligence Queries are simple, regarding basic transactional summaries/reports.Response times are in seconds across a handful of business analysts. 

 

Example: retrieve a customer’s profile and summarize their overall standing based on current market values for all assets.

 

This is representative of the work performed when a business asks the question “What is my customer worth today?”

 

The transaction is a read-only transaction. Questions vary based on what business analyst needs to know interactively.

Queries can be as complex as with batch analytics, but generally are still read-only and processed against aggregates. Queries span across business functions.Response times are in seconds across large numbers of business analysts.Example: retrieve a customer profile and summarize activities across all customer-touch points, calculating “Life-Time-Value” based on past & current activities.

This is representative of the work performed when a business asks the question “Who are my most profitable customers?”

 

Questions vary based on what business analyst needs to know interactively.

Want my view on Batch Analytics? Look here.

Want my view on Real-time analytics? Look here.

Here are a few products in this space:

Posted in Big Data.

Tagged with , , , , , .


Batch with Big Data versus Small Data

Big-Small

How do you know whether you are dealing with Big Data or Small Data? I’m constantly asked for my definition of “Big Data”. Well, here it is…for batch analytics, now addressed by technologies such as Hadoop.

Batch Analytics

Batch Analytics Small Data Big Data
Data Volume Gigabytes Terabytes – Petabytes
Data Velocity Updated periodically with non-real-time intervals Updated both in real-time  and through bulk timed intervals
Data Variety 1-6 structured sources 6+ structured AND 6+ unstructured sources
Data Models Store data without cleaning, transforming, or normalizing. Store data without cleaning, transforming, and normalizing. Then apply schemas based on application needs.
Business Functions One line of business (e.g. sales) Several lines of business – to – 360 view
Business Intelligence Queries are complex requiring many concurrent data modifications, a rich breadth of operators, and many selectivity constraints. However, they are applied to a simpler data structure.Response times are in minutes to hours, issued by one or maybe two experts.Example: determine how much profit is made on a given line of parts, broken out by supplier, by geography, by year.

 

Queries are complex requiring many concurrent data modifications, a rich breadth of operators, and many selectivity constraints. Queries span across business functions.Response times are in minutes to hours, issued by a small group of experts. 

Example: determine how much profit is made on a given line of parts, broken out by supplier, by geography, by year; and then determining which customers purchased the higher profit parts, by geography, by year; determining the profile of those high-profit customers; finding out what products purchased by high-profit customers were NOT purchased by other similar customers in order to cross-sell / up-sell.

Want to see my view on Ad Hoc and Interactive Analytics? Go here.

Want to see my view on Real-Time Analytics? Go here.

Here are a few other products in this space:

ICS Hadoop

Cloudera

MapR

Hortonworks

Pivotal

Intel

IBM

Wandisco

Posted in Big Data.

Tagged with , , , , .




Switch to our mobile site