Welcome!

Video Authors: Elizabeth White, Yakov Fain, Liz McMillan, Dan Ristic, Jnan Dash

Related Topics: @BigDataExpo, @CloudExpo, @ThingsExpo

@BigDataExpo: Blog Feed Post

Aggregated Data Dilemma | @BigDataExpo #BigData #Analytics #DataScience

Valuable performance and behavioral nuances can be buried in the aggregated data

Okay, I am weird (tell me something that I don’t know, say most of my friends).  For Christmas I wanted a Nike Apple Watch to go with my existing FitBit and Garmin fitness trackers (I look sort of like a cyborg in the photo below…which is always cool).

While I was intrigued by the ability to do all sorts of cool things on the Apple Watch (like take a phone call and talk into my wrist watch like Dick Tracy), the thing that most intrigued me was the ability to buy third-party apps that could yield detailed exercise and health data.  I was hoping that this detailed exercise and health data could help me understand what effect particular behaviors or activities (or lack of particular behaviors and activities) were having on my overall health.

Why is this important to me?  You can thank articles like “Unexpected Heart Attack Triggers” for my health and exercise anxiety.  The article highlighted several things that can trigger a heart attack including:

  • Lack of sleep (definitely an issue, especially when I’m traveling so much)
  • Migraine Headaches (how can you work in technology and not have headaches)
  • Cold Weather (need to find more clients in warmer weather)
  • Big, Heavy Meals (with the exception of Chipotle, right?)
  • Getting Out of Bed in the Morning (see, I knew that was a big danger!!)
  • Alcohol (just like to drink a beer now and then)
  • Coffee (I drink Chai Tea Lattes, that’s technically not coffee, and I know that I shouldn’t admit that I drink Chai Tea Lattes)

So there are many items on that above list that could trigger a heart attack, and I enjoy many of the things on that list (like sleeping and eating and the occasional beer).  Consequently, I thought I’d put my data science experience to work to monitor my exercise and diet behaviors and predict potential health outcomes.

Personal Fitness Analytics
I tested the downloadable data from each of the three devices. The Fitbit offered the easiest way to download my fitness data (and I have TONS of useful fitness and diet tracking suggestions if anyone at Fitbit, Garmin or Apple ever read this blog!!). The problem with the fitness data is that I can only get daily level data (see Table 1).

Table 1:  Daily Fitness Tracking Data

I can add more external data to the aggregated fitness data (e.g., days of the week, days when I travel, how much I travel on those travel days) to come up with some simple plots.

For example, Figure 2 shows a visual correlation between the calories that I burn per step and the days that I travel.  My assumption is that I burn more calories per step when I am doing something that requires more exertion (like running or climbing steps), so it makes sense that on days when I am traveling, I have less opportunities for highly exertive activities.

Figure 2:  How Many Calories I Burn Per Step When Traveling

While this information is “interesting,” unfortunately, data at the aggregated daily level is not actionable.  If I had more detailed or granular fitness data, I’d like to chart what happens to my heart rate (and related stress levels):

  • During an airplane flight
  • When racing through an airport to catch a connecting flight
  • Waking up very early in the morning while traveling
  • Immediately after eating a large meal
  • While I’m doing my taxes (I hate doing my taxes)

The problem is that the data provided by my fitness band is aggregated to a level that is not actionable.  If I had my fitness data at 5 or 10-minute intervals, then I could more easily spot unusual health outcomes and determine (and eventually predict?) what behaviors (e.g., flying in an airplane, eating large meals, heavy exercise exertion, waking up extremely early) might be causing health concerns.

Power of Granular Data
Big Data and data science are all about granular data because valuable performance and behavioral nuances can be buried in the aggregated data.  For example, the chart in Figure 3 shows how additional performance nuances are being uncovered as we transition from a 5-minute to a 1-minute and finally to a 5-second interval in the capture of the performance data.

Figure 3:  Performance Nuances Uncovered in Granular Data

As the data gets more granular, the behavioral and performance nuances buried in the data start to surface. Data at the 5 minute and 1 minute intervals in Figure 3 tell you very little. Aggregated data is the anti-data science. Data at the 5-second interval highlights some potential performance concerns.  In this example, data at the 5-second interval starts to become actionable.

For example, I might notice too sedentary of a heart rate whenever I sit too long on a cross-country flight or my stress level jumping whenever I get another “flight delayed” message while trying to catch a connecting flight. I might then learn to perform some in-seat exercises and walking around during those long flights, or practicing controlled breathing and some simple yoga when enduring yet another flight delay (SFO airport does have a yoga room, and now I know why).

Preparing for an IoT World of Granular Data
Understanding the challenges of capturing and analyzing real-time granular machine and device-generated data will become even more critical as we move into the Internet of Things (IOT), where hundreds of sensors are kicking off tens, hundreds or even thousands of data points per minute.  This will force two specific challenges upon those of us coming from the more traditional human-generated big data world:

  • Real-time data capture and compression
  • Real-time analytics at the edge

For my fitness focus, I might need to expand my Personal Fitness Analysis to capture and analyze more of this detailed data in (near) real-time so that I can become aware of behaviors that are hurting or improving my health and fitness.  Ultimately, my goal is to change my behaviors, but I need to understand (and quantify?) what behaviors lead to desirable health and fitness outcomes (e.g., improved blood pressure, lower weight, less stress).

The post Aggregated Data Dilemma appeared first on InFocus Blog | Dell EMC Services.

Read the original blog entry...

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business”, is responsible for setting the strategy and defining the Big Data service line offerings and capabilities for the EMC Global Services organization. As part of Bill’s CTO charter, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He’s written several white papers, avid blogger and is a frequent speaker on the use of Big Data and advanced analytics to power organization’s key business initiatives. He also teaches the “Big Data MBA” at the University of San Francisco School of Management.

Bill has nearly three decades of experience in data warehousing, BI and analytics. Bill authored EMC’s Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute’s faculty as the head of the analytic applications curriculum.

Previously, Bill was the Vice President of Advertiser Analytics at Yahoo and the Vice President of Analytic Applications at Business Objects.

@ThingsExpo Stories
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Val...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet th...
SYS-CON Events announced today that Hitachi Data Systems, a wholly owned subsidiary of Hitachi LTD., will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City. Hitachi Data Systems (HDS) will be featuring the Hitachi Content Platform (HCP) portfolio. This is the industry’s only offering that allows organizations to bring together object storage, file sync and share, cloud storage gateways, and sophisticated search an...
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help globa...
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help globa...
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
SYS-CON Events announced today that Juniper Networks (NYSE: JNPR), an industry leader in automated, scalable and secure networks, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Juniper Networks challenges the status quo with products, solutions and services that transform the economics of networking. The company co-innovates with customers and partners to deliver automated, scalable and secure network...
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Busine...
Bert Loomis was a visionary. This general session will highlight how Bert Loomis and people like him inspire us to build great things with small inventions. In their general session at 19th Cloud Expo, Harold Hannon, Architect at IBM Bluemix, and Michael O'Neill, Strategic Business Development at Nvidia, discussed the accelerating pace of AI development and how IBM Cloud and NVIDIA are partnering to bring AI capabilities to "every day," on-demand. They also reviewed two "free infrastructure" pr...
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in compute, storage and networking technologies, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/...
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Val...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...
The age of Digital Disruption is evolving into the next era – Digital Cohesion, an age in which applications securely self-assemble and deliver predictive services that continuously adapt to user behavior. Information from devices, sensors and applications around us will drive services seamlessly across mobile and fixed devices/infrastructure. This evolution is happening now in software defined services and secure networking. Four key drivers – Performance, Economics, Interoperability and Trust ...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
With billions of sensors deployed worldwide, the amount of machine-generated data will soon exceed what our networks can handle. But consumers and businesses will expect seamless experiences and real-time responsiveness. What does this mean for IoT devices and the infrastructure that supports them? More of the data will need to be handled at - or closer to - the devices themselves.
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.