Comparing Stride Counts: Apple Watch, Fitbit Accuse HR, and iOS Withings App

For the past several weeks, I have been wearing an Apple tree Lookout man (nerd bling!), a Fitbit Charge Hr, and not exactly wearing but still carrying my mobile phone similar all the other technology addicted almost-eye aged people out there. The phone has the Withings app on it because I rail my weight so that I tin cycle back and along between exuberance (Alright! I lost 5 pounds!) to bummed out (And I just gained vii pounds. PBBBBBBT!!!!). The Withings App tracks steps based on the M7 chip inside of my iPhone 5S (why yes, I requite Apple a lot of my coin, cheers for request). So I wanted to run across how the three compared to each other. This is a casual comparison - I'm non running statistical tests or anything this fourth dimension around.

The ground rules

After getting my Apple Sentinel, I decided to do this comparison affair. I was going to clothing both the Fitbit and the Watch each day as I normally would. What's normal? When I wake up in the forenoon, I put on each device. The Lookout goes on my non dominant (left) arm. The fitbit goes on my dominant (correct) arm. I wasn't picky about which went on start - usually, it was whichever I could fumble and grab first. I decided some fourth dimension ago (almost when I discovered my wrist was getting sort of evil-smelling from the fitbit charge) that I was not going to vesture any devices at night nor track my sleep. (Yep, bad self tracker, I know.) I set the recording of each device to the correct arm according to their corresponding apps. I carried my phone with me as I would commonly - which ways it goes with me to breakfast and goes with me when I walk the dog in the morning. When I get to work, it stays with me for the near part, but I will forget and leave information technology on my desk. At the gym, it sometimes goes with me onto a treadmill but information technology might also linger in the locker. If I am going to be going into water, like the pool or the beach or into the shower, I accept off all devices and proceed the telephone away from h2o. For the nearly part, I went almost my life.

Normal life involves walking the higher up mentioned dog, going on outings with the family unit, doing elementary social activities with friends, and working at a job that involves a lot of sitting near a estimator and blasphemous at the photocopier. I occasionally forget my phone for some reason, panic, and then gradually accept that I am phoneless for several hours. That is normal, and I did non track what days I left my phone somewhere. That is only life. I was trying to run an experiment that was fairly truthful to my life.

For days tracked, I picked the window between vi/3/2015 and seven/21/2015. Are these special days? An anniversary or obscure holiday? No. I got the watch on the 2nd, just it was late in the day so I didn't have a full twenty-four hour period of data on that whereas the other devices/app had more time to exist attached to my person. I actually wanted to go all the way through July, but my telephone had been having issues and I had to get it replaced (hooray applecare! and yes, more money to Apple for applecare), so I lost Withings step data after the 21st. That's why nosotros terminate and then.

Getting Data Out

QS Labs was kind plenty to release an iOS app called QS Access that let me go a .csv file of my Apple Health data. (Note: when I peak at the data in the wellness app, it seems that some of the information points, like a span of a minute or two, is from the phone rather than the watch? Merely I didn't care enough to dig into it and assume this is apple beingness smart virtually getting a more thorough picture. I'm sure I can read some message board and get a lot more than details, but I'k but calling data from the wellness app  "watch data" even though information technology isn't 100% true).

Fibit information I grabbed from the dashboard of Fitbit. I know there are hacks to catch information - in fact, I'k associated with one of them that will take hold of information technology in infinitesimal increments - but I just wanted to compare daily totals. Possibly ane twenty-four hour period in the future I will look at minute by infinitesimal or hour by hr or some other increment to see if there is something cool going on.

Withings lets you just consign .csv files from their web interface, so that was like shooting fish in a barrel enough. Just a few clicks here and at that place. Then stick it all into a spreadsheet and brand a few plots. Once again, I'm beingness lazy most this and am not running any serious statistics. I have had the about experience with Fitbit devices (I once tested the zip confronting the flex and saw the flex undercounted quite a bit relative to the null), then I figured I'd use the fitbit devices as a baseline. Anyhow, here are the results.

Result 1: Fitbit Charge Hour tends to count more than steps

If you accept into account my previous feel showing that the flex seemed to undercount relative to the hip based prune on zilch (which is more similar to research grade pedometers that do scientific discipline people employ), then nosotros might assume that the Charge 60 minutes undercounts relative to whatever is my true number of steps. (Also, published inquiry suggests every bit much). No matter though - I'd rather accept an undercount than overcount so that I push button myself a lilliputian more. Merely what is interesting is that the Accuse Hr, assuming it was undercounting, was counting yet more steps than the other 2. Run into the poorly labeled graph that has not been cropped below.

You would think I could keep a legend, but no. The Green is Fitbit Charge HR, Blue is Apple Watch, and Yellow is the Withings app.

You would recall I could keep a legend, but no. The Green is Fitbit Charge 60 minutes, Blueish is Apple Spotter, and Xanthous is the Withings app.

The Apple Watch and the Charge runway pretty close, but the green ever seems to be a picayune college. If you want to see how much of a divergence there was each day - in decimal approximations, then just keep scrolling.

This feller shows how much the Apple Watch count deviated from what the Charge HR recorded. The decimals correspond to percentages, but I just didn't feel like actually making it show as a percentage. Anyway, the shorter the bar, the closer the numb…

This feller shows how much the Apple Watch count deviated from what the Accuse HR recorded. The decimals correspond to percentages, simply I just didn't experience like actually making it show equally a percent. Anyway, the shorter the bar, the closer the numbers were. Bars that point downward are undercounts. Bars that bespeak up are overcounts. You can meet that it's pretty close with 4 days that undercounted more than 20%. The overcoats were on 6 days, and only i of those was over xx% off.

Then the decision hither is that a Fitbit Charge HR on the dominant hand seems to count more steps than an Apple Watch on a non-dominant paw. Presumably, the algorithms that they utilise in each would account for ascendant/not-dominant. And for the days that the counting was way off, I'm willing to believe something dumb happened that I don't remember, like a battery dying. (It happens.)

If you lot want to run into information technology every bit being like a correlation, hither's that plot.

The numbers on the axes are step counts. Probably should have mentioned that sooner, but you probably figured it out, right? If not, sorry. The very first graph will make more sense now. Like I said, poorly labeled!

The numbers on the axes are step counts. Probably should have mentioned that sooner, only you lot probably figured it out, right? If not, deplorable. The very kickoff graph will make more than sense now. Similar I said, poorly labeled!

If you e'er took a statistics course, you probably covered correlation. You probably saw the pictures of magical correlation of +ane or -i (everything lies perfectly on a line of gradient +1 or -1) and a correlation of zero (which, if retentivity serves, was always like a perfect dot filled circle in the textbooks. Just let'southward not start on that - at that place is a cottage industry of ranting against textbooks and  I fifty-fifty did a chapter of my dissertation on that which eventually became a periodical article.) Anyhow, this has an upwardly line that looks +1ish, although at that place are some points hovering above the line (Apple undercounts relative to Fitbit). I didn't want to run any statistics, and a correlation barely counts because it'southward like a click, drag, and button click in your favorite spreadsheet plan - and so here information technology is. For these two, r = 0.81. That seems like a pretty high correlation. It'south not 1, but getting a 1 is pretty darn hard.

Okay, how about the Withings? Again, relative to Fitbit's matter, we get the following departure plot.

The bars are red because they are like tears of blood. Sometimes it overcoats (like 9 days?) and the rest of the time it undercounts. But when it undercounts, IT REALLY UNDERCOUNTS. There are a lot more days that are more than 20% off. And not like …

The bars are red because they are like tears of blood. Sometimes it overcoats (like nine days?) and the rest of the fourth dimension information technology undercounts. But when it undercounts, IT Actually UNDERCOUNTS. In that location are a lot more days that are more than 20% off. And not similar a sale cost xx% off. It's like "WHOOPS!"

And so information technology turns out that forgetting the phone at domicile or in the machine or on my desk makes it a bad footstep counter. Who would have thunk?! But still, bold I'm good almost conveying my phone with me on the days that don't take the super sad droopy red bars, it's still a pretty large departure.

Some time ago, there was a hubbub about some study about smart phones existence reasonably skilful compared to a wearable tracker. Of course, the news media went to boondocks with it and said article of clothing trackers are the worst thing always and so have that you annoying techno-posers! But the study actually said that smart phones are a reasonable approximation when you lot take people doing something similar....walking on a treadmill for a science experiment. If you don't accept or don't want a wearable device merely want to runway, then by all means, use your telephone! (Just don't forget information technology at home or in the machine or on your desk). But if you want numbers that are a lilliputian...um, higher? And so a wear that you don't have to think virtually leaving in the auto or on your desk is fine. At to the lowest degree it works with this guy who has two thumbs and devices on each arm.

Correlation now? Hither information technology is for the Fitbit vs. the phone app.

Just to confuse you, the Fitbit Charge HR steps are on the x-axis instead of the y-axis.

Just to confuse you, the Fitbit Charge 60 minutes steps are on the x-axis instead of the y-axis.

Well, that plot is sort of line like - it looks more than like a line than a dotted circle. Heck, if I was a sociologist and got this equally my plot, I'd probably offset doing the Mipos Dance of Joy! (I'd also try to hide the fact that my Northward is and then small). Simply doing that click, drag, button trick gives me this: r = 0.60.

That'due south not terrible. It is certainly a good approximation, and I am saying this without actually looking at any data but imagine this is probably close to the correlation of acme and weight of american adults of a certain age. So it isn't bad - yous tin can become reasonably shut (if you remember to carry the darn smartphone everywhere, which we remember we are good at, but I tin can't even tell you the number of times the words "Practice yous know where I put my phone?" are uttered each twenty-four hour period in my house.)

What's that, you want one more correlation? For fun? Okay. The remaining pairwise correlation.

I moved the Apple Watch counts back to the x-axis just to be annoying.

I moved the Apple tree Watch counts back to the x-axis merely to be annoying.

This last scatter plot looks sort of similar a dragon? Probably considering it is greenish, but I can kind of make out a neck and a tail. It seems the least pretty to me, but since Game of Thrones became a matter, my relationship to dragons has changed. I'd notwithstanding exist delighted if I were a sociologist (and ashamed of the N - allow'due south not forget the shame). R = 0.82. And then that's not bad if we apply the Watch as our baseline? But like I said before, information technology seemed like the Scout might go some of its step counts from phone accelerometer move, so that could be part of what's going on.

Parting thoughts

Well, I'1000 happy to employ all three device options (peculiarly considering the Watch is useless without the phone) fifty-fifty though I'd go double the wristband tan. However, I've scaled downward to i for now (the Picket - Apple nerd bling makes me intendance more well-nigh things that practice not chronicle to data, like overpriced beauty). My hunch is that the Fitbit does a better job of getting closer to "Existent" steps, but nosotros are a long way from whatever wrist-based or handheld device being able to getting to that level of accurateness. I could elaborate on that hunch, but I won't right at present.

And I know there are a bunch of different analyses I could and should do - boilerplate counts, significance tests (okay fine, paired t-tests are all significant at the 0.01 level), weekends vs. weekdays, excluding some outlier days, etc.). But if yous desire my tl;dr (which you lot wouldn't know virtually because I put information technology at the bottom and if you dr'd it, y'all wouldn't know it's here), it is that phone apps don't seem as reliable or consistent compared to a wearable device when situated in the actual globe of human use, equally determined from this man's utilise. Beyond that, selection your poisonous substance. More steps can come out of the Fitbit wrist-worn device, which makes information technology an attractive pick. That is, until you brand and sell something with the Apple logo on information technology. Then all bets are off.

Oh, and my squad published a case study of a tween girl who came to a similar determination that you lot can read hither. And the whole question of accurateness is something that a agglomeration of elementary school kids explored too. We published that also, and i day information technology will be freely bachelor.