Get fast, smart


Strava, Garmin, MapMyRun, Runkeeper, Endomondo, Runtastic, Nike+, MiCoach, … you’ve pioneered the exercise app, but it’s time to Get Smart. Leverage the great data we help each other collect.

First remove the bad data. Here’s what not to do. On July 7, 1995, Hicham El Guerrouj of Morocco ran the world’s all time fastest mile of 3:43.13 in Rome.  Go Hicham!


But according to your records, on April 1, 2012, Tina Harrison ran a 1:58 mile. Go Tina!!


Let’s stop embarrassing each other. Don’t publish that I broke Usain Bolt’s world record time of 9.58 seconds in the 100-meter dash, when actually all I did was …

  1. Finish my run
  2. Didn’t turn off my watch
  3. Drove off at 60 MPH

So if my GPS reports that I’m running 100 meters in 9 seconds, don’t mess up our* records, throw out the bad data.  

* Why do I say our records, and not my records? My stats feed the global data pool. My bad data messes up everyone’s records.

After you throw out our false world records, then use our heart rate monitors to double check our (less reliable) GPS measured speeds. It’s not that hard to determine how fast a runner should be able to run, given heart rate, distance, slope and prior running history. If I’m not wearing a heart rate monitor (HRM) then sanity check my speed against segments from HRM double-checked runners who normally run like me.

Check the wind and weather data too. We both know it’s not a personal record if there is a 20 MPH tailwind. Correct for these weather effects. Your hill adjustments are also off. Uphill you should use heart rate (HR) as a proxy for effort, showing our normal speed at that HR, not some abstract calculation based on GPS speed and slope.

What we want is a RealSpeed™. RealSpeed would correct for all external effects; hills, wind, wind velocity, temperature (too hot or too cold = hard), even path roughness. The key is to use your millions of runs to generate real data, not to use simple (but false) calculations.  The HRM (heart rate monitor) is the key (except for downhill runs). If you run 7.5 MPH (8 minute miles) at a HR of 150 beats per second (bps) with perfect temperature, no wind, no hills, flat road, then whenever you see a HR of 150 you know the RealSpeed is 7.5, (at least until your running form and conditioning improve, and RealSpeed is the key to measuring this improvement).

Once you’ve got my RealSpeed you can start comparing every run I do, regardless of slope, wind, course, … Give me a cool chart that shows me, at a glance, my progress over time. We love charts – especially when we’re training and improving. Find a clear, visual way that lets me compare my last 6 months of runs versus today’s run.

Veteran runners might have a problem with Real-Speed. We eventually get slower. We might want an age-adjusted RealSpeed. Otherwise we’ll train and train and get slower. That isn’t fun. In fact any runner who works hard will be tired the next day and slow down a bit. Maybe we need a rest-adjusted RealSpeed too.

Next, stop asking us to manually create courses. There are algorithms to identify if any two paths match, that is if I (or anyone else) has run this path before. Use these algorithms to instantly acquire 100% ‘course’ coverage of every step I run, without any manual input, duplicates or gaps. A good path-matching algorithm is as important to managing outdoors activity as a good search algorithm is to managing Internet activity, and we need Google, not Alta Vista. For route names, default to the name on the map, or allow the course regulars to propose better names, which all those who have completed the course can vote on.

When I do run the same paths I’d like to see how I’m doing versus my previous steps on that path. This may sound simple, but it isn’t. Why? Look at the GPS plot below. I don’t think most GPS data this bad, but …


This GPS data is so bad you might not even realize I’m actually just doing laps in the bottom lane of the pool (the bright blue rectangle in the middle).  Before you can compare two identical runs, you’ll need some tricky DSP (digital signal processing) work to clean the GPS data, to extract simple, easy to understand feel for speed and effort.

Right now my speed graphs show a lot of random spikes, even when I’m running smoothly.

Without this clever DSP work, when you compare two or twenty identical runs you’ll get a barrage of spikes and noise, hiding the signal or ‘true’ speed.  Digest our noisy data so we can understand it.

You might be amazed how nicely a top-notch signal-processing expert can clean up noisy data, given enough data. Here’s some non-expert signal processing.


This graph looks pretty, but its smoothness is a lie. Bad signal processing, in this case smoothing data over 60+ seconds, created a false smoothness. Neither was the terrain smooth, nor was I trying to run smooth. Here’s the same run, without the fake smoothing.


  • The first, biggest spike is where I walked down some stairs.
  • The second spike is where I stopped after my speed mile.
  • The third, small but sharp spike is where I stopped to drink from a fountain.
  • The fourth spike is where I encountered a flight of (very) steep stairs (and stopped for a breath at the top).

When I got to the (obvious) very flat spot on Great Highway (along side the very flat Pacific Ocean)

great highway

I increased my pace from about a 9 minute miles to a 7 minute miles, just to see if I could run this fast, starting at the moment when I pushed my lap button (which should have been noted).  The smoothed graph entirely misses this 7-minute mile – that’s not helping me analyze my run.

While the little spikes may be noise, OK to smooth, the big spikes are not OK to smooth. When I ‘change gear’ from fast to slow (or slow to fast) the graph should jump, not connect, and certainly not blend in speeds from 30 or 60 seconds before my abrupt pace change. It might more meaningful to skip or drop non-running spots from my running graph, since I’m actually not running.

More important, you need a meters per second graph or MPH graph. The current Seconds per Mile display goes up to infinity as I slow down. This makes ugly graphs, with the ‘slow’ spikes so high they take up all the vertical space.  It’s more than a cosmetic problem. My fast sections are critical for training, the slow parts literally just junk.

Our fast sections must stick up and be emphasized. 10 MPH should be higher than 8 MPH. Anything less than a running pace (5 MPH), should be just noise at the bottom of the graph. This is exactly what a MPH graph does.


Unfortunately, the popular Minute:Seconds per Mile graphs highlights the meaningless, easy sections or our workouts, and squashes the important, hard sections of our workout.  Stops become huge spikes, sprints are practically invisible.


The blue and red graphs show the exact same speeds, but the red, Seconds Per Mile graph emphasizes all the wrong information.

Since many runners stop at the same places (a turn around point at a fence or wall, a water fountain, bathroom or traffic light, …), looking at everyone’s data will help separate what is a ‘stop’ from what is data noise. The clean up process will need to be fed, not only my GPS data, but also my lap button presses, terrain data, and places we take breaks. Big data combined with good DSP work should clean up my noise without wiping out my signal.

The most valuable thing you could do for me is to give me a Big Data-driven virtual coach. I want my big-data coach to: Tell me if I’m not warming up enough before pushing hard. Pester me when I over-train and tell me I need to take a break. Coach me when I need intervals, both how much to rest, and how much to push. Motivate me with real data on how much faster I’ll be if I run 5 miles more a week, or lose 5 pounds, or can push harder on my hardest hill. Have an expert, like Owen Anderson, PhD., author of the new book, Running Science, or another famous scientific data-driven coach co-design and endorse your big-data coach.

Maybe I’m biased after working on games, but can I get a score? How much have I improved over the last four years? Am I the 10,000th fastest half marathoner in the world? If I’m not, how much faster do I need to go? How much help was my last training run? Coach – encourage me! If I’m going to live an extra 100 minutes thanks to my 40 minute workout, let me know my sweat just bought me an extra hour of life.

For bonus credit, if more experienced runners or riders take variant courses, we’d love to see them (on your map, of course). I’ve been pounding down the steep Point Lobos Avenue hill for years.

Point Lobos Ave

I just found out there is a nice, soft (dirt) footpath hiding just across the street.

sutro heights stairs

Ideally less experienced runners of the Point Lobos hill could now benefit from my experience. If the system were paying attention it would see I’ve changed my routine (say after 5 repeats of the new route instead of the old route) and then ask me if I’d recommend this change to others.

I don’t always want to run alone. I want something like a game matchmaking service, to let me discover potential training partners: people who run where I run, who aren’t too fast or too slow.  If I run by them, send me a sign. A simple text will do. Tell us both how much faster we’ll become if we start running together. Tell me which of my coworkers (my address book), neighbors (your service), contacts (LinkedIn), or friends of friends (Facebook) might want to run with me. Let running clubs and running stores promote their weekly runs and sign up their friends. Allow coaches and trainers to monitor and message their athletes. Encourage race organizers to set up their exact courses and give out virtual prizes.

Get super smart! You have more data than any coach or exercise study in history. Look for patterns in your millions of runs to predict how we’re most likely to get hurt, and then warn us. How do you detect an injury? Look for regular runners that stop in the middle of a workout and then don’t run again for weeks. Take me as an example. If you look at my March 28th log, you’ll see me running as fast as I can down a steep hill, then stopping about two miles from home, then no running for the next ten weeks. Once you can see where we get injured, help us not get injured. If you see us running too fast down a steep hill, tell us how risky that is, and how to mitigate that risk.  I’m not the only one who would pay a virtual coach $10/month to skip one injury. That’s a lot less cost (and pain) then physical therapy.

Want to spawn a blizzard of sparkling publicity? Enlist us in mass experiments to replace old coaches’ tales and marketing puffery with hard-data backed science. Maybe start with our shoes. Do we run the same speed at a lower heart rate when we put on fancy new shoes? Do minimalist shoes actually reduce our injury rate? Find out everything: Does stretching before running increase or decrease injuries, what’s the impact of a cool down run, how much warm up do we need, do long, slow runs increase fitness, how fast is it safe to ramp up distance (particularly after certain kinds of injuries), how much faster do we get if we join a club or get a coach, does cross training help, and if so, what helps the most, …

Finally, once you gather for me all this great information, I want it nicely organized on a single page, my race readiness fitness poster. Most runners have weekly schedules; it’s fun to compare run to run, but a real coach watches progress week-to-week and race-to-race. You don’t make this easy; my progress information is scattered on bunches of pages hiding somewhere.

In conclusion: Make yourself indispensable. Look at runners who run real races. Find out which training programs actually work best, and teach them to us.

Get fast, smart, together!