Blog
Imran Haque — Mon 04 January 2021

TL;DR - How to implement a cycling speed+cadence+power meter over Bluetooth LE, and some other small software tricks on the PeloMon.

Fourth in a series. See the project GitHub, to be updated through the series.

The code
3,418 pages of specs and not a single datatype
How to do it: incomplete documentation, multiple BLE services, incompatible data formats, BLE “advertising”, poorly written specs
User interface
Fun little code tricks

The code

The moment you’ve been waiting for: the source code for the PeloMon is now present in the project GitHub repo. This post walks through some of the process of writing it, with a special focus on the hell that is dealing with Bluetooth.

3,418 pages of specs and not a single datatype

Bluetooth is a complicated protocol. The Core protocol spec v5.2 is 3,256 pages long. The Core Specification Supplement (I guess there just wasn’t room for another appendix) is 37 pages. There are two cycling-related sub-specifications relevant to the PeloMon: Cycling Power and Cycling Speed and Cadence, and each has a “profile” and a “service” spec (47, 37, 32, and 19 pages, if you care. You don’t.).

The short version that will let you read other documentation on the Internet: Bluetooth LE (aka BLE) has a feature called “GATTs”, for “Generic Attributes”, which various BLE sensor types use to expose the data they want to show off. The device must implement one or more “services”, each of which consists of multiple “characteristics”, where the characteristic contains the data you actually want to share. Additionally, your device must advertise which services it offers in a special advertisement broadcast packet. There are other subtleties tailored for power-saving — for example, connection modes include not just “read” and “write”, but also “notify”, which allows the sensor to actively notify a connected device when its value changes. This requires something called a CCCD and ohmygodmakeitstop.

And after reading all of those specs, you’ll realize that in all those hundreds of pages of PDFs you’ve read, NOT ONCE did any of them tell you what the data format is that you’re supposed to expose. 16 bits? 32 bits? Signed, unsigned? Fixed point or integer or decimal or binary? That’s not important, why would you care about that? And then you realize that there’s a whole parallel world of XML-based specification documents that summarize all of that low-level stuff that you actually cared about.

How to do it with only a little hair-tearing

That’s the bad news. The good news is that I have stared into the abyss for you and come back with what you need to know. If you just want a PeloMon that works, clone the code from the repo, build it, and move on with life. But if you’re looking for something a bit more detailed, read on.

Gotcha 0: The Adafruit documentation

Adafruit provides both an AT command set to interface with the BLE chip as well as a C++ library that can both issue low-level AT commands as well as provide a higher-level interface. While the AT command set has reasonable if somewhat sparse documentation, the C++ library is basically undocumented besides a few example projects. It’s also definitely incomplete in terms of its functionality. For example, at the outset of the project, while the C++ library supported issuing a command with an integer response and returning that response to caller code, it did not have functionality to handle string responses.

Fortunately on the latter point, the library code is straightforward and the maintainers are quick to take pull requests with enhancements. On the former point…well, you just gotta read and experiment. For example, it looks like you can pass data formats for defining BLE characteristics, like INTEGER or BYTEARRAY. Don’t bother. Just use BLE_DATATYPE_AUTO. That one actually works. Unfortunately the Nordic nRF BLE chip’s firmware isn’t open source, so it’s not possible to see all the details of what works without trying.

Gotcha 1: Cycling Speed and Cadence vs Cycling Power

There are two BLE services (aka, sensor types) that are relevant:

Cycling Speed and Cadence Service (CSCS, UUID 0x1816) is capable of reporting… speed and cadence. And that’s all.
Cycling Power Service (CPS, UUID 0x1818) can report total power output, power output per pedal, speed and cadence, torque, etc.

Even though CPS supports a superset of CSCS’s data, it may not be enough to only implement CPS — in particular, although Wahoo on my phone could read power, speed, and cadence from a PeloMon only implementing CPS, my Garmin watch could not detect it at all. Garmin watches (at least, my Venu) require a sensor supporting CSCS. So for maximum support, you will have to implement both CSCS and CPS.

Each of these services is comprised of additional characteristics which are what actually contain the data you want to report. The main characteristics we care about are “cycling power measurement” and “cycling speed and cadence measurement”. However, we also need to implement a couple ancillary characteristics reporting metadata about the services: CP and CSC “feature” characteristics, which report which measurements the sensor actually supports; CP/CSC “sensor location”, which on a real bike would indicate where the sensor is located and for the PeloMon are just set to “left crank”; and (foreshadowing gotcha number 4), “SC Control Point” for speed and cadence.

In BLE, predefined services and characteristics are given 16-bit UUIDs, defined in the BLE specs. The XML specs are the most convenient reference point for both the service and characteristic UUIDs as well as the definition of the data formats they use. The ones we care about for the PeloMon are:

UUID	Description
Cycling Power Service
0x1818	CP Service
0x2A63	CP Measurement Characteristic
0x2A65	CP Feature Characteristic
Cycling Speed and Cadence Service
0x1816	CSC Service
0x2A5B	CSC Measurement Characteristic
0x2A5C	CSC Feature Characteristic
0x2A55	SC Control Point Characteristic
Both
0x2A5D	Sensor Location Characteristic

(There are two other relevant ones that weren’t needed for the PeloMon: CP Control Point Characteristic and CP Vector Characteristic.)

The PeloMon file ble_constants.h contains human-readable constants for the UUIDs and feature flags used in these characteristics.

Gotcha 2: Data formats (and CPS/CSCS incompatibility)

The Peloton reports power directly in deciwatts and crank cadence directly in rpm (and speed in mph can be computed from power). While the BLE Cycling Power Service reports power directly in watts, cadence and speed are not reported in normal units. Instead, they are reported as a pair of values: the number of total crank or wheel revolutions, and a timestamp when the last revolution was completed. (If you think about a physical bicycle sensor, this makes a ton of sense: all the sensor tracks is when the magnet on the wheel or crank crosses its sensor, and how many such crossings have occurred; this is much cheaper than constantly computing the current speed.)

So, to implement speed and cadence, we 1) have to integrate the velocities coming from the Peloton into total revolutions and 2) back-calculate the timestamp when the last rev-completion took place. The first is easy and mostly obvious. (Though note that for speed, we need to convert linear speed to wheel revolutions, which requires an assumption on wheel circumference. The PeloMon uses the canonical value for a 700c x 25 wheel: 2105mm.) It’s very easy to overlook the latter problem: it’s not enough to simply take the timestamp when you performed the update (e.g., when the RPM message came from the bike); you need to figure out when the crossing to the next integral value took place, or else your speeds will come out all wrong.

While the units for revolutions are obvious (integer number of revolutions), timestamp is less obvious. Rather than using a decimal fraction of a second, both CPS and CSCS use binary fractions of a second…but CPS and CSCS have different timestamp resolutions. Specifically, CSCS takes the last wheel revolution timestamp in units of 1/1024 second, whereas CPS has a higher resolution of 1/2048 sec. This leads to a subtle incompatibility (at least with certain software): if you implement speed and cadence in both CPS and CSCS, you may see incorrect/unstable values reported. For example, when I implemented speed and cadence support in both CPS and CSCS, Wahoo on my phone reported crazy values. Using the specified different time resolutions for each led to Wahoo freaking out and reporting bad data; setting both to 1/2048 led to it reporting 2x the correct speed. Evidently, it was using the timestamp resolution from one service for both and producing nonsensical results. The easy way around this was to report only power and accumulated energy (total kJ) in CPS, and use CSCS to report speed and cadence. Wahoo recognized both services and read the data from each correctly.

At a binary level, both the CPS and CSCS measurement characteristics follow a similar data format: flags, followed by mandatory data, followed by optional data indicated by the flags. The PeloMon’s CPS measurement data follows the format [flags uint16] [power uint16] [energy uint16]; one flag bit is set indicating the presence of accumulated energy, power is a uint16 in watts, and total energy is a uint16 in kilojoules. All values are little-endian. PeloMon’s CSC measurement data looks like [flags uint8] [wheel revs uint16] [wheel rev timestamp uint16] [crank revs uint16] [crank rev timestamp uint16]. Two bits are set in flags, indicating both wheel and crank revolution data present. Wheel and crank rev counts are unsigned 16-bit integers that are allowed to freely wrap around. Timestamps, as mentioned above, are 16-bit integers representing the timestamp in units of 1/1024 sec the last time the wheel or crank completed a revolution.

Gotcha 3: Advertising

Implementing the services alone isn’t enough. While unpaired, BLE devices broadcast an “advertising” packet listing what services they support, so that scanning devices can filter down which ones they show to the relevant set. (For example, when scanning for sensors, my watch will not show the TV nearby, because the TV doesn’t support any services the watch cares about.)

Bluetooth core spec version 5.2 volume 3 (host) part c (generic access profile) ch 11 defines the format of the advertising packet. It is a sequence of “AD” structures:

    struct AD {
            uint8_t length;
            uint8_t ad_type;
            uint8_t data[length - 1];
    }

padded out to the rest of the max size of 62 (?) bytes with zeros.

AD_type is defined as one of the BT generic access profile assigned numbers. With that, we can decode what the PeloMon was advertising by default. (Well, with one non-default item: I had already set the device name using the firmware supported AT+GAPDEVNAME command.):

Raw bytes	Description
`02 01 06`	Type 0x01, Flags: LE Limited Discoverable Mode (defined in Bluetooth Core Specification Supplement (CSS) v9 Part A (data types specification), section 1.3)
`02 0A 00`	Type 0x0A, Tx power level: 0dBm (defined in CSSv9 Part A Section 1.5 as a signed int8_t representing power level from -128 to 127dBm)
`11 06 9E CA DC 24 0E E5 A9 E0 93 F3 A3 B5 01 00 60 6E`	Type 0x06,incomplete list of 128-bit service class UUIDs: the BLE UART service predefined in the firmware
`08 09 50 65 6C 6F 4D 6F 6E`	Type 0x09, complete local name: PeloMon
(29 bytes of 0x00)	Padding

So we need to add the service UUIDs for CSCS and CPS to those last spare bytes available. Service UUIDs need to be put into advertising data in little endian format (though CPS is 0x1818 so it doesn’t matter, conveniently; it does matter for CSCS with UUID 0x1816.) Following the example code, we want to add two more services as “incomplete list of 16-bit service class UUIDs” to what is already there: 0x05 0x02 0x18 0x18 0x16 0x18:

0x05: five bytes in clause
0x02: clause is of type “incomplete list of 16-bit UUIDs”
0x18 0x18: little-endian CPS UUID
0x16 0x18: little-endian CSCS UUID

Note that with the Adafruit firmware, when issuing the AT+GAPSETADVDATA command, you must not include the device name (that’s set with AT+GAPDEVNAME) and don’t need to add the zero-byte padding.

This is enough to get the PeloMon showing up in the device scan, and to see the data in Wahoo, but not quite enough to see speed on the Garmin, bringing us to…

Gotcha 4: Ancillary Services and Subtle Specs

Implementing CPS and CSCS as above was not enough to get the sensor working on my Garmin watch. It would be detected, but it wouldn’t actually show the speed from the ride. Remember how I said the PDF specs are worthless? Turns out, there are some important details really buried in there. Here are a couple tidbits from the CSC Profile spec (which defines the spec from the perspective of the “collector” device trying to read from the sensor):

3.1.1.4: CSC Sensor should include the value of the Appearance characteristic defined in its Advertising data
4.7.2.1: If the Wheel Revolution Data Supported bit of the CSC Feature characteristic is set to 1, then this procedure (SC Control Point Set Cumulative Value Procedure) is supported by the CSC Sensor.

Addressing the first didn’t do anything; in fact, trying to set the advertising data to include this threw an error in the BLE firmware. However, exposing a third BLE characteristic, “SC Control Point”, with the right permissions fixed the issue. The presence of the characteristic with even a no-op handler for “Set Cumulative Value” allowed the Garmin to properly report both speed and cadence from CSCS.

Creating a user interface

“But wait”, you say, “you plug this guy into the Peloton, pair it with your watch, and off you go, right? What interface?”

For debugging and fun times purposes, though, it’s nice to be able to interact with the device with something beyond the USB serial monitor in the Arduino tooling. Conveniently, the Adafruit firmware includes a serial interface emulator called the BLEUart. The documentation for this is…nearly nonexistent. If you look at the doc pages for the 32u4 Bluefruit, there’s nothing useful at all. A different Bluefruit has a little more info, but really not a whole lot. But that’s OK! You can more or less treat it like a Serial interface, with some caveats. For example, available() has an implicit timeout; if there’s nothing already in a receive buffer, it will wait this long to see if anything shows up. Resetting the timeout with the (undocumented!) setTimeout method works — but is unreliable with timeouts below 3ms. Experimentation!

Using the Adafruit Bluefruit Connect app to connect to the PeloMon, a debug console is available on the UART tab. By default, the PeloMon logs the current cadence, power, and speed to the console; this can be disabled by sending the command nolog, or additional debug output requested with debug. More commands are available too!

Command	Description
`help`	list available commands
`sim`	reboot PeloMon and switch to [simulator](/posts/2020/12/26/pelomon-part-ii-emulating-peloton/)
`reboot`	reboot
`freset`	factory reset (resets BLE module state and resistance lookup table)
`nolog`	disable console logging
`info`	set log level to INFO (default)
`debug`	set log level to DEBUG
`rlut`	dump resistance LUT
`ble`	dump BLE module state
`ride`	dump ride state

Little Tricks

There are a handful of little things in the code that I wanted to write up but don’t merit entire posts of their own, so here are some quick hits.

`<avr/pgmspace.h>`

The ATMega AVR is a Harvard architecture with code and data in separate memory address spaces. This has a couple consequences:

while constant data can be stored in the program address space, it must be copied into RAM to be worked on using dedicated routines.
C-level constant arrays (strings, etc.) by default take up working RAM, rather than just code space.

Saving RAM by moving strings to the program memory space and reading them back a byte-at-a-time is familiar to Arduino programmers with the F() macro, which signals the compiler to store the string argument to program memory and casts the pointer to the special FlashStringHelper* type signaling a downstream overloaded function to read a pointer to program memory space rather than RAM.

Going beyond Arduino, this is a technique generally usable on AVR micros, with the PROGMEM modifier and the PSTR macro. Notably, <avr/pgmspace.h> has versions of many useful C library routines suffixed with _P that take arguments from program space rather than RAM. For example, snprintf_P takes a constant format string from program memory and writes its output to RAM, allowing the constant string to be moved out of RAM.

Interrupt timers

In the original design of the PeloMon, the main loop() function drove a state machine that would process a single byte at a time. However, adding the BLE debug interface — which checked for input right after the state machine, before loop() returned — broke the functionality, because as noted above, the BLEUart’s available() function needed a timeout of 3ms, meaning that it never took less than 3ms to return. However, the Peloton bike can respond to a request from the HU in as little as 200 microseconds, so we can’t afford to wait 3ms after seeing the last byte from the HU — it’s critical once that byte has been seen to immediately switch over to listening to the Bike’s software serial interface to avoid missing a message.

However, it’s always possible to have a glitch and miss a message, or that a ride gets cancelled between the HU sending a request and seeing the bike’s response, and it is undesirable to have the PeloMon be unresponsive for an unbounded period of time while it waits for a bike message that may never come. One option would be to check millis() inside the inner loop waiting for the bike’s messages, but I chose an arguably more elegant, yet slightly more complicated option: using a builtin timer interrupt.

The AVR’s timer 0 is by default set up (by the Arduino runtime) to overflow about once per millisecond. The AVR allows attaching multiple interrupts to a signal timer. Arduino uses the overflow interrupt on timer 0 to trigger the counter for millis, leaving two “exact value” comparator interrupts available. The PeloMon takes over the first one, setting it up to trigger an interrupt once per timer cycle (comparing to an arbitrary exact value) to decrement a “wait time remaining” value, which is checked in the inner loop of the wait.

Defining and setting up the timer is easy, despite the magic variable names:

// Set up an ISR on an arbitrary point in timer0 which ticks
// over at about 1KHz. Use this to time-limit our wait for
// bike responses and ensure user responsiveness.
// This variable is modified in an ISR so needs to be volatile.
volatile uint8_t bike_wait_ms_remaining;
SIGNAL(TIMER0_COMPA_vect) {
    bike_wait_ms_remaining--;
}
inline void enable_bike_timeout(void) {
    // Arbitrary value. Just need interrupt to fire once per timer cycle.
    OCR0A = 0xB0;
    TIMSK0 |= _BV(OCIE0A);
}
inline void disable_bike_timeout(void) {
    TIMSK0 &= ~(_BV(OCIE0A));
}

Then when we wait for the bike to respond, we set up this timer to make sure we don’t wait indefinitely:

enable_bike_timeout();
while (!bike_message_complete) {
    // Wait a max of 10-11ms at each byte
    // (might be 10 if the timer ticks over immediately after we assign)
    bike_wait_ms_remaining = 11;
    while (!peloton.bike_available()) {
        // If we have waited too long for the bike to respond, bail.
        if (bike_wait_ms_remaining == 0) {
            disable_bike_timeout();
            return false;
        }
    }
    uint8_t next_byte = peloton.bike_read();
...

Horner’s Method

Earlier in the series I derived a two-piece polynomial regression to compute speed from power on the Peloton. Horner’s method allows efficiently evaluating a polynomial without having to explicitly evaluate powers of x; in this case, the third-order polynomial can be evaluated in only 4 adds and 3 multiplies:

const float rtpower = sqrt(power);
const float coefs_low[4] = {-0.07605, 0.74063, -0.14023, 0.04660};
const float coefs_high[4] = {0.00087, -0.05685, 2.23594, -1.31158};
const float* const coefs = power < 27.0f ? coefs_low : coefs_high;
float mph = 0;
for (uint8_t i=0; i < 3; i++) {
    mph += coefs[i];
    mph *= rtpower;
}
mph += coefs[3];
return mph;

(Yes, the coefficients are a little different than those shown in the earlier post. It doesn’t matter too much.)

Conclusion

It works! It was a fun journey from physical layer signaling and mucking about with voltages all the way through writing high-level code to handle Bluetooth. Hope you found it interesting too.

As always, drop questions or comments on Twitter @ImranSHaque and tag them #pelomon!

PeloMon: The Code (Part IV)

Friends don't let friends write Bluetooth code.

Table of Contents