Why Auto-Rotating a Football Heat Map Is Harder Than It Looks

A few weeks ago I added a GPS heat map to Atmos Football. Players upload a TCX file from their fitness watch after a 5-a-side match, and the app draws a Strava-style heat map showing where they spent their time on the pitch.

It worked. The colours behaved. The blending was smooth. There was just one problem: the pitch was almost never aligned to the screen.

A football pitch is a rectangle. A laptop screen is a rectangle. There is one obvious thing to do: rotate the map so that the pitch's long edge runs along the screen's long edge. On a wide PC screen, the pitch should be horizontal. On a phone in portrait, the pitch should be vertical. The whole pitch ends up filling the screen, every player position gets pixels, and you don't waste a third of the viewport on the grass outside.

I tried to build this. I tried four different algorithms. Each one was more sophisticated than the last. And each one was eventually defeated by the same problem — a problem that I now suspect has no clean algorithmic answer.

This post is the story of those four attempts, what each one was supposed to fix, and the conclusion I eventually arrived at: when you cannot solve a problem with maths, you let the user solve it once and you remember the answer.

What "auto-rotate" needs to do

Before getting into the algorithms, the goal:

Take a list of GPS points (lat/lon pairs) from a single player's match.
Figure out the long axis of the pitch they played on.
Return a rotation angle that will align that long axis with the long edge of whatever screen the player is viewing on.

If the player is on a laptop (landscape), rotate so the long axis is horizontal. If they're on a phone (portrait), rotate so the long axis is vertical.

The maths-y part is step 2. There are no labelled "this is a pitch corner" markers in the GPS data. All you have is a cloud of 4,000-ish points and a hope that the cloud's shape gives away the underlying geometry.

Attempt 1: PCA (Principal Component Analysis)

The textbook answer to "what is the principal direction of a cloud of points?" is PCA.

You compute the 2×2 covariance matrix of the (latitude, longitude) coordinates. The eigenvector corresponding to the larger eigenvalue points along the direction of maximum spread. For a long, thin rectangle of GPS pings, that direction should be the rectangle's long axis. Job done.

I implemented this in a few dozen lines. It worked beautifully on test data — perfectly synthetic N-S, E-W, and 45° diagonal "pitches" all returned the right angle to within a degree.

Then I tried it on a real game.

The heat map cluster looked horizontal-ish on screen, but the auto-rotate had picked an angle that made it look diagonal. I'd open another game and it would rotate sideways. The whole feature was a coin flip.

The first failure mode turned out to be GPS drift.

Why GPS data isn't a clean rectangle

When a player kicks the ball into a hedge and runs to fetch it, the GPS unit on their watch doesn't stop recording. You get a long line of points trailing off the pitch in whatever direction they wandered.

This kind of "drift" — sustained off-pitch excursions of 30–60 seconds — is murder for PCA. PCA's covariance calculation squares the distance from the centroid, which means a single point 50 metres outside the pitch contributes 100× more to the variance than a point 5 metres from centre. A short trail of 30 wayward points along a single bearing can pull the principal axis dramatically toward that bearing.

In my first attempt I had added a basic guard against this: clip the top and bottom 2% of points on each axis before running PCA. That removes the worst single-point GPS glitches.

It was not enough. A 2% clip on a 4,000-point recording is 80 points — fewer than a single ball-fetching excursion. The remaining hundreds of drift points stayed in the PCA, and the principal axis kept locking onto the drift direction instead of the pitch direction.

Attempt 2: density-based filtering

The fix seemed obvious once I named the failure: I needed to throw out everything that isn't part of the dense play area.

A football match is, geometrically, a dense rectangle of frequently-visited cells (the pitch) plus occasional sparse trails leading off it (the drift). If I could detect and keep only the dense cells, the trails would vanish before PCA ever saw them.

The algorithm:

Lay down a 20×20 grid over the bounding box of all the GPS points.
Count how many points fall in each cell.
Throw away every cell whose count is below half the median non-empty cell.
Run PCA on the points that remain.

On synthetic data with drift, this worked brilliantly. I wrote a test that deliberately injected a long off-pitch trail into a clean N-S pitch, and the algorithm calmly returned a near-N-S bearing.

Confidence high, I shipped it.

Why the density filter wasn't the whole answer

The next game I looked at, the rotation was still wrong.

I dug in. The density filter had done exactly what it was meant to do: it had thrown away the sparse off-pitch cells and produced a clean cloud of dense play points. PCA on those clean points had then confidently picked an axis.

The axis was wrong by about 90°.

This was the moment I realised that the density filter, by itself, doesn't solve the problem. It just changes the failure mode. Once the drift was gone, PCA was now operating on a clean cloud that still had a principal direction — and that direction wasn't necessarily the pitch.

Attempt 3: iterative refinement

I added a third stage. After the initial PCA, I would compute each point's perpendicular distance from the principal axis, drop any point further than 2σ from the axis, and re-run PCA on the survivors. The idea was that any residual drift that survived the density filter would be far from the axis perpendicular-wise, and a second pass would clean it up.

For some games this helped. For others it made things subtly worse — particularly on a roughly-square pitch where the "principal axis" was already a coin flip between two near-equal-variance directions, and the refinement just tilted the result more confidently in the wrong direction.

I kept all three stages — the percentile clip, the density filter, and the iterative refinement — because they each prevent a real failure mode that I have test cases for. But together, they still didn't reliably produce a sensible rotation on the real game I kept coming back to.

I tried a fourth approach.

Attempt 4: oriented bounding box

If PCA is asking "which direction is the spread biggest", the alternative geometric question is "which rotation makes the smallest enclosing rectangle?". The minimum-area-rectangle algorithm gives you a rotation such that the data fits into a tight bounding box, and the long edge of that box is, intuitively, the pitch's long edge.

I implemented it. The minimum-area rectangle for my problem game came out as 48 metres by 50 metres.

That is essentially square.

And here, finally, was the real problem.

What none of the algorithms could see

The pitch in the satellite image had a clearly visible long axis — about 25° off vertical, running roughly north-west to south-east. Anyone looking at the rectangle on the map would tell you which way it pointed in five seconds.

But the player whose GPS data I was feeding the algorithm hadn't really used the long axis. His movement pattern was concentrated near the centre, with slightly more spread along the short axis of the pitch than the long one. (5-a-side games tend to be like this — the action collapses around the ball and most of the running is lateral, not lengthwise.)

So:

PCA saw the actual data and correctly identified its principal axis — which was perpendicular to the pitch's long axis.
OBB saw a nearly-square cloud of points and correctly identified that there was no clear long edge.
Iterative refinement confidently sharpened the wrong answer.
Density filtering removed the drift but couldn't conjure a long axis where the player didn't make one.

Every algorithm was mathematically correct about the data it was looking at. The problem was that the data didn't represent the pitch. It represented the player.

There are three reasons the GPS data and the pitch geometry diverge:

Players don't always run along the pitch edges. Especially in 5-a-side, the natural axis of play depends on the ball, not the goal lines. A defender holding their line will spread more side-to-side than goal-to-goal. A keeper barely moves at all.
Some pitches are squarer than others. Five-a-side outdoor pitches range from a 2:1 long rectangle to nearly square. On a square pitch, no algorithm reading the GPS can pick a long axis — there isn't one.
GPS drift and out-of-bounds fetching add their own direction. This is the noise the first three filtering stages were designed to remove. It is solvable; it just isn't the only problem.

The first two of these are not solvable by looking at one game's GPS data. They might not be solvable by looking at any number of games' GPS data, because some players really do play more across the pitch than along it.

What I shipped instead

After a few rounds of progressively cleverer algorithms each failing in a different way, I changed the question.

Instead of "how do I auto-detect the pitch direction from this game?", I asked "what is actually changing between games?"

The pitch isn't moving. The same group plays at the same venue every week. The pitch direction is a fixed geometric fact that is wrong for the algorithm to keep re-discovering from scratch.

So I added a button: Save for pitch. When the organiser rotates the map to the correct angle and clicks it, the rotation is converted into an orientation-independent axis (a single angle in [0°, 180°), stored on the group document in Firestore) and saved. On every subsequent heat map open, the app reads the saved axis and converts it into a screen rotation appropriate for whatever viewport the viewer is on — horizontal on a desktop, vertical on a phone, automatically adapting on resize.

The auto-detection from GPS is still in place as a fallback for first-time use and for groups that haven't saved a direction. But the algorithmic answer is now the prior — not the final word.

Lessons

A few takeaways from this:

Don't fight the data when there's a cheaper signal. The pitch direction is information that the human already knows perfectly. Trying to recover it from the consequence-of-the-pitch (the GPS data) instead of asking the human directly is a lot of work for less reliability.

Each filtering stage solves one failure mode. The percentile clip, density filter, and iterative refinement still ship — they each fix a real, reproducible bug that the previous algorithms had. They're now the fallback, not the primary. That's the right place for an algorithm that's correct 70% of the time.

A bad confident answer is worse than no answer. A small confidence guard ("if the eigenvalue ratio is too low, don't auto-rotate, just leave it north-up") would have made this less frustrating much earlier. PCA on a square cloud confidently picks an axis that's basically random — and the user has no idea the algorithm gave up.

Orientation should be independent of viewport. Saving the literal rotation (e.g. "65°") would have broken the moment the user switched from a laptop to a phone. Saving the underlying axis (a property of the pitch, not the screen) means the same value works in any aspect ratio.

The heat map looks great now. It rotates to the right angle, every time, on every device. The expensive maths runs once, in the background, only when there isn't a saved answer yet.

I should probably stop trying to be clever about it.