Okay, so today I’m gonna spill the beans on a project I tackled recently: reliable people counting in busy areas. It was a headache, I won’t lie, but we got there in the end. Let me walk you through it.

It all started with the client – a big retail chain. They wanted to know how many people were actually coming into their stores, not just guessing based on sales figures. The goal? Optimize staffing, measure marketing campaign effectiveness, and generally just get a better handle on customer flow. Sounds simple, right? Nope.

My first thought was, “Cameras! Obvious!” So I started digging into different types of cameras. We looked at basic IP cameras, more sophisticated thermal cameras, and even considered some fancy 3D depth-sensing ones. We even tried some cheapo ones from online stores just to see if they’d work. What I quickly found out is that just having a camera isn’t enough. You need brains behind it.

The problems started piling up immediately.

  • Lighting changes: Sunlight streaming in through the windows? Shadows playing tricks? Regular cameras struggled big time.
  • Overlapping bodies: People walking close together would often be counted as one person. Imagine lunchtime in a busy store – a complete mess!
  • Occlusion: Shelves, displays, pillars… anything that blocked the camera’s view would throw off the count.
  • Calibration: Getting the camera angles and detection zones just right was a pain. It felt like every day something would drift out of alignment.

So, hardware wasn’t the complete answer. I needed some serious software to make sense of the video feed. This is where things got interesting.

I started messing around with OpenCV – an open-source computer vision library. It’s powerful, but has a steep learning curve. I spent a good week just trying to get basic object detection working reliably. We even tried some pre-trained models for pedestrian detection, but they weren’t accurate enough in our specific environment.

Then, I stumbled upon some research papers on background subtraction and blob analysis. The idea is to create a model of the “empty” scene and then detect anything that moves differently. Sounds simple enough, but tuning the parameters was an art form. Too sensitive, and you’d get false positives from shadows or moving displays. Not sensitive enough, and you’d miss people.

We also started playing with some machine learning. I collected hours of video footage from the store entrances and used it to train a custom object detection model using TensorFlow. This helped a lot with the occlusion and overlapping issues. The model learned to recognize people even when they were partially hidden or close together.

But the biggest hurdle was still accuracy. We needed something like 95% accuracy to make the data useful. So, I started layering techniques. We used background subtraction to detect moving objects, then used the trained ML model to classify them as people. We also added some simple tracking algorithms to follow people as they moved through the frame.

To further improve accuracy, I incorporated a Kalman filter. Basically, it predicts the future position of a person based on their past movements. This helps to smooth out noisy detections and prevents people from being counted multiple times as they move through the doorway.

Finally, we deployed the system. This involved setting up small computers at each store location, running the software, and sending the data to a central server. We had to deal with network issues, power outages, and all sorts of fun stuff. We wrote some scripts to automatically restart the system if anything crashed, and set up alerts to notify us of any problems.

The results?

After weeks of tweaking and testing, we finally got the system working with an accuracy of around 92-95% in most locations. The client was thrilled. They started using the data to optimize staffing levels, adjust store layouts, and measure the impact of their marketing campaigns.

Lessons learned?

  • People counting is harder than it looks. Don’t underestimate the challenges of lighting, occlusion, and overlapping bodies.
  • A combination of techniques is usually needed. Don’t rely on a single algorithm or piece of hardware.
  • Data is king. The more training data you have, the better your machine learning models will perform.
  • Don’t forget about the practicalities. Deployment, maintenance, and reliability are just as important as accuracy.

It was a tough project, but I learned a ton. And now, hopefully, you’ve learned something too!