Should I run my machine learning model on the server or client side?

Silvia85 · June 30, 2025, 2:21pm

I created a web application where users can upload images for prediction. The app is mainly used on mobile browsers where people take photos with their phones and submit them for analysis.

Right now I’m using this setup:

Server side: TensorFlow model with Django and Apache on a machine with RTX 3070 GPUs

I’m wondering about performance though. Someone mentioned that uploading images through HTTP requests can be really slow, especially on mobile connections.

I’m thinking about switching to TensorFlow.js and running the model directly in the browser instead. This would eliminate the need to send images to the server, but then my GPU hardware wouldn’t be utilized.

Has anyone compared these two approaches? Which one typically performs better for mobile web apps that process images?

Finn_Brave · July 8, 2025, 12:06pm

I’ve run into this exact problem in production. It really comes down to your model size and what devices you’re targeting. TensorFlow.js works great for lightweight models under 10MB, but anything bigger will crash older phones and kill your page load times. Your RTX 3070 is definitely going to crush mobile CPUs, even with network delays. Compress your images before upload and use WebP format to speed up transfers. I’d actually go with a hybrid approach - preprocess images on the client to shrink file sizes, then send them to your GPU backend for the actual inference. You get the best performance without breaking on weaker devices.

Zack_88Surf · July 6, 2025, 12:31pm

Interesting problem! What’s the typical file size of your users’ images? Have you measured upload times on mobile vs GPU processing times? I’m curious if the bottleneck is actually network speed or something else.