What AI models can the EP-200Q support?

Written by Satoru Kumashiro | Nov 24, 2025 5:59:59 PM

AI inference on the EP-200Q

The EP-200Q can handle AI inference on either CPU, GPU or NPU. Here is a brief summary of the pros and cons of the process with each hardware. In general, we recommend using NPU as it can be dedicated to the AI inference process whereas CPU and GPU share their resources with other tasks.

	CPU	GPU	NPU
Model format	FP32, FP16, Int16, Int8	FP32, FP16, Int16, Int8	Int16, Int8
Runtime	SNPE, QNN, TensorFlow lite	SNPE, QNN, TensorFlow lite	SNPE, QNN, TensorFlow lite
Accuracy	High with FP32	High with FP32	Less than FP32
Latency	Slow	Slow	Fast
Memory usage	The number of AI model parameters times 4 (FP32), 2 (FP16/Int16) or 1 (Int8)	The number of AI model parameters times 4 (FP32), 2 (FP16/Int16) or 1 (Int8)	The number of AI model parameters times 4 (FP32), 2 (FP16/Int16) or 1 (Int8)

What if available AI models are in a different framework?

We provide a Docker file to build development environment to migrate available AI models in ONNX, TensorFlow and PyTorch into one of usable runtime.

This is one of workflow to convert ONNX, TensorFlow and PyTorch models and develop AI applications. There is another way to use Qualcomm AI Hub, which requires the end-user to acquire a Qualcomm ID.

Please note that there are some constraints to convert ONNX models.

If you're interested in learning more about the development environment for the EP-200Q, please contact us.

View full post