Previous
Deploy model
Inference is the process of generating output from a machine learning (ML) model. With Viam, you can run inference to generate the following kinds of output:
You can run inference locally on a Viam machine, or remotely in the Viam cloud.
You can use viam-server
to deploy and run ML models directly on your machines.
You can run inference on your machine in the following ways:
Entry-level devices such as the Raspberry Pi 4 can run small ML models, such as TensorFlow Lite (TFLite). More powerful hardware, including the Jetson Xavier or Raspberry Pi 5 with an AI HAT+, can process larger AI models, including Tensorflow and ONNX.
Vision services apply an ML model to a stream of images from a camera to generate bounding boxes or classifications.
For configuration information, click on the model name:
Some vision services include their own ML models, and thus do not require a deployed ML model. If your vision service does not include an ML model, you must deploy an ML model to your machine to use that service.
To use a vision service:
Visit the CONFIGURE page of the Viam app.
Click the + icon next to your main machine part and select Component or service.
Type in the name of the service and select a vision service.
If your vision service does not include an ML model, deploy an ML model to your machine to use that service.
Configure the service based on your use case.
To view the deployed vision service, use the live detection feed in the Viam app. The feed shows an overlay of detected objects or classifications on top of a live camera feed. On the CONFIGURE or CONTROL pages for your machine, expand the Test area of the service panel to view the feed.
For instance, you could use viam:vision:mlmodel
with the EfficientDet-COCO
ML model to detect a variety of objects, including people, bicycles, and apples, in a camera feed.
Alternatively, you could use viam-soleng:vision:openalpr
to detect license plates in images.
Since this service includes its own ML model, there is no need to configure a separate ML model.
After adding a vision service, you can use a vision service API method with a classifier or a detector to get inferences programmatically. For more information, see the APIs for ML Model and Vision.
With the Viam SDK, you can pass image data to an ML model service, read the output annotations, and react to output in your own code.
Use the Infer
method of the ML Model API to make inferences.
For example:
import numpy as np
my_mlmodel = MLModelClient.from_robot(robot=machine, name="my_mlmodel_service")
image_data = np.zeros((1, 384, 384, 3), dtype=np.uint8)
# Create the input tensors dictionary
input_tensors = {
"image": image_data
}
output_tensors = await my_mlmodel.infer(input_tensors)
input_tensors := ml.Tensors{"0": tensor.New(tensor.WithShape(1, 2, 3), tensor.WithBacking([]int{1, 2, 3, 4, 5, 6}))}
output_tensors, err := myMLModel.Infer(context.Background(), input_tensors)
Cloud inference enables you to run machine learning models in the Viam cloud, instead of on a local machine. Cloud inference often provides more computing power than edge devices, so you can benefit from:
You can run cloud inference using any Tensorflow model in the Viam registry, including private models owned by or shared with your organization.
To run cloud inference, you must pass
The viam infer
CLI command runs inference in the cloud on a piece of data using the specified ML model:
viam infer --binary-data-id <binary-data-id> --model-name <model-name> --model-org-id <org-id-that-owns-model> --model-version "2025-04-14T16-38-25" --org-id <org-id-that-executes-inference>
Inference Response:
Output Tensors:
Tensor Name: num_detections
Shape: [1]
Values: [1.0000]
Tensor Name: classes
Shape: [32 1]
Values: [...]
Tensor Name: boxes
Shape: [32 1 4]
Values: [...]
Tensor Name: confidence
Shape: [32 1]
Values: [...]
Annotations:
Bounding Box Format: [x_min, y_min, x_max, y_max]
No annotations.
infer
returns a list of detected classes or bounding boxes depending on the output of the ML model you specified, as well as a list of confidence values for those classes or boxes.
This method returns bounding box output using proportional coordinates between 0 and 1, with the origin (0, 0)
in the top left of the image and (1, 1)
in the bottom right.
Was this page helpful?
Glad to hear it! If you have any other feedback please let us know:
We're sorry about that. To help us improve, please tell us what we can do better:
Thank you!