Model inference

Model Inference

After the model is trained and stored in S3 bucket, the next step is to use that model for inference.

This chapter explains how to use the previously trained model and run inference using TensorFlow and Keras on Amazon EKS.

Run inference pod

A model from training was stored in the S3 bucket in previous section. Make sure S3_BUCKET and AWS_REGION environment variables are set correctly.

curl -LO
envsubst <mnist-inference.yaml | kubectl apply -f -

Wait for the containers to start and run the next command to check its status

kubectl get pods -l app=mnist,type=inference

You should see similar output

NAME                    READY   STATUS      RESTARTS   AGE
mnist-96fb6f577-k8pm6   1/1     Running     0          116s

Now, we are going to use Kubernetes port forward for the inference endpoint to do local testing:

kubectl port-forward `kubectl get pods -l=app=mnist,type=inference -o jsonpath='{.items[0]}' --field-selector=status.phase=Running` 8500:8500

Leave the current terminal running and open a new terminal for installing tensorflow

Install packages

Install tensorflow package:

curl -O
python3 --user
pip3 install tensorflow --user

Run inference

Use the script to make prediction request. It will randomly pick one image from test dataset and make prediction.

curl -LO
python --endpoint http://localhost:8500/v1/models/mnist:predict

It will randomly pick one image from test dataset and make prediction.

Data: {"instances": [[[[0.0], [0.0], [0.0], [0.0], [0.0] ... 0.0], [0.0]]]], "signature_name": "serving_default"}
The model thought this was a Ankle boot (class 9), and it was actually a Ankle boot (class 9)


Now that we saw how to run training job and inference, let’s terminate these pods to free up resources

kubectl delete -f mnist-training.yaml

kubectl delete -f mnist-inference.yaml