https://github.com/tensorflow/serving/tree/master/tensorflow_serving/g3doc
1. TF Serving 示例
#拉取镜像
# docker
docker pull tensorflow/serving
# nvidia-docker
docker pull tensorflow/serving:latest-gpu
#
git clone https://github.com/tensorflow/serving
# Location of demo models
TESTDATA="$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata"
# Start TensorFlow Serving container and open the REST API port
docker run -t --rm -p 8501:8501 \
-v "$TESTDATA/saved_model_half_plus_two_cpu:/models/half_plus_two" \
-e MODEL_NAME=half_plus_two \
tensorflow/serving &
# Query the model using the predict API
curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://localhost:8501/v1/models/half_plus_two:predict
# Returns => { "predictions": [2.5, 3.0, 4.5] }
2. TF Serving 参数
TF Serving 运行示例:
docker run -p 8501:8501 \
--mount type=bind,source=/path/to/local/saved_model/,target=/models/model_name \
-t tensorflow/serving \
-e MODEL_NAME=model_name --model_base_path=/models/model_name/ --per_process_gpu_memory_fraction=0.5
#或
docker run -t --rm -p 8501:8501 \
-v /path/to/local/saved_model/:/models/model_name \
-e MODEL_NAME=test-model tensorflow/serving:1.15.0 &
[1] - -p 8501:8501
- 服务端口
[2] - -t
- 指定镜像
[3] - --mount
- 挂载路径,其中type=bind
是选择挂载模式,注:中间不能有空格. source
是模型在机器中的绝对存储路径,target
是模型容器中存储路径
[4] - per_process_gpu_memory_fraction
是设定运行时所需的GPU显存资源最大不允许超过的比率的设定值;
[5] - model_name
命名