科技回声

Is this able to support more than 50 requests per second? Are there any benchmarks on performance overhead of the underlying web server/routing that is handling the requests?

It seems very interesting! What about support for pytorch models or .onnx? I usually use the pytorch->onnx->tensorrt to deploy models.

Looks interesting! How about models that require dictionaries - e.g. tf-idf to convert text into a feature vector? Does it allow for some preprocessing?

I see that you accept models up to 1 GB. It seems the inference time might be high for models of this size on CPUs. Do you use GPUs to speed up inference for deep learning models ?

You might want to change 100 Mb --> 100 MB

Is this able to support more than 50 requests per second? Are there any benchmarks on performance overhead of the underlying web server/routing that is handling the requests?

It seems very interesting! What about support for pytorch models or .onnx? I usually use the pytorch->onnx->tensorrt to deploy models.

Looks interesting! How about models that require dictionaries - e.g. tf-idf to convert text into a feature vector? Does it allow for some preprocessing?

I see that you accept models up to 1 GB. It seems the inference time might be high for models of this size on CPUs. Do you use GPUs to speed up inference for deep learning models ?

You might want to change 100 Mb --> 100 MB

Show HN: Deploy Scikit and Keras Models with a Simple Drag and Drop

5 条评论

Show HN: Deploy Scikit and Keras Models with a Simple Drag and Drop

5 条评论