Releases · pytorch/serve · GitHub

November 5, 2024 at 12:00 AMai_discoveryinfo

Product

DataRobot

aimachine-learningautomation

Update Details

Comprehensive information about this update

Full Content

Release Notes
Highlights Include GenAI updates No code LLM deployments with TorchServe + vLLM & TensorRT-LLM using ts.llm_launcher script OpenAI API support for TorchServe + vLLM Integration of TensorRT-LLM engine Stateful Inference on AWS Sagemaker (see blog) Support for linux-aarch64 CI & nightly regression added Publish docker & KServe images PyTorch updates Support for PyTorch 2.4 Deprecation of TorchText PyTorch Updates upgrade to PyTorch 2.4 & deprecation of TorchText by @agunapal in #3289 Resnet152 batch inference torch.compile example by @andrius-meta in #3259 squeezenet torch.compile example by @wdvr in #3277 GenAI Implement stateful inference session timeout by @namannandan in #3263 Use Case: Enhancing LLM Serving with Torch Compiled RAG on AWS Graviton by @agunapal in #3276 Feature add openai api for vllm integration by @mreso in #3287 Set vllm multiproc method to spawn by @mreso in #3310 TRT LLM Integration with LORA by @agunapal in #3305 Bump vllm from 0.5.0 to 0.5.5 in /examples/large_models/vllm by @dependabot in #3321 Use startup time in async worker thread instead of worker timeout by @mreso in #3315 Rename vllm dockerfile by @mreso in #3330 Support for linux-aarch64 Adding Graviton Regression test CI by @udaij12 in #3273 adding graviton docker image release by @udaij12 in #3313 Fixing kserve nightly for arm64 by @udaij12 in #3319 Docker aarch by @udaij12 in #3323 Documentation Security doc update by @udaij12 in #3256 Remove compile note for hpu by @RafLit in #3271 doc update of the rag usecase blog by @agunapal in #3280 Add some hints for java devs by @mreso in #3282 add TorchServe with Intel® Extension for PyTorch* guidance by @jingxu10 in #3285 Update quickstart llm docker in serve/readme; added ts.llm_launcher example by @mreso in #3300 typo fixes in HF Transformers example by @EFord36 in #3307 docs: update WaveGlow links by @emmanuel-ferdman in #3317 Fix typo: "a asynchronous" -> "an asynchronous" by @tadayosi in #3314 Fix typo: vesion -> version, succsesful

Published At

Tuesday, November 5, 2024

12:00:00 AM

Discovered At

Monday, August 25, 2025

10:25:34 PM

Confidence

1