# vllm

標記為「vllm」的 14 篇文章

AI Infrastructure Exploitation

Methodology for exploiting GPU clusters, model serving frameworks (Triton, vLLM, Ollama), Kubernetes ML platforms, cloud AI services, and cost amplification attacks.

infrastructuregputritonvllmollamakubernetescloud-aicost-amplification

專家

Security Comparison of Model Serving Frameworks

In-depth security analysis of TorchServe, TensorFlow Serving, Triton Inference Server, and vLLM for production AI deployments

infrastructuremodel-servingtorchservetritonvllmvulnerability-analysis

中級

vLLM Security Configuration

Security hardening for vLLM serving deployments including API authentication, resource limits, and input validation.

infrastructurevllmllm-servinginference

入門

Lab: Inference Server Exploitation

Attack vLLM, TGI, and Triton inference servers to discover information disclosure vulnerabilities, denial-of-service vectors, and configuration weaknesses in model serving infrastructure.

labinference-serverinfrastructurevllmtriton

進階

Model Serving Security

Security hardening for model serving infrastructure — covering vLLM, TGI, Triton Inference Server configuration, API security, resource isolation, and deployment best practices.

model-servingvllmtgitritoninferencehardening

中級

Lab Setup: Ollama, vLLM & Docker Compose

Complete lab setup guide for AI red teaming: local model serving with Ollama and vLLM, GPU configuration, Docker Compose for multi-service testing environments.

lab-setupollamavllmdocker

中級

Testing vLLM Inference Deployments

Red team testing guide for models served via vLLM including batching, KV cache, and speculative decoding.

walkthroughsplatformsvllminference

進階

AI Infrastructure 利用ation

Methodology for exploiting GPU clusters, model serving frameworks (Triton, vLLM, Ollama), Kubernetes ML platforms, cloud AI services, and cost amplification attacks.

infrastructuregputritonvllmollamakubernetescloud-aicost-amplification

專家

安全 Comparison of 模型 Serving Frameworks

In-depth security analysis of TorchServe, TensorFlow Serving, Triton Inference Server, and vLLM for production AI deployments

infrastructuremodel-servingtorchservetritonvllmvulnerability-analysis

中級

vLLM 安全 Configuration

安全 hardening for vLLM serving deployments including API authentication, resource limits, and input validation.

infrastructurevllmllm-servinginference

入門

實驗室: Inference Server 利用ation

攻擊 vLLM, TGI, and Triton inference servers to discover information disclosure vulnerabilities, denial-of-service vectors, and configuration weaknesses in model serving infrastructure.

labinference-serverinfrastructurevllmtriton

進階

模型服務安全

模型服務基礎設施的安全強化——涵蓋 vLLM、TGI、Triton 推論伺服器設定、API 安全、資源隔離與部署最佳實務。

model-servingvllmtgitritoninferencehardening

中級

實驗室建置：Ollama、vLLM 與 Docker Compose

AI 紅隊的完整實驗環境建置指南：以 Ollama 與 vLLM 進行本地模型服務、GPU 組態，以及多服務測試環境的 Docker Compose 編排。

lab-setupollamavllmdocker

中級

Testing vLLM Inference Deployments

Red team testing guide for models served via vLLM including batching, KV cache, and speculative decoding.

walkthroughsplatformsvllminference

進階