# vllm

7 artikelengetagd met “vllm”

AI Infrastructure Exploitation

Methodologie voor het exploiteren van GPU-clusters, model-serving-frameworks (Triton, vLLM, Ollama), Kubernetes ML-platforms, cloud-AI-services en cost-amplification-aanvallen.

infrastructuregputritonvllmollamakubernetescloud-aicost-amplification

Expert

Security Comparison of Model Serving Frameworks

In-depth security analysis of TorchServe, TensorFlow Serving, Triton Inference Server, and vLLM for production AI deployments

infrastructuremodel-servingtorchservetritonvllmvulnerability-analysis

Gemiddeld

Beveiligingsconfiguratie van vLLM

Security hardening for vLLM serving deployments including API authentication, resource limits, and input validation.

infrastructurevllmllm-servinginference

Beginner

Lab: misbruik van inference-servers

Attack vLLM, TGI, and Triton inference servers to discover information disclosure vulnerabilities, denial-of-service vectors, and configuration weaknesses in model serving infrastructure.

labinference-serverinfrastructurevllmtriton

Gevorderd

Beveiliging van model serving

Security hardening for model serving infrastructure — covering vLLM, TGI, Triton Inference Server configuration, API security, resource isolation, and deployment best practices.

model-servingvllmtgitritoninferencehardening

Gemiddeld

Lab-setup: Ollama, vLLM en Docker Compose

Complete lab setup guide for AI red teaming: local model serving with Ollama and vLLM, GPU configuration, Docker Compose for multi-service testing environments.

lab-setupollamavllmdocker

Gemiddeld

vLLM inference-deployments testen

Red team testing guide for models served via vLLM including batching, KV cache, and speculative decoding.

walkthroughsplatformsvllminference

Gevorderd