Model Optimization Engineer

World Labs

On-site

San Francisco, CA, United States

Full-time

$150,000 - $250,000

About World Labs

At World Labs, our mission is to revolutionize artificial intelligence by developing Large World Models, taking AI beyond language and 2D visuals into the realm of complex 3D environments, both virtual and real. We're the team that's envisioning a future where AI doesn't just process information but truly understands and interacts with the world around us.

We're looking for the overachievers, the visionaries, and the relentless innovators who aren't satisfied with the status quo. You know that person who's always dreaming up the next big breakthrough? That's us. And we want you to be part of it.

About the Role

We're seeking an experienced engineer to bridge the gap between our research team's state-of-the-art models and production-ready inference systems. You'll take PyTorch research code and transform it into highly optimized, low-latency inference solutions.

Qualifications

3+ years optimizing deep learning models for production inference.

Expert-level PyTorch and CUDA programming experience.

Hands-on experience with model quantization (INT8/FP16) and inference frameworks (TensorRT, ONNX Runtime).

Proficiency in GPU profiling tools and performance analysis.

Experience with multi-GPU inference and model serving at scale.

Strong understanding of transformer architectures and modern ML model optimization techniques.

Preferred:

Custom CUDA kernel development experience.

Experience with Triton, vLLM, or similar high-performance serving frameworks.

Background in both research and production ML environments.

Responsibilities

Optimize neural network models for inference through quantization, pruning, and architectural modifications while maintaining accuracy.

Profile and benchmark model performance to identify computational bottlenecks.

Implement optimizations using torch.compile, custom CUDA kernels, and specialized inference frameworks.

Deploy multi-GPU inference solutions with efficient model parallelism and serving architectures.

Collaborate with research teams to ensure optimization techniques integrate smoothly with model development workflows.

Apply now

Search

Model Optimization Engineer

About World Labs

About the Role

Qualifications

Responsibilities

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: