Luma AI

πŸ‘₯ 1-10πŸ’° $43,000,000 Series B about 1 year agoVirtual RealityAugmented RealityArtificial Intelligence (AI)Computer VisionVideo Games3D TechnologyπŸ’Ό Private Company
Website LinkedIn Twitter

Luma AI is at the forefront of generative AI, enabling users to transform text into stunning 3D models. We're pioneering the future of mixed reality, helping people capture, share, and experience the world in immersive 3D. Our mission is to fundamentally change how we interact with memories, products, and online spaces. Our technology stack includes cutting-edge tools like Viewport Meta, and leverages mobile compatibility with iPhone/Mobile Compatible. We utilize modern technologies such as SPF, SSL by Default, HSTS, DNSSEC, LetsEncrypt, and Cloudflare CDN. Our engineering team is focused on building scalable and efficient systems, often using Python & PyTorch, CUDA, and distributed systems. We encourage an environment of innovation, collaboration, and continuous learning in a remote-first culture. Currently, we are expanding our applied research team, and are looking for Senior Research Engineers and Senior Machine Learning Engineers, with significant experience in areas like training efficiency, hardware abstractions, and performance optimization. If you thrive on solving complex problems, enjoy contributing to the development of groundbreaking technology, and are passionate about the intersection of AI and 3D, we encourage you to apply. Join our team and help us shape the future of immersive experiences. Luma AI has secured significant funding, including a recent Series B round.

Related companies:

Jobs at this company:

Apply

🧭 Full-Time

πŸ” Machine Learning

  • Experience optimizing for memory, latency and throughput in Pytorch.
  • Experience using torch.compile / torch.XLA.
  • Experience benchmarking and profiling GPU & CPU code in Pytorch for optimal device utilization (examples: torch profiler, memory profilers, trace viewers, custom tooling).
  • Experience building tools & abstractions to ensure models run optimally on different hardware and software stacksΒ .
  • Experience working with transformer models and attention implementations.
  • Experience with parallel inference, particularly with tensor parallelism, pipeline parallelism.
  • Ensure efficient implementation of models & systems with a focus on designing, maintaining, and writing abstractions that scale beyond NVIDIA/CUDA hardware.
  • Identify and remedy efficiency bottlenecks (memory, speed, utilization, communication) by profiling and implementing high-performance PyTorch code, deferring to Triton or similar kernel-level languages as necessary.
  • Benchmarking our products across a variety of hardware & software to help the product team understand the optimal tradeoffs between latency, throughput and cost at various degrees of parallelism.
  • Work together with our partners to help them identify bottlenecks and push forward new iterations of hardware and software.
  • Work closely together with the rest of the research team to ensure systems are planned to be as efficient as possible from start to finish and raise potential issues for hardware integration.
Posted 21 days ago
Apply
Apply

🧭 Full-Time

πŸ” Software Development

  • Experience training large models using Python & Pytorch, including practical experience working with the full development pipeline from data processing, preparation & dataloading to training and inference.
  • Experience profiling GPU & CPU code in Pytorch for optimal device utilization (examples: torch profiler, NVIDIA Nsight systems/compute, memory profilers, trace viewers, custom tooling).
  • Experience writing & improving highly parallel & distributed Pytorch code of large generative models, with familiarity in FSDP, Tensor Parallel, Sequence/Context Parallel, Pipeline Parallel etc.
  • Experience working with transformer models and attention implementations.
  • Ensure efficient implementation of models & systems with a focus on large-scale training.
  • Identify and implement optimization techniques for massively parallel and distributed systems, including the underlying communication layer.
  • Identify and remedy efficiency bottlenecks (memory, speed, utilization, communication) by profiling and implementing high-performance PyTorch code, deferring to Triton, CUDA, and lower levels as necessary.
  • Work closely together with the rest of the research team to ensure systems are planned to be as efficient as possible from start to finish.
  • Conduct research & experiments on state-of-the-art large-scale generative AI models with the goal to improve latency & throughput for training and inference.
Posted 21 days ago
Apply