Senior AI Systems Engineer - Edge & Inference
New
BrazilFull-TimeSenior
Salary not disclosed
Apply NowOpens the employer's application page
Job Details
- Required Skills
- DockerPythonPyTorchC++TensorflowLinuxNLPComputer Vision
Requirements
- Strong experience deploying and optimizing machine learning models in production environments
- Solid expertise with PyTorch and TensorFlow, especially model export workflows
- Deep understanding of ONNX and ONNX Runtime
- Hands-on experience with inference servers such as Triton Inference Server
- Strong knowledge of inference optimization techniques including INT8, FP16, and mixed precision
- Advanced Python programming skills
- Intermediate to advanced C++ skills with a focus on performance optimization
- Experience working in Linux environments, Docker, and containerized infrastructures
- Background in performance profiling, benchmarking, and system optimization
- Familiarity with LLMs, vision models, NLP architectures, and multimodal AI systems
- Experience in at least one of the following domains: GenAI at scale, real-time computer vision, edge AI systems, or large-scale NLP applications
- Strong analytical thinking, autonomy, and ability to work in complex technical environments
- Clear communication skills and ability to collaborate with cross-functional teams
Responsibilities
- Lead the deployment and productionization of AI models in enterprise-grade environments, ensuring stability, scalability, and performance
- Optimize inference pipelines through quantization, pruning, and tuning techniques to balance accuracy, latency, throughput, and energy consumption
- Design, implement, and maintain inference services using tools such as Triton Inference Server and ONNX Runtime
- Integrate AI models with specialized hardware accelerators to maximize execution efficiency
- Develop monitoring, telemetry, and health-check systems for AI workloads in production
- Perform profiling, benchmarking, and performance analysis to continuously improve system efficiency
- Architect and support advanced AI use cases such as LLM serving, retrieval-augmented generation systems, copilots, and real-time video analytics
- Build APIs and inference services using Python and C++, collaborating closely with data, ML, and engineering teams
View Full Description & ApplyYou'll be redirected to the employer's site