AI Research Engineer - Reinforcement Learning

New

IndiaFull-TimeMiddle

Salary not disclosed

Apply NowOpens the employer's application page

Job Details

Degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field; PhD preferred.
Strong research background in reinforcement learning, machine learning, NLP, or AI-related disciplines with proven contributions to advanced AI research initiatives.
Hands-on experience conducting large-scale reinforcement learning experiments, including online RL methods such as Group Relative Policy Optimization (GRPO).
Deep understanding of reinforcement learning concepts including policy gradients, actor-critic methods, GRPO, exploration-exploitation tradeoffs, and policy optimization techniques.
Strong expertise in PyTorch and reinforcement learning frameworks, including experience building end-to-end RL pipelines.
Experience developing, training, evaluating, and deploying reinforcement learning systems in production or large-scale research environments.
Proven ability to solve complex RL challenges such as sample inefficiency, training instability, reward optimization, and convergence issues.
Experience working with multi-modal AI systems and resource-efficient model architectures is considered a strong advantage.
Strong analytical, problem-solving, and experimentation skills with a research-driven mindset.
Excellent communication and collaboration abilities within distributed and cross-functional teams.

Design, develop, and implement advanced reinforcement learning algorithms to optimize decision-making processes across simulated and real-world environments.
Build, execute, monitor, and evaluate large-scale reinforcement learning experiments while tracking key performance indicators and benchmark results.
Develop and curate high-quality simulation environments and training datasets tailored to domain-specific reinforcement learning challenges.
Optimize reinforcement learning pipelines by identifying and resolving issues related to exploration strategies, policy divergence, reward signal instability, and computational efficiency.
Improve policy performance, convergence stability, and sample efficiency through advanced optimization techniques and iterative experimentation.
Collaborate with engineering and research teams to integrate reinforcement learning agents into production systems and real-world applications.
Define measurable success metrics and continuously monitor deployed RL systems to ensure robustness, scalability, and sustained performance improvements.
Contribute to ongoing AI research initiatives by exploring innovative RL methodologies, model architectures, and training frameworks.
Document experimental findings, technical approaches, and research outcomes to support knowledge sharing and continuous innovation.

View Full Description & ApplyYou'll be redirected to the employer's site