Senior Machine Learning Engineer, Ads Training Platform
RedditAbout the Role
Join our Ads Training Platform team as a Senior Machine Learning Engineer to design and scale the distributed training and data processing infrastructure that powers Reddit's critical Ads machine learning models. This role offers a unique opportunity to significantly impact ad targeting, conversion prediction, and overall advertiser value.
About the Team
The Ads Training Platform team is responsible for building and maintaining the core distributed training and data processing infrastructure that powers Reddit’s Ads machine learning models. Our mission is to enable fast, reliable, and scalable model training across large datasets, directly supporting ML teams in enhancing ad targeting, conversion prediction, and advertiser value. This role is open for remote work from The Netherlands.
Responsibilities
- Design, build, and maintain large-scale distributed training infrastructure for Ads ML models.
- Develop robust tools and frameworks, particularly leveraging the Ray platform.
- Create tools for debugging, profiling, and tuning distributed training jobs to optimize performance and reliability.
- Integrate with object storage systems and enhance data access patterns for efficiency.
- Collaborate closely with ML engineers to improve model training time, efficiency, and manage GPU training costs.
- Drive continuous improvements in scheduling, state management, and fault tolerance within the training platform.
Requirements
- Deep experience in designing, building, and maintaining large-scale distributed systems and infrastructure.
- Proven experience with ML platform operations and distributed machine learning training.
- Hands-on experience with platforms like Ray for distributed computing.
- Strong ability to debug, profile, and optimize complex distributed training jobs.
- Experience with object storage systems and optimizing data access patterns.
- Excellent collaboration skills, with the ability to work effectively with machine learning engineers.
- This is a remote position open to candidates based in The Netherlands.
About Reddit
View companyReddit is an online platform that enables users to submit links, create content, and have discussions about the topics of their interest.