Where we Work
Udemy is a global company headquartered in San Francisco, with additional U.S. offices in Denver and Austin, and international hubs in Australia, India, Ireland, Mexico, and TΓΌrkiye. This is an in-office position, requiring three days a week in the office (Tuesday, Wednesday, Thursday) and flexibility on Mondays and Fridays.
About your Skills
Strong verbal and written communication skills in English.
Troubleshooting and problem-solving abilities with attention to detail.
Ability to work effectively in a team, share ideas, and adapt to feedback.
About this role
Engineering teams at Udemy build & manage several microservices that power the Udemy B2B and D2C products. As a Reliability Engineer Intern on the Datastore Infrastructure (DSI) team, you will have the opportunity to work closely with Senior Staff Database Reliability Engineers on a high-impact, focused project. This 12-week program is designed to give you hands-on experience in managing critical data infrastructure and contribute to the reliability and scalability of our core services. Your project will be scoped to provide significant learning while directly impacting the performance, uptime, or security of a key datastore, streaming, or caching component.
What you'll be doing (12-Week Focus):
Learn and Contribute to Core Systems: Spend time learning the team's technology stack, including our use of databases (e.g., MySQL, PGSQL), streaming platforms (Kafka), caching systems (Redis), and infrastructure-as-code tools (Terraform).
Scoped Project Implementation: Design, specify, and implement a feature or improvement for a core infrastructure component, focusing on automation, monitoring, or performance. This project will be the central focus of your 12-week internship.
Infrastructure Automation: Assist in developing immutable infrastructure patterns and writing automation code in languages like Python or Golang to improve developer efficiency and service reliability.
Observability and Monitoring: Contribute to enhancing the observability of our systems by improving monitoring and documentation for datastore health and performance.
Code Review and Quality: Participate in code reviews to learn best practices for quality, security, and performance in a production environment.
Agile Collaboration: Collaborate with the wider Platform team and cross-functional partners to understand engineering requirements and integrate your project seamlessly.