About BlaBlaCar
BlaBlaCar is the world’s leading community-based travel app enabling 27 million members a year to carpool or travel by bus in 21 countries. Our team of 800 employees counts over 50 nationalities and is spread across our 5 global offices, 30% working fully remotely.
BlaBlaCar is the world’s leading community-based travel app enabling 26 million members a year to carpool or travel by bus in 21 countries. Our team of 800 employees counts over 50 nationalities and is spread across our 5 global offices, 30% working fully remotely.
By joining our Foundations department, you will be working alongside talented individuals grouped in small agile teams that each have strong ownership on their piece of these goals. Foundations is composed of seven teams which “provide consistent, easy to use, infrastructures, services, and expertise to support BlaBlaCar’s growth and evolution”.
The Site Reliability Engineering team (SRE) is responsible to provide best in class Observability, Alerting and Incident management tools and processes to service teams. As an enabling team, we help BlaBlacar engineers to efficiently improve their service reliability. Empowering developers and bringing them our reliability expertise are at the core of our daily work.
Core Infrastructure: Kubernetes, Google Cloud Platform
GitOps/Delivery: GitHub, Terraform, Flux, Helm, Jenkins
Observability/Incident Management: Datadog, Opentelemetry, Grafana IRM,
In house Synthetic Tests platform: Playwright, Qualcium, SauceLabs
Languages: Go / Python for Tooling, Typescripts/JS for the testing platform
Support software engineers by creating, maintaining, and improving observability and alerting tools and frameworks. You embrace the use of AI, leveraging agentic to eliminate toil and streamline your daily tasks
Own the Service Level Objectives (SLOs) framework, assist in the design and maintenance of indicators (SLI) and objectives to ensure service reliability.
Owning the incident management process by defining best practices, standards, and ensuring continuous improvement through post-mortems and chaos engineering. While developers handle incidents within their scope, you could step in as Incident Commander during high-severity incidents, leading coordination efforts .
Develop and maintain tools, such as Terraform modules or Go apps, to help automate and enhance reliability across services.
Build and promote reporting on operational metrics and incidents to drive distributed and continuous improvement.
Working in a multidisciplinary environment will request strong communication skills : you'll need to adapt your communication level to other teams expertise and be able to understand their needs
Strong knowledge of observability tools (e.g., Datadog) and understanding of metrics, logging, and tracing.
Troubleshooting/oncall experience in production environments, diagnosing and resolving technical issues effectively (experience with Kubernetes is a plus).
Full working proficiency in English
Fit with our BlaBlaPrinciples
Thriving in a collaborative, fast-growing and innovative environment
Ability to take ownership, aligned with business priorities and navigating in different contexts
Nice to have:
Familiarity with incident management platforms (e.g., Grafana IRM) is a bonus
Experience working with Service Level Objectives (SLOs) and Service Level Indicators (SLIs)
Exposure to programming in Go or a strong interest in learning it.
Experience in integrating Opentelemtry
Backend services are built using multiple programming languages: while development skills aren't required, familiarity with object-oriented programming and scripting languages is an advantage.
Familiarity with web/mobile testing tools or a strong curiosity to understand how software is tested at scale.
Hybrid status for this role : 2-3 days at the Office
4 additional weeks on top of legal maternity/paternity leaves
50% healthcare coverage (Alan)
Financial support for home office equipment
Minimum 25 days holiday per year
Local meal plan policy (Swile card)
50% transportation paid (Forfait Mobilité Durable)
Free unlimited carpooling & bus rides
Personal growth via trainings, mentorship, and internal mobility opportunities
Employee Stock ownership plan
Regular team building events
1 day off per year to test our product
a 45-min video-call with Maxime, Talent Acquisition Manager, to get to know you, understand your career expectations and answer your questions
a 60-min video-call with Damien Bertau, Hiring Manager, to discuss your experience and share more details about the team
a 90-min system design interview with 2 team members to discuss about your technical expertise
a 45-min video-call with Maxime Fouilleul, Head of Foundations, to get a wider vision of the department and its strategy
Our hiring process lasts on average 25-30 days, offers usually come within 48 hours.
Please note that one of these interviews will be onsite.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.