Site Reliability Engineer - MPOS Air
Purpose of your new department
M.POS Air is the new Metro’s checkout core solution which is user-friendly, secure, innovative, lightweight and always available. It has to compute the amount to pay, tailored for every type of customer, in a smooth and accurate manner, to provide a performant invoicing and payment solution and also user management functionalities, keeping continuously in mind to enable the best user experience.
Our Site Reliability Engineers (SREs) help finding an optimal balance between high reliability, maintainability, scalability, resilience, and velocity of our software products. SREs are involved in the software delivery lifecycle and run large-scale applications in production. In doing so, SREs focus on production readiness by developing continuous monitoring and self-healing techniques, mitigating common issues, providing transparency about alerts and incident root causes, as part of instilling a blameless postmortem culture.
Curious to get more information about the team, the product, or the tech stack?
Your new challenges:
- Holistic approach: As part of an agile development team you will be involved in all phases of software engineering, from inception to coding, testing, delivery, and operation. You enable a high degree of automation on every level
- Performance mindset: Your mission will be to increase productivity of development teams by focusing on the cross-functional aspects like observability, scalability, availability, maintainability, reliability, etc, while always keeping the customer in mind. In doing so, you will also be leading postmortems after major incidents.
- Continuous learning: You are willing to continuously develop yourself, eager to share your knowledge and take over responsibility, and have fun doing so
- Customer feedback: As part of the team you constantly seek out feedback from users while developing new features and always keep the code base deployable to production
You are a great fit:
- You also have experience with cloud platforms (e.g. Google Cloud Platform, Amazon Web Services, Microsoft Azure).
- Docker, Kubernetes, Elasticsearch, Cassandra, Postgres, Kafka, Jenkins, Gitlab are no strange topics to you
- Basic knowledge of monitoring and experience with Datadog, Prometheus, Grafana, ELK-Stack or similar tools
- Expertise around web security and networking
- Education: You have a degree in computer science, physics, mathematics or equivalent
- Strong communicator: You have exceptional verbal communication skills, open collaboration, and strong attention to detail
- Mindset / Ways of working: You have a systematic problem-solving approach, coupled with a strong sense of ownership and drive. You have experience in dealing with difficult situations and you can make, when necessary, decisions with a sense of urgency
That's why it's worth getting started:
- You’ll be working with motivated people, that love creating products that our customers love
- You’ll be continuously growing both as an individual and as a professional, by learning on the job or from others or by accessing the training programs we have.
- You’ll have the chance of both solving reliability issues, but also coding reliability from the inception to running the product. As we strive in automating toil away, you’ll work on more and more complex topics.
- You’ll have the opportunity to participate in various initiatives or working groups with the goal of enriching and streamlining our landscape.
Our general Benefits:
- Work life-balance: flexible working time, work from anywhere inside Romania, celebrate your birthday with a free day;
- Personal growth: training in the area of soft, technical and business skills, free Bookster account, the opportunity to learn and work with a variety of technologies;
- Well-being: online sports activities, fitness centers discounts, health and life insurance, private pension, lunch tickets;
- Working mode: multicultural, self-organizing teams, agile environment.