Site Reliability Engineer (3-5 years of experience)

Employer: WORLDLINE
Domain:
  • IT Software
  • Job type:: full-time
    Job level: 1 - 5 ani experienta
    Location:
  • BUCURESTI
  • Updated at: 11-12-2025
    Remote work: Hybrid

    Short company description

    We are a global Leader in Technology and Payments
    #1 payment services provider in Europe and #4 worldwide
    Company industry: Digital payments.

    Requirements

    OS: Knowledge/experiences with Linux;
    DB: Knowledge/experiences with Oracle, Postgres DB
    Knowledge/experiences with Korn Shell Schripting, Python, Kubernetes
    Knowledge about ServiceNow tool for ticketing
    Knowledge and experience with monitoring tools like: Prometheus, Grafana, PagerDuty
    Experience on CI/CD, Network management (debugging network issues)
    Cloud oriented: GCP knowledge (other provider knowledge/experiences could be a plus; I.e. AWS)
    Cloud management tools: Terraform, Puppet, Gitlab
    Cloud container solutions: Docker, Nginx, HAProxy, etc..
    Analytical skills (capable to deep dive into analysis to find solutions), team working, full autonomy, accountability, strong proactivity
    On call duty availability

    Responsibilities

    Availability and Reliability:

    Monitor the performance, availability, and reliability of our systems and applications, ensuring adherence to SLAs.
    Ensuring the correct running and maintenance activities of the Application Platform
    Implement and maintain monitoring and alerting systems to proactively identify issues.

    Release/Deploy:

    Have a holistic end-to end view on the service: application, underlying infrastructure and other dependencies.
    Release/scripts/packages/libraries installation in production
    Guarantee that the service you are in charge can be installed, configured, changed, and operated I compliance with regulation, security and the expected service levels.
    Improve through automation, the pipeline from development to operations (Ci/CD) and all operations on the service.
    Change request management
    Participate and validate service designs and changes, with the mission to ensure that quality of operations will remain at the proper level. 

    Incident Management:

    Lead incident response efforts during outages or major incidents, coordinating with cross-functional teams to ensure timely resolution.
    Conduct post-mortem analyses to identify root causes and implement corrective actions.
    Availability for on call activities out of business hours – Stand-by

    Run and Operations:

    Ensure 2nd Line of support for incidents and problem management
    Identify gather and analyze metrics and events from both infrastructure and application to provide capacity planning, performance improvements and incident analysis.
    Performing technical and business monitoring of application platforms through alerting tools (I.e Zabbix, Grafana, PagerDuty, Prometheus)

    Stakeholder Management:

    Propose application related evolutions to improve service and deliver quality
    Partner with infrastructure provider as the voice of service delivery

    Benefits

    • Performance bonus
    • Special events bonus (ex: Easter, Christmas)
    • Medical subscription
    • Life insurance
    • Trainings
    • Courses
    • Bookster subscription
    • Flexible work schedule
    • Meal vouchers

    Job-uri similare care te-ar putea interesa:

    Site Reliability Engineer

    Hybrid

    Senior Site Reliability Engineer - Engineering & Reliability domain @ING Hubs Romania

    BUCURESTI, Cluj Napoca,

    Site Reliability Engineer (Bucharest and Iasi)

    Hybrid

    Vezi job-uri similare ( 335 )