Data Engineer (Python & Spark) (remote work)

Angajator: Luxoft Romania
  • IT Software
  • Tip job: full-time
    Nivel job: peste 5 ani experienta
  • Alt oras
  • Actualizat la: 27.09.2021
    Scurta descriere a companiei

    Luxoft, a DXC Technology Company, (NYSE: DXC), is a digital strategy and software engineering firm providing bespoke technology solutions that drive business change for customers the world over. Luxoft uses technology to enable business transformation, enhance customer experiences, and boost operational efficiency through its strategy, consulting, and engineering services. Luxoft combines a unique blend of engineering excellence and deep industry expertise, specializing in automotive, financial services, travel and hospitality, healthcare, life sciences, media and telecommunications. Luxoft is well known for its consistent high level of delivery and complex project management, its premier digital engineering talent, exceptional client focus, and agility, creativity, and remarkable problem-solving capabilities.


    Mandatory Skills:

    - Experience working in Python
    - Experience working in Data frameworks such as Spark, Hadoop, HDFS
    - At least 3 years of experience in building data pipelines in an industrial/enterprise environment
    - Knowledge of Apache Airflow
    - Hands-on with different data formats (JSON, CSV, XML, AVRO, TXT, etc.)
    - Some experience with AWS

    Nice-to-Have Skills:

    - Experience working in IoT or predictive maintenance domain
    - Experience with OOP software engineering practices


    Project Description:

    Data collected from smart devices is accessed from cloud (AWS) storage and undergoes translation from device-specific schema, file formats, etc and transformations such as selection of relevant data and features before being applied to a ML model training subsystem. Qualified models are then pushed to production environment for prediction/execution. Data handling employs scalable spark-based access. The entire processing workflow is kept in sync via pipelines defined in Airflow. The state of the entire data engineering (& ML models, training and execution) is available via Dashboard UI


    - Bring data engineering expertise to a local Scrum team, providing qualified deliverables and services on schedule.
    - Participate in progress reviews
    - Work with scrum master(s), tech lead(s) to analyze and understand user stories in each sprint
    - Complete coding & unit testing for the allotted stories
    - Create design documents or make changes to existing ones
    - Complete code reviews