Big Data Engineer

Employer: Vauban
Domain:
  • Telecommunication
  • Job type: full-time
    Job level: 1 - 5 years of experience
    Location:
  • BUCHAREST
  • Updated at: 25.01.2020
    Short company description

    Vauban joined the GFI Group at the beginning of January 2019. With more than 450 consultants in Romania, Vauban is a leading provider of IT services and innovative applications. Created in 2007, the company has experienced strong growth and has established itself as a reference partner for key accounts in the Banking, Telecom, Industry and Energy sectors.

    Requirements

    Mid/senior level, at least 3 years in working with Cloudera data platform
    Programming experience, ideally in Python, Spark, Kafka or Java, and a willingness to learn new programming languages to meet goals and objectives
    Knowledge of data cleaning, wrangling, visualization and reporting, with an understanding of the best, most efficient use of associated tools and applications to complete these tasks
    Experience processing large amounts of structured and unstructured data, including integrating data from multiple sources
    Knowledge in configuring & troubleshooting of all the components in the Hadoop ecosystem like Cloudera, Cloudera Manager, HDFS, Hive, Impala, Oozie, YARN, Sqoop, Zookeeper, Flume, Spark, Spark standalone, Kafka (incl. Kafka Connect), Apache Kudu, Cassandra, Hbase
    Develop and maintain documentation relating to Hadoop Administration tasks (upgrades, patching, service installation and maintenance)
    Understand Hadoop’s Security mechanisms and implement Hadoop Security (Apache Sentry, Kerberos, Active Directory, TLS/SSL)
    Work and continuously improve the DevOps pipeline and tooling to provide active management of the CI/CD processes
    Should have experience in scripting for automation requirement (. scripting via Shell, Python, Groovy etc.)
    Understanding of networking principles and ability to troubleshoot (DNS, TCP/IP, HTTP).

    Responsibilities

    As an Engineer you will be responsible for developing, deploying, maintaining / operating, testing and evaluating big data solutions within the organization
    Collect, store, process, and analyze of huge sets of data
    Choose optimal solutions to use for above purposes, then maintaining, implementing, and monitoring them
    Integrate solutions with the architecture used across the company.

    Other info

    Nice to have / a plus:
    Knowledge of one or more of the following: ElasticSearch, Kibana, Grafana, git/scm, Atlassian Suite (Confluence, Jira, Bitbucket) Jenkins/TeamCity, Docker and Kubernetes is highly appreciated
    NoSQL databases, MongoDB
    Integration with RDBMS, lambda architectures – integration of Data Warehouses – Data Lakes – Data Hubs
    Cloud implementations of Big Data stacks
    Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O.

    Benefits:
    Interesting salary conditions
    Undetermined period of contract
    Career plan (professional, academic and financial)
    Medical insurance
    Lunch tickets
    Professional and friendly working environment.