Big Data Engineer
Vauban joined the GFI Group at the beginning of January 2019. With more than 450 consultants in Romania, Vauban is a leading provider of IT services and innovative applications. Created in 2007, the company has experienced strong growth and has established itself as a reference partner for key accounts in the Banking, Telecom, Industry and Energy sectors.Requirements
Mid/senior level, at least 3 years in working with Cloudera data platform
Programming experience, ideally in Python, Spark, Kafka or Java, and a willingness to learn new programming languages to meet goals and objectives
Knowledge of data cleaning, wrangling, visualization and reporting, with an understanding of the best, most efficient use of associated tools and applications to complete these tasks
Experience processing large amounts of structured and unstructured data, including integrating data from multiple sources
Knowledge in configuring & troubleshooting of all the components in the Hadoop ecosystem like Cloudera, Cloudera Manager, HDFS, Hive, Impala, Oozie, YARN, Sqoop, Zookeeper, Flume, Spark, Spark standalone, Kafka (incl. Kafka Connect), Apache Kudu, Cassandra, Hbase
Develop and maintain documentation relating to Hadoop Administration tasks (upgrades, patching, service installation and maintenance)
Understand Hadoop’s Security mechanisms and implement Hadoop Security (Apache Sentry, Kerberos, Active Directory, TLS/SSL)
Work and continuously improve the DevOps pipeline and tooling to provide active management of the CI/CD processes
Should have experience in scripting for automation requirement (. scripting via Shell, Python, Groovy etc.)
Understanding of networking principles and ability to troubleshoot (DNS, TCP/IP, HTTP).
As an Engineer you will be responsible for developing, deploying, maintaining / operating, testing and evaluating big data solutions within the organization
Collect, store, process, and analyze of huge sets of data
Choose optimal solutions to use for above purposes, then maintaining, implementing, and monitoring them
Integrate solutions with the architecture used across the company.
Nice to have / a plus:
Knowledge of one or more of the following: ElasticSearch, Kibana, Grafana, git/scm, Atlassian Suite (Confluence, Jira, Bitbucket) Jenkins/TeamCity, Docker and Kubernetes is highly appreciated
NoSQL databases, MongoDB
Integration with RDBMS, lambda architectures – integration of Data Warehouses – Data Lakes – Data Hubs
Cloud implementations of Big Data stacks
Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O.
Interesting salary conditions
Undetermined period of contract
Career plan (professional, academic and financial)
Professional and friendly working environment.