Early on, Orange defined its role to address some of today’ssocialchallenges linked to our industry: promoting responsible mobile phone use, raising awareness of screen time, recycling mobile phones, and fighting against cyber bullying.
We are looking for a creative, talented, and motivated Sr. Cloud Data Engineer to work in our growing India Development Center. In this role, you will be instrumental in architecting, building, and managing our data infra on Google Cloud Platform GCP and Big Data Setup on prem. The ideal candidate will be well versed in data engineering and cloud computing and be excited to use data to drive business insights and decisions.
Your Role
In Data Referential domain we are hiring a Big Data Engineer in charge of Data/Big Data and BI projects on public (mainly GCP) or Hybrid Cloud environments.
Your main missions will be to:
Support the teams of the Business Units and subsidiaries for the definition of the ingestion pipelines of their Data and/or Business Intelligence projects:
understand the needs and issues
propose solutions and relations with organizations
Industrialize the data movement between sources and data lakes
Develop data preparation pipeline for further use cases
Participate in various technical activities of projects to develop and maintain your operational skills (data work, integration, platform administration, ...)
You will be part of an international team (i.e India France) that leverages collective intelligence and expertise to create value solution.
You will be involved in the different stages of the projects (study, development, production) and you will work on:
Gathering the needs of operational entities and defining solutions that will take the most advantage of the services offered by public clouds and the capabilities of internal clouds
Developing appropriate strategies to help entities better handle the infrastructures of their data projects.
Supporting the implementation of operational projects using the target infrastructures.
Evaluating new technologies and their impact on data stakeholders.
Key Responsibilities
Build scalable data architectures and pipelines on GCP and Big data
A very strong knowledge of Data modeling on Data/Big Data tools and the datalake organization
Design and maintain the ETL process to bring data in different sources into the data platform.
Achieve data quality, accuracy, and consistency in all data systems.
Establish manage cloud infrastructure via GCP services like : BigQuery, Dataflow, Pub/Sub, Dataproc and Bigtable
A good knowledge of Serverless solutions (Cloud functions, Amazon lambda) and PaaS solutions related to Big Data (Big Query, Azure DWH, ...)
A strong knowledge of Data manipulation and processing tools: apache beam, spark, spark streaming, hive, pig, presto, etc.)
Prioritise the efficiency, scalability, and cost of data processing and storage solutions.
Work with data scientists, analysts, and business stakeholders to define data needs and provide an ability to automate delivery in a timely fashion.
Work on dashboards, reports and data visualization tools to help everyone at the company make data driven decisions.
Data security best practices to keep sensitive data safe.
Comply with relevant data protection regulations and data privacy policies Automation and Monitoring
Facilitate data workflows and processes that are automated to increase efficiency and minimise manual effort.
Monitor the data systems for performance, reliability and incidents/alerts and act promptly to mitigate any incidents
Identify opportunities for process improvements and implement best practices.
Perform coordination of data mapping requirements, guiding team in data ingestion activities.
about you
Technical Skills:
Proficiency in GCP services such as BigQuery, Dataflow, Pub/Sub, Dataproc, and Bigtable.
Strong programming skills in Python, Java
Experience with SQL and database management.
Familiarity with data warehousing concepts and tools.
Knowledge of big data tools and frameworks like Apache Hadoop and Spark.
Knowledge of CI/CD processes and agility in general
a good knowledge of data and big data issues and in particular concerning performance
Certifications: GCP certifications such as Professional Data Engineer or Professional cloud Architect are a plus.
Development skillset in Hadoop, Spark environment, Python or Java, Shell scripting, ETL, Hive, PIG, HDFS, Kafka, NIFI , Oozie, Hadoop DHFC Shell Command.
Good understanding of API concepts : REST, SOAP,GraphQL
Data management principles basic knowledge.
Understanding of Cloud architecture is preferrable. Good to be GCP Certified.
Understanding of Devops Concepts and various technologies like Kubernetes, Dockers, Containers.
Good understanding of JIRA.
Understanding of Open Digital Architecture principles is preferrable.