We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. The ideal candidate will have extensive experience in ETL, Data Modelling, and Data Architecture. Proficiency in ETL optimization, designing, coding, and tuning big data processes using Python is essential. Experience in using Scala is an add-on. Additionally, the candidate should have extensive experience in building data platforms using a variety of technologies, including Python, PostgreSQL, Spark, Parquet/ORC, Data Modelling (Relational Dimensional E-R Modelling), ETL, Splunk, DataDog, Airflow, Git, CI/CD Jenkins, JIRA, Confluence, IntelliJ Idea, Agile - Scrum/Kanban, Code Review, RCP Framework, Query book, Build, Deployment CI/CD Release Process, Backstage, PagerDuty, and Spinnaker. Hands-on experience on developing Data platform and its components Data Lake, cloud Datawarehouse, APIs, Batch and streaming data pipeline Experience with building data pipelines and applications to stream and process large datasets at low latency. Develop and maintain batch and stream processing data solutions using Apache Spark and Apache Hive leveraging the RCP Framework to create robust, modular applications. Work on orchestration using Airflow to automate and manage data workflows. Utilize project management tools like JIRA and Confluence to track progress and collaborate with the team. Develop data processing workflows utilizing Spark, SQL/PLSQL, and Python to transform and cleanse raw data into a usable format implementing data storage solutions leveraging Parquet/ORC formats. Develop and manage scalable data pipelines and applications using containerization with Docker and orchestration with Kubernetes. Optimize data storage and retrieval performance through efficient data modelling techniques, including Relational, Dimensional, and E-R modelling. Maintain data integrity and quality by implementing robust validation and error handling mechanisms within ETL processes. Automate deployment processes using CI/CD tools like Jenkins and Spinnaker to ensure reliable and consistent releases. Monitor and troubleshoot data pipelines using monitoring tools like DataDog and Splunk to identify performance bottlenecks and ensure system reliability. Participate in Agile development methodologies such as Scrum/Kanban, including sprint planning, daily stand-ups, and retrospective meetings. Conduct code reviews to ensure adherence to coding standards, best practices, and scalability considerations. Manage and maintain documentation using tools like Confluence to ensure clear and up-to-date documentation of data pipelines, schemas, and processes. Provide on-call support for production data pipelines, responding to incidents and resolving issues in a timely manner. Collaborate with cross-functional teams including developers, data scientists, and operations teams to address complex data engineering challenges. Stay updated on emerging technologies and industry trends to continuously improve data engineering processes and tools. Contribute to the development of reusable components and frameworks to streamline data engineering tasks across projects. Utilize version control systems like Git to manage codebase and collaborate effectively with team members. Leverage IDEs like IntelliJ IDEA for efficient development and debugging of data engineering code. Adhere to security best practices in handling sensitive data and implementing access controls within the data lake environment. Programming Languages: Python, Bash/Unix/Linux Big Data Technologies: Apache Spark, Apache Hive Cloud Services: EC2, ECS, S3, SNS, CloudWatch Databases: Postgres Application development: RCP Framework Containerization and Orchestration: Docker, Kubernetes CI/CD Tools: Github, Spinnaker, Jenkins Additional Skills: Scala, Maven
: Job Requirements:
SKILLS
NA
Similar jobs
NA AgreeYa Solutions is looking for a PEGA- CDH Developer/ Sr Developer to be part of a close-knit, high... Expand NA NA * Title:Experience on Jenkins + Azure DevOps CI/CD pipelines + version control systems using GitHub + Terraform or Ansible... Expand NA NA NA "Mega Walk In Drive for Women" Tuesday / 19th Nov 2024 Come, create... Expand 5-9 Lacs P.A. 19th November , 11.00 AM - 1.00 PM Axis Bank Ltd., 2nd Floor, A-11, Vishal Enclave, Rajouri Garden, Delhi 110027 Near Rajouri Garden Metro Station NA
The ideal candidate... Expand