Data Engineer Full-time Job
1 week ago IT & Telecoms Dubai 79 views Reference: 34087Job Details
Role & Responsibilities:
Design and Develop PySpark-Based Data Pipelines:
You will design, develop, and maintain PySpark-based data pipelines for processing large datasets. You will write efficient and optimized PySpark code to perform data transformations and manipulations. You will ensure that the data pipelines are scalable, reliable, and efficient.
Collaborate with Data Scientists and Analysts:
You will work closely with data scientists & analysts to understand their data needs and develop solutions that enable them to gain insights from data. You will work with senior and lead data engineers and help them to extract, transform, and load data from various sources and perform exploratory data analysis to identify patterns and insights.
Perform Exploratory Data Analysis:
You will participate and perform exploratory data analysis to identify patterns and insights that can help in making informed decisions. You will use various statistical and visualization techniques to identify trends, correlations, and anomalies in the data, as needed.
Develop and Maintain Data Quality Checks:
You will develop and maintain data quality checks to ensure the accuracy and completeness of data. You will monitor the health of the data pipelines and ensure that they are running smoothly.
Integrate Data Pipelines into Production Systems:
You will work with software engineers to integrate data pipelines into production systems. You will ensure that the data pipelines are integrated seamlessly and that they meet the performance and scalability requirements of the production systems.
Apply Knowledge of Distributed Computing:
You will apply basic knowledge of distributed computing to design and optimize data processing solutions that can handle large volumes of data. You will understand the concepts of distributed computing and their implications on data processing.
Understand and Follow Software Lifecycle:
You will understand and follow the software lifecycle to develop, test, and deploy data processing solutions. You will follow best practices in software engineering to ensure that the solutions are reliable, maintainable, and scalable.