TECHNICAL SKILLS:
● Excellent Coding skills in Python and SQL , PySpark is a must
● Experience of working on public clouds (AWS / Azure/ GCP)
● Experience of writing/maintaining Spark code base in a production environment.
● Big Data Experience - Relational Databases (Postgres, MySQL etc.), MongoDB, Cassandra
● Well Versed with orchestration frameworks like Airflow, Kubeflow etc.
● Build & support scalable data engineering pipelines (ETLs etc.). This will entail extract, load and transform of ‘big data’ from a wide variety of sources, both batch & streaming, using latest data frameworks and technologies, along with real time monitoring dashboards and alerting.
● Essential experience of working with distributed systems software development.
● Demonstrated experience of production experience in big data infrastructure and data modelling.
● Critical experience performance optimization for both data loading and data ingestion.
● Know-how of critical tools, in a developer’s toolkit such as GitHub, Dockers etc.
● Ability to work in a fast-paced and deadline driven environment.