Saas Talent

Data Engineer 3

Job Description

MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. We enable organizations of all sizes to easily build, scale, and run modern applications by helping them modernize legacy workloads, embrace innovation, and unleash AI. Our industry-leading developer data platform, MongoDB Atlas, is the only globally distributed, multi-cloud database and is available in more than 115 regions across AWS, Google Cloud, and Microsoft Azure. Atlas allows customers to build anywhere—on the edge, on premises, or across cloud providers. With offices worldwide and over 175,000 developers joining MongoDB every month, it’s no wonder that leading organizations, like Samsung and Toyota, trust MongoDB to build next-generation, AI-powered applications.

Headquartered in New York, with offices across North America, Europe, and Asia-Pacific, MongoDB has more than 29,000 customers, which include some of the largest and most sophisticated businesses in nearly every vertical industry, in over 100 countries.

MongoDB is growing rapidly and seeking a Data Engineer to be a key contributor to the overall internal data platform at MongoDB. You will build data driven solutions to help drive MongoDB's growth as a product and as a company. You will take on complex data-related problems using very diverse data sets.

We are looking to speak to candidates who are based in Gurugram for our hybrid working model.

Our ideal candidate has experience with

Several programming languages (Python, Scala, Java, etc.)
Data processing frameworks like Spark
Streaming data processing frameworks like Kafka, KSQ, and Spark Streaming
A diverse set of databases like MongoDB, Cassandra, Redshift, Postgres, etc
Different storage format like Parquet, Avro, Arrow, and JSON
AWS services such as EMR, Lambda, S3, Athena, Glue, IAM, RDS, etc
Orchestration tools such as Airflow, Luiji, Azkaban, Cask, etc
Git and Github
CI/CD Pipelines

You might be an especially great fit if you

Enjoy wrangling huge amounts of data and exploring new data sets
Value code simplicity and performance
Obsess over data: everything needs to be accounted for and be thoroughly tested
Plan effective data storage, security, sharing and publishing within an organization
Constantly thinking of ways to squeeze better performance out of data pipelines

Nice to haves

You are deeply familiar with Spark and/or Hive
You have expert experience with Airflow
You understand the differences between different storage formats like Parquet, Avro, Arrow, and JSON
You understand the tradeoffs between different schema designs like normalization vs. denormalization
In addition to data pipelines, you’re also quite good with Kubernetes, Drone, and Terraform
You’ve built an end-to-end production-grade data solution that runs on AWS

Responsibilities
As a Data Engineer, you will

Build large-scale batch and real-time data pipelines with data processing frameworks like Spark on AWS
Help drive best practices in continuous integration and delivery
Help drive optimization, testing, and tooling to improve data quality
Collaborate with other software engineers, machine learning experts, and stakeholders, taking learning and leadership opportunities that will arise every single day

Skills

Python
Java
CI/CD
Database
Spark Streaming
AWS Lambda

Education

Master's Degree
Bachelor's Degree

Job Information

Job Posted Date

Dec 17, 2024

Experience

4 to 7 Years

Compensation (Annual in Lacs)

Best in the Industry

Work Type

Permanent

Type Of Work

8 hour shift