Java Tech Lead
About the Role
We are seeking a highly skilled Java Tech Lead with expertise in Apache Spark and AWS EMR to join our team. This role focuses on enhancing the stability, scalability, and cost-efficiency of UserSession, one of the largest and most critical datasets at Indeed. The ideal candidate will lead efforts to optimize data processing, integrate new datasets, and ensure seamless data availability for key reports.
Key Responsibilities
- Lead the development and optimization of the UserSession data pipeline to improve stability and reduce operational costs.
- Design and implement solutions to integrate new datasets that align with evolving business and analytical needs.
- Ensure high performance and scalability of Apache Spark workloads on AWS EMR.
- Work closely with data engineers, analysts, and business stakeholders to understand requirements and translate them into efficient technical solutions.
- Identify bottlenecks and drive performance improvements in data processing and storage.
- Advocate for best practices in coding, testing, and deployment to ensure high-quality deliverables.
- Mentor junior developers and contribute to team knowledge-sharing initiatives.
Requirements
- Solid experience in Java development, with a strong understanding of multi-threading, concurrency, and performance tuning.
- Proven expertise with Apache Spark, including performance optimization and large-scale data processing.
- Hands-on experience with AWS EMR and a deep understanding of distributed computing.
- Strong knowledge of big data processing architectures and data lake best practices.
- Experience with SQL and NoSQL databases for handling large datasets.
- Familiarity with CI/CD pipelines, version control (Git), and modern deployment strategies.
- Excellent problem-solving skills and a proactive approach to troubleshooting.
- Ability to work in a collaborative environment and communicate technical concepts effectively.
Nice to Have
- Experience with other AWS services (S3, Glue, Redshift, Lambda, etc.).
- Knowledge of Scala or Python for data engineering tasks.
- Exposure to machine learning pipelines or analytics-driven environments.