Job Description:
- Within this role, candidates would be a hands-on leader in data engineering functions including schema design, data movement, data transformation, encryption, and monitoring: all the activities needed to build, sustain and govern big data pipelines.
Responsibilities:
- Own development of large-scale data platform including operational data store, real time metrics store and attribution platform, data warehouses and data marts for advertising planning, operation, reporting and optimization.
- Wider team collaboration and system documentation.
- Maintain next-gen cloud based big data infrastructure for batch and streaming data applications, and continuously improve performance, scalability and availability.
- dvocate the best engineering practices, including the use of design patterns, CI/CD, code review and automated integration testing.
Required Education, Experience, Skills and Training:
- Bachelor or above in computer science or EE.
- 5+ years of professional programming in Scala, Java and SQL.
- 5+ years experience developing in Amazon Cloud technologies including S3, Glue, EC2, and Kinesis.
- 5+ years of big data design experience with technical stacks like Spark, Flink, Druid, Clickhouse, Single Store, Snowflake, Kafka, Nifi and AWS big data technologies.
- Proven track record with cloud infrastructure technologies, at least two of Terraform, K8S, Spinnaker, IAM, ALB, and etc.
- Experience building highly available and scalable services for public consumption.
- Experience with processing large amount of data at petabyte level.
- Strong knowledge of system design, application design and architecture.