Site Reliability Engineer | Manhattan, NY, USA | In-Office - Selby Jennings : Job Details

Site Reliability Engineer | Manhattan, NY, USA | In-Office

Selby Jennings

Job Location : New York,NY, USA

Posted on : 2025-01-30T11:24:25Z

Job Description :

Site Reliability EngineerSelby Jennings Manhattan, United States

Apply now

Posted 1 day ago In-Office Job Permanent USD200000 - USD400000 per year

We are seeking a Site Reliability Engineer to join our Infrastructure team. You'll manage a diverse technology stack, including Kubernetes, virtualization, and CI/CD. Proficiency in automation frameworks and Infrastructure as Code is essential. Responsibilities will include designing secure platforms, supporting engineers, and enhancing infrastructure reliability.

Responsibilities:

Design and maintain a robust and secure Kubernetes and GitOps CI/CD platform capable of handling large data volumes and diverse technology loads.
Assist engineers using the platform by providing clear communication, advice, troubleshooting, maintaining documentation, and implementing feedback-driven improvements.
Promote and implement Infrastructure as Code principles and best practices.
Lead projects from design through to implementation, testing, monitoring, documentation, and support.
Automate processes to reduce manual work in large, distributed systems.
Collaborate individually and with teams to enhance the reliability, availability, and performance of the infrastructure.

Skills:

Proficient in writing and maintaining applications and APIs in languages such as Python, Go, or Shell.
Extensive experience with cloud-native and containerization technologies like Kubernetes and Docker.
Strong knowledge of Linux systems.
Experience with configuration management tools such as Terraform, Puppet, or Ansible.
Understanding of network technologies, server virtualization, and storage.
Familiarity with observability systems like Prometheus, Grafana, ELK, or Jaeger.
Experience with distributed data platforms such as Kafka, Flink, or Airflow.
Self-starter with the ability to quickly grasp concepts, implement new ideas, and think creatively.
Focused on enhancing system availability, security, and resilience through testing, monitoring, standardization, and automation.
Ability to clearly explain the rationale behind best practices.
Capable of building positive and collaborative relationships with colleagues across teams and locations.

#J-18808-Ljbffr

Apply Now!

Similar Jobs ( 0)

-- View More Similar Jobs --