Senior Site Reliability Engineer at Talent Groups

Senior Site Reliability Engineer – Linux Focused

Location: Hybrid - McKinney, TX 75070 Type: Contract to Hire - client will not sponsor. Please, no C2C.

Our client is actively hiring a Senior Site Reliability Engineer (SRE) to join their growing team. This is a fantastic opportunity for a seasoned Linux Systems Administrator to transition into a formal SRE role and contribute directly to product development and platform scalability.

What You’ll Do

Own the reliability, availability, and performance of production infrastructure.
Collaborate closely with developers to design, build, and maintain scalable systems.
Automate infrastructure provisioning and management using tools like Terraform, Ansible, and Kubernetes.
Build and enhance observability into systems using monitoring and logging tools.
Participate in incident response and post-mortem analysis.
Improve system reliability through rigorous testing, automation, and continuous delivery practices.

Ideal Background

Strong Linux system administration experience, ideally with hundreds or thousands of servers in production.
Solid experience with one or more of the following:
- Kubernetes
- Terraform
- Ansible
- Monitoring/observability stacks (e.g., Prometheus, Grafana, ELK, etc.)
Comfortable with troubleshooting complex issues in a distributed environment.
Familiarity with on-prem hardware management, particularly systems deployed at customer sites, is a major plus.

Senior Site Reliability Engineer

Description

Senior Site Reliability Engineer – Linux Focused

What You’ll Do

Ideal Background

Skills & Requirements