Senior Site Reliability Engineer – Linux Focused
Location: Hybrid - McKinney, TX 75070
Type: Contract to Hire - client will not sponsor. Please, no C2C.
Our client is actively hiring a Senior Site Reliability Engineer (SRE) to join their growing team. This is a fantastic opportunity for a seasoned Linux Systems Administrator to transition into a formal SRE role and contribute directly to product development and platform scalability.
What You’ll Do
- Own the reliability, availability, and performance of production infrastructure.
- Collaborate closely with developers to design, build, and maintain scalable systems.
- Automate infrastructure provisioning and management using tools like Terraform, Ansible, and Kubernetes.
- Build and enhance observability into systems using monitoring and logging tools.
- Participate in incident response and post-mortem analysis.
- Improve system reliability through rigorous testing, automation, and continuous delivery practices.
Ideal Background
- Strong Linux system administration experience, ideally with hundreds or thousands of servers in production.
- Solid experience with one or more of the following:
- Kubernetes
- Terraform
- Ansible
- Monitoring/observability stacks (e.g., Prometheus, Grafana, ELK, etc.)
- Comfortable with troubleshooting complex issues in a distributed environment.
- Familiarity with on-prem hardware management, particularly systems deployed at customer sites, is a major plus.