Site reliability Engineer

Employment Type

: Full-Time

Industry

: Miscellaneous



Loading some great jobs for you...





Our client is looking to hire Site Reliability Engineers at their Dania Beach, FL and Boston, MA locations.
Site Reliability Engineers are a cross between system and software engineers who are responsible for all operational aspects of the ecommerce platform.
The team is responsible for designing, building, monitoring, and maintaining the infrastructure of our internet-facing and internal services.
We're looking for engineers who want to be a part of developing infrastructure software, maintaining it, and scaling technology stack.
Come help us build a bigger and better company as a Site Reliability Engineer.
You will be part of a small family within, that has a huge impact on incredible growth.
Ideal candidates will possess the ability to discuss complex technical concepts with a diverse audience across all areas of the organization.
They will remain calm under pressure and always strive to add structure to high-pressure, fast paced tasks or projects.
What you'll do:
Focus on service stability and reliability by working with application owners to set SLOs, \"Error Budget\" and backup and DR strategies
Define application monitoring and alerting strategy
Perform capacity planning and production readiness assessment
Embed with product teams during the design and requirements phase of new product development through to initial production launch
Identify requirements for other operational teams (release engineering, automation, etc.) during application development phase
Be a technology and Devops evangelist for the rest of the company
Participate in on-call rotation for level 3 support escalations
What you'll need:
At least 5 years of experience working in an SRE role or similar.
Hands on experience with orchestration and system configuration tools such as Ansible, Puppet, Chef, Terraform, etc.
Expert in building and maintaining highly available applications including redundancy, fail over, scalability, monitoring and performance.
Strong experience with virtualization, monitoring and automation.
Software development experience (both scripting and programming languages).
Experience working with open source community (troubleshooting, patch submission, etc.).
Demonstrated 5+ years of Linux System Administration.
Experience with CI tools such as Bamboo, Jenkins, Hudson.
Ability to organize, troubleshoot and continuously learn.
Previous experience working within controls such as SOX, PCI, etc.
This position requires travel.
- provided by Dice

Launch your career - Upload your resume now!

Upload your resume

Loading some great jobs for you...