Site Reliability Engineer III
ACV
Who we are looking for:
You will work alongside quality, infrastructure, and the analytics teams to build and ship new features related to our data products. We value practical software experience in addition to computer science fundamentals and training. The technologies you are familiar with are less important to us than your ability to solve complex software problems and apply software engineering best practices. As a Site Reliability Engineer at ACV Auctions you will develop, write, and modify code. You will work alongside software and production engineers to build and ship new features that optimize operational efficiency and drive growth.
What you will do:
- Operate, maintain, and administer solutions that contribute to the operational efficiency, availability and visibility of customer infrastructure.
- Planning maintenance activity, design documentation and standard procedures
- Provide Root Cause Analysis reports for outages/incidents (ITIL - Problem Management)
- Observe and provide feedback on the current state of the client’s infrastructure, and identify opportunities to improve resiliency, reduce the occurrence of incidents and automate repetitive administrative and operational tasks.
- Contribute to, improve and maintain team documentation about client systems and infrastructure, procedures, policies and schedules.
- Gather and document information about client environments through audit activities and analyze the information to identify opportunities for improvement and application of best practices.
- Work collaboratively with team mates to contribute to the continuous improvement of our working culture.
- Act as a technology leader for clients, as well as drive client discussions on technology road maps.
- Participate in an on-call rotation in an escalation capacity.
- Perform additional duties as assigned.
What you will need:
- BS degree in Computer Science OR a related technical discipline OR equivalent practical experience.
- Minimum 4 years of experience with programming in at least two of the following: Python, Java, C#, or GO
- Minimum 4 years of experience working with continuous integration and build tools such as Jenkins, building deployment pipelines, etc
- Experience building/managing infrastructure deployments on Amazon Web Services and/or Google Cloud Platform.
- Deep knowledge in day-day tools and how they work including deployments, k8s, monitoring systems, and testing tools.
- Highly proficient in version control systems including trunk based development, multiple release planning, cherry picking, and rebase.
- Self-sufficient debugger who can identify and solve complex problems in code
- Deep understanding of major data structures (arrays, dictionaries, strings).
- Strong problem-solving skills, including the ability to independently diagnose and resolve issues. This involves approaching challenges with confidence and resourcefulness, and using your expertise to explore solutions thoroughly
- Familiarity with IaC tools such as Terraform and deployment tools such as Helm
- Experience building, maintaining, and scaling Kubernetes clusters for production workloads.
Compensation: $119,000.00 - $149,000,000.00 annually. Please note that final compensation will be determined based upon the applicant's relevant experience, skillset, location, business needs, market demands, and other factors as permitted by law. ##LI-AM1
#LI-RG1
No immigration or work visa sponsorship will be provided for this position.