Site Reliability Engineer

about 2 years ago
Full time role
Broomfield, CO, US... more
Broomfield, CO, US... more

Job Description

AMP Robotics is a pioneer and industry leader in artificial intelligence (AI), robotics, and infrastructure for the waste and recycling industry. We’re working to reimagine and actively modernize recycling by applying AI and automation to increase recycling rates and economically recover recyclables reclaimed as raw materials for the global supply chain. We build and deploy cutting-edge technology solutions that solve many of the central challenges of recycling to make it more efficient, cost-effective, scalable, and sustainable. 

Headquartered in Louisville, Colorado, the Denver Post named AMP one of the 2021 Top Workplaces in Colorado. We’re fostering an environment where talented, driven individuals can grow and create impact. We seek unconventional thinkers to join our mission to enable a world without waste; at AMP, your contributions have meaning and can spur change. With backing from top-tier investors and national recognition including Fortune’s Impact 20, Fast Company’s Most Innovative Companies, and Forbes’ most promising AI companies, we’re always seeking ways to better our operations, raising the bar on innovation, and looking to collaborate and improve each day in what we do. Learn more at AMPRobotics.com.

AMP Robotics is hiring a Site Reliability Engineer reporting to the Site Reliability Engineering Manager.  The role focuses on turning raw, field data into actionable and transparent intelligence - allowing efficient scaling and improved reliability of AMP fleet devices and facilities.   As a Site Reliability Engineer you will build automated tooling to proactively monitor device health and technical process health - as well as lead efforts in commissioning and startup of new AMP facilities.

The Site Reliability Engineering team will work closely with other engineering groups (DevOps, Software Engineering, Facility Ops, QA Engineering, etc.) to define acceptable error rates and a common performance language.  As an Ops facing and reliability focused group, Site Reliability Engineers will focus heavily on reducing product downtime, predictive modeling and proactive alerts for device level failure, standardized root cause analysis, and process engineering best practices.  

Site Reliability Engineers function as the primary liaison between Ops and Sustaining Engineering efforts, increasing interdisciplinary knowledge of product designers, sales teams, project managers, service teams, and production groups with respect to how AMP devices and software actually function in the wild.  The Site Reliability Engineer will be responsible for answering the question “Is the fleet operating within defined performance metrics, without interruption… and can we prove it”.

As a member of the Site Reliability Engineering team, you will:

  • Lead efforts to commission and startup new AMP facilities across the US
    • Develop and execute both Software Acceptance Testing and Functional Acceptance testing during plant startups.
    • Drive lessons learned from each startup back to design engineering and asset deployment groups.
    • Facilitate the transition from startup to typical operations through training plant operators and facility staff on preferred process conditions and equipment operation.
  • Work with Software and DevOps groups to automate and execute software rollouts to 200+ physical devices operating in production facilities around the world.
    • Perform before and after rollout analysis - validating and verifying application specific performance parity or improvement.
    • Minimize eyes on glass time during fleet-wide rollouts.
    • Hyper focused on 3, 9’s customer uptime.
    • Continue AMPs competitive advantage by increasing Software Team feature velocity - become the most efficient portion of the product update process.
  • Develop internally facing proactive alerting, system state dashboards, and historic reporting capabilities.
    • Create and maintain a proactive alerts library - fully defining edge cases and using historic data to develop appropriate severity and priority thresholds.
    • Automate injection of alerts to appropriate reporting platforms (Salesforce, Slack, Jupyter notebooks, etc.) - allowing appropriate teams to act with urgency to resolve issues.
    • Manage real-time Grafana dashboards for company-wide access to device specific performance metrics.
  • Act as the connection point between Production and Engineering teams, ensuring production personnel can effectively deploy software onto AMP devices.
    • Automate warehouse commissioning processes where possible
    • Maintain documentation associated with device level commissioning
    • Provide regular reports to production groups associated with commissioning deviations and commissioning process areas of improvement
  • Provide standardized new application and new product feedback
    • Join the Asset Deployment Team for boots on the ground installations of new applications or new products - documenting anomalies, product improvements, or likely points of failure.
    • Act as a product team resource early in the design process - participating in design reviews and building reliability monitoring processes in parallel with new product design.
    • Perform auditing of startup and calibration parameters - providing the Asset Deployment Team with automated feedback about the correctness of new installations.
  • Example of Associated KPIs
    • Cost to Serve
    • Customer NPS Score
    • Rollout Execution Metrics
    • ECO execution speed

The successful candidate will have:

Required:

  • 3+ years experience with industrial controls & automation and/or process engineering
    • Startup Experience - SAT (Software Acceptance Testing) and FAT (Functional Acceptance Testing) experience.
  • 2+ years experience programming with a scripting language like Python (hobby level projects considered).
  • 2+ years of experience with Linux system administration
  • Automated build program experience (Ansible Tower, Jenkins, Gitlab CI, etc)
  • Good working familiarity with Docker and Docker-Compose
  • Equally comfortable behind a keyboard or holding a wrench

Preferred: 

  • Experience in the GCP ecosystem
  • Examples of technical writing
  • Grafana Experience
  • Use of multiple database technologies - mySQL, Postgres, Timescale, InfluxDB
  • Networking exposure - TCP/IP networking (routing & switching, VPNs, managing VPCs, running and interpreting packet captures)
  • Experience with a wide variety of both batch and continuous process controls

 

Bonus:

 

  • Interest in/experience with Machine Learning/Artificial Intelligence and/or robotics
  • Ability to read and/or write any adjacent, moden, coding language
  • Facility operation experience

Education:

  • Bachelor’s Degree in any engineering discipline

Experience: 

  • 2+ years experience in Process Engineering, Controls and Automation Engineering, DevOps, TechOps, Sustaining Engineering, Reliability Engineering, and/or Site Reliability Engineering

Working Conditions/Physical Demands: 

The physical demands described here are representative of those that must be met by an employee to successfully perform the essential functions of this job. Reasonable

accommodations may be made to enable individuals with disabilities to perform the essential functions. 

  • Prolonged periods of sitting at a desk and working on a computer.
  • Must be able to lift up to 15 pounds at times.

Working Location(s): 

  • Louisville, Colorado

Travel Requirements: 

  • Up to 40%

Affirmative Action/EEO Statement: 

AMP Robotics is an equal opportunity employer. In order to provide equal employment and advancement opportunities to all individuals, employment decisions at the Company will be based on job openings, merit, qualifications, and abilities as required by the position. The Company does not discriminate, and does not permit its employees to discriminate against other employees, applicants, customers, or independent contractors because of: 

  • Race 
  • Color 
  • Religion 
  • Sex 
  • Sexual orientation (including gender identity or expression, including a person's orientation toward heterosexuality, homosexuality, bisexuality, or transgender status, or PeopleCare’s perception thereof) 
  • Pregnancy, childbirth, and related conditions 
  • Marital status 
  • National origin 
  • Citizenship 
  • Military or veteran status 
  • Ancestry 
  • Age (40 or over) 
  • Disability (including genetic information) 
  • Or, any other consideration made unlawful by applicable laws. 

Equal employment opportunity will be extended to all persons in all aspects of the employer-employee relationship, including recruitment, hiring, upgrading, training, promotion, transfer, compensation, benefits, discipline, layoff, recall, and termination. 

Other duties:

Please note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities, and activities may change at any time with or without notice. 

We recognize that there is more to work than the day-to-day responsibilities. In addition to a collaborative, high-performing team environment, we’re pleased to offer competitive base salaries; medical, dental and vision insurance; a 401(k) plan; paid time off and sick time; flexible work hours; and the opportunity to quickly accelerate your learning and growth. 

Salary & compensation information: $85,000 - $105,000

Benefits information: 

Full-Time / Salaried Employees 

  • Medical - The company covers up to 85% of the premium of UHC Gold Choice Plus POS 1250 BP9K. Employees pay the difference in premium if they select a more expensive plan. Up to 25% for dependents.
  • Group Life, AD&D – 100% paid. 
  • Long Term Disability – 100% paid. 
  • Dental Insurance – 75% paid. 
  • Vision Insurance* - 75% paid. 
  • Employee Assistance Program - Provided through United Healthcare.
  • Paid Vacation Leave – Accrues at a rate of ~4.31 hours (0.54 days) per pay period (2 weeks) starting day 1. Unused PTO carries over each year with a 1-year limit.
  • Paid Sick Leave – 64 hours per year, given in full on start date, refreshes on anniversary.
  • 401(k) retirement plan - (non-matching). 
  • Seven (7) paid holidays – 7 company designated and 2 floating holidays.
  •  Referral bonuses for staff positions. 

 

Similar jobs