Skip to main content
Posted October 01, 2021
Web3 Foundation

Site Reliability Engineer

Zug, Switzerland / Remote EU Remote Full Time

This position is based in Zug, Switzerland or Berlin, Germany, but for exceptional candidates, we may consider remote work in...

This position is based in Zug, Switzerland or Berlin, Germany, but for exceptional candidates, we may consider remote work in Europe.

Responsibilities

  • Participate in on-call rotation, failure resolution, post-mortem analysis and prevention through automation.
  • Maintenance of the 1000 validator program of Polkadot and Kusama
  • Assist teams on making the platform components production-ready and provide support on IT-related issues.
  • Take ownership of the different infrastructure-as-code components that build up our platform, adapting them to the evolution of the given requirements.
  • Define automated tools to help the products be timely adapted to end-user requirements, including CI/CD pipelines.
  • Continuously improve the observability of the system and the feedback loops for getting information about potential problems before they happen.
  • Design automated disaster-recovery mechanisms.

Requirements

  • Experience designing and maintaining scalable, resilient, performant and observable systems.
  • On-call experience: participation in ops-duty rotations, incident response and post-mortem analysis and prevention.
  • A systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
  • Solid background in software development and experience with one or more of the following; Nodejs, Typescript, Rust, Python and Bash Scripting
  • Experience or willingness to learn about cloud-native technologies: kubernetes as platform, Helm as package manager, prometheus and related technologies as the monitoring stack, all working on different providers such as Google Cloud Platform, AWS, DigitalOcean and Azure.

A plus

  • Continuous Delivery experience.
  • Familiarity with Database administration and a good understanding of networking
  • Prometheus, alertmanager and grafana: create service monitors, write alert rules, create dashboards and panels.
  • Experience using Terraform and Ansible with a test-driven infrastructure approach.
  • Interest and background in decentralized technologies, especially blockchain.

Benefits

  • Competitive compensation and employee benefits
  • Regular company retreats at unique locations located around Europe
  • Opportunity to work in a multinational, high-performance team with diverse backgrounds (i.e. physics, computer science, machine-learning algorithm design, legal, financial products, management consulting, marketing & advertising, etc.)

To apply to this position, we ask you to answer a few questions in the application form, and to submit your CV and a cover letter, telling us a bit about yourself and your motivation to join us.

For more information about us, visit us on

This listing expired on Nov 15. Applications are no longer accepted.

Below are some other jobs we think you might be interested in.