Software Engineer, Triage
Site Reliability Engineering is a newly created engineering discipline in the company. Its charter is to
Triage all production issues and devise...
Site Reliability Engineering is a newly created engineering discipline in the company. Its charter is to
- Triage all production issues and devise remedies to stabilize the problem until engineering can rollout proper remediation
- Devise metrics to provide baselines where engineering can work to ensure that our platform achieves 95% uptime
- Build tools and infrastructure that promote early detection of production failures, leading to a stellar customer experience
Our work is to drive safety, health and uptime of our platform, and the ability to remedy unforeseen problems. By removing some of the complex burdens in how to scale and maintain uptime in distributed systems, SRE allows development teams to focus on feature development instead of the nuances of achieving and maintaining service level commitments.
About the Opportunity
We’re looking for a creative and driven individual that can spearhead our effort to push some “outside the box” infrastructure implementations that will have a tremendous impact on our platform’s stability and scalability.
What You’ll Be Doing:
- Work closely with our stakeholders to triage and resolve our production issues quickly
- Partner with stakeholders to devise metrics to identify areas for improvement to reduce our mean time for issue resolution
- Suggest architecture improvements that improve the reliability of our platform and maintain our market leadership
- Collaborate with Product, Engineering, and DevOps to propose, develop and maintain application performance monitoring tools to forecast the next problem spot and to keep with our service level commitment
- Automate internal processes and develop bots to keep our Product and Engineering focused on delivering the next technological breakthrough for our customers
What We Look For In You:
- 3+ years full-stack development experience with a proficiency in Java and/or JavaScript (React.js)
- 3+ years experience in application performance monitor tools integration and deployment
- 3+ years experience with large-scale distributed environments and high traffic volume (100 million user level)
- Demonstrated scripting skills in Python or similar
- Good command of Linux environment and cloud platform services (AWS, GCP or other)
- Understand DNS, SSL/ TLS, and how traffic on IP networks establishes end-to-end security and trust
- Open for pager duty
- Self-motivated & self-organized, with the ability to “think outside the box”
Nice to Haves:
- Experience with administrating Linux systems and with configuration management (Ansible, Terraform, Docker)
- Working understanding of TCP/IP network stack
- Familiarity with AliCloud
- Strong passion for Bitcoin and other cryptocurrencies
Highlights of Perks and Benefits:
- Market competitive total compensation package
- Comprehensive insurance package including medical, dental, vision, disability & life insurance (Company pays 100% for employee/80% for dependents)
- 401K with company contribution
- Flexible PTO policy, company paid holidays, and flexible hours
- UberEats Program
- Paid Parental Leave
- Employee Referral Bonus Program paid in BTC
- Company Donation Match
- More surprises when you join!
Who We Are
Okcoin is one of the world’s largest and fastest growing cryptocurrency exchanges. We help millions of people buy and sell bitcoin, and over 30 other crypto assets every day — but our work is a whole lot more than that. We’re building an inclusive future of finance, one that opens new opportunities to learn financial literacy, store value, and build wealth for everyone.
Ready to help the next billion people experience the future of finance with us? Come on board. We have offices in San Francisco, Miami, Malta, Hong Kong, Singapore and Japan. Even though this role is listed in San Francisco, we are remote friendly and believe in you working wherever you work best.
Okcoin Statement:
Okcoin is committed to equal employment opportunities regardless of race, color, genetic information, creed, religion, sex, sexual orientation, gender identity, lawful alien status, national origin, age, marital status, and non-job related physical or mental disability, or protected veteran status. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Below are some other jobs we think you might be interested in.
-
Software Engineer - Data Engineering
- Akuna Capital
- Chicago, IL
Jun 23 -
Senior Software Engineer
- DRW
- Tel Aviv
Jun 19 -
Senior Software Engineer
- MadHive
- Remote, United States; Madhive US (Redwood City)
May 22 -
Software Engineer - Compliance
- Rain
- New York, NY
May 27 -
Senior Software Engineer
- Ellipsis Labs
- New York, New York
Jun 02 -
Software Backend Engineer
- Sardine
- India
- Remote
Jun 03 -
Python Software Engineer
- Cboe Digital
- London, United Kingdom
Jun 24 -
Senior Software Engineer
- Coinbase
- Remote - USA
- Remote
Jun 26 -
Software Engineer - Android
- ether.fi
- Dubai; Cayman; Denver; New York
Jun 06 -
Software Engineer - Fullstack
- Onramper
- Amsterdam, Noord-Holland, Netherlands
Jun 10 -
Software Engineer - Solutions
- Alchemy
- San Francisco; New York
May 29 -
Software Engineer, AI
- thirdweb
- San Francisco, New York, London, Bangalore, Remote
- Remote
May 13 -
Software Engineer, London
- Talos Trading
- London
Jun 20 -
Software Engineer - Solana
- Rain
- New York, NY; Remote
- Remote
Jun 02 -
Software Engineer (fullstack)
- Waterfall
- New York, NY
Jun 02 -
Software Engineer, Enterprise Custody
- Blockstream
- United States
May 16 -
Drift - Junior Software Engineer
- Drift Protocol
- Fully-Remote, working in the Asia timezone
- Remote
Jun 15 -
Software Engineer, New York
- Talos Trading
- New York
Jun 12 -
Trading Systems Software Engineer
- BlockTech B.V.
- Amsterdam, Noord-Holland, Netherlands
May 21 -
Staff Software Engineer, Web3
- Robinhood
- Toronto, Canada
May 17

