Thank you for your interest in joining our Web Operations team at Creative Market, where you'll help independent creators make a living doing what they love! Every day, thousands of customers use Creative Market to buy ready-to-use design content (fonts, graphics, templates, website themes, 3d models, etc) directly from top designers around the world.
Join the team to help us continue our meteoric growth, and your work will be used by millions of people every month.
As an SRE, you'll help maintain our fantastic uptime, wave the Infrastructure as Code flag, work with our product engineering teams to deliver great new features, and work closely with our engineering leadership on special projects.
This is a full-time position.
Remote opportunities are available, but you must be able to work in the United States.
About our Web Operations team
We're deployed entirely in AWS, so there's no hardware to manage.
We auto-scale our app servers and have many user-driven deploys per day.
We use Chef and an in-house tool (written in Go) for deploying the app and Scalr for orchestration.
We use Linux, AWS, GitHub, New Relic, Datadog, Pingdom, PagerDuty, Loggly, PackageCloud, and Distil.
We use CircleCI for CI/CD for Chef.
Things you like: automation, stability, security, high availability, resiliency, simplicity, and performance.
You have experience deploying an application in a cloud environment.
You have experience monitoring as many aspects of the application and infrastructure as possible, and alerting when anomalies arise.
You're a programmer, and proficient in either C, C++, Python, Perl, PHP, Ruby, or the like.
You're familiar with Chef, and have a good understanding of the LAMP stack, ElasticSearch, networking, and databases.
You communicate clearly, quickly, and often.
You speak up when something doesn't look right, and jump and help others when needed.
You also break down and explain complex technical concepts to others.
You get buy in from your team when presenting a new idea.
You're curious, like to play with new tech, but know when to stick with the tried-and-true.
You like to know why something succeeded or failed, and generally want to make things better than they were before.
You're a champion for good security practices and you understand the balance between security, performance, and ease of use.
You write tests to ensure that your code doesn't suffer from regressive bugs.
You jump into action when outages occur and work hard to get services restored.
You stay organized and prioritize your daily tasks.
Your soft skills are as honed as your hard skills.
You're a team player, like to work towards a common goal.
You strive for simple solutions to problems.
The fewer moving parts the better.
Participate in our daily stand-up calls.
Support application developers by adding features and functionality our application servers (PHP).
Things like new modules, config changes, etc.
Respond to, remedy, and prevent site performance incidents and outages.
Think through ways to prevent or lessen the impact of performance degradations and outages.
Partake in on-call rotation (we rarely get paged) and escalate to other team members when needed.
Participate in rare off-hours cutovers and migrations.
Build tools to automate parts of your day-to-day to make your job easier and tasks less prone to failure.
Document processes and thoughts behind design decisions you made.
Build, support, and maintain our internal API infrastructure.
Track application performance and propose ways to improve key areas of the stack / traffic flow.
Troubleshoot issues across the infrastructure.
Set engineering standards, contribute to runbooks, holding us to higher standard
Put our infrastructure to the test: run fire drills and disaster recovery tests, where we manually trigger infrastructure failures to test our overall resiliency
Lots of AWS experience.
You know its quirks, best practices, and dos & don'ts.
Familiar with the AWS product family.
Medical, dental, and vision benefits
Generous paid time off