Computer Systems Engineer Job Description Template
Our company is looking for a Computer Systems Engineer to join our team.
Responsibilities:
- Oversee capacity and planning of our clusters;
- Participate in a 24×7 on call rotation;
- Build CPU and GPU clustered compute systems;
- Build cool stuff;
- Work directly with application developers to help investigate upgrades, system tweaks, and next generation hardware;
- Track key metrics and logs;
- Design, implement, and support our internal and cloud systems.
Requirements:
- Familiar with container clustering (K8S/Kubernetes, Swarm, etc.);
- Familiar with Cuda and TensorFlow workloads;
- Experience with larger HPC clusters ( 10,000 cores);
- Familiar with job and resource scheduling managers (Slurm (preferred), LSF, etc.);
- Expert level knowledge of virtual platforms (vSphere, Xen, Docker, or KVM);
- 10+ years of experience and ability to work with little or no supervision;
- Ability to script in any of the following: Perl, Python, Ruby, or Bash;
- Familiar with GPU usage in Compute Cluster.