RESPONSIBILITIES Maintain smooth operations of multi-user computer systems consisting of Linux based application and license servers, virtual machines, and GP/GPU cluster based systems.
- Coordination with network administrators.
- Setup administrator and service accounts, maintain system documentation, tune system performance, install system wide software and allocate mass storage space.
- Develop and monitor policies and standards for allocation related to the use of computing resources.
- Participate in the installation, integration, acceptance testing, and on-going maintenance of HPC systems and software environment.
- Assist with forecast resource limitations and provide recommendations for increasing the efficiency of resources through proper scheduling and load balancing techniques.
QUALIFICATIONS/EXPERIENCE
- Bachelor's degree and at least 3 years of experience (additional experience may be substituted for the degree)
- DoD 8570 certification - Security + and OS cert, Linux+, RedHat, etc. (90 day waiver may be provided to obtain the OS certification)
- General knowledge of Linux and general Unix operating systems concepts as well as systems administration experience
- Experience with compiling, installing, and porting software.
- Experience with Storage Architectures: SAN, SAS, FC, SATA, Bandwidth, Performance
- Familiar in multi-factor authentication platforms and solutions, and Identity Management such as OpenID, LDAP, and Kerberos.
- Security implementations using multi-factor authentication, PKI, or Kerberos and Unix OS hardening to DoD STIG standards.
- Experience programming or troubleshooting Python code
- Experience with supporting Apache Web Server
- Experience with Zabbix
- Experience with Red Hat Satellite or Red Hat Identity Manager
- Experience with BIND DNS
- Experience with Ansible
- Experience with Gitlab
ABILITIES REQUIRED
- Strong problem-solving skills; capable of analyzing and resolving hardware and operating system problems.
- Ability to develop and maintain documentation on system administration procedures for routine and complex tasks
- Ability to create and maintain Information Assurance (IA) compliance documentation of the Information System
- Ability to understand application scaling issues related to problem resolution, algorithm choice.
- Ability to manage development of appropriate application benchmarks, analyze results and determine optimal configurations for processor type/speed, size of memory/cache, and memory interconnect fabric for customer problem domains.
- Excellent verbal and written communications with all levels of management for planning and organizing site management projects.
- Strong detail-oriented skills; able to multi-task and change priorities quickly.
- Able to meet precise standards.
- Strong team player.
- Ability to work designated schedule, as well as maintain attendance and punctuality.
- Ability to occasionally work after hours and/or respond to emergency situations for problem resolution.
- Ability to obtain and maintain a top-secret security clearance.
Sentrillion is an EEO Employer / Protected Vet / Disabled