Monitoring Solution Architect
• Experience with different distributions of Linux (Red Hat/Debian/CentOS) environments, Windows Server version (2003+), ESXi 5.5+
• File system concepts (inode, paging, swapping, clustering, logical partitions, etc.)
• Experience with various monitoring solution (Nagios, Tivoli, Solarwinds)
• Experience providing support in medium to large scale infrastructure environments (change management processes, etc.)
• Experience using ticketing systems.
• Experience with Backup software and strategies.
• Familiarity with RAID disk technologies & SAN infrastructure
• Familiarity with network infrastructure.
• Ability to work well with the team and on various projects (shared team environment)
• Ability to work under pressure and remain decisive.
• Ability to conceptualize problems
• Willingness to learn new things & technologies (automation tools, OS related, storage related)
• Communicative competence (pro-active)
• Should be able to mentor junior member of the team.
• Should act as a role model
• Flexible if workflow is changing
• Fluent English (spoken and written)
• Team player
• A very positive attitude towards life
• Highly motivated
Nice to have:
• ITIL Certified
• Experience with Linux automation tools (i.e. UrbanCode Deploy, Puppet, Chef, Dockers etc.)
• Proficient in at least 1 programming/scripting language – bash, python, ruby, etc.
• Experience with backup tools such as Tivoli TSM
• Experience with system monitoring tools and analyzing results
• Bachelor degree, with a technical major such as Engineering or Computer Science
As a Shared Services Monitoring Solution Architect you will be responsible for the design and implementation of various monitoring tools (IBM Tivoli suite, Nagios, Ichinga, Solarwinds, etc.)
Your tasks may included but are not limited to:
• Monitoring solutions design & implementation
• Responsible for uptime, performance, reliability, scalability, security and high availability of infrastructure machines
• Troubleshoot performance issues, OS configuration and hardware failures and apply fixes
• Create and implement change requests for incidents/problems as needed
• Support incidents and perform root cause analysis for service interruption issues. Fix the root cause not the effect.
• Ensure compliance with security guidelines
• Create scripts in order to automate certain tasks
• Structure, draft and review technical documentation, document technical requirements and processes