Capital One Network Problem Manager in Plano, Texas

West Creek 1 (12071), United States of America, Richmond, Virginia

At Capital One, we’re building a leading information-based technology company. Still founder-led by Chairman and Chief Executive Officer Richard Fairbank, Capital One is on a mission to help our customers succeed by bringing ingenuity, simplicity, and humanity to banking. We measure our efforts by the success our customers enjoy and the advocacy they exhibit. We are succeeding because they are succeeding.

Guided by our shared values, we thrive in an environment where collaboration and openness are valued. We believe that innovation is powered by perspective and that teamwork and respect for each other lead to superior results. We elevate each other and obsess about doing the right thing. Our associates serve with humility and a deep respect for their responsibility in helping our customers achieve their goals and realize their dreams. Together, we are on a quest to change banking for good.

Network Problem Manager

The Network Problem Manager is responsible for driving high priority post-incident root cause analysis and problem remediation, ensuring the implementation of permanent fixes in the production environment with the purpose of improving the availability of critical systems. The individual will provide leadership to various technical teams to guide them in determining root cause and in working through the problem management phases of the service management lifecycle, leveraging best practices according to the ITIL framework.

We are seeking a candidate with strong technical and organizational skills to work in a Network Operations Problem Management team. An ideal candidate will have a strong understanding of core networking technologies including LAN, WAN, routing, and switching.

Job Description:

  • Analyze incidents and problems and drive process or standards changes to proactively prevent the occurrence of further incidents and problems.

  • Drive Problem resolution and Provide status updates to key stakeholders and senior leadership

  • Prepare, Conduct and Report on Major Problem Post Mortems and present findings to the customer.

  • Responsible for researching Network Circuits, Switches, routers and other network components and report any issues to the Manager.

  • Maintain and ensure customer’s knowledge base of problem resolutions (KEDB) and communicate with the service desk team, and other support teams.

  • Drive efforts to improve overall application stability and availability for applications and supporting infrastructure by ensuring problem resolution.

  • Monitor metrics and drive continuous infrastructure and application improvement efforts across teams to achieve customer performance goals.

  • Facilitate and coordinate technical problem review meetings to include leading and facilitating investigations of critical incidents and managing root cause analysis between technical teams.

  • Work with process owners and stakeholders to re-engineer processes to be simple, nimble, repeatable, measurable, achievable and continuously improved.

  • Suggest comprehensive metrics that can be actionable and promote positive behavioral changes; Interact with IT development, operations, IT, quality and product management groups to obtain and exchange information.

  • Communicate and manage expectations during problem resolution and act as a point of escalation.

  • Should possess a strong knowledge of network analysis and packet capture tools like Wireshark, OpNET, Sniffer, Compuware, etc.

  • A strong knowledge of Wireless, Firewalls, Proxies, and Cloud networking is highly desired.

  • Candidates must also be able to interface with multiple teams from business areas, network engineering, and management as a team representative on major incident bridges.

  • The candidate will be part of a team performing Network monitoring and troubleshooting functions as well as team representation on major incident bridges.

  • Perform system and network analysis by using well defined toolsets for proactive monitoring, troubleshooting and incident response (packet capture, application performance management, Syslog, NetFlow etc).

Job Responsibilities:

  • Identifies trends and potential Problem sources (by reviewing Incident and Problem analysis).

  • Prevents the replication of Problems across multiple systems.

  • Reviews the efficiency and effectiveness of the Problem control process.

  • Monitors the effectiveness of error control and makes recommendations for improvements.

  • Maintains inventory of problems under analysis and their current progress and status.

  • Follows up on issues and progress with problem owners where necessary.

  • Updating the KEDB periodically after reviewing with Network Operations and Problem Managers.

  • Creation of periodic reports on the problem management process and problem trending.

  • Prevents recurrence of issues by identifying the root cause and implementing fix actions.

  • Work with all internal technical teams, incident management and the various lines of business.

  • Drive all problems towards root cause identification and permanent fix.

  • Need to have an innovative approach as problems are unique and need to use different RCA techniques.

  • Good Interpersonal skills and organizational skills required.

  • Experience in scripting in Python or other languages is highly valued.

Basic Qualifications:

  • High School Degree or GED

  • At least 5 years’ experience in the Incident Management/Problem Management, Change Management, Configuration Management, Knowledge Management, Release, and Deployment Management, Service Catalogue Management, Availability Management, Continuous Improvement, Service reporting, or KPIs processes

  • At least 3 years’ experience working in Incident/Problem Management in a corporate environment, having been directly involved in two or more phases of the SDLC

Preferred Qualifications:

  • Bachelor’s Degree in an IT-related field

  • Broad understanding of all aspects of applications and infrastructure components. Proficient in ServiceNow modules including Incident, Change, Problem, Knowledge Management, and reports. Knowledge of Risk Management, Compliance, Audit, Information Security and Technical Privacy

  • Highly qualified candidates will have certifications in:

  • ITIL Foundations

  • PMP

  • Cloud including but not limited to: Azure/ AWS/ Google Cloud solutions architect, Devops, Delivery

  • Networking including but not limited to: CCNA, CCNP, Wireless, Network+, Load Balancing, DNS, etc.

  • Security including but not limited to: CISSP, CISA, CISM, Security+ At least 2 years’ experience with a scripting language, especially Python

At this time, Capital One will not sponsor a new applicant for employment authorization for this position.