Capital One Principal Associate, Data Scientist - Data and Model Operations in McLean, Virginia

McLean 2 (19052), United States of America, McLean, Virginia

At Capital One, we’re building a leading information-based technology company. Still founder-led by Chairman and Chief Executive Officer Richard Fairbank, Capital One is on a mission to help our customers succeed by bringing ingenuity, simplicity, and humanity to banking. We measure our efforts by the success our customers enjoy and the advocacy they exhibit. We are succeeding because they are succeeding.

Guided by our shared values, we thrive in an environment where collaboration and openness are valued. We believe that innovation is powered by perspective and that teamwork and respect for each other lead to superior results. We elevate each other and obsess about doing the right thing. Our associates serve with humility and a deep respect for their responsibility in helping our customers achieve their goals and realize their dreams. Together, we are on a quest to change banking for good.

Principal Associate, Data Scientist - Data and Model Operations

At Capital One, data is at the center of everything we do. When we launched as a startup we disrupted the credit card industry by individually personalizing every credit card offer using statistical modeling and the relational database, cutting edge technology in 1988! Fast-forward a few years, and this little innovation and our passion for data has skyrocketed us to a Fortune 100 company and a leader in the world of data-driven decision-making.

As a Data Scientist at the Retail and Direct Bank, you’ll be part of a high performing team to define the next generation of banking. The Bank Data Science team has a relentless focus on the craft of data sciences and innovation with a target towards continually improving customer experience and delivering value to the business. Using the latest in machine learning and distributed computing technologies, you will be building the next generation of data products to enable automation and aim for the right decision at the right time for in-moment action.

This role within the Bank Data Sciences team is for individuals who are passionate about engineering excellence of modeling pipelines. The Deposit Forecasting and Pricing team is looking for an experienced feature pipeline and model implementation expert to support the next generation of models in Bank.

More about the role:

  • Productionizing feature pipelines

  • Set feature pipeline standards in batch, stream, and real-time settings

  • Refactor feature pipelines for faster feature changes and updates

  • Implement data validation framework, and quality tests

  • Productionizing models

  • Implement model objects in cloud based platforms

  • Develop low latency model scoring

  • Establish model promotion and configuration management mechanisms

  • Enable enhanced model & feature versioning

  • Enable system and integration testing for model deployment

  • Productionizing model monitoring

  • Integrate with alerts and notification mechanisms

  • Collaboratively develop model monitoring tools and reporting

  • Experience in modern distributed computing tools such as Hadoop, Spark, and H2O

  • You should know Python or Scala and are comfortable with working with multiple languages

Twenty years after Capital One was started it’s still led by its founder. Be ready to join a community of the smartest people you’ve ever met, who see the customer first, and want to use their data skills to make a difference.

Basic Qualifications:

-Bachelor’s Degree plus 2 years of experience in data analytics, or Master’s Degree plus 1 year of experience in data analytics, or PhD

-At least 1 year of experience in open source programming languages for large scale data analysis

-At least 1 year of experience with machine learning

-At least 1 year of experience with relational databases

Preferred Qualifications:

-Master’s Degree or PhD

-At least 1 year experience working with AWS services (emr, lambda, ec2)

-At least 3 years’ experience in Python or Scala

-At least 3 years’ experience with machine learning (scikit-learn, tensorflow)

-At least 3 years’ experience with SQL

-At least 1 year experience with streaming frameworks in Spark Streaming or Apache Flink

-At least 3 years experience with Apache Spark

-At least 1 year experience with containerization and cluster management methods (e.g. Docker, Kubernetes)

Capital One will consider sponsoring a new qualified applicant for employment authorization for this position.