Sr. Data Scientist Duration: 6 months Remote Job Summary: The Digital Technology Services (DTS) Principal Data Scientist will join a collaborative team of extremely talented Analysts, Engineers, Designers, and Managers across the Health network. This role requires using data analysis, applied mathematics, machine learning (ML), and large language models to build next-generation predictive solutions and artificial intelligence (AI) tools that enable clinicians and other end-users within the organization to perform their work more efficiently and effectively. In this role, there is a strong emphasis on using the agile development methodology to rapidly iterate and deploy impactful solutions within the healthcare organization.
Essential Job Functions: - Work with stakeholders to document and understand their objectives and ideate how AI/ML solutions might achieve those objectives.
- Perform exploratory data analysis to understand the data available related to a particular business objective and if that data lends itself to the creation of an AI/ML solution that might achieve that business objective.
- Optimizing Large Language Model (LLM) output with prompt engineering; building retrieval-augmented generation (RAG) pipelines.
- Build AI/ML models that attempt to address a business objective given the data available; explore the model space to understand the optimal model for the particular use case and what the performance characteristics are.
- Create repeatable, interpretable, and scalable models that can be seamlessly incorporated into analytic data products.
- Engineer features by using business acumen to find new ways to combine data sources.
- Write production-quality pipeline code used to load data, execute the AI/ML model, and store the results.
- Perform analyses of AI/ML models and systems, such as fairness, bias, and equity audits; performance analyses;impact assessments; and interpretability reports.
- Develop and commit Python code in such a way that it works in harmony with the code and systems being developed by the team's data engineers, software engineers, and data analysts.
- Proficient in writing production code for complex models. Modifies or creates custom monitoring solutions as needed.
- Able to deploy custom models to a cloud platform. Deep expertise in more than one area of ML Ops.
- Seen as a thought leader and may be invited to speak at conferences or publish articles. Able to communicate
- complex ML/AI methodology and concepts to a wide range of audiences including non-technical audiences and the general public.
- Make strategic recommendations about organizing work teams.
- Lead and manage large-scale, complex projects with significant impact on the organization.
- Coach executives on ML/AI and how it can be used for decision-making. Facilitate stakeholder adoption of ML/AL solutions for data-driven decision-making.
- Other duties and/or projects as assigned.
- Adheres to Organizational competencies and standards of behavior.
Education, Knowledge, Skills, and Abilities Required: - Bachelor's degree in STEM or another related/relevant field of study; and Ph.D degree in data science-related area.
- Minimum of 10+ years of experience working in a data science role.
- Expert-level Python development experience.
- Expert-level SQL experience.
- Proficient in developing and interpreting complex statistical models. Deep expertise in one or more aspects or areas of data acquisition, cleaning, and curation; EDA; or statistical analysis.
- Excellent understanding of most aspects of clinical data; Very proficient with querying and using clinical data; Deep expertise in more than one aspect or area of clinical data.
- Excellent understanding of a wide range of ML models. Able to improve model performance with complex feature engineering and hyperparameter tuning. Ability to design and implement complex machine learning pipelines.
- Deep expertise in one or more aspects of machine learning.
- Expertise in prompt engineering methodology; Deep expertise in creating and optimizing RAG pipelines. Ability to create LLM human-in-the-loop or automated learning systems that solve business use cases; Ability to apply model fine-tuning to improve overall system performance.
- Can create custom monitoring solutions as needed. Deep expertise in one or more areas of ML Ops.
- Work effectively with team members and technical and non-technical stakeholders.
- Excellent written and verbal communication skills.
- Proficient computer skills including but not limited to Microsoft Office and Google Suite platforms.
Education, Knowledge, Skills, and Abilities Preferred: - Minimum of 8+ years of data analysis experience in healthcare.
- Proficiency in Epic Clarity clinical data models.
- Experience with Google Cloud Platform (Big Query, VertexAI, etc).
- Experience with Git and GitHub.
- Experience with Docker.
Licenses and Certifications Required: Licenses and Certifications Preferred: - Epic Clarity Data Model.
- Epic Clarity Clinical Data Model.
- Google Machine Learning Engineer Certification.