;

Daily Educational and Job updated for Freshers & Experienced candidates

Top Data Science and Analytics Jobs

Top Data Science and Analytics Jobs
🚀 AI/ML, Data Engineer, MLOps, and Clinical Analyst Roles (Entry to Senior)

The landscape of data careers is rapidly evolving, driven by the practical application of Machine Learning Operations (MLOps), the proliferation of Generative AI (GenAI), and the increasing demand for specialized data infrastructure. The jobs listed here represent the cutting edge of the industry, spanning from highly technical, platform-focused AI/ML Data Engineering to critical, domain-specific Clinical Data Analysis and Machine Learning Engineering in high-impact fields like robotics and manufacturing.

1) The Pinnacle of Data Engineering: AI/ML Data Engineer (Mid-Level to Senior)

The role of the AI/ML Data Engineer is arguably the most specialized and high-value position in modern data infrastructure, requiring a deep hybrid of traditional data engineering, MLOps, and foundational Generative AI knowledge, especially around Retrieval Augmented Generation (RAG).

Detailed Analysis: AI/ML Data Engineer – College Board (Remote)

This is not a traditional ETL role; it is an ML Data Platform Engineering role with a strong focus on MLOps and LLM DataOps. The job description is a masterclass in modern, high-complexity data requirements.

A. Core Responsibilities and High-Value Skills

The required expertise is segmented into distinct, high-CPC skill areas:

  1. ML Data Platform & Pipelines (40% Focus):

    • Distributed Compute & ETL: Mastery of batch and streaming pipelines using Kinesis/Kafka (streaming ingestion) leading into Spark/Glue (distributed transformation) orchestrated by Step Functions/Airflow. This is the core data movement and processing layer.

    • Feature Stores & Embeddings: Building reproducible pipelines for offline/online feature stores and embedding pipelines (e.g., S3/Parquet/Iceberg + vector index). This is critical for model training and low-latency inference.

    • Data Governance & Quality: Implementing Data Contracts & Validation using tools like Great Expectations/Deequ and capturing metadata/lineage via standards like OpenLineage/DataHub/Amundsen. This guarantees data quality and compliance, particularly crucial when handling sensitive student data (FERPA).

    • Lakehouse Optimization: Expertise in optimizing Lakehouse/Warehouse layouts (Redshift/Athena/Iceberg) for scalable ML queries and analytics.

  2. Model Enablement & LLM DataOps (30% Focus):

    • ML Experimentation & Versioning: Productionizing datasets with versioning (DVC/LakeFS) and experiment tracking (MLflow). This moves ML from notebooks to production-grade systems.

    • RAG Foundation Building: This is the GenAI frontier. It requires building the entire RAG pipeline: document ingestion, chunking, embeddings, retrieval indexing, and, crucially, quality evaluation (precision@k, faithfulness, latency, and cost).

    • Model Serving: Collaborating with Data Scientists to ship models to serving environments (SageMaker/EKS/ECS) and setting up automated feature backfills.

  3. Reliability, Security & Compliance (15% Focus):

    • ML Observability: Defining SLOs and instrumenting monitoring across data and model services for freshness, drift/skew, lineage, cost, and performance. This is the Ops in MLOps.

    • Security by Design: Embedding security and privacy, including PII minimization/redaction and strict FERPA compliance.

    • CI/CD for ML Systems: Building automated testing, quality gates, and safe rollouts (shadow/canary) for both data and models. This is the DevOps applied to Machine Learning.

B. Essential Technical Stack for High-Value Data Engineering

The required technical depth in AWS services alone indicates a high-seniority role, despite the Mid-level label, demonstrating the competitive nature of the field:

  • Languages & Compute: Python (strong requirement), Spark/Glue/Dask (distributed computing).

  • Cloud Platform (AWS): S3, Glue, Lambda, Athena, Bedrock (for LLMs), OpenSearch, API Gateway, DynamoDB, SageMaker, Step Functions, Redshift, Kinesis.

  • Orchestration & MLOps: Airflow/Step Functions, Docker, SageMaker/EKS/ECS, MLflow, DVC/LakeFS.

  • Data Quality/Governance: Great Expectations/Deequ, OpenLineage/DataHub/Amundsen.

  • GenAI/LLMOps (Preferred but High-Value): RAG, vector search (OpenSearch KNN/pgvector/FAISS), prompt/eval frameworks, real-time feature engineering (Kinesis/Kafka).

This job description is an excellent blueprint for an ambitious mid-level engineer targeting the Principal Data Engineer or ML Platform Architect career track. The focus on compliance (FERPA) and impact (educational and career opportunities) highlights the non-technical value demanded by mission-driven organizations.


2) Data Engineering Focused on Applications: Data Engineer – OneDigital (Remote)

In contrast to the specialized MLOps role, the Data Engineer position at OneDigital is a classic, highly practical Full-Stack Data Application Engineer role, bridging the gap between core data infrastructure and end-user applications.

Detailed Analysis: Data Engineer – OneDigital (Remote)

This role requires a strong frontend-data synthesis skillset, a combination increasingly sought after by companies needing custom data tools and dashboards delivered with a modern user experience.

A. Hybrid Full-Stack & Data Skills

The job highlights a unique blend of competencies:

  • Frontend/Data Visualization: Expertise in React and TypeScript for building and optimizing applications. This directly translates complex data into accessible dashboards and tools.

  • Backend & Data APIs: Supporting backend development, specifically around data APIs and transformation logic. A strong preference for Golang indicates a focus on high-performance, concurrent backend services, a sign of a high-load, distributed system environment.

  • Data Integration & Source Expertise: Collaboration with core data engineers for clean integration and a specific focus on working with Salesforce data and integrations. Industry-specific data experience (e.g., Insurance industry familiarity) is a valuable differentiator.

B. System Reliability and Observability

A key part of the mid-level role is ensuring the deployed systems are robust:

  • Observability & Performance: Ensuring observability and performance across distributed systems using tools like Datadog, Grafana, and OpenTelemetry. This is a core DevOps skill set applied to the data platform.

  • Architecture Knowledge: Familiarity with distributed architectures, service communication, microservices, and event-driven architectures is highly valued.

  • CI/CD: Experience building CI/CD pipelines for automated deployment and testing.

The required 4–6 years of experience aligns perfectly with a mid-level engineer who has successfully moved past simple scripting and into architecting reliable, user-facing data applications. This role is a direct pipeline to Senior Full-Stack Engineer or Data Platform Product Manager positions.


3) The Research-to-Production Pipeline: Machine Learning Engineer – General Motors (Detroit, MI)

This position is a specialized Early-Career Research and Applied ML role, requiring a high academic pedigree (PhD/Masters with contributions) focused on translating state-of-the-art AI into real-world applications within a critical industry: robotics and manufacturing.

Detailed Analysis: Machine Learning Engineer – General Motors (Detroit, MI)

The job focuses on high-stakes, physically-bound problems, distinguishing it from typical consumer-AI roles.

A. Advanced ML & Application Domains

The core of the work is in adapting and deploying cutting-edge ML techniques:

  • Industrial ML Applications: Computer vision, robotic manipulation, predictive maintenance, and process optimization. This requires a deep understanding of domain constraints (latency, safety, hardware integration).

  • Deep Learning Pipelines: Building end-to-end pipelines for multi-modal sensor data (vision, force/torque, proprioception, environmental sensors). This involves handling complex, high-velocity data unique to industrial IoT and robotics.

  • Foundation Models & Transfer Learning: Contributing to the development of foundation models that generalize across diverse industrial scenarios—a highly coveted and technically challenging area in ML research.

B. Academic and Technical Requirements

The qualifications emphasize a strong theoretical and research background:

  • Academic Credential: PhD in CS/ML or a Master’s with significant ongoing AI/ML contributions is the minimum bar.

  • Deep Learning Expertise: In-depth knowledge of modern architectures: Transformers, Diffusion Models, CNNs, and training techniques at scale.

  • Frameworks & Languages: Strong hands-on experience with PyTorch, TensorFlow, Keras, or JAX, coupled with robust programming skills in Python and familiarity with systems languages like C++/Java (essential for high-performance robotics/embedded systems).

  • Competitive Edge: Experience in anomaly detection, predictive maintenance, reinforcement learning (RL) for robotic control, and a track record of publications in top-tier conferences (NeurIPS, ICML, CVPR).

The estimated salary range of $130,000 – $170,000 reflects the high value placed on this advanced, research-focused engineering skillset, particularly when paired with a PhD. This role is a springboard to Principal Applied Scientist or AI Research Lead positions.


4) Foundation and Domain Expertise: Data Analyst Roles (Entry-Level to Early-Career)

While the engineering roles focus on building systems, the Data Analyst positions focus on extracting value and insight from the data, often within a highly regulated or specialized business domain.

I) Detailed Analysis: HIM Clinical Data Analyst – Baptist Health South Florida (Remote)

This is a domain-specific analyst role centered on Health Information Management (HIM), highlighting the demand for data professionals with compliance and domain knowledge.

A. Domain-Specific Responsibilities

The focus is less on coding and more on regulation, compliance, and workflow:

  • Physician Delinquency & Compliance: Tracking and trending Physician Delinquency Reports, sending notifications, and enforcing the Suspension List in coordination with Medical Staff Leadership. This is a direct measure of data driving clinical and administrative compliance.

  • Accreditation & Reporting: Preparing reports and graphs for Medical Record Committee meetings and the Joint Commission (TJC). This ensures the hospital maintains its accreditation status.

  • Core Analyst Skills: Knowledge of statistics, data collection, analysis, and data presentation.

B. Qualifications and Differentiators

  • Preferred Credentials: Bachelor’s Degree in HIM or Health Services Administration, and preferred certification as a Registered Health Information Technician (RHIT) and/or Registered Health Information Administrator (RHIA). This shows the importance of industry certification in specialized analyst roles.

  • Technical Skills: Proficiency in data collection, analysis, and presentation, with an emphasis on attention to detail, organization, and problem-solving under time constraints. Basic office proficiency (Word, Excel) and data skills are the foundation.

This early-career role offers a crucial path for data professionals interested in Healthcare Analytics and Data Governance/Compliance.

II) Detailed Analysis: Data Analyst I – University of Rochester (Rochester, NY)

This is a classic Entry-Level Data Analyst role within a highly specific operational domain: Pharmaceutical Procurement and Inventory Management within a hospital system.

A. Operational Data Focus

The responsibilities are heavily focused on leveraging data for operational efficiency and cost control:

  • Procurement & Negotiation: Independently managing the purchase of medications (wholesaler/direct), running reports to track high-cost/non-formulary items, and negotiating with vendors for optimum service and price. This is data analysis driving direct financial impact.

  • Inventory & Logistics: Monitoring and analyzing pricing against contracts, ensuring inventory supply, and managing logistics for specialty and limited access drugs.

  • System Integration: Managing NDC settings (National Drug Code), identifying new electronic health record (EHR) build requests in collaboration with informatics, and handling the ticketing database for vendor management.

B. Foundational Skill Requirements

  • Minimum Experience: Bachelor’s degree and 1 year of Pharmacy and/or 340B experience (a critical federal drug pricing program) or equivalent. Domain knowledge is paramount.

  • Technical Tools: Proficiency with MS Office applications, specifically Excel for data management, reporting, and analysis. This confirms that for many operational analyst roles, SQL and Python are “nice-to-haves,” but Advanced Excel and Domain Expertise (e.g., 340B program knowledge) are the “must-haves.”

This role serves as a foundational step for careers in Supply Chain Analytics, Health Economics, and Pharmacy Informatics

Apply Job links:

  1. AI/ML Data Engineer – College Board (Remote)
    Apply: https://collegeboard.wd1.myworkdayjobs.com/Careers/job/AI-ML-Data-Engineer_REQ002307
  2. Data Engineer – OneDigital (Remote)
    Apply: https://onedigital.wd5.myworkdayjobs.com/Centro/job/Data-Engineer—Remote_R8028
  3. Machine Learning Engineer – General Motors (Detroit, MI)
    Apply: https://search-careers.gm.com/jobs/jr-202519114
  4. HIM Clinical Data Analyst – Baptist Health (Remote)
    Apply: https://careers.baptisthealth.net/job/152308
  5. Data Analyst I – University of Rochester (On-site)
    Apply: https://universityofrochester.jobs/job/752BB6619DE4428E92091B6A302E93E6

♻️ Please repost to help people who are in a job search
➡️ Like, comment, or share for better visibility
🔔 Follow: https://thenewfueldata.in/category/software-jobs-in-india-and-abroad/

Note: Not affiliated with these companies. Apply directly via official links.

Core Cross-Cutting Skill AI/ML DE (College Board) Data Engineer (OneDigital) ML Engineer (GM) HIM Analyst (Baptist) Data Analyst I (UoR) Career Impact & Trajectory
Python & Distributed Compute Essential (Spark/Glue/Dask) Preferred (Backend Services) Essential (ML Frameworks) Not Required Not Required Platform Architecture, MLOps, Algorithm Development
Cloud (AWS/GCP/Azure) Expert Level (S3, Redshift, SageMaker) Experience Required (Cloud Platforms) Experience Required (Deployment/Infra) Not Specified (Implied) Not Specified (Implied) Scalability, System Design, Principal Engineer Tracks
MLOps/LLMOps Core Responsibility (CI/CD, RAG, Feature Stores) Nice to Have (Observability) Core Responsibility (Deployment, Monitoring) Not Applicable Not Applicable High-Value Infrastructure Leadership (ML Platform Lead)
Data Quality & Governance Essential (Data Contracts, FERPA) Experience with Clean Integration Data Collection/Annotation Strategy Essential (TJC/CMS Compliance) Quality Management (NDC/EHR) Data Governance Officer, Compliance Analyst, Data Architect
Communication & Collaboration Essential (DS, Peers, Tech Talks) Essential (Technical/Non-Technical) Essential (Partner Teams, Research Community) Essential (Medical Staff, Leadership) Essential (Vendors, Faculty, Staff) Management and Leadership Roles
Frontend/Visualization BI Tools (Tableau, Quicksight) Expert (React, TypeScript, Dashboards) Not Primary Focus Reports/Graphs Runs Reports/Analysis Data Product Management, Analytics Engineering

Written by Pasupuleti

Empowering Aspirations: Your Ultimate Guide to Career and Academic Excellence.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
error: Content is protected !!
0
Would love your thoughts, please comment.x
()
x