What 3,215 Job Postings Reveal About Skills That Define each role

Ask five people the difference between a Data Scientist, ML Engineer, and AI Engineer, and you’ll get five different answers. I decided to let the data speak for itself.

I built a data pipeline that ingests 10,000 tech job postings and 50,000 learning resources per day, enriches each one with structured skill extraction, and indexes them for a variety of use cases — from labor market analytics and job searching to skill gap analysis and personalized learning paths. The resulting data products are exposed through an AI-first API designed to be operated by Claude Code, exposing endpoints for semantic search, graph analytics, and SQL.

This article is my first deep dive into the kinds of insights this data product can produce. I used my skills-by-role endpoint to pull 3,215 job postings across three roles and asked a simple question: what does the data actually say about how Data Scientists, ML Engineers, and AI Engineers differ?

The Shared Foundation

Before diving into what separates these roles, it’s worth noting what unites them. Three skills appear in more than 10% of job postings across all three with roughly equal frequencies:

SkillData ScientistML EngineerAI Engineer
Python72%57%61%
AWS13%19%23%
Data Pipelines18%18%12%

Python is universal — the lingua franca of all three roles (but we have a surprise below for the new language emerging for AI engineers). Cloud infrastructure and data pipeline skills round out the shared baseline. Beyond this narrow common ground, the roles diverge sharply. The interesting story is how they diverge.

Three Roles, Three Layers

The cleanest insight from the data: these roles occupy distinct layers of the stack.

  • Data Scientist operates at the data and insight layer — SQL, stats, visualization, experimentation
  • ML Engineer operates at the model and training layer — PyTorch, deep learning, model lifecycle
  • AI Engineer operates at the application and orchestration layer — LLM APIs, RAG, vector DBs, full-stack delivery

This stratification isn’t a rough taxonomy — it emerges clearly from the skills data.

The Data Scientist: Analyst-Scientist

Seven skills are distinctive to Data Scientists:

Distinctive DS Skill% of DS Postings
SQL57%
A/B Testing26%
R21%
Data Visualization19%
Predictive Modeling16%
Causal Inference15%
Data Modeling15%

The DS skill profile is statistical, analytical, and business-facing. R, causal inference, and predictive modeling reflect a role grounded in hypothesis testing and decision science. Data visualization and data modeling show that DS professionals are expected to communicate findings to non-technical stakeholders.

And then there’s SQL — the single largest differentiator in the entire dataset: 57% for DS vs 12% for MLE vs 10% for AIE. If a job posting requires SQL fluency, it’s almost certainly a Data Scientist role.

The ML Engineer: Model Builder

ML Engineer has six skills that lean heavily towards it:

Distinctive MLE Skill% of MLE Postings
PyTorch41%
TensorFlow28%
Deep Learning22%
Model Deployment19%
Model Training18%
Computer Vision12%

The MLE skill profile is deep ML expertise paired with production delivery. PyTorch at 41% and TensorFlow at 28% show that ML Engineers are expected to work directly with training frameworks — not just call APIs. Deep learning at 22% and computer vision at 12% reflect the role’s connection to specialized ML domains that require hands-on model work. Model training (18%) and model deployment (19%) together capture the core MLE responsibility: owning the full path from training to production.

MLE occupies the bridge between DS and AIE. It shares feature engineering and Spark with Data Science. It shares Docker, Kubernetes, and LLMs with AI Engineering. But where Data Scientists stop at the notebook and AI Engineers start at the API, ML Engineers own the middle — building, training, and shipping the models that both sides depend on.

The AI Engineer: LLM Application Builder

AI Engineer is the newest role (599 postings vs ~1,300 for the other two) and the most sharply differentiated:

Distinctive AIE Skill% of AIE Postings
Prompt Engineering26%
LLMs22%
Fine-tuning21%
Observability19%
LangChain19%
RAG17%
TypeScript17%
Vector Databases15%

The AIE skill profile is LLM-native and application-layer. LangChain, RAG, and vector databases signal a role built around orchestrating foundation models, not training them.

But the most surprising finding is TypeScript at 17%. This makes AI Engineer the most full-stack of the three roles — expected to build user-facing applications, not just backend model pipelines. Meanwhile, observability (19%) and CI/CD (17%) show that AIE is also the most DevOps-aware of the three, reflecting the operational complexity of keeping LLM-powered applications running in production.

What This Means for Your Career

If you’re navigating a career in AI/ML, here’s how to read this data:

  • Moving from DS to MLE: Double down on PyTorch, deep learning, and model deployment. Drop R, pick up Docker. Your ML fundamentals transfer; the gap is in production infrastructure.
  • Moving from MLE to AIE: You’re already halfway there. Add LangChain/RAG, learn prompt engineering, and — this is the surprising one — brush up on TypeScript. The shift is from training models to orchestrating them.
  • Moving from DS to AIE: This is the biggest leap. You’ll need to rebuild around the LLM stack (prompt engineering, RAG, vector DBs) and move from SQL/notebooks to APIs/TypeScript. Your analytical instincts transfer, but the toolchain is almost entirely different.
  • Hiring for these roles: SQL on a job description signals DS. Deep learning signals MLE. Prompt engineering signals AIE. Misaligning these will attract the wrong candidates.

Dig Deeper

Want to explore this data for yourself? Sign up for our API and start using our Claude Code skills to analyze the tech job market, search for jobs, identify skill gaps, and develop personalized learning paths.

Jared Rand

By Jared Rand

Jared Rand is a data scientist specializing in natural language processing. He also has an MBA and is a serial entrepreneur. He is a Principal NLP Data Scientist at Everstream Analytics and founder of Skillenai. Connect with Jared on LinkedIn.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.