If you’re a data scientist in San Francisco, you’re doing a fundamentally different job than a data scientist in New York, London, or Bangalore. Not a slightly different job. A fundamentally different one.

We analyzed 77,735 tech job postings across three regions — the Bay Area (80km radius around San Francisco), the rest of the US, and the rest of the world — and ran chi-square tests of homogeneity with Bonferroni correction to separate real geographic differences from noise. The results surprised us.

Read the full report here: https://github.com/chiefastro/skillenai-notebooks/tree/master/bay-area-vs-world.

The Big Finding: It’s Bay Area vs Everywhere Else

For Data Scientists and AI Engineers, the rest-of-US and non-US skill profiles are statistically indistinguishable (p = 0.18 for both). The Bay Area is the outlier. A Data Scientist in Austin has more in common with one in Berlin than one in Palo Alto.

ML Engineers are the exception — all three regions differ significantly, driven by non-US emphasis on infrastructure skills like MLOps, Kubernetes, and Docker.

RoleOmnibus X²p-valueBay vs Rest-USBay vs Non-USRest-US vs Non-US
Data Scientist116.07.9e-10SignificantSignificantNot significant
ML Engineer164.11.3e-17SignificantSignificantSignificant
AI Engineer58.70.017Not significantSignificantNot significant

The Bay Area Data Scientist Is an Experiment Designer

Only four skills survive Bonferroni correction when comparing Bay Area Data Scientists to the rest of the US. They tell a clear story: the Bay Area DS is a product experimentation role.

SkillBay AreaRest of USResidual
Experimentation25%8%+5.3
Causal inference28%15%+4.0
A/B testing30%19%+3.0
SQL76%61%+2.3

Meanwhile, traditional modeling skills — predictive modeling, statistical modeling, scikit-learn, pandas — all trend lower in the Bay Area. Even “machine learning” as a listed skill is less common for Bay Area data scientists (42% vs 47%). This isn’t a data scientist who builds models. This is someone who designs experiments, measures causal effects, and ships insights to product teams.

Bay Area MLEs Train Models. Everyone Else Operates Them.

ML Engineer is the most geographically differentiated role in our dataset (Cramer’s V = 0.115, all pairwise comparisons significant). Eight of twenty skills survive Bonferroni correction — the most of any role.

The Bay Area MLE focuses on the training loop:

Bay Area HIGHResidualBay Area LOWResidual
Data pipelines+3.1MLOps-3.8
Fine-tuning+2.1AWS-2.9
  SQL-2.5

The non-US MLE is the infrastructure specialist: MLOps (+3.7), Kubernetes (+3.2), Docker (+2.8), and AWS (+2.6) are all significantly overrepresented outside the US. Bay Area MLEs build and fine-tune models. Everyone else deploys and monitors them.

AI Engineer: Too New for Confident Conclusions

With only 134 Bay Area AI Engineers in our dataset, most individual skill differences land in the noise. The Bay-vs-rest-US comparison doesn’t even reach significance (p = 0.097).

The one robust finding: LangChain is significantly more prevalent outside the US (22% vs 10%, residual +2.4). Non-US AI Engineers are more framework-dependent. The Bay Area’s apparent LLM and TypeScript advantages are suggestive but don’t survive multiple comparison correction — we’ll need more data as this role matures.

The Bay Area Pays 18% More (and It’s Not Just Cost of Living)

A Mann-Whitney U test (p = 1.1e-07, medium effect size r = 0.31) confirms the salary gap. The bootstrap 95% confidence interval for the median difference is $15,000 to $39,500.

MetricBay AreaRest of US
Salary floor (P25)$140K$109K
Midpoint median$203K$172K
Salary ceiling (P75)$277K$246K

The premium narrows at the top (13% at P75 vs 18% at median), suggesting top-of-market compensation is converging. UK salaries run about 44% below the Bay Area in USD terms; EUR salaries about 62% below.

The Bay Area Skips “Senior” and Goes Straight to “Principal”

The seniority analysis produced the single largest residual in the entire study: +18.3 for Principal-level roles in the Bay Area. At 11.9%, the Bay Area has nearly 2x the principal rate of the global average (6.3%).

But the Bay Area simultaneously underrepresents Senior (-9.6) and Junior (-6.0). It skips the middle rungs. This is a market that hires experienced ICs who ship — not a market that grows talent through traditional leveling.

Formal internship programs are a US phenomenon entirely: both US regions are overrepresented (+11.5, +10.1) while non-US is sharply underrepresented (-13.4).

Remote Work Runs on a Global Gradient

The work model split produced the largest residual spread of any analysis: 36 points between Bay Area remote (-19.4) and non-US remote (+16.8).

ModelBay AreaRest of USNon-US
Onsite54%51%41%
Remote26%34%42%
Hybrid20%15%17%

The Bay Area is the hybrid capital (+9.8 residual). Onsite dominance is US-wide, not Bay Area-specific. And international markets split almost perfectly between onsite and remote — a coin flip.

The Topic Map: Where AI Lives

Machine Learning as a topic tag produced the highest residual of any variable in the entire study: +32.2 for the Bay Area. LLM (+31.9), Agents (+25.9), Generative AI (+25.0), and MLOps (+21.0) complete the cluster.

The inverse is equally stark. Testing/QA (-17.7), DevOps (-16.7), and Backend (-15.2) are all sharply underrepresented in the Bay Area. These functions exist — they’re just abstracted behind platform engineering and SRE rather than named explicitly.

Three topics show zero geographic variation: product management, ethics/governance, and product design. These are universally distributed concerns regardless of where a company sits.

What This Means for Your Career

If you’re a Data Scientist considering a Bay Area move: brush up on experimentation frameworks, causal inference, and A/B testing. Your scikit-learn and predictive modeling skills matter less there. The role is closer to product analytics than to ML research.

If you’re an ML Engineer outside the US: you’re likely an infrastructure specialist. A Bay Area move would shift your focus toward model training, fine-tuning, and data pipelines — and away from MLOps, Kubernetes, and AWS.

If you’re an AI Engineer anywhere: the role is still being defined. Geographic differences are mostly noise at current sample sizes, except that non-US AI Engineers rely more heavily on frameworks like LangChain.

If you’re hiring remotely: rest-of-US and non-US candidates have nearly identical skill profiles for DS and AIE roles. The talent pool is deeper than the Bay Area, and you won’t sacrifice skill fit.

Methodology

This analysis draws from Skillenai’s job index of 77,735 deduplicated postings (after removing spam and fixing geocoding errors). “Bay Area” is defined as an 80km radius around San Francisco. Skills are extracted via NER and entity resolution from job descriptions.

Statistical significance was assessed via chi-square tests of homogeneity with Bonferroni correction for multiple comparisons. Standardized residuals (|r| > 2.0) identify which specific categories drive each difference. Salary differences were validated with Mann-Whitney U tests and bootstrap confidence intervals. Monte Carlo bootstrap distributions (1,000 resamples of 100 jobs) provide visual confirmation via violin plots.

Full data, code, and violin plots are available in the Skillenai data products repository.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.