About this project

Measuring the health of
every major body of water

A data-driven index that scores 30 oceans, seas, and lakes on pollution โ€” and shows where they're headed by 2100 if nothing changes.

Explore the dashboard โ†’

The Project

Why this exists

Pollution data for the world's oceans exists โ€” but it's scattered across a dozen agencies, in different formats, at different scales. Nobody had stitched it together into a single comparable score per water body.

This project does exactly that. Six public datasets, cleaned and merged, weighted by scientific relevance, and turned into an index you can actually compare โ€” and forecast.

30
Water bodies scored
6
Data sources merged
2100
Forecast horizon

The index covers major oceans split by hemisphere, regional seas (Mediterranean, Black Sea, Red Sea, South China Sea), and large lakes (Great Lakes, Caspian Sea, Lake Victoria, Lake Baikal). Every region gets a score from 0โ€“100 built from real measurements โ€” not estimates.


Data Sources

What goes into the score

Every dataset is publicly available and free to use. Where regional data was missing from a source, values were filled using published peer-reviewed literature โ€” with the source documented per region in the codebase.

Marine Microplastics
NOAA NCEI
22,530 measurements globally โ€” the most direct measure of plastic pollution in the water column
Weight: 30%
River Plastic Input
Our World in Data / Meijer et al.
Plastic entering the ocean via rivers, by country โ€” the primary land-to-ocean pollution pathway
Weight: 25%
World Port Index
NGIA / Kaggle
3,824 ports worldwide with size classifications โ€” proxy for industrial coastal pressure and shipping activity
Weight: 20%
Global Coastal Characteristics
Copernicus / Zenodo
Population within 10km of shore โ€” proxy for waste generation and runoff pressure
Weight: 15%
Ocean Biogeochemistry
Copernicus Marine Service
Ocean pH, dissolved inorganic carbon, alkalinity โ€” acidification signals from COโ‚‚ absorption
Weight: 10%
IUU Fishing Risk
Global Fishing Watch
Illegal, unreported fishing vessel activity 2012โ€“2025 โ€” ecological pressure beyond plastic pollution
Contextual

Methodology

How the index is calculated

Each of the five components is normalised to a 0โ€“100 scale using min-max scaling across all 30 regions. The components are then combined using a weighted average:

Microplastic concentration
30%
River plastic input
25%
Port pressure score
20%
Coastal population (10km)
15%
Ocean pH deviation
10%

Forecasts (2030, 2040, 2050, 2100) apply a 3% annual compound growth rate โ€” consistent with observed global plastic production trends over the past two decades. Scores exceeding 100 in the 2100 forecast are intentional: they represent trajectories that exceed the current worst-case benchmark, making the unsustainability of current trends legible.

Missing data was handled in two ways: where published peer-reviewed measurements existed for a region, those values were used directly. Where no literature existed, values were imputed from comparable regions with documented reasoning. Every imputation is flagged in the source code.

Limitations:


The Builder

Who made this

S
Sercan Emiroglu
BSc Computer Science ยท City St George's, University of London
I'm a Computer Science graduate based in London with a focus on data science, machine learning, and building things that are actually useful. This project started as a portfolio piece and turned into something I genuinely care about โ€” there's real data here telling a real story about where the world's oceans are headed.

I'm currently looking for roles in data analytics and data science. If you're working on something interesting, I'd like to hear about it.

Get in Touch

Questions or feedback?

Whether it's about the methodology, the data, or just to say something looks wrong โ€” I want to hear it.

Your message goes directly to my inbox. I'll reply within a few days.