IMDb Database Interfaces
Machine-readable access (plain text data files) to the Internet Movie Database, covering movies, TV, ratings, and people, widely used as a benchmark dataset in recommendation and ML research and listed in awesome dataset directories.
About this tool
IMDb Database Interfaces
URL: https://www.imdb.com/interfaces
Category: Themed Directories
Tags: datasets, media, recommendation
Overview
IMDb Database Interfaces provide machine-readable, plain-text access to subsets of the Internet Movie Database for personal and non-commercial use. The data covers movies, TV, ratings, and people, and is widely used as a benchmark dataset in recommendation systems and machine learning research.
Data is distributed as gzipped, UTF-8, tab-separated values (TSV) files, refreshed daily and downloadable from:
- https://datasets.imdbws.com/
Use is governed by IMDb’s non-commercial licensing terms and copyright/license.
Features
-
Non-commercial IMDb data access
- Subsets of IMDb data available for personal and non-commercial use.
- Local copies may be held subject to IMDb terms and conditions.
- Must comply with IMDb non-commercial licensing and copyright policies.
-
Data refresh and availability
- Dataset files hosted at: https://datasets.imdbws.com/
- Data is refreshed daily.
-
File format and encoding
- Each dataset is a separate file:
*.tsv.gz(gzipped TSV). - UTF-8 character set.
- First line contains column headers describing each field.
- Missing or null values represented by
\N.
- Each dataset is a separate file:
-
Available datasets
title.akas.tsv.gz- Alternate titles and related metadata for titles (e.g., localized titles / AKA information).
title.basics.tsv.gz- Core information about titles (e.g., movies, TV shows, etc.).
title.crew.tsv.gz- Crew information by title (e.g., directors, writers).
title.episode.tsv.gz- Episode-level information for episodic content.
title.principals.tsv.gz- Principal cast and crew for each title.
title.ratings.tsv.gz- User ratings for titles.
name.basics.tsv.gz- Person-level data (e.g., actors, directors, etc.).
-
Schema stability notice
- As of March 18, 2024, datasets are backed by a new data source.
- No change in file location or schema.
- Issues after this date can be reported to: imdb-data-interest@imdb.com.
-
Typical use cases (implied by description)
- Benchmark datasets for recommendation systems.
- Research and experiments in machine learning and information retrieval.
- Analytics on film/TV content, people, and ratings.
Licensing & Usage
- Intended for personal and non-commercial use.
- Users must review and comply with:
- Non-Commercial Licensing: (linked from page)
- Copyright/License: (linked from page)
Pricing
- No explicit pricing information is provided in the content. The described access is for non-commercial use under IMDb’s licensing terms; commercial licensing is not detailed on this page.
Loading more......
Information
Categories
Tags
Similar Products
6 result(s)A set of context-aware recommendation datasets across five domains, distributed with CARSKit, for research in context-aware recommender systems and machine learning. Part of an awesome public datasets listing.
A curated awesome-style list of open 3D semantic city and region models (e.g., CityGML datasets), providing a centralized directory of high-quality 3D urban data sources.
A curated Awesome-style collection of biological and genomics datasets, including ENCODE, EMPIAR, Ensembl Genomes, GEO, Gene Ontology, GloBI, LINCS, HGDP, HMP, ICOS PSP Benchmark, HapMap, JCB DataViewer (via BioStudies), and KEGG. Each entry links out to the primary dataset resource along with a corresponding YAML metadata file in the awesomedata/apd-core GitHub repository, making this part of a larger meta collection of Awesome data directories.
A curated awesome-style collection of image processing and computer vision datasets, hosted under the Awesome Data (apd-core) project. The listed datasets (e.g., ImageNet, KITTI, Danbooru, DukeMTMC) are part of this meta awesome directory of specialized data resources.
A curated subset of the Awesome Public Datasets meta-collection, focusing on economics-related data sources such as macroeconomic indicators, trade statistics, productivity, corporate registries, and long-run historical series. This portion of the awesome list aggregates high‑quality, openly accessible economics datasets useful for research, data science, and policy analysis.
A curated Awesome-style subdirectory under the Awesome Public Datasets project focusing on Energy-related datasets (e.g., AMPds, BLUEd, COMBED, DBFC, ECO, Global Power Plant Database). It aggregates and links to high-quality, structured energy datasets useful for research and data science.