Chapter 2 Data sources
Our main data source is obtained from https://www.imdb.com/interfaces/, a daily updated movie dataset collected by IMDB. They made these subsets of IMDb data available for access to customers for personal and non-commercial use, but didn’t reveal the way they collected.
Compared to other options like TMDB Database API, the one in IMDB can be directly download, up-to-date, and follows standard database format with id to each person, movie, and even episode.
This dataset contains massive information about 258k+ films:
Basics
: genres, runtimes, starting year, is for adult or not, language, region ···Crews
: directors, writers ···Principal cast
: category, role, characters as well as their birth year, profession ···Rating
: average rating on IMDB, number of votes ···
Other than that, we also use an external data sources numbers to combine commercial statistics of 6,000+ movies released to theaters: