Box Office Stats

This dataset contains information about movie producers, their movies, and the corresponding box office performance.

Tutorials are the best documentation — Box Office Movie Analysis Tutorial

Description

This dataset provides detailed box office performance data for movies, including daily revenue, theater counts, and distributor information.

It links movies to their producer companies via ticker symbols, enabling analysis of box office success across different production studios and distributors over time.

Data Access

Retrieving Data

from sovai import sov 
df_movies = sov.data("movies/boxoffice")

Data Dictionary

Column NameData TypeDescriptionExample

ticker

string

Ticker symbol of the movie producer company

"ZEEL"

date

date

Date of the movie's box office performance

2022-03-18

title

string

Title of the movie

"The Kashmir Files"

distributor

string

Distributor of the movie

"Zee Studios"

gross

integer

Gross box office revenue for the movie on the specified date

413000

percent_yd

float

Percentage change in gross revenue compared to the previous day

0.0

percent_lw

float

Percentage change in gross revenue compared to the previous week

0.2

theaters

integer

Number of theaters screening the movie on the specified date

230

per_theater

float

Average gross revenue per theater on the specified date

1796.0

total_gross

integer

Cumulative gross box office revenue for the movie up to the specified date

413000

days_in_release

integer

Number of days the movie has been in release as of the specified date

1

parent_company

string

Parent company of the movie producer

"Zee Entertainment Enterprises Limited"

distributor_address

string

Address of the movie distributor

"Laxmi Industrial Estate, Off New Link Road, An..."

distributor_website

string

Website of the movie distributor

"https://www.zee.com/"

release_date

date

Initial release date of the movie

2022-03-17

Dataset Structure

The dataset is organized as a table with 228,484 rows and 15 columns. Each row represents a specific movie's box office performance on a particular date.

Data Types

The dataset contains the following data types:

  • String: ticker, title, distributor, parent_company, distributor_address, distributor_website

  • Date: date, release_date

  • Integer: gross, theaters, total_gross, days_in_release

  • Float: percent_yd, percent_lw, per_theater

Missing Values

If a movie does not have any data for a particular column on a specific date, the corresponding cell may contain missing values.

Example Rows

Here are a few example rows from the dataset:

tickerdatetitledistributorgrosspercent_ydpercent_lwtheatersper_theatertotal_grossdays_in_releaseparent_companydistributor_addressdistributor_websiterelease_date

600579

2011-02-11

Raymond Did It

Plastic Age …

2999

0.0

0.0

1.0

2999.0

2999

1

KraussMaffei Group

7295 Tellier St, Montreal, Quebec H1N 3S9, CA

https://plastic-age.com/en/

2011-02-10

600579

2011-02-12

Raymond Did It

Plastic Age …

193

-0.94

0.0

1.0

193.0

3192

2

KraussMaffei Group

7295 Tellier St, Montreal, Quebec H1N 3S9, CA

https://plastic-age.com/en/

2011-02-10

ZEEL

2022-03-18

The Kashmir Files

Zee Studios

413000

0.0

0.0

230.0

1796.0

413000

1

Zee Entertainment Enterprises Limited

Laxmi Industrial Estate, Off New Link Road, An...

https://www.zee.com/

2022-03-17

This data dictionary provides an overview of the movie producer and movie dataset, including the column descriptions, data types, examples, and sample rows.

Last updated