Media Memorability Prediction

Introduction

Predicting Media Memorability task is proposed as part of the MediaEval 2018 Benchmarking Initiative for Multimedia Evaluation. The purpose of this task is the automatic prediction of multimedia content memorability. Understanding what makes a content memorable has a very broad range of current applications. Efficient memorability prediction models will also push forward the semantic understanding of multimedia content, by putting human perceptions through memorability in the center of the scene understanding.

For the task, participants will be provided with extensive datasets of multimedia content associated with memorability annotations, and with pre-extracted state-of-the-art audio-visual features. The corresponding ground truth consists of objective measures of memory performance and had been collected through recognition tests. Participants will be required to train computational models capable to infer multimedia content memorability from features of their choice. Models will be evaluated through standard evaluation metrics.

Dataset

The full development set consists of 8000 videos, together with their corresponding short-term and long-term memorability scores.

The data folder only contains several sample files and videos used by the project.

Videos

Videos were extracted from raw footage used by professionals when creating content. They are varied and contain different scenes types. All videos are released in .webm format, with a bit rate of 3000 kbps. They are provided as individual files, named:

video<Nb>.webm

<Nb> is the number of a video.

Ground Truth

The corresponding ground truth for the development set can be found in the file dev-set/ground-truth.csv. It contains one line per video, which consists of:

The video's name.
Its short-term memorability score.
The number of annotations that was used to calculate the short-term memorability score.
Its long-term memorability score.
The number of annotations that was used to calculate the long-term memorability score.

Example:

video10.webm, 0.950, 34, 0.900, 10

Precomputed Features

A set of pre-extracted visual features are provided to the participants to the task. Precomputed features are organized in different folders, one per feature. For the following two features, participants will find one file per video.

C3D

Item	Description
Outputs	The final classification layer of the 3D Convolutional Neural Networks.
File Format	`.txt` file.
Feature	A single list of numbers on one line, the dimension is 101.

HMP

Item Description

Outputs Histogram of Motion Patterns.

File Format .txt file.

Feature A single list of pairs of numbers with format bin:number on one line, the dimension is 6075.

The next feature was extracted on three key-frames, the first (0), one-third (56) and two-thirds (112), on each video. So there are three files for each video, with names video<Nb>-0.txt, video<Nb>-56.txt and video<Nb>-112.txt.

LBP

Item	Description
Outputs	Local Binary Patterns, calculated for patches of 8 × 15 pixels.
File Format	`.txt` file.
Feature	A single list of numbers on one line, the dimension depends on the image size.

Models

These models have been tested in the project:

Linear Regression
K-Nearest Neighbors Regression
Support Vector Regression
Random Forest Regression
AdaBoost Regression

Evaluation

The official evaluation metric will be the Spearman's rank correlation between the predicted memorability scores and the ground-truth memorability scores computed over all test videos.

Dependences

License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
badges		badges
data		data
paper-tex		paper-tex
references		references
stats		stats
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Cover.png		Cover.png
LICENSE		LICENSE
Media Memorability Prediction.ipynb		Media Memorability Prediction.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Media Memorability Prediction

Introduction

Dataset

Videos

Ground Truth

Precomputed Features

Models

Evaluation

Dependences

License

About

Languages

Item	Description
Outputs	Histogram of Motion Patterns.
File Format	`.txt` file.
Feature	A single list of pairs of numbers with format `bin:number` on one line, the dimension is 6075.

License

czs108/Media-Memorability-Prediction

Folders and files

Latest commit

History

Repository files navigation

Media Memorability Prediction

Introduction

Dataset

Videos

Ground Truth

Precomputed Features

Models

Evaluation

Dependences

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages