Skip to content

Releases: Wesleyan-Media-Project/public_data

2020 Facebook Aggregate Weekly Spend of Page IDs Caught through Keyword Search of Federal General Election Advertisers (1/26/20 - 11/7/20)

12 Aug 21:52
1b5bf6c
Compare
Choose a tag to compare

Authors: Markus Neumann, Jielu Yao, Pavel Oleinikov, Laura Baum, Colleen Bogucki, Travis Ridout, Mike Franz and Erika Franklin Fowler

The file weekly_adds_for_page_id_disclaimer_041322.csv contains all relevant data. Scroll down to 'Assets' and click on the filename to download (make sure to click on the file itself, the source code files only contain compressed versions of the GitHub repo readme, not the data).


Cite as

Markus Neumann, Jielu Yao, Pavel Oleinikov, Laura Baum, Colleen Bogucki, Travis Ridout, Mike Franz and Erika Franklin Fowler. (2022). 2020 Facebook Aggregate Weekly Spend of Page IDs Caught through Keyword Search of Federal General Election Advertisers (1/26/20 - 11/7/20). Wesleyan Media Project. Retrieved from https://github.com/Wesleyan-Media-Project/public_data/releases/tag/fb2020_weekly_agg_v1.


Release summary

This file contains aggregate weekly spending compiled from the Facebook Aggregate Report from the period 1/26/2020 to 11/7/2020 for all page name and disclaimer combinations that were identified by the Wesleyan Media Project as being associated with a page ID that was caught through keyword searches of the API for all federal candidate names during the general election period 9/1/2020 through Election Day.


Details

Explanation of the columns and some cleanup that was done:

  • from_date and to_date. These dates indicate the reporting week, starting on a Sunday and ending on a Saturday. The first reporting week is from 01/26/2020 to 02/01/2020. The last reporting week is from 11/01/2020 to 11/07/2020.
  • pd_id - the pd_id that was assigned to the entity by the WMP process
  • disclaimer - the paid-for-by string
  • page_name - the latest value of the page name. The "latest" means that of all page names that were associated with a pd_id (in the case that a page name changed over time), the file keeps the one that occurred closest to Election Day (chronologically it was the most recent). This is necessary as a precaution: when a page gets deleted, the name field in the lifelong report becomes empty. Essentially, the table contains the "last known name of the page."
  • new_spend - the difference of the values of the "amt_spent" column reported by FB between the "to_date" and "from_date" from the daily aggregate report.
  • num_of_new_ads - the difference of the values of the "num_of_ads" column reported by FB between the "to_date" and "from_date"

We’ve included all dates from the range, even though there are cases when the page did not exist yet in February and the first spend appears, for example, in July.

Because FB rounds up the spend when it is below $100, the very first "new_spend" value for a newly created page will contain the value $100, even though the real spend may be lower. This means that if pd_id had the following real (and not reported) cumulative values of spend - 37, 45, 87, 99 - at the end of the first four weeks, the reported values will 100, 100, 100, 100 and as a result the "new_spend" column will contain values 100, 0, 0, 0. Unfortunately, there is nothing we can do about this. This round-up rule applies to all reporting done by Facebook. Fortunately, once the spend goes above $100, it is reported precisely down to the dollar. In this regard, deriving the numbers from the lifelong report is better because it has the smallest amount of round-ups.