High-level picture:
-
We set up an HTTP request listener,
onBeforeRequest
in background.js -
In analyze.js, we have a callback for this listener. We take the information we analyze from the callback data, and then pass the result to
resolveBuffer
, where we call our analysis functions on that request
-
These functions can be found in searchFunctions.js.
-
locationKeywordSearch
looks for user inputted location elements,urlSearch
takes Firefox's url classifications from the disconnect.me list,coordinateSearch
looks for longitude and latitudes,regexSearch
matches regular expression patterns passed to it,fingerprintSearch
looks for text that matches the list that we have compiled (src/assets/services.json
),pixelSearch
looks for urls on our pixel list (src/assets/keywords.json
),dynamicPixelSearch
looks for common properties of a pixel,ipSearch
looks for inputted IP addresses sent to 3rd parties, andencodedEmailSearch
looks for emails that have been encoded in a few ways being sent to 3rd parties. -
Some of these functions, like
locationKeywordSearch
,regexSearch
, andipSearch
get passed user data from the watchlistKeyval and the IP address-based dynamic data generation via ipinfo.io.
-
We format this user data as a list
[locCoords, networkKeywords, services, currIpInfo]
returned by theimportData()
function inimportSearchData.js
. -
locCoords
use the getLocationData.js file which uses the navigator api to ask for the user's location. So, the location popup should come up here. -
networkKeywords
grabs all the user data in the watchlistKeyval, runs the appropriate structured routines in structuredRoutines.js (reformats phone numbers with google's api for example), and then returns everything as a dictionary. -
services
is the json fromsrc/assets/services.json
. This contains important websites that regularly perform invasive procedures, and we alert users to instances of their use. -
currIpInfo
isthe user's current location and IP address as provided by ipinfo.io] -
This function is called on load
importData().then((data) => {
in background.js. Then, we pass data to the listeners. If the user changes a value, either a setting or a watchlist item,importData()
will be called again.
-
For each HTTP Reqeust, we call all analysis routines. Once the analysis is complete, we pass its result to
addToEvidenceStore
found inaddEvidence.js
. -
addToEvidenceStore
will take the passed evidence and interact with the database. We do not add duplicate evidence. Duplicate evidence is defined as the same root/reqUrl and the same permission/type. Example: The exact same request that gets tagged asmonetizaiton
is handled twice.
-
There are three indexedDb databases: evidenceKeyval, watchlistKeyval, and settingsKeyval. evidenceKeyval is defined in
analysis/interactDB/openDB.js
, watchlistKeyval and settingsKeyval are defined inlibs/indexed-db/openDB.js
. Both use async/await and have data persist across sessions. We use this library to wrap the DB. -
watchlistKeyval/settingsKeyval are populated when a user interacts with the front-end.
-
In evidenceKeyval, we keep all of the evidence we have collected in our analysis. The keys to this store are rootUrls (i.e. nytimes.com or facebook.com) and the values are the evidence we have at these rootUrls.
- So, if you call
e = await evidenceKeyval.get(nytimes.com)
, e will point to a dictionary of evidence. This evidence is structured: permission -> type -> requestUrl -> evidence object
-
The structure for permission, type, and evidence objects can be found in classModels.js. So, throughout the backend, we always use permissionEnum or typeEnum to reference permissions and types. If a new function is written that calls the addToEvidence function, it should pass permission and type parameters from these enums.
-
In general, any strings that will be used throughout the codebase should be declared as an object in the classModels.js file.
-
The front end uses the privacyLabels enum. The permissionEnum and typeEnum should have exactly the same naming conventions as the privacyLabels enum. If evidence is added with a permission that is not in privacyLabels or a subtype that is not in privacyLabels, it will not display on the front end, which iterates through evidence and looks for evidence that has permissions and types from privacyLabels.
-
The frontend gets data from the backend (pulls from the evidenceKeyval) in the libs/indexed-db/index.js file.
-
More documentation of the frontend should be added. The high high level is that the popup displays the data for the website the user is currently on and the home page shows all of the labels we have in the evidenceKeyval.