Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hidden renderer for fetching backup search results #27285

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

DJAndries
Copy link
Collaborator

Resolves brave/brave-browser#43399

Submitter Checklist:

  • I confirm that no security/privacy review is needed and no other type of reviews are needed, or that I have requested them
  • There is a ticket for my issue
  • Used Github auto-closing keywords in the PR description above
  • Wrote a good PR/commit description
  • Squashed any review feedback or "fixup" commits before merge, so that history is a record of what happened in the repo, not your PR
  • Added appropriate labels (QA/Yes or QA/No; release-notes/include or release-notes/exclude; OS/...) to the associated issue
  • Checked the PR locally:
    • npm run test -- brave_browser_tests, npm run test -- brave_unit_tests wiki
    • npm run presubmit wiki, npm run gn_check, npm run tslint
  • Ran git rebase master (if needed)

Reviewer Checklist:

  • A security review is not needed, or a link to one is included in the PR description
  • New files have MPL-2.0 license header
  • Adequate test coverage exists to prevent regressions
  • Major classes, functions and non-trivial code blocks are well-commented
  • Changes in component dependencies are properly reflected in gn
  • Code follows the style guide
  • Test plan is specified in PR before merging

After-merge Checklist:

Test Plan:

request->timeout_timer.Start(
FROM_HERE, kTimeout,
base::BindOnce(&BackupResultsServiceImpl::CleanupAndDispatchResult,
base::Unretained(this), request, std::nullopt));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

content::GetUIThreadTaskRunner({})->PostTask(
FROM_HERE,
base::BindOnce(&BackupResultsServiceImpl::CleanupAndDispatchResult,
base::Unretained(this), pending_request, std::nullopt));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

pending_request->simple_url_loader->DownloadToString(
pending_request->shared_url_loader_factory.get(),
base::BindOnce(&BackupResultsServiceImpl::HandleURLLoaderResponse,
base::Unretained(this), pending_request),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

GURL(*url_str), std::nullopt,
base::BindOnce(
&WebDiscoveryRetrieveBackupResultsFunction::HandleBackupResults,
this));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] BindOnce/BindRepeating may allow callers to access objects which may already be freed in the C++ lifecycle.
Verify the occurrences manually.


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

@diracdeltas diracdeltas requested a review from bridiver January 21, 2025 17:07
public:
void DidFinishLoad(content::RenderFrameHost* render_frame_host,
const GURL& validated_url) override {
if (!validated_url.SchemeIsHTTPOrHTTPS()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can limit to only HTTPS here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually we can limit it to only google? the API should not be able to initiate a load to any other origin.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can limit to only HTTPS here?

done

actually we can limit it to only google? the API should not be able to initiate a load to any other origin.

We may invoke requests using various TLDs depending on the user's locale, which makes this less straightforward. In HandleWebContentsStartRequest, we do ensure that the redirected requests have the same origin as the original URL.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we do the same thing as WDP extension here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DJAndries can you point to the logic that decides on the TLD based on locale? it would be really bad if we assumed google.$TLD was owned by google and allowed that to go through but it actually wasn't. i think we need to be stricter than the original WDP logic @ShivanKaul given the risks are higher here.

request->timeout_timer.Start(
FROM_HERE, kTimeout,
base::BindOnce(&BackupResultsServiceImpl::CleanupAndDispatchResult,
base::Unretained(this), request, std::nullopt));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

content::GetUIThreadTaskRunner({})->PostTask(
FROM_HERE,
base::BindOnce(&BackupResultsServiceImpl::CleanupAndDispatchResult,
base::Unretained(this), pending_request, std::nullopt));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

pending_request->simple_url_loader->DownloadToString(
pending_request->shared_url_loader_factory.get(),
base::BindOnce(&BackupResultsServiceImpl::HandleURLLoaderResponse,
base::Unretained(this), pending_request),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

GURL(*url_str), std::nullopt,
base::BindOnce(
&WebDiscoveryRetrieveBackupResultsFunction::HandleBackupResults,
this));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] BindOnce/BindRepeating may allow callers to access objects which may already be freed in the C++ lifecycle.
Verify the occurrences manually.


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

@DJAndries DJAndries requested a review from a team as a code owner January 22, 2025 03:15
request->timeout_timer.Start(
FROM_HERE, kTimeout,
base::BindOnce(&BackupResultsServiceImpl::CleanupAndDispatchResult,
base::Unretained(this), request, std::nullopt));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

content::GetUIThreadTaskRunner({})->PostTask(
FROM_HERE,
base::BindOnce(&BackupResultsServiceImpl::CleanupAndDispatchResult,
base::Unretained(this), pending_request, std::nullopt));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

pending_request->simple_url_loader->DownloadToString(
pending_request->shared_url_loader_factory.get(),
base::BindOnce(&BackupResultsServiceImpl::HandleURLLoaderResponse,
base::Unretained(this), pending_request),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reported by reviewdog 🐶
[semgrep] base::Unretained is most of the time unrequited, and a weak reference is better suited for secure coding.
Consider swapping Unretained for a weak reference.
base::Unretained usage may be acceptable when a callback owner is guaranteed
to be destroyed with the object base::Unretained is pointing to, for example:

- PrefChangeRegistrar
- base::*Timer
- mojo::Receiver
- any other class member destroyed when the class is deallocated


Source: https://github.com/brave/security-action/blob/main/assets/semgrep_rules/client/chromium-uaf.yaml


Cc @thypon @goodov @iefremov

Copy link
Contributor

[puLL-Merge] - brave/brave-core@27285

Description

This PR implements a backup results service for Brave Search, allowing it to fetch and process search results from backup providers. It includes functionality for both direct URL fetching and full page rendering modes, with support for cookie handling and request throttling. This is used to support features like fallback mixing (GMix) and Web Discovery Project.

Security Hotspots

  1. Cookie Handling: The service processes cookies from backup search providers in an OTR profile to isolate cookie storage. The service needs to ensure cookies are not leaked between sessions.

    • Location: backup_results_service_impl.cc, backup_results_navigation_throttle.cc
  2. URL Loading: The service makes HTTP requests to external search providers. User input validation and URL sanitization are critical.

    • Location: brave_search_fallback_host.cc
  3. Resource Limits: The service limits request size and timeout periods to prevent resource exhaustion attacks.

    • Location: backup_results_service_impl.cc
Changes

Changes

  1. Browser Service
  • Added BackupResultsService interface and implementation
  • Created factory for service instantiation
  • Implemented URL loading and page rendering functionality
  1. Navigation Control
  • Added BackupResultsNavigationThrottle to control navigation in backup result web contents
  • Implemented request filtering and validation
  1. Extension API
  • Added webDiscovery extension API endpoint
  • Implemented retrieveBackupResults function for extension access
  1. Tests
  • Added browser tests for backup results service
  • Added render and navigation tests
  • Updated existing search tests
sequenceDiagram
    participant Extension
    participant BackupResultsService
    participant OTRProfile
    participant WebContents
    participant BackupProvider

    Extension->>BackupResultsService: retrieveBackupResults(url)
    BackupResultsService->>OTRProfile: Create
    
    alt Full Render Mode
        BackupResultsService->>WebContents: Create
        WebContents->>BackupProvider: Initial Request
        BackupProvider->>WebContents: Response with Redirect
        WebContents->>BackupProvider: Follow Redirect
        BackupProvider->>WebContents: Final Results
        WebContents->>BackupResultsService: Extract HTML
    else Direct Mode
        BackupResultsService->>BackupProvider: Direct Request
        BackupProvider->>BackupResultsService: Results
    end
    
    BackupResultsService->>Extension: Return Results
    BackupResultsService->>OTRProfile: Cleanup
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add hidden renderer for fetching backup search results
7 participants