@tuplo/fletcher

HTTP request library, focused on web scraping.

Install

$ npm install @tuplo/fletcher

# or with yarn
$ yarn add @tuplo/fletcher

Usage

Fetch a HTML page and parse it using cheerio.

import fetch from '@tuplo/fletcher';

const $page = await fetch.html('https://foo.com/page.html');
const heading = $page.find('body > h1');

Fetch a JSON file and parse it.

const { foo } = await fetch.json('https://foo.com/page.html');

Find a script on a page and evaluate it.

const { foo } = await fetch.script('https://foo.com/page.html', {
  scriptPath: 'script:nth-of-type(3)',
});

Find JSON-LD metadata on a page.

const [jsonld] = await fetch.jsonld('https://foo.com/page.html');

Work with the raw Response.

const res = await fetch.response('https://foo.com');
console.log(res.headers);
console.log(await res.text());

Work with Puppeteer for headless browser automation.

const client = fetch.create({
  browserWSEndpoint: 'ws://localhost:3000',
});
const $page = await client.browser.html('https://foo.com');
const { foo } = await client.browser.json('https://foo.com', /ajax-list/);

Options

Option	Description	Default
`browserWSEndpoint`	Puppeteer web socket address
`cache`	Caches requests in memory	`false`
`delay`	Introduce a delay before the request (ms)	1_000
`formData`	Object with key/value pairs to send as form data
`encoding`	The encoding used by the source page, will be converted to UTF8
`headers`	A simple multi-map of names to values
`jsonData`	Object with key/value pairs to send as json data
`log`	Should log all request URLS to stderr	false
`onAfterRequest`	Callback to be called right after request is resolved
`proxy`	Proxy configuration (`host`, `port`, `username`, `password`)
`retry`	Retries failed responses	`async-retry`
`scriptFindFn`	A function to find a `script` element on the page, execute and return it
`scriptPath`	A CSS selector to pick a `script` element on the page, execute and return it
`scriptSandbox`	An object to use as base on an execution of a piece of code found on the page
`urlSearchParams`	A key-value object listing what parameters to add to the query string of `url`
`userAgent`	Set a custom user agent
`validateStatus`	A function to decide if the response status is an error and should be thrown

API

`fletcher(url: string, options?: FletcherOptions) => http.Response`

Generic utility to return a HTTP Response

`fletcher.html(url: string, options?: FletcherOptions) => Cheerio<AnyNode>`

Requests a HTTP resource, parses it using Cheerio and returns its

const $page = await fletcher.html('https://foo.com/page.html');
const heading = $page.find('body > h1');

`fletcher.script<T>(url: string, options?: FletcherOptions) => T`

Requests a HTTP resource, finds a script on it, executes and returns its global context.

const { foo } = await fletcher.script('https://foo.com/page.html', {
	scriptPath: 'script:nth-of-type(3)',
});

`fletcher.text(url: string, options?: FletcherOptions) => string`

Requests a HTTP resource, returning it as a string

`fletcher.cookies(url: string, options?: FletcherOptions) => CookieJar`

Requests a HTTP resources, returning the cookies returned with it.

`fletcher.json<T>(url: string, options?: FletcherOptions) => T`

Requests a HTTP resource, returning it as a JSON object

`fletcher.jsonld(url: string, options?: FletcherOptions) => unknown[]`

Requests a HTTP resource, retrieving all the JSON-LD blocks found on the document

`fletcher.response(url: string, options?: FletcherOptions) => Response`

Requests a HTTP resource, returning the full HTTP Response object

`fletcher.browser.html(url: string) => Cheerio<AnyNode>`

Requests a HTTP resource using Puppeteer/Chrome, parses it using Cheerio and returns its.

`fletcher.browser.json<T>(url: string, requestUrl: string | RegExp) => T`

Requests a HTTP resource using Puppeteer/Chrome, intercepts a request made by that page and returns it as a JSON object

`fletcher.create(options: FletcherOptions) => Object`

Creates a new instance of fletcher with a custom config

const instance = fletcher.create({ headers: { foo: 'bar' } });
await instance.json('http://foo.com');

Name		Name	Last commit message	Last commit date
Latest commit History 177 Commits
.github/workflows		.github/workflows
sh		sh
src		src
.eslintignore		.eslintignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.node-version		.node-version
.npmrc		.npmrc
.prettierrc		.prettierrc
.releaserc		.releaserc
eslint.config.mjs		eslint.config.mjs
license		license
package-lock.json		package-lock.json
package.json		package.json
readme.md		readme.md
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
vite.config.mts		vite.config.mts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@tuplo/fletcher

Install

Usage

Options

API

`fletcher(url: string, options?: FletcherOptions) => http.Response`

`fletcher.html(url: string, options?: FletcherOptions) => Cheerio<AnyNode>`

`fletcher.script<T>(url: string, options?: FletcherOptions) => T`

`fletcher.text(url: string, options?: FletcherOptions) => string`

`fletcher.cookies(url: string, options?: FletcherOptions) => CookieJar`

`fletcher.json<T>(url: string, options?: FletcherOptions) => T`

`fletcher.jsonld(url: string, options?: FletcherOptions) => unknown[]`

`fletcher.response(url: string, options?: FletcherOptions) => Response`

`fletcher.browser.html(url: string) => Cheerio<AnyNode>`

`fletcher.browser.json<T>(url: string, requestUrl: string | RegExp) => T`

`fletcher.create(options: FletcherOptions) => Object`

About

Releases 81

Packages

Contributors 2

Languages

License

tuplo/fletcher

Folders and files

Latest commit

History

Repository files navigation

@tuplo/fletcher

Install

Usage

Options

API

fletcher(url: string, options?: FletcherOptions) => http.Response

fletcher.html(url: string, options?: FletcherOptions) => Cheerio<AnyNode>

fletcher.script<T>(url: string, options?: FletcherOptions) => T

fletcher.text(url: string, options?: FletcherOptions) => string

fletcher.cookies(url: string, options?: FletcherOptions) => CookieJar

fletcher.json<T>(url: string, options?: FletcherOptions) => T

fletcher.jsonld(url: string, options?: FletcherOptions) => unknown[]

fletcher.response(url: string, options?: FletcherOptions) => Response

fletcher.browser.html(url: string) => Cheerio<AnyNode>

fletcher.browser.json<T>(url: string, requestUrl: string | RegExp) => T

fletcher.create(options: FletcherOptions) => Object

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 81

Packages 0

Contributors 2

Languages

`fletcher(url: string, options?: FletcherOptions) => http.Response`

`fletcher.html(url: string, options?: FletcherOptions) => Cheerio<AnyNode>`

`fletcher.script<T>(url: string, options?: FletcherOptions) => T`

`fletcher.text(url: string, options?: FletcherOptions) => string`

`fletcher.cookies(url: string, options?: FletcherOptions) => CookieJar`

`fletcher.json<T>(url: string, options?: FletcherOptions) => T`

`fletcher.jsonld(url: string, options?: FletcherOptions) => unknown[]`

`fletcher.response(url: string, options?: FletcherOptions) => Response`

`fletcher.browser.html(url: string) => Cheerio<AnyNode>`

`fletcher.browser.json<T>(url: string, requestUrl: string | RegExp) => T`

`fletcher.create(options: FletcherOptions) => Object`

Packages