getHtml Function

The getHtml function is the core of the Waterfall-Fetch library. It fetches HTML content from a specified URL using various strategies.

Syntax

async function getHtml(
	url: string,
	options?: {
		set: "cheap" | "js" | null;
		evalFunction?: OnPageEvaluationFunction;
	}
): Promise<HtmlResponse>;

Parameters

  • url (string): The URL to fetch HTML content from.
  • options (object, optional):
    • set (“cheap” | “js” | null): The strategy set to use. Default is null (uses all strategies).
    • evalFunction (function, optional): A custom evaluation function for use with Puppeteer.

Return Value

Returns a Promise that resolves to an HtmlResponse object:

type HtmlResponse = {
	success: boolean;
	html: string | null;
	root?: NodeHTMLElement;
	evaluation_result?: any;
	strategy: FetchStrategy;
	error?: Error | null | string | unknown;
	status?: number | string | null;
};

Example Usage

import getHtml from "waterfall-fetch";

const url = "https://example.com";

// Basic usage
const result = await getHtml(url);

// Using the 'cheap' strategy set
const { html, error, success } = await getHtml(url, { set: "cheap" });

if (success) {
	console.log(html);
} else {
	console.error(error);
}

You can customize the strategy set used by Waterfall-Fetch. Check out the Advanced Usage section to learn how to create and use custom strategy sets.