Note
The Scrapegraphai PHP API Library is currently in beta and we're excited for you to experiment with it!
This library has not yet been exhaustively tested in production environments and may be missing some features you'd expect in a stable release. As we continue development, there may be breaking changes that require updates to your code.
We'd love your feedback! Please share any suggestions, bug reports, feature requests, or general thoughts by filing an issue.
The Scrapegraphai PHP library provides convenient access to the Scrapegraphai REST API from any PHP 8.1.0+ application.
It is generated with Stainless.
The REST API documentation can be found on scrapegraphai.com.
To use this package, install via Composer by adding the following to your application's composer.json
:
{
"repositories": [
{
"type": "vcs",
"url": "[email protected]:stainless-sdks/scrapegraphai-php.git"
}
],
"require": {
"org-placeholder/scrapegraphai": "dev-main"
}
}
<?php
use Scrapegraphai\Client;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;
$client = new Client(
apiKey: getenv("SCRAPEGRAPHAI_API_KEY") ?: "My API Key",
environment: "environment_1",
);
$params = SmartscraperCreateParams::from(
userPrompt: "Extract the product name, price, and description"
);
$completedSmartscraper = $client->smartscraper->create($params);
var_dump($completedSmartscraper->request_id);
When the library is unable to connect to the API, or if the API returns a non-success status code (i.e., 4xx or 5xx response), a subclass of Scrapegraphai\Errors\APIError
will be thrown:
<?php
use Scrapegraphai\Errors\APIConnectionError;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;
try {
$params = SmartscraperCreateParams::from(
userPrompt: "Extract the product name, price, and description"
);
$Smartscraper = $client->smartscraper->create($params);
} catch (APIConnectionError $e) {
echo "The server could not be reached", PHP_EOL;
var_dump($e->getPrevious());
} catch (RateLimitError $_) {
echo "A 429 status code was received; we should back off a bit.", PHP_EOL;
} catch (APIStatusError $e) {
echo "Another non-200-range status code was received", PHP_EOL;
var_dump($e->status);
}
Error codes are as follows:
Cause | Error Type |
---|---|
HTTP 400 | BadRequestError |
HTTP 401 | AuthenticationError |
HTTP 403 | PermissionDeniedError |
HTTP 404 | NotFoundError |
HTTP 409 | ConflictError |
HTTP 422 | UnprocessableEntityError |
HTTP 429 | RateLimitError |
HTTP >= 500 | InternalServerError |
Other HTTP error | APIStatusError |
Timeout | APITimeoutError |
Network error | APIConnectionError |
Certain errors will be automatically retried 2 times by default, with a short exponential backoff.
Connection errors (for example, due to a network connectivity problem), 408 Request Timeout, 409 Conflict, 429 Rate Limit, >=500 Internal errors, and timeouts will all be retried by default.
You can use the max_retries
option to configure or disable this:
<?php
use Scrapegraphai\Client;
use Scrapegraphai\RequestOptions;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;
// Configure the default for all requests:
$client = new Client(maxRetries: 0);
$params = SmartscraperCreateParams::from(
userPrompt: "Extract the product name, price, and description"
);
// Or, configure per-request:
$result = $client
->smartscraper
->create($params, new RequestOptions(maxRetries: 5));
You can send undocumented parameters to any endpoint, and read undocumented response properties, like so:
Note: the extra_
parameters of the same name overrides the documented parameters.
<?php
use Scrapegraphai\RequestOptions;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;
$params = SmartscraperCreateParams::from(
userPrompt: "Extract the product name, price, and description"
);
$completedSmartscraper = $client
->smartscraper
->create(
$params,
new RequestOptions(
extraQueryParams: ["my_query_parameter" => "value"],
extraBodyParams: ["my_body_parameter" => "value"],
extraHeaders: ["my-header" => "value"],
),
);
var_dump($completedSmartscraper["my_undocumented_property"]);
If you want to explicitly send an extra param, you can do so with the extra_query
, extra_body
, and extra_headers
under the request_options:
parameter when making a request, as seen in the examples above.
To make requests to undocumented endpoints while retaining the benefit of auth, retries, and so on, you can make requests using client.request
, like so:
<?php
$response = $client->request(
method: "post",
path: '/undocumented/endpoint',
query: ['dog' => 'woof'],
headers: ['useful-header' => 'interesting-value'],
body: ['hello' => 'world']
);
The examples/
directory contains comprehensive examples demonstrating various use cases:
- SmartScraper - Extract data from web pages using natural language prompts
- Markdownify - Convert web pages to clean Markdown format
- SearchScraper - Search and scrape data from multiple websites
- Crawl - Systematically crawl entire websites
- Generate Schema - Generate JSON schemas from natural language
- Credits - Check your API credit balance
- Validate - Validate your API key
- Advanced SmartScraper - Complex schemas, JavaScript rendering, pagination
- Error Handling - Comprehensive error handling strategies
- E-commerce Scraper - Product monitoring, price comparison, review analysis
- News Aggregator - Multi-source news monitoring, sentiment analysis
- Job Listings - Job search aggregation, salary benchmarking, skills analysis
<?php
require_once 'vendor/autoload.php';
use Scrapegraphai\Client;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;
// Initialize client
$client = new Client(
apiKey: getenv('SCRAPEGRAPHAI_API_KEY')
);
// Extract product information
$params = SmartscraperCreateParams::from(
userPrompt: "Extract the product name, price, and rating",
websiteURL: "https://example-store.com/product"
);
$result = $client->smartscraper->create($params);
echo json_encode($result->data, JSON_PRETTY_PRINT);
For more examples and detailed documentation, see the examples directory.
This package follows SemVer conventions. As the library is in initial development and has a major version of 0
, APIs may change at any time.
This package considers improvements to the (non-runtime) PHPDoc type definitions to be non-breaking changes.
PHP 8.1.0 or higher.