Skip to content

ScrapeGraphAI/scrapegraphai-php

Repository files navigation

Scrapegraphai PHP API library

Note

The Scrapegraphai PHP API Library is currently in beta and we're excited for you to experiment with it!

This library has not yet been exhaustively tested in production environments and may be missing some features you'd expect in a stable release. As we continue development, there may be breaking changes that require updates to your code.

We'd love your feedback! Please share any suggestions, bug reports, feature requests, or general thoughts by filing an issue.

The Scrapegraphai PHP library provides convenient access to the Scrapegraphai REST API from any PHP 8.1.0+ application.

It is generated with Stainless.

Documentation

The REST API documentation can be found on scrapegraphai.com.

Installation

To use this package, install via Composer by adding the following to your application's composer.json:

{
  "repositories": [
    {
      "type": "vcs",
      "url": "[email protected]:stainless-sdks/scrapegraphai-php.git"
    }
  ],
  "require": {
    "org-placeholder/scrapegraphai": "dev-main"
  }
}

Usage

<?php

use Scrapegraphai\Client;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;

$client = new Client(
  apiKey: getenv("SCRAPEGRAPHAI_API_KEY") ?: "My API Key",
  environment: "environment_1",
);

$params = SmartscraperCreateParams::from(
  userPrompt: "Extract the product name, price, and description"
);
$completedSmartscraper = $client->smartscraper->create($params);

var_dump($completedSmartscraper->request_id);

Handling errors

When the library is unable to connect to the API, or if the API returns a non-success status code (i.e., 4xx or 5xx response), a subclass of Scrapegraphai\Errors\APIError will be thrown:

<?php

use Scrapegraphai\Errors\APIConnectionError;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;

try {
    $params = SmartscraperCreateParams::from(
      userPrompt: "Extract the product name, price, and description"
    );
    $Smartscraper = $client->smartscraper->create($params);
} catch (APIConnectionError $e) {
    echo "The server could not be reached", PHP_EOL;
    var_dump($e->getPrevious());
} catch (RateLimitError $_) {
    echo "A 429 status code was received; we should back off a bit.", PHP_EOL;
} catch (APIStatusError $e) {
    echo "Another non-200-range status code was received", PHP_EOL;
    var_dump($e->status);
}

Error codes are as follows:

Cause Error Type
HTTP 400 BadRequestError
HTTP 401 AuthenticationError
HTTP 403 PermissionDeniedError
HTTP 404 NotFoundError
HTTP 409 ConflictError
HTTP 422 UnprocessableEntityError
HTTP 429 RateLimitError
HTTP >= 500 InternalServerError
Other HTTP error APIStatusError
Timeout APITimeoutError
Network error APIConnectionError

Retries

Certain errors will be automatically retried 2 times by default, with a short exponential backoff.

Connection errors (for example, due to a network connectivity problem), 408 Request Timeout, 409 Conflict, 429 Rate Limit, >=500 Internal errors, and timeouts will all be retried by default.

You can use the max_retries option to configure or disable this:

<?php

use Scrapegraphai\Client;
use Scrapegraphai\RequestOptions;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;

// Configure the default for all requests:
$client = new Client(maxRetries: 0);
$params = SmartscraperCreateParams::from(
  userPrompt: "Extract the product name, price, and description"
);

// Or, configure per-request:
$result = $client
  ->smartscraper
  ->create($params, new RequestOptions(maxRetries: 5));

Advanced concepts

Making custom or undocumented requests

Undocumented properties

You can send undocumented parameters to any endpoint, and read undocumented response properties, like so:

Note: the extra_ parameters of the same name overrides the documented parameters.

<?php

use Scrapegraphai\RequestOptions;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;

$params = SmartscraperCreateParams::from(
  userPrompt: "Extract the product name, price, and description"
);
$completedSmartscraper = $client
  ->smartscraper
  ->create(
  $params,
  new RequestOptions(
    extraQueryParams: ["my_query_parameter" => "value"],
    extraBodyParams: ["my_body_parameter" => "value"],
    extraHeaders: ["my-header" => "value"],
  ),
);

var_dump($completedSmartscraper["my_undocumented_property"]);

Undocumented request params

If you want to explicitly send an extra param, you can do so with the extra_query, extra_body, and extra_headers under the request_options: parameter when making a request, as seen in the examples above.

Undocumented endpoints

To make requests to undocumented endpoints while retaining the benefit of auth, retries, and so on, you can make requests using client.request, like so:

<?php

$response = $client->request(
  method: "post",
  path: '/undocumented/endpoint',
  query: ['dog' => 'woof'],
  headers: ['useful-header' => 'interesting-value'],
  body: ['hello' => 'world']
);

Examples

The examples/ directory contains comprehensive examples demonstrating various use cases:

Basic Examples

  • SmartScraper - Extract data from web pages using natural language prompts
  • Markdownify - Convert web pages to clean Markdown format
  • SearchScraper - Search and scrape data from multiple websites
  • Crawl - Systematically crawl entire websites
  • Generate Schema - Generate JSON schemas from natural language
  • Credits - Check your API credit balance
  • Validate - Validate your API key

Advanced Examples

Real-World Use Cases

Quick Start Example

<?php
require_once 'vendor/autoload.php';

use Scrapegraphai\Client;
use Scrapegraphai\Smartscraper\SmartscraperCreateParams;

// Initialize client
$client = new Client(
    apiKey: getenv('SCRAPEGRAPHAI_API_KEY')
);

// Extract product information
$params = SmartscraperCreateParams::from(
    userPrompt: "Extract the product name, price, and rating",
    websiteURL: "https://example-store.com/product"
);

$result = $client->smartscraper->create($params);
echo json_encode($result->data, JSON_PRETTY_PRINT);

For more examples and detailed documentation, see the examples directory.

Versioning

This package follows SemVer conventions. As the library is in initial development and has a major version of 0, APIs may change at any time.

This package considers improvements to the (non-runtime) PHPDoc type definitions to be non-breaking changes.

Requirements

PHP 8.1.0 or higher.

Contributing

See the contributing documentation.

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published