ScrapingBytes API - Documentation

Welcome to the official ScrapingBytes documentation. Our goal is to make this process as simple as possible. Web scraping shouldn't be hard and neither should our implementation. To get started you'll need to know a few things:

  • Your API Key
  • The URL you want to scrape
  • Do you need to render Javascript?
  • Do you want to use a premium/residential proxy?

Finding Your API Key

You have to generate an API key. It will only be shown once. Please store this in a secure place and keep it private. To generate an API key first head to the dashboard.

generating an API key on the ScrapingBytes dashboard

An API key will be shown like the following. example API key

Copy this and integrate it into your application. Keep your API key private and secure. How to do this is beyond the scope of this documentation. A brief example from Twilio by using environmental variables can be watched here.


Sending a Request

Now that you have your API key you can send a POST request to our API:

https://scrapingbytes.com/api/v1/scrape
Authorization: Bearer YOUR-API-KEY

You'll need to build a request body, and send it as JSON. At a minimum you'll need to provide the target URL.

{
    "url": "https://books.toscrape.com/",
    "render_js": true,
    "premium_proxy": true
}

If you do not want to use the headless browser or premium proxies set the values to false. Don't know how to do this? Try using the Request Builder on the dashboard.

The API will return the raw HTML from the target URL:

<html>
<head>
...
</head>
<body>
...
</body>
</html>

To know if a request was successful plus the credit costs, check the response headers.

SB-Credit-Cost This header will return a number with the total credits used. SB-Success This header will return a boolean on whether the request was successful or not.

In this case the response headers would be:

Key Value
SB-Credit-Cost 25
SB-Success true

Controls

These are the JSON parameters you can provide to the browser to control it. Please note if render_js is disabled we send an HTTP request instead of using a headless browser.

Name type default Description
url string required The URL of the page you want to scrape
render_js bool true Fetch the website with a headless browser and render JavaScript
premium_proxy bool false Use a premium proxy to bypass difficult websites. This uses a pool of residential proxies
screenshot bool false Return a screenshot of the page you want to scrape. This attempts to fetch the full page of the website. This is returned as base64
mobile bool false Control if a mobile device will be used to send the request. Defaults to desktop
block_resources bool true Block resources such as images
block_ads bool true Block ads on the page you want to scrape
window_height int 1080 Height in pixels of the browser window and viewport used to scrape the page
window_width int 1920 Height in pixels of the browser window and viewport used to scrape the page
timeout int 130 How long to wait before timing out your request. Maximum is 130 seconds

Want to quickly build the JSON required? Try the request builder on the dashboard.


Credit Costs

Every ScrapingBytes plan includes a certain amount of credits to use per month.

The credits used per request depends on the parameters provided with your API calls. It will cost 1 to 25 credits to make a request with the ScrapingBytes API.

A breakdown of credit costs:

Feature Credit Cost
Rotating Proxy without JavaScript rendering 1
Rotating Proxy with JavaScript rendering 5
Premium Proxy without JavaScript rendering 10
Premium Proxy with JavaScript rendering 25

Response Status Codes

Every response from the ScrapingBytes API returns a HTTP code with a specific/general reason. You can find all status codes and their reasons below:

Code Charged Meaning Solution
200 Yes Successful request
204 No No content There was a problem fetching the content. Please try again.
400 No Bad request Invalid parameters provided. Check the response message for further details.
402 No Not enough credits Please upgrade your plan, or contact support/sales for assistance.
429 No Too many concurrent connections Please upgrade your plan, or contact support/sales for assistance.
404 Yes Invalid URL You provided an invalid URL or we couldn't find the requested URL.
500 No Error Please try again, an unknown error has occurred.