Web Scraper API documentation
Get to know how Web Scraper API works and integrate it into your app. Examples are provided in Curl, Javascript and Python.
Getting started​
Data scraping and parsing endpoint.
Sync API​
Sync API allows to receive immediate result using our proxy
If you intend to keep the same IP address across multiple requests, you can use the session_number
(integer) parameter to proceed. Your session has a duration of 60 seconds.
If you want to define the geolocation of your session, you may set the country_code (string) parameter with one country code at the creation of the session.
Example: 'us'
, 'gb'
, 'fr'
, 'de'
, 'jp'
, 'cn'
, 'ru'
Supported country codes can be found in the collapsible table below:
All country codes (click here to expand)
Code | Country name | Code | Country name | Code | Country name |
---|---|---|---|---|---|
ad | Andorra | gl | Greenland | no | Norway |
ae | United Arab Emirates | gm | Gambia | np | Nepal |
af | Afghanistan | gn | Guinea | nr | Nauru |
ag | Antigua and Barbuda | gp | Guadeloupe | nu | Niue |
ai | Anguilla | gq | Equatorial Guinea | nz | New Zealand |
al | Albania | gr | Greece | om | Oman |
am | Armenia | gs | South Georgia and the South Sandwich Islands | pa | Panama |
ao | Angola | gt | Guatemala | pe | Peru |
aq | Antarctica | gu | Guam | pf | French Polynesia |
ar | Argentina | gw | Guinea-Bissau | pg | Papua New Guinea |
as | American Samoa | gy | Guyana | ph | Philippines |
at | Austria | hk | Hong Kong | pk | Pakistan |
au | Australia | hm | Heard Island and McDonald Islands | pl | Poland |
aw | Aruba | hn | Honduras | pm | Saint Pierre and Miquelon |
ax | Ã…land Islands | hr | Croatia | pn | Pitcairn |
az | Azerbaijan | ht | Haiti | pr | Puerto Rico |
ba | Bosnia and Herzegovina | hu | Hungary | ps | Palestine, State of |
bb | Barbados | id | Indonesia | pt | Portugal |
bd | Bangladesh | ie | Ireland | pw | Palau |
be | Belgium | il | Israel | py | Paraguay |
bf | Burkina Faso | im | Isle of Man | qa | Qatar |
bg | Bulgaria | in | India | re | Réunion |
bh | Bahrain | io | British Indian Ocean Territory | ro | Romania |
bi | Burundi | iq | Iraq | rs | Serbia |
bj | Benin | ir | Iran (Islamic Republic of) | ru | Russian Federation |
bl | Saint Barthélemy | is | Iceland | rw | Rwanda |
bm | Bermuda | it | Italy | sa | Saudi Arabia |
bn | Brunei Darussalam | je | Jersey | sb | Solomon Islands |
bo | Bolivia (Plurinational State of) | jm | Jamaica | sc | Seychelles |
bq | Bonaire, Sint Eustatius and Saba | jo | Jordan | sd | Sudan |
br | Brazil | jp | Japan | se | Sweden |
bs | Bahamas | ke | Kenya | sg | Singapore |
bt | Bhutan | kg | Kyrgyzstan | sh | Saint Helena, Ascension and Tristan da Cunha |
bv | Bouvet Island | kh | Cambodia | si | Slovenia |
bw | Botswana | ki | Kiribati | sj | Svalbard and Jan Mayen |
by | Belarus | km | Comoros | sk | Slovakia |
bz | Belize | kn | Saint Kitts and Nevis | sl | Sierra Leone |
ca | Canada | kp | Korea (Democratic People's Republic of) | sm | San Marino |
cc | Cocos (Keeling) Islands | kr | Korea, Republic of | sn | Senegal |
cd | Congo, Democratic Republic of the | kw | Kuwait | so | Somalia |
cf | Central African Republic | ky | Cayman Islands | sr | Suriname |
cg | Congo | kz | Kazakhstan | ss | South Sudan |
ch | Switzerland | la | Lao People's Democratic Republic | st | Sao Tome and Principe |
ci | Côte d'Ivoire | lb | Lebanon | sv | El Salvador |
ck | Cook Islands | lc | Saint Lucia | sx | Sint Maarten (Dutch part) |
cl | Chile | li | Liechtenstein | sy | Syrian Arab Republic |
cm | Cameroon | lk | Sri Lanka | sz | Eswatini |
cn | China | lr | Liberia | tc | Turks and Caicos Islands |
co | Colombia | ls | Lesotho | td | Chad |
cr | Costa Rica | lt | Lithuania | tf | French Southern Territories |
cu | Cuba | lu | Luxembourg | tg | Togo |
cv | Cabo Verde | lv | Latvia | th | Thailand |
cw | Curaçao | ly | Libya | tj | Tajikistan |
cx | Christmas Island | ma | Morocco | tk | Tokelau |
cy | Cyprus | mc | Monaco | tl | Timor-Leste |
cz | Czechia | md | Moldova, Republic of | tm | Turkmenistan |
de | Germany | me | Montenegro | tn | Tunisia |
dj | Djibouti | mf | Saint Martin (French part) | to | Tonga |
dk | Denmark | mg | Madagascar | tr | Türkiye |
dm | Dominica | mh | Marshall Islands | tt | Trinidad and Tobago |
do | Dominican Republic | mk | North Macedonia | tv | Tuvalu |
dz | Algeria | ml | Mali | tw | Taiwan, Province of China |
ec | Ecuador | mm | Myanmar | tz | Tanzania, United Republic of |
ee | Estonia | mn | Mongolia | ua | Ukraine |
eg | Egypt | mo | Macao | ug | Uganda |
eh | Western Sahara | mp | Northern Mariana Islands | um | United States Minor Outlying Islands |
er | Eritrea | mq | Martinique | us | United States of America |
es | Spain | mr | Mauritania | uy | Uruguay |
et | Ethiopia | ms | Montserrat | uz | Uzbekistan |
fi | Finland | mt | Malta | va | Holy See |
fj | Fiji | mu | Mauritius | vc | Saint Vincent and the Grenadines |
fk | Falkland Islands (Malvinas) | mv | Maldives | ve | Venezuela (Bolivarian Republic of) |
fm | Micronesia (Federated States of) | mw | Malawi | vg | Virgin Islands (British) |
fo | Faroe Islands | mx | Mexico | vi | Virgin Islands (U.S.) |
fr | France | my | Malaysia | vn | Viet Nam |
ga | Gabon | mz | Mozambique | vu | Vanuatu |
gb | United Kingdom of Great Britain and Northern Ireland | na | Namibia | wf | Wallis and Futuna |
gd | Grenada | nc | New Caledonia | ws | Samoa |
ge | Georgia | ne | Niger | ye | Yemen |
gf | French Guiana | nf | Norfolk Island | yt | Mayotte |
gg | Guernsey | ng | Nigeria | za | South Africa |
gh | Ghana | ni | Nicaragua | zm | Zambia |
gi | Gibraltar | nl | Netherlands | zw | Zimbabwe |
You can use mobile flag (boolean 'true') to switch the user-agent to the mobile mode
You may also turn on following redirect links (301 code) with follow_redirect
parameter and retrying URL not found (404) results using retry_404
parameter.
You're also free to pass your own headers set to the request
Query Parameters​
Name | Description | Example | Options |
---|---|---|---|
url (string, required) | Destination url to retrieve (url-encoded) | {"url":"google.com"} | default=null |
api_key (string, required) | Web Scraper API key | {"api_key":"0de32912321"} | default=null |
mobile (bool, optional) | User-Agent type (true for mobile, false for desktop) | {"mobile":"true"} | default=False |
follow_redirect (bool, optional) | Allow request to follow redirects (301 code) | {"follow_redirect":"true"} | default=False |
retry_404 (bool, optional) | Retry with another proxy if 404 message returned | {"retry_404":"true"} | default=False |
country_code (str, optional) | Proxy country code (geolocation) | {"country_code":"fr"} | default=null, options - us, gb, de, fr, cn, jp, ru |
session_number (int, optional) | Proxy session number | {"session_number":"31"} | default=0 |
render_js (bool, optional) | Render JS on page | {"render_js":"true"} | default=false |
Returns​
Status code | Description | Example |
---|---|---|
200 (Success) | Request successful. Returns JSON with headers and html fields | {"headers":{}, 'html':""} |
401 (Unauthorized) | API key is missing or wrong | {'error':'API key is missing or wrong'} |
422 (Unprocessable Entity) | Error in query parameters | {'error':'Wrong query'} |
504 (Timeout) | Site returned timeout after 3 attempts to reach it | {'error':'Timeout'} |
'api_key': 'API_KEY'
, where API_KEY is your Infatica API key
Curl​
curl -X POST "https://scrape.infatica.io/" -H "Content-Type: application/json" -d '{"api_key": "API_KEY", "url": "https://www.google.com"}'
Python​
import requests
import json
req = requests.post('https://scrape.infatica.io/', data = json.dumps({
'url': 'TARGET_URL',
'api_key': 'API_KEY',
'mobile': True,
'country_code': 'us',
'session_number': 55
}), headers = {
'user_header_1': 'header1_value',
'user_header_2': 'header2_value'
})
content = json.loads(req.content)
print(content)
Javascript / NodeJS​
const axios = require('axios')
const options = {
method: 'POST',
responseType: 'json',
data: {
url: 'TARGET_URL',
api_key: 'API_KEY',
mobile: true,
country_code: 'gb',
session_number: 55
},
url: 'https://scrape.infatica.io'
}
axios(options)
.then((result) => {
console.log(result)
})
.catch((err) => {
console.error(err)
})
Async API​
Async API allows to put multiple time-consuming requests to the queue and receive the results as soon as they are getting ready
Payload parameters​
Name | Description | Example | Options |
---|---|---|---|
url (string, required) | Destination url to retrieve (url-encoded) | {"url":"google.com"} | default=null |
api_key (string, required) | Web Scraper API key | {"api_key":"0de32912321"} | default=null |
mobile (bool, optional) | User-Agent type (true for mobile, false for desktop) | {"mobile":"true"} | default=False |
follow_redirect (bool, optional) | Allow request to follow redirects (301 code) | {"follow_redirect":"true"} | default=False |
retry_404 (bool, optional) | Retry with another proxy if 404 message returned | {"retry_404":"true"} | default=False |
country_code (str, optional) | Proxy country code (geolocation) | {"country_code":"fr"} | default=null, options - us, gb, de, fr, cn, jp, ru |
session_number (int, optional) | Proxy session number | {"session_number":"31"} | default=0 |
render_js (bool, optional) | Render JS on page | {"render_js":"true"} | default=false |
Returns​
Status code | Description | Example |
---|---|---|
200 (Success) | Request successful. Returns JSON with headers and html fields | {"id":'result_id'} |
401 (Unauthorized) | API key is missing or wrong | {'error':'API key is missing or wrong'} |
422 (Unprocessable Entity) | Error in query parameters | {'error':'Wrong query'} |
504 (Timeout) | Site returned timeout after 3 attempts to reach it | {'error':'Timeout'} |
Curl​
curl -X POST -H "Content-Type: application/json" -d '{"api_key": "API_KEY", "url": "http://httpbin.org/ip"}' "https://scrape.infatica.io/job"
Python​
import requests
import json
req = requests.post('https://scrape.infatica.io/job', data = json.dumps({
'url': 'TARGET_URL',
'api_key': 'API_KEY',
'mobile': True,
'country_code': 'gb',
'session_number': 55
}), headers = {
'user_header_1': 'header1_value',
'user_header_2': 'header2_value'
})
content = json.loads(req.content)
print(content)
Javascript / NodeJS​
const axios = require('axios')
const options = {
method: 'POST',
responseType: 'json',
data: {
url: 'TARGET_URL',
api_key: 'API_KEY',
mobile: true,
country_code: 'gb',
session_number: 55
},
url: 'https://scrape.infatica.io/job'
}
axios(options)
.then((result) => {
console.log(result)
})
.catch((err) => {
console.error(err)
})
Async API - receiving results​
POST https://scrape.infatica.io/job/<job_id>
Path Parameters​
Name | Description | Example | Options |
---|---|---|---|
job_id (string, required) | Job ID | https://scrape.infatica.io/job/0de32912321 | default=null |
Payload Parameters​
Name | Description | Example | Options |
---|---|---|---|
api_key (string, required) | Web Scraper API key | {"api_key":"0de32912321"} | default=null |
Returns​
Status code | Description | Example |
---|---|---|
200 (Success) | Request successful. Returns JSON with headers and html fields | { "status":"running", "statusUrl":"https://scrape.infatica.io/job/0962a8a0-5f1a-4e14-bf8c-5efcc18f1953", "url":"http://httpbin.org/ip" } |
401 (Unauthorized) | API key is missing or wrong | {'error':'API key is missing or wrong'} |
422 (Unprocessable Entity) | Error in query parameters | {'error':'Wrong query'} |
504 (Timeout) | Site returned timeout after 3 attempts to reach it | {'error':'Timeout'} |
Curl​
curl -X POST -H "Content-Type: application/json" -d '{"api_key": "API_KEY"}' "https://scrape.infatica.io/job/<job_id>"
Python​
import requests
import json
req = requests.post('https://scrape.infatica.io/job/<job_id>', data = json.dumps({
'api_key': 'API_KEY'}))
content = json.loads(req.content)
print(content)
Javascript / NodeJS​
const axios = require('axios')
const options = {
method: 'POST',
responseType: 'json',
data: {
api_key: 'API_KEY',
},
url: 'https://scrape.infatica.io/job/<job_id>'
}
axios(options)
.then((result) => {
console.log(result)
})
.catch((err) => {
console.error(err)
})
Credits and Requests​
Your plan determines how many credits you can use. Each request you make costs some credits. The number of credits you use varies based on the domain and parameters of your request. Geotargeting is included in these credit costs.
Domains​
We have built special scrapers for some sites. These scrapers will run when you scrape those domains, changing the credit cost. Scraping other domains costs 1 credit (without additional parameters).
Category | Normal | E-commerce | SERP |
---|---|---|---|
credit cost | 1 | 10 | 10 |
render_js | 10 | 20 | 20 |
Normal - any other website if no additional parameters are added;
SERP - Google;
Ecommerce - Amazon;
List will be updated as functionality is added.
Paid query parameters​
These parameters provide you with additional features for the parsing.
{"render_js":"true"}
– requests cost 10 credits