Skip to main content

Python Parser API documentation

Get to know how Python Parser API works and integrate it into your app. Examples are provided in Curl, Javascript and Python.

Getting started

Data scraping and parsing endpoint.

http://scrape.infatica.io:9000

Sync API

Sync API allows to receive immediate result using our proxy

If you intend to keep the same IP address across multiple requests, you can use the session_number (integer) parameter to proceed. Your session has a duration of 60 seconds.

If you want to define the geolocation of your session, you may set the country_code (string) parameter with one country code at the creation of the session. Allowed country codes are 'us', 'uk', 'fr', 'de', 'jp', 'cn', 'ru'

You can usemobile flag (boolean) to switch the user-agent to the mobile mode

You may also turn on following redirect links (301 code) with follow_redirect parameter and retrying URL not found (404) results using retry_404 parameter.

You're also free to pass your own headers set to the request

Query Parameters

NameDescriptionExampleOptions
url (string, required)Destination url to retrieve (url-encoded){"url":"google.com"}default=null
api_key (string, required)Python Parser API key{"api_key":"0de32912321"}default=null
mobile (bool, optional)User-Agent type (true for mobile, false for desktop){"mobile":"true"}default=false
follow_redirect (bool, optional)Allow request to follow redirects (301 code){"follow_redirect":"true"}default=false
retry_404 (bool, optional)Retry with another proxy if 404 message returned{"retry_404":"true"}default=false
country_code (str, optional)Proxy country code (geolocation){"country_code":"fr"}default=null; options - us, uk, de, fr, cn, jp, ru
session_number (int, optional)Proxy session number{"session_number":"31"}default=0
render_js (bool, optional)Render JS on page{"render_js":"true"}default=false

Returns

Status codeDescriptionExample
200 (Success)Request successful. Returns JSON with headers and html fields{"headers":{}, 'html':""}
401 (Unauthorized)API key is missing or wrong{'error':'API key is missing or wrong'}
422 (Unprocessable Entity)Error in query parameters{'error':'Wrong query'}
504 (Timeout)Site returned timeout after 3 attempts to reach it{'error':'Timeout'}
API key:

'api_key': 'xxxxxx', where xxxxxx is your Infatica API key

Curl

curl -X GET "http://scrape.infatica.io:9000/" -H "Content-Type: application/json" -d '{"api_key": "xxxxxx", "url": "https://www.google.com"}'

Python

import requests
import json

req = requests.get('http://scrape.infatica.io:9000/', data = json.dumps({
'url': 'TARGET_URL',
'api_key': 'xxxxxx',
'mobile':true,
'country_code':'uk',
'session_number':55
}), headers = {
'user_header_1': 'header1_value',
'user_header_2': 'header2_value'
})

content = json.loads(req.content)
print(content)

Javascript / NodeJS

const axios = require('axios')
const options = {
method: 'GET',
responseType: 'json',
data: {
url: 'TARGET_URL',
api_key: 'xxxxxx',
mobile: true,
country_code: 'uk',
session_number: 55
},
url: 'http://scrape.infatica.io:9000'
}

axios(options)
.then((result) => {
console.log(result)
})
.catch((err) => {
console.error(err)
})

Async API

Async API allows to put multiple time-consuming requests to the queue and receive the results as soon as they are getting ready

POST http://scrape.infatica.io:9000/job

Payload parameters

NameDescriptionExampleOptions
url (string, required)Destination url to retrieve (url-encoded){"url":"google.com"}default=null
api_key (string, required)Python Parser API key{"api_key":"0de32912321"}default=null
mobile (bool, optional)User-Agent type (true for mobile, false for desktop){"mobile":"true"}default=false
follow_redirect (bool, optional)Allow request to follow redirects (301 code){"follow_redirect":"true"}default=false
retry_404 (bool, optional)Retry with another proxy if 404 message returned{"retry_404":"true"}default=false
country_code (str, optional)Proxy country code (geolocation){"country_code":"fr"}default=null; options - us, uk, de, fr, cn, jp, ru
session_number (int, optional)Proxy session number{"session_number":"31"}default=0
render_js (bool, optional)Render JS on page{"render_js":"true"}default=false

Returns

Status codeDescriptionExample
200 (Success)Request successful. Returns JSON with headers and html fields{"id":'result_id'}
401 (Unauthorized)API key is missing or wrong{'error':'API key is missing or wrong'}
422 (Unprocessable Entity)Error in query parameters{'error':'Wrong query'}
504 (Timeout)Site returned timeout after 3 attempts to reach it{'error':'Timeout'}

Curl

curl -X POST -H "Content-Type: application/json" -d '{"api_key": "xxxxxx", "url": "http://httpbin.org/ip"}' "http://scrape.infatica.io:9000/job"

Python

import requests
import json

req = requests.post('http://scrape.infatica.io:9000/job', data = json.dumps({
'url': 'TARGET_URL',
'api_key': 'xxxxxx',
'mobile':true,
'country_code':'uk',
'session_number':55
}), headers = {
'user_header_1': 'header1_value',
'user_header_2': 'header2_value'
})

content = json.loads(req.content)
print(content)

Javascript / NodeJS

const axios = require('axios')
const options = {
method: 'POST',
responseType: 'json',
data: {
url: 'TARGET_URL',
api_key: 'xxxxxx',
mobile: true,
country_code: 'uk',
session_number: 55
},
url: 'http://scrape.infatica.io:9000/job'
}

axios(options)
.then((result) => {
console.log(result)
})
.catch((err) => {
console.error(err)
})

Async API - receiving results

GET http://scrape.infatica.io:9000/job/<job_id>

Path Parameters

NameDescriptionExampleOptions
job_id (string, required)Job IDhttp://scrape.infatica.io:9000/job/0de32912321default=null

Payload Parameters

NameDescriptionExampleOptions
api_key (string, required)Python Parser API key{"api_key":"0de32912321"}default=null

Returns

Status codeDescriptionExample
200 (Success)Request successful. Returns JSON with headers and html fields{
"status":"running",
"statusUrl":"http://scrape.infatica.io:9000/job/0962a8a0-5f1a-4e14-bf8c-5efcc18f1953",
"url":"http://httpbin.org/ip"
}
401 (Unauthorized)API key is missing or wrong{'error':'API key is missing or wrong'}
422 (Unprocessable Entity)Error in query parameters{'error':'Wrong query'}
504 (Timeout)Site returned timeout after 3 attempts to reach it{'error':'Timeout'}

Curl

curl -X GET -H "Content-Type: application/json" -d '{"api_key": "xxxxxx"}' "http://scrape.infatica.io:9000/job/<job_id>"

Python

import requests
import json

req = requests.get('http://scrape.infatica.io:9000/job/<job_id>', data = json.dumps({
'api_key': 'xxxxxx'}))
content = json.loads(req.content)
print(content)

Javascript / NodeJS

const axios = require('axios')
const options = {
method: 'GET',
responseType: 'json',
data: {
api_key: 'xxxxxx',
},
url: 'http://scrape.infatica.io:9000/job/<job_id>'
}

axios(options)
.then((result) => {
console.log(result)
})
.catch((err) => {
console.error(err)
})