nl_NL

Jobfeed API Reference

Version 3 (last update: November 2018)

Introduction

The Jobfeed API provides developers with real-time access to the Jobfeed data. With this API, Textkernel aims to promote new uses of Jobfeed's big data capabilities through the development of third party applications.

The Jobfeed API is intended for real-time job search and analytics. If you would like to import a large portion of the data into your own system, please request a data feed from Textkernel.

The API is accessible via HTTP GET or POST requests and allows you to filter and aggregate jobs from Jobfeed.

The POST requests currently only support URL-encoded parameters (Content-Type: application/x-www-form-urlencoded).

Accounts are limited to 4 requests/second and 30 requests/minute. If you exceed any of those limits you will start receiving 429 Too Many Requests HTTP responses.


Authentication

The Jobfeed API uses Basic Authentication. With each request, you need to send a Jobfeed username and password. The account you use should have the API feature enabled.

Example
$ curl --user 'username:password' https://www.jobfeed.nl/api/v3/search

Data model

The Jobfeed data is a collection of jobs. A job generally corresponds to a job opening that is advertised online. In our data model, each job contains a list of postings: specific web pages that Jobfeed detects as different advertisements of this job. Often, a job contains more than one posting, e.g., advertisements posted on different websites.

Each posting contains a set of fields, such as job title, source website, source url, job description, etc. Each job, in addition to a list of postings, also contains a set of (job-level) fields: the total number of postings of the job, the number of different websites where the job was posted, the date when the first posting of the job was published, etc.

The Jobfeed API provides access to both job-level and posting-level information. When searching, you can filter on job fields and on posting fields. When aggregating, you can request job counts or posting counts.


Fields

The /fields endpoint provides information about all available fields for the current country: field type, English-language description, available levels (job/posting) and the list of possible values (where applicable). This endpoing only supports HTTP GET requests.

The available fields differ per country. You can download a complete overview of all fields per country here.

Request

Endpoint

Method URL
GET https://www.jobfeed.nl/api/v3/fields

Query parameters

Parameter Type Default Description
_get_all bool 0 Include information about fields that are not available for your account.
_field str None Return information for a specific field only.
_values str None Filter possible_values object. Input is a comma separated list of keys. It should be used in combination with _field parameter, otherwise ignored.
_language str Country-dependent Translate labels for possible field values into a given language (a two-letter ISO 639-1 language code). Supported languages: en, de, fr, nl.
_pretty bool 0 If 1, the JSON output is pretty-printed. Do not use this in production.

Response

Status Content type Content
200 (OK) application/json A JSON object with metadata for every field: type, description, level and possible_values (where applicable). The level is a list of strings job and/or posting; the first value in the list is the default level, applied when filtering on the field without explicit prefix job: or posting:
Example https://www.jobfeed.nl/api/v3/fields?_pretty=1&_language=en

Searching

The /search endpoint returns a list of jobs matching your criteria.

Request

Endpoint

Method URL BODY
GET https://www.jobfeed.nl/api/v3/search?<query>
POST https://www.jobfeed.nl/api/v3/search <query>

Query parameters

Parameter Type Default Description
_labels bool 1 Include (1) or omit (0) human-readable labels for field values in the result list
_language str Country-dependent Translate labels into a given language (a two-letter ISO 639-1 language code). Supported languages: en, de, fr, nl
_limit int 10 Limit the size of the result list.
_offset int 0 Skip a number of jobs from the top of the result list.
_sort str None Name of the field(s) on which to sort the result.
_sortdir str None Direction to sort the result list: asc or desc.
_fields str All fields Specify the fields of jobs to include in the result list. The value is a comma-separated list of fields. The list may contain fields prefixed with posting: or job:, or a special value posting; see details below.
_snippets bool 0 When searching in the full_text field, return highlighted portions of the text that matched the search.
_matching_postings bool 0 If you request a list of postings to be included for every job using _fields parameter, by default all postings of a job will be returned. With _matching_postings set to 1, only postings that match your posting-level filters will be returned.
_pretty bool 0 If 1, the JSON output is pretty-printed. Do not use this in production.
_facets str None Comma separated list of fields to compute facets. For each field, the facet will contain the most frequent field values with job counts.
_facets_limit int 10 Limits (comma separated) for each of the fields specified in _facets. Must be equal or less than 20.
<field> Mixed None Query the API on a specific field. See Filtering below.

You can specify multiple fields for _sort using a comma-separated sequence, in which case the _sortdir will correspond to the fields in the same order as specified in _sort.

The values of _limit and _offset should not add up to more than 10000.

Example /search?_sort=date,organization_name&_sortdir=desc,asc
Description This will order the results by date in descending order, and results on the same date by organisation_name in ascending order.

Response

Status Content type Content description
200 (OK) application/json A JSON object containing:
  • total_count: the total number of results in the set
  • results: An array of job objects

By default, each job object in the result will contain all available fields. For fields with the default level posting (such as source_website or job_title; see Fields), the returned values will come from one of the postings of the job that match your posting-level filters. You can override this behavior and request values indexed at the job level by specifying prefix job: (note that this will only work for fields that are also available at the job level, such as profession or location).

If the _fields parameter is provided and includes a special value posting, for every job an extra field posting will be added, containing the list of all postings of this job; by default, each posting object will contain all posting-level fields. If you only need specific fields from the postings, you can specify the fields using posting:FIELDNAME syntax in the _fields parameter:

Example /search?_fields=date,job_title
Description For every job object in the result, only return the job title and the posting date
Example /search?_fields=date,job_title,posting
Description For every job object, return the job title, the posting date and the list of all postings, each with all available fields
Example /search?_fields=date,job_title,posting:date,posting:source_url
Description For every job object, return the job title, the posting date and the list of all postings, but only include posting date and source URL for every posting object
Example /search?_fields=date,job_title,posting&_matching_postings=1&source_website=example.com
Description For every job object, return the job title, the posting date and the list of postings from the website example.com (not all postings)

You can check which fields are available at the job level and at the posting level via the /fields endpoint (see Fields).

In case of error, the API returns a 4xx/5xx status code and a JSON document with the error code and description.

Filtering

Both /search and /aggregate endpoints accept additional query parameters for filtering your result set. You can specify multiple filters as <field name>=<field value>. The values you can specify depend on the field type:

  1. For boolean fields (e.g. via_intermediary) you can use:
    • 0 or false or no for "false"
    • 1 ortrue or yes for "true"
  2. For integer fields, values can be integers. Additionally the location field can also be a string (e.g. city name) that will be automatically mapped to the best matching location code.
  3. For date fields, values should be in format YYYY-MM-DD
  4. For text fields, values are interpreted as full text search query (see Full text search for details).

Filtering for fields is applied either on the job or on the posting level. You can explicitly specify the level using job: or posting: prefix. If a field is used without a prefix, the field's default level is assumed (see Fields). Note that some fields are only available on one level (job or posting).

Example /search?job:profession=696
Description Return jobs with job-level profession code 696 (Java developer)
Example /search?posting:profession=696
Description Return jobs that have at least one posting with profession code 696 (Java developer)
Example /search?profession=696
Description Same as posting:profession=696, because posting is the default level for the field profession

You can use the endpoint /fields to get information about available fields, their types, levels and possible values (see Fields).

NB: In order to improve readability we didn't URL-encode the parameters in the examples below. You must do it in your application (or build the URLs using a library that does it automatically). Otherwise the API will interpret values like now+1h as now 1h and return an error.

Multiple values for one field

You can specify multiple values for a field by joining the values with | or OR:

Example /search?profession=227|4980 or /search?profession=227 OR 4980
Description Jobs for which the profession code is either 227 or 4980

The use of [] to specify multiple values is deprecated and should not be used:

Example /search?profession[]=227&profession[]=4980
Description Deprecated way to filter for profession codes 227 or 4980

In some cases, it may be useful to apply logical AND to multiple filters for one field, e.g., to find intersection of intervals. This can be done with the __and field modifier (notice the double underscore):

Example /search?date__range=2017-01-01__2017-06-30&date__range__and=2017-06-26__2017-07-02
Description Date is in the 2017-01-01 to 2017-06-30 range AND at the same time in the 2017-06-26 to 2017-07-02 range

Filtering by prefix

For string fields, you can filter results by prefixed search appending __prefix to the field.

Example /search?source_website__prefix=example
Description Jobs for which the source website begins with example, i.e. examplejobboards.com, examplevacancy.com

Filtering by (non) existence

You can filter the results by presense/absense of a field using either _exists or _not_exists as a value for that field. E.g., in order to exclude expired jobs (i.e. select only active jobs), you can use expiration_date=_not_exists.

Filtering by location radius

When filtering results by geographic location, you can append __radius to the location field (note the two underscores). As value, you should provide either a city name, a Jobfeed-specific municipality id, or geographical coordinates (in format latitude,longitude in degrees), and append the radius in kilometers. Examples:

  • /search?location__radius=Amsterdam__25
  • /search?location__radius=1000__25
  • /search?location__radius=52.08,4.32__10

You cannot use __radius together with __range for the same field.

Filtering on a range

The suffix __range allows you to filter a field on a range of values. If using this suffix, the value format should be from__to. To make the range open-ended, you can omit either from or to.

Example /search?education_level__range=3__8
Description Jobs with education levels between 3 and 8 (inclusive)

For date ranges, as values for from and to you can use:

  • specific dates, such as 2016-03-01
  • a special value now: the current date
  • relative expressions that consist of an anchor date and an interval. An anchor date can be either now or a specific date suffixed with ||. For example:
    • now-1M: one month ago
    • now-1y-6M: one year and six months ago
    • 2016-03-01||+1w: one week after March 1st, 2016.

Supported units for date intervals are (case-sensitive):

  • y (year)
  • M (month)
  • w (week)
  • d (day)
Example /search?date__range=now-1y__now
Description Jobs posted in the past year (between 1 year ago and now)

You cannot use __range together with __radius for the same field.

Filtering on a calendar period

The suffix __calendar allows you to filter a date field on complete calendar periods. The value will be the amount of periods suffixed by a unit.

Supported units:

  • m (month)
  • q (quarter)
  • y (year)
Example /search?date__calendar=2m
Description Jobs posted within the last 2 complete calendar months

Assuming today is the 11th of January 2017 2m denotes the interval between the 1st of November 2016 and the 31st of December 2016.

Filtering on custom categories

If you've defined custom categories in Jobfeed you can search by custom category IDs by suffixing the field name with __custom.

Example /search?profession__custom=123
Description Search for jobs with the profession in custom category 123

You can get the IDs of the custom categories for the fields that have support for them from the /fields endpoint.

Negative filters

You can negate any filter for a field by appending __not to the field name. E.g.: /search?education_level__not=5.

You can also negate __range, __radius, __calendar and __custom filters in the same way, e.g.: /search?education_level__range__not=3__8

Full text search

For fields of type 'text' (such as job_title, organization_name, full_text, etc.; see Fields), you can use full text search operators, as described below.

Boolean expressions

You can filter on specific terms in field values:

Example /search?full_text=(transport OR auto) AND diesel AND NOT java
Example /search?full_text=(transport|auto) diesel -java

Note: The dash symbol - is sensitive to spaces; it should be placed after a space or at the beginning of the query, and should not be followed by any spaces.

Phrase search

You can filter on occurence of specific phrases by enclosing them in double double-quotes ":

Example /search?full_text="diesel technician"
Fuzzy searches

You can search for similar words: the following will match documents containing 'color' as well as 'colour'.

Example /search?full_text=colour~

You can specify an acceptable edit distance as integer after the tilde (~).

Proximity searches

You can search for words that occur close to each other. The integer after the tilde specifies the maximum number of words between the two terms:

Example /search?full_text="junior manager"~3
Wildcard searches

You can use * as wildcard character. The example below will return documents containing words starting with "auto":

Example /search?full_text=auto*

Using * can make queries significantly slower (up to 5 times), especially if * does not follow a prefix of at least 2 word characters. For example, query dev* will be fast, but queries *ologist or p*diatric can be up to 5 times slower.

Exact match on text fields

Two text fields (job_title and organization_name) also allow for exact string match on top of the regular full-text search:

Example /search?organization_name__exact=Globen

The example above will match jobs that have the exact string "Globen" as the organization name. It will not match (for instance) "Globen Apeldoorn", as the full-text search would.

Snippets

When searching in the full_text field you can request snippets of the text that matched your search by using the _snippets parameter with a value of 1.

Example /search?full_text=java developer&_snippets=1

In the result the job objects will have an extra _snippets field containing an array of highlighted text snippets. For example:

"_snippets": [
    "  The Lead <em>Java<\/em> Application <em>Developer<\/em> works on the Architecture team is responsible for designing",
    " experience in at least one software specialization. The <em>Developer<\/em> acts as resource for colleagues with less",
    " experienceMinimum of 4 years <em>Java<\/em>\/ J2EE (including Servlets\/JSPs) development experienceMinimum of 4 years JDBC",
]

Facets

You can use the _facets parameter to request the most frequent values with job counts for one or more fields.

You can limit the number of values for each field using the _facets_limit parameter. If ommited, the default limit per field will be 10. The maximum limit you can set per field is 20.

Example /search?_facets=profession,education_level&_facets_limit=,5

The response will contain a facets object with separate properties for each requested field.

The job counts will be affected by filtering. Keep in mind that in order to get consistent counts you should also filter at job level (ie: using job: prefixes for the fields).

Note that if you request facets for a field that is only available at the posting level (see Fields), the job counts for the values may add up to more than the total number of jobs, because each job can be associated with multiple values of the field.


Aggregations

The /aggregate endpoint allows you to aggregate information about jobs across specific fields into buckets and compute various metrics over buckets.

Request

Endpoint

Method URL BODY
GET https://www.jobfeed.nl/api/v3/aggregate?<query>
POST https://www.jobfeed.nl/api/v3/aggregate <query>

Query parameters

Parameter Type Default Description
_group str None Field to aggregate on. You can specify a comma separated list to group on multiple fields (see Multi-field aggregation). It may be missing, in which case the metric will be computed on the whole result set. When the value of the parameter _metric (see below) is count, you can prefix field names with posting: to get posting counts instead of job counts (the default); additionally, a special value posting will return the number of postings instead of jobs.
_limit int 10 Limit the size of the result list. This parameter can also be used with multi-field aggregation. When grouping by date intervals there is no default value (ie: all the buckets will be returned by default), but if you do specify a _limit it will be observed.
_offset int 0 Skip a number of rows from the top of the result list. Note: _offset for aggregations is an approximation and it becomes less accurate for high limits.
_sort str _value Sort the resulting list of aggregation buckets:
  • _group: sort by aggregated field values
  • _value: sort by values of the aggregation metric (e.g., counts)
_sortdir str desc Direction to sort the result list: asc or desc.
_labels bool 1 Include (1) or omit (0) human-readable labels for field values in the result list.
_language str Country-dependent Translate labels into a given language (a two-letter ISO 639-1 language code). Supported languages: en, de, fr, nl.
_metric str count Aggregation metric to compute for buckets in the result list: count, min__FLD, max__FLD, sum__FLD, avg__FLD, median__FLD, where FLD is the name of a numeric field. By default, the number of jobs or postings in each bucket is returned. See Using metrics below.
_totals bool 0 Return total job counts from the intermediate levels of aggregation. For fields prefixed with posting: in the _group, we return the count of postings instead.
_pretty bool 0 If 1, the JSON output is pretty-printed. Do not use this in production.
<field> Mixed None Filter jobs on specific fields in the same way as for the /search endpoint: see Filtering.

Response

Status Content type Content
200 (OK) application/json A JSON list of buckets. Each bucket consist of a list of values of all aggregation fields (_group parameter) followed by the values of the metrics for this bucket.

In case of error, the API returns a 4xx/5xx status code and a JSON document with the error code and description.

Note that if you are grouping on a field that is only available at the posting level (see Fields), the job counts in the aggregation buckets may add up to more than the total number of jobs, because each job can be associated with multiple values of the field. Each of these values will contribute to a bucket count.

Using metrics

By default, the API returns the number of documents in every bucket (_metric=count). If the metric is specified as OPERATION__FIELD (note the two underscores), the corresponding aggregation operation is applied to the values of the field for all jobs in every bucket. E.g., _metric=median__salary will compute median salary for jobs in each bucket.

It is not possible to compute multiple metrics in one aggregation request, with one exception: if you specify the metric as OPERATION__FIELD,count, the result will contain both the outcome of the aggregation operation and the number of documents in every bucket.

Deprecation notice: The value _metric=count_postings is deprecated and should not be used. To get the number of postings instead the number of jobs, you can use _group=posting or _group=posting:FIELD.

Grouping by date intervals

When grouping by a date-typed field (date or expiration_date), you can append an aggregation interval to the field name:

  1. __week
  2. __month
  3. __quarter
  4. __year
Example /aggregate?_group=date__quarter
Description Compute the number of jobs per quarter.

Grouping on ranges

When grouping by an integer field you can define buckets to group jobs into using this syntax:

fieldname__buckets_<min>_<max>_<step>
Example /aggregate?_group=salary__buckets_1000_5000_500
Description Group results by salary on buckets like 1000 → 1499, 1500 → 1999, 2000 → 2499 and so on up to 5000.

Grouping by custom categories

If you have defined custom categories in Jobfeed for a field that supports it (e.g. profession, education level or industry), you can group jobs by custom categories, adding suffix __custom to the field name.

Example /aggregate?_group=profession__custom
Description Group the results by custom profession categories.

You can find the list of the custom categories currently defined for your account via the /fields endpoint or on the Search form settings page in the Jobfeed UI.

Limitations:

  • When grouping by custom categories, you have to specify _totals=1.
  • Sorting by the value of the aggregation metric (e.g., by counts) is not supported for custom categories: parameter _sort=_value has no effect.

Levels of custom categories

If you have defined two-level custom categories, you can aggregate categories at a specific level using suffix __custom_1 or __custom_2.

Example /aggregate?_group=profession__custom_1 or /aggregate?_group=profession__custom
Description Group the results by the first (highest) level of custom profession categories.
Example /aggregate?_group=profession__custom_2
Description Group the results by the second (lowest) level of custom profession categories.

Multi-field aggregation

You can perform multi-dimensional aggregations by specifying a comma-separated list of fields in the _group parameter. Each bucket will be further split into smaller buckets for every next field in the field list.

Example /aggregate?_group=organization_industry,education_level
Description Group results by organization_industry, and then further split by education_level.

You can specify separate limits on number of results for all levels of aggregation by setting _limit to a comma-separated list of integers (in the same order as the list of fields in _group). For example, _group=organization_industry,education_level&_limit=5,3) will return the top 5 most frequent industries with, for each industry, top 3 education levels.

You can also omit limits for some levels, in which case a default of 10 is applied.

Example /aggregate?_group=organization_industry,education_level,region&_limit=5,,3
Description Group results by organization_industry, then by education_level and finally by region; use default limit 10 for education level.

Similar to multiple limits, you can specify multiple values for _offset, _sort and _sortdir parameters when doing multi-field aggregations. If using _metric, sorting by _value will only apply to the last aggregated field; for other fields, sorting by _value will always sort by the number of jobs in buckets, not by the actual value of _metric.

Example /aggregate?_group=region,education_level& _metric=avg__salary&_sort=_value,_value&_sortdir=asc,desc
Description This will group the jobs by region, it will order those groups ascending by size (number of jobs in each), it will split them further by education level and will sort each education level bucket within its parent region bucket descending by average salary.

When doing multi-field aggregations you can request the total counts from the intermediate buckets by adding the _totals=1 parameter to the request:

Example /aggregate?_labels=0&_group=profession,education_level&_totals=1
Description Group results by profession, then by education_level; also return the total count for each bucket defined by profession.

The result may look something like:

[["profession", "education_level", "count"],
 [0, 0, 18428],
 [0, 11, 8933],
 [0, 19, 8485],
 ...
 [0, 3, 592],
 [0, "_all", 49431],
 [4108, 11, 12160],
 [4108, 9, 7030],
 ...
 [4108, 7, 178],
 [4108, "_all", 37780]
 ...
]

Note the rows containing _all placeholders. They will show the total number of jobs for that profession. If you prefix fields in the _group with posting:, you will get the count of postings instead.

Counts

The /count endpoint returns the count of jobs or postings matching a set of filters, or the number of distinct values of a certain field.

Request

Endpoint

Method URL BODY
GET https://www.jobfeed.nl/api/v3/count?<query>
POST https://www.jobfeed.nl/api/v3/count <query>

Query parameters

Parameter Type Default Description
_group str None Field to count. If omitted, the call returns the number of jobs matching your filters. With _group=posting, it returns the number matching postings. Otherwise, the value should be the name of a field, and the call returns the number of distinct values of the field in the matching jobs or postings.
<field> Mixed None Filter jobs on specific fields in the same way as for the /search endpoint: see Filtering.

Response

Status Content type Content
200 (OK) application/json A JSON object with a count property

Change log

November 2018:

  • Introduced the _facets parameter for the /search endpoint

June 2018:

  • Introduced job- and posting-level fields
  • Introduced simpler syntax for filtering on multiple values using | and OR. Deprecated the [] syntax.
  • Introduced _matching_postings parameter
  • Deprecated _metric=count_postings