Use `parse_query` to convert a search query from the Trove web interface into a set of parameters that the API will understand.

Functions

format_date[source]

format_date(date, start=False)

The web interface uses YYYY-MM-DD dates, but the API expects YYYY-MM-DDT00:00:00Z. Reformat dates accordingly.

Also the start date in an API query needs to be set to the day before you want. So if this is a start date, take it back in time by a day.

parse_query[source]

parse_query(query, api_version=2)

Converts the parameters of a search using the Trove web interface into a form the API will understand.

Parameters:

  • query – the url of a search in the Trove newspapers & gazettes category
  • api_version – Trove API version (default is 2)

Returns:

  • a dict containing the parameters (multiple values will be in a list)

Basic usage

Here's the url of a search in Trove's newspapers: https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-state=Queensland&l-category=Article&l-illustrationType=Cartoon

If we feed this url to parse_query() we get back a dict with the query parameters translated into a form the Trove API understands.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-state=Queensland&l-category=Article&l-illustrationType=Cartoon', 3)
params

If you want to use this to get data back from the Trove API, you'll need to provide your Trove API key, either as a query parameter (version 2), or in the request headers (version 3). You might also want to change the encoding of the results to 'json'. Then you can just give the parameters as params to requests. For example:

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-state=Queensland&l-category=Article&l-illustrationType=Cartoon', 3)
headers = {'X-API-KEY': 'mYApiKEY'}
params['encoding'] = 'json'
params['n'] = 1
response = requests.get('https://api.trove.nla.gov.au/v3/result', params=params, headers=headers)
data = response.json()

Assuming your API key is valid, this will return the following results:

{'query': 'wragge',
 'category': [{'code': 'newspaper',
   'name': 'Newspapers & Gazettes',
   'records': {'s': '*',
    'n': 2,
    'total': 510,
    'next': 'https://api.trove.nla.gov.au/v3/result?q=wragge&l-artType=newspapers&l-state=Queensland&l-category=Article&l-illustrated=true&l-illtype=Cartoon&category=newspaper&encoding=json&n=2&s=AoIIQzWFoig4MjM0NjM1NA%3D%3D',
    'nextStart': 'AoIIQzWFoig4MjM0NjM1NA==',
    'article': [{'id': '21765046',
      'url': 'https://api.trove.nla.gov.au/v3/newspaper/21765046',
      'heading': 'Mrs. Adelaide Wragge.',
      'category': 'Article',
      'title': {'id': '16',
       'title': 'The Brisbane Courier (Qld. : 1864 - 1933)'},
      'date': '1931-12-16',
      'page': '13',
      'pageSequence': '13',
      'relevance': {'score': 215.65185546875, 'value': 'very relevant'},
      'snippet': 'Formerly of Victoria, and in 1864 Mayoress of Melbourne, the late Mrs. Wragge, who died recently, had been',
      'troveUrl': 'https://.nla.gov.au/nla.news-article21765046?searchTerm=wragge'},
     {'id': '82346354',
      'url': 'https://api.trove.nla.gov.au/v3/newspaper/82346354',
      'heading': 'MR WRAGGE ON WEATHER CANNONS.',
      'category': 'Article',
      'title': {'id': '269',
       'title': 'The North Queensland Register (Townsville, Qld. : 1892 - 1905)'},
      'date': '1901-03-11',
      'page': '10',
      'pageSequence': '10',
      'relevance': {'score': 181.52200317382812, 'value': 'very relevant'},
      'snippet': 'I have been to Styria, have seen the cannons made in the forges, have witnessed the experiments, have visited Herr Stiger, the inventor of the',
      'troveUrl': 'https://.nla.gov.au/nla.news-article82346354?searchTerm=wragge'}]}}]}

Note that the API includes some additional parameters such as reclevel and include. Have a look at the Trove API Console for examples.

Version 2 tests

Simple search with facets

Multiple keywords are just passed along as is and are combined with a boolean AND. This is the same in both the Simple and Advanced search.

assert {'q': 'wragge weather', 'zone': 'newspaper,gazette'} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge%20weather')

Multiple keywords with OR are passed along as is.

assert {'q': 'wragge OR weather', 'zone': 'newspaper,gazette'} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge%20OR%20weather')

Phrase search passed along as is.

assert {'q': '"inclement wragge"', 'zone': 'newspaper,gazette'} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=%22inclement%20wragge%22')

More complex queries such as date ranges should be passed along as is.

parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge%20date%3A%5B1901%20TO%201903%5D&l-artType=newspapers')

Limit to gazettes using facets.

assert {'q': 'wragge', 'zone': 'gazette'} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=gazette')

Limit state to NSW using facets.

assert {'q': 'wragge', 'l-state': ['New South Wales'], 'zone': 'newspaper,gazette'} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-state=New%20South%20Wales')

Limit newspaper to SMH using facets.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-title': ['35']} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-title=35')

Limit to 'Article' category using facets.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-category': ['Article']} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-category=Article')

Limit to specific decade using facets.

parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-decade=190')

Limit to specific year using facets.

parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-decade=190&l-year=1903')

Limit to articles with illustration type of 'Photo' with facets.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-illustrated': 'true', 'l-illtype': ['Photo']} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-illustrationType=Photo')

Limit to articles containing more than 1,000 words using facets.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-word': '1000+ Words'} == parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-word=1000%2B%20Words')

Multiple keywords in 'Any of these words' box.

assert {'q': '(wragge OR weather)', 'zone': 'newspaper,gazette'} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword.any=wragge%20weather')

Multiple keywords in 'The phrase' box.

assert {'q': '"inclement wragge"', 'zone': 'newspaper,gazette'} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword.phrase=inclement%20wragge')

Keywords in 'All of these words' and 'Without these words' boxes.

assert {'q': 'wragge AND NOT (weather)', 'zone': 'newspaper,gazette'} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword.not=weather&keyword=wragge')

Limit to a specific date range.

assert {'q': 'wragge date:[1899-12-31T00:00:00Z TO 1900-02-04T00:00:00Z]', 'zone': 'newspaper'} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&date.from=1900-01-01&date.to=1900-02-04&l-advArtType=newspapers')

Limit to a specific state.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-state': ['Queensland']} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advstate=Queensland')

Limit to specific newspapers.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-title': ['16', '1055']} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advtitle=16&l-advtitle=1055')

Limit to a specific category.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-category': ['Family Notices']} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advcategory=Family%20Notices')

Limit to a specific illustration type.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-illustrated': 'true', 'l-illtype': ['Photo']} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advIllustrationType=Photo')

Limit to a specific number of words.

assert {'q': 'wragge', 'zone': 'newspaper', 'l-word': '100 - 1000 Words'} == parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advWord=100%20-%201000%20Words')

Version 3 tests

Simple search with facets

def query_api(params):
    api_key = os.getenv("TROVE_API_KEY")
    params["n"] = 0
    response = requests.get("https://api.trove.nla.gov.au/v3/result", params=params, headers={"X-API-KEY": api_key})
    return response.status_code
params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge%20weather', 3)
assert {'q': 'wragge weather', 'category': 'newspaper'} == params
assert query_api(params) == 200

Multiple keywords with OR are passed along as is.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge%20OR%20weather', 3)
assert {'q': 'wragge OR weather', 'category': 'newspaper'} == params
assert query_api(params) == 200

Phrase search passed along as is.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=%22inclement%20wragge%22', 3)
assert {'q': '"inclement wragge"', 'category': 'newspaper'} == params
assert query_api(params) == 200

More complex queries such as date ranges should be passed along as is.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge%20date%3A%5B1901%20TO%201903%5D&l-artType=newspapers', 3)
assert {'q': 'wragge date:[1901 TO 1903]', 'category': 'newspaper', 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to gazettes using facets.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=gazette', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-artType': 'gazette'} == params
assert query_api(params) == 200

Limit state to NSW using facets.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-state=New%20South%20Wales', 3)
assert {'q': 'wragge', 'l-state': ['New South Wales'], 'category': 'newspaper'} == params
assert query_api(params) == 200

Limit newspaper to SMH using facets.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-title=35', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-title': ['35'], 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to 'Article' category using facets.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-category=Article', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-category': ['Article'], 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to specific decade using facets.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-decade=190', 3)
assert {'q': 'wragge', 'l-artType': 'newspapers', 'l-decade': ['190'], 'category': 'newspaper'} == params
assert query_api(params) == 200

Limit to specific year using facets.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-decade=190&l-year=1903', 3)
assert {'q': 'wragge', 'l-artType': 'newspapers', 'l-decade': ['190'], 'l-year': ['1903'], 'category': 'newspaper'} == params
assert query_api(params) == 200

Limit to articles with illustration type of 'Photo' with facets.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-illustrationType=Photo', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-illustrated': 'true', 'l-illustrationType': ['Photo'], 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to articles containing more than 1,000 words using facets.

params = parse_query('https://trove.nla.gov.au/search/category/newspapers?keyword=wragge&l-artType=newspapers&l-word=1000%2B%20Words', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-wordCount': '1000+ Words', 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Multiple keywords in 'Any of these words' box.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword.any=wragge%20weather', 3)
assert {'q': '(wragge OR weather)', 'category': 'newspaper'} == params
assert query_api(params) == 200

Multiple keywords in 'The phrase' box.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword.phrase=inclement%20wragge', 3)
assert {'q': '"inclement wragge"', 'category': 'newspaper'} == params
assert query_api(params) == 200

Keywords in 'All of these words' and 'Without these words' boxes.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword.not=weather&keyword=wragge', 3)
assert {'q': 'wragge AND NOT (weather)', 'category': 'newspaper'} == params
assert query_api(params) == 200

Limit to a specific date range.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&date.from=1900-01-01&date.to=1900-02-04&l-advArtType=newspapers', 3)
assert {'q': 'wragge date:[1899-12-31T00:00:00Z TO 1900-02-04T00:00:00Z]', 'category': 'newspaper', 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to a specific state.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advstate=Queensland', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-state': ['Queensland'], 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to specific newspapers.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advtitle=16&l-advtitle=1055', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-title': ['16', '1055'], 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to a specific category.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advcategory=Family%20Notices', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-category': ['Family Notices'], 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to a specific illustration type.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advIllustrationType=Photo', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-illustrated': 'true', 'l-illustrationType': ['Photo'], 'l-artType': 'newspapers'} == params
assert query_api(params) == 200

Limit to a specific number of words.

params = parse_query('https://trove.nla.gov.au/search/advanced/category/newspapers?keyword=wragge&l-advArtType=newspapers&l-advWord=100%20-%201000%20Words', 3)
assert {'q': 'wragge', 'category': 'newspaper', 'l-wordCount': '100 - 1000 Words', 'l-artType': 'newspapers'} == params
assert query_api(params) == 200