API documentation

DirectAccessV1

DirectAccessV2

class directaccess.DirectAccessV2(client_id, client_secret, api_key, retries=5, backoff_factor=1, links=None, access_token=None, **kwargs)[source]

Client for Enverus Drillinginfo Developer API Version 2

count(dataset, **options)[source]

Get the count of records given a dataset and query options

Parameters:
  • dataset – a valid dataset name. See the Direct Access documentation for valid values
  • options – query parameters as keyword arguments
Returns:

record count as int

ddl(dataset, database)[source]

Get DDL statement for dataset. Must provide exactly one of mssql or pg for database argument. mssql is Microsoft SQL Server, pg is PostgreSQL

Parameters:
  • dataset – a valid dataset name. See the Direct Access documentation for valid values
  • database – one of mssql or pg.
Returns:

a DDL statement from the Direct Access service as str

docs(dataset)[source]

Get docs for dataset

Parameters:dataset – a valid dataset name. See the Direct Access documentation for valid values
Returns:docs response for dataset as list[dict] or None if ?docs is not supported on the dataset
get_access_token()[source]

Get an access token from /tokens endpoint. Automatically sets the Authorization header on the class instance’s session. Raises DAAuthException on error

Returns:token response as dict
static in_(items)[source]

Helper method for providing values to the API’s in() filter function.

The API currently supports GET requests to dataset endpoints. When providing a large list of values to the API’s in() filter function, it’s necessary to chunk up the values to avoid URLs larger than 2048 characters. The query method of this class handles the chunking transparently; this helper method simply stringifies the input items into the correct syntax.

d2 = DirectAccessV2(client_id, client_secret, api_key)
# Query well-origins
well_origins_query = d2.query(
    dataset='well-origins',
    deleteddate='null',
    pagesize=100000
)
# Get all UIDs for well-origins
uid_parent_ids = [x['UID'] for x in well_origins_query]
# Provide the UIDs to wellbores endpoint
wellbores_query = d2.query(
    dataset='wellbores',
    deleteddate='null',
    pagesize=100000,
    uidparent=d2.in_(uid_parent_ids)
)
Parameters:items (list) – list or generator of values to provide to in() filter function
Returns:str to provide to DirectAccessV2 query method
query(dataset, **options)[source]

Query Direct Access V2 dataset

Accepts a dataset name and a variable number of keyword arguments that correspond to the fields specified in the ‘Request Parameters’ section for each dataset in the Direct Access documentation.

This method only supports the JSON output provided by the API and yields dicts for each record.

Parameters:
  • dataset – a valid dataset name. See the Direct Access documentation for valid values
  • options – query parameters as keyword arguments
Returns:

query response as generator

to_csv(query, path, log_progress=True, **kwargs)

Write query results to CSV. Optional keyword arguments are provided to the csv writer object, allowing control over delimiters, quoting, etc. The default is comma-separated with csv.QUOTE_MINIMAL

d2 = DirectAccessV2(client_id, client_secret, api_key)
query = d2.query('rigs', deleteddate='null', pagesize=1500)
# Write tab-separated file
d2.to_csv(query, '/path/to/rigs.csv', delimiter='\t')
Parameters:
  • query – DirectAccessV1 or DirectAccessV2 query object
  • path (str) – relative or absolute filesystem path for created CSV
  • log_progress (bool) – whether to log progress. if True, log a message with current written count
Returns:

the newly created CSV file path

to_dataframe(dataset, converters=None, log_progress=True, **options)[source]

Write query results to a pandas Dataframe with properly set dtypes and index columns.

This works by requesting the DDL for dataset and manipulating the text to build a list of dtypes, date columns and the index column(s). It then makes a query request for dataset to ensure we know the exact fields to expect, (ie, if fields was a provided query parameter and the result will have fewer fields than the DDL).

For endpoints with composite primary keys, a pandas MultiIndex is created.

This method is potentially fragile. The API’s docs feature is preferable but not yet available on all endpoints.

Query results are written to a temporary CSV file and then read into the dataframe. The CSV is removed afterwards.

pandas version 0.24.0 or higher is required for use of the Int64 dtype allowing integers with NaN values. It is not possible to coerce missing values for columns of dtype bool and so these are set to object dtype.

d2 = DirectAccessV2(client_id, client_secret, api_key)
# Create a Texas permits dataframe, removing commas from Survey names and replacing the state
# abbreviation with the complete name.
df = d2.to_dataframe(
    dataset='permits',
    deleteddate='null',
    pagesize=100000,
    stateprovince='TX',
    converters={
        'StateProvince': lambda x: 'TEXAS',
        'Survey': lambda x: x.replace(',', '')
    }
)
df.head(10)
Parameters:
  • dataset (str) – a valid dataset name. See the Direct Access documentation for valid values
  • converters (dict) – Dict of functions for converting values in certain columns. Keys can either be integers or column labels.
  • log_progress (bool) – whether to log progress. if True, log a message with current written count
  • options – query parameters as keyword arguments
Returns:

pandas dataframe