Dataset

class tamr_client.Dataset(url, name, key_attribute_names, description=None)[source]

A Tamr dataset

See https://docs.tamr.com/reference/dataset-models

Parameters
  • url (URL) – The canonical dataset-based URL for this dataset e.g. /datasets/1

  • name (str) –

  • key_attribute_names (Tuple[str, …]) –

  • description (Optional[str]) –

tamr_client.dataset.by_resource_id(session, instance, id)[source]

Get dataset by resource ID

Fetches dataset from Tamr server

Parameters
  • instance (Instance) – Tamr instance containing this dataset

  • id (str) – Dataset ID

Raises
Return type

Dataset

tamr_client.dataset.by_name(session, instance, name)[source]

Get dataset by name

Fetches dataset from Tamr server

Parameters
  • instance (Instance) – Tamr instance containing this dataset

  • name (str) – Dataset name

Raises
Return type

Dataset

tamr_client.dataset.attributes(session, dataset)[source]

Get all attributes from a dataset

Parameters

dataset (Dataset) – Dataset containing the desired attributes

Return type

Tuple[Attribute, …]

Returns

The attributes for the specified dataset

Raises

requests.HTTPError – If an HTTP error is encountered.

tamr_client.dataset.materialize(session, dataset)[source]

Materialize a dataset and wait for the operation to complete Materializing consists of updating the dataset (including records) in persistent storage (HBase) based on upstream changes to data.

Parameters

dataset (Dataset) – A Tamr dataset which will be materialized

Return type

Operation

tamr_client.dataset.delete(session, dataset, *, cascade=False)[source]

Deletes an existing dataset

Sends a deletion request to the Tamr server

Parameters
  • dataset (Dataset) – Existing dataset to delete

  • cascade (bool) – Whether to delete all derived datasets as well

Raises
tamr_client.dataset.get_all(session, instance, *, filter=None)[source]

Get all datasets from an instance

Parameters
  • instance (Instance) – Tamr instance from which to get datasets

  • filter (Union[str, List[str], None]) – Filter expression, e.g. “externalId==wobbly” Multiple expressions can be passed as a list

Return type

Tuple[Dataset, …]

Returns

The datasets retrieved from the instance

Raises

requests.HTTPError – If an HTTP error is encountered.

tamr_client.dataset.create(session, instance, *, name, key_attribute_names, description=None, external_id=None)[source]

Create a dataset in Tamr.

Parameters
  • instance (Instance) – Tamr instance

  • name (str) – Dataset name

  • key_attribute_names (Tuple[str, …]) – Dataset primary key attribute names

  • description (Optional[str]) – Dataset description

  • external_id (Optional[str]) – External ID of the dataset

Return type

Dataset

Returns

Dataset created in Tamr

Raises

Exceptions

class tamr_client.dataset.NotFound[source]

Raised when referencing (e.g. updating or deleting) a dataset that does not exist on the server.

class tamr_client.dataset.Ambiguous[source]

Raised when referencing a dataset by name that matches multiple possible targets.

class tamr_client.dataset.AlreadyExists[source]

Raised when a dataset with these specifications already exists.