Butler¶
A high level object that provides read access to the Datasets in a single Collection and write access to a single Run.
Butler is a concrete, final Python class in the current design; all extensibility is provided by the Registry and Datastore instances it holds.
Transition¶
The new Butler plays essentially the same role as the v14 Butler.
Python API¶
-
class
Butler
¶ -
config
¶ a
ButlerConfiguration
instance
-
get
(label, parameters=None)¶ Load a Dataset or a slice thereof from the Butler’s Collection.
Parameters: - label (DatasetLabel) – a
DatasetLabel
that identifies the Dataset to retrieve. - parameters (dict) – a dictionary of StorageClass-specific parameters that can be used to obtain a slice of the Dataset.
Returns: an InMemoryDataset.
Implemented as:
handle = self.registry.find(self.run.collection, label) return self.getDirect(handle, parameters)
Todo
- Implementation requires all components to be able to handle (typically pass-through) parameters passed for the composite. Could we instead get away with only passing those when getting the parent from the Datastore?
- Recursive composites were broken by a minor update. Would probably not be hard to add back in if we decide we need them, but they’d make the logic a bit harder to follow so not worth doing now.
- label (DatasetLabel) – a
-
getDirect
(handle, parameters=None)¶ Load a Dataset or a slice thereof from a
DatasetHandle
.Unless
Butler.get()
, this method allows Datasets outside the Butler’s Collection to be read as long as theDatasetHandle
that identifies them can be obtained separately. This is needed to support the Comparison SuperTasks use case.Parameters: - handle (DatasetHandle) – a pointer to the Dataset to load.
- parameters (dict) – a dictionary of StorageClass-specific parameters that can be used to obtain a slice of the Dataset.
Returns: an InMemoryDataset.
Implemented as:
parent = self.datastore.get(handle.uri, handle.type.storageClass, parameters) if handle.uri else None children = {name: self.datastore.get(childHandle, parameters) for name, childHandle in handle.components.items()} return handle.type.storageClass.assemble(parent, children)
-
put
(label, dataset, producer=None)¶ Write a Dataset.
Parameters: - label (DatasetLabel) – a
DatasetLabel
that will identify the Dataset being stored. - dataset – the InMemoryDataset to store.
- producer (Quantum) – the Quantum instance that produced the Dataset. May be
None
for some Registries.producer.run
must matchself.run
.
Returns: Implemented as:
ref = self.registry.expand(label) run = self.run assert(producer is None or run == producer.run) storageHint = ref.makeStorageHint(run) uri, components = self.datastore.put(inMemoryDataset, ref.type.storageClass, storageHint, ref.type.name) return self.registry.addDataset(ref, uri, components, producer=producer, run=run)
- label (DatasetLabel) – a
-
markInputUsed
(quantum, ref)¶ Mark a Dataset as having been “actually” (not just predicted-to-be) used by a Quantum.
Parameters: - quantum (Quantum) – the dependent Quantum.
- ref (DatasetRef) – the Dataset that is a true dependency of
quantum
.
Implemented as:
handle = self.registry.find(self.run.collection, ref) self.registry.markInputUsed(handle, quantum)
-
unlink
(*labels)¶ Remove the Datasets associated with the given
DatasetLabels
from the Butler’s Collection, and signal that they may be deleted from storage if they are not referenced by any other Collection.Implemented as:
handles = [self.registry.find(self.run.collection, label) for label in labels] for handle in self.registry.disassociate(self.run.collection, handles, remove=True): self.datastore.remove(handle.uri)
Todo
How much more of Registry’s should Butler forward?
-
-
class
ButlerConfiguration
¶ Note
This currently is a class that maps directly onto a
YAML
file.- Configuration options are accessed through dictionary keys separated by dots (e.g.
config['datastore.root']
). - Configuration for Datastore and Registry, including which classes to instantiate, is nested under
config['datastore']
andconfig['registry']
respectively.
But this is an implementation detail that is likely to change significantly.
- Configuration options are accessed through dictionary keys separated by dots (e.g.