Skip to content

document autopopulate.make logic #1241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 6, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 70 additions & 6 deletions datajoint/autopopulate.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,13 +93,75 @@ def _rename_attributes(table, props):

def make(self, key):
"""
Derived classes must implement method `make` that fetches data from tables
above them in the dependency hierarchy, restricting by the given key,
computes secondary attributes, and inserts the new tuples into self.
This method must be implemented by derived classes to perform automated computation.
The method must implement the following three steps:

1. Fetch data from tables above in the dependency hierarchy, restricted by the given key.
2. Compute secondary attributes based on the fetched data.
3. Insert the new tuples into the current table.

The method can be implemented either as:
(a) Regular method: All three steps are performed in a single database transaction.
The method must return None.
(b) Generator method:
The make method is split into three functions:
- `make_fetch`: Fetches data from the parent tables.
- `make_compute`: Computes secondary attributes based on the fetched data.
- `make_insert`: Inserts the computed data into the current table.

Then populate logic is executes as follows:

<pseudocode>
fetched_data1 = self.make_fetch(key)
computed_result = self.make_compute(key, *fetched_data1)
begin transaction:
fetched_data2 = self.make_fetch(key)
if fetched_data1 != fetched_data2:
cancel transaction
else:
self.make_insert(key, *computed_result)
commit_transaction
<pseudocode>

Importantly, the output of make_fetch is a tuple that serves as the input into `make_compute`.
The output of `make_compute` is a tuple that serves as the input into `make_insert`.

The functionality must be strictly divided between these three methods:
- All database queries must be completed in `make_fetch`.
- All computation must be completed in `make_compute`.
- All database inserts must be completed in `make_insert`.

DataJoint may programmatically enforce this separation in the future.

:param key: The primary key value used to restrict the data fetching.
:raises NotImplementedError: If the derived class does not implement the required methods.
"""
raise NotImplementedError(
"Subclasses of AutoPopulate must implement the method `make`"
)

if not (
hasattr(self, "make_fetch")
and hasattr(self, "make_insert")
and hasattr(self, "make_compute")
):
# user must implement `make`
raise NotImplementedError(
"Subclasses of AutoPopulate must implement the method `make` or (`make_fetch` + `make_compute` + `make_insert`)"
)

# User has implemented `_fetch`, `_compute`, and `_insert` methods instead

# Step 1: Fetch data from parent tables
fetched_data = self.make_fetch(key) # fetched_data is a tuple
computed_result = yield fetched_data # passed as input into make_compute

# Step 2: If computed result is not passed in, compute the result
if computed_result is None:
# this is only executed in the first invocation
computed_result = self.make_compute(key, *fetched_data)
yield computed_result # this is passed to the second invocation of make

# Step 3: Insert the computed result into the current table.
self.make_insert(key, *computed_result)
yield

@property
def target(self):
Expand Down Expand Up @@ -347,6 +409,8 @@ def _populate1(
]
): # rollback due to referential integrity fail
self.connection.cancel_transaction()
logger.warning(
f"Referential integrity failed for {key} -> {self.target.full_table_name}")
return False
gen.send(computed_result) # insert

Expand Down
Loading