Skip to content

feat(records): ingest endpoint#2652

Open
andersfylling wants to merge 1 commit into
masterfrom
andersf/records/ingest2
Open

feat(records): ingest endpoint#2652
andersfylling wants to merge 1 commit into
masterfrom
andersf/records/ingest2

Conversation

@andersfylling
Copy link
Copy Markdown
Contributor

@andersfylling andersfylling commented May 27, 2026

https://cognitedata.atlassian.net/browse/HVD-1261

Created a new PR (again), due to github issues.

@andersfylling andersfylling force-pushed the andersf/records/ingest2 branch from 1636624 to 4287519 Compare May 27, 2026 19:56
@andersfylling andersfylling marked this pull request as ready for review May 27, 2026 19:57
@andersfylling andersfylling requested review from a team as code owners May 27, 2026 19:57
@andersfylling andersfylling changed the title andersf/records/ingest2 feat(records): ingest endpoint May 27, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for ingesting records into data modeling streams by introducing the ingest method to both the sync and async records API, along with supporting data classes (RecordWrite, RecordSource, RecordSourceReference, and RecordWriteList). Feedback on the changes focuses on adding missing type hints to class attributes across the new data classes to comply with the repository style guide, and safely handling potential null values for sources in RecordWrite._load to prevent a TypeError.

Comment thread cognite/client/data_classes/data_modeling/records.py
Comment thread cognite/client/data_classes/data_modeling/records.py
Comment thread cognite/client/data_classes/data_modeling/records.py
Comment on lines +107 to +113
@classmethod
def _load(cls, resource: dict[str, Any]) -> Self:
return cls(
space=resource["space"],
external_id=resource["externalId"],
sources=[RecordSource._load(s) for s in resource.get("sources", [])],
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If the API response contains sources as null (which is represented as None in Python), resource.get("sources", []) will return None. This will cause a TypeError when attempting to iterate over it. Use resource.get("sources") or [] to safely handle this case.

Suggested change
@classmethod
def _load(cls, resource: dict[str, Any]) -> Self:
return cls(
space=resource["space"],
external_id=resource["externalId"],
sources=[RecordSource._load(s) for s in resource.get("sources", [])],
)
@classmethod
def _load(cls, resource: dict[str, Any]) -> Self:
return cls(
space=resource["space"],
external_id=resource["externalId"],
sources=[RecordSource._load(s) for s in resource.get("sources") or []],
)

Comment on lines +126 to +129
class RecordWriteList(CogniteResourceList[RecordWrite]):
"""A list of :class:`RecordWrite` objects."""

_RESOURCE = RecordWrite
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Please add a type hint for the class attribute _RESOURCE to comply with the repository style guide requirement that all class attributes must have type hints.

Suggested change
class RecordWriteList(CogniteResourceList[RecordWrite]):
"""A list of :class:`RecordWrite` objects."""
_RESOURCE = RecordWrite
class RecordWriteList(CogniteResourceList[RecordWrite]):
"""A list of :class:`RecordWrite` objects."""
_RESOURCE: type[RecordWrite] = RecordWrite
References
  1. All functions, methods, and class attributes must have type hints. (link)

resource_path=self._records_url(stream_id),
)

async def ingest(self, stream_id: str, items: RecordWrite | Sequence[RecordWrite]) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we adopt the "items as posarg, stream_id as kwarg" pattern here too:

Suggested change
async def ingest(self, stream_id: str, items: RecordWrite | Sequence[RecordWrite]) -> None:
async def ingest(
self,
items: RecordWrite | Sequence[RecordWrite],
*,
stream_id: str,
) -> None:

Comment on lines +815 to +817
assert list_cls is not None
assert resource_cls is not None
assert input_resource_cls is not None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy you silly 😆

Comment on lines +157 to +158
from cognite.client.utils._identifier import RecordId

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from cognite.client.utils._identifier import RecordId

Comment on lines +165 to +166
from cognite.client.utils._identifier import RecordId

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from cognite.client.utils._identifier import RecordId

}


class RecordWrite(WriteableCogniteResource["RecordWrite"]):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have used the word "Apply" in Data Modeling, i.e. NodeApply and EdgeApply, but honestly, I like Write better.

return self


class RecordWriteList(CogniteResourceList[RecordWrite]):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing an as_ids (not that write-list-type classes ever see any real use, dont even know why we have them 😆 )

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove all Record* classes from here (cognite/client/data_classes/__init__.py) and keep them in cognite/client/data_classes/data_modeling/__init__.py.

... )
"""
self._warning.warn()
item_list: list[RecordWrite] = [items] if isinstance(items, RecordWrite) else list(items)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does mypy complain about this? Would be nice to avoid making a full copy

Suggested change
item_list: list[RecordWrite] = [items] if isinstance(items, RecordWrite) else list(items)
item_list: list[RecordWrite] = [items] if isinstance(items, RecordWrite) else items

return len(self) == len({(r.space, r.external_id) for r in self._identifiers})


class RecordSourceReference(CogniteResource):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can remove this class entirely and use the existing ContainerId

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or make a shallow subclass RecordContainerId

}


class RecordSource(CogniteResource):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If RecordSourceReference is replaced with ContainerId, then RecordSource becomes nearly identical to NodeOrEdgeData, but I think we should keep it. The node-or-edge thingy is very bloated due to support for TypedInstance, which for the foreseeable future should not make it into Records.

So, I just suggest update source to ContainerId here.

@haakonvt
Copy link
Copy Markdown
Contributor

One last question, the API has both ingest and upsert; what are your thoughts on keeping just ingest / both? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants