[DCP - Ingestion Helper] Major code refactor for modularization and cleanliness#2036
[DCP - Ingestion Helper] Major code refactor for modularization and cleanliness#2036gmechali wants to merge 3 commits into
Conversation
…were getting cluttered. This way they are grouped. The core logic of embeddings and aggregations remain completely unchanged.
There was a problem hiding this comment.
Code Review
This pull request refactors the ingestion-helper by modularizing the action logic into separate handler files and introduces a new action to clear the Redis cache. It also implements lazy initialization for the Spanner and Storage clients and adds comprehensive unit tests for the new handlers. A review comment identifies that the AggregationUtils constructor in the new aggregation handler is missing the location and is_base_dc parameters, which are necessary for environment-specific configurations.
| aggregation = AggregationUtils( | ||
| connection_id=FLAGS.spanner_connection_id, | ||
| project_id=FLAGS.spanner_project_id, | ||
| instance_id=FLAGS.spanner_instance_id, | ||
| database_id=FLAGS.spanner_graph_database_id, | ||
| ) |
There was a problem hiding this comment.
The AggregationUtils constructor should be updated to include the location and is_base_dc parameters. This ensures that environment-specific configurations for the BigQueryExecutor and LinkedEdgeGenerator are handled within the class constructor, promoting modularity and preventing incorrect defaults.
aggregation = AggregationUtils(
connection_id=FLAGS.spanner_connection_id,
project_id=FLAGS.spanner_project_id,
instance_id=FLAGS.spanner_instance_id,
database_id=FLAGS.spanner_graph_database_id,
location=FLAGS.location,
is_base_dc=FLAGS.is_base_dc,
)References
- Refactor duplicated logic and environment-specific constants into class constructors to improve maintainability and modularity.
- Avoid hardcoding environment-specific values; instead, pass these values as parameters to ensure the code remains maintainable and portable.
The code for the ingestion helper has steadily been getting overly complicated, and less structured. This refactor should clean that up a bit and give a clear path forward for how to tack on new responsibilities without cluttering the code.