add solutions for problems 1, 2, and 3 by krishnadheerajkrovi · Pull Request #29 · super30admin/Pandas10

krishnadheerajkrovi · 2026-04-05T19:01:08Z

Completed Pandas10

super30admin · 2026-04-07T18:45:03Z

Let's evaluate each solution one by one.

For problem1.py:

The problem requires grouping by 'sell_date' and aggregating the number of unique products and a sorted, comma-separated list of unique product names.
The student uses groupby and agg with 'nunique' for counting unique products and a lambda function to create the sorted comma-separated string. This is correct.
The lambda function uses sorted(x.unique()) which first gets unique products and then sorts them. This is efficient and meets the requirement.
The code then sorts by 'sell_date' and returns the result. The problem states the result should be sorted by sell_date, so this is correct.
Time complexity: O(n log n) due to the sorting of unique products for each group. However, the number of groups is likely small compared to the data size. The overall groupby operation is O(n), but the sorting per group adds O(m log m) per group, where m is the number of unique products in that group. But since the groups are dates and products per date are limited, it should be efficient.
Space complexity: O(n) for the grouped data and the resulting DataFrame.
Code quality: Good. The code is concise and readable. However, using a lambda might be less efficient than using a predefined function. Alternatively, we could use a custom aggregation function like lambda x: ','.join(sorted(set(x))) but the student's approach is similar. Actually, using x.unique() first is efficient as it reduces the data before sorting.

For problem2.py:

The problem requires grouping by 'date_id' and 'make_name', then counting unique leads and unique partners.
The student uses groupby and agg with 'nunique' for both columns. This is correct and efficient.
The code returns the result directly without sorting. The problem does not specify sorting, so it's acceptable. However, the output might be expected in a specific order (like sorted by date_id and make_name) but since not specified, it's fine.
Time complexity: O(n) for the groupby operation.
Space complexity: O(n) for the grouped data.
Code quality: Good. The code is very concise and clear.

For problem3.py:

The problem requires finding actor-director pairs who have collaborated at least 3 times.
The student groups by ['actor_id', 'director_id'] and uses size() to count the number of collaborations. Then filters for counts >=3 and returns the required columns.
This is correct and efficient.
Time complexity: O(n) for the groupby and O(n) for the filtering.
Space complexity: O(n) for the intermediate DataFrame.
Code quality: Good. The code is clear. However, the student uses .size() which counts including NaN values? But in this dataset, there should be no NaN for these columns since it's about collaborations. So it's safe.

General feedback:

The student's solutions are correct and efficient for all three problems.
The code is concise and uses appropriate pandas functions.
The student includes comments explaining the approach, which is good for readability.
There are no major issues. The solutions follow standard pandas best practices.

add solutions for problems 1, 2, and 3

1aae1c0

Provide feedback