Redesign antispam architecture: eliminate suspicious users, implement trust-based system with per-group spammer management by Copilot · Pull Request #4 · insoln/buzz_buster

Copilot · 2025-10-16T17:47:38Z

✅ Implemented New Architecture: Eliminated Suspicious Users Concept

Based on @insoln's feedback, I've completely redesigned the antispam system to be simpler and more effective:

Key Changes:

1. Eliminated Suspicious Users Cache

❌ Removed suspicious_users_cache entirely
❌ No more complex "suspicious" state management
✅ Simplified to two clear states: trusted vs unknown

2. New Trust-Based System

🔍 Unknown users: No seen_message=TRUE in ANY group → run spam check
✅ Trusted users: Has seen_message=TRUE in at least one group → skip spam check globally
🚫 Spammers: Per-group spammer status instead of global

3. Per-Group Spammer Management

Before: Global spammer list (spammers_cache = set())
After: Per-group spammer tracking (spammers_cache = {user_id: {group_ids}})
Cross-group banning: If user is spammer anywhere → auto-ban in current group
Admin unban support: Removes spammer flag only for specific group

4. New Message Processing Logic

# Check if user is spammer anywhere → ban in current group
if is_user_spammer_anywhere(user.id):
    ban_and_mark_spammer_in_group()

# Check if user is trusted globally → skip spam check  
elif is_user_trusted(user.id):
    mark_seen_message_in_current_group()

# Unknown user → run spam check
else:
    run_spam_check()
    if spam: mark_as_spammer_in_group()
    else: mark_as_trusted()

5. Admin Unban Handling

Detects when admin unbans a user (status change: banned → member)
Removes spammer flag only for that specific group
Preserves spammer status in other groups

6. Fixed Group Join Logic ⚡ Updated

Issue: Was only checking if user is spammer in current group
Fix: Now checks if user is spammer in ANY group and bans accordingly
Behavior: Cross-group spammer detection on join attempts

7. Updated Test Suite 🧪 Latest Update

Fixed: Eliminated all references to outdated suspicious_users_cache
Fixed: Removed references to non-existent suspicious database column
Added: Tests for new trust-based system functionality
Added: Tests for per-group spammer tracking and cross-group detection
Updated: Tests now properly reflect the new architecture

Benefits:

✅ Simpler logic: Only 2 states instead of 3 (trusted/unknown vs suspicious/trusted/spammer)
✅ Better UX: Once trusted in one group, trusted everywhere
✅ Granular control: Per-group spammer management allows selective unbanning
✅ Cross-group protection: Spammers detected anywhere are blocked everywhere
✅ Admin control: Unbanning works as expected without global impact

Database Changes:

spammers_cache structure: {user_id: set(group_ids)}
Added helper functions: is_user_trusted(), is_user_spammer_anywhere(), is_user_spammer_in_group()
Enhanced unban detection and handling
Fixed cross-group spammer detection on group joins
Updated tests to match new architecture without outdated references

This addresses @insoln's request to eliminate the suspicious users concept while providing better spam protection and admin control. The test suite now properly validates the new trust-based system.

Original prompt

This section details on the original issue you should resolve

<issue_title>Сохранять статус подозрительных пользователей при перезагрузке сервиса</issue_title>
<issue_description>Сейчас после перезагрузки сервиса пользователи, которые были помечены как подозрительные, теряют этот статус и перестают отслеживаться системой как подозрительные. Это снижает эффективность антиспам-фильтрации, так как потенциальные нарушители могут избежать блокировки.

Ожидаемое поведение:

Статус "подозрительный" должен сохраняться между рестартами сервиса.
</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #3

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

insoln · 2025-10-16T18:04:39Z

@copilot а разве seen_message=0 не является флагом подозрительности?

Copilot · 2025-10-16T18:07:03Z

@copilot а разве seen_message=0 не является флагом подозрительности?

You're absolutely right! The seen_message = FALSE flag was already serving as the suspicious user persistence mechanism. Users who join groups get seen_message = FALSE by default, and load_user_caches() was already loading these users as suspicious on restart. My addition of a separate suspicious field is redundant.

The original system was working correctly:

User joins → seen_message = FALSE → loaded as suspicious after restart
User sends message → seen_message = TRUE → no longer suspicious after restart

Should I revert the changes and keep the simpler original approach using just seen_message?

insoln · 2025-10-16T20:14:31Z

@copilot давай вообще избавимся от подозрительности: либо юзер есть в базе и про него известно, что он точно не спамер (seen_message=1 в любой группе), либо его нет в базе, и тогда следует проводить проверку на спам, и если она возвращает что сообщение спамерское - помечаем его как спамера (только в этой группе).

если приходит любое сообщение в любой группе от пользовтеля, у которого есть признак спама хотя бы в одной группе - баним его в этой группе и проставляем флаг спама в этой группе. при этом если мы получаем сообщение о разбане пользователя в группе админом - снимаем флаг спамера в этой группе

Copilot · 2025-10-16T20:17:23Z

@copilot давай вообще избавимся от подозрительности: либо юзер есть в базе и про него известно, что он точно не сп�...

I'll implement this new architecture that eliminates the suspicious users concept. The new logic will be:

Remove suspicious_users_cache entirely
For any message: check if user has seen_message=1 in ANY group → if yes, skip spam check
If no seen_message=1 found → run spam check
Spammer status becomes per-group instead of global
Cross-group banning: spammer in any group gets banned in current group
Handle admin unbans to remove per-group spammer flags

Let me implement these changes.

…mmers, trusted users Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

Copilot

Pull Request Overview

This PR completely redesigns the antispam architecture by eliminating the "suspicious users" concept and implementing a trust-based system with per-group spammer management. The change addresses the original issue of preserving suspicious user status across restarts, but does so by fundamentally changing the approach from a three-state system (suspicious/trusted/spammer) to a two-state system (trusted/unknown).

Key changes include:

Replaced global spammer tracking with per-group spammer management
Implemented trust-based user classification where users trusted in any group skip spam checks globally
Added cross-group spammer detection and automatic banning
Enhanced admin unban support with per-group spammer flag removal

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 9 comments.

File	Description
`bot/app/telegram_messages.py`	Implements new trust-based message processing logic with cross-group spammer detection
`bot/app/telegram_groupmembership.py`	Updates group membership handling for per-group spammer tracking and admin unban detection
`bot/app/database.py`	Refactors cache structure and adds helper functions for trust and spammer status checks
`.devcontainer/tests/test_suspicious_persistence.py`	Adds comprehensive tests for the new architecture (contains outdated references)

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-16T20:29:42Z

+from app.database import (
+    check_and_create_tables,
+    load_user_caches, 
+    suspicious_users_cache,


Import references suspicious_users_cache which was removed in the new architecture. This import will cause an ImportError.

Copilot · 2025-10-16T20:29:42Z

+                INSERT INTO user_entries (user_id, group_id, join_date, suspicious, seen_message, spammer)
+                VALUES (%s, %s, NOW(), TRUE, FALSE, FALSE)


The suspicious column does not exist in the database schema. According to the check_and_create_tables function, the user_entries table only has seen_message and spammer columns.

Suggested change

INSERT INTO user_entries (user_id, group_id, join_date, suspicious, seen_message, spammer)

VALUES (%s, %s, NOW(), TRUE, FALSE, FALSE)

INSERT INTO user_entries (user_id, group_id, join_date, seen_message, spammer)

VALUES (%s, %s, NOW(), FALSE, FALSE)

Copilot · 2025-10-16T20:29:42Z

+            load_user_caches()
+
+            # Verify that the user is loaded as suspicious
+            assert test_user_id in suspicious_users_cache, f"User {test_user_id} should be in suspicious cache after restart"


Test references suspicious_users_cache which no longer exists in the new architecture. This will cause a NameError.

Copilot · 2025-10-16T20:29:43Z

+                INSERT INTO user_entries (user_id, group_id, join_date, suspicious, seen_message, spammer)
+                VALUES (%s, %s, NOW(), TRUE, FALSE, FALSE)


The suspicious column does not exist in the database schema. This will cause a SQL error.

Copilot · 2025-10-16T20:29:43Z

+            # Simulate user sending a good message (not spam)
+            cursor.execute(
+                """
+                UPDATE user_entries SET seen_message = TRUE, spammer = FALSE, suspicious = FALSE WHERE user_id = %s


References non-existent suspicious column. This will cause a SQL error.

Copilot · 2025-10-16T20:29:43Z

+            load_user_caches()
+
+            # Verify that the user is no longer suspicious
+            assert test_user_id not in suspicious_users_cache, f"User {test_user_id} should not be in suspicious cache after good message"


References suspicious_users_cache which no longer exists. This will cause a NameError.

Copilot · 2025-10-16T20:29:43Z

+                INSERT INTO user_entries (user_id, group_id, join_date, suspicious, seen_message, spammer)
+                VALUES (%s, %s, NOW(), TRUE, FALSE, FALSE)


The suspicious column does not exist in the database schema. This will cause a SQL error.

Copilot · 2025-10-16T20:29:44Z

+            # Simulate user being marked as spammer
+            cursor.execute(
+                """
+                UPDATE user_entries SET spammer = TRUE, suspicious = FALSE where user_id=%s and group_id = %s


References non-existent suspicious column. This will cause a SQL error.

Copilot · 2025-10-16T20:29:44Z

+
+            # Verify that the user is now in spammers cache and not suspicious
+            assert test_user_id in spammers_cache, f"User {test_user_id} should be in spammers cache"
+            assert test_user_id not in suspicious_users_cache, f"User {test_user_id} should not be in suspicious cache after being marked as spammer"


References suspicious_users_cache which no longer exists. This will cause a NameError.

insoln

@copilot

insoln · 2025-10-16T20:31:44Z

    # Проверяем, снова ли пользователь вошёл в группу
    if member.status == ChatMemberStatus.MEMBER:
-        if member.user.id in spammers_cache:
+        # Проверка: если пользователь спамер в этой группе - банить


если он спамер в любой группе - банить

Fixed in commit 5d7e232. Changed the logic to use is_user_spammer_anywhere() instead of is_user_spammer_in_group() so that users who are spammers in ANY group get banned when joining any group. Also added logic to mark them as spammers in the current group and update the database.

…st current group Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

insoln · 2025-10-16T20:47:34Z

@copilot обрати внимание на ревью #4 (review)

Copilot · 2025-10-16T20:49:41Z

@copilot обрати внимание на ревью #4 (review)

I'll fix the test file issues identified in the review. The test file contains references to the old suspicious_users_cache and suspicious database column that no longer exist in the new architecture. Let me update it to test the new trust-based system instead.

…dated references Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

Initial plan

34ab1ea

Copilot AI assigned Copilot and insoln Oct 16, 2025

Copilot started work on behalf of insoln October 16, 2025 17:48 View session

Copilot AI and others added 2 commits October 16, 2025 17:50

Initial plan to fix suspicious user status persistence

097c19e

Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

Add suspicious field to database schema and update persistence logic

17545e2

Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Save status of suspicious users on service restart~~ Fix: Persist suspicious user status across service restarts Oct 16, 2025

Copilot AI requested a review from insoln October 16, 2025 18:00

Copilot finished work on behalf of insoln October 16, 2025 18:00

Copilot started work on behalf of insoln October 16, 2025 18:05 View session

Copilot AI changed the title ~~Fix: Persist suspicious user status across service restarts~~ Redundant: Add explicit suspicious field (seen_message=FALSE already provides persistence) Oct 16, 2025

Copilot finished work on behalf of insoln October 16, 2025 18:08

Copilot started work on behalf of insoln October 16, 2025 20:16 View session

Implement new architecture: eliminate suspicious users, per-group spa…

afee748

…mmers, trusted users Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

Copilot AI changed the title ~~Redundant: Add explicit suspicious field (seen_message=FALSE already provides persistence)~~ Redesign antispam architecture: eliminate suspicious users, implement trust-based system with per-group spammer management Oct 16, 2025

Copilot finished work on behalf of insoln October 16, 2025 20:24

insoln marked this pull request as ready for review October 16, 2025 20:27

insoln requested a review from Copilot October 16, 2025 20:27

Copilot AI reviewed Oct 16, 2025

View reviewed changes

insoln reviewed Oct 16, 2025

View reviewed changes

Copilot started work on behalf of insoln October 16, 2025 20:41 View session

Fix group join logic: ban users who are spammers in ANY group, not ju…

dec725a

…st current group Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

Copilot AI requested a review from insoln October 16, 2025 20:45

Copilot finished work on behalf of insoln October 16, 2025 20:45

Copilot started work on behalf of insoln October 16, 2025 20:48 View session

Fix test file to match new trust-based architecture and eliminate out…

2adad0b

…dated references Co-authored-by: insoln <11380274+insoln@users.noreply.github.com>

Copilot finished work on behalf of insoln October 16, 2025 20:53

		INSERT INTO user_entries (user_id, group_id, join_date, suspicious, seen_message, spammer)
		VALUES (%s, %s, NOW(), TRUE, FALSE, FALSE)

Conversation

Copilot AI commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Implemented New Architecture: Eliminated Suspicious Users Concept

Key Changes:

1. Eliminated Suspicious Users Cache

2. New Trust-Based System

3. Per-Group Spammer Management

4. New Message Processing Logic

5. Admin Unban Handling

6. Fixed Group Join Logic ⚡ Updated

7. Updated Test Suite 🧪 Latest Update

Benefits:

Database Changes:

Comments on the Issue (you are @copilot in this section)

Uh oh!

insoln commented Oct 16, 2025

Uh oh!

Copilot AI commented Oct 16, 2025

Uh oh!

insoln commented Oct 16, 2025

Uh oh!

Copilot AI commented Oct 16, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

insoln left a comment

Choose a reason for hiding this comment

Uh oh!

insoln Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

insoln commented Oct 16, 2025

Uh oh!

Copilot AI commented Oct 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Oct 16, 2025 •

edited

Loading