Feature: Lazy Loading Architecture for Large Repository Performance#73
Feature: Lazy Loading Architecture for Large Repository Performance#73vovarbv wants to merge 2 commits into
Conversation
This commit introduces a lazy loading architecture to significantly improve performance and UI responsiveness when handling large codebases. The previous implementation loaded all file contents and tokenized them upfront, causing severe slowdowns and freezing on large folders. This new architecture addresses these issues by: - Performing a lightweight initial scan that only gathers file metadata and provides token estimations based on file type and size. - Deferring the expensive work of reading file content and performing accurate tokenization until a file is explicitly selected by the user. - Introducing UI components to handle large folder warnings (LargeFolderModal, LargeSubfolderModal) and provide clear user feedback during on-demand processing (ProcessingOverlay). - Refactoring App.tsx by extracting workspace logic into a dedicated useWorkspaces hook to improve state management and readability. - Updating documentation to reflect the new architecture (ARCHITECTURE.md, lazy-loading.md).
|
Hey @vovarbv, this looks pretty good. I've done some manual testing both on my Linux system and my Windows system, and there seem to be some issues. Issues With Sidebar and File Tree Handling:
Issues with loading Small Repository:
Issues With Binary Files Handling:
What I recommend:
These issues seem to happen on both systems, in which I suspect that it is not a platform-based issue. Tested on:
|
|
Hi @haikalllp - You’re absolutely right, that was my oversight. I added it for my own convenience and didn’t have the chance to test it thoroughly. I’ll put together a more robust solution shortly. It’s tough to nail every detail on the first pass, so please forgive any slip-ups. I really enjoy this project and use it often! |
|
This is excellent, because I'm getting issues with a project with 400+ folders and 2000+ files. it just locks and then reverts to an empty result. So it never finishes its analysis. Just some immediate feedback. But I can try the changes as soon as I have time. Thanks sirs. |
@RagingKore For now just try to add folders into the ignore filter, some binary files and large build files might be processed, hence it causes errors and empty results. |
Description
This pull request introduces a fundamental shift in how PasteMax handles file systems by implementing a lazy loading architecture. The primary goal is to eliminate UI freezing and dramatically improve initial load times when working with large repositories, making the application scalable and responsive.
Previously, the app would read and tokenize every file in a selected folder upfront, which was not feasible for large codebases. Now, it performs a quick metadata scan and only processes files on-demand.
Key Changes
🚀 Lazy Loading & Token Estimation:
~123 est) and a loading spinner for files being processed.** modals for Large Folders:**
LargeFolderModalnow warns the user if a selected directory is very large, giving them options to proceed, load with files deselected, or cancel.LargeSubfolderModalprovides a similar warning when selecting a large subfolder from the file tree.✨ UI & UX Improvements:
ProcessingOverlaynow appears during batch operations to provide clear feedback.TreeItem.tsx) and file list (FileCard.tsx,FileList.tsx) have been updated to handle the newisTokenEstimatestate and display loading indicators.CopyButton.tsxlogic has been refactored to be more flexible.🏗️ Architectural Refactoring:
App.tsxinto a newuseWorkspaces.tshook for better separation of concerns.gpt-3-encoder).📚 Documentation:
ARCHITECTURE.mdto provide a high-level overview of the application structure.docs/features/lazy-loading.mdto detail the new performance architecture.README.mdandCONTRIBUTING.mdwith the latest development workflow and project info.How to Test
Test with a small repository:
Test with a very large repository (e.g., >10,000 files):
LargeSubfolderModalappears.Verify Documentation:
This PR resolves major performance bottlenecks and sets a strong foundation for future scalability. Looking forward to your feedback!