Add built-in Python source analyzer (sgraph.analyzers)#157
Open
villelaitila wants to merge 1 commit into
Open
Conversation
28aafcd to
d5ec0a9
Compare
Introduce sgraph.analyzers, a built-in analysis layer that turns a Python
project directory into an SGraph model directly, without an external analyzer.
Public API:
from sgraph.analyzers import analyze_python
result = analyze_python("./src")
result.graph.to_xml("model.xml")
Implementation technique
- Pure standard-library parsing via the `ast` module — no third-party Python
parser. Each file is parsed with `ast.parse` and walked by an
`ast.NodeVisitor` subclass.
- Name/decorator/annotation extraction uses structural pattern matching
(match/case) over AST expression nodes.
- Two-phase pipeline: (1) discover + parse every file, build the element tree
and register modules; (2) resolve collected import statements into
dependency edges once the whole module registry is known. Deferring import
resolution to phase 2 lets forward references and cross-module imports
resolve regardless of file processing order.
Architecture
- base.py — shared, framework-agnostic types: AnalyzerConfig (root path,
include/exclude globs, external-import + stdlib toggles), AnalysisResult
(graph + errors + stats + summary), AnalysisError, SourceLocation, and the
AnalysisLevel enum (PACKAGES_ONLY < FILES < CLASSES < FUNCTIONS < FULL) that
controls how deep the model is built.
- code/base.py — language-agnostic source-file plumbing: glob-based
discovery with exclude filtering, file path -> module path mapping, and
encoding-tolerant reading (utf-8 with latin-1 fallback).
- code/python/ — the Python implementation:
- python_analyzer.py orchestrates the two phases and maps the filesystem
into SElements (packages from __init__.py, modules from files, nested
classes/functions/methods), then builds deduplicated import/from_import
SElementAssociations.
- ast_visitor.py creates SElements per scope according to AnalysisLevel
and records decorators, parameters and return-type annotations at FULL.
- import_resolver.py resolves both absolute and relative imports
(dot-level aware) against the module registry with parent-module
fallback; external/stdlib targets are skipped unless configured.
- database/ and infrastructure/ are reserved namespaces for future analyzers.
Levels let callers trade detail for speed, from a coarse package/file graph up
to a full model with classes, functions, parameters and decorators.
Tests: 27 tests covering source discovery, module-path mapping, level
gating, class/function extraction, and absolute/relative import resolution.
d5ec0a9 to
f5e847e
Compare
Softagram Impact Report for pull/157 (head commit: f5e847e)TL;DR Changed code files: 15 | Directly impacted code files: 0⭐ Change Overview
⭐ Details of Dependency Changes
[] 📄 Full report
Impact Report explained. Give feedback on this report to support@softagram.com |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
Adds
sgraph.analyzers, a built-in analysis layer that turns a Python project directory into anSGraphmodel directly — no external analyzer required.Implementation technique
Standard-library
astonly — no third-party Python parser. Each file is parsed withast.parseand walked by anast.NodeVisitorsubclass.Structural pattern matching (
match/case) drives name, decorator and type-annotation extraction over AST expression nodes.Two-phase pipeline:
Deferring import resolution to phase 2 means forward references and cross-module imports resolve regardless of file processing order. Duplicate edges are suppressed and self-references skipped.
Architecture
analyzers/base.pyAnalyzerConfig(root path, include/exclude globs, external-import + stdlib toggles),AnalysisResult(graph + errors + stats +summary()),AnalysisError,SourceLocation, and theAnalysisLevelenum.analyzers/code/base.pyanalyzers/code/python/python_analyzer.pySElements (packages from__init__.py, modules from files, nested classes/functions/methods) and builds theimport/from_importSElementAssociations.analyzers/code/python/ast_visitor.pySElements per scope according to the configured level; records decorators, parameters and return-type annotations atFULL.analyzers/code/python/import_resolver.pyanalyzers/database/,analyzers/infrastructure/Analysis levels
AnalysisLevellets callers trade detail for speed:PACKAGES_ONLY/FILES— coarse structural graphCLASSES/FUNCTIONS— nested classes, functions and methodsFULL— additionally captures parameters, return types and decoratorsTests
27 tests covering source discovery, module-path mapping, level gating, class/function extraction, and absolute/relative import resolution. All passing locally (
pytest tests/analyzers/).Notes for reviewers
database/andinfrastructure/ship as empty reserved packages (placeholders for upcoming analyzers).