feat(analyzer): Add third-party + stdlib taint passThrough propagators#154
Open
misonijnik wants to merge 8 commits into
Open
feat(analyzer): Add third-party + stdlib taint passThrough propagators#154misonijnik wants to merge 8 commits into
misonijnik wants to merge 8 commits into
Conversation
a6295e0 to
050c869
Compare
Extract the passThrough (propagator) rules from misonijnik/codeql-rules that the rule-test suite depends on but the main-built analyzer jar does not yet bundle. Adds 13 per-jar propagator files under config/jar-split and extends stdlib.yaml, so a future analyzer release ships them by default and the rules CI no longer needs to overlay them.
…overage Refine the propagators added in 309492b using the engine's actual propagation semantics (no implicit this->result for unmodeled library calls; whole-object taint subsumes field reads but virtual-field taint does not satisfy a whole-object sink). Fix broken fluent chains by modeling every intermediate link, not just the endpoints: - okhttp Request.Builder: cover all builder methods, not only url/build. - spring-web RequestEntity: cover all static factories and the HeadersBuilder/BodyBuilder chain, not only get/build. - java.net.http HttpRequest.Builder: cover header/POST/PUT/method/etc., not only uri/GET/build. Broaden per-package coverage: - commons-io IOUtils: toByteArray/toCharArray/readLines. - commons-codec Base64: URL-safe and chunked encode variants. - java.util.Base64.Decoder: decode/wrap (decode direction). - spring-jdbc: substituteNamedParameters/parseSqlStatementIntoString. - spring-ldap: LdapQueryBuilder config carriers + gte/lte/not (iface+impl). - velocity: Context interface put. - groovy CompilationUnit.addSource: arg(*) to cover File/URL/SourceUnit overloads where the source is the first argument. Wrapper constructors keep whole-object arg(*)->this on purpose: their sinks consume the whole object, so a virtual-field-only taint would be a false negative.
… fields Make the new propagators precise and type-correct. A bare `arg -> this` (or `arg -> result`) makes the whole object abstract-tainted, so every later field read of it matches via abstract refinement — over-tainting — and copies a value of one type onto an object of another. Instead, store the differently-typed argument into a `<rule-storage>` virtual field of the target: from: arg(i) to: - this # or result - .java.lang.Object#<rule-storage>#java.lang.Object This is precise (reading unrelated real fields no longer matches, since the node is no longer abstract) and type-safe (a field accessor owned by java.lang.Object is always type-compatible, so the slot survives every chain step and the Builder -> Product type change of build()/body() under a plain this->result carry). A sink that consumes the whole object still matches the slot by prefix. Applied to: StringEntity, hudson.FilePath, unboundid SearchRequest, JMXServiceURL/JMXConnectorFactory, StreamSource, java.net.URL(String), VelocityContext.put, CompilationUnit.addSource, ant FileSet set*, and the okhttp Request.Builder / java.net.http HttpRequest.Builder / spring-web RequestEntity / spring-ldap query-builder chains. Builder data methods store into both result (chained `a().b()`) and this (non-chained `x.a(); x.b()`) and carry this->result; every chain link keeps this->result so the chain survives. Pure transforms that *produce* the tainted value (toString/decode/getBytes/URI.create/MVEL.compile/ substituteNamedParameters/iterator.next) stay bare, since the result is itself the tainted value with no surrounding state to over-taint.
Several new propagators repeated rules already present in other bundled config files (all .yaml under /config are concatenated at load time, so a second copy of the same function is a real runtime duplicate). Removed: - commons-codec: Base64#encodeBase64 / #decodeBase64 — already in config.yaml (identical arg(0)->result). Kept the String / URL-safe / chunked variants, which are not. - stdlib: dropped the catch-all (no-signature) entries that existing signature-specific rules already cover better: String#getBytes, URI#create, Enumeration#nextElement (existing () entry already has this->result), URL#<init>(String) (existing same-signature entry), Base64$Decoder#decode (existing typed overloads), and StreamSource#<init> (jmod.yaml has all typed overloads). Also dropped Collection#iterator (covered by the kept Iterable#iterator via override). Kept Iterable#iterator / Iterator#next generics — the existing entries only model element-level (.Element) flow, so these add the whole-collection this->result carry. - spring-web-5.3.39: dropped the RequestEntity HeadersBuilder/BodyBuilder chain methods — spring-web-7.0.2.yaml already provides them. Kept only the static factories (get/post/method/...), because v7 models those as this->result, a no-op for static methods; the arg-based form here is what actually carries the URI. Note: spring-web-7.0.2.yaml's RequestEntity static factories are dead (this-based on static methods); they should be fixed or removed there separately.
…3.39 file spring-web-7.0.2.yaml modeled the RequestEntity.get/post/head/put/patch/ delete/options/method static factories as `this -> result`, which is a no-op for static methods, so they never carried the URI argument. Replace them with the arg-based form that stores the URI into a java.lang.Object- owned <rule-storage> slot of the returned builder; the slot survives the builder -> RequestEntity type change of .build() under the existing this->result builder carries, and the RestTemplate sink consuming the whole RequestEntity matches it by prefix. With the factories fixed, spring-web-5.3.39.yaml (which only existed to provide working arg-based factories) is fully redundant — RequestEntity is the same class in both jars and all config now lives in spring-web-7.0.2.yaml — so delete it.
…at terminals Follow-up to the static-factory fix: instead of a java.lang.Object-owned <rule-storage> slot, store the factory's URI argument into the builder's own type slot (RequestEntity$HeadersBuilder#<rule-storage>, which also covers BodyBuilder since it extends HeadersBuilder), and re-key it onto the RequestEntity#<rule-storage> slot at the terminals .build() and .body(). This is more type-honest and unifies with the existing <init> / getBody modeling (all RequestEntity construction paths now land taint in the same RequestEntity#<rule-storage> slot), and a builder-typed slot is type-pruned if taint ever reaches a non-builder object. The cost is that every terminal returning a RequestEntity must re-key (HeadersBuilder is not a subtype of RequestEntity, so a plain this->result drops the builder-owned slot) — both build() and body() do.
…key at transitions Replace the imprecise `java.lang.Object` <rule-storage> slot owners in the new passthrough config with precise domain types, mirroring the spring-web RequestEntity reference (precise owner, java.lang.Object kept only as the field value type, re-key where the static type changes). - No type change (wrapper -> this, factory -> result): owner = the object's own type — ant FileSet, jenkins FilePath, httpcore5 StringEntity, unboundid SearchRequest, groovy CompilationUnit, JMX JMXServiceURL/JMXConnector; Velocity put() uses the common Context interface for all three receivers. - Single transition: builder-typed owner along the chain + one re-key at build() — okhttp Request.Builder -> Request, java.net.http HttpRequest.Builder -> HttpRequest. - Multi-transition Spring LDAP chain: owner tracks each stage (LdapQueryBuilder -> ConditionCriteria -> ContainerCriteria) with a re-key at every type change (where/is/like/gte/lte/whitespaceWildcardsLike/and/or/ filter), mirrored on the Default* impl entries. Re-key acceptance follows JIRFactTypeChecker.typeMayHaveSubtypeOf(resultType, owner); every transition is covered, so the slot reaches the terminal in exactly the cases the Object-owner baseline did (behavior-equivalent).
…n passthroughs Rework the new jar-split/stdlib passThrough rules for precision: - Replace arg(*) with per-overload `signature:` + exact arg indices (RequestEntity, okhttp/java.net.http builders, StringEntity, FilePath, CompilationUnit.addSource, JMXServiceURL, SearchRequest). Keep arg(*) only for genuine varargs (e.g. UriTemplate.expand). Use `*` return to avoid generic-erasure mismatch; exclude structurally non-taint args (e.g. RequestEntity.method's HttpMethod enum). - Rename branch-new <rule-storage> slots to specific names: url / content / path / source / value / filter; SearchRequest split into baseDN / filter / attributes. Pre-existing <rule-storage> slots left untouched. Note: named fields are no longer excluded from .* (AnyAccessor) unrolling — only the literal "<rule-storage>" is special-cased (JIRTaintAnalyzer UnrollStrategy); ContainsMarkOnAnyField sinks still see the named sub-fields. - HttpHeaders: enumerate all 61 setter overloads + 24 read-back getters with exact signatures and per-header named vfields (Location, ContentType, ...; dynamic-name set/add/getFirst/get share DynamicHeader), keeping whole-object taint for header-injection sinks. - RequestEntity: fix the static factories (get/post/head/delete/options/put/ patch/method carry the URI/template args into the builder slot; this->result was a no-op for statics) and add the getUrl read-back accessor. - Drop explanatory comments from the passThrough YAML.
1359bfe to
0b087fd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.