Skip to content

Unsoundness in run_ctags #101

@lwz23

Description

@lwz23

Hello, thank you for your contribution in this project, I an testing our static analysis tool in github's Rust project and I notice the following code:

fn run_ctags(opt: &Opt, files: &Vec<String>) -> Vec<ClassInfo> {
    let outputs = CmdCtags::call(&opt, &files).unwrap();
    let mut iters = Vec::new();
    for o in &outputs {
        let iter = if opt.validate_utf8 {
            str::from_utf8(&o.stdout).unwrap().lines()
        } else {
            unsafe { str::from_utf8_unchecked(&o.stdout).lines() }
        };
        iters.push(iter);
    }
    let parser = CtagsParser::parse_str(iters);
    let classes = parser.classes();
    classes
}

The issue is in the run_ctags function where it uses str::from_utf8_unchecked on external command output:

let iter = if opt.validate_utf8 {
    str::from_utf8(&o.stdout).unwrap().lines()
} else {
    unsafe { str::from_utf8_unchecked(&o.stdout).lines() }
};

Since o.stdout comes from an external program (ctags), there's no guarantee the data is valid UTF-8. If opt.validate_utf8 is false (which can be controlled through the configuration because I notice this is a pub field), the program will use str::from_utf8_unchecked on potentially invalid UTF-8 data, causing undefined behavior.
A valid path to call this fn: pub fn execute -> fn run_ctags

POC

fn main() {
    // Create a configuration with a repo that points to an existing directory
    let config = CocoConfig {
        repos: vec![RepoConfig {
            url: String::from("/tmp/test_repo"),  // Point to any directory
            languages: Some(vec![String::from("rust")]),
            // Other fields initialized as needed
        }],
        plugins: vec![PluginConfig {
            name: String::from("struct_analysis"),
            configs: vec![
                KeyValue {
                    key: String::from("ctags"),
                    value: String::from("/usr/bin/ctags"),  // Path to ctags binary
                }
            ]
        }]
    };
    
    // This will eventually call run_ctags, which uses str::from_utf8_unchecked
    // on the output of an external program (ctags) when validate_utf8 is false
    execute(config);
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions