CMPE 275 mini-project for NYC 311 query processing with three execution styles:
- AoS serial baseline (
QueryBase) - AoS + OpenMP (
QueryOMP) - SoA layout + OpenMP queries (
QuerySOA,DatasetSOA)
ParallelCompMini1/
├── CMakeLists.txt
├── README.md
├── LICENSE
├── .gitignore
├── mini1-memory-nyc.md
├── NYC_dataset/
│ └── 311_2020_present.csv
├── benchmarks/
│ └── workloads/
│ ├── README.md
│ ├── workload_10.csv
│ ├── workload_100k.csv
│ └── workload_100k_old.csv
├── include/
│ ├── aggregations.hpp
│ ├── csv_parser.hpp
│ ├── dataset.hpp
│ ├── dataset_SOA.hpp
│ ├── dataset_utils.hpp
│ ├── iQuery.hpp
│ ├── query_base.hpp
│ ├── query_omp.hpp
│ ├── query_SOA.hpp
│ └── timer.hpp
├── src/
│ ├── csv_parser.cpp
│ ├── dataset.cpp
│ ├── dataset_SOA.cpp
│ ├── query_base.cpp
│ ├── query_omp.cpp
│ └── query_SOA.cpp
├── tests/
│ ├── CMakeLists.txt
│ ├── test_record.cpp
│ ├── test_csv_parser.cpp
│ ├── test_dataset.cpp
│ ├── test_query.cpp
│ ├── benchmark_dataset_load.cpp
│ ├── benchmark_query.cpp
│ ├── benchmark_query_omp_date_range.cpp
│ └── benchmark_query_SOA_date_range.cpp
└── scripts/
├── check_borough_cardinality.py
├── find_complaint_freq.py
├── validate_query.py
└── requirements.txt
From project root:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -jOptional debug prints for query aggregation paths:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_QUERY_DEBUG_PRINT=ON
cmake --build build -jctest --test-dir build --output-on-failureBuilt under build/tests/:
benchmark_dataset_loadbenchmark_querybenchmark_query_omp_date_rangebenchmark_query_SOA_date_range
Example (SoA benchmark):
# serial SoA load
./build/tests/benchmark_query_SOA_date_range \
benchmarks/workloads/workload_100k.csv 100000 serial 0
# parallel SoA load (8 threads)
./build/tests/benchmark_query_SOA_date_range \
benchmarks/workloads/workload_100k.csv 100000 parallel 8- OpenMP is required (
find_package(OpenMP REQUIRED)intests/CMakeLists.txt). - SoA query implementation uses OpenMP parallel loops; explicit SIMD intrinsics are not currently used.