Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,4 @@ jobs:
poetry run flake8 . --count --exit-zero --max-complexity=10 --max-line-length=120 --statistics
- name: Test with pytest
run: |
poetry run pytest --verbose -p no:warnings -m "not heavy" ./tests/test_drawing.py ./tests/test_geo.py ./tests/test_map.py ./tests/test_issues.py
poetry run pytest --verbose -p no:warnings -m "not heavy" ./tests/test_drawing.py ./tests/test_geo.py ./tests/test_map.py ./tests/test_issues.py ./tests/test_cli.py
25 changes: 25 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,31 @@ cnmaps export ./henan.shp --province 河南省 --level 省 --record first

输出格式默认按文件后缀推断:`.geojson` / `.json` 对应 GeoJSON,`.shp` 对应 ESRI Shapefile。

### 检查并读取自定义边界文件

如果你有自己的 `GeoJSON` 或 `Shapefile`,也可以先把它整理成符合 `cnmaps boundary spec` 的结构,再交给 `cnmaps` 检查和读取。

```bash
cnmaps check-boundary ./my-boundary.geojson
cnmaps check-boundary ./my-boundary.shp --json # 将检查结果以 JSON 输出
```

检查通过后,就可以在 Python 代码中读取为 `MapPolygon`,继续用于 `make_mask_array(...)`、`maskout(...)` 或 `clip_*`:

```python
from cnmaps import read_boundary_file

boundary = read_boundary_file("./my-boundary.geojson")
```

如果你的原始 `shp` / `geojson` 还不符合这套规范,推荐先安装 `cnmaps` 自带的 AI Skill,再向 AI 发送类似下面的提示词:

```text
帮我把 <path/filename.shp> 转为符合 cnmaps 可识别格式的 shapefile/geojson 文件,并通过 cnmaps 的 check-boundary 检查。
```

这样 AI 会更容易按 `cnmaps boundary spec` 整理文件结构;整理完成后,再执行 `cnmaps check-boundary ...` 验证即可。

## 使用指南

针对本项目更多的使用方法,我们还有一份更详细的文档:[cnmaps使用指南](https://cnmaps.readthedocs.io/zh_CN/latest/index.html)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,22 @@ Good for:
- resolving a user's abbreviated or ambiguous region name before calling `get_adm_maps`
- batch-filtering names after the user already knows several exact region names

### `validate_boundary_file(fp, allow_multi_feature=True)`

Use when the user has an external GeoJSON or Shapefile and wants to know whether it matches the cnmaps boundary spec before using it for masking or clipping.

Current boundary spec:

- file suffix must be `.geojson`, `.json`, or `.shp`
- CRS must be WGS84 / `EPSG:4326`
- geometries must all be `Polygon` or `MultiPolygon`
- empty or invalid geometries are rejected
- multiple features are allowed, but they are treated as one combined boundary when read

### `read_boundary_file(fp, dissolve=True)`

Use when the user already has an external boundary file that matches the cnmaps boundary spec and wants a `MapPolygon` for `make_mask_array(...)`, `maskout(...)`, or `clip_*`.

## Drawing APIs

### `draw_map(map_polygon, ax=None, **kwargs)`
Expand Down Expand Up @@ -211,6 +227,15 @@ Rules:
- Output format is inferred from the destination suffix unless `--engine` is provided explicitly.
- Default coordinates are WGS84; use `--gcj02` only when the user explicitly wants GCJ02 export.

### `cnmaps check-boundary <path> [--json]`

Use when the user has an external GeoJSON or Shapefile and wants a direct terminal check before reading it with `read_boundary_file(...)`.

Rules:

- Prefer this command when the user is unsure whether an external file already matches the cnmaps boundary spec.
- `--json` is useful when AI or another script will consume the result and decide how to rewrite the file; it changes the check result output format, not the input file format.

## Rules Of Thumb

- If the user wants China only: always write `country="中国", level="国"` explicitly.
Expand All @@ -222,5 +247,6 @@ Rules:
- If the user wants clipped scientific plots specifically for EPS/PS export, consider `simplify=True` on the clipping boundary to reduce path complexity.
- If the user wants a raster mask array rather than a plotted figure: use `MapPolygon.make_mask_array(...)` or `MapPolygon.maskout(...)`.
- If the user wants exported vector output: query `only_polygon=True` and use `map_polygon.to_file(...)`.
- If the user wants to use a custom GeoJSON or Shapefile for masking, do not assume arbitrary files will work directly; first validate them against the cnmaps boundary spec, then read them with `read_boundary_file(...)`.
- If the user wants global country boundaries: `get_adm_maps(level="国")` is now the correct broad query.
- If the user asks about seams, gaps, or disputed-border behavior in world maps, explain the source-semantic caveat instead of blaming plotting code first.
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
- Important: be explicit about what `cnmaps` handles directly versus what remains a downstream raster-processing step.
- If the user wants a boolean-like mask array, use `MapPolygon.make_mask_array(lons, lats)`.
- If the user wants masked data back, use `MapPolygon.maskout(lons, lats, data)`.
- If the user wants to use a custom GeoJSON or Shapefile instead of built-in administrative boundaries, first validate it against the cnmaps boundary spec, then load it with `read_boundary_file(...)` and continue with the same `MapPolygon`-based masking workflow.

## Matplotlib Artist Clipping

Expand Down
170 changes: 170 additions & 0 deletions cnmaps/maps.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
import re
import warnings
from collections.abc import Iterable
from dataclasses import dataclass
from functools import lru_cache
from pathlib import Path
from typing import Optional

import numpy as np
import shapely.geometry as sgeom
Expand All @@ -30,6 +33,30 @@ class MapNotFoundError(Exception):
pass


class BoundarySpecError(ValueError):
"""外部边界文件不符合 cnmaps boundary spec 时抛出的异常。"""

pass


@dataclass(frozen=True)
class BoundaryCheckResult:
"""Structured result for validating an external boundary file."""

path: str
passed: bool
driver: Optional[str]
feature_count: int
geometry_types: tuple[str, ...]
crs: Optional[str]
errors: tuple[str, ...] = ()
warnings: tuple[str, ...] = ()

@property
def ok(self) -> bool:
return self.passed


class MapRecord(dict):
"""支持点号访问的地图记录对象。

Expand Down Expand Up @@ -208,6 +235,149 @@ def _clone_geometry(geom):
return wkb.loads(geom.wkb)


def _is_supported_boundary_suffix(path: Path) -> bool:
return path.suffix.lower() in {".geojson", ".json", ".shp"}


def validate_boundary_file(fp, *, allow_multi_feature=True) -> BoundaryCheckResult:
"""
检查外部 GeoJSON / Shapefile 是否符合 cnmaps boundary spec。

当前规范要求:
- 文件格式为 GeoJSON / Shapefile
- CRS 必须明确且可等价为 WGS84 (EPSG:4326)
- 所有几何都必须是 Polygon / MultiPolygon
- 不能包含空几何
- 几何必须有效
"""
import geopandas as gpd
from pyproj import CRS

path = Path(fp).expanduser().resolve()
errors = []
warnings_list = []

if not path.exists():
return BoundaryCheckResult(
path=str(path),
passed=False,
driver=None,
feature_count=0,
geometry_types=(),
crs=None,
errors=(f"文件不存在: {path}",),
)

if not _is_supported_boundary_suffix(path):
errors.append("仅支持符合 cnmaps boundary spec 的 .geojson/.json 或 .shp 文件")

try:
gdf = gpd.read_file(path)
except Exception as exc:
return BoundaryCheckResult(
path=str(path),
passed=False,
driver=None,
feature_count=0,
geometry_types=(),
crs=None,
errors=(f"无法读取边界文件: {exc}",),
)

driver = getattr(gdf, "_driver", None)
feature_count = len(gdf)
if feature_count == 0:
errors.append("文件中不包含任何 feature")

if not allow_multi_feature and feature_count > 1:
errors.append("文件包含多个 feature;当前模式下只允许单个 feature")
elif feature_count > 1:
warnings_list.append("文件包含多个 feature;读取时会先合并为一个统一边界")

if gdf.crs is None:
errors.append("文件缺少 CRS 定义;cnmaps boundary spec 要求显式声明 WGS84 (EPSG:4326)")
crs_text = None
else:
crs_text = str(gdf.crs)
try:
if not CRS.from_user_input(gdf.crs).equals(CRS.from_epsg(4326)):
errors.append("文件 CRS 不是 WGS84 (EPSG:4326)")
except Exception as exc:
errors.append(f"无法解析 CRS: {exc}")

if "geometry" not in gdf:
errors.append("文件中缺少 geometry 列")
geometry_types = ()
else:
geometries = gdf.geometry
if geometries.isna().any():
errors.append("文件包含空几何")

non_null_geometries = [geom for geom in geometries if geom is not None]
geometry_types = tuple(sorted({geom.geom_type for geom in non_null_geometries}))

unsupported = sorted(
{
geom_type
for geom_type in geometry_types
if geom_type not in {"Polygon", "MultiPolygon"}
}
)
if unsupported:
errors.append(
"仅支持 Polygon / MultiPolygon 面几何,当前检测到: " + ", ".join(unsupported)
)

invalid_count = sum(1 for geom in non_null_geometries if not geom.is_valid)
if invalid_count:
errors.append(f"文件中包含 {invalid_count} 个无效几何")

empty_count = sum(1 for geom in non_null_geometries if geom.is_empty)
if empty_count:
errors.append(f"文件中包含 {empty_count} 个空几何")

return BoundaryCheckResult(
path=str(path),
passed=not errors,
driver=driver,
feature_count=feature_count,
geometry_types=geometry_types,
crs=crs_text,
errors=tuple(errors),
warnings=tuple(warnings_list),
)


def read_boundary_file(fp, *, dissolve=True) -> "MapPolygon":
"""
读取符合 cnmaps boundary spec 的外部边界文件并返回 MapPolygon。

当前支持 GeoJSON / Shapefile,要求输入文件:
- CRS 为 WGS84 (EPSG:4326)
- 仅包含 Polygon / MultiPolygon
- 不包含空几何和无效几何
"""
import geopandas as gpd
from shapely.ops import unary_union

result = validate_boundary_file(fp)
if not result.ok:
raise BoundarySpecError(" ; ".join(result.errors))

gdf = gpd.read_file(Path(fp).expanduser().resolve())
geometries = [geom for geom in gdf.geometry if geom is not None and not geom.is_empty]

if dissolve:
merged = unary_union(geometries)
return _as_mappolygon_result(merged)

polygons = []
for geom in geometries:
normalized = _as_mappolygon_result(geom)
polygons.extend(list(normalized.geom.geoms))
return MapPolygon(polygons)


def _as_mappolygon_result(geom):
"""Normalize Shapely set-operation results to MapPolygon."""
if geom is None or geom.is_empty:
Expand Down
57 changes: 54 additions & 3 deletions cnmaps_cli/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@
from __future__ import annotations

import argparse
import json
import shutil
from pathlib import Path
from typing import Optional


SKILL_NAME = "cnmaps-python-assistant"
Expand Down Expand Up @@ -281,7 +283,7 @@ def install_claudecode_skill(workspace: Path, force: bool = False, scope: str =
)


def _normalize_export_engine(engine: str | None, output_path: Path) -> str:
def _normalize_export_engine(engine: Optional[str], output_path: Path) -> str:
if engine is None:
suffix = output_path.suffix.lower()
if suffix in {".geojson", ".json"}:
Expand Down Expand Up @@ -314,7 +316,7 @@ def export_adm_maps(
record: str = "all",
wgs84: bool = True,
simplify: bool = False,
engine: str | None = None,
engine: Optional[str] = None,
encoding: str = "utf-8",
) -> Path:
from cnmaps import get_adm_maps
Expand Down Expand Up @@ -349,6 +351,23 @@ def export_adm_maps(
return output


def check_boundary_file(path: Path) -> tuple[bool, dict]:
from cnmaps import validate_boundary_file

result = validate_boundary_file(path)
payload = {
"path": result.path,
"passed": result.passed,
"driver": result.driver,
"feature_count": result.feature_count,
"geometry_types": list(result.geometry_types),
"crs": result.crs,
"errors": list(result.errors),
"warnings": list(result.warnings),
}
return result.passed, payload


def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(prog="cnmaps")
subparsers = parser.add_subparsers(dest="command")
Expand Down Expand Up @@ -430,10 +449,21 @@ def build_parser() -> argparse.ArgumentParser:
help="Simplify geometries before export.",
)

check_parser = subparsers.add_parser(
"check-boundary",
help="Validate whether an external GeoJSON or Shapefile matches the cnmaps boundary spec.",
)
check_parser.add_argument("path", type=Path, help="Boundary file path to validate.")
check_parser.add_argument(
"--json",
action="store_true",
help="Emit the validation result as JSON.",
)

return parser


def main(argv: list[str] | None = None) -> int:
def main(argv=None) -> int:
parser = build_parser()
args = parser.parse_args(argv)

Expand Down Expand Up @@ -478,5 +508,26 @@ def _unwrap(values):

parser.exit(0, f"Exported administrative boundaries to {output}\n")

if args.command == "check-boundary":
passed, payload = check_boundary_file(args.path.expanduser().resolve())
if args.json:
parser.exit(0 if passed else 1, json.dumps(payload, ensure_ascii=False, indent=2) + "\n")

lines = [
f"Boundary spec check: {'PASS' if passed else 'FAIL'}",
f"- path: {payload['path']}",
f"- feature_count: {payload['feature_count']}",
f"- geometry_types: {', '.join(payload['geometry_types']) if payload['geometry_types'] else '(none)'}",
f"- crs: {payload['crs'] or '(missing)'}",
]
if payload["warnings"]:
lines.append("- warnings:")
lines.extend(f" - {item}" for item in payload["warnings"])
if payload["errors"]:
lines.append("- errors:")
lines.extend(f" - {item}" for item in payload["errors"])

parser.exit(0 if passed else 1, "\n".join(lines) + "\n")

parser.print_help()
return 1
Loading
Loading