Skip to content

feat: Cross-region internal download acceleration via inter-cloud connections#32

Open
solofox wants to merge 2 commits into
modelscope:mainfrom
solofox:feat/inter-regions
Open

feat: Cross-region internal download acceleration via inter-cloud connections#32
solofox wants to merge 2 commits into
modelscope:mainfrom
solofox:feat/inter-regions

Conversation

@solofox

@solofox solofox commented Jul 2, 2026

Copy link
Copy Markdown

When ModelScope server is asked to serve a file whose OSS bucket region matches the client's region, it returns an OSS internal Pre-Signed URL (-internal.aliyuncs.com). Otherwise it returns a CDN public URL.

The existing MODELSCOPE_DOWNLOAD_INTRA_CLOUD_REGION/INTRA_CLOUD_ACCELERATION_REGION env variable is a single-value global override — forcing it to a remote region breaks the optimal same-region path.

This PR introduces a per-file opportunistic cross-region acceleration mechanism: a new env variable MODELSCOPE_DOWNLOAD_INTER_CLOUD_REGIONS (comma-separated list of peer regions reachable via CEN/VPC peering) and a --inter-regions CLI flag.

For each file, the SDK:

  1. first probes with the local region; if that yields an OSS internal URL, it uses it as-is.
  2. Otherwise it probes each peer region in order; the first one yielding an OSS internal URL wins.
  3. If all probes miss, the download falls back to the default CDN path.

All probing happens via a single HEAD request per region (no data transfer), and each file is probed independently.

Also, a marker show on the progress bar when --inter-regions/env is specified, "⚡ " = local OSS, "⇄ " = peer OSS, " " = CDN:

⇄ 252787_255587.tar.gz:  13%|███████████████▌   | 4.99G/37.3G [00:07<5:54:41, 1.52MB/s]
⚡ 250434_252697.tar.gz:   1%|| 347M/43.4G [00:14<7:35:10, 1.58MB/s]
  304773_306558.tar.gz:   1%|| 263M/37.7G [00:14<7:24:28, 1.40MB/s]

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces inter-region acceleration for downloads by probing peer regions to locate an OSS internal endpoint, including a new CLI argument --inter-regions and a corresponding environment variable. The review feedback highlights a significant performance bottleneck where sequential probing is redundantly performed for every single file in a repository. To resolve this, it is recommended to cache the resolved region per repository using a thread-safe global lock, which avoids excessive network requests and potential rate limiting.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

import io
import os
import time
import re

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Import threading and define a global lock to protect the thread-safe caching of resolved repository regions.

Suggested change
import re
import re
import threading
_RESOLVED_REGIONS_LOCK = threading.Lock()

Comment on lines +862 to +883
# Inter-region acceleration: probe peer regions for OSS internal URL
# Progress bar prefix: "⚡ " = local OSS, "⇄ " = peer OSS, " " = CDN, "" = not configured
source_prefix = ""
peer_regions = self._get_inter_cloud_regions()
if peer_regions:
try:
probe_url = self._client.get_download_url(
repo_id, repo_type, file_path, revision,
)
cookies = None
if self._client.token:
cookies = {"m_session_id": self._client.token}
download_headers, source = self._resolve_inter_region_headers(
probe_url, download_headers, cookies,
peer_regions=peer_regions,
)
source_prefix = {
"local": "\u26a1 ", "peer": "\u21c4 ", "default": " ",
}[source]
except Exception as exc:
logger.warning("Failed to resolve inter-region acceleration: %s. Falling back to default.", exc)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Performance Bottleneck: Missing Cache for Resolved Regions

When downloading a repository with many files (e.g., via download_repo), the files are downloaded in parallel. Currently, the SDK performs sequential HEAD requests (probing) for every single file in the repository.

Since all files in a repository are stored in the same OSS bucket and thus share the same optimal region, probing independently for each file is extremely redundant. It adds massive latency to the download start time of each file and can easily trigger rate limits or 5xx errors on the ModelScope server.

Solution

Cache the resolved region per (repo_id, repo_type) using a thread-safe global lock. This reduces the number of probes from N * (1 + P) to just 1 * (1 + P) (where N is the number of files and P is the number of peer regions).

        # Inter-region acceleration: probe peer regions for OSS internal URL
        # Progress bar prefix: "⚡ " = local OSS, "⇄ " = peer OSS, "  " = CDN, "" = not configured
        source_prefix = ""
        peer_regions = self._get_inter_cloud_regions()
        if peer_regions:
            if not hasattr(self, "_resolved_repo_regions"):
                with _RESOLVED_REGIONS_LOCK:
                    if not hasattr(self, "_resolved_repo_regions"):
                        self._resolved_repo_regions = {}

            cache_key = (repo_id, repo_type)
            with _RESOLVED_REGIONS_LOCK:
                in_cache = cache_key in self._resolved_repo_regions
                if in_cache:
                    cached_region, source = self._resolved_repo_regions[cache_key]

            if in_cache:
                if cached_region is not None:
                    download_headers["x-aliyun-region-id"] = cached_region
                else:
                    download_headers.pop("x-aliyun-region-id", None)
                source_prefix = {
                    "local": "⚡ ", "peer": "⇄ ", "default": "  ",
                }[source]
            else:
                try:
                    probe_url = self._client.get_download_url(
                        repo_id, repo_type, file_path, revision,
                    )
                    cookies = None
                    if self._client.token:
                        cookies = {"m_session_id": self._client.token}
                    download_headers, source = self._resolve_inter_region_headers(
                        probe_url, download_headers, cookies,
                        peer_regions=peer_regions,
                    )
                    resolved_region = download_headers.get("x-aliyun-region-id")
                    with _RESOLVED_REGIONS_LOCK:
                        self._resolved_repo_regions[cache_key] = (resolved_region, source)
                    source_prefix = {
                        "local": "⚡ ", "peer": "⇄ ", "default": "  ",
                    }[source]
                except Exception as exc:
                    logger.warning("Failed to resolve inter-region acceleration: %s. Falling back to default.", exc)

@wangxingjun778 wangxingjun778 added the enhancement New feature or request label Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants