{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "intro",
   "metadata": {},
   "source": [
    "# Qwen3.5 msModelSlim Quantization Verification\n",
    "\n",
    "This notebook follows the official Ascend `msmodelslim` Qwen3.5 example and verifies that the image has the required pieces in place for model quantization.\n",
    "The image is validated against the working stack used for a successful `Qwen3.5-27B` run: `msmodelslim 26.0.0a2`, `transformers 5.2.0`, `torchvision 0.24.0`, `mistral-common 1.11.0`, `easydict 1.13`, and `wcmatch 10.1`.\n",
    "\n",
    "Official reference:\n",
    "- https://raw.gitcode.com/Ascend/msmodelslim/raw/master/example/Qwen3_5/README.md\n",
    "\n",
    "This notebook does not start a large quantization job by default. It checks the runtime, prepares the official `msmodelslim quant` command, and only runs it when `RUN_QUANT = True`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "official-notes",
   "metadata": {},
   "source": [
    "## Official Notes\n",
    "\n",
    "The official Qwen3.5 guide states:\n",
    "- `msmodelslim` must be installed.\n",
    "- `transformers==5.2.0` is required.\n",
    "- For the verified `Qwen3.5-27B` multimodal path in this image, `torchvision==0.24.0`, `mistral-common==1.11.0`, `easydict==1.13`, and `wcmatch==10.1` are also required at runtime.\n",
    "- Supported devices are Atlas A2 and Atlas A3 training/inference products.\n",
    "- Example command format:\n",
    "\n",
    "```bash\n",
    "msmodelslim quant --model_path ${MODEL_PATH} --save_path ${SAVE_PATH} --device npu --model_type Qwen3.5-27B --quant_type w8a8 --trust_remote_code True\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "config",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "# Update these paths before running a real quantization job.\n",
    "MODEL_PATH = Path('/opt/app-root/src/models/Qwen3.5-27B')\n",
    "SAVE_PATH = Path('/opt/app-root/src/output/qwen35-27b-w8a8')\n",
    "\n",
    "# Officially documented model types in the Qwen3.5 README.\n",
    "MODEL_TYPE = 'Qwen3.5-27B'\n",
    "QUANT_TYPE = 'w8a8'\n",
    "DEVICE = 'npu'\n",
    "TRUST_REMOTE_CODE = True\n",
    "\n",
    "# Safety switch: keep this False for environment verification.\n",
    "RUN_QUANT = False\n",
    "\n",
    "CANN_ENV = '/usr/local/Ascend/cann/set_env.sh'\n",
    "ATB_ENV = '/usr/local/Ascend/nnal/atb/set_env.sh'\n",
    "\n",
    "SUPPORTED_MODEL_TYPES = {\n",
    "    'Qwen3.5-397B-A17B': {'w8a8', 'w4a8'},\n",
    "    'Qwen3.5-122B-A10B': {'w8a8'},\n",
    "    'Qwen3.5-35B-A3B': {'w8a8'},\n",
    "    'Qwen3.5-27B': {'w8a8'},\n",
    "}\n",
    "\n",
    "print('MODEL_PATH =', MODEL_PATH)\n",
    "print('SAVE_PATH  =', SAVE_PATH)\n",
    "print('MODEL_TYPE =', MODEL_TYPE)\n",
    "print('QUANT_TYPE =', QUANT_TYPE)\n",
    "print('RUN_QUANT  =', RUN_QUANT)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "helpers",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import shlex\n",
    "import stat\n",
    "import subprocess\n",
    "\n",
    "\n",
    "def run_cmd(cmd: str, check: bool = True) -> subprocess.CompletedProcess:\n",
    "    env_prefix = f'source {CANN_ENV} && source {ATB_ENV}'\n",
    "    full_cmd = f'{env_prefix} && {cmd}'\n",
    "    print(f'$ {cmd}')\n",
    "    result = subprocess.run(\n",
    "        ['bash', '-lc', full_cmd],\n",
    "        text=True,\n",
    "        capture_output=True,\n",
    "        env=os.environ.copy(),\n",
    "    )\n",
    "    if result.stdout:\n",
    "        print(result.stdout)\n",
    "    if result.stderr:\n",
    "        print(result.stderr)\n",
    "    if check and result.returncode != 0:\n",
    "        raise RuntimeError(f'command failed with exit code {result.returncode}')\n",
    "    return result\n",
    "\n",
    "\n",
    "def shell_quote(value: str) -> str:\n",
    "    return shlex.quote(value)\n",
    "\n",
    "\n",
    "def find_writable_paths(root: Path, recursive: bool, limit: int = 20) -> list[str]:\n",
    "    if not root.exists():\n",
    "        return []\n",
    "\n",
    "    offenders = []\n",
    "\n",
    "    def inspect(candidate: Path) -> bool:\n",
    "        try:\n",
    "            mode = candidate.stat().st_mode\n",
    "        except FileNotFoundError:\n",
    "            return False\n",
    "        if mode & (stat.S_IWGRP | stat.S_IWOTH):\n",
    "            offenders.append(f'{candidate} mode={oct(mode & 0o777)}')\n",
    "            return len(offenders) >= limit\n",
    "        return False\n",
    "\n",
    "    if inspect(root):\n",
    "        return offenders\n",
    "    if recursive and root.is_dir():\n",
    "        for candidate in root.rglob('*'):\n",
    "            if inspect(candidate):\n",
    "                break\n",
    "    return offenders\n",
    "\n",
    "\n",
    "def prepare_msmodelslim_permissions(model_path: Path, save_path: Path) -> None:\n",
    "    run_cmd(f'mkdir -p {shell_quote(str(save_path))}')\n",
    "    for path in (model_path.parent, save_path.parent):\n",
    "        if path.exists():\n",
    "            run_cmd(f'chmod go-w {shell_quote(str(path))}', check=False)\n",
    "    for path in (model_path, save_path):\n",
    "        if path.exists():\n",
    "            run_cmd(f'chmod -R go-w {shell_quote(str(path))}', check=False)\n",
    "\n",
    "\n",
    "def print_msmodelslim_permission_report(model_path: Path, save_path: Path) -> bool:\n",
    "    checks = [\n",
    "        ('model parent', find_writable_paths(model_path.parent, recursive=False)),\n",
    "        ('model tree', find_writable_paths(model_path, recursive=True)),\n",
    "        ('save parent', find_writable_paths(save_path.parent, recursive=False)),\n",
    "        ('save tree', find_writable_paths(save_path, recursive=True)),\n",
    "    ]\n",
    "\n",
    "    print('Permission preflight:')\n",
    "    for label, offenders in checks:\n",
    "        if offenders:\n",
    "            print(f'  [{label}] writable path(s) still present, showing up to {len(offenders)}:')\n",
    "            for offender in offenders:\n",
    "                print(f'    {offender}')\n",
    "        else:\n",
    "            print(f'  [{label}] OK')\n",
    "\n",
    "    return not any(offenders for label, offenders in checks if label.startswith('model'))\n",
    "\n",
    "\n",
    "print('Helper functions loaded')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "verify-env",
   "metadata": {},
   "outputs": [],
   "source": [
    "print('Python version:')\n",
    "run_cmd('python3 --version')\n",
    "\n",
    "print('Checking verified runtime stack:')\n",
    "run_cmd(\"python3 - <<'PY'\\nimport importlib.metadata as m\\nfor name in ['bracex', 'easydict', 'msmodelslim', 'transformers', 'huggingface-hub', 'torchvision', 'mistral-common', 'wcmatch']:\\n    print(name, m.version(name))\\nfrom huggingface_hub import is_offline_mode\\nprint('huggingface-hub is_offline_mode ok', is_offline_mode())\\nimport bracex\\nimport easydict\\nimport mistral_common\\nimport msmodelslim\\nimport torchvision\\nimport transformers\\nimport wcmatch\\nprint('runtime imports ok')\\nPY\")\n",
    "\n",
    "print('Checking transformers version:')\n",
    "run_cmd(\"python3 -c \\\"import transformers; print(transformers.__version__)\\\"\")\n",
    "\n",
    "print('Checking msmodelslim CLI:')\n",
    "run_cmd('msmodelslim --help | head -n 20')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "prepare-command",
   "metadata": {},
   "outputs": [],
   "source": [
    "if MODEL_TYPE not in SUPPORTED_MODEL_TYPES:\n",
    "    raise ValueError(f'Unsupported MODEL_TYPE: {MODEL_TYPE}')\n",
    "\n",
    "if QUANT_TYPE not in SUPPORTED_MODEL_TYPES[MODEL_TYPE]:\n",
    "    raise ValueError(f'{MODEL_TYPE} does not support quant type {QUANT_TYPE} in the official README')\n",
    "\n",
    "SAVE_PATH.mkdir(parents=True, exist_ok=True)\n",
    "\n",
    "quant_cmd = ' '.join([\n",
    "    'msmodelslim', 'quant',\n",
    "    '--model_path', shell_quote(str(MODEL_PATH)),\n",
    "    '--save_path', shell_quote(str(SAVE_PATH)),\n",
    "    '--device', DEVICE,\n",
    "    '--model_type', MODEL_TYPE,\n",
    "    '--quant_type', QUANT_TYPE,\n",
    "    '--trust_remote_code', str(TRUST_REMOTE_CODE),\n",
    "])\n",
    "\n",
    "print('Official quant command:')\n",
    "print(quant_cmd)\n",
    "print('Permission prep that will run when RUN_QUANT = True:')\n",
    "print(f'  chmod go-w {MODEL_PATH.parent}')\n",
    "print(f'  chmod -R go-w {MODEL_PATH}')\n",
    "print(f'  mkdir -p {SAVE_PATH}')\n",
    "print(f'  chmod go-w {SAVE_PATH.parent}')\n",
    "print(f'  chmod -R go-w {SAVE_PATH}')\n",
    "\n",
    "if MODEL_PATH.exists():\n",
    "    print(f'Model path exists: {MODEL_PATH}')\n",
    "    if not print_msmodelslim_permission_report(MODEL_PATH, SAVE_PATH):\n",
    "        print('MODEL_PATH is not ready yet. RUN_QUANT = True will try to fix permissions and re-check before quantization.')\n",
    "else:\n",
    "    print(f'Model path does not exist yet: {MODEL_PATH}')\n",
    "    print('Update MODEL_PATH before setting RUN_QUANT = True.')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "supported-matrix",
   "metadata": {},
   "source": [
    "## Supported Official Qwen3.5 Variants\n",
    "\n",
    "- `Qwen3.5-397B-A17B`: `w8a8`, `w4a8`\n",
    "- `Qwen3.5-122B-A10B`: `w8a8`\n",
    "- `Qwen3.5-35B-A3B`: `w8a8`\n",
    "- `Qwen3.5-27B`: `w8a8`\n",
    "\n",
    "If you want to verify a different official model, change `MODEL_TYPE`, `MODEL_PATH`, `SAVE_PATH`, and `QUANT_TYPE` above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "run-quant",
   "metadata": {},
   "outputs": [],
   "source": [
    "if RUN_QUANT:\n",
    "    if not MODEL_PATH.exists():\n",
    "        raise FileNotFoundError(f'MODEL_PATH does not exist: {MODEL_PATH}')\n",
    "    prepare_msmodelslim_permissions(MODEL_PATH, SAVE_PATH)\n",
    "    if not print_msmodelslim_permission_report(MODEL_PATH, SAVE_PATH):\n",
    "        raise RuntimeError(\n",
    "            'MODEL_PATH still has group/other writable bits after permission prep. '\n",
    "            'This usually means the mounted model directory was created with permissive modes '\n",
    "            'or the storage backend is not honoring chmod.'\n",
    "        )\n",
    "    run_cmd(quant_cmd)\n",
    "else:\n",
    "    print('RUN_QUANT is False; skipping the real quantization job.')\n",
    "    print('Set RUN_QUANT = True after you prepare the model weights on disk.')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
