Skip to content

fix: specify UTF-8 encoding for subprocess in init_database#606

Open
JasonOA888 wants to merge 1 commit into666ghj:mainfrom
JasonOA888:fix/issue-605-gbk-encoding
Open

fix: specify UTF-8 encoding for subprocess in init_database#606
JasonOA888 wants to merge 1 commit into666ghj:mainfrom
JasonOA888:fix/issue-605-gbk-encoding

Conversation

@JasonOA888
Copy link
Copy Markdown

Summary

Fixes #605 - 修复 Windows 上数据库初始化的 GBK 编码错误。

Problem

用户在 Windows 上运行 python main.py --init-db 时遇到:

UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 7223: illegal multibyte sequence

Root Cause

subprocess.runtext=True 参数在 Windows 上默认使用系统编码 (GBK),而不是 UTF-8。当 SQL 文件包含 UTF-8 字符时,会导致解码失败。

Solution

显式指定 encoding='utf-8' 参数。

Changes

  • MindSpider/main.py: 在 initialize_database() 的 subprocess.run 中添加 encoding='utf-8'

Fixes #605

- subprocess.run with text=True uses system encoding (GBK on Windows)
- Causes UnicodeDecodeError when reading SQL files with UTF-8 content
- Explicitly set encoding='utf-8' to fix the issue

Fixes 666ghj#605
@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

数据库初始化失败

1 participant