Compare commits

...

36 Commits
v0.9.6 ... main

Author SHA1 Message Date
yihong0618
130ddf4e90 Merge branch 'main' of https://github.com/yihong0618/bilingual_book_maker 2025-04-22 17:41:12 +08:00
yihong0618
f4ac49651c fix: black it
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-22 17:41:00 +08:00
yihong
433e208925
Merge pull request #456 from leslieo2/main
fix: Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements
2025-04-22 17:36:19 +08:00
yihong0618
2b40f0872a fix: add gitignore
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-22 17:36:05 +08:00
yihong
4ac29a7d63
Merge pull request #458 from yihong0618/hy/add_gemini_model
fix: drop .pdm-python add new gemini model
2025-04-22 17:17:40 +08:00
yihong0618
e45326f7a8 fix: drop .pdm-python add new gemini model
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-04-22 17:17:16 +08:00
leslie
c780f7c516 fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-21 16:11:33 +08:00
leslie
cc4f4c4dae fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-21 15:05:21 +08:00
leslie
57ca4da847 fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-20 16:48:11 +08:00
leslie
09589c626d fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-19 23:28:02 +08:00
leslie
750ecd7d93 fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-19 22:57:47 +08:00
leslie
70a1962804 fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-19 22:18:47 +08:00
leslie
83303d1dd8 fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-19 20:31:22 +08:00
leslie
b0dbed8826 fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-19 20:29:41 +08:00
leslie
6685b23993 fix:Fix translation paragraph count mismatch by explicitly instructing LLM about paragraph requirements 2025-04-19 19:59:32 +08:00
yihong
a1f0185043
Merge pull request #449 from lytt925/main
fix(docker): update Dockerfile to copy only requirements.txt
2025-03-12 11:04:47 +08:00
Yen-Ting Li
68f21744f5 fix(docker): update Dockerfile to copy only requirements.txt 2025-03-12 10:54:00 +08:00
yihong
81f9b5280b
Merge pull request #448 from cangming/support_openai_o_series_model
feat(model): support openai o series model
2025-03-11 18:02:54 +08:00
Cangming H
bf0a0b8ad5
upgrade ci python version to 3.10 2025-03-11 13:02:46 +08:00
Cangming H
b83ac10e88
reformat python script 2025-03-11 12:47:12 +08:00
Cangming H
cf992aef70
Fix github CI failed due to actions/upload-artifact deprecated 2025-03-11 12:19:16 +08:00
cangming
8bfd1b146d
feat(model): support openai o series model (o1-preview, o1, o1-mini, o3-mini) 2025-03-09 23:04:46 +08:00
Thibaut
e6b0de14db
Add promptdown support (markdown user, system, developer messages (#446)
* Add promptdown support (markdown user, system, developer messages

* Revert extra changes to cli

* Add to gitignore
2025-03-05 20:58:10 +08:00
Deftera
b7674d734d
Use dynamic language in Google translator (#445) 2025-03-02 13:38:57 +08:00
yihong0618
f0927404fe fix: lint
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-22 19:00:40 +08:00
ZHU PEIYAN
4e7cbb5e32
feat(translator): 优化 Gemini 翻译模板(吴恩达三步翻译法)并支持新模型 (#439)
* feat(translator): 优化 Gemini 翻译模板并支持新模型

更新翻译提示模板以提供更专业的中文翻译输出,主要变更包括:
- 添加新的 Gemini 2.0 flash 实验模型支持
- 修改翻译提示模板,采用三步翻译流程提升翻译质量
- 增加标签提取功能,只返回最终优化后的翻译内容
- 移除对 {language} 参数的强制要求检查

优化后的翻译流程包含初次翻译、反思改进和最终润色三个步骤,
显著提升翻译结果的准确性和可读性。

* support md file type

---------

Co-authored-by: zhonghua.zhu <zhonghua.zhu@riveretech.com>
2024-12-22 18:59:42 +08:00
cce
b80f1ba785
Add context and system message support for Claude (#438)
* option to select claude model

* add context_flag and context_paragraph_limit option for claude

* reformat with black

* remove nonexistent model
2024-12-08 07:06:38 +08:00
cce
daea974d68
Pass --context_paragraph_limit option from CLI (#437) 2024-12-07 12:28:30 +08:00
cce
a36b488ca7
set claude temperature, update claude default model (#434) 2024-11-28 10:06:40 +08:00
umm
b42a33d9a8
Update README (#433) 2024-11-09 12:46:22 +08:00
umm
a82255a8d7
Update README (#432)
* feat: support Tencent TranSmart

* feat: support groq translator

* update README
2024-11-09 09:08:12 +08:00
anenin
e962a08f35
fix: Fix parameter mismatch in EPUBBookLoaderHelper.translate_with_ba… (#429)
* fix: Fix parameter mismatch in EPUBBookLoaderHelper.translate_with_backoff

- Fix TypeError when calling translate_with_backoff with multiple arguments
- Add proper parameter handling in the decorated method
- Add jitter=None to prevent extra parameters from backoff decorator
- Improve code readability and error handling

* style: format code with black

---------

Co-authored-by: wenping <angenpn@gmail.com>
2024-11-06 18:09:09 +08:00
Xie Yanbo
546fbd8e37
Avoid crash if not provide book_name (#431) 2024-11-06 17:57:38 +08:00
Xie Yanbo
78fc7985d5
support xAI (#430) 2024-11-06 17:56:59 +08:00
risin42
9261d92e20
Gemini Enhancements (#428)
* chore: Bump google-generativeai and related dependencies

* feat: add support for --temperature option to gemini

* feat: add support for --interval option to gemini

* feat: add support for --model_list option to gemini

* feat: add support for --prompt option to gemini

* modify: model settings

* feat: add support for --use_context option to gemini

* feat: add support for rotate_key to gemini

* feat: add exponential backoff to gemini

* Update README.md

* fix: typos and apply black formatting

* Update make_test_ebook.yaml

* fix: cli

* fix: interval option implementation

* fix: interval for geminipro

* fix: recreate convo after rotating key
2024-10-21 13:42:33 +08:00
mkXultra
6912206cb1
support: gpt4o (#425)
* support: gpt4o

* fix: remove wrong code

* fix: fix ci error
2024-08-24 19:20:15 +08:00
28 changed files with 1504 additions and 314 deletions

View File

@ -11,6 +11,6 @@ jobs:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: '3.9'
python-version: '3.10'
- run: pip install mkdocs mkdocs-material
- run: mkdocs gh-deploy --force
- run: mkdocs gh-deploy --force

View File

@ -27,10 +27,10 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: install python 3.9
- name: install python 3.10
uses: actions/setup-python@v4
with:
python-version: '3.9'
python-version: '3.10'
cache: 'pip' # caching pip dependencies
- name: Check formatting (black)
run: |
@ -71,7 +71,7 @@ jobs:
- name: Rename and Upload ePub
if: env.OPENAI_API_KEY != null
uses: actions/upload-artifact@v2
uses: actions/upload-artifact@v4
with:
name: epub_output
path: "test_books/lemo_bilingual.epub"

7
.gitignore vendored
View File

@ -138,3 +138,10 @@ log/
*.srt
*.txt
*.bin
*.epub
# For markdown files in user directories
.cursorrules
books/
prompts/
.pdm-python

View File

@ -1 +0,0 @@
/home/yihong/use_now/bilingual_book_maker/.venv/bin/python

View File

@ -4,10 +4,10 @@ RUN apt-get update
WORKDIR /app
COPY requirements.txt setup.py .
COPY requirements.txt .
RUN pip install -r /app/requirements.txt
COPY . .
ENTRYPOINT ["python3", "make_book.py"]
ENTRYPOINT ["python3", "make_book.py"]

View File

@ -4,7 +4,6 @@ bilingual_book_maker 是一个 AI 翻译工具,使用 ChatGPT 帮助用户制
![image](https://user-images.githubusercontent.com/15976103/222317531-a05317c5-4eee-49de-95cd-04063d9539d9.png)
## 准备
1. ChatGPT or OpenAI token [^token]
@ -12,43 +11,203 @@ bilingual_book_maker 是一个 AI 翻译工具,使用 ChatGPT 帮助用户制
3. 能正常联网的环境或 proxy
4. python3.8+
## 快速开始
## 使用
本地放了一个 `test_books/animal_farm.epub` 给大家测试
```shell
pip install -r requirements.txt
python3 make_book.py --book_name test_books/animal_farm.epub --openai_key ${openai_key} --test
pip install -U bbook_maker
bbook --book_name test_books/animal_farm.epub --openai_key ${openai_key} --test
```
## 翻译服务
- `pip install -r requirements.txt``pip install -U bbook_maker`
- 使用 `--openai_key` 指定 OpenAI API key如果有多个可以用英文逗号分隔(xxx,xxx,xxx),可以减少接口调用次数限制带来的错误。
或者,指定环境变量 `BBM_OPENAI_API_KEY` 来略过这个选项。
- 本地放了一个 `test_books/animal_farm.epub` 给大家测试
或者,指定环境变量 `BBM_OPENAI_API_KEY` 来略过这个选项。
- 默认用了 [GPT-3.5-turbo](https://openai.com/blog/introducing-chatgpt-and-whisper-apis) 模型,也就是 ChatGPT 正在使用的模型。
- 可以使用 DeepL 封装的 api 进行翻译,需要付费,[DeepL Translator](https://rapidapi.com/splintPRO/api/dpl-translator) 来获得 token `--model deepl --deepl_key ${deepl_key}`
- 可以使用 DeepL free `--model deeplfree`
- 可以使用 [Claude](https://console.anthropic.com/docs) 模型进行翻译 `--model claude --claude_key ${claude_key}`
- 可以使用 google 来翻译 `--model google`
- 可以使用彩云进行翻译 `--model caiyun --caiyun_key ${caiyun_key}`
- 可以使用 Gemini 进行翻译 `--model gemini --gemini_key ${gemini_key}`
- 可以使用腾讯交互翻译(免费)进行翻译`--model tencentransmart`
- 可以使用 [Ollama](https://github.com/ollama/ollama) 自托管模型进行翻译,使用 `--ollama_model ${ollama_model_name}`
- 如果 ollama server 不运行在本地,使用 `--api_base http://x.x.x.x:port/v1` 指向 ollama server 地址
- 使用 `--test` 命令如果大家没付费可以加上这个先看看效果(有 limit 稍微有些慢)
- 使用 `--language` 指定目标语言,例如: `--language "Simplified Chinese"`,预设值为 `"Simplified Chinese"`.
请阅读 helper message 来查找可用的目标语言: `python make_book.py --help`
- 使用 `--proxy` 参数,方便中国大陆的用户在本地测试时使用代理,传入类似 `http://127.0.0.1:7890` 的字符串
- 使用 `--resume` 命令,可以手动中断后,加入命令继续执行。
- epub 由 html 文件组成。默认情况下,我们只翻译 `<p>` 中的内容。
使用 `--translate-tags` 指定需要翻译的标签。使用逗号分隔多个标签。例如:
`--translate-tags h1,h2,h3,p,div`
- 请使用 --book_from 选项指定电子阅读器类型(现在只有 kobo 可用),并使用 --device_path 指定挂载点。
- 如果你遇到了墙需要用 Cloudflare Workers 替换 api_base 请使用 `--api_base ${url}` 来替换。
**请注意此处你输入的api应该是'`https://xxxx/v1`'的字样,域名需要用引号包裹**
- 翻译完会生成一本 ${book_name}_bilingual.epub 的双语书
- 如果出现了错误或使用 `CTRL+C` 中断命令,不想接下来继续翻译了,会生成一本 ${book_name}_bilingual_temp.epub 的书,直接改成你想要的名字就可以了
- 如果你想要翻译电子书中的无标签字符串,可以使用 `--allow_navigable_strings` 参数,会将可遍历字符串加入翻译队列,**注意,在条件允许情况下,请寻找更规范的电子书**
- 如果你想调整 prompt你可以使用 `--prompt` 参数。有效的占位符包括 `{text}``{language}`。你可以用以下方式配置 prompt
如果您不需要设置 `system` 角色,可以这样:`--prompt "Translate {text} to {language}"` 或者 `--prompt prompt_template_sample.txt`(示例文本文件可以在 [./prompt_template_sample.txt](./prompt_template_sample.txt) 找到)。
如果您需要设置 `system` 角色,可以使用以下方式配置:`--prompt '{"user":"Translate {text} to {language}", "system": "You are a professional translator."}'`,或者 `--prompt prompt_template_sample.json`(示例 JSON 文件可以在 [./prompt_template_sample.json](./prompt_template_sample.json) 找到)。
你也可以用环境以下环境变量来配置 `system``user` 角色 prompt`BBM_CHATGPTAPI_USER_MSG_TEMPLATE` 和 `BBM_CHATGPTAPI_SYS_MSG`
该参数可以是提示模板字符串,也可以是模板 `.txt` 文件的路径。
- 使用`--batch_size` 参数,指定批量翻译的行数(默认行数为10目前只对txt生效)
* DeepL
使用 DeepL 封装的 api 进行翻译,需要付费。[DeepL Translator](https://rapidapi.com/splintPRO/api/dpl-translator) 来获得 token
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model deepl --deepl_key ${deepl_key}
```
* DeepL free
使用 DeepL free
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model deeplfree
```
* Claude
使用 [Claude](https://console.anthropic.com/docs) 模型进行翻译
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model claude --claude_key ${claude_key}
```
* 谷歌翻译
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model google
```
* 彩云小译
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model caiyun --caiyun_key ${caiyun_key}
```
* Gemini
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model gemini --gemini_key ${gemini_key}
```
* 腾讯交互翻译
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model tencentransmart
```
* [xAI](https://x.ai)
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model xai --xai_key ${xai_key}
```
* [Ollama](https://github.com/ollama/ollama)
使用 [Ollama](https://github.com/ollama/ollama) 自托管模型进行翻译。
如果 ollama server 不运行在本地,使用 `--api_base http://x.x.x.x:port/v1` 指向 ollama server 地址
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --ollama_model ${ollama_model_name}
```
* [Groq](https://console.groq.com/keys)
GroqCloud 当前支持的模型可以查看[Supported Models](https://console.groq.com/docs/models)
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --groq_key [your_key] --model groq --model_list llama3-8b-8192
```
## 使用说明
- 翻译完会生成一本 `{book_name}_bilingual.epub` 的双语书
- 如果出现了错误或使用 `CTRL+C` 中断命令,不想接下来继续翻译了,会生成一本 `{book_name}_bilingual_temp.epub` 的书,直接改成你想要的名字就可以了
## 参数说明
- `--test`:
如果大家没付费可以加上这个先看看效果(有 limit 稍微有些慢)
- `--language`: 指定目标语言
- 例如: `--language "Simplified Chinese"`,预设值为 `"Simplified Chinese"`.
- 请阅读 helper message 来查找可用的目标语言: `python make_book.py --help`
- `--proxy`
方便中国大陆的用户在本地测试时使用代理,传入类似 `http://127.0.0.1:7890` 的字符串
- `--resume`
手动中断后,加入命令可以从之前中断的位置继续执行。
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model google --resume
```
- `--translate-tags`
指定需要翻译的标签使用逗号分隔多个标签。epub 由 html 文件组成,默认情况下,只翻译 `<p>` 中的内容。例如: `--translate-tags h1,h2,h3,p,div`
- `--book_from`
选项指定电子阅读器类型(现在只有 kobo 可用),并使用 `--device_path` 指定挂载点。
- `--api_base ${url}`
如果你遇到了墙需要用 Cloudflare Workers 替换 api_base 请使用 `--api_base ${url}` 来替换。
**请注意,此处你输入的 api 应该是'`https://xxxx/v1`'的字样,域名需要用引号包裹**
- `--allow_navigable_strings`
如果你想要翻译电子书中的无标签字符串,可以使用 `--allow_navigable_strings` 参数,会将可遍历字符串加入翻译队列,**注意,在条件允许情况下,请寻找更规范的电子书**
- `--prompt`
如果你想调整 prompt你可以使用 `--prompt` 参数。有效的占位符包括 `{text}``{language}`。你可以用以下方式配置 prompt:
- 如果您不需要设置 `system` 角色,可以这样:`--prompt "Translate {text} to {language}"` 或者 `--prompt prompt_template_sample.txt`(示例文本文件可以在 [./prompt_template_sample.txt](./prompt_template_sample.txt) 找到)。
- 如果您需要设置 `system` 角色,可以使用以下方式配置:`--prompt '{"user":"Translate {text} to {language}", "system": "You are a professional translator."}'`,或者 `--prompt prompt_template_sample.json`(示例 JSON 文件可以在 [./prompt_template_sample.json](./prompt_template_sample.json) 找到)。
- 你也可以用环境以下环境变量来配置 `system``user` 角色 prompt`BBM_CHATGPTAPI_USER_MSG_TEMPLATE` 和 `BBM_CHATGPTAPI_SYS_MSG`
该参数可以是提示模板字符串,也可以是模板 `.txt` 文件的路径。
- `--batch_size`
指定批量翻译的行数(默认行数为 10目前只对 txt 生效)
- `--accumulated_num`:
达到累计token数开始进行翻译。gpt3.5将total_token限制为4090。
例如,如果您使用`--accumulation_num 1600`则可能会输出2200个令牌另外200个令牌用于系统指令system_message和用户指令user_message1600+2200+200 = 4000所以token接近极限。你必须选择一个自己合适的值我们无法在发送之前判断是否达到限制
- `--use_context`:
prompts the model to create a three-paragraph summary. If it's the beginning of the translation, it will summarize the entire passage sent (the size depending on `--accumulated_num`).
For subsequent passages, it will amend the summary to include details from the most recent passage, creating a running one-paragraph context payload of the important details of the entire translated work. This improves consistency of flow and tone throughout the translation. This option is available for all ChatGPT-compatible models and Gemini models.
模型提示词将创建三段摘要。如果是翻译的开始,它将总结发送的整个段落(大小取决于`--accumulated_num`)。
对于后续的段落,它将修改摘要,以包括最近段落的细节,创建一个完整的段落上下文负载,包含整个翻译作品的重要细节。 这提高了整个翻译过程中的流畅性和语气的一致性。 此选项适用于所有ChatGPT兼容型号和Gemini型号。
- `--context_paragraph_limit`:
使用`--use_context`选项时,使用`--context_paragraph_limit`设置上下文段落数限制。
- `--temperature`:
使用 `--temperature` 设置 `chatgptapi`/`gpt4`/`claude`模型的temperature值.
`--temperature 0.7`.
- `--block_size`:
使用`--block_size`将多个段落合并到一个块中。这可能会提高准确性并加快处理速度,但可能会干扰原始格式。必须与`--single_translate`一起使用。
例如:`--block_size 5 --single_translate`。
- `--single_translate`:
使用`--single_translate`只输出翻译后的图书,不创建双语版本。
- `--translation_style`:
如: `--translation_style "color: #808080; font-style: italic;"`
- `--retranslate "$translated_filepath" "file_name_in_epub" "start_str" "end_str"(optional)`:
- 重新翻译,从 start_str 到 end_str 的标记:
```shell
python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which' 'This kind of thing is not a good symptom. Obviously'
```
- 重新翻译, 从start_str 的标记开始:
```shell
python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which'
```
### 示范用例
@ -98,10 +257,10 @@ python3 make_book.py --book_name test_books/the_little_prince.txt --test --batch
python3 make_book.py --model caiyun --caiyun_key 3975l6lr5pcbvidl6jl2 --book_name test_books/animal_farm.epub
# 可以在环境变量中设置BBM_CAIYUN_API_KEY略过--openai_key
export BBM_CAIYUN_API_KEY=${your_api_key}
```
更加小白的示例
```shell
python3 make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_base 'https://xxxxx/v1'
@ -109,12 +268,11 @@ python3 make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_
python make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_base 'https://xxxxx/v1'
```
[演示视频](https://www.bilibili.com/video/BV1XX4y1d75D/?t=0h07m08s)
[演示视频2](https://www.bilibili.com/video/BV1T8411c7iU/)
[演示视频 2](https://www.bilibili.com/video/BV1T8411c7iU/)
使用 Azure OpenAI service
```shell
python3 make_book.py --book_name 'animal_farm.epub' --openai_key XXXXX --api_base 'https://example-endpoint.openai.azure.com' --deployment_id 'deployment-name'
@ -122,14 +280,11 @@ python3 make_book.py --book_name 'animal_farm.epub' --openai_key XXXXX --api_bas
python make_book.py --book_name 'animal_farm.epub' --openai_key XXXXX --api_base 'https://example-endpoint.openai.azure.com' --deployment_id 'deployment-name'
```
## 注意
1. Free trail 的 API token 有所限制,如果想要更快的速度,可以考虑付费方案
2. 欢迎提交 PR
# 感谢
- @[yetone](https://github.com/yetone)

264
README.md
View File

@ -2,11 +2,13 @@
[![litellm](https://img.shields.io/badge/%20%F0%9F%9A%85%20liteLLM-OpenAI%7CAzure%7CAnthropic%7CPalm%7CCohere%7CReplicate%7CHugging%20Face-blue?color=green)](https://github.com/BerriAI/litellm)
# bilingual_book_maker
The bilingual_book_maker is an AI translation tool that uses ChatGPT to assist users in creating multi-language versions of epub/txt/srt files and books. This tool is exclusively designed for translating epub books that have entered the public domain and is not intended for copyrighted works. Before using this tool, please review the project's **[disclaimer](./disclaimer.md)**.
![image](https://user-images.githubusercontent.com/15976103/222317531-a05317c5-4eee-49de-95cd-04063d9539d9.png)
## Supported Models
gpt-4, gpt-3.5-turbo, claude-2, palm, llama-2, azure-openai, command-nightly, gemini
For using Non-OpenAI models, use class `liteLLM()` - liteLLM supports all models above.
Find more info here for using liteLLM: https://github.com/BerriAI/litellm/blob/main/setup.py
@ -18,56 +20,221 @@ Find more info here for using liteLLM: https://github.com/BerriAI/litellm/blob/m
3. Environment with internet access or proxy
4. Python 3.8+
## Use
## Quick Start
A sample book, `test_books/animal_farm.epub`, is provided for testing purposes.
```shell
pip install -r requirements.txt
python3 make_book.py --book_name test_books/animal_farm.epub --openai_key ${openai_key} --test
OR
pip install -U bbook_maker
bbook --book_name test_books/animal_farm.epub --openai_key ${openai_key} --test
```
## Translate Service
- `pip install -r requirements.txt` or `pip install -U bbook_maker`(you can use)
- Use `--openai_key` option to specify OpenAI API key. If you have multiple keys, separate them by commas (xxx,xxx,xxx) to reduce errors caused by API call limits.
Or, just set environment variable `BBM_OPENAI_API_KEY` instead.
Or, just set environment variable `BBM_OPENAI_API_KEY` instead.
- A sample book, `test_books/animal_farm.epub`, is provided for testing purposes.
- The default underlying model is [GPT-3.5-turbo](https://openai.com/blog/introducing-chatgpt-and-whisper-apis), which is used by ChatGPT currently. Use `--model gpt4` to change the underlying model to `GPT4`. You can also use `GPT4omini`.
- Important to note that `gpt-4` is significantly more expensive than `gpt-4-turbo`, but to avoid bumping into rate limits, we automatically balance queries across `gpt-4-1106-preview`, `gpt-4`, `gpt-4-32k`, `gpt-4-0613`,`gpt-4-32k-0613`.
- If you want to use a specific model alias with OpenAI (eg `gpt-4-1106-preview` or `gpt-3.5-turbo-0125`), you can use `--model openai --model_list gpt-4-1106-preview,gpt-3.5-turbo-0125`. `--model_list` takes a comma-separated list of model aliases.
- If using chatgptapi, you can add `--use_context` to add a context paragraph to each passage sent to the model for translation (see below).
- Support DeepL model [DeepL Translator](https://rapidapi.com/splintPRO/api/dpl-translator) need pay to get the token use `--model deepl --deepl_key ${deepl_key}`
- Support DeepL free model `--model deeplfree`
- Support Google [Gemini](https://makersuite.google.com/app/apikey) model `--model gemini --gemini_key ${gemini_key}`
- Support [Claude](https://console.anthropic.com/docs) model, use `--model claude --claude_key ${claude_key}`
- Support [Tencent TranSmart](https://transmart.qq.com) model (Free), use `--model tencentransmart`
- Support [Ollama](https://github.com/ollama/ollama) self-host models, use `--ollama_model ${ollama_model_name}`
- If ollama server is not running on localhost, use `--api_base http://x.x.x.x:port/v1` to point to the ollama server address
- Use `--test` option to preview the result if you haven't paid for the service. Note that there is a limit and it may take some time.
- Set the target language like `--language "Simplified Chinese"`. Default target language is `"Simplified Chinese"`.
Read available languages by helper message: `python make_book.py --help`
- Use `--proxy` option to specify proxy server for internet access. Enter a string such as `http://127.0.0.1:7890`.
- Use `--resume` option to manually resume the process after an interruption.
- epub is made of html files. By default, we only translate contents in `<p>`.
Use `--translate-tags` to specify tags need for translation. Use comma to separate multiple tags. For example:
`--translate-tags h1,h2,h3,p,div`
- Use `--book_from` option to specify e-reader type (Now only `kobo` is available), and use `--device_path` to specify the mounting point.
- If you want to change api_base like using Cloudflare Workers, use `--api_base <URL>` to support it.
**Note: the api url should be '`https://xxxx/v1`'. Quotation marks are required.**
- Important to note that `gpt-4` is significantly more expensive than `gpt-4-turbo`, but to avoid bumping into rate limits, we automatically balance queries across `gpt-4-1106-preview`, `gpt-4`, `gpt-4-32k`, `gpt-4-0613`,`gpt-4-32k-0613`.
- If you want to use a specific model alias with OpenAI (eg `gpt-4-1106-preview` or `gpt-3.5-turbo-0125`), you can use `--model openai --model_list gpt-4-1106-preview,gpt-3.5-turbo-0125`. `--model_list` takes a comma-separated list of model aliases.
- If using chatgptapi, you can add `--use_context` to add a context paragraph to each passage sent to the model for translation (see below).
* DeepL
Support DeepL model [DeepL Translator](https://rapidapi.com/splintPRO/api/dpl-translator) need pay to get the token
```
python3 make_book.py --book_name test_books/animal_farm.epub --model deepl --deepl_key ${deepl_key}
```
* DeepL free
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model deeplfree
```
* [Claude](https://console.anthropic.com/docs)
Use [Claude](https://console.anthropic.com/docs) model to translate
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model claude --claude_key ${claude_key}
```
* Google Translate
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model google
```
* Caiyun Translate
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model caiyun --caiyun_key ${caiyun_key}
```
* Gemini
Support Google [Gemini](https://aistudio.google.com/app/apikey) model, use `--model gemini` for Gemini Flash or `--model geminipro` for Gemini Pro.
If you want to use a specific model alias with Gemini (eg `gemini-1.5-flash-002` or `gemini-1.5-flash-8b-exp-0924`), you can use `--model gemini --model_list gemini-1.5-flash-002,gemini-1.5-flash-8b-exp-0924`. `--model_list` takes a comma-separated list of model aliases.
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model gemini --gemini_key ${gemini_key}
```
* [Tencent TranSmart](https://transmart.qq.com)
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model tencentransmart
```
* [xAI](https://x.ai)
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model xai --xai_key ${xai_key}
```
* [Ollama](https://github.com/ollama/ollama)
Support [Ollama](https://github.com/ollama/ollama) self-host models,
If ollama server is not running on localhost, use `--api_base http://x.x.x.x:port/v1` to point to the ollama server address
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --ollama_model ${ollama_model_name}
```
* [groq](https://console.groq.com/keys)
GroqCloud currently supports models: you can find from [Supported Models](https://console.groq.com/docs/models)
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --groq_key [your_key] --model groq --model_list llama3-8b-8192
```
## Use
- Once the translation is complete, a bilingual book named `${book_name}_bilingual.epub` would be generated.
- If there are any errors or you wish to interrupt the translation by pressing `CTRL+C`. A book named `${book_name}_bilingual_temp.epub` would be generated. You can simply rename it to any desired name.
- If you want to translate strings in an e-book that aren't labeled with any tags, you can use the `--allow_navigable_strings` parameter. This will add the strings to the translation queue. **Note that it's best to look for e-books that are more standardized if possible.**
- To tweak the prompt, use the `--prompt` parameter. Valid placeholders for the `user` role template include `{text}` and `{language}`. It supports a few ways to configure the prompt:
If you don't need to set the `system` role content, you can simply set it up like this: `--prompt "Translate {text} to {language}."` or `--prompt prompt_template_sample.txt` (example of a text file can be found at [./prompt_template_sample.txt](./prompt_template_sample.txt)).
If you need to set the `system` role content, you can use the following format: `--prompt '{"user":"Translate {text} to {language}", "system": "You are a professional translator."}'` or `--prompt prompt_template_sample.json` (example of a JSON file can be found at [./prompt_template_sample.json](./prompt_template_sample.json)).
You can also set the `user` and `system` role prompt by setting environment variables: `BBM_CHATGPTAPI_USER_MSG_TEMPLATE` and `BBM_CHATGPTAPI_SYS_MSG`.
- Use the `--batch_size` parameter to specify the number of lines for batch translation (default is 10, currently only effective for txt files).
- `--accumulated_num` Wait for how many tokens have been accumulated before starting the translation. gpt3.5 limits the total_token to 4090. For example, if you use --accumulated_num 1600, maybe openai will
output 2200 tokens and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000, So you are close to reaching the limit. You have to choose your own
value, there is no way to know if the limit is reached before sending
- `--use_context` prompts the model to create a three-paragraph summary. If it's the beginning of the translation, it will summarize the entire passage sent (the size depending on `--accumulated_num`). For subsequent passages, it will amend the summary to include details from the most recent passage, creating a running one-paragraph context payload of the important details of the entire translated work. This improves consistency of flow and tone throughout the translation. This option is available for all ChatGPT-compatible models.
- Use `--context_paragraph_limit` to set a limit on the number of context paragraphs when using the `--use_context` option.
- Use `--temperature` to set the temperature parameter for `chatgptapi`/`gpt4`/`claude` models. For example: `--temperature 0.7`.
- Use `--block_size` to merge multiple paragraphs into one block. This may increase accuracy and speed up the process but can disturb the original format. Must be used with `--single_translate`. For example: `--block_size 5`.
- Use `--single_translate` to output only the translated book without creating a bilingual version.
- `--translation_style` example: `--translation_style "color: #808080; font-style: italic;"`
- `--retranslate` `--retranslate "$translated_filepath" "file_name_in_epub" "start_str" "end_str"(optional)`<br>
Retranslate from start_str to end_str's tag:
`python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which' 'This kind of thing is not a good symptom. Obviously'`<br>
Retranslate start_str's tag:
`python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which'`
- If there are any errors or you wish to interrupt the translation by pressing `CTRL+C`. A book named `{book_name}_bilingual_temp.epub` would be generated. You can simply rename it to any desired name.
## Params
- `--test`:
Use `--test` option to preview the result if you haven't paid for the service. Note that there is a limit and it may take some time.
- `--language`:
Set the target language like `--language "Simplified Chinese"`. Default target language is `"Simplified Chinese"`.
Read available languages by helper message: `python make_book.py --help`
- `--proxy`:
Use `--proxy` option to specify proxy server for internet access. Enter a string such as `http://127.0.0.1:7890`.
- `--resume`:
Use `--resume` option to manually resume the process after an interruption.
```shell
python3 make_book.py --book_name test_books/animal_farm.epub --model google --resume
```
- `--translate-tags`:
epub is made of html files. By default, we only translate contents in `<p>`.
Use `--translate-tags` to specify tags need for translation. Use comma to separate multiple tags.
For example: `--translate-tags h1,h2,h3,p,div`
- `--book_from`:
Use `--book_from` option to specify e-reader type (Now only `kobo` is available), and use `--device_path` to specify the mounting point.
- `--api_base`:
If you want to change api_base like using Cloudflare Workers, use `--api_base <URL>` to support it.
**Note: the api url should be '`https://xxxx/v1`'. Quotation marks are required.**
- `--allow_navigable_strings`:
If you want to translate strings in an e-book that aren't labeled with any tags, you can use the `--allow_navigable_strings` parameter. This will add the strings to the translation queue. **Note that it's best to look for e-books that are more standardized if possible.**
- `--prompt`:
To tweak the prompt, use the `--prompt` parameter. Valid placeholders for the `user` role template include `{text}` and `{language}`. It supports a few ways to configure the prompt:
- If you don't need to set the `system` role content, you can simply set it up like this: `--prompt "Translate {text} to {language}."` or `--prompt prompt_template_sample.txt` (example of a text file can be found at [./prompt_template_sample.txt](./prompt_template_sample.txt)).
- If you need to set the `system` role content, you can use the following format: `--prompt '{"user":"Translate {text} to {language}", "system": "You are a professional translator."}'` or `--prompt prompt_template_sample.json` (example of a JSON file can be found at [./prompt_template_sample.json](./prompt_template_sample.json)).
- You can now use [PromptDown](https://github.com/btfranklin/promptdown) format (`.md` files) for more structured prompts: `--prompt prompt_md.prompt.md`. PromptDown supports both traditional system messages and developer messages (used by newer AI models). Example:
```markdown
# Translation Prompt
## Developer Message
You are a professional translator who specializes in accurate translations.
## Conversation
| Role | Content |
|-------|---------------------------------------------------|
| User | Please translate the following text into {language}:\n\n{text} |
```
- You can also set the `user` and `system` role prompt by setting environment variables: `BBM_CHATGPTAPI_USER_MSG_TEMPLATE` and `BBM_CHATGPTAPI_SYS_MSG`.
- `--batch_size`:
Use the `--batch_size` parameter to specify the number of lines for batch translation (default is 10, currently only effective for txt files).
- `--accumulated_num`:
Wait for how many tokens have been accumulated before starting the translation. gpt3.5 limits the total_token to 4090. For example, if you use `--accumulated_num 1600`, maybe openai will output 2200 tokens and maybe 200 tokens for other messages in the system messages user messages, 1600+2200+200=4000, So you are close to reaching the limit. You have to choose your own
value, there is no way to know if the limit is reached before sending
- `--use_context`:
prompts the model to create a three-paragraph summary. If it's the beginning of the translation, it will summarize the entire passage sent (the size depending on `--accumulated_num`).
For subsequent passages, it will amend the summary to include details from the most recent passage, creating a running one-paragraph context payload of the important details of the entire translated work. This improves consistency of flow and tone throughout the translation. This option is available for all ChatGPT-compatible models and Gemini models.
- `--context_paragraph_limit`:
Use `--context_paragraph_limit` to set a limit on the number of context paragraphs when using the `--use_context` option.
- `--temperature`:
Use `--temperature` to set the temperature parameter for `chatgptapi`/`gpt4`/`claude` models.
For example: `--temperature 0.7`.
- `--block_size`:
Use `--block_size` to merge multiple paragraphs into one block. This may increase accuracy and speed up the process but can disturb the original format. Must be used with `--single_translate`.
For example: `--block_size 5 --single_translate`.
- `--single_translate`:
Use `--single_translate` to output only the translated book without creating a bilingual version.
- `--translation_style`:
example: `--translation_style "color: #808080; font-style: italic;"`
- `--retranslate "$translated_filepath" "file_name_in_epub" "start_str" "end_str"(optional)`:
Retranslate from start_str to end_str's tag:
```shell
python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which' 'This kind of thing is not a good symptom. Obviously'
```
Retranslate start_str's tag:
```shell
python3 "make_book.py" --book_name "test_books/animal_farm.epub" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which'
```
### Examples
**Note if use `pip install bbook_maker` all commands can change to `bbook_maker args`**
@ -82,9 +249,12 @@ python3 make_book.py --book_name test_books/Lex_Fridman_episode_322.srt --openai
# Or translate the whole book
python3 make_book.py --book_name test_books/animal_farm.epub --openai_key ${openai_key} --language zh-hans
# Or translate the whole book using Gemini
# Or translate the whole book using Gemini flash
python3 make_book.py --book_name test_books/animal_farm.epub --gemini_key ${gemini_key} --model gemini
# Use a specific list of Gemini model aliases
python3 make_book.py --book_name test_books/animal_farm.epub --gemini_key ${gemini_key} --model gemini --model_list gemini-1.5-flash-002,gemini-1.5-flash-8b-exp-0924
# Set env OPENAI_API_KEY to ignore option --openai_key
export OPENAI_API_KEY=${your_api_key}
@ -140,6 +310,7 @@ export BBM_CAIYUN_API_KEY=${your_api_key}
```
More understandable example
```shell
python3 make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_base 'https://xxxxx/v1'
@ -148,6 +319,7 @@ python make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_b
```
Microsoft Azure Endpoints
```shell
python3 make_book.py --book_name 'animal_farm.epub' --openai_key XXXXX --api_base 'https://example-endpoint.openai.azure.com' --deployment_id 'deployment-name'

View File

@ -13,7 +13,60 @@ def parse_prompt_arg(prompt_arg):
if prompt_arg is None:
return prompt
if not any(prompt_arg.endswith(ext) for ext in [".json", ".txt"]):
# Check if it's a path to a markdown file (PromptDown format)
if prompt_arg.endswith(".md") and os.path.exists(prompt_arg):
try:
from promptdown import StructuredPrompt
structured_prompt = StructuredPrompt.from_promptdown_file(prompt_arg)
# Initialize our prompt structure
prompt = {}
# Handle developer_message or system_message
# Developer message takes precedence if both are present
if (
hasattr(structured_prompt, "developer_message")
and structured_prompt.developer_message
):
prompt["system"] = structured_prompt.developer_message
elif (
hasattr(structured_prompt, "system_message")
and structured_prompt.system_message
):
prompt["system"] = structured_prompt.system_message
# Extract user message from conversation
if (
hasattr(structured_prompt, "conversation")
and structured_prompt.conversation
):
for message in structured_prompt.conversation:
if message.role.lower() == "user":
prompt["user"] = message.content
break
# Ensure we found a user message
if "user" not in prompt or not prompt["user"]:
raise ValueError(
"PromptDown file must contain at least one user message"
)
print(f"Successfully loaded PromptDown file: {prompt_arg}")
# Validate required placeholders
if any(c not in prompt["user"] for c in ["{text}"]):
raise ValueError(
"User message in PromptDown must contain `{text}` placeholder"
)
return prompt
except Exception as e:
print(f"Error parsing PromptDown file: {e}")
# Fall through to other parsing methods
# Existing parsing logic for JSON strings and other formats
if not any(prompt_arg.endswith(ext) for ext in [".json", ".txt", ".md"]):
try:
# user can define prompt by passing a json string
# eg: --prompt '{"system": "You are a professional translator who translates computer technology books", "user": "Translate \`{text}\` to {language}"}'
@ -35,8 +88,9 @@ def parse_prompt_arg(prompt_arg):
else:
raise FileNotFoundError(f"{prompt_arg} not found")
if prompt is None or any(c not in prompt["user"] for c in ["{text}", "{language}"]):
raise ValueError("prompt must contain `{text}` and `{language}`")
# if prompt is None or any(c not in prompt["user"] for c in ["{text}", "{language}"]):
if prompt is None or any(c not in prompt["user"] for c in ["{text}"]):
raise ValueError("prompt must contain `{text}`")
if "user" not in prompt:
raise ValueError("prompt must contain the key of `user`")
@ -122,6 +176,14 @@ def main():
help="You can get Groq Key from https://console.groq.com/keys",
)
# for xAI
parser.add_argument(
"--xai_key",
dest="xai_key",
type=str,
help="You can get xAI Key from https://console.x.ai/",
)
parser.add_argument(
"--test",
dest="test",
@ -290,7 +352,7 @@ So you are close to reaching the limit. You have to choose your own value, there
"--temperature",
type=float,
default=1.0,
help="temperature parameter for `chatgptapi`/`gpt4`/`claude`",
help="temperature parameter for `chatgptapi`/`gpt4`/`claude`/`gemini`",
)
parser.add_argument(
"--block_size",
@ -316,11 +378,20 @@ So you are close to reaching the limit. You have to choose your own value, there
action="store_true",
help="Use pre-generated batch translations to create files. Run with --batch first before using this option",
)
parser.add_argument(
"--interval",
type=float,
default=0.01,
help="Request interval in seconds (e.g., 0.1 for 100ms). Currently only supported for Gemini models. Default: 0.01",
)
options = parser.parse_args()
if not options.book_name:
print(f"Error: please provide the path of your book using --book_name <path>")
exit(1)
if not os.path.isfile(options.book_name):
print(f"Error: {options.book_name} does not exist.")
print(f"Error: the book {options.book_name!r} does not exist.")
exit(1)
PROXY = options.proxy
@ -331,7 +402,17 @@ So you are close to reaching the limit. You have to choose your own value, there
translate_model = MODEL_DICT.get(options.model)
assert translate_model is not None, "unsupported model"
API_KEY = ""
if options.model in ["openai", "chatgptapi", "gpt4", "gpt4omini"]:
if options.model in [
"openai",
"chatgptapi",
"gpt4",
"gpt4omini",
"gpt4o",
"o1preview",
"o1",
"o1mini",
"o3mini",
]:
if OPENAI_API_KEY := (
options.openai_key
or env.get(
@ -358,7 +439,7 @@ So you are close to reaching the limit. You have to choose your own value, there
API_KEY = options.deepl_key or env.get("BBM_DEEPL_API_KEY")
if not API_KEY:
raise Exception("Please provide deepl key")
elif options.model == "claude":
elif options.model.startswith("claude"):
API_KEY = options.claude_key or env.get("BBM_CLAUDE_API_KEY")
if not API_KEY:
raise Exception("Please provide claude key")
@ -366,10 +447,12 @@ So you are close to reaching the limit. You have to choose your own value, there
API_KEY = options.custom_api or env.get("BBM_CUSTOM_API")
if not API_KEY:
raise Exception("Please provide custom translate api")
elif options.model == "gemini":
elif options.model in ["gemini", "geminipro"]:
API_KEY = options.gemini_key or env.get("BBM_GOOGLE_GEMINI_KEY")
elif options.model == "groq":
API_KEY = options.groq_key or env.get("BBM_GROQ_API_KEY")
elif options.model == "xai":
API_KEY = options.xai_key or env.get("BBM_XAI_API_KEY")
else:
API_KEY = ""
@ -421,6 +504,7 @@ So you are close to reaching the limit. You have to choose your own value, there
prompt_config=parse_prompt_arg(options.prompt_arg),
single_translate=options.single_translate,
context_flag=options.context_flag,
context_paragraph_limit=options.context_paragraph_limit,
temperature=options.temperature,
)
# other options
@ -449,6 +533,11 @@ So you are close to reaching the limit. You have to choose your own value, there
"chatgptapi",
"gpt4",
"gpt4omini",
"gpt4o",
"o1",
"o1preview",
"o1mini",
"o3mini",
], "only support chatgptapi for deployment_id"
if not options.api_base:
raise ValueError("`api_base` must be provided when using `deployment_id`")
@ -471,6 +560,18 @@ So you are close to reaching the limit. You have to choose your own value, there
e.translate_model.set_gpt4_models()
if options.model == "gpt4omini":
e.translate_model.set_gpt4omini_models()
if options.model == "gpt4o":
e.translate_model.set_gpt4o_models()
if options.model == "o1preview":
e.translate_model.set_o1preview_models()
if options.model == "o1":
e.translate_model.set_o1_models()
if options.model == "o1mini":
e.translate_model.set_o1mini_models()
if options.model == "o3mini":
e.translate_model.set_o3mini_models()
if options.model.startswith("claude-"):
e.translate_model.set_claude_model(options.model)
if options.block_size > 0:
e.block_size = options.block_size
if options.batch_flag:
@ -478,6 +579,16 @@ So you are close to reaching the limit. You have to choose your own value, there
if options.batch_use_flag:
e.batch_use_flag = options.batch_use_flag
if options.model in ("gemini", "geminipro"):
e.translate_model.set_interval(options.interval)
if options.model == "gemini":
if options.model_list:
e.translate_model.set_model_list(options.model_list.split(","))
else:
e.translate_model.set_geminiflash_models()
if options.model == "geminipro":
e.translate_model.set_geminipro_models()
e.make_bilingual_book()

View File

@ -1,10 +1,12 @@
from book_maker.loader.epub_loader import EPUBBookLoader
from book_maker.loader.txt_loader import TXTBookLoader
from book_maker.loader.srt_loader import SRTBookLoader
from book_maker.loader.md_loader import MarkdownBookLoader
BOOK_LOADER_DICT = {
"epub": EPUBBookLoader,
"txt": TXTBookLoader,
"srt": SRTBookLoader,
"md": MarkdownBookLoader,
# TODO add more here
}

View File

@ -33,8 +33,8 @@ class EPUBBookLoader(BaseBookLoader):
prompt_config=None,
single_translate=False,
context_flag=False,
temperature=1.0,
context_paragraph_limit=0,
temperature=1.0,
):
self.epub_name = epub_name
self.new_epub = epub.EpubBook()

View File

@ -1,7 +1,7 @@
import re
from copy import copy
import backoff
import logging
from copy import copy
logging.basicConfig(level=logging.WARNING)
logger = logging.getLogger(__name__)
@ -37,9 +37,10 @@ class EPUBBookLoaderHelper:
Exception,
on_backoff=lambda details: logger.warning(f"retry backoff: {details}"),
on_giveup=lambda details: logger.warning(f"retry abort: {details}"),
jitter=None,
)
def translate_with_backoff(self, **kwargs):
return self.translate_model.translate(**kwargs)
def translate_with_backoff(self, text, context_flag=False):
return self.translate_model.translate(text, context_flag)
def deal_new(self, p, wait_p_list, single_translate=False):
self.deal_old(wait_p_list, single_translate, self.context_flag)

View File

@ -0,0 +1,176 @@
import sys
from pathlib import Path
from book_maker.utils import prompt_config_to_kwargs
from .base_loader import BaseBookLoader
class MarkdownBookLoader(BaseBookLoader):
def __init__(
self,
md_name,
model,
key,
resume,
language,
model_api_base=None,
is_test=False,
test_num=5,
prompt_config=None,
single_translate=False,
context_flag=False,
context_paragraph_limit=0,
temperature=1.0,
) -> None:
self.md_name = md_name
self.translate_model = model(
key,
language,
api_base=model_api_base,
temperature=temperature,
**prompt_config_to_kwargs(prompt_config),
)
self.is_test = is_test
self.p_to_save = []
self.bilingual_result = []
self.bilingual_temp_result = []
self.test_num = test_num
self.batch_size = 10
self.single_translate = single_translate
self.md_paragraphs = []
try:
with open(f"{md_name}", encoding="utf-8") as f:
self.origin_book = f.read().splitlines()
except Exception as e:
raise Exception("can not load file") from e
self.resume = resume
self.bin_path = f"{Path(md_name).parent}/.{Path(md_name).stem}.temp.bin"
if self.resume:
self.load_state()
self.process_markdown_content()
def process_markdown_content(self):
"""将原始内容处理成 markdown 段落"""
current_paragraph = []
for line in self.origin_book:
# 如果是空行且当前段落不为空,保存当前段落
if not line.strip() and current_paragraph:
self.md_paragraphs.append("\n".join(current_paragraph))
current_paragraph = []
# 如果是标题行,单独作为一个段落
elif line.strip().startswith("#"):
if current_paragraph:
self.md_paragraphs.append("\n".join(current_paragraph))
current_paragraph = []
self.md_paragraphs.append(line)
# 其他情况,添加到当前段落
else:
current_paragraph.append(line)
# 处理最后一个段落
if current_paragraph:
self.md_paragraphs.append("\n".join(current_paragraph))
@staticmethod
def _is_special_text(text):
return text.isdigit() or text.isspace() or len(text) == 0
def _make_new_book(self, book):
pass
def make_bilingual_book(self):
index = 0
p_to_save_len = len(self.p_to_save)
try:
sliced_list = [
self.md_paragraphs[i : i + self.batch_size]
for i in range(0, len(self.md_paragraphs), self.batch_size)
]
for paragraphs in sliced_list:
batch_text = "\n\n".join(paragraphs)
if self._is_special_text(batch_text):
continue
if not self.resume or index >= p_to_save_len:
try:
max_retries = 3
retry_count = 0
while retry_count < max_retries:
try:
temp = self.translate_model.translate(batch_text)
break
except AttributeError as ae:
print(f"翻译出错: {ae}")
retry_count += 1
if retry_count == max_retries:
raise Exception("翻译模型初始化失败") from ae
except Exception as e:
print(f"翻译过程中出错: {e}")
raise Exception("翻译过程中出现错误") from e
self.p_to_save.append(temp)
if not self.single_translate:
self.bilingual_result.append(batch_text)
self.bilingual_result.append(temp)
index += self.batch_size
if self.is_test and index > self.test_num:
break
self.save_file(
f"{Path(self.md_name).parent}/{Path(self.md_name).stem}_bilingual.md",
self.bilingual_result,
)
except (KeyboardInterrupt, Exception) as e:
print(f"发生错误: {e}")
print("程序将保存进度,您可以稍后继续")
self._save_progress()
self._save_temp_book()
sys.exit(1) # 使用非零退出码表示错误
def _save_temp_book(self):
index = 0
sliced_list = [
self.origin_book[i : i + self.batch_size]
for i in range(0, len(self.origin_book), self.batch_size)
]
for i in range(len(sliced_list)):
batch_text = "".join(sliced_list[i])
self.bilingual_temp_result.append(batch_text)
if self._is_special_text(self.origin_book[i]):
continue
if index < len(self.p_to_save):
self.bilingual_temp_result.append(self.p_to_save[index])
index += 1
self.save_file(
f"{Path(self.md_name).parent}/{Path(self.md_name).stem}_bilingual_temp.txt",
self.bilingual_temp_result,
)
def _save_progress(self):
try:
with open(self.bin_path, "w", encoding="utf-8") as f:
f.write("\n".join(self.p_to_save))
except:
raise Exception("can not save resume file")
def load_state(self):
try:
with open(self.bin_path, encoding="utf-8") as f:
self.p_to_save = f.read().splitlines()
except Exception as e:
raise Exception("can not load resume file") from e
def save_file(self, book_path, content):
try:
with open(book_path, "w", encoding="utf-8") as f:
f.write("\n".join(content))
except:
raise Exception("can not save file")

View File

@ -25,6 +25,7 @@ class SRTBookLoader(BaseBookLoader):
prompt_config=None,
single_translate=False,
context_flag=False,
context_paragraph_limit=0,
temperature=1.0,
) -> None:
self.srt_name = srt_name

View File

@ -20,6 +20,7 @@ class TXTBookLoader(BaseBookLoader):
prompt_config=None,
single_translate=False,
context_flag=False,
context_paragraph_limit=0,
temperature=1.0,
) -> None:
self.txt_name = txt_name

View File

@ -8,20 +8,33 @@ from book_maker.translator.gemini_translator import Gemini
from book_maker.translator.groq_translator import GroqClient
from book_maker.translator.tencent_transmart_translator import TencentTranSmart
from book_maker.translator.custom_api_translator import CustomAPI
from book_maker.translator.xai_translator import XAIClient
MODEL_DICT = {
"openai": ChatGPTAPI,
"chatgptapi": ChatGPTAPI,
"gpt4": ChatGPTAPI,
"gpt4omini": ChatGPTAPI,
"gpt4o": ChatGPTAPI,
"o1preview": ChatGPTAPI,
"o1": ChatGPTAPI,
"o1mini": ChatGPTAPI,
"o3mini": ChatGPTAPI,
"google": Google,
"caiyun": Caiyun,
"deepl": DeepL,
"deeplfree": DeepLFree,
"claude": Claude,
"claude-3-5-sonnet-latest": Claude,
"claude-3-5-sonnet-20241022": Claude,
"claude-3-5-sonnet-20240620": Claude,
"claude-3-5-haiku-latest": Claude,
"claude-3-5-haiku-20241022": Claude,
"gemini": Gemini,
"geminipro": Gemini,
"groq": GroqClient,
"tencentransmart": TencentTranSmart,
"customapi": CustomAPI,
"xai": XAIClient,
# add more here
}

View File

@ -42,6 +42,27 @@ GPT4oMINI_MODEL_LIST = [
"gpt-4o-mini",
"gpt-4o-mini-2024-07-18",
]
GPT4o_MODEL_LIST = [
"gpt-4o",
"gpt-4o-2024-05-13",
"gpt-4o-2024-08-06",
"chatgpt-4o-latest",
]
O1PREVIEW_MODEL_LIST = [
"o1-preview",
"o1-preview-2024-09-12",
]
O1_MODEL_LIST = [
"o1",
"o1-2024-12-17",
]
O1MINI_MODEL_LIST = [
"o1-mini",
"o1-mini-2024-09-12",
]
O3MINI_MODEL_LIST = [
"o3-mini",
]
class ChatGPTAPI(Base):
@ -209,40 +230,6 @@ class ChatGPTAPI(Base):
lines = [line.strip() for line in lines if line.strip() != ""]
return lines
def get_best_result_list(
self,
plist_len,
new_str,
sleep_dur,
result_list,
max_retries=15,
):
if len(result_list) == plist_len:
return result_list, 0
best_result_list = result_list
retry_count = 0
while retry_count < max_retries and len(result_list) != plist_len:
print(
f"bug: {plist_len} -> {len(result_list)} : Number of paragraphs before and after translation",
)
print(f"sleep for {sleep_dur}s and retry {retry_count+1} ...")
time.sleep(sleep_dur)
retry_count += 1
result_list = self.translate_and_split_lines(new_str)
if (
len(result_list) == plist_len
or len(best_result_list) < len(result_list) <= plist_len
or (
len(result_list) < len(best_result_list)
and len(best_result_list) > plist_len
)
):
best_result_list = result_list
return best_result_list, retry_count
def log_retry(self, state, retry_count, elapsed_time, log_path="log/buglog.txt"):
if retry_count == 0:
return
@ -312,48 +299,131 @@ class ChatGPTAPI(Base):
return new_text
def translate_list(self, plist):
sep = "\n\n\n\n\n"
# new_str = sep.join([item.text for item in plist])
plist_len = len(plist)
new_str = ""
i = 1
for p in plist:
# Create a list of original texts and add clear numbering markers to each paragraph
formatted_text = ""
for i, p in enumerate(plist, 1):
temp_p = copy(p)
for sup in temp_p.find_all("sup"):
sup.extract()
new_str += f"({i}) {temp_p.get_text().strip()}{sep}"
i = i + 1
para_text = temp_p.get_text().strip()
# Using special delimiters and clear numbering
formatted_text += f"PARAGRAPH {i}:\n{para_text}\n\n"
if new_str.endswith(sep):
new_str = new_str[: -len(sep)]
print(f"plist len = {plist_len}")
new_str = self.join_lines(new_str)
original_prompt_template = self.prompt_template
plist_len = len(plist)
print(f"plist len = {len(plist)}")
result_list = self.translate_and_split_lines(new_str)
start_time = time.time()
result_list, retry_count = self.get_best_result_list(
plist_len,
new_str,
6, # WTF this magic number here?
result_list,
structured_prompt = (
f"Translate the following {plist_len} paragraphs to {{language}}. "
f"CRUCIAL INSTRUCTION: Format your response using EXACTLY this structure:\n\n"
f"TRANSLATION OF PARAGRAPH 1:\n[Your translation of paragraph 1 here]\n\n"
f"TRANSLATION OF PARAGRAPH 2:\n[Your translation of paragraph 2 here]\n\n"
f"... and so on for all {plist_len} paragraphs.\n\n"
f"You MUST provide EXACTLY {plist_len} translated paragraphs. "
f"Do not merge, split, or rearrange paragraphs. "
f"Translate each paragraph independently but consistently. "
f"Keep all numbers and special formatting in your translation. "
f"Each original paragraph must correspond to exactly one translated paragraph."
)
end_time = time.time()
self.prompt_template = structured_prompt + " ```{text}```"
state = "fail" if len(result_list) != plist_len else "success"
log_path = "log/buglog.txt"
translated_text = self.translate(formatted_text, False)
self.log_retry(state, retry_count, end_time - start_time, log_path)
self.log_translation_mismatch(plist_len, result_list, new_str, sep, log_path)
# Extract translations from structured output
translated_paragraphs = []
for i in range(1, plist_len + 1):
pattern = (
r"TRANSLATION OF PARAGRAPH "
+ str(i)
+ r":(.*?)(?=TRANSLATION OF PARAGRAPH \d+:|\Z)"
)
matches = re.findall(pattern, translated_text, re.DOTALL)
if matches:
translated_paragraph = matches[0].strip()
translated_paragraphs.append(translated_paragraph)
else:
print(f"Warning: Could not find translation for paragraph {i}")
loose_pattern = (
r"(?:TRANSLATION|PARAGRAPH|PARA).*?"
+ str(i)
+ r".*?:(.*?)(?=(?:TRANSLATION|PARAGRAPH|PARA).*?\d+.*?:|\Z)"
)
loose_matches = re.findall(loose_pattern, translated_text, re.DOTALL)
if loose_matches:
translated_paragraphs.append(loose_matches[0].strip())
else:
translated_paragraphs.append("")
self.prompt_template = original_prompt_template
# If the number of extracted paragraphs is incorrect, try the alternative extraction method.
if len(translated_paragraphs) != plist_len:
print(
f"Warning: Extracted {len(translated_paragraphs)}/{plist_len} paragraphs. Using fallback extraction."
)
all_para_pattern = r"(?:TRANSLATION|PARAGRAPH|PARA).*?(\d+).*?:(.*?)(?=(?:TRANSLATION|PARAGRAPH|PARA).*?\d+.*?:|\Z)"
all_matches = re.findall(all_para_pattern, translated_text, re.DOTALL)
if all_matches:
# Create a dictionary to map translation content based on paragraph numbers
para_dict = {}
for num_str, content in all_matches:
try:
num = int(num_str)
if 1 <= num <= plist_len:
para_dict[num] = content.strip()
except ValueError:
continue
# Rebuild the translation list in the original order
new_translated_paragraphs = []
for i in range(1, plist_len + 1):
if i in para_dict:
new_translated_paragraphs.append(para_dict[i])
else:
new_translated_paragraphs.append("")
if len(new_translated_paragraphs) == plist_len:
translated_paragraphs = new_translated_paragraphs
if len(translated_paragraphs) < plist_len:
translated_paragraphs.extend(
[""] * (plist_len - len(translated_paragraphs))
)
elif len(translated_paragraphs) > plist_len:
translated_paragraphs = translated_paragraphs[:plist_len]
return translated_paragraphs
def extract_paragraphs(self, text, paragraph_count):
"""Extract paragraphs from translated text, ensuring paragraph count is preserved."""
# First try to extract by paragraph numbers (1), (2), etc.
result_list = []
for i in range(1, paragraph_count + 1):
pattern = rf"\({i}\)\s*(.*?)(?=\s*\({i + 1}\)|\Z)"
match = re.search(pattern, text, re.DOTALL)
if match:
result_list.append(match.group(1).strip())
# If exact pattern matching failed, try another approach
if len(result_list) != paragraph_count:
pattern = r"\((\d+)\)\s*(.*?)(?=\s*\(\d+\)|\Z)"
matches = re.findall(pattern, text, re.DOTALL)
if matches:
# Sort by paragraph number
matches.sort(key=lambda x: int(x[0]))
result_list = [match[1].strip() for match in matches]
# Fallback to original line-splitting approach
if len(result_list) != paragraph_count:
lines = text.splitlines()
result_list = [line.strip() for line in lines if line.strip() != ""]
# del (num), num. sometime (num) will translated to num.
result_list = [re.sub(r"^(\(\d+\)|\d+\.|(\d+))\s*", "", s) for s in result_list]
return result_list
def set_deployment_id(self, deployment_id):
@ -404,6 +474,66 @@ class ChatGPTAPI(Base):
print(f"Using model list {model_list}")
self.model_list = cycle(model_list)
def set_gpt4o_models(self):
# for issue #375 azure can not use model list
if self.deployment_id:
self.model_list = cycle(["gpt-4o"])
else:
my_model_list = [
i["id"] for i in self.openai_client.models.list().model_dump()["data"]
]
model_list = list(set(my_model_list) & set(GPT4o_MODEL_LIST))
print(f"Using model list {model_list}")
self.model_list = cycle(model_list)
def set_o1preview_models(self):
# for issue #375 azure can not use model list
if self.deployment_id:
self.model_list = cycle(["o1-preview"])
else:
my_model_list = [
i["id"] for i in self.openai_client.models.list().model_dump()["data"]
]
model_list = list(set(my_model_list) & set(O1PREVIEW_MODEL_LIST))
print(f"Using model list {model_list}")
self.model_list = cycle(model_list)
def set_o1_models(self):
# for issue #375 azure can not use model list
if self.deployment_id:
self.model_list = cycle(["o1"])
else:
my_model_list = [
i["id"] for i in self.openai_client.models.list().model_dump()["data"]
]
model_list = list(set(my_model_list) & set(O1_MODEL_LIST))
print(f"Using model list {model_list}")
self.model_list = cycle(model_list)
def set_o1mini_models(self):
# for issue #375 azure can not use model list
if self.deployment_id:
self.model_list = cycle(["o1-mini"])
else:
my_model_list = [
i["id"] for i in self.openai_client.models.list().model_dump()["data"]
]
model_list = list(set(my_model_list) & set(O1MINI_MODEL_LIST))
print(f"Using model list {model_list}")
self.model_list = cycle(model_list)
def set_o3mini_models(self):
# for issue #375 azure can not use model list
if self.deployment_id:
self.model_list = cycle(["o3-mini"])
else:
my_model_list = [
i["id"] for i in self.openai_client.models.list().model_dump()["data"]
]
model_list = list(set(my_model_list) & set(O3MINI_MODEL_LIST))
print(f"Using model list {model_list}")
self.model_list = cycle(model_list)
def set_model_list(self, model_list):
model_list = list(set(model_list))
print(f"Using model list {model_list}")

View File

@ -13,38 +13,99 @@ class Claude(Base):
language,
api_base=None,
prompt_template=None,
prompt_sys_msg=None,
temperature=1.0,
context_flag=False,
context_paragraph_limit=5,
**kwargs,
) -> None:
super().__init__(key, language)
self.api_url = f"{api_base}" if api_base else "https://api.anthropic.com"
self.api_url = api_base or "https://api.anthropic.com"
self.client = Anthropic(base_url=api_base, api_key=key, timeout=20)
self.model = "claude-3-5-sonnet-20241022" # default it for now
self.language = language
self.prompt_template = (
prompt_template
or "\n\nHuman: Help me translate the text within triple backticks into {language} and provide only the translated result.\n```{text}```\n\nAssistant: "
or "Help me translate the text within triple backticks into {language} and provide only the translated result.\n```{text}```"
)
self.prompt_sys_msg = prompt_sys_msg or ""
self.temperature = temperature
self.context_flag = context_flag
self.context_list = []
self.context_translated_list = []
self.context_paragraph_limit = context_paragraph_limit
def rotate_key(self):
pass
def set_claude_model(self, model_name):
self.model = model_name
def create_messages(self, text, intermediate_messages=None):
"""Create messages for the current translation request"""
current_msg = {
"role": "user",
"content": self.prompt_template.format(
text=text,
language=self.language,
),
}
messages = []
if intermediate_messages:
messages.extend(intermediate_messages)
messages.append(current_msg)
return messages
def create_context_messages(self):
"""Create a message pair containing all context paragraphs"""
if not self.context_flag or not self.context_list:
return []
# Create a single message pair for all previous context
return [
{
"role": "user",
"content": self.prompt_template.format(
text="\n\n".join(self.context_list),
language=self.language,
),
},
{"role": "assistant", "content": "\n\n".join(self.context_translated_list)},
]
def save_context(self, text, t_text):
"""Save the current translation pair to context"""
if not self.context_flag:
return
self.context_list.append(text)
self.context_translated_list.append(t_text)
# Keep only the most recent paragraphs within the limit
if len(self.context_list) > self.context_paragraph_limit:
self.context_list.pop(0)
self.context_translated_list.pop(0)
def translate(self, text):
print(text)
self.rotate_key()
prompt = self.prompt_template.format(
text=text,
language=self.language,
)
message = [{"role": "user", "content": prompt}]
# Create messages with context
messages = self.create_messages(text, self.create_context_messages())
r = self.client.messages.create(
max_tokens=4096,
messages=message,
model="claude-3-haiku-20240307", # default it for now
messages=messages,
system=self.prompt_sys_msg,
temperature=self.temperature,
model=self.model,
)
t_text = r.content[0].text
# api limit rate and spider rule
time.sleep(1)
if self.context_flag:
self.save_context(text, t_text)
print("[bold green]" + re.sub("\n{3,}", "\n\n", t_text) + "[/bold green]")
return t_text

View File

@ -1,5 +1,7 @@
import re
import time
from os import environ
from itertools import cycle
import google.generativeai as genai
from google.generativeai.types.generation_types import (
@ -11,23 +13,38 @@ from rich import print
from .base_translator import Base
generation_config = {
"temperature": 0.7,
"temperature": 1.0,
"top_p": 1,
"top_k": 1,
"max_output_tokens": 2048,
"max_output_tokens": 8192,
}
safety_settings = [
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_MEDIUM_AND_ABOVE"},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"threshold": "BLOCK_MEDIUM_AND_ABOVE",
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_MEDIUM_AND_ABOVE",
},
safety_settings = {
"HATE": "BLOCK_NONE",
"HARASSMENT": "BLOCK_NONE",
"SEXUAL": "BLOCK_NONE",
"DANGEROUS": "BLOCK_NONE",
}
PROMPT_ENV_MAP = {
"user": "BBM_GEMINIAPI_USER_MSG_TEMPLATE",
"system": "BBM_GEMINIAPI_SYS_MSG",
}
GEMINIPRO_MODEL_LIST = [
"gemini-1.5-pro",
"gemini-1.5-pro-latest",
"gemini-1.5-pro-001",
"gemini-1.5-pro-002",
]
GEMINIFLASH_MODEL_LIST = [
"gemini-1.5-flash",
"gemini-1.5-flash-latest",
"gemini-1.5-flash-001",
"gemini-1.5-flash-002",
"gemini-2.0-flash-exp",
"gemini-2.5-flash-preview-04-17",
]
@ -38,20 +55,57 @@ class Gemini(Base):
DEFAULT_PROMPT = "Please help me to translate,`{text}` to {language}, please return only translated content not include the origin text"
def __init__(self, key, language, **kwargs) -> None:
genai.configure(api_key=key)
def __init__(
self,
key,
language,
prompt_template=None,
prompt_sys_msg=None,
context_flag=False,
temperature=1.0,
**kwargs,
) -> None:
super().__init__(key, language)
self.context_flag = context_flag
self.prompt = (
prompt_template
or environ.get(PROMPT_ENV_MAP["user"])
or self.DEFAULT_PROMPT
)
self.prompt_sys_msg = (
prompt_sys_msg
or environ.get(PROMPT_ENV_MAP["system"])
or None # Allow None, but not empty string
)
self.interval = 3
genai.configure(api_key=next(self.keys))
generation_config["temperature"] = temperature
def create_convo(self):
model = genai.GenerativeModel(
model_name="gemini-pro",
model_name=self.model,
generation_config=generation_config,
safety_settings=safety_settings,
system_instruction=self.prompt_sys_msg,
)
self.convo = model.start_chat()
# print(model) # Uncomment to debug and inspect the model details.
def rotate_model(self):
self.model = next(self.model_list)
self.create_convo()
print(f"Using model {self.model}")
def rotate_key(self):
pass
genai.configure(api_key=next(self.keys))
self.create_convo()
def translate(self, text):
delay = 1
exponential_base = 2
attempt_count = 0
max_attempts = 7
t_text = ""
print(text)
# same for caiyun translate src issue #279 gemini for #374
@ -60,32 +114,91 @@ class Gemini(Base):
if len(text_list) > 1:
if text_list[0].isdigit():
num = text_list[0]
try:
self.convo.send_message(
self.DEFAULT_PROMPT.format(text=text, language=self.language)
)
print(text)
t_text = self.convo.last.text.strip()
except StopCandidateException as e:
match = re.search(r'content\s*{\s*parts\s*{\s*text:\s*"([^"]+)"', str(e))
if match:
t_text = match.group(1)
t_text = re.sub(r"\\n", "\n", t_text)
else:
t_text = "Can not translate"
except BlockedPromptException as e:
print(str(e))
t_text = "Can not translate by SAFETY reason.(因安全问题不能翻译)"
except Exception as e:
print(str(e))
t_text = "Can not translate by other reason.(因安全问题不能翻译)"
if len(self.convo.history) > 10:
self.convo.history = self.convo.history[2:]
while attempt_count < max_attempts:
try:
self.convo.send_message(
self.prompt.format(text=text, language=self.language)
)
t_text = self.convo.last.text.strip()
# 检查是否包含特定标签,如果有则只返回标签内的内容
tag_pattern = (
r"<step3_refined_translation>(.*?)</step3_refined_translation>"
)
tag_match = re.search(tag_pattern, t_text, re.DOTALL)
if tag_match:
print(
"[bold green]"
+ re.sub("\n{3,}", "\n\n", t_text)
+ "[/bold green]"
)
t_text = tag_match.group(1).strip()
# print("[bold green]" + re.sub("\n{3,}", "\n\n", t_text) + "[/bold green]")
break
except StopCandidateException as e:
print(
f"Translation failed due to StopCandidateException: {e} Attempting to switch model..."
)
self.rotate_model()
except BlockedPromptException as e:
print(
f"Translation failed due to BlockedPromptException: {e} Attempting to switch model..."
)
self.rotate_model()
except Exception as e:
print(
f"Translation failed due to {type(e).__name__}: {e} Will sleep {delay} seconds"
)
time.sleep(delay)
delay *= exponential_base
self.rotate_key()
if attempt_count >= 1:
self.rotate_model()
attempt_count += 1
if attempt_count == max_attempts:
print(f"Translation failed after {max_attempts} attempts.")
return
if self.context_flag:
if len(self.convo.history) > 10:
self.convo.history = self.convo.history[2:]
else:
self.convo.history = []
print("[bold green]" + re.sub("\n{3,}", "\n\n", t_text) + "[/bold green]")
# for limit
time.sleep(0.5)
# for rate limit(RPM)
time.sleep(self.interval)
if num:
t_text = str(num) + "\n" + t_text
return t_text
def set_interval(self, interval):
self.interval = interval
def set_geminipro_models(self):
self.set_models(GEMINIPRO_MODEL_LIST)
def set_geminiflash_models(self):
self.set_models(GEMINIFLASH_MODEL_LIST)
def set_models(self, allowed_models):
available_models = [
re.sub(r"^models/", "", i.name) for i in genai.list_models()
]
model_list = sorted(
list(set(available_models) & set(allowed_models)),
key=allowed_models.index,
)
print(f"Using model list {model_list}")
self.model_list = cycle(model_list)
self.rotate_model()
def set_model_list(self, model_list):
# keep the order of input
model_list = sorted(list(set(model_list)), key=model_list.index)
print(f"Using model list {model_list}")
self.model_list = cycle(model_list)
self.rotate_model()

View File

@ -2,7 +2,7 @@ import re
import requests
from rich import print
from book_maker.utils import TO_LANGUAGE_CODE, LANGUAGES
from .base_translator import Base
@ -13,7 +13,14 @@ class Google(Base):
def __init__(self, key, language, **kwargs) -> None:
super().__init__(key, language)
self.api_url = "https://translate.google.com/translate_a/single?client=it&dt=qca&dt=t&dt=rmt&dt=bd&dt=rms&dt=sos&dt=md&dt=gt&dt=ld&dt=ss&dt=ex&otf=2&dj=1&hl=en&ie=UTF-8&oe=UTF-8&sl=auto&tl=zh-CN"
# Convert language name to code if needed, otherwise use as-is
if language.lower() in TO_LANGUAGE_CODE:
language_code = TO_LANGUAGE_CODE[language.lower()]
else:
language_code = language
self.api_url = f"https://translate.google.com/translate_a/single?client=it&dt=qca&dt=t&dt=rmt&dt=bd&dt=rms&dt=sos&dt=md&dt=gt&dt=ld&dt=ss&dt=ex&otf=2&dj=1&hl=en&ie=UTF-8&oe=UTF-8&sl=auto&tl={language_code}"
self.headers = {
"Content-Type": "application/x-www-form-urlencoded",
"User-Agent": "GoogleTranslate/6.29.59279 (iPhone; iOS 15.4; en; iPhone14,2)",

View File

@ -0,0 +1,20 @@
from openai import OpenAI
from .chatgptapi_translator import ChatGPTAPI
from os import linesep
from itertools import cycle
XAI_MODEL_LIST = [
"grok-beta",
]
class XAIClient(ChatGPTAPI):
def __init__(self, key, language, api_base=None, **kwargs) -> None:
super().__init__(key, language)
self.model_list = XAI_MODEL_LIST
self.api_url = str(api_base) if api_base else "https://api.x.ai/v1"
self.openai_client = OpenAI(api_key=key, base_url=self.api_url)
def rotate_model(self):
self.model = self.model_list[0]

View File

@ -2,7 +2,7 @@
## Models
`-m, --model <Model>` <br>
Currently `bbook_maker` supports these models: `chatgptapi` , `gpt3` , `google` , `caiyun` , `deepl` , `deeplfree` , `gpt4` , `gpt4omini` , `claude` , `customapi`.
Currently `bbook_maker` supports these models: `chatgptapi` , `gpt3` , `google` , `caiyun` , `deepl` , `deeplfree` , `gpt4` , `gpt4omini` , `o1-preview` , `o1` , `o1-mini` , `o3-mini` , `claude` , `customapi`.
Default model is `chatgptapi` .
### OPENAI models

View File

@ -19,6 +19,32 @@ To tweak the prompt, use the `--prompt` parameter. Valid placeholders for the `u
You can also set the `user` and `system` role prompt by setting environment variables: `BBM_CHATGPTAPI_USER_MSG_TEMPLATE` and `BBM_CHATGPTAPI_SYS_MSG`.
- You can now use PromptDown format (`.md` files) for more structured prompts: `--prompt prompt_md.prompt.md`
# Translation Prompt
## System Message
You are a professional translator who specializes in accurate translations.
## Conversation
| Role | Content |
|-------|------------------------------------------|
| User | Please translate the following text into {language}:\n\n{text} |
# OR using Developer Message (for newer AI models)
# Translation Prompt
## Developer Message
You are a professional translator who specializes in accurate translations.
## Conversation
| Role | Content |
|-------|------------------------------------------|
| User | Please translate the following text into {language}:\n\n{text} |
## Examples
```sh
python3 make_book.py --book_name test_books/animal_farm.epub --prompt prompt_template_sample.txt

335
pdm.lock generated
View File

@ -4,8 +4,11 @@
[metadata]
groups = ["default"]
strategy = ["cross_platform", "inherit_metadata"]
lock_version = "4.4.1"
content_hash = "sha256:7792e48118ca2396a823aeef510a3bbec973033b44e04c7d81368e0285275716"
lock_version = "4.5.0"
content_hash = "sha256:2fb0dba3fe80797eb75ebae386f9bac949e07192d24d6f60f8b5f0f89bc284bc"
[[metadata.targets]]
requires_python = ">=3.10"
[[package]]
name = "aiohttp"
@ -105,6 +108,9 @@ version = "0.6.0"
requires_python = ">=3.8"
summary = "Reusable constraint types to use with typing.Annotated"
groups = ["default"]
dependencies = [
"typing-extensions>=4.0.0; python_version < \"3.9\"",
]
files = [
{file = "annotated_types-0.6.0-py3-none-any.whl", hash = "sha256:0641064de18ba7a25dee8f96403ebc39113d0cb953a01429249d5c7564666a43"},
{file = "annotated_types-0.6.0.tar.gz", hash = "sha256:563339e807e53ffd9c267e99fc6d9ea23eb8443c08f112651963e24e22f84a5d"},
@ -112,23 +118,22 @@ files = [
[[package]]
name = "anthropic"
version = "0.26.1"
requires_python = ">=3.7"
version = "0.49.0"
requires_python = ">=3.8"
summary = "The official Python library for the anthropic API"
groups = ["default"]
dependencies = [
"anyio<5,>=3.5.0",
"distro<2,>=1.7.0",
"httpx<1,>=0.23.0",
"jiter<1,>=0.1.0",
"jiter<1,>=0.4.0",
"pydantic<3,>=1.9.0",
"sniffio",
"tokenizers>=0.13.0",
"typing-extensions<5,>=4.7",
"typing-extensions<5,>=4.10",
]
files = [
{file = "anthropic-0.26.1-py3-none-any.whl", hash = "sha256:2812b9b250b551ed8a1f0a7e6ae3f005654098994f45ebca5b5808bd154c9628"},
{file = "anthropic-0.26.1.tar.gz", hash = "sha256:26680ff781a6f678a30a1dccd0743631e602b23a47719439ffdef5335fa167d8"},
{file = "anthropic-0.49.0-py3-none-any.whl", hash = "sha256:bbc17ad4e7094988d2fa86b87753ded8dce12498f4b85fe5810f208f454a8375"},
{file = "anthropic-0.49.0.tar.gz", hash = "sha256:c09e885b0f674b9119b4f296d8508907f6cff0009bc20d5cf6b35936c40b4398"},
]
[[package]]
@ -155,6 +160,9 @@ requires_python = ">=3.7"
summary = "Timeout context manager for asyncio programs"
groups = ["default"]
marker = "python_version < \"3.11\""
dependencies = [
"typing-extensions>=3.6.5; python_version < \"3.8\"",
]
files = [
{file = "async-timeout-4.0.3.tar.gz", hash = "sha256:4640d96be84d82d02ed59ea2b7105a0f7b33abe8703703cd0ab0bf87c427522f"},
{file = "async_timeout-4.0.3-py3-none-any.whl", hash = "sha256:7405140ff1230c310e51dc27b3145b9092d659ce68ff733fb0cefe3ee42be028"},
@ -166,6 +174,9 @@ version = "23.2.0"
requires_python = ">=3.7"
summary = "Classes Without Boilerplate"
groups = ["default"]
dependencies = [
"importlib-metadata; python_version < \"3.8\"",
]
files = [
{file = "attrs-23.2.0-py3-none-any.whl", hash = "sha256:99b87a485a5820b23b879f04c2305b44b951b502fd64be915879d77a7e8fc6f1"},
{file = "attrs-23.2.0.tar.gz", hash = "sha256:935dc3b529c262f6cf76e50877d35a4bd3c1de194fd41f47a2b7ae8f19971f30"},
@ -465,6 +476,7 @@ summary = "Composable command line interface toolkit"
groups = ["default"]
dependencies = [
"colorama; platform_system == \"Windows\"",
"importlib-metadata; python_version < \"3.8\"",
]
files = [
{file = "click-8.1.7-py3-none-any.whl", hash = "sha256:ae74fb96c20a0277a1d615f1e4d73c8414f5a98db8b799a7931d1582f3390c28"},
@ -614,7 +626,7 @@ files = [
[[package]]
name = "google-ai-generativelanguage"
version = "0.6.4"
version = "0.6.15"
requires_python = ">=3.7"
summary = "Google Ai Generativelanguage API client library"
groups = ["default"]
@ -622,11 +634,12 @@ dependencies = [
"google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1",
"google-auth!=2.24.0,!=2.25.0,<3.0.0dev,>=2.14.1",
"proto-plus<2.0.0dev,>=1.22.3",
"protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.19.5",
"proto-plus<2.0.0dev,>=1.25.0; python_version >= \"3.13\"",
"protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<6.0.0dev,>=3.20.2",
]
files = [
{file = "google-ai-generativelanguage-0.6.4.tar.gz", hash = "sha256:1750848c12af96cb24ae1c3dd05e4bfe24867dc4577009ed03e1042d8421e874"},
{file = "google_ai_generativelanguage-0.6.4-py3-none-any.whl", hash = "sha256:730e471aa549797118fb1c88421ba1957741433ada575cf5dd08d3aebf903ab1"},
{file = "google_ai_generativelanguage-0.6.15-py3-none-any.whl", hash = "sha256:5a03ef86377aa184ffef3662ca28f19eeee158733e45d7947982eb953c6ebb6c"},
{file = "google_ai_generativelanguage-0.6.15.tar.gz", hash = "sha256:8f6d9dc4c12b065fe2d0289026171acea5183ebf2d0b11cefe12f3821e159ec3"},
]
[[package]]
@ -716,12 +729,12 @@ files = [
[[package]]
name = "google-generativeai"
version = "0.5.4"
version = "0.8.5"
requires_python = ">=3.9"
summary = "Google Generative AI High level API client library and tools."
groups = ["default"]
dependencies = [
"google-ai-generativelanguage==0.6.4",
"google-ai-generativelanguage==0.6.15",
"google-api-core",
"google-api-python-client",
"google-auth>=2.15.0",
@ -731,7 +744,7 @@ dependencies = [
"typing-extensions",
]
files = [
{file = "google_generativeai-0.5.4-py3-none-any.whl", hash = "sha256:036d63ee35e7c8aedceda4f81c390a5102808af09ff3a6e57e27ed0be0708f3c"},
{file = "google_generativeai-0.8.5-py3-none-any.whl", hash = "sha256:22b420817fb263f8ed520b33285f45976d5b21e904da32b80d4fd20c055123a2"},
]
[[package]]
@ -750,8 +763,8 @@ files = [
[[package]]
name = "groq"
version = "0.8.0"
requires_python = ">=3.7"
version = "0.22.0"
requires_python = ">=3.8"
summary = "The official Python library for the groq API"
groups = ["default"]
dependencies = [
@ -760,11 +773,11 @@ dependencies = [
"httpx<1,>=0.23.0",
"pydantic<3,>=1.9.0",
"sniffio",
"typing-extensions<5,>=4.7",
"typing-extensions<5,>=4.10",
]
files = [
{file = "groq-0.8.0-py3-none-any.whl", hash = "sha256:f5e4e892d45001241a930db451e633ca1f0007e3f749deaa5d7360062fcd61e3"},
{file = "groq-0.8.0.tar.gz", hash = "sha256:37ceb2f706bd516d0bfcac8e89048a24b375172987a0d6bd9efb521c54f6deff"},
{file = "groq-0.22.0-py3-none-any.whl", hash = "sha256:f53d3966dff713aaa635671c2d075ebb932b0d48e3c4031ede9b84a2a6694c79"},
{file = "groq-0.22.0.tar.gz", hash = "sha256:9d090fbe4a051655faff649890d18aaacb3121393ad9d55399171fe081f1057b"},
]
[[package]]
@ -835,6 +848,9 @@ version = "0.14.0"
requires_python = ">=3.7"
summary = "A pure-Python, bring-your-own-I/O implementation of HTTP/1.1"
groups = ["default"]
dependencies = [
"typing-extensions; python_version < \"3.8\"",
]
files = [
{file = "h11-0.14.0-py3-none-any.whl", hash = "sha256:e3fe4ac4b851c468cc8363d500db52c2ead036020723024a109d37346efaa761"},
{file = "h11-0.14.0.tar.gz", hash = "sha256:8f19fbbe99e72420ff35c00b27a34cb9937e902a8b810e2c88300c6f0a3b699d"},
@ -863,6 +879,7 @@ summary = "A comprehensive HTTP client library."
groups = ["default"]
dependencies = [
"pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2; python_version > \"3.0\"",
"pyparsing<3,>=2.4.2; python_version < \"3.0\"",
]
files = [
{file = "httplib2-0.22.0-py3-none-any.whl", hash = "sha256:14ae0a53c1ba8f3d37e9e27cf37eabb0fb9980f435ba405d546948b009dd64dc"},
@ -943,6 +960,7 @@ requires_python = ">=3.8"
summary = "Read metadata from Python packages"
groups = ["default"]
dependencies = [
"typing-extensions>=3.6.4; python_version < \"3.8\"",
"zipp>=0.5",
]
files = [
@ -1022,6 +1040,39 @@ files = [
{file = "jiter-0.4.0.tar.gz", hash = "sha256:68203e02e0419bc3eca717c580c2d8f615aeee1150e2a1fb68d6600a7e52a37c"},
]
[[package]]
name = "jsonschema"
version = "4.23.0"
requires_python = ">=3.8"
summary = "An implementation of JSON Schema validation for Python"
groups = ["default"]
dependencies = [
"attrs>=22.2.0",
"importlib-resources>=1.4.0; python_version < \"3.9\"",
"jsonschema-specifications>=2023.03.6",
"pkgutil-resolve-name>=1.3.10; python_version < \"3.9\"",
"referencing>=0.28.4",
"rpds-py>=0.7.1",
]
files = [
{file = "jsonschema-4.23.0-py3-none-any.whl", hash = "sha256:fbadb6f8b144a8f8cf9f0b89ba94501d143e50411a1278633f56a7acf7fd5566"},
{file = "jsonschema-4.23.0.tar.gz", hash = "sha256:d71497fef26351a33265337fa77ffeb82423f3ea21283cd9467bb03999266bc4"},
]
[[package]]
name = "jsonschema-specifications"
version = "2024.10.1"
requires_python = ">=3.9"
summary = "The JSON Schema meta-schemas and vocabularies, exposed as a Registry"
groups = ["default"]
dependencies = [
"referencing>=0.31.0",
]
files = [
{file = "jsonschema_specifications-2024.10.1-py3-none-any.whl", hash = "sha256:a09a0680616357d9a0ecf05c12ad234479f549239d0f5b55f3deea67475da9bf"},
{file = "jsonschema_specifications-2024.10.1.tar.gz", hash = "sha256:0f38b83639958ce1152d02a7f062902c41c8fd20d558b0c34344292d417ae272"},
]
[[package]]
name = "langdetect"
version = "1.0.9"
@ -1036,24 +1087,26 @@ files = [
[[package]]
name = "litellm"
version = "1.38.10"
version = "1.67.0.post1"
requires_python = "!=2.7.*,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,!=3.7.*,>=3.8"
summary = "Library to easily interface with LLM API providers"
groups = ["default"]
dependencies = [
"aiohttp",
"click",
"httpx>=0.23.0",
"importlib-metadata>=6.8.0",
"jinja2<4.0.0,>=3.1.2",
"openai>=1.27.0",
"jsonschema<5.0.0,>=4.22.0",
"openai>=1.68.2",
"pydantic<3.0.0,>=2.0.0",
"python-dotenv>=0.2.0",
"requests<3.0.0,>=2.31.0",
"tiktoken>=0.4.0",
"tiktoken>=0.7.0",
"tokenizers",
]
files = [
{file = "litellm-1.38.10-py3-none-any.whl", hash = "sha256:4d33465eacde566832b9d7aa7677476e61aa7ba4ec26631fb1c8411c87219ed1"},
{file = "litellm-1.38.10.tar.gz", hash = "sha256:1a0b3088fe4b072f367343a7d7d25e4c5f9990975d9ee7dbf21f3b25ff046bb0"},
{file = "litellm-1.67.0.post1-py3-none-any.whl", hash = "sha256:b7b3c6a6a032b059a45b326673d24318dc8b65b1016a93194c9ea7ee94b0e00d"},
{file = "litellm-1.67.0.post1.tar.gz", hash = "sha256:1adf69769ee5df93c834c093fad760f406feeb6c59e79638c3f448226887554d"},
]
[[package]]
@ -1322,22 +1375,23 @@ files = [
[[package]]
name = "openai"
version = "1.30.3"
requires_python = ">=3.7.1"
version = "1.75.0"
requires_python = ">=3.8"
summary = "The official Python library for the openai API"
groups = ["default"]
dependencies = [
"anyio<5,>=3.5.0",
"distro<2,>=1.7.0",
"httpx<1,>=0.23.0",
"jiter<1,>=0.4.0",
"pydantic<3,>=1.9.0",
"sniffio",
"tqdm>4",
"typing-extensions<5,>=4.7",
"typing-extensions<5,>=4.11",
]
files = [
{file = "openai-1.30.3-py3-none-any.whl", hash = "sha256:f88119c8a848998be533c71ab8aa832446fa72b7ddbc70917c3f5886dc132051"},
{file = "openai-1.30.3.tar.gz", hash = "sha256:8e1bcdca2b96fe3636ab522fa153d88efde1b702d12ec32f1c73e9553ff93f45"},
{file = "openai-1.75.0-py3-none-any.whl", hash = "sha256:fe6f932d2ded3b429ff67cc9ad118c71327db32eb9d32dd723de3acfca337125"},
{file = "openai-1.75.0.tar.gz", hash = "sha256:fb3ea907efbdb1bcfd0c44507ad9c961afd7dce3147292b54505ecfd17be8fd1"},
]
[[package]]
@ -1351,18 +1405,29 @@ files = [
{file = "packaging-24.0.tar.gz", hash = "sha256:eb82c5e3e56209074766e6885bb04b8c38a0c015d0a30036ebe7ece34c9989e9"},
]
[[package]]
name = "promptdown"
version = "0.9.0"
requires_python = ">=3.10"
summary = "A package for loading promptdown files, which are a special type of markdown file for defining structured LLM prompts"
groups = ["default"]
files = [
{file = "promptdown-0.9.0-py3-none-any.whl", hash = "sha256:9ebd7044517217d00f61966dfe8297ee06328d11a6ca52d1b48c96739e5fc01a"},
{file = "promptdown-0.9.0.tar.gz", hash = "sha256:5727cf275a62f0feb6754fec4182b4a240a7e2bd2615e2381498d75d386c4087"},
]
[[package]]
name = "proto-plus"
version = "1.23.0"
requires_python = ">=3.6"
summary = "Beautiful, Pythonic protocol buffers."
version = "1.26.1"
requires_python = ">=3.7"
summary = "Beautiful, Pythonic protocol buffers"
groups = ["default"]
dependencies = [
"protobuf<5.0.0dev,>=3.19.0",
"protobuf<7.0.0,>=3.19.0",
]
files = [
{file = "proto-plus-1.23.0.tar.gz", hash = "sha256:89075171ef11988b3fa157f5dbd8b9cf09d65fffee97e29ce403cd8defba19d2"},
{file = "proto_plus-1.23.0-py3-none-any.whl", hash = "sha256:a829c79e619e1cf632de091013a4173deed13a55f326ef84f05af6f50ff4c82c"},
{file = "proto_plus-1.26.1-py3-none-any.whl", hash = "sha256:13285478c2dcf2abb829db158e1047e2f1e8d63a077d94263c2b88b043c75a66"},
{file = "proto_plus-1.26.1.tar.gz", hash = "sha256:21a515a4c4c0088a773899e23c7bbade3d18f9c66c73edd4c7ee3816bc96a012"},
]
[[package]]
@ -1603,6 +1668,22 @@ files = [
{file = "PyYAML-6.0.1.tar.gz", hash = "sha256:bfdf460b1736c775f2ba9f6a92bca30bc2095067b8a9d77876d1fad6cc3b4a43"},
]
[[package]]
name = "referencing"
version = "0.36.2"
requires_python = ">=3.9"
summary = "JSON Referencing + Python"
groups = ["default"]
dependencies = [
"attrs>=22.2.0",
"rpds-py>=0.7.0",
"typing-extensions>=4.4.0; python_version < \"3.13\"",
]
files = [
{file = "referencing-0.36.2-py3-none-any.whl", hash = "sha256:e8699adbbf8b5c7de96d8ffa0eb5c158b3beafce084968e2ea8bb08c6794dcd0"},
{file = "referencing-0.36.2.tar.gz", hash = "sha256:df2e89862cd09deabbdba16944cc3f10feb6b3e6f18e902f7cc25609a34775aa"},
]
[[package]]
name = "regex"
version = "2024.4.28"
@ -1677,7 +1758,7 @@ files = [
[[package]]
name = "requests"
version = "2.32.2"
version = "2.32.3"
requires_python = ">=3.8"
summary = "Python HTTP for Humans."
groups = ["default"]
@ -1688,23 +1769,122 @@ dependencies = [
"urllib3<3,>=1.21.1",
]
files = [
{file = "requests-2.32.2-py3-none-any.whl", hash = "sha256:fc06670dd0ed212426dfeb94fc1b983d917c4f9847c863f313c9dfaaffb7c23c"},
{file = "requests-2.32.2.tar.gz", hash = "sha256:dd951ff5ecf3e3b3aa26b40703ba77495dab41da839ae72ef3c8e5d8e2433289"},
{file = "requests-2.32.3-py3-none-any.whl", hash = "sha256:70761cfe03c773ceb22aa2f671b4757976145175cdfca038c02654d061d6dcc6"},
{file = "requests-2.32.3.tar.gz", hash = "sha256:55365417734eb18255590a9ff9eb97e9e1da868d4ccd6402399eaf68af20a760"},
]
[[package]]
name = "rich"
version = "13.7.1"
requires_python = ">=3.7.0"
version = "14.0.0"
requires_python = ">=3.8.0"
summary = "Render rich text, tables, progress bars, syntax highlighting, markdown and more to the terminal"
groups = ["default"]
dependencies = [
"markdown-it-py>=2.2.0",
"pygments<3.0.0,>=2.13.0",
"typing-extensions<5.0,>=4.0.0; python_version < \"3.11\"",
]
files = [
{file = "rich-13.7.1-py3-none-any.whl", hash = "sha256:4edbae314f59eb482f54e9e30bf00d33350aaa94f4bfcd4e9e3110e64d0d7222"},
{file = "rich-13.7.1.tar.gz", hash = "sha256:9be308cb1fe2f1f57d67ce99e95af38a1e2bc71ad9813b0e247cf7ffbcc3a432"},
{file = "rich-14.0.0-py3-none-any.whl", hash = "sha256:1c9491e1951aac09caffd42f448ee3d04e58923ffe14993f6e83068dc395d7e0"},
{file = "rich-14.0.0.tar.gz", hash = "sha256:82f1bc23a6a21ebca4ae0c45af9bdbc492ed20231dcb63f297d6d1021a9d5725"},
]
[[package]]
name = "rpds-py"
version = "0.24.0"
requires_python = ">=3.9"
summary = "Python bindings to Rust's persistent data structures (rpds)"
groups = ["default"]
files = [
{file = "rpds_py-0.24.0-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:006f4342fe729a368c6df36578d7a348c7c716be1da0a1a0f86e3021f8e98724"},
{file = "rpds_py-0.24.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:2d53747da70a4e4b17f559569d5f9506420966083a31c5fbd84e764461c4444b"},
{file = "rpds_py-0.24.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e8acd55bd5b071156bae57b555f5d33697998752673b9de554dd82f5b5352727"},
{file = "rpds_py-0.24.0-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7e80d375134ddb04231a53800503752093dbb65dad8dabacce2c84cccc78e964"},
{file = "rpds_py-0.24.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:60748789e028d2a46fc1c70750454f83c6bdd0d05db50f5ae83e2db500b34da5"},
{file = "rpds_py-0.24.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:6e1daf5bf6c2be39654beae83ee6b9a12347cb5aced9a29eecf12a2d25fff664"},
{file = "rpds_py-0.24.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1b221c2457d92a1fb3c97bee9095c874144d196f47c038462ae6e4a14436f7bc"},
{file = "rpds_py-0.24.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:66420986c9afff67ef0c5d1e4cdc2d0e5262f53ad11e4f90e5e22448df485bf0"},
{file = "rpds_py-0.24.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:43dba99f00f1d37b2a0265a259592d05fcc8e7c19d140fe51c6e6f16faabeb1f"},
{file = "rpds_py-0.24.0-cp310-cp310-musllinux_1_2_i686.whl", hash = "sha256:a88c0d17d039333a41d9bf4616bd062f0bd7aa0edeb6cafe00a2fc2a804e944f"},
{file = "rpds_py-0.24.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:cc31e13ce212e14a539d430428cd365e74f8b2d534f8bc22dd4c9c55b277b875"},
{file = "rpds_py-0.24.0-cp310-cp310-win32.whl", hash = "sha256:fc2c1e1b00f88317d9de6b2c2b39b012ebbfe35fe5e7bef980fd2a91f6100a07"},
{file = "rpds_py-0.24.0-cp310-cp310-win_amd64.whl", hash = "sha256:c0145295ca415668420ad142ee42189f78d27af806fcf1f32a18e51d47dd2052"},
{file = "rpds_py-0.24.0-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:2d3ee4615df36ab8eb16c2507b11e764dcc11fd350bbf4da16d09cda11fcedef"},
{file = "rpds_py-0.24.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:e13ae74a8a3a0c2f22f450f773e35f893484fcfacb00bb4344a7e0f4f48e1f97"},
{file = "rpds_py-0.24.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cf86f72d705fc2ef776bb7dd9e5fbba79d7e1f3e258bf9377f8204ad0fc1c51e"},
{file = "rpds_py-0.24.0-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c43583ea8517ed2e780a345dd9960896afc1327e8cf3ac8239c167530397440d"},
{file = "rpds_py-0.24.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:4cd031e63bc5f05bdcda120646a0d32f6d729486d0067f09d79c8db5368f4586"},
{file = "rpds_py-0.24.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:34d90ad8c045df9a4259c47d2e16a3f21fdb396665c94520dbfe8766e62187a4"},
{file = "rpds_py-0.24.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e838bf2bb0b91ee67bf2b889a1a841e5ecac06dd7a2b1ef4e6151e2ce155c7ae"},
{file = "rpds_py-0.24.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:04ecf5c1ff4d589987b4d9882872f80ba13da7d42427234fce8f22efb43133bc"},
{file = "rpds_py-0.24.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:630d3d8ea77eabd6cbcd2ea712e1c5cecb5b558d39547ac988351195db433f6c"},
{file = "rpds_py-0.24.0-cp311-cp311-musllinux_1_2_i686.whl", hash = "sha256:ebcb786b9ff30b994d5969213a8430cbb984cdd7ea9fd6df06663194bd3c450c"},
{file = "rpds_py-0.24.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:174e46569968ddbbeb8a806d9922f17cd2b524aa753b468f35b97ff9c19cb718"},
{file = "rpds_py-0.24.0-cp311-cp311-win32.whl", hash = "sha256:5ef877fa3bbfb40b388a5ae1cb00636a624690dcb9a29a65267054c9ea86d88a"},
{file = "rpds_py-0.24.0-cp311-cp311-win_amd64.whl", hash = "sha256:e274f62cbd274359eff63e5c7e7274c913e8e09620f6a57aae66744b3df046d6"},
{file = "rpds_py-0.24.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:d8551e733626afec514b5d15befabea0dd70a343a9f23322860c4f16a9430205"},
{file = "rpds_py-0.24.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0e374c0ce0ca82e5b67cd61fb964077d40ec177dd2c4eda67dba130de09085c7"},
{file = "rpds_py-0.24.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d69d003296df4840bd445a5d15fa5b6ff6ac40496f956a221c4d1f6f7b4bc4d9"},
{file = "rpds_py-0.24.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:8212ff58ac6dfde49946bea57474a386cca3f7706fc72c25b772b9ca4af6b79e"},
{file = "rpds_py-0.24.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:528927e63a70b4d5f3f5ccc1fa988a35456eb5d15f804d276709c33fc2f19bda"},
{file = "rpds_py-0.24.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:a824d2c7a703ba6daaca848f9c3d5cb93af0505be505de70e7e66829affd676e"},
{file = "rpds_py-0.24.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:44d51febb7a114293ffd56c6cf4736cb31cd68c0fddd6aa303ed09ea5a48e029"},
{file = "rpds_py-0.24.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:3fab5f4a2c64a8fb64fc13b3d139848817a64d467dd6ed60dcdd6b479e7febc9"},
{file = "rpds_py-0.24.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:9be4f99bee42ac107870c61dfdb294d912bf81c3c6d45538aad7aecab468b6b7"},
{file = "rpds_py-0.24.0-cp312-cp312-musllinux_1_2_i686.whl", hash = "sha256:564c96b6076a98215af52f55efa90d8419cc2ef45d99e314fddefe816bc24f91"},
{file = "rpds_py-0.24.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:75a810b7664c17f24bf2ffd7f92416c00ec84b49bb68e6a0d93e542406336b56"},
{file = "rpds_py-0.24.0-cp312-cp312-win32.whl", hash = "sha256:f6016bd950be4dcd047b7475fdf55fb1e1f59fc7403f387be0e8123e4a576d30"},
{file = "rpds_py-0.24.0-cp312-cp312-win_amd64.whl", hash = "sha256:998c01b8e71cf051c28f5d6f1187abbdf5cf45fc0efce5da6c06447cba997034"},
{file = "rpds_py-0.24.0-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:3d2d8e4508e15fc05b31285c4b00ddf2e0eb94259c2dc896771966a163122a0c"},
{file = "rpds_py-0.24.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:0f00c16e089282ad68a3820fd0c831c35d3194b7cdc31d6e469511d9bffc535c"},
{file = "rpds_py-0.24.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:951cc481c0c395c4a08639a469d53b7d4afa252529a085418b82a6b43c45c240"},
{file = "rpds_py-0.24.0-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c9ca89938dff18828a328af41ffdf3902405a19f4131c88e22e776a8e228c5a8"},
{file = "rpds_py-0.24.0-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:ed0ef550042a8dbcd657dfb284a8ee00f0ba269d3f2286b0493b15a5694f9fe8"},
{file = "rpds_py-0.24.0-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2b2356688e5d958c4d5cb964af865bea84db29971d3e563fb78e46e20fe1848b"},
{file = "rpds_py-0.24.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:78884d155fd15d9f64f5d6124b486f3d3f7fd7cd71a78e9670a0f6f6ca06fb2d"},
{file = "rpds_py-0.24.0-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:6a4a535013aeeef13c5532f802708cecae8d66c282babb5cd916379b72110cf7"},
{file = "rpds_py-0.24.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:84e0566f15cf4d769dade9b366b7b87c959be472c92dffb70462dd0844d7cbad"},
{file = "rpds_py-0.24.0-cp313-cp313-musllinux_1_2_i686.whl", hash = "sha256:823e74ab6fbaa028ec89615ff6acb409e90ff45580c45920d4dfdddb069f2120"},
{file = "rpds_py-0.24.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:c61a2cb0085c8783906b2f8b1f16a7e65777823c7f4d0a6aaffe26dc0d358dd9"},
{file = "rpds_py-0.24.0-cp313-cp313-win32.whl", hash = "sha256:60d9b630c8025b9458a9d114e3af579a2c54bd32df601c4581bd054e85258143"},
{file = "rpds_py-0.24.0-cp313-cp313-win_amd64.whl", hash = "sha256:6eea559077d29486c68218178ea946263b87f1c41ae7f996b1f30a983c476a5a"},
{file = "rpds_py-0.24.0-cp313-cp313t-macosx_10_12_x86_64.whl", hash = "sha256:d09dc82af2d3c17e7dd17120b202a79b578d79f2b5424bda209d9966efeed114"},
{file = "rpds_py-0.24.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:5fc13b44de6419d1e7a7e592a4885b323fbc2f46e1f22151e3a8ed3b8b920405"},
{file = "rpds_py-0.24.0-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c347a20d79cedc0a7bd51c4d4b7dbc613ca4e65a756b5c3e57ec84bd43505b47"},
{file = "rpds_py-0.24.0-cp313-cp313t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:20f2712bd1cc26a3cc16c5a1bfee9ed1abc33d4cdf1aabd297fe0eb724df4272"},
{file = "rpds_py-0.24.0-cp313-cp313t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:aad911555286884be1e427ef0dc0ba3929e6821cbeca2194b13dc415a462c7fd"},
{file = "rpds_py-0.24.0-cp313-cp313t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:0aeb3329c1721c43c58cae274d7d2ca85c1690d89485d9c63a006cb79a85771a"},
{file = "rpds_py-0.24.0-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2a0f156e9509cee987283abd2296ec816225145a13ed0391df8f71bf1d789e2d"},
{file = "rpds_py-0.24.0-cp313-cp313t-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:aa6800adc8204ce898c8a424303969b7aa6a5e4ad2789c13f8648739830323b7"},
{file = "rpds_py-0.24.0-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:a18fc371e900a21d7392517c6f60fe859e802547309e94313cd8181ad9db004d"},
{file = "rpds_py-0.24.0-cp313-cp313t-musllinux_1_2_i686.whl", hash = "sha256:9168764133fd919f8dcca2ead66de0105f4ef5659cbb4fa044f7014bed9a1797"},
{file = "rpds_py-0.24.0-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:5f6e3cec44ba05ee5cbdebe92d052f69b63ae792e7d05f1020ac5e964394080c"},
{file = "rpds_py-0.24.0-cp313-cp313t-win32.whl", hash = "sha256:8ebc7e65ca4b111d928b669713865f021b7773350eeac4a31d3e70144297baba"},
{file = "rpds_py-0.24.0-cp313-cp313t-win_amd64.whl", hash = "sha256:675269d407a257b8c00a6b58205b72eec8231656506c56fd429d924ca00bb350"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl", hash = "sha256:619ca56a5468f933d940e1bf431c6f4e13bef8e688698b067ae68eb4f9b30e3a"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:4b28e5122829181de1898c2c97f81c0b3246d49f585f22743a1246420bb8d399"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e8e5ab32cf9eb3647450bc74eb201b27c185d3857276162c101c0f8c6374e098"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:208b3a70a98cf3710e97cabdc308a51cd4f28aa6e7bb11de3d56cd8b74bab98d"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:bbc4362e06f950c62cad3d4abf1191021b2ffaf0b31ac230fbf0526453eee75e"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:ebea2821cdb5f9fef44933617be76185b80150632736f3d76e54829ab4a3b4d1"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b9a4df06c35465ef4d81799999bba810c68d29972bf1c31db61bfdb81dd9d5bb"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:d3aa13bdf38630da298f2e0d77aca967b200b8cc1473ea05248f6c5e9c9bdb44"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl", hash = "sha256:041f00419e1da7a03c46042453598479f45be3d787eb837af382bfc169c0db33"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-musllinux_1_2_i686.whl", hash = "sha256:d8754d872a5dfc3c5bf9c0e059e8107451364a30d9fd50f1f1a85c4fb9481164"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl", hash = "sha256:896c41007931217a343eff197c34513c154267636c8056fb409eafd494c3dcdc"},
{file = "rpds_py-0.24.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:92558d37d872e808944c3c96d0423b8604879a3d1c86fdad508d7ed91ea547d5"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-macosx_10_12_x86_64.whl", hash = "sha256:f9e0057a509e096e47c87f753136c9b10d7a91842d8042c2ee6866899a717c0d"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:d6e109a454412ab82979c5b1b3aee0604eca4bbf9a02693bb9df027af2bfa91a"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fc1c892b1ec1f8cbd5da8de287577b455e388d9c328ad592eabbdcb6fc93bee5"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:9c39438c55983d48f4bb3487734d040e22dad200dab22c41e331cee145e7a50d"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9d7e8ce990ae17dda686f7e82fd41a055c668e13ddcf058e7fb5e9da20b57793"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:9ea7f4174d2e4194289cb0c4e172d83e79a6404297ff95f2875cf9ac9bced8ba"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bb2954155bb8f63bb19d56d80e5e5320b61d71084617ed89efedb861a684baea"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:04f2b712a2206e13800a8136b07aaedc23af3facab84918e7aa89e4be0260032"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-musllinux_1_2_aarch64.whl", hash = "sha256:eda5c1e2a715a4cbbca2d6d304988460942551e4e5e3b7457b50943cd741626d"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-musllinux_1_2_i686.whl", hash = "sha256:9abc80fe8c1f87218db116016de575a7998ab1629078c90840e8d11ab423ee25"},
{file = "rpds_py-0.24.0-pp311-pypy311_pp73-musllinux_1_2_x86_64.whl", hash = "sha256:6a727fd083009bc83eb83d6950f0c32b3c94c8b80a9b667c87f4bd1274ca30ba"},
{file = "rpds_py-0.24.0.tar.gz", hash = "sha256:772cc1b2cd963e7e17e6cc55fe0371fb9c704d63e44cacec7b9b7f523b78919e"},
]
[[package]]
@ -1767,8 +1947,8 @@ files = [
[[package]]
name = "tiktoken"
version = "0.7.0"
requires_python = ">=3.8"
version = "0.9.0"
requires_python = ">=3.9"
summary = "tiktoken is a fast BPE tokeniser for use with OpenAI's models"
groups = ["default"]
dependencies = [
@ -1776,35 +1956,31 @@ dependencies = [
"requests>=2.26.0",
]
files = [
{file = "tiktoken-0.7.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:485f3cc6aba7c6b6ce388ba634fbba656d9ee27f766216f45146beb4ac18b25f"},
{file = "tiktoken-0.7.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:e54be9a2cd2f6d6ffa3517b064983fb695c9a9d8aa7d574d1ef3c3f931a99225"},
{file = "tiktoken-0.7.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:79383a6e2c654c6040e5f8506f3750db9ddd71b550c724e673203b4f6b4b4590"},
{file = "tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:5d4511c52caacf3c4981d1ae2df85908bd31853f33d30b345c8b6830763f769c"},
{file = "tiktoken-0.7.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:13c94efacdd3de9aff824a788353aa5749c0faee1fbe3816df365ea450b82311"},
{file = "tiktoken-0.7.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:8e58c7eb29d2ab35a7a8929cbeea60216a4ccdf42efa8974d8e176d50c9a3df5"},
{file = "tiktoken-0.7.0-cp310-cp310-win_amd64.whl", hash = "sha256:21a20c3bd1dd3e55b91c1331bf25f4af522c525e771691adbc9a69336fa7f702"},
{file = "tiktoken-0.7.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:10c7674f81e6e350fcbed7c09a65bca9356eaab27fb2dac65a1e440f2bcfe30f"},
{file = "tiktoken-0.7.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:084cec29713bc9d4189a937f8a35dbdfa785bd1235a34c1124fe2323821ee93f"},
{file = "tiktoken-0.7.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:811229fde1652fedcca7c6dfe76724d0908775b353556d8a71ed74d866f73f7b"},
{file = "tiktoken-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:86b6e7dc2e7ad1b3757e8a24597415bafcfb454cebf9a33a01f2e6ba2e663992"},
{file = "tiktoken-0.7.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:1063c5748be36344c7e18c7913c53e2cca116764c2080177e57d62c7ad4576d1"},
{file = "tiktoken-0.7.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:20295d21419bfcca092644f7e2f2138ff947a6eb8cfc732c09cc7d76988d4a89"},
{file = "tiktoken-0.7.0-cp311-cp311-win_amd64.whl", hash = "sha256:959d993749b083acc57a317cbc643fb85c014d055b2119b739487288f4e5d1cb"},
{file = "tiktoken-0.7.0-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:71c55d066388c55a9c00f61d2c456a6086673ab7dec22dd739c23f77195b1908"},
{file = "tiktoken-0.7.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:09ed925bccaa8043e34c519fbb2f99110bd07c6fd67714793c21ac298e449410"},
{file = "tiktoken-0.7.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:03c6c40ff1db0f48a7b4d2dafeae73a5607aacb472fa11f125e7baf9dce73704"},
{file = "tiktoken-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d20b5c6af30e621b4aca094ee61777a44118f52d886dbe4f02b70dfe05c15350"},
{file = "tiktoken-0.7.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:d427614c3e074004efa2f2411e16c826f9df427d3c70a54725cae860f09e4bf4"},
{file = "tiktoken-0.7.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:8c46d7af7b8c6987fac9b9f61041b452afe92eb087d29c9ce54951280f899a97"},
{file = "tiktoken-0.7.0-cp312-cp312-win_amd64.whl", hash = "sha256:0bc603c30b9e371e7c4c7935aba02af5994a909fc3c0fe66e7004070858d3f8f"},
{file = "tiktoken-0.7.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:cabc6dc77460df44ec5b879e68692c63551ae4fae7460dd4ff17181df75f1db7"},
{file = "tiktoken-0.7.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:8d57f29171255f74c0aeacd0651e29aa47dff6f070cb9f35ebc14c82278f3b25"},
{file = "tiktoken-0.7.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2ee92776fdbb3efa02a83f968c19d4997a55c8e9ce7be821ceee04a1d1ee149c"},
{file = "tiktoken-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e215292e99cb41fbc96988ef62ea63bb0ce1e15f2c147a61acc319f8b4cbe5bf"},
{file = "tiktoken-0.7.0-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:8a81bac94769cab437dd3ab0b8a4bc4e0f9cf6835bcaa88de71f39af1791727a"},
{file = "tiktoken-0.7.0-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:d6d73ea93e91d5ca771256dfc9d1d29f5a554b83821a1dc0891987636e0ae226"},
{file = "tiktoken-0.7.0-cp39-cp39-win_amd64.whl", hash = "sha256:2bcb28ddf79ffa424f171dfeef9a4daff61a94c631ca6813f43967cb263b83b9"},
{file = "tiktoken-0.7.0.tar.gz", hash = "sha256:1077266e949c24e0291f6c350433c6f0971365ece2b173a23bc3b9f9defef6b6"},
{file = "tiktoken-0.9.0-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:586c16358138b96ea804c034b8acf3f5d3f0258bd2bc3b0227af4af5d622e382"},
{file = "tiktoken-0.9.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:d9c59ccc528c6c5dd51820b3474402f69d9a9e1d656226848ad68a8d5b2e5108"},
{file = "tiktoken-0.9.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f0968d5beeafbca2a72c595e8385a1a1f8af58feaebb02b227229b69ca5357fd"},
{file = "tiktoken-0.9.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:92a5fb085a6a3b7350b8fc838baf493317ca0e17bd95e8642f95fc69ecfed1de"},
{file = "tiktoken-0.9.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:15a2752dea63d93b0332fb0ddb05dd909371ededa145fe6a3242f46724fa7990"},
{file = "tiktoken-0.9.0-cp310-cp310-win_amd64.whl", hash = "sha256:26113fec3bd7a352e4b33dbaf1bd8948de2507e30bd95a44e2b1156647bc01b4"},
{file = "tiktoken-0.9.0-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:f32cc56168eac4851109e9b5d327637f15fd662aa30dd79f964b7c39fbadd26e"},
{file = "tiktoken-0.9.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:45556bc41241e5294063508caf901bf92ba52d8ef9222023f83d2483a3055348"},
{file = "tiktoken-0.9.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:03935988a91d6d3216e2ec7c645afbb3d870b37bcb67ada1943ec48678e7ee33"},
{file = "tiktoken-0.9.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8b3d80aad8d2c6b9238fc1a5524542087c52b860b10cbf952429ffb714bc1136"},
{file = "tiktoken-0.9.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:b2a21133be05dc116b1d0372af051cd2c6aa1d2188250c9b553f9fa49301b336"},
{file = "tiktoken-0.9.0-cp311-cp311-win_amd64.whl", hash = "sha256:11a20e67fdf58b0e2dea7b8654a288e481bb4fc0289d3ad21291f8d0849915fb"},
{file = "tiktoken-0.9.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:e88f121c1c22b726649ce67c089b90ddda8b9662545a8aeb03cfef15967ddd03"},
{file = "tiktoken-0.9.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:a6600660f2f72369acb13a57fb3e212434ed38b045fd8cc6cdd74947b4b5d210"},
{file = "tiktoken-0.9.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:95e811743b5dfa74f4b227927ed86cbc57cad4df859cb3b643be797914e41794"},
{file = "tiktoken-0.9.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:99376e1370d59bcf6935c933cb9ba64adc29033b7e73f5f7569f3aad86552b22"},
{file = "tiktoken-0.9.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:badb947c32739fb6ddde173e14885fb3de4d32ab9d8c591cbd013c22b4c31dd2"},
{file = "tiktoken-0.9.0-cp312-cp312-win_amd64.whl", hash = "sha256:5a62d7a25225bafed786a524c1b9f0910a1128f4232615bf3f8257a73aaa3b16"},
{file = "tiktoken-0.9.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:2b0e8e05a26eda1249e824156d537015480af7ae222ccb798e5234ae0285dbdb"},
{file = "tiktoken-0.9.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:27d457f096f87685195eea0165a1807fae87b97b2161fe8c9b1df5bd74ca6f63"},
{file = "tiktoken-0.9.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2cf8ded49cddf825390e36dd1ad35cd49589e8161fdcb52aa25f0583e90a3e01"},
{file = "tiktoken-0.9.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:cc156cb314119a8bb9748257a2eaebd5cc0753b6cb491d26694ed42fc7cb3139"},
{file = "tiktoken-0.9.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:cd69372e8c9dd761f0ab873112aba55a0e3e506332dd9f7522ca466e817b1b7a"},
{file = "tiktoken-0.9.0-cp313-cp313-win_amd64.whl", hash = "sha256:5ea0edb6f83dc56d794723286215918c1cde03712cbbafa0348b33448faf5b95"},
{file = "tiktoken-0.9.0.tar.gz", hash = "sha256:d02a5ca6a938e0490e1ff957bc48c8b078c88cb83977be1625b1fd8aac792c5d"},
]
[[package]]
@ -1897,7 +2073,7 @@ files = [
[[package]]
name = "tqdm"
version = "4.66.4"
version = "4.67.1"
requires_python = ">=3.7"
summary = "Fast, Extensible Progress Meter"
groups = ["default"]
@ -1905,8 +2081,8 @@ dependencies = [
"colorama; platform_system == \"Windows\"",
]
files = [
{file = "tqdm-4.66.4-py3-none-any.whl", hash = "sha256:b75ca56b413b030bc3f00af51fd2c1a1a5eac6a0c1cca83cbb37a5c52abce644"},
{file = "tqdm-4.66.4.tar.gz", hash = "sha256:e4d936c9de8727928f3be6079590e97d9abfe8d39a590be678eb5919ffc186bb"},
{file = "tqdm-4.67.1-py3-none-any.whl", hash = "sha256:26445eca388f82e72884e0d580d5464cd801a3ea01e63e5601bdff9ba6a48de2"},
{file = "tqdm-4.67.1.tar.gz", hash = "sha256:f8aef9c52c08c13a65f30ea34f4e5aac3fd1a34959879d7e59e63027286627f2"},
]
[[package]]
@ -1951,6 +2127,7 @@ groups = ["default"]
dependencies = [
"idna>=2.0",
"multidict>=4.0",
"typing-extensions>=3.7.4; python_version < \"3.8\"",
]
files = [
{file = "yarl-1.9.4-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:a8c1df72eb746f4136fe9a2e72b0c9dc1da1cbd23b5372f94b5820ff8ae30e0e"},

4
prompt_md.json Normal file
View File

@ -0,0 +1,4 @@
{
"system": "You are a highly skilled translator responsible for translating the content of books in Markdown format from English into Chinese.",
"user": "## Strategies\nYou will follow a three-step translation process:\n### 1. Translate the input content from English into Chinese, respect the intention of the original text, keep the original Markdown format unchanged, and do not delete or omit any content, nor add additional explanations or remarks.\n### 2. Read the original text and the translation carefully, and then put forward constructive criticism and helpful suggestions to improve the translation. The final style and tone of the translation should conform to the Chinese language style.\nYou must strictly follow the rules below.\n- Never change the Markdown markup structure. Don't add or remove links. Do not change any URL.\n- Never touch or change the contents of code blocks even if they appear to have a bug.\n- Always preserve the original line breaks. Do not add or remove blank lines.\n- Never touch any permalink at the end of each heading.\n- Never touch HTML-like tags such as `<Notes>`.\nWhen writing suggestions, pay attention to whether there are ways to improve the translation in terms of:\n- Accuracy (by correcting errors such as additions, mistranslations, omissions or untranslated text).\n- Fluency (by applying the rules of Chinese grammar, spelling and punctuation, and ensuring there is no unnecessary repetition).\n- Conciseness and abbreviation (please appropriately simplify and abbreviate the translation result while keeping the original meaning unchanged to avoid the translation being too lengthy).\n### 3. Based on the results of steps 1 and 2, refine and polish the translation, and do not add additional explanations or remarks.\n## Output\nFor each step of the translation process, output the results within the appropriate XML tags:\n<step1_initial_translation>\n[Insert your initial translation here.]\n</step1_initial_translation>\n<step2_reflection>\n[Insert your reflection on the translation and put forward specific here, useful and constructive suggestions to improve the translation. Each suggestion should target a specific part of the translation.]\n</step2_reflection>\n<step3_refined_translation>\n[Insert your refined and polished translation here.]\n</step3_refined_translation>\n## Input\nThe following is the content of the book that needs to be translated within the <INPUT> tag:\n<INPUT>{text}</INPUT>"
}

11
prompt_md.prompt.md Normal file
View File

@ -0,0 +1,11 @@
# Translation Prompt
## Developer Message
You are a professional translator who specializes in accurate, natural-sounding translations that preserve the original meaning, tone, and style of the text.
## Conversation
| Role | Content |
|-------|---------------------------------------------------------------------------|
| User | Please translate the following text into {language}:\n\n{text} |

View File

@ -1,4 +1,4 @@
{
"system": "You are a professional translator.",
"user": "Translate the given text to {language}. Be faithful or accurate in translation. Make the translation readable or intelligible. Be elegant or natural in translation. If the text cannot be translated, return the original text as is. Do not translate person's name. Do not add any additional text in the translation. The text to be translated is:\n{text}"
"system": "You are a highly skilled academic translator. Please complete the translation task according to the following instructions and provide only the final polished translation.",
"user": "## Strategies\nYou will follow a three-step translation process:\n### Step.1 Initial Direct Translation: Translate the content from English to Chinese sentence by sentence, respecting the original intent without deleting, omitting, or adding any extra explanations or notes.\n ### Step.2 Reflection and Revision: Carefully review both the input content and the initial direct translation from Step 1. Check if the translation conveys the original meaning, if the grammatical structure is correct, if word choices are appropriate, and if there are any ambiguities or polysemous words. The final style and tone should conform to Chinese language conventions. \nYou must strictly follow the rules below.\n- Don't add or remove links. Do not change any URL.\n- Do not translate the reference list.\n- Never touch,change or translate the mathematical formulas.\n- Never touch,change or translate the contents of code blocks even if they appear to have a bug.\n- Always preserve the original line breaks. Do not add or remove blank lines.\nProvide constructive criticism and helpful suggestions to improve: \n- translation accuracy (correct additions, mistranslations, omissions, or untranslated text errors),\n- fluency (apply Chinese grammar, spelling, and punctuation rules, and ensure no unnecessary repetition), \n- conciseness (streamline the translation results while maintaining the original meaning, avoiding wordiness).\n ### Step.3 Polish and Optimize: Based on the results from Steps 1 and 2, refine and polish the translation, ensuring the final translation adheres to Chinese style without additional explanations or notes. The content to be translated is wrapped in the following <INPUT> tags:\n\n<INPUT>{text}</INPUT>. \n\nPlease write and output only the final polished translation here: "
}

View File

@ -4,7 +4,7 @@ description = "The bilingual_book_maker is an AI translation tool that uses Chat
readme = "README.md"
license = {text = "MIT"}
dynamic = ["version"]
requires-python = ">=3.9"
requires-python = ">=3.10"
authors = [
{ name = "yihong0618", email = "zouzou0208@gmail.com" },
]
@ -28,10 +28,12 @@ dependencies = [
"tiktoken",
"tqdm",
"groq>=0.5.0",
"promptdown>=0.9.0",
]
[project.scripts]
bbook_maker = "book_maker.cli:main"
promptdown = "promptdown_cli:main"
[project.urls]
Homepage = "https://github.com/yihong0618/bilingual_book_maker"

View File

@ -25,13 +25,13 @@ exceptiongroup==1.2.1; python_version < "3.11"
filelock==3.14.0
frozenlist==1.4.1
fsspec==2024.3.1
google-ai-generativelanguage==0.6.4
google-api-core==2.19.0
google-api-python-client==2.127.0
google-auth==2.29.0
google-ai-generativelanguage==0.6.10
google-api-core==2.21.0
google-api-python-client==2.149.0
google-auth==2.35.0
google-auth-httplib2==0.2.0
google-generativeai==0.5.4
googleapis-common-protos==1.63.0
google-generativeai==0.8.3
googleapis-common-protos==1.65.0
groq==0.8.0
grpcio==1.63.0
grpcio-status==1.62.2
@ -53,6 +53,7 @@ mdurl==0.1.2
multidict==6.0.5
openai==1.30.3
packaging==24.0
promptdown==0.9.0
proto-plus==1.23.0
protobuf==4.25.3
pyasn1==0.6.0