From 26fdfb0f0dd13dce59ba08619f0e07d132891a3e Mon Sep 17 00:00:00 2001 From: Hsieh Chin Fan Date: Thu, 9 Mar 2023 17:59:23 +0800 Subject: [PATCH] Improve README (#119) * Improve README - Improve English wording - Surround command, files , terms with "`" - In codeblock, make comment starts from capitalized char - command/parameter -> option - Add description about --translate-tags from PR#107 * Remove Traditional Chinese * Fix terms for Simplified Chinese --------- Co-authored-by: Hsieh Chin Fan --- README-CN.md | 53 ++++++++++++++++++++++++-------------- README.md | 73 +++++++++++++++++++++++++++++++++------------------- 2 files changed, 80 insertions(+), 46 deletions(-) diff --git a/README-CN.md b/README-CN.md index 177e86f..aa6be08 100644 --- a/README-CN.md +++ b/README-CN.md @@ -7,7 +7,7 @@ bilingual_book_maker 是一个 AI 翻译工具,使用 ChatGPT 帮助用户制 ## 准备 -1. ChatGPT or OpenAI token +1. ChatGPT or OpenAI token [^token] 2. epub books 3. 能正常联网的环境或 proxy 4. python3.8+ @@ -15,42 +15,54 @@ bilingual_book_maker 是一个 AI 翻译工具,使用 ChatGPT 帮助用户制 ## 使用 -1. pip install -r requirements.txt -2. OpenAI API key,如果有多个可以用英文逗号分隔(xxx,xxx,xxx),可以减少接口调用次数限制带来的错误 -3. 本地放了一个 animal_farm.epub 给大家测试 +1. `pip install -r requirements.txt` +2. 使用 `--openai_key` 指定 OpenAI API key,如果有多个可以用英文逗号分隔(xxx,xxx,xxx),可以减少接口调用次数限制带来的错误。 + 或者,指定环境变量 `OPENAI_API_KEY` 来略过这个选项。 +3. 本地放了一个 `test_books/animal_farm.epub` 给大家测试 4. 默认用了 [GPT-3.5-turbo](https://openai.com/blog/introducing-chatgpt-and-whisper-apis) 模型,也就是 ChatGPT 正在使用的模型,用 `--model gpt3` 来使用 gpt3 模型 -5. 加了 `--test` 命令如果大家没付费可以加上这个先看看效果(有 limit 稍微有些慢) -6. Set the target language like `--language "Simplified Chinese"`. - Suppot ` "Japanese" / "Traditional Chinese" / "German" / "French" / "Korean"`. - Default target language is `"Simplified Chinese"`. Support language list please see the LANGUAGES at [utils.py](./utils.py). -7. 加了 `--proxy` 参数,方便中国大陆的用户在本地测试时使用代理,传入类似 `http://127.0.0.1:7890` 的字符串 -8. 加入 `--resume` 命令,可以手动中断后,加入命令继续执行。 -9. 如果你遇到了墙需要用 Cloudflare Workers 替换 api_base 请使用 `--api_base ${url}` 来替换。**请注意,此处你输入的api应该是"`https://xxxx/v1`"的字样,域名需要用引号包裹** -10. 翻译完会生成一本 ${book_name}_bilingual.epub 的双语书 -10. 如果出现了错误或 CTRL + C 中断,不想接下来继续翻译了,会生成一本 ${book_name}_bilingual_temp.epub 的书,直接改成你想要的名字就可以了 +5. 使用 `--test` 命令如果大家没付费可以加上这个先看看效果(有 limit 稍微有些慢) +6. 使用 `--language` 指定目标语言,例如: `--language "Simplified Chinese"`,预设值为 `"Simplified Chinese"`. + 请阅读 helper message 来查找可用的目标语言: `python make_book.py --help` +7. 使用 `--proxy` 参数,方便中国大陆的用户在本地测试时使用代理,传入类似 `http://127.0.0.1:7890` 的字符串 +8. 使用 `--resume` 命令,可以手动中断后,加入命令继续执行。 +9. epub 由 html 文件组成。默认情况下,我们只翻译 `

` 中的内容。 + 使用 `--translate-tags` 指定需要翻译的标签。使用逗号分隔多个标签。例如: + `--translate-tags h1,h2,h3,p,div` +10. 如果你遇到了墙需要用 Cloudflare Workers 替换 api_base 请使用 `--api_base ${url}` 来替换。 + **请注意,此处你输入的api应该是'`https://xxxx/v1`'的字样,域名需要用引号包裹** +11. 翻译完会生成一本 ${book_name}_bilingual.epub 的双语书 +12. 如果出现了错误或使用 `CTRL+C` 中断命令,不想接下来继续翻译了,会生成一本 ${book_name}_bilingual_temp.epub 的书,直接改成你想要的名字就可以了 e.g. ```shell # 如果你想快速测一下 python3 make_book.py --book_name test_books/animal_farm.epub --openai_key ${openai_key} --no_limit --test -# or do it # Chinese + +# 或翻译完整本书 python3 make_book.py --book_name test_books/animal_farm.epub --openai_key ${openai_key} --language zh-hans -# or 用 gpt3 模型 + +# 指定环境变量来略过 --openai_key export OPENAI_API_KEY=${your_api_key} + +# 或使用 gpt3 模型 python3 make_book.py --book_name test_books/animal_farm.epub --model gpt3 --no_limit --language ja + +# Translate contents in

and

+python3 make_book.py --book_name test_books/animal_farm.epub --translate-tags div,p ``` 更加小白的示例 ```shell python3 make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_base 'https://xxxxx/v1' -# 有可能你不需要python3 而是python + +# 有可能你不需要 python3 而是python python make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_base 'https://xxxxx/v1' ``` ## 注意 -1. 有 limit 如果想要速度可以付费 -2. PR welcome +1. Free trail 的 API token 有所限制,如果想要更快的速度,可以考虑付费方案 +2. 欢迎提交 PR 3. 尤其是 batch translate 做完效果会好很多 4. DeepL 模型稍后更新 @@ -63,10 +75,13 @@ python make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_b - 任何 issue PR 都欢迎 - Issue 中有些 TODO 没做的都可以选 -- 提交代码前请先 `black make_book.py` +- 提交代码前请先执行 `black make_book.py` [^black] ## 赞赏 谢谢就够了 ![image](https://user-images.githubusercontent.com/15976103/222407199-1ed8930c-13a8-402b-9993-aaac8ee84744.png) + +[^token]: https://platform.openai.com/account/api-keys +[^black]: https://github.com/psf/black diff --git a/README.md b/README.md index fb43ca1..dc24605 100644 --- a/README.md +++ b/README.md @@ -1,59 +1,75 @@ **[中文](./README-CN.md) | English** # bilingual_book_maker -The bilingual_book_maker is an AI translation tool that uses ChatGPT to assist users in creating multi-language versions of epub files and books. This tool is exclusively designed for translating epub books that have entered the public domain and is not intended for copyrighted works. Prior to usage, please review the project's **[disclaimer](./disclaimer.md)**. +The bilingual_book_maker is an AI translation tool that uses ChatGPT to assist users in creating multi-language versions of epub files and books. This tool is exclusively designed for translating epub books that have entered the public domain and is not intended for copyrighted works. Before using this tool, please review the project's **[disclaimer](./disclaimer.md)**. ![image](https://user-images.githubusercontent.com/15976103/222317531-a05317c5-4eee-49de-95cd-04063d9539d9.png) ## Preparation -1. ChatGPT or OpenAI token -2. Prepared epub books +1. ChatGPT or OpenAI token [^token] +2. epub books 3. Environment with internet access or proxy 4. Python 3.8+ ## Use -1. pip install -r requirements.txt -2. OpenAI API key. If you have multiple keys, separate them by commas (xxx,xxx,xxx) to reduce errors caused by API call limits. -3. A sample book, test_books/animal_farm.epub, is provided for testing purposes. +1. `pip install -r requirements.txt` +2. Use `--openai_key` option to specify OpenAI API key. If you have multiple keys, separate them by commas (xxx,xxx,xxx) to reduce errors caused by API call limits. + Or, just set environment variable `OPENAI_API_KEY` to ignore this option. +3. A sample book, `test_books/animal_farm.epub`, is provided for testing purposes. 4. The default underlying model is [GPT-3.5-turbo](https://openai.com/blog/introducing-chatgpt-and-whisper-apis), which is used by ChatGPT currently. Use `--model gpt3` to change the underlying model to `GPT3` -5. Use --test command to preview the result if you haven't paid for the service. Note that there is a limit and it may take some time. -6. Set the target language like `--language "Simplified Chinese"`. - Support ` "Japanese" / "Traditional Chinese" / "German" / "French" / "Korean"`. - Default target language is `"Simplified Chinese"`. Support language list please see the LANGUAGES at [utils.py](./utils.py). -7. Use the --proxy parameter to enable users in mainland China to use a proxy when testing locally. Enter a string such as http://127.0.0.1:7890. -8. Use the --resume command to manually resume the process after an interruption. -9. If you want to change api_base like using Cloudflare Workers Use --api_base ${url} to support it. **Note: the api url you input should be `https://xxxx/v1', and quotation marks are required.** -10. Once the translation is complete, a bilingual book named ${book_name}_bilingual.epub will be generated. -11. If there are any errors or you wish to interrupt the translation using CTRL+C and do not want to continue further, a book named ${book_name}_bilingual_temp.epub will be generated. You can simply rename it to the desired name. +5. Use `--test` option to preview the result if you haven't paid for the service. Note that there is a limit and it may take some time. +6. Set the target language like `--language "Simplified Chinese"`. Default target language is `"Simplified Chinese"`. + Read available languages by helper message: `python make_book.py --help` +7. Use `--proxy` option to specify proxy server for internet access. Enter a string such as `http://127.0.0.1:7890`. +8. Use `--resume` option to manually resume the process after an interruption. +9. epub is made of html files. By default, we only translate contents in `

`. + Use `--translate-tags` to specify tags need for translation. Use comma to seperate multiple tags. For example: + `--translate-tags h1,h2,h3,p,div` +10. If you want to change api_base like using Cloudflare Workers, use `--api_base ` to support it. + **Note: the api url should be '`https://xxxx/v1`'. Quotation marks are required.** +11. Once the translation is complete, a bilingual book named `${book_name}_bilingual.epub` would be generated. +12. If there are any errors or you wish to interrupt the translation by pressing `CTRL+C`. A book named `${book_name}_bilingual_temp.epub` would be generated. You can simply rename it to any desired name. + +### Eamples -e.g. ```shell # Test quickly python3 make_book.py --book_name test_books/animal_farm.epub --openai_key ${openai_key} --no_limit --test --language zh-hans -# or do it + +# Or translate the whole book python3 make_book.py --book_name test_books/animal_farm.epub --openai_key ${openai_key} --language zh-hans -# or use the GPT-3 model with Japanese + +# Set env OPENAI_API_KEY to ignore option --openai_key export OPENAI_API_KEY=${your_api_key} + +# Use the GPT-3 model with Japanese python3 make_book.py --book_name test_books/animal_farm.epub --model gpt3 --no_limit --language ja + +# Translate contents in

and

+python3 make_book.py --book_name test_books/animal_farm.epub --translate-tags div,p ``` + More understandable example ```shell python3 make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_base 'https://xxxxx/v1' -# or + +# Or python3 is not in your PATH python make_book.py --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_base 'https://xxxxx/v1' ``` ## Docker + You can use [Docker](https://www.docker.com/) if you don't want to deal with setting up the environment. + ```shell -# build image +# Build image docker build --tag bilingual_book_maker . -# run container -# "$folder_path" represents the folder where your book file is located. Also, it is where the processed file will be stored. +# Run container +# "$folder_path" represents the folder where your book file locates. Also, it is where the processed file will be stored. # Windows PowerShell $folder_path=your_folder_path # $folder_path="C:\Users\user\mybook\" @@ -72,7 +88,8 @@ export language=${your_language} docker run --rm --name bilingual_book_maker --mount type=bind,source=${folder_path},target='/app/test_books' bilingual_book_maker --book_name "/app/test_books/${book_name}" --openai_key ${openai_key} --no_limit --language "${language}" ``` -e.g. +For example: + ```shell # Linux docker run --rm --name bilingual_book_maker --mount type=bind,source=/home/user/my_books,target='/app/test_books' bilingual_book_maker --book_name /app/test_books/animal_farm.epub --openai_key sk-XXX --no_limit --test --test_num 1 --language zh-hant @@ -80,11 +97,10 @@ docker run --rm --name bilingual_book_maker --mount type=bind,source=/home/user/ ## Notes -1. here is a limit. If you want to speed up the process, consider paying for the service or use multiple OpenAI tokens -2. PR welcome +1. API token from free trial has limit. If you want to speed up the process, consider paying for the service or use multiple OpenAI tokens +2. PR is welcome 3. The DeepL model will be updated later. - # Thanks - @[yetone](https://github.com/yetone) @@ -93,10 +109,13 @@ docker run --rm --name bilingual_book_maker --mount type=bind,source=/home/user/ - Any issues or PRs are welcome. - TODOs in the issue can also be selected. -- Please run black make_book.py before submitting the code. +- Please run `black make_book.py`[^black] before submitting the code. ## Appreciation Thank you, that's enough. ![image](https://user-images.githubusercontent.com/15976103/222407199-1ed8930c-13a8-402b-9993-aaac8ee84744.png) + +[^token]: https://platform.openai.com/account/api-keys +[^black]: https://github.com/psf/black