bilingual_book_maker/search/search_index.json

1 line
20 KiB
JSON

{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"bilingual book maker","text":"<p>The <code>bilingual_book_maker</code> is an AI translation tool that uses ChatGPT to assist users in creating multi-language versions of epub/txt files and books.</p> <p>This tool is exclusively designed for translating epub books that have entered the public domain and is not intended for copyrighted works. Before using this tool, please review the project's disclaimer.</p>"},{"location":"book_source/","title":"Translate from Different Sources","text":""},{"location":"book_source/#txtsrt","title":"txt/srt","text":"<p>Txt files and srt files are plain text files. This program can translate plain text.</p> <pre><code>python3 make_book.py --book_name test_books/the_little_prince.txt --test --language zh-hans\n</code></pre>"},{"location":"book_source/#epub","title":"epub","text":"<p>epub is made of html files. By default, we only translate contents in <code>&lt;p&gt;</code>. Use <code>--translate-tags</code> to specify tags need for translation. Use comma to separate multiple tags. For example: <code>--translate-tags h1,h2,h3,p,div</code></p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --openai_key ${openai_key} --translate-tags div,p\n</code></pre> <p>If you want to translate strings in an e-book that aren't labeled with any tags, you can use the <code>--allow_navigable_strings</code> parameter. This will add the strings to the translation queue. Note that it's best to look for e-books that are more standardized if possible.</p>"},{"location":"book_source/#e-reader","title":"e-reader","text":"<p>Use <code>--book_from</code> option to specify e-reader type (Now only <code>kobo</code> is available), and use <code>--device_path</code> to specify the mounting point.</p> <pre><code># Translate books download from Rakuten Kobo on kobo e-reader\nbbook_maker --book_from kobo --device_path /tmp/kobo\n</code></pre>"},{"location":"cmd/","title":"Command Line Options","text":""},{"location":"cmd/#test-translate","title":"Test translate","text":"<p><code>--test</code> </p> <p>Use this option to preview the result if you haven't paid for the service or just want to test. Note that there is a limit and it may take some time.</p> <pre><code>bbook_maker --book_name test_books/Lex_Fridman_episode_322.srt --openai_key ${openai_key} --test\n</code></pre> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --openai_key ${openai_key} --test --language zh-hans\n</code></pre> <p><code>--test_num &lt;TEST_NUM&gt;</code></p> <p>Use this option to set how many paragraph you want to translate for testing. Default is 10.</p>"},{"location":"cmd/#resume","title":"Resume","text":"<p><code>--resume</code> </p> <p>Use this option to manually resume the process after an interruption.</p>"},{"location":"cmd/#retranslate-epub-only","title":"Retranslate (epub only)","text":"<p><code>--retranslate &lt;translated_filepath, file_name_in_epub, start_str [, end_str]&gt;</code></p> <p>If a file in epub is not translated well, it supports to re-translate part of epub separately.</p> <p>This option take 4 arguments: <code>translated_filepath</code>, <code>file_name_in_epub</code>, <code>start_str</code>, <code>end_str</code>. <code>end_str</code> is optional.</p> <ul> <li> <p>Retranslate from start_str to end_str's tag:</p> <pre><code>bbook_maker --book_name \"test_books/animal_farm.epub\" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which' 'This kind of thing is not a good symptom. Obviously'\n</code></pre> </li> <li> <p>Retranslate start_str's tag:</p> <pre><code>bbook_maker --book_name \"test_books/animal_farm.epub\" --retranslate 'test_books/animal_farm_bilingual.epub' 'index_split_002.html' 'in spite of the present book shortage which'\n</code></pre> </li> <li> <p>Retranslate start_str's tag, auto find filename:</p> <pre><code>bbook_maker --book_name \"test_books/animal_farm.epub\" --retranslate 'test_books/animal_farm_bilingual.epub' '' 'in spite of the present book shortage which'\n</code></pre> </li> </ul> <p>Warning:</p> <p>It deletes from the tag at start_str of the finished book to the next tag at end_str, and then re-translates.</p> <p>Therefore, please make sure that the next tag of end_str is the translated content. (If end_str is not provided, the next label of start_str is guaranteed to be the translated content.) There can be missing translations between the two strings, but if end_str is not translated, there will be problems.</p>"},{"location":"cmd/#customize-output-style-epub-only","title":"Customize output style (epub only)","text":"<p><code>--translation_style &lt;TRANSLATION_STYLE&gt;</code></p> <p>Support changing the output style of epub files.</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --translation_style \"color: #4a4a4a; font-style: normal; background-color: #f7f7f7; padding: 5px; margin: 10px 0; border-radius: 5px;\"\n</code></pre> <p></p>"},{"location":"cmd/#proxy","title":"Proxy","text":"<p><code>--proxy &lt;PROXY&gt;</code> </p> <p>Use this option to specify proxy server for internet access. Enter a string such as <code>http://127.0.0.1:7890</code> .</p>"},{"location":"cmd/#api-base","title":"API base","text":"<p><code>--api_base &lt;API_BASE_URL&gt;</code></p> <p>If you want to change api_base like using Cloudflare Workers, use this option to support it.</p> <pre><code>bbook_maker --book_name 'animal_farm.epub' --openai_key sk-XXXXX --api_base 'https://xxxxx/v1'\n</code></pre> <p>Note: the api url should be '<code>https://xxxx/v1</code>'. Quotation marks are required.</p>"},{"location":"cmd/#microsoft-azure-endpoints","title":"Microsoft Azure Endpoints","text":"<p><code>--api_base &lt;API_BASE_URL&gt;</code> <code>--deployment_id &lt;DEPLOYMENT_ID&gt;</code></p> <p>You can use the api endpoint provided from Microsoft.</p> <pre><code>bbook_maker --book_name 'animal_farm.epub' --openai_key XXXXX --api_base 'https://example-endpoint.openai.azure.com' --deployment_id 'deployment-name'\n</code></pre> <p>Note : Current only support chatgptapi model for deployment_id. And <code>api_base</code> must be provided when using <code>deployment_id</code>. You can check here for more information about <code>deployment_id</code>.</p>"},{"location":"cmd/#batch-size-txt-only","title":"Batch size (txt only)","text":"<p><code>--batch_size</code></p> <p>Use this parameter to specify the number of lines for batch translation. Default is 10. (Currently only effective for txt files).</p> <pre><code>python3 make_book.py --book_name test_books/the_little_prince.txt --test --batch_size 20\n</code></pre>"},{"location":"cmd/#accumulated-num","title":"Accumulated Num","text":"<p><code>--accumulated_num &lt;ACCUMULATED_NUM&gt;</code></p> <p>Wait for how many tokens have been accumulated before starting the translation. gpt3.5 limits the total_token to 4090. </p> <p>For example, if you use --accumulated_num 1600, maybe openai will output 2200 tokens and maybe 200 tokens for other messages in the system messages user messages. 1600+2200+200=4000, so you are close to the limit. </p> <p>You have to choose your own value, there is no way to tell if the limit is reached before sending request.</p>"},{"location":"disclaimer/","title":"Disclaimer","text":"<ol> <li>The purpose of this project, bilingual_book_maker, is to assist users in creating multilingual versions of epub files and books. It is only applicable to books that have entered the public domain and is not intended for use with copyrighted material. We strongly advise users to read the copyright information carefully before using this project and to comply with relevant laws and regulations in order to protect their own and others' rights.</li> <li>In no event shall the authors or developers be liable for any loss or damage caused by the use of this project. Users assume all risks associated with the use of this project. Users must confirm that they have obtained permission from the original copyright holder or used open source EPUB files before using this project to avoid potential copyright risks.</li> </ol> <p>If you have any concerns or suggestions about the use of this project, please contact us through the issues section.</p> <p>\u514d\u8d23\u58f0\u660e\uff1a</p> <ol> <li>\u8be5\u9879\u76ee\u8bbe\u8ba1\u76ee\u7684\u662f\u4e3a\u4e86\u5e2e\u52a9\u7528\u6237\u5236\u4f5c\u591a\u8bed\u8a00\u7248\u672c\u7684epub\u6587\u4ef6\u548c\u56fe\u4e66\uff0c\u4ec5\u9002\u7528\u4e8e\u8fdb\u5165\u516c\u5171\u7248\u6743\u9886\u57df\u4e66\u7c4d\uff0c\u4e0d\u9002\u7528\u4e8e\u6709\u7248\u6743\u7684\u4e66\u7c4d\u3002\u6211\u4eec\u5f3a\u70c8\u5efa\u8bae\u7528\u6237\u5728\u4f7f\u7528\u8be5\u9879\u76ee\u65f6\u4ed4\u7ec6\u9605\u8bfb\u5176\u7248\u6743\u4fe1\u606f\u5e76\u9075\u5b88\u76f8\u5173\u6cd5\u5f8b\u548c\u89c4\u5b9a\uff0c\u4ee5\u4fdd\u62a4\u81ea\u5df1\u548c\u4ed6\u4eba\u7684\u6743\u76ca\u3002</li> <li>\u5728\u4efb\u4f55\u60c5\u51b5\u4e0b\uff0c\u4f5c\u8005\u548c\u5f00\u53d1\u8005\u4e0d\u5bf9\u56e0\u4f7f\u7528\u8be5\u9879\u76ee\u800c\u5bfc\u81f4\u7684\u4efb\u4f55\u635f\u5931\u6216\u635f\u5bb3\u627f\u62c5\u4efb\u4f55\u8d23\u4efb\u3002\u4f7f\u7528\u8be5\u9879\u76ee\u7684\u98ce\u9669\u7531\u7528\u6237\u81ea\u884c\u627f\u62c5\u3002\u7528\u6237\u5fc5\u987b\u5728\u4f7f\u7528\u8be5\u9879\u76ee\u4e4b\u524d\uff0c\u786e\u8ba4\u5176\u5df2\u83b7\u5f97\u4e86\u539f\u8457\u4f5c\u6743\u4eba\u7684\u8bb8\u53ef\u6216\u4f7f\u7528\u4e86\u516c\u5f00\u53ef\u7528\u7684\u5f00\u6e90EPUB\u6587\u4ef6\uff0c\u4ee5\u907f\u514d\u53ef\u80fd\u5b58\u5728\u7684\u7248\u6743\u98ce\u9669\u3002</li> </ol> <p>\u5982\u679c\u60a8\u5bf9\u8be5\u9879\u76ee\u7684\u4f7f\u7528\u6709\u4efb\u4f55\u7591\u8651\u6216\u5efa\u8bae\uff0c\u8bf7\u901a\u8fc7 issues \u4e0e\u6211\u4eec\u8054\u7cfb\u3002</p>"},{"location":"env_settings/","title":"Environment Settings","text":"<p>You can also write information into env to skip some options.</p>"},{"location":"env_settings/#model-keys","title":"Model keys","text":"<pre><code># Set env BBM_OPENAI_API_KEY to ignore option --openai_key\nexport BBM_OPENAI_API_KEY=${your_api_key}\n\n# Set env BBM_CAIYUN_API_KEY to ignore option --caiyun_key\nexport BBM_CAIYUN_API_KEY=${your_api_key}\n</code></pre>"},{"location":"installation/","title":"Installation","text":""},{"location":"installation/#pip","title":"pip","text":"<p>bilingual_book_maker has been published as a Python package and can be install by <code>pip</code>. (Recommend in a virtual environment.)</p> <pre><code>pip install -U bbook_maker\n</code></pre>"},{"location":"installation/#git","title":"git","text":"<p>You can also install from github if you want to use the latest version.</p> <pre><code>git clone git@github.com:yihong0618/bilingual_book_maker.git\npip install .\n</code></pre>"},{"location":"model_lang/","title":"Model and Languages","text":""},{"location":"model_lang/#models","title":"Models","text":"<p><code>-m, --model &lt;Model&gt;</code> </p> <p>Currently <code>bbook_maker</code> supports these models: <code>chatgptapi</code> , <code>gpt3</code> , <code>google</code> , <code>caiyun</code> , <code>deepl</code> , <code>deeplfree</code> , <code>gpt4</code> , <code>gpt4omini</code> , <code>o1-preview</code> , <code>o1</code> , <code>o1-mini</code> , <code>o3-mini</code> , <code>claude</code> , <code>customapi</code>. Default model is <code>chatgptapi</code> . </p>"},{"location":"model_lang/#openai-models","title":"OPENAI models","text":"<p>There are three models you can choose from.</p> <ul> <li> <p>gpt3</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model gpt3 --openai_key ${openai_key}\n</code></pre> </li> <li> <p>chatgpiapi</p> <p><code>chatgptapi</code> is GPT-3.5-turbo, which is used by ChatGPT currently.</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model chatgptapi --openai_key ${openai_key}\n</code></pre> </li> <li> <p>gpt4</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model gpt4 --openai_key ${openai_key}\n</code></pre> <p>If using <code>gpt4</code> , you can add <code>--use_context</code> to add a context paragraph to each passage sent to the model for translation.</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model gpt4 --openai_key ${openai_key} --use_context\n</code></pre> <p>The option <code>--use_context</code> prompts the GPT4 model to create a one-paragraph summary. </p> <p>If it is the beginning of the translation, it will summarize the entire passage sent (the size depending on <code>--accumulated_num</code> ).</p> <p>If it has any proceeding passage, it will amend the summary to include details from the most recent passage, creating a running one-paragraph context payload of the important details of the entire translated work, which improves consistency of flow and tone of each translation.</p> </li> </ul> <p>Note 1: Use <code>--openai_key</code> option to specify OpenAI API key. If you have multiple keys, separate them by commas (xxx, xxx, xxx) to reduce errors caused by API call limits.</p> <p>Note 2: You can just set the environment variable <code>BBM_OPENAI_API_KEY</code> instead the openai_key. See Environment setting.</p>"},{"location":"model_lang/#caiyun","title":"CAIYUN","text":"<p>Using Caiyun model to translate. The api currently only support: </p> <ol> <li>Simplified Chinese &lt;-&gt; English</li> <li>Simplified Chinese &lt;-&gt; Japanese</li> </ol> <p>The official Caiyun has provided a test token (3975l6lr5pcbvidl6jl2). You can apply your own token by following this [tutorial].(https://bobtranslate.com/service/translate/caiyun.html)</p> <pre><code>bbook_maker --model caiyun --caiyun_key 3975l6lr5pcbvidl6jl2 --book_name test_books/animal_farm.epub\n</code></pre>"},{"location":"model_lang/#deepl","title":"DEEPL","text":"<p>There are two models you can choose from.</p> <ul> <li> <p>deepl: DeepL Translator. </p> <p>Need to pay to get the token. Use <code>--model deepl --deepl_key ${deepl_key}</code></p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model deepl --deepl_key ${deepl_key}\n</code></pre> </li> <li> <p>deeplfree: DeepL free model</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model deeplfree\n</code></pre> </li> </ul>"},{"location":"model_lang/#claude","title":"Claude","text":"<p>Support Claude model. Use <code>--model claude --claude_key ${claude_key}</code> .</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model claude --claude_key ${claude_key}\n</code></pre>"},{"location":"model_lang/#custom-api","title":"Custom API","text":"<p>Support CustomAPI model. Use <code>--model customapi --custom_api ${custom_api}</code> .</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model customapi --custom_api ${custom_api}\n</code></pre>"},{"location":"model_lang/#google","title":"Google","text":"<p>Support google model. Use <code>--model google</code></p>"},{"location":"model_lang/#languages","title":"Languages","text":"<p><code>--language &lt;LANGUAGE&gt;</code> </p> <p>Set target languages. All models except for <code>caiyun</code> supports lots of languages. You can use <code>bbook_maker --help</code> to check available languages. Default target language is <code>\"Simplified Chinese\"</code> .</p> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model chatgptapi --openai_key ${openai_key} --language ja\n</code></pre> <pre><code>bbook_maker --book_name test_books/animal_farm.epub --model chatgptapi --openai_key ${openai_key} --language \"Simplified Chinese\"\n</code></pre>"},{"location":"prompt/","title":"Tweak the prompt","text":"<p>To tweak the prompt, use the <code>--prompt</code> parameter. Valid placeholders for the <code>user</code> role template include <code>{text}</code> and <code>{language}</code>. It supports a few ways to configure the prompt:</p> <ul> <li> <p>If you don't need to set the <code>system</code> role content, you can simply set it up like this: <code>--prompt \"Translate {text} to {language}.\"</code> or <code>--prompt prompt_template_sample.txt</code></p> <pre><code># prompt_template_sample.txt\nTranslate the given text to {language}. Be faithful or accurate in translation. Make the translation readable or intelligible. Be elegant or natural in translation. If the text cannot be translated, return the original text as is. Do not translate person's name. Do not add any additional text in the translation. The text to be translated is: \n{text}\n</code></pre> </li> <li> <p>If you need to set the <code>system</code> role content, you can use the following format: <code>--prompt '{\"user\":\"Translate {text} to {language}\", \"system\": \"You are a professional translator.\"}'</code> or <code>--prompt prompt_template_sample.json</code></p> <pre><code># prompt_template_sample.json\n{\n \"system\": \"You are a professional translator.\", \n \"user\": \"Translate the given text to {language}. Be faithful or accurate in translation. Make the translation readable or intelligible. Be elegant or natural in translation. If the text cannot be translated, return the original text as is. Do not translate person's name. Do not add any additional text in the translation. The text to be translated is:\\n{text}\"\n}\n</code></pre> </li> </ul> <p>You can also set the <code>user</code> and <code>system</code> role prompt by setting environment variables: <code>BBM_CHATGPTAPI_USER_MSG_TEMPLATE</code> and <code>BBM_CHATGPTAPI_SYS_MSG</code>.</p> <ul> <li>You can now use PromptDown format (<code>.md</code> files) for more structured prompts: <code>--prompt prompt_md.prompt.md</code><pre><code># Translation Prompt\n\n## System Message\nYou are a professional translator who specializes in accurate translations.\n\n## Conversation\n\n| Role | Content |\n|-------|------------------------------------------|\n| User | Please translate the following text into {language}:\\n\\n{text} |\n\n# OR using Developer Message (for newer AI models)\n\n# Translation Prompt\n\n## Developer Message\nYou are a professional translator who specializes in accurate translations.\n\n## Conversation\n\n| Role | Content |\n|-------|------------------------------------------|\n| User | Please translate the following text into {language}:\\n\\n{text} |\n</code></pre> </li> </ul>"},{"location":"prompt/#examples","title":"Examples","text":"<pre><code>python3 make_book.py --book_name test_books/animal_farm.epub --prompt prompt_template_sample.txt\n# or\npython3 make_book.py --book_name test_books/animal_farm.epub --prompt prompt_template_sample.json\n# or\npython3 make_book.py --book_name test_books/animal_farm.epub --prompt \"Please translate \\`{text}\\` to {language}\"\n</code></pre>"},{"location":"quickstart/","title":"QuickStart","text":"<p>After successfully install the package, you can see <code>bbook-maker</code> is in the output of <code>pip list</code>.</p>"},{"location":"quickstart/#preparation","title":"Preparation","text":"<ol> <li>ChatGPT or OpenAI token</li> <li>epub/txt books</li> <li>Environment with internet access or proxy</li> <li>Python 3.8+</li> </ol>"},{"location":"quickstart/#use","title":"Use","text":"<p>You can use by command <code>bbook_maker</code>. A sample book, <code>test_books/animal_farm.epub</code>, is provided for testing purposes.</p> <pre><code>bbook_maker --book_name ${path of a book} --openai_key ${openai_key}\n\n# Example\nbbook_maker --book_name test_books/animal_farm.epub --openai_key ${openai_key}\n</code></pre> <p>Or, you can use the script provided by repository.</p> <pre><code>python3 make_book.py --book_name ${path of a book} --openai_key ${openai_key}\n\n# Example\npython3 make_book.py --book_name test_books/animal_farm.epub --openai_key ${openai_key}\n</code></pre> <p>Once the translation is complete, a bilingual book named <code>${book_name}_bilingual.epub</code> would be generated.</p> <p>Note: If there are any errors or you wish to interrupt the translation by pressing <code>CTRL+C</code>. A book named <code>${book_name}_bilingual_temp.epub</code> would be generated. You can simply rename it to any desired name.</p>"}]}