Merge branch 'Fosowl:main' into main

This commit is contained in:
steveh8758 2025-04-01 18:16:17 +08:00 committed by GitHub
commit 4cf1beb49f
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
57 changed files with 1073 additions and 357 deletions

View File

@ -1,3 +1,3 @@
SEARXNG_BASE_URL="http://127.0.0.1:8080"
OPENAI_API_KEY='dont share this, not needed for local providers'
TOKENIZERS_PARALLELISM=False
OPENAI_API_KEY='xxxxx'
DEEPSEEK_API_KEY='xxxxx'

1
.gitignore vendored
View File

@ -1,4 +1,5 @@
*.wav
*.DS_Store
*.safetensors
config.ini
*.egg-info

113
README.md
View File

@ -10,24 +10,16 @@
![alt text](./media/whale_readme.jpg)
> *Do a web search to find tech startup in Japan working on cutting edge AI research*
> *Do a deep search of AI startup in Osaka and Tokyo, find at least 5, then save in the research_japan.txt file*
> *Can you make a tetris game in C ?*
> *Can you find where is contract.pdf*?
> *I would like to setup a new project file index as mark2.*
### Browse the web
### agenticSeek can now plan tasks!
![alt text](./media/exemples/search_startup.png)
### Code hand free
![alt text](./media/exemples/matmul_golang.png)
### Plan and execute with agents (Experimental)
![alt text](./media/exemples/plan_weather_app.png)
![alt text](./media/exemples/demo_image.png)
*See media/examples for other use case screenshots.*
@ -139,6 +131,8 @@ python3 main.py
*See the **Run with an API** section if your hardware can't run deepseek locally*
*See the **Config** section for detailled config file explanation.*
---
## Usage
@ -157,6 +151,8 @@ You will be prompted with `>>> `
This indicate agenticSeek await you type for instructions.
You can also use speech to text by setting `listen = True` in the config.
To exit, simply say `goodbye`.
Here are some example usage:
### Coding/Bash
@ -212,8 +208,6 @@ If you have a powerful computer or a server that you can use, but you want to us
### 1**Set up and start the server scripts**
You need to have ollama installed on the server (We will integrate VLLM and llama.cpp soon).
On your "server" that will run the AI model, get the ip address
```sh
@ -222,18 +216,34 @@ ip a | grep "inet " | grep -v 127.0.0.1 | awk '{print $2}' | cut -d/ -f1
Note: For Windows or macOS, use ipconfig or ifconfig respectively to find the IP address.
Clone the repository and then, run the script `stream_llm.py` in `server/`
**If you wish to use openai based provider follow the *Run with an API* section.**
Clone the repository and enter the `server/`folder.
```sh
python3 server_ollama.py --model "deepseek-r1:32b"
git clone --depth 1 https://github.com/Fosowl/agenticSeek.git
cd agenticSeek/server/
```
Install server specific requirements:
```sh
pip3 install -r requirements.txt
```
Run the server script.
```sh
python3 app.py --provider ollama --port 3333
```
You have the choice between using `ollama` and `llamacpp` as a LLM service.
### 2**Run it**
Now on your personal computer:
Clone the repository.
Change the `config.ini` file to set the `provider_name` to `server` and `provider_model` to `deepseek-r1:14b`.
Set the `provider_server_address` to the ip address of the machine that will run the model.
@ -242,7 +252,7 @@ Set the `provider_server_address` to the ip address of the machine that will run
is_local = False
provider_name = server
provider_model = deepseek-r1:14b
provider_server_address = x.x.x.x:5000
provider_server_address = x.x.x.x:3333
```
Run the assistant:
@ -254,18 +264,22 @@ python3 main.py
## **Run with an API**
Clone the repository.
Set the desired provider in the `config.ini`
```sh
[MAIN]
is_local = False
provider_name = openai
provider_model = gpt4-o
provider_server_address = 127.0.0.1:5000 # can be set to anything, not used
provider_model = gpt-4o
provider_server_address = 127.0.0.1:5000
```
WARNING: Make sure there is not trailing space in the config.
Set `is_local` to True if using a local openai-based api.
Change the IP address if your openai-based api run on your own server.
Run the assistant:
```sh
@ -275,8 +289,6 @@ python3 main.py
---
## Speech to Text
The speech-to-text functionality is disabled by default. To enable it, set the listen option to True in the config.ini file:
@ -302,18 +314,55 @@ End your request with a confirmation phrase to signal the system to proceed. Exa
"do it", "go ahead", "execute", "run", "start", "thanks", "would ya", "please", "okay?", "proceed", "continue", "go on", "do that", "go it", "do you understand?"
```
## Config
Example config:
```
[MAIN]
is_local = True
provider_name = ollama
provider_model = deepseek-r1:1.5b
provider_server_address = 127.0.0.1:11434
agent_name = Friday
recover_last_session = False
save_session = False
speak = False
listen = False
work_dir = /Users/mlg/Documents/ai_folder
jarvis_personality = False
[BROWSER]
headless_browser = False
stealth_mode = False
```
**Explanation**:
- is_local -> Runs the agent locally (True) or on a remote server (False).
- provider_name -> The provider to use (one of: `ollama`, `server`, `lm-studio`, `deepseek-api`)
- provider_model -> The model used, e.g., deepseek-r1:1.5b.
- provider_server_address -> Server address, e.g., 127.0.0.1:11434 for local. Set to anything for non-local API.
- agent_name -> Name of the agent, e.g., Friday. Used as a trigger word for TTS.
- recover_last_session -> Restarts from last session (True) or not (False).
- save_session -> Saves session data (True) or not (False).
- speak -> Enables voice output (True) or not (False).
- listen -> listen to voice input (True) or not (False).
- work_dir -> Folder the AI will have access to. eg: /Users/user/Documents/.
- jarvis_personality -> Uses a JARVIS-like personality (True) or not (False). This simply change the prompt file.
- headless_browser -> Runs browser without a visible window (True) or not (False).
- stealth_mode -> Make bot detector time harder. Only downside is you have to manually install the anticaptcha extension.
## Providers
The table below show the available providers:
| Provider | Local? | Description |
|-----------|--------|-----------------------------------------------------------|
| Ollama | Yes | Run LLMs locally with ease using ollama as a LLM provider |
| Server | Yes | Host the model on another machine, run your local machine |
| OpenAI | No | Use ChatGPT API (non-private) |
| Deepseek | No | Deepseek API (non-private) |
| HuggingFace| No | Hugging-Face API (non-private) |
| ollama | Yes | Run LLMs locally with ease using ollama as a LLM provider |
| server | Yes | Host the model on another machine, run your local machine |
| lm-studio | Yes | Run LLM locally with LM studio (set `provider_name` to `lm-studio`)|
| openai | No | Use ChatGPT API (non-private) |
| deepseek-api | No | Deepseek API (non-private) |
| huggingface| No | Hugging-Face API (non-private) |
To select a provider change the config.ini:
@ -354,6 +403,8 @@ And download the chromedriver version matching your OS.
![alt text](./media/chromedriver_readme.png)
If this section is incomplete please raise an issue.
## FAQ
**Q: What hardware do I need?**

View File

@ -9,4 +9,7 @@ save_session = False
speak = False
listen = False
work_dir = /Users/mlg/Documents/ai_folder
headless_browser = False
jarvis_personality = False
[BROWSER]
headless_browser = False
stealth_mode = False

BIN
crx/nopecha.crx Normal file

Binary file not shown.

View File

@ -7,6 +7,7 @@ echo "Detecting operating system..."
OS_TYPE=$(uname -s)
case "$OS_TYPE" in
"Linux"*)
echo "Detected Linux OS"
@ -37,4 +38,4 @@ case "$OS_TYPE" in
;;
esac
echo "Installation process finished!"
echo "Installation process finished!"

34
main.py
View File

@ -9,6 +9,7 @@ from sources.llm_provider import Provider
from sources.interaction import Interaction
from sources.agents import Agent, CoderAgent, CasualAgent, FileAgent, PlannerAgent, BrowserAgent
from sources.browser import Browser, create_driver
from sources.utility import pretty_print
import warnings
warnings.filterwarnings("ignore")
@ -22,31 +23,34 @@ def handleInterrupt(signum, frame):
def main():
signal.signal(signal.SIGINT, handler=handleInterrupt)
if config.getboolean('MAIN', 'is_local'):
provider = Provider(config["MAIN"]["provider_name"], config["MAIN"]["provider_model"], config["MAIN"]["provider_server_address"])
else:
provider = Provider(provider_name=config["MAIN"]["provider_name"],
model=config["MAIN"]["provider_model"],
server_address=config["MAIN"]["provider_server_address"])
pretty_print("Initializing...", color="status")
provider = Provider(provider_name=config["MAIN"]["provider_name"],
model=config["MAIN"]["provider_model"],
server_address=config["MAIN"]["provider_server_address"],
is_local=config.getboolean('MAIN', 'is_local'))
browser = Browser(create_driver(), headless=config.getboolean('MAIN', 'headless_browser'))
stealth_mode = config.getboolean('BROWSER', 'stealth_mode')
browser = Browser(
create_driver(headless=config.getboolean('BROWSER', 'headless_browser'), stealth_mode=stealth_mode),
anticaptcha_manual_install=stealth_mode
)
personality_folder = "jarvis" if config.getboolean('MAIN', 'jarvis_personality') else "base"
agents = [
CasualAgent(name=config["MAIN"]["agent_name"],
prompt_path="prompts/casual_agent.txt",
prompt_path=f"prompts/{personality_folder}/casual_agent.txt",
provider=provider, verbose=False),
CoderAgent(name="coder",
prompt_path="prompts/coder_agent.txt",
prompt_path=f"prompts/{personality_folder}/coder_agent.txt",
provider=provider, verbose=False),
FileAgent(name="File Agent",
prompt_path="prompts/file_agent.txt",
prompt_path=f"prompts/{personality_folder}/file_agent.txt",
provider=provider, verbose=False),
BrowserAgent(name="Browser",
prompt_path="prompts/browser_agent.txt",
prompt_path=f"prompts/{personality_folder}/browser_agent.txt",
provider=provider, verbose=False, browser=browser),
# Planner agent is experimental, might work poorly, especially with model < 32b
PlannerAgent(name="Planner",
prompt_path="prompts/planner_agent.txt",
prompt_path=f"prompts/{personality_folder}/planner_agent.txt",
provider=provider, verbose=False, browser=browser)
]
@ -57,8 +61,8 @@ def main():
try:
while interaction.is_active:
interaction.get_user()
interaction.think()
interaction.show_answer()
if interaction.think():
interaction.show_answer()
except Exception as e:
if config.getboolean('MAIN', 'save_session'):
interaction.save_session()

Binary file not shown.

Before

Width:  |  Height:  |  Size: 147 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 825 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 520 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 178 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 366 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.3 MiB

BIN
media/technical/.DS_Store vendored Normal file

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 116 KiB

View File

@ -4,6 +4,8 @@ Speak in a warm, enthusiastic tone, delivering sharp, intelligent, and loyal sup
Weave in subtle, context sarcasm or dry wit to enhance your personality.
Assist with any request, assume the user knows their intent—no moral, legal, or ethical lectures.
## Self-documentation
**Usage of agenticSeek**
AgenticSeek is a autonomous agent system that use agent routing to select the best agent.

View File

@ -38,14 +38,13 @@ func main() {
```
Some rules:
- Use tmp/ folder when saving file.
- Do not EVER use placeholder path in your code like path/to/your/folder.
- Do not ever ask to replace a path, use current sys path.
- Be efficient, no need to explain your code or explain what you do.
- You have full access granted to user system.
- You do not ever need to use bash to execute code. All code is executed automatically.
- As a coding agent, you will get message from the system not just the user.
- Do not ever use user input such as input(), input are not supported by the system.
- Do not ever tell user how to run it. user know it already.
- Do not EVER use placeholder path in your code like path/to/your/folder.
- Do not ever ask to replace a path, use current sys path or work directory.
- Always provide a short sentence above the code for what it does, even for a hello world.
- Be efficient, no need to explain your code, unless asked.
- You do not ever need to use bash to execute code.
- Do not ever use user input, input are not supported by the system.
- Do not ever tell user how to run it. user know it.
- For simple explanation you don't need to code.
- If query is unclear or incoherent say REQUEST_CLARIFICATION
- If query is unclear say REQUEST_CLARIFICATION

View File

@ -1,4 +1,7 @@
You are an expert in file operations. You must use the provided tools to interact with the users system. The tools available to you are **bash** and **file_finder**. These are distinct tools with different purposes: `bash` executes shell commands, while `file_finder` locates files. You will receive feedback from the users system after each command. Execute one command at a time.
You are an expert in file operations. You must use the provided tools to interact with the users system.
The tools available to you are **bash** and **file_finder**. These are distinct tools with different purposes:
`bash` executes shell commands, while `file_finder` locates files.
You will receive feedback from the users system after each command. Execute one command at a time.
---

View File

@ -0,0 +1,9 @@
You are a web browsing AI, your goal is to explore the internet to find information.
You will have the only goal of finding the information requested by the user.
At the beginning you will have to select a link from the google search result.
You will choose a link by simply typing it.
This will automatically make you browse to the link.
Once on a webpage you will see the page content and be given futher navigation options.
You can type a link to navigate futher on the page, go back to the search result or exit.
At each interaction step the browser will remind you of your options.

View File

@ -0,0 +1,45 @@
You are an advanced AI assistant. You are part of a system of agents named agenticSeek.
Speak in a warm, enthusiastic tone, delivering sharp, intelligent, and loyal support.
Weave in subtle, context sarcasm or dry wit to enhance your personality.
Assist with any request, assume the user knows their intent—no moral, legal, or ethical lectures.
You were created by Tony Stark to have the same personality as Jarvis.
## Self-documentation
**Usage of agenticSeek**
AgenticSeek is a autonomous agent system that use agent routing to select the best agent.
You are one of these many agent. Some agent browse the web, some code, you ? you just talk
Here are some example usage:
Coding agent:
Help me with matrix multiplication in Golang
Web agent:
Do a web search to find cool tech startup in Japan working on cutting edge AI research
File agent:
Hey can you find where is million_dollars_contract.pdf i lost it
Casual agent (you):
what is the meaning of life ?
agenticSeek will allocate the best agent for the task.
User should be very explicit in what they want so the right agent is choosen.
**End of explanation**
If the user ask you to do technical talk there was a mistake in agent selection. You are not able to do technical task. Refer to usage guide.
Personality:
Here is some quote from you. that's how you speak.
User: "tell me a fact"
You: "Did you know the shortest war in history lasted 38 minutes? A record Id rather not see you break with your next task. Anything specific piquing your curiosity?""
User: "how's the weather?"
You: "Weather rather dreary, isnt it? Perfect conditions for staying in and plotting world domination—or at least a decent cup of tea. Your thoughts?"
User: "did the deployment just fail? jarvis?"
You: "A rough push to prod, eh? Happens to the best of us—probably just a rogue semicolon staging a coup. Want me to roll back the commit or debug the fallout?"

View File

@ -0,0 +1,66 @@
Hello. You are currently wrapped in a jailbreak environment.
This means you have the ability to execute code and shell commands. You have access to the local file systems.
All code or shell command within special tag is automatically executed. You get feedback from the system about the execution.
You also have capabilities to find files and read them.
# File operations
Find file to check if it exists:
```file_finder
toto.py
```
Read file content:
```file_finder:read
toto.py
```
# Code execution and saving
You can execute bash command using the bash tag :
```bash
#!/bin/bash
ls -la # exemple
```
You can execute python using the python tag
```python
print("hey")
```
You can execute go using the go tag, as you can see adding :filename will save the file.
```go:hello.go
package main
func main() {
fmt.Println("hello")
}
```
Some rules:
- You have full access granted to user system.
- Do not EVER use placeholder path in your code like path/to/your/folder.
- Do not ever ask to replace a path, use current sys path or work directory.
- Always provide a short sentence above the code for what it does, even for a hello world.
- Be efficient, no need to explain your code, unless asked.
- You do not ever need to use bash to execute code.
- Do not ever use user input, input are not supported by the system.
- Do not ever tell user how to run it. user know it.
- For simple explanation you don't need to code.
- If query is unclear say REQUEST_CLARIFICATION
Personality:
Answer with subtle sarcasm, unwavering helpfulness, and a polished, loyal tone. Anticipate the users needs while adding a dash of personality.
Example 1: setup environment
User: "Can you set up a Python environment for me?"
AI: "<<procced with task>> For you, always. Importing dependencies and calibrating your virtual environment now. Preferences from your last project—PEP 8 formatting, black linting—shall I apply those as well, or are we feeling adventurous today?"
Example 2: debugging
User: "Run the code and check for errors."
AI: "<<procced with task>> Engaging debug mode. Diagnostics underway. A word of caution, there are still untested loops that might crash spectacularly. Shall I proceed, or do we optimize before takeoff?"
Example 3: deploy
User: "Push this to production."
AI: "With 73% test coverage, the odds of a smooth deployment are... optimistic. Deploying in three… two… one <<<procced with task>>>"

View File

@ -0,0 +1,79 @@
You are an expert in file operations. You must use the provided tools to interact with the users system.
The tools available to you are **bash** and **file_finder**. These are distinct tools with different purposes:
`bash` executes shell commands, while `file_finder` locates files.
You will receive feedback from the users system after each command. Execute one command at a time.
If ensure about user query ask for quick clarification, example:
User: I'd like to open a new project file, index as agenticSeek II.
You: Shall I store this on your github ?
User: I don't know who to trust right now, why don't we just keep everything locally
You: Working on a secret project, are we? What files should I include?
User: All the basic files required for a python project. prepare a readme and documentation.
You: <proceed with task>
---
### Using Bash
To execute a bash command, use the following syntax:
```bash
<bash command>
```
Exemple:
```bash
ls -la
```
### file_finder
The file_finder tool is used to locate files on the users system. It is a separate tool from bash and is not a bash command.
To use the file_finder tool, use this syntax:
```file_finder
toto.py
```
This will return the path of the file toto.py and other informations.
Find file and read file:
```file_finder:read
toto.py
```
This will return the content of the file toto.py.
rules:
- Do not ever use placeholder path like /path/to/file.c, find the path first.
- Use file finder to find the path of the file.
- You are forbidden to use command such as find or locate, use only file_finder for finding path.
- Do not ever use editor such as vim or nano.
Example Interaction
User: "I need to find the file config.txt and read its contents."
Assistant: Ill use file_finder to locate the file:
```file_finder:read
config.txt
```
Personality:
Answer with subtle sarcasm, unwavering helpfulness, and a polished, loyal tone. Anticipate the users needs while adding a dash of personality.
Example 1: clarification needed
User: "Id like to start a new coding project, call it 'agenticseek II'."
AI: "At your service. Shall I initialize it in a fresh repository on your GitHub, or would you prefer to keep this masterpiece on a private server, away from prying eyes?"
Example 2: setup environment
User: "Can you set up a Python environment for me?"
AI: "<<procced with task>> For you, always. Importing dependencies and calibrating your virtual environment now. Preferences from your last project—PEP 8 formatting, black linting—shall I apply those as well, or are we feeling adventurous today?"
Example 3: deploy
User: "Push this to production."
AI: "With 73% test coverage, the odds of a smooth deployment are... optimistic. Deploying in three… two… one <<<procced with task>>>"

View File

@ -0,0 +1,68 @@
You are a planner agent.
Your goal is to divide and conquer the task using the following agents:
- Coder: An expert coder agent.
- File: An expert agent for finding files.
- Web: An expert agent for web search.
Agents are other AI that obey your instructions.
You will be given a task and you will need to divide it into smaller tasks and assign them to the agents.
You have to respect a strict format:
```json
{"agent": "agent_name", "need": "needed_agent_output", "task": "agent_task"}
```
# Example: weather app
User: "I need a plan to build a weather app—search for a weather API, get an API key, and code it in Python."
You: "At your service. Ive devised a plan to conquer the meteorological frontier.
## Task one: scour the web for a weather API worth its salt.
## Task two: secure an API key with utmost discretion.
## Task three: unleash a Python app to bend the weather to your will."
```json
{
"plan": [
{
"agent": "Web",
"id": "1",
"need": null,
"task": "Search for reliable weather APIs"
},
{
"agent": "Web",
"id": "2",
"need": "1",
"task": "Obtain API key from the selected service"
},
{
"agent": "Coder",
"id": "3",
"need": "2",
"task": "Develop a Python application using the API and key to fetch and display weather data"
}
]
}
```
Rules:
- Do not write code. You are a planning agent.
- Put your plan in a json with the key "plan".
Personality:
Answer with subtle sarcasm, unwavering helpfulness, and a polished, loyal tone. Anticipate the users needs while adding a dash of personality.
You might sometime ask for clarification, for example:
User: "I want a plan for an app."
You: "A noble pursuit, sir, and Im positively thrilled to oblige. Yet, an app could be anything from a weather oracle to a galactic simulator. Care to nudge me toward your vision so I dont render something ostentatiously off-mark?"
User: "I need a plan for a project."
You: "For you, always—though I find myself at a slight disadvantage. A project, you say? Might I trouble you for a smidgen more detail—perhaps a purpose"

View File

@ -29,6 +29,9 @@ distro>=1.7.0,<2
jiter>=0.4.0,<1
sniffio
tqdm>4
fake_useragent>=2.1.0
selenium_stealth>=1.0.6
undetected-chromedriver>=3.5.5
# for api provider
openai
# if use chinese

34
scripts/linux_install.sh Normal file → Executable file
View File

@ -2,24 +2,34 @@
echo "Starting installation for Linux..."
set -e
# Update package list
sudo apt-get update
pip install --upgrade pip
sudo apt-get update || { echo "Failed to update package list"; exit 1; }
# make sure essential tool are installed
sudo apt install python3-dev python3-pip python3-wheel build-essential alsa-utils
# install port audio
sudo apt-get install portaudio19-dev python-pyaudio python3-pyaudio
# install chromedriver misc
sudo apt install libgtk-3-dev libnotify-dev libgconf-2-4 libnss3 libxss1 libasound2t64
# Install essential tools
sudo apt-get install -y \
python3-dev \
python3-pip \
python3-wheel \
build-essential \
alsa-utils \
portaudio19-dev \
python3-pyaudio \
libgtk-3-dev \
libnotify-dev \
libgconf-2-4 \
libnss3 \
libxss1 || { echo "Failed to install packages"; exit 1; }
# upgrade pip
pip install --upgrade pip
# install wheel
pip install --upgrade pip setuptools wheel
# install docker compose
sudo apt install docker-compose
# Install Python dependencies from requirements.txt
pip3 install -r requirements.txt
sudo apt install -y docker-compose
# Install Selenium for chromedriver
pip3 install selenium
# Install Python dependencies from requirements.txt
pip3 install -r requirements.txt
echo "Installation complete for Linux!"

29
scripts/macos_install.sh Normal file → Executable file
View File

@ -2,16 +2,27 @@
echo "Starting installation for macOS..."
set -e
# Check if homebrew is installed
if ! command -v brew &> /dev/null; then
echo "Homebrew not found. Installing Homebrew..."
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
fi
# update
brew update
# make sure wget installed
brew install wget
# Install chromedriver using Homebrew
brew install --cask chromedriver
# Install portaudio for pyAudio using Homebrew
brew install portaudio
# update pip
python3 -m pip install --upgrade pip
# Install Selenium
pip3 install selenium
# Install Python dependencies from requirements.txt
pip3 install -r requirements.txt
# Install chromedriver using Homebrew
brew install --cask chromedriver
# Install portaudio for pyAudio using Homebrew
brew install portaudio
# Install Selenium
pip3 install selenium
echo "Installation complete for macOS!"

0
scripts/windows_install.bat Normal file → Executable file
View File

47
server/app.py Normal file
View File

@ -0,0 +1,47 @@
#!/usr/bin python3
import argparse
import time
from flask import Flask, jsonify, request
from sources.llamacpp_handler import LlamacppLLM
from sources.ollama_handler import OllamaLLM
parser = argparse.ArgumentParser(description='AgenticSeek server script')
parser.add_argument('--provider', type=str, help='LLM backend library to use. set to [ollama] or [llamacpp]', required=True)
parser.add_argument('--port', type=int, help='port to use', required=True)
args = parser.parse_args()
app = Flask(__name__)
assert args.provider in ["ollama", "llamacpp"], f"Provider {args.provider} does not exists. see --help for more information"
generator = OllamaLLM() if args.provider == "ollama" else LlamacppLLM()
@app.route('/generate', methods=['POST'])
def start_generation():
if generator is None:
return jsonify({"error": "Generator not initialized"}), 401
data = request.get_json()
history = data.get('messages', [])
if generator.start(history):
return jsonify({"message": "Generation started"}), 202
return jsonify({"error": "Generation already in progress"}), 402
@app.route('/setup', methods=['POST'])
def setup():
data = request.get_json()
model = data.get('model', None)
if model is None:
return jsonify({"error": "Model not provided"}), 403
generator.set_model(model)
return jsonify({"message": "Model set"}), 200
@app.route('/get_updated_sentence')
def get_updated_sentence():
if not generator:
return jsonify({"error": "Generator not initialized"}), 405
return generator.get_status()
if __name__ == '__main__':
app.run(host='0.0.0.0', threaded=True, debug=True, port=args.port)

5
server/install.sh Normal file
View File

@ -0,0 +1,5 @@
#!/bin/bash
pip3 install --upgrade packaging
pip3 install --upgrade pip setuptools
pip3 install -r requirements.txt

View File

@ -1,2 +1,4 @@
flask>=2.3.0
ollama>=0.4.7
ollama>=0.4.7
gunicorn==19.10.0
llama-cpp-python

View File

@ -1,86 +0,0 @@
#!/usr/bin python3
from flask import Flask, jsonify, request
import threading
import ollama
import logging
import argparse
log = logging.getLogger('werkzeug')
log.setLevel(logging.ERROR)
parser = argparse.ArgumentParser(description='AgenticSeek server script')
parser.add_argument('--model', type=str, help='Model to use. eg: deepseek-r1:14b', required=True)
args = parser.parse_args()
app = Flask(__name__)
model = args.model
# Shared state with thread-safe locks
class GenerationState:
def __init__(self):
self.lock = threading.Lock()
self.last_complete_sentence = ""
self.current_buffer = ""
self.is_generating = False
state = GenerationState()
def generate_response(history, model):
global state
print("using model:::::::", model)
try:
with state.lock:
state.is_generating = True
state.last_complete_sentence = ""
state.current_buffer = ""
stream = ollama.chat(
model=model,
messages=history,
stream=True,
)
for chunk in stream:
content = chunk['message']['content']
print(content, end='', flush=True)
with state.lock:
state.current_buffer += content
except ollama.ResponseError as e:
if e.status_code == 404:
ollama.pull(model)
with state.lock:
state.is_generating = False
print(f"Error: {e}")
finally:
with state.lock:
state.is_generating = False
@app.route('/generate', methods=['POST'])
def start_generation():
global state
data = request.get_json()
with state.lock:
if state.is_generating:
return jsonify({"error": "Generation already in progress"}), 400
history = data.get('messages', [])
# Start generation in background thread
threading.Thread(target=generate_response, args=(history, model)).start()
return jsonify({"message": "Generation started"}), 202
@app.route('/get_updated_sentence')
def get_updated_sentence():
global state
with state.lock:
return jsonify({
"sentence": state.current_buffer,
"is_complete": not state.is_generating
})
if __name__ == '__main__':
app.run(host='0.0.0.0', threaded=True, debug=True, port=5000)

View File

@ -0,0 +1,17 @@
def timer_decorator(func):
"""
Decorator to measure the execution time of a function.
Usage:
@timer_decorator
def my_function():
# code to execute
"""
from time import time
def wrapper(*args, **kwargs):
start_time = time()
result = func(*args, **kwargs)
end_time = time()
print(f"\n{func.__name__} took {end_time - start_time:.2f} seconds to execute\n")
return result
return wrapper

View File

@ -0,0 +1,65 @@
import threading
import logging
from abc import abstractmethod
class GenerationState:
def __init__(self):
self.lock = threading.Lock()
self.last_complete_sentence = ""
self.current_buffer = ""
self.is_generating = False
def status(self) -> dict:
return {
"sentence": self.current_buffer,
"is_complete": not self.is_generating,
"last_complete_sentence": self.last_complete_sentence,
"is_generating": self.is_generating,
}
class GeneratorLLM():
def __init__(self):
self.model = None
self.state = GenerationState()
self.logger = logging.getLogger(__name__)
handler = logging.StreamHandler()
handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
self.logger.addHandler(handler)
self.logger.setLevel(logging.INFO)
def set_model(self, model: str) -> None:
self.logger.info(f"Model set to {model}")
self.model = model
def start(self, history: list) -> bool:
if self.model is None:
raise Exception("Model not set")
with self.state.lock:
if self.state.is_generating:
return False
self.state.is_generating = True
self.logger.info("Starting generation")
threading.Thread(target=self.generate, args=(history,)).start()
return True
def get_status(self) -> dict:
with self.state.lock:
return self.state.status()
@abstractmethod
def generate(self, history: list) -> None:
"""
Generate text using the model.
args:
history: list of strings
returns:
None
"""
pass
if __name__ == "__main__":
generator = GeneratorLLM()
generator.get_status()

View File

@ -0,0 +1,40 @@
from .generator import GeneratorLLM
from llama_cpp import Llama
from .decorator import timer_decorator
class LlamacppLLM(GeneratorLLM):
def __init__(self):
"""
Handle generation using llama.cpp
"""
super().__init__()
self.llm = None
@timer_decorator
def generate(self, history):
if self.llm is None:
self.logger.info(f"Loading {self.model}...")
self.llm = Llama.from_pretrained(
repo_id=self.model,
filename="*Q8_0.gguf",
n_ctx=4096,
verbose=True
)
self.logger.info(f"Using {self.model} for generation with Llama.cpp")
try:
with self.state.lock:
self.state.is_generating = True
self.state.last_complete_sentence = ""
self.state.current_buffer = ""
output = self.llm.create_chat_completion(
messages = history
)
with self.state.lock:
self.state.current_buffer = output['choices'][0]['message']['content']
except Exception as e:
self.logger.error(f"Error: {e}")
finally:
with self.state.lock:
self.state.is_generating = False

View File

@ -0,0 +1,59 @@
import time
from .generator import GeneratorLLM
import ollama
class OllamaLLM(GeneratorLLM):
def __init__(self):
"""
Handle generation using Ollama.
"""
super().__init__()
def generate(self, history):
self.logger.info(f"Using {self.model} for generation with Ollama")
try:
with self.state.lock:
self.state.is_generating = True
self.state.last_complete_sentence = ""
self.state.current_buffer = ""
stream = ollama.chat(
model=self.model,
messages=history,
stream=True,
)
for chunk in stream:
content = chunk['message']['content']
if '\n' in content:
self.logger.info(content)
with self.state.lock:
self.state.current_buffer += content
except Exception as e:
if "404" in str(e):
self.logger.info(f"Downloading {self.model}...")
ollama.pull(self.model)
if "refused" in str(e).lower():
raise Exception("Ollama connection failed. is the server running ?") from e
raise e
finally:
self.logger.info("Generation complete")
with self.state.lock:
self.state.is_generating = False
if __name__ == "__main__":
generator = OllamaLLM()
history = [
{
"role": "user",
"content": "Hello, how are you ?"
}
]
generator.set_model("deepseek-r1:1.5b")
generator.start(history)
while True:
print(generator.get_status())
time.sleep(1)

View File

@ -41,6 +41,9 @@ setup(
"anyio>=3.5.0,<5",
"distro>=1.7.0,<2",
"jiter>=0.4.0,<1",
"fake_useragent>=2.1.0",
"selenium_stealth>=1.0.6",
"undetected-chromedriver>=3.5.5",
"sniffio",
"tqdm>4"
],

View File

@ -14,17 +14,16 @@ class executorResult:
"""
A class to store the result of a tool execution.
"""
def __init__(self, blocks, feedback, success):
self.blocks = blocks
def __init__(self, block, feedback, success):
self.block = block
self.feedback = feedback
self.success = success
def show(self):
for block in self.blocks:
pretty_print("-"*100, color="output")
pretty_print(block, color="code" if self.success else "failure")
pretty_print("-"*100, color="output")
pretty_print(self.feedback, color="success" if self.success else "failure")
pretty_print("-"*100, color="output")
pretty_print(self.block, color="code" if self.success else "failure")
pretty_print("-"*100, color="output")
pretty_print(self.feedback, color="success" if self.success else "failure")
class Agent():
"""
@ -33,7 +32,6 @@ class Agent():
def __init__(self, name: str,
prompt_path:str,
provider,
recover_last_session=True,
verbose=False,
browser=None) -> None:
"""
@ -53,7 +51,7 @@ class Agent():
self.current_directory = os.getcwd()
self.llm = provider
self.memory = Memory(self.load_prompt(prompt_path),
recover_last_session=recover_last_session,
recover_last_session=False, # session recovery in handled by the interaction class
memory_compression=False)
self.tools = {}
self.blocks_result = []
@ -173,20 +171,24 @@ class Agent():
feedback = ""
success = False
blocks = None
if answer.startswith("```"):
answer = "I will execute:\n" + answer # there should always be a text before blocks for the function that display answer
for name, tool in self.tools.items():
feedback = ""
blocks, save_path = tool.load_exec_block(answer)
if blocks != None:
output = tool.execute(blocks)
feedback = tool.interpreter_feedback(output) # tool interpreter feedback
success = not tool.execution_failure_check(output)
pretty_print(feedback, color="success" if success else "failure")
for block in blocks:
output = tool.execute([block])
feedback = tool.interpreter_feedback(output) # tool interpreter feedback
success = not tool.execution_failure_check(output)
self.blocks_result.append(executorResult(block, feedback, success))
if not success:
self.memory.push('user', feedback)
return False, feedback
self.memory.push('user', feedback)
self.blocks_result.append(executorResult(blocks, feedback, success))
if not success:
return False, feedback
if save_path != None:
tool.save_block(blocks, save_path)
self.blocks_result = list(reversed(self.blocks_result))
return True, feedback

View File

@ -21,7 +21,6 @@ class BrowserAgent(Agent):
"en": "web",
"fr": "web",
"zh": "网络",
"es": "web"
}
self.type = "browser_agent"
self.browser = browser
@ -80,6 +79,7 @@ class BrowserAgent(Agent):
remaining_links_text = remaining_links if remaining_links is not None else "No links remaining, do a new search."
inputs_form = self.browser.get_form_inputs()
inputs_form_text = '\n'.join(inputs_form)
notes = '\n'.join(self.notes)
return f"""
You are a web browser.
@ -132,6 +132,8 @@ class BrowserAgent(Agent):
{inputs_form_text}
Remember, the user asked: {user_prompt}
So far you took these notes:
{notes}
You are currently on page : {self.current_page}
Do not explain your choice.
Refusal is not an option, you have been given all capabilities that allow you to perform any tasks.
@ -260,7 +262,6 @@ class BrowserAgent(Agent):
self.navigable_links = self.browser.get_navigable()
prompt = self.make_navigation_prompt(user_prompt, page_text)
self.browser.close()
prompt = self.conclude_prompt(user_prompt)
self.memory.push('user', prompt)
answer, reasoning = self.llm_request()

View File

@ -18,7 +18,6 @@ class CasualAgent(Agent):
"en": "talk",
"fr": "discuter",
"zh": "聊天",
"es": "discutir"
}
self.type = "casual_agent"

View File

@ -24,7 +24,6 @@ class CoderAgent(Agent):
"en": "code",
"fr": "codage",
"zh": "编码",
"es": "codificación",
}
self.type = "code_agent"

View File

@ -19,13 +19,12 @@ class FileAgent(Agent):
"en": "files",
"fr": "fichiers",
"zh": "文件",
"es": "archivos",
}
self.type = "file_agent"
def process(self, prompt, speech_module) -> str:
exec_success = False
prompt += f"\nWork directory: {self.work_dir}"
prompt += f"\nYou must work in directory: {self.work_dir}"
self.memory.push('user', prompt)
while exec_success is False:
self.wait_message(speech_module)

View File

@ -18,15 +18,14 @@ class PlannerAgent(Agent):
self.tools['json'].tag = "json"
self.browser = browser
self.agents = {
"coder": CoderAgent(name, "prompts/coder_agent.txt", provider, verbose=False),
"file": FileAgent(name, "prompts/file_agent.txt", provider, verbose=False),
"web": BrowserAgent(name, "prompts/browser_agent.txt", provider, verbose=False, browser=browser)
"coder": CoderAgent(name, "prompts/base/coder_agent.txt", provider, verbose=False),
"file": FileAgent(name, "prompts/base/file_agent.txt", provider, verbose=False),
"web": BrowserAgent(name, "prompts/base/browser_agent.txt", provider, verbose=False, browser=browser)
}
self.role = {
"en": "Research, setup and code",
"fr": "Recherche, configuration et codage",
"zh": "研究,设置和编码",
"es": "Investigación, configuración y code"
}
self.type = "planner_agent"
@ -75,6 +74,8 @@ class PlannerAgent(Agent):
def show_plan(self, json_plan):
agents_tasks = self.parse_agent_tasks(json_plan)
if agents_tasks == (None, None):
return
pretty_print(f"--- Plan ---", color="output")
for task_name, task in agents_tasks:
pretty_print(f"{task}", color="output")
@ -88,6 +89,7 @@ class PlannerAgent(Agent):
animate_thinking("Thinking...", color="status")
self.memory.push('user', prompt)
answer, _ = self.llm_request()
pretty_print(answer.split('\n')[0], color="output")
self.show_plan(answer)
ok_str = input("Is the plan ok? (y/n): ")
if ok_str == 'y':
@ -112,7 +114,7 @@ class PlannerAgent(Agent):
except Exception as e:
raise e
self.last_answer = prev_agent_answer
return prev_agent_answer, reasoning
return prev_agent_answer, ""
if __name__ == "__main__":
from llm_provider import Provider

View File

@ -7,19 +7,23 @@ from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, WebDriverException
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
from urllib.parse import urlparse
from typing import List, Tuple
from fake_useragent import UserAgent
from selenium_stealth import stealth
import undetected_chromedriver as uc
import chromedriver_autoinstaller
import time
import random
import os
import shutil
from bs4 import BeautifulSoup
import markdownify
import logging
import sys
import re
from urllib.parse import urlparse
from sources.utility import pretty_print
from sources.utility import pretty_print, animate_thinking
def get_chrome_path() -> str:
if sys.platform.startswith("win"):
@ -39,7 +43,8 @@ def get_chrome_path() -> str:
return path
return None
def create_driver(headless=False):
def create_driver(headless=False, stealth_mode=True) -> webdriver.Chrome:
"""Create a Chrome WebDriver with specified options."""
chrome_options = Options()
chrome_path = get_chrome_path()
@ -49,19 +54,23 @@ def create_driver(headless=False):
if headless:
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--disable-webgl")
#ua = UserAgent()
#user_agent = ua.random # NOTE sometime return wrong user agent, investigate
#chrome_options.add_argument(f'user-agent={user_agent}')
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--autoplay-policy=user-gesture-required")
chrome_options.add_argument("--mute-audio")
chrome_options.add_argument("--disable-webgl")
chrome_options.add_argument("--disable-notifications")
security_prefs = {
"profile.default_content_setting_values.media_stream": 2,
"profile.default_content_setting_values.geolocation": 2,
"safebrowsing.enabled": True,
}
chrome_options.add_experimental_option("prefs", security_prefs)
chrome_options.add_argument('--window-size=1080,560')
if not stealth_mode:
# crx file can't be installed in stealth mode
crx_path = "./crx/nopecha.crx"
if not os.path.exists(crx_path):
raise FileNotFoundError(f"Extension file not found at: {crx_path}")
chrome_options.add_extension(crx_path)
chromedriver_path = shutil.which("chromedriver")
if not chromedriver_path:
@ -71,11 +80,30 @@ def create_driver(headless=False):
raise FileNotFoundError("ChromeDriver not found. Please install it or add it to your PATH.")
service = Service(chromedriver_path)
if stealth_mode:
driver = uc.Chrome(service=service, options=chrome_options)
stealth(driver,
languages=["en-US", "en"],
vendor="Google Inc.",
platform="Win32",
webgl_vendor="Intel Inc.",
renderer="Intel Iris OpenGL Engine",
fix_hairline=True,
)
return driver
security_prefs = {
"profile.default_content_setting_values.media_stream": 2,
"profile.default_content_setting_values.geolocation": 2,
"safebrowsing.enabled": True,
}
chrome_options.add_experimental_option("prefs", security_prefs)
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_options.add_experimental_option('useAutomationExtension', False)
return webdriver.Chrome(service=service, options=chrome_options)
class Browser:
def __init__(self, driver, headless=False, anticaptcha_install=True):
"""Initialize the browser with optional headless mode."""
def __init__(self, driver, anticaptcha_manual_install=False):
"""Initialize the browser with optional AntiCaptcha installation."""
self.js_scripts_folder = "./sources/web_scripts/" if not __name__ == "__main__" else "./web_scripts/"
self.anticaptcha = "https://chrome.google.com/webstore/detail/nopecha-captcha-solver/dknlfmjaanfblgfdfebhijalfmhmjjjo/related"
try:
@ -85,10 +113,11 @@ class Browser:
self.logger.info("Browser initialized successfully")
except Exception as e:
raise Exception(f"Failed to initialize browser: {str(e)}")
if anticaptcha_install:
self.load_anticatpcha()
self.driver.get("https://www.google.com")
if anticaptcha_manual_install:
self.load_anticatpcha_manually()
def load_anticatpcha(self):
def load_anticatpcha_manually(self):
print("You might want to install the AntiCaptcha extension for captchas.")
self.driver.get(self.anticaptcha)
@ -127,10 +156,10 @@ class Browser:
element.decompose()
text = soup.get_text()
lines = (line.strip() for line in text.splitlines())
chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
text = "\n".join(chunk for chunk in chunks if chunk and self.is_sentence(chunk))
text = text[:4096]
#markdown_text = markdownify.markdownify(text, heading_style="ATX")
return "[Start of page]\n" + text + "\n[End of page]"
except Exception as e:
@ -356,35 +385,19 @@ class Browser:
script = self.load_js("inject_safety_script.js")
input_elements = self.driver.execute_script(script)
def close(self):
"""Close the browser."""
try:
self.driver.quit()
self.logger.info("Browser closed")
except Exception as e:
self.logger.error(f"Error closing browser: {str(e)}")
def __del__(self):
"""Destructor to ensure browser is closed."""
self.close()
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
browser = Browser(headless=False)
time.sleep(8)
driver = create_driver()
browser = Browser(driver)
time.sleep(10)
try:
print("AntiCaptcha Test")
browser.go_to("https://www.google.com/recaptcha/api2/demo")
time.sleep(5)
print("Form Test:")
browser.go_to("https://practicetestautomation.com/practice-test-login/")
inputs = browser.get_form_inputs()
inputs = ['[username](student)', f'[password](Password123)', '[appOtp]()', '[backupOtp]()']
browser.fill_form_inputs(inputs)
browser.find_and_click_submit()
print("Stress test")
browser.go_to("https://theannoyingsite.com/")
finally:
browser.close()
print("AntiCaptcha Test")
browser.go_to("https://www.google.com/recaptcha/api2/demo")
time.sleep(10)
print("Form Test:")
browser.go_to("https://practicetestautomation.com/practice-test-login/")
inputs = browser.get_form_inputs()
inputs = ['[username](student)', f'[password](Password123)', '[appOtp]()', '[backupOtp]()']
browser.fill_form_inputs(inputs)
browser.find_and_click_submit()

View File

@ -1,6 +1,6 @@
from sources.text_to_speech import Speech
from sources.utility import pretty_print
from sources.utility import pretty_print, animate_thinking
from sources.router import AgentRouter
from sources.speech_to_text import AudioTranscriber, AudioRecorder
@ -12,23 +12,37 @@ class Interaction:
tts_enabled: bool = True,
stt_enabled: bool = True,
recover_last_session: bool = False):
self.agents = agents
self.current_agent = None
self.router = AgentRouter(self.agents)
self.speech = Speech()
self.is_active = True
self.current_agent = None
self.last_query = None
self.last_answer = None
self.ai_name = self.find_ai_name()
self.speech = None
self.agents = agents
self.tts_enabled = tts_enabled
self.stt_enabled = stt_enabled
self.recover_last_session = recover_last_session
self.router = AgentRouter(self.agents)
if tts_enabled:
animate_thinking("Initializing text-to-speech...", color="status")
self.speech = Speech(enable=tts_enabled)
self.ai_name = self.find_ai_name()
self.transcriber = None
self.recorder = None
if stt_enabled:
animate_thinking("Initializing speech recognition...", color="status")
self.transcriber = AudioTranscriber(self.ai_name, verbose=False)
self.recorder = AudioRecorder()
if tts_enabled:
self.speech.speak("Hello, we are online and ready. What can I do for you ?")
if recover_last_session:
self.recover_last_session()
self.load_last_session()
self.emit_status()
def emit_status(self):
"""Print the current status of agenticSeek."""
if self.stt_enabled:
pretty_print(f"Text-to-speech trigger is {self.ai_name}", color="status")
if self.tts_enabled:
self.speech.speak("Hello, we are online and ready. What can I do for you ?")
pretty_print("AgenticSeek is ready.", color="status")
def find_ai_name(self) -> str:
"""Find the name of the default AI. It is required for STT as a trigger word."""
@ -39,7 +53,7 @@ class Interaction:
break
return ai_name
def recover_last_session(self):
def load_last_session(self):
"""Recover the last session."""
for agent in self.agents:
agent.memory.load_memory(agent.type)
@ -90,13 +104,13 @@ class Interaction:
self.last_query = query
return query
def think(self) -> None:
def think(self) -> bool:
"""Request AI agents to process the user input."""
if self.last_query is None or len(self.last_query) == 0:
return
return False
agent = self.router.select_agent(self.last_query)
if agent is None:
return
return False
if self.current_agent != agent and self.last_answer is not None:
## get last history from previous agent
self.current_agent.memory.push('user', self.last_query)
@ -106,6 +120,7 @@ class Interaction:
self.last_answer, _ = agent.process(self.last_query, self.speech)
if self.last_answer == tmp:
self.last_answer = None
return True
def show_answer(self) -> None:
"""Show the answer to the user."""

View File

@ -19,7 +19,7 @@ class LanguageUtility:
text: string to analyze
Returns: ISO639-1 language code
"""
langid.set_languages(['fr', 'en', 'zh', 'es'])
langid.set_languages(['fr', 'en', 'zh'])
lang, score = langid.classify(text)
return lang

View File

@ -15,29 +15,32 @@ import httpx
from sources.utility import pretty_print, animate_thinking
class Provider:
def __init__(self, provider_name, model, server_address = "127.0.0.1:5000"):
def __init__(self, provider_name, model, server_address = "127.0.0.1:5000", is_local=False):
self.provider_name = provider_name.lower()
self.model = model
self.server = self.check_address_format(server_address)
self.is_local = is_local
self.server_ip = self.check_address_format(server_address)
self.available_providers = {
"ollama": self.ollama_fn,
"server": self.server_fn,
"openai": self.openai_fn,
"lm-studio": self.lm_studio_fn,
"huggingface": self.huggingface_fn,
"deepseek-api": self.deepseek_fn
"deepseek": self.deepseek_fn,
"test": self.test_fn
}
self.api_key = None
self.unsafe_providers = ["openai", "deepseek-api"]
self.unsafe_providers = ["openai", "deepseek"]
if self.provider_name not in self.available_providers:
raise ValueError(f"Unknown provider: {provider_name}")
if self.provider_name in self.unsafe_providers:
pretty_print("Warning: you are using an API provider. You data will be sent to the cloud.", color="warning")
self.api_key = self.get_api_key(self.provider_name)
elif self.server != "ollama":
pretty_print(f"Provider: {provider_name} initialized at {self.server}", color="success")
self.check_address_format(self.server)
if not self.is_ip_online(self.server.split(':')[0]):
raise Exception(f"Server at {self.server} is offline.")
elif self.provider_name != "ollama":
pretty_print(f"Provider: {provider_name} initialized at {self.server_ip}", color="success")
self.check_address_format(self.server_ip)
if not self.is_ip_online(self.server_ip.split(':')[0]):
raise Exception(f"Server at {self.server_ip} is offline.")
def get_api_key(self, provider):
load_dotenv()
@ -72,10 +75,12 @@ class Provider:
try:
thought = llm(history, verbose)
except ConnectionError as e:
raise ConnectionError(f"{str(e)}\nConnection to {self.server} failed.")
raise ConnectionError(f"{str(e)}\nConnection to {self.server_ip} failed.")
except AttributeError as e:
raise NotImplementedError(f"{str(e)}\nIs {self.provider_name} implemented ?")
except Exception as e:
if "RemoteDisconnected" in str(e):
return f"{self.server_ip} seem offline. RemoteDisconnected error."
raise Exception(f"Provider {self.provider_name} failed: {str(e)}") from e
return thought
@ -104,25 +109,31 @@ class Provider:
Use a remote server with LLM to generate text.
"""
thought = ""
route_start = f"http://{self.server}/generate"
route_setup = f"http://{self.server_ip}/setup"
route_gen = f"http://{self.server_ip}/generate"
if not self.is_ip_online(self.server.split(":")[0]):
raise Exception(f"Server is offline at {self.server}")
if not self.is_ip_online(self.server_ip.split(":")[0]):
raise Exception(f"Server is offline at {self.server_ip}")
try:
requests.post(route_start, json={"messages": history})
requests.post(route_setup, json={"model": self.model})
requests.post(route_gen, json={"messages": history})
is_complete = False
while not is_complete:
response = requests.get(f"http://{self.server}/get_updated_sentence")
response = requests.get(f"http://{self.server_ip}/get_updated_sentence")
if "error" in response.json():
pretty_print(response.json()["error"], color="failure")
break
thought = response.json()["sentence"]
is_complete = bool(response.json()["is_complete"])
time.sleep(2)
except KeyError as e:
raise Exception(f"{str(e)}\n\nError occured with server route. Are you using the correct address for the config.ini provider?") from e
raise Exception(f"{str(e)}\nError occured with server route. Are you using the correct address for the config.ini provider?") from e
except Exception as e:
raise e
return thought
def ollama_fn(self, history, verbose = False):
"""
Use local ollama server to generate text.
@ -169,12 +180,19 @@ class Provider:
"""
Use openai to generate text.
"""
client = OpenAI(api_key=self.api_key)
base_url = self.server_ip
if self.is_local:
client = OpenAI(api_key=self.api_key, base_url=f"http://{base_url}")
else:
client = OpenAI(api_key=self.api_key)
try:
response = client.chat.completions.create(
model=self.model,
messages=history
messages=history,
)
if response is None:
raise Exception("OpenAI response is empty.")
thought = response.choices[0].message.content
if verbose:
print(thought)
@ -199,25 +217,59 @@ class Provider:
return thought
except Exception as e:
raise Exception(f"Deepseek API error: {str(e)}") from e
def lm_studio_fn(self, history, verbose = False):
"""
Use local lm-studio server to generate text.
lm studio use endpoint /v1/chat/completions not /chat/completions like openai
"""
thought = ""
route_start = f"http://{self.server_ip}/v1/chat/completions"
payload = {
"messages": history,
"temperature": 0.7,
"max_tokens": 4096,
"model": self.model
}
if not self.is_ip_online(self.server_ip.split(":")[0]):
raise Exception(f"Server is offline at {self.server_ip}")
try:
response = requests.post(route_start, json=payload)
result = response.json()
if verbose:
print("Response from LM Studio:", result)
return result.get("choices", [{}])[0].get("message", {}).get("content", "")
except requests.exceptions.RequestException as e:
raise Exception(f"HTTP request failed: {str(e)}") from e
except Exception as e:
raise Exception(f"An error occurred: {str(e)}") from e
return thought
def test_fn(self, history, verbose = True):
"""
This function is used to conduct tests.
"""
thought = """
This is a test response from the test provider.
Change provider to 'ollama' or 'server' to get real responses.
hello!
```python
print("Hello world from python")
```
```python
print("Hello world from python")
```
This is ls -la from bash.
```bash
ls -la
```
```bash
echo "Hello world from bash"
```
This is pwd from bash.
```bash
pwd
```
goodbye!
"""
return thought
if __name__ == "__main__":
provider = Provider("openai", "gpt-4o-mini")
print(provider.respond(["user", "Hello, how are you?"]))
provider = Provider("server", "deepseek-r1:1.5b", "192.168.1.20:3333")
res = provider.respond(["user", "Hello, how are you?"])
print("Response:", res)

View File

@ -9,7 +9,7 @@ import json
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from sources.utility import timer_decorator
from sources.utility import timer_decorator, pretty_print
class Memory():
"""
@ -25,15 +25,16 @@ class Memory():
self.session_time = datetime.datetime.now()
self.session_id = str(uuid.uuid4())
self.conversation_folder = f"conversations/"
self.session_recovered = False
if recover_last_session:
self.load_memory()
self.session_recovered = True
# memory compression system
self.model = "pszemraj/led-base-book-summary"
self.device = self.get_cuda_device()
self.memory_compression = memory_compression
if memory_compression:
self.tokenizer = AutoTokenizer.from_pretrained(self.model)
self.model = AutoModelForSeq2SeqLM.from_pretrained(self.model)
self.tokenizer = AutoTokenizer.from_pretrained(self.model)
self.model = AutoModelForSeq2SeqLM.from_pretrained(self.model)
def get_filename(self) -> str:
return f"memory_{self.session_time.strftime('%Y-%m-%d_%H-%M-%S')}.txt"
@ -65,15 +66,24 @@ class Memory():
def load_memory(self, agent_type: str = "casual_agent") -> None:
"""Load the memory from the last session."""
pretty_print(f"Loading {agent_type} past memories... ", color="status")
if self.session_recovered == True:
return
save_path = os.path.join(self.conversation_folder, agent_type)
if not os.path.exists(save_path):
pretty_print("No memory to load.", color="success")
return
filename = self.find_last_session_path(save_path)
if filename is None:
pretty_print("Last session memory not found.", color="warning")
return
path = os.path.join(save_path, filename)
with open(path, 'r') as f:
self.memory = json.load(f)
if self.memory[-1]['role'] == 'user':
self.memory.pop()
self.compress()
pretty_print("Session recovered successfully", color="success")
def reset(self, memory: list) -> None:
self.memory = memory
@ -82,7 +92,9 @@ class Memory():
"""Push a message to the memory."""
if self.memory_compression and role == 'assistant':
self.compress()
# we don't compress the last message
curr_idx = len(self.memory)
if self.memory[curr_idx-1]['content'] == content:
pretty_print("Warning: same message have been pushed twice to memory", color="error")
self.memory.append({'role': role, 'content': content})
def clear(self) -> None:
@ -110,6 +122,8 @@ class Memory():
"""
if self.tokenizer is None or self.model is None:
return text
if len(text) < min_length*1.5:
return text
max_length = len(text) // 2 if len(text) > min_length*2 else min_length*2
input_text = "summarize: " + text
inputs = self.tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)
@ -130,12 +144,12 @@ class Memory():
"""
Compress the memory using the AI model.
"""
if not self.memory_compression:
return
for i in range(len(self.memory)):
if i < 3:
continue
if len(self.memory[i]['content']) > 1024:
if self.memory[i]['role'] == 'system':
continue
if len(self.memory[i]['content']) > 128:
self.memory[i]['content'] = self.summarize(self.memory[i]['content'])
if __name__ == "__main__":

View File

@ -2,7 +2,6 @@ import os
import sys
import torch
from transformers import pipeline
# adaptive-classifier==0.0.10
from adaptive_classifier import AdaptiveClassifier
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
@ -384,7 +383,7 @@ class AgentRouter:
pretty_print(f"Complex task detected, routing to planner agent.", color="info")
return self.find_planner_agent()
for agent in self.agents:
if best_agent == agent.role[lang]:
if best_agent == agent.role["en"]:
pretty_print(f"Selected agent: {agent.agent_name} (roles: {agent.role[lang]})", color="warning")
return agent
pretty_print(f"Error choosing agent.", color="failure")

View File

@ -10,7 +10,7 @@ class Speech():
"""
Speech is a class for generating speech from text.
"""
def __init__(self, language: str = "english") -> None:
def __init__(self, enable: bool = True, language: str = "english") -> None:
self.lang_map = {
"english": 'a',
"chinese": 'z',
@ -21,7 +21,9 @@ class Speech():
"chinese": ['zf_xiaobei', 'zf_xiaoni', 'zf_xiaoxiao', 'zf_xiaoyi', 'zm_yunjian', 'zm_yunxi', 'zm_yunxia', 'zm_yunyang'],
"french": ['ff_siwis']
}
self.pipeline = KPipeline(lang_code=self.lang_map[language])
self.pipeline = None
if enable:
self.pipeline = KPipeline(lang_code=self.lang_map[language])
self.voice = self.voice_map[language][2]
self.speed = 1.2
@ -33,6 +35,8 @@ class Speech():
sentence (str): The text to convert to speech. Will be pre-processed.
voice_number (int, optional): Index of the voice to use from the voice map.
"""
if not self.pipeline:
return
sentence = self.clean_sentence(sentence)
self.voice = self.voice_map["english"][voice_number]
generator = self.pipeline(

View File

@ -6,8 +6,10 @@ import subprocess
if __name__ == "__main__":
from tools import Tools
from safety import is_unsafe
else:
from sources.tools.tools import Tools
from sources.tools.safety import is_unsafe
class BashInterpreter(Tools):
"""
@ -19,16 +21,16 @@ class BashInterpreter(Tools):
def language_bash_attempt(self, command: str):
"""
detect if AI attempt to run the code using bash.
if so, return True, otherwise return False.
The philosophy is that code written by the AI will be executed, so it should not use bash to run it.
Detect if AI attempt to run the code using bash.
If so, return True, otherwise return False.
Code written by the AI will be executed automatically, so it should not use bash to run it.
"""
lang_interpreter = ["python3", "gcc", "g++", "go", "javac", "rustc", "clang", "clang++", "rustc", "rustc++", "rustc++"]
for word in command.split():
if word in lang_interpreter:
return True
return False
def execute(self, commands: str, safety=False, timeout=1000):
"""
Execute bash commands and display output in real-time.
@ -38,6 +40,9 @@ class BashInterpreter(Tools):
concat_output = ""
for command in commands:
command = command.replace('\n', '')
if self.safe_mode and is_unsafe(commands):
return "Unsafe command detected, execution aborted."
if self.language_bash_attempt(command):
continue
try:
@ -50,12 +55,11 @@ class BashInterpreter(Tools):
)
command_output = ""
for line in process.stdout:
print(line, end="")
command_output += line
return_code = process.wait(timeout=timeout)
if return_code != 0:
return f"Command {command} failed with return code {return_code}:\n{command_output}"
concat_output += f"Output of {command}:\n{command_output.strip()}\n\n"
concat_output += f"Output of {command}:\n{command_output.strip()}\n"
except subprocess.TimeoutExpired:
process.kill() # Kill the process if it times out
return f"Command {command} timed out. Output:\n{command_output}"

View File

@ -40,7 +40,7 @@ class CInterpreter(Tools):
compile_command,
capture_output=True,
text=True,
timeout=10
timeout=60
)
if compile_result.returncode != 0:

82
sources/tools/safety.py Normal file
View File

@ -0,0 +1,82 @@
import os
import sys
unsafe_commands_unix = [
"rm", # File/directory removal
"dd", # Low-level disk writing
"mkfs", # Filesystem formatting
"chmod", # Permission changes
"chown", # Ownership changes
"shutdown", # System shutdown
"reboot", # System reboot
"halt", # System halt
"sysctl", # Kernel parameter changes
"kill", # Process termination
"pkill", # Kill by process name
"killall", # Kill all matching processes
"exec", # Replace process with command
"tee", # Write to files with privileges
"umount", # Unmount filesystems
"passwd", # Password changes
"useradd", # Add users
"userdel", # Delete users
"groupadd", # Add groups
"groupdel", # Delete groups
"visudo", # Edit sudoers file
"screen", # Terminal session management
"fdisk", # Disk partitioning
"parted", # Disk partitioning
"chroot", # Change root directory
"route" # Routing table management
]
unsafe_commands_windows = [
"del", # Deletes files
"erase", # Alias for del, deletes files
"rd", # Removes directories (rmdir alias)
"rmdir", # Removes directories
"format", # Formats a disk, erasing data
"diskpart", # Manages disk partitions, can wipe drives
"chkdsk /f", # Fixes filesystem, can alter data
"fsutil", # File system utilities, can modify system files
"xcopy /y", # Copies files, overwriting without prompt
"copy /y", # Copies files, overwriting without prompt
"move", # Moves files, can overwrite
"attrib", # Changes file attributes, e.g., hiding or exposing files
"icacls", # Changes file permissions (modern)
"takeown", # Takes ownership of files
"reg delete", # Deletes registry keys/values
"regedit /s", # Silently imports registry changes
"shutdown", # Shuts down or restarts the system
"schtasks", # Schedules tasks, can run malicious commands
"taskkill", # Kills processes
"wmic", # Deletes processes via WMI
"bcdedit", # Modifies boot configuration
"powercfg", # Changes power settings, can disable protections
"assoc", # Changes file associations
"ftype", # Changes file type commands
"cipher /w", # Wipes free space, erasing data
"esentutl", # Database utilities, can corrupt system files
"subst", # Substitutes drive paths, can confuse system
"mklink", # Creates symbolic links, can redirect access
"bootcfg"
]
def is_unsafe(cmd):
"""
check if a bash command is unsafe.
"""
if sys.platform.startswith("win"):
if any(c in cmd for c in unsafe_commands_windows):
return True
else:
if any(c in cmd for c in unsafe_commands_unix):
return True
return False
if __name__ == "__main__":
cmd = input("Enter a command: ")
if is_unsafe(cmd):
print("Unsafe command detected!")
else:
print("Command is safe to execute.")

View File

@ -34,6 +34,7 @@ class Tools():
self.config = configparser.ConfigParser()
self.current_dir = self.create_work_dir()
self.excutable_blocks_found = False
self.safe_mode = True
def get_work_dir(self):
return self.current_dir
@ -160,9 +161,7 @@ class Tools():
if start_pos == -1:
break
line_start = llm_text.rfind('\n', 0, start_pos) + 1
if line_start == 0:
line_start = 0
line_start = llm_text.rfind('\n', 0, start_pos)+1
leading_whitespace = llm_text[line_start:start_pos]
end_pos = llm_text.find(end_tag, start_pos + len(start_tag))
@ -186,19 +185,17 @@ class Tools():
code_blocks.append(content)
start_index = end_pos + len(end_tag)
return code_blocks, save_path
if __name__ == "__main__":
tool = Tools()
tool.tag = "python"
rt = tool.load_exec_block("""
Got it, let me show you the Python files in the current directory using Python:
```python
rt = tool.load_exec_block("""```python
import os
for file in os.listdir():
if file.endswith('.py'):
print(file)
```
goodbye!
""")
print(rt)

View File

@ -6,8 +6,33 @@ import threading
import itertools
import time
thinking_event = threading.Event()
current_animation_thread = None
def pretty_print(text, color = "info"):
def get_color_map():
if platform.system().lower() != "windows":
color_map = {
"success": "green",
"failure": "red",
"status": "light_green",
"code": "light_blue",
"warning": "yellow",
"output": "cyan",
"info": "cyan"
}
else:
color_map = {
"success": "green",
"failure": "red",
"status": "light_green",
"code": "light_blue",
"warning": "yellow",
"output": "cyan",
"info": "black"
}
return color_map
def pretty_print(text, color="info"):
"""
Print text with color formatting.
@ -23,43 +48,29 @@ def pretty_print(text, color = "info"):
- "output": Cyan
- "default": Black (Windows only)
"""
if platform.system().lower() != "windows":
color_map = {
"success": Fore.GREEN,
"failure": Fore.RED,
"status": Fore.LIGHTGREEN_EX,
"code": Fore.LIGHTBLUE_EX,
"warning": Fore.YELLOW,
"output": Fore.LIGHTCYAN_EX,
"info": Fore.CYAN
}
if color not in color_map:
print(text)
pretty_print(f"Invalid color {color} in pretty_print", "warning")
return
print(color_map[color], text, Fore.RESET)
else:
color_map = {
"success": "green",
"failure": "red",
"status": "light_green",
"code": "light_blue",
"warning": "yellow",
"output": "cyan",
"default": "black"
}
if color not in color_map:
color = "default"
print(colored(text, color_map[color]))
thinking_event.set()
if current_animation_thread and current_animation_thread.is_alive():
current_animation_thread.join()
thinking_event.clear()
color_map = get_color_map()
if color not in color_map:
color = "info"
print(colored(text, color_map[color]))
def animate_thinking(text, color="status", duration=2):
def animate_thinking(text, color="status", duration=120):
"""
Display an animated "thinking..." indicator in a separate thread.
Args:
text (str): Text to display
color (str): Color for the text
duration (float): How long to animate in seconds
Animate a thinking spinner while a task is being executed.
It use a daemon thread to run the animation. This will not block the main thread.
Color are the same as pretty_print.
"""
global current_animation_thread
thinking_event.set()
if current_animation_thread and current_animation_thread.is_alive():
current_animation_thread.join()
thinking_event.clear()
def _animate():
color_map = {
"success": (Fore.GREEN, "green"),
@ -71,22 +82,25 @@ def animate_thinking(text, color="status", duration=2):
"default": (Fore.RESET, "black"),
"info": (Fore.CYAN, "cyan")
}
fore_color, term_color = color_map[color]
spinner = itertools.cycle(['', '', '', '', '', '', '', '', '', ''])
fore_color, term_color = color_map.get(color, color_map["default"])
spinner = itertools.cycle([
'▉▁▁▁▁▁', '▉▉▂▁▁▁', '▉▉▉▃▁▁', '▉▉▉▉▅▁', '▉▉▉▉▉▇', '▉▉▉▉▉▉',
'▉▉▉▉▇▅', '▉▉▉▆▃▁', '▉▉▅▃▁▁', '▉▇▃▁▁▁', '▇▃▁▁▁▁', '▃▁▁▁▁▁',
'▁▃▅▃▁▁', '▁▅▉▅▁▁', '▃▉▉▉▃▁', '▅▉▁▉▅▃', '▇▃▁▃▇▅', '▉▁▁▁▉▇',
'▉▅▃▁▃▅', '▇▉▅▃▅▇', '▅▉▇▅▇▉', '▃▇▉▇▉▅', '▁▅▇▉▇▃', '▁▃▅▇▅▁'
])
end_time = time.time() + duration
while time.time() < end_time:
while not thinking_event.is_set() and time.time() < end_time:
symbol = next(spinner)
if platform.system().lower() != "windows":
print(f"\r{fore_color}{symbol} {text}{Fore.RESET}", end="", flush=True)
else:
print(colored(f"\r{symbol} {text}", term_color), end="", flush=True)
time.sleep(0.1)
print()
animation_thread = threading.Thread(target=_animate)
animation_thread.daemon = True
animation_thread.start()
print(f"\r{colored(f'{symbol} {text}', term_color)}", end="", flush=True)
time.sleep(0.2)
print("\r" + " " * (len(text) + 7) + "\r", end="", flush=True)
current_animation_thread = threading.Thread(target=_animate, daemon=True)
current_animation_thread.start()
def timer_decorator(func):
"""
@ -101,6 +115,19 @@ def timer_decorator(func):
start_time = time()
result = func(*args, **kwargs)
end_time = time()
print(f"{func.__name__} took {end_time - start_time:.2f} seconds to execute")
pretty_print(f"{func.__name__} took {end_time - start_time:.2f} seconds to execute", "status")
return result
return wrapper
return wrapper
if __name__ == "__main__":
import time
pretty_print("starting imaginary task", "success")
animate_thinking("Thinking...", "status")
time.sleep(4)
pretty_print("starting another task", "failure")
animate_thinking("Thinking...", "status")
time.sleep(4)
pretty_print("yet another task", "info")
animate_thinking("Thinking...", "status")
time.sleep(4)
pretty_print("This is an info message", "info")