Refactor :minor change in server

This commit is contained in:
martin legrand 2025-03-07 13:56:14 +01:00
commit 7271466e30
4 changed files with 103 additions and 26 deletions

View File

@ -1,7 +1,7 @@
# 🚀 agenticSeek: Local AI Assistant Powered by DeepSeek Agents
# AgenticSeek: Fully local AI Assistant Powered by Deepseek R1 Agents.
**A fully local AI assistant** using Deepseek R1 agents.
**A fully local AI assistant** using AI agents. The goal of the project is to create a truly Jarvis like assistant using reasoning model such as deepseek R1.
> 🛠️ **Work in Progress** Looking for contributors! 🚀
---
@ -11,13 +11,19 @@
- **Privacy-first**: Runs 100% locally **no data leaves your machine**
- **Voice-enabled**: Speak and interact naturally
- **Coding abilities**: Code in Python, Bash, C, Golang, and soon more
- **Self-correcting**: Automatically fixes errors by itself
- **Trial-and-error**: Automatically fixes code or command upon execution failure
- **Agent routing**: Select the best agent for the task
- **Multi-agent**: For complex tasks, divide and conquer with multiple agents
- **Web browsing (not implemented yet)**: Browse the web and search the internet
- **Tools:**: All agents have their respective tools ability. Basic search, flight API, files explorer, etc...
- **Web browsing (not implemented yet)**: Browse the web autonomously to conduct task.
---
---
## Installation
### 1**Install Dependencies**
@ -57,21 +63,25 @@ Run the assistant:
python3 main.py
```
### 4**Alternative: Run the Assistant (Own Server)**
### 4**Alternative: Run the LLM on your own server**
Get the ip address of the machine that will run the model
On your "server" that will run the AI model, get the ip address
```sh
ip a | grep "inet " | grep -v 127.0.0.1 | awk '{print $2}' | cut -d/ -f1
```
On the other machine that will run the model execute the script in stream_llm.py
Clone the repository and then, run the script `stream_llm.py` in `server/`
```sh
python3 stream_llm.py
```
Now on your personal computer:
Clone the repository.
Change the `config.ini` file to set the `provider_name` to `server` and `provider_model` to `deepseek-r1:7b`.
Set the `provider_server_address` to the ip address of the machine that will run the model.
@ -89,21 +99,40 @@ Run the assistant:
python3 main.py
```
## Provider
Currently the only provider are :
- ollama -> Use ollama running on your computer. Ollama program for running locally large language models.
- server -> A custom script that allow you to have the LLM model run on another machine. Currently it use ollama but we'll switch to other options soon.
- openai -> Use ChatGPT API (not private).
- deepseek -> Deepseek API (not private).
To select a provider change the config.ini:
```
is_local = False
provider_name = openai
provider_model = gpt-4o
provider_server_address = 127.0.0.1:5000
```
is_local: should be True for any locally running LLM, otherwise False.
provider_name: Select the provider to use by its name, see the provider list above.
provider_model: Set the model to use by the agent.
provider_server_address: can be set to anything if you are not using the server provider.
## Current capabilities
- All running locally
- Reasoning with deepseek R1
- Code execution capabilities (Python, Golang, C)
- Code execution capabilities (Python, Golang, C, etc..)
- Shell control capabilities in bash
- Will try to fix errors by itself
- Routing system, select the best agent for the task
- Fast text-to-speech using kokoro.
- Speech to text.
- Memory compression (reduce history as interaction progresses using summary model)
- Recovery: recover last session from memory
## UNDER DEVELOPMENT
- Web browsing
- Knowledge base RAG
- Graphical interface
- Speech-to-text using distil-whisper/distil-medium.en
- Recovery: recover and save session from filesystem.

BIN
media/demo_img.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 237 KiB

30
server/config.json Normal file
View File

@ -0,0 +1,30 @@
{
"model_name": "deepseek-r1:14b",
"known_models": [
"qwq:32b",
"deepseek-r1:1.5b",
"deepseek-r1:7b",
"deepseek-r1:14b",
"deepseek-r1:32b",
"deepseek-r1:70b",
"deepseek-r1:671b",
"deepseek-coder:1.3b",
"deepseek-coder:6.7b",
"deepseek-coder:33b",
"llama2-uncensored:7b",
"llama2-uncensored:70b",
"llama3.1:8b",
"llama3.1:70b",
"llama3.3:70b",
"llama3:8b",
"llama3:70b",
"i4:14b",
"mistral:7b",
"mistral:70b",
"mistral:33b",
"qwen1:7b",
"qwen1:14b",
"qwen1:32b",
"qwen1:70b"
]
}

View File

@ -2,25 +2,43 @@ from flask import Flask, jsonify, request
import threading
import ollama
import logging
import json
log = logging.getLogger('werkzeug')
log.setLevel(logging.ERROR)
app = Flask(__name__)
model = 'deepseek-r1:14b'
# Shared state with thread-safe locks
class Config:
def __init__(self):
self.model = None
self.known_models = []
self.allowed_models = []
self.model_name = None
def load(self):
with open('config.json', 'r') as f:
data = json.load(f)
self.known_models = data['known_models']
self.model_name = data['model_name']
def validate_model(self, model):
if model not in self.known_models:
raise ValueError(f"Model {model} is not known")
class GenerationState:
def __init__(self):
self.lock = threading.Lock()
self.last_complete_sentence = ""
self.current_buffer = ""
self.is_generating = False
self.model = None
state = GenerationState()
def generate_response(history, model):
def generate_response(history):
global state
try:
with state.lock:
@ -29,21 +47,18 @@ def generate_response(history, model):
state.current_buffer = ""
stream = ollama.chat(
model=model,
model=state.model,
messages=history,
stream=True,
)
for chunk in stream:
content = chunk['message']['content']
print(content, end='', flush=True)
with state.lock:
state.current_buffer += content
except ollama.ResponseError as e:
if e.status_code == 404:
ollama.pull(model)
ollama.pull(state.model)
with state.lock:
state.is_generating = False
print(f"Error: {e}")
@ -61,8 +76,7 @@ def start_generation():
return jsonify({"error": "Generation already in progress"}), 400
history = data.get('messages', [])
# Start generation in background thread
threading.Thread(target=generate_response, args=(history, model)).start()
threading.Thread(target=generate_response, args=(history, state.model)).start()
return jsonify({"message": "Generation started"}), 202
@app.route('/get_updated_sentence')
@ -75,4 +89,8 @@ def get_updated_sentence():
})
if __name__ == '__main__':
config = Config()
config.load()
config.validate_model(config.model_name)
state.model = config.model_name
app.run(host='0.0.0.0', port=5000, debug=False, threaded=True)