ollama

mirror of https://github.com/tcsenpai/ollama.git synced 2025-07-12 04:20:02 +00:00

Author	SHA1	Message	Date
jmorganca	43efc893d7	basic progress	2024-09-03 21:15:13 -04:00
jmorganca	20afaae020	add more runner params	2024-09-03 21:15:13 -04:00
jmorganca	72f3fe4b94	truncate stop properly	2024-09-03 21:15:13 -04:00
jmorganca	a379d68aa9	wip stop tokens	2024-09-03 21:15:13 -04:00
jmorganca	b2ef3bf490	embeddings	2024-09-03 21:15:12 -04:00
jmorganca	ce15ed6d69	remove dependency on `llm`	2024-09-03 21:15:12 -04:00
jmorganca	c0b94376b2	grammar	2024-09-03 21:15:12 -04:00
jmorganca	72be8e27c4	sampling	2024-09-03 21:15:12 -04:00
jmorganca	d12db0568e	better `example` module, add port	2024-09-03 21:15:12 -04:00
jmorganca	ec17359a68	wip	2024-09-03 21:15:12 -04:00
jmorganca	fbc8572859	add `llava` to `runner`	2024-09-03 21:15:12 -04:00
jmorganca	87af27dac0	fix output in build_hipblas.sh	2024-09-03 21:15:12 -04:00
jmorganca	54f391309f	mods to `build_hipblas.sh` for linux	2024-09-03 21:15:12 -04:00
jmorganca	28bedcd807	wip	2024-09-03 21:15:12 -04:00
jmorganca	922d0acbdb	improve cuda and hipblas build scripts	2024-09-03 21:15:12 -04:00
jmorganca	b22d78720e	cuda linux	2024-09-03 21:15:12 -04:00
Jeffrey Morgan	905568a47f	Update README.md	2024-09-03 21:15:12 -04:00
Jeffrey Morgan	a15ac52fbe	Update README.md	2024-09-03 21:15:12 -04:00
jmorganca	9547aa53ff	disable log file	2024-09-03 21:15:12 -04:00
jmorganca	e29205ad6d	fix readme for llava	2024-09-03 21:15:12 -04:00
jmorganca	a8f91d3cc1	add llava	2024-09-03 21:15:12 -04:00
jmorganca	a9884ae136	llama: add clip dependencies	2024-09-03 21:15:12 -04:00
jmorganca	e37651cca0	add clip and parallel requests to the todo list	2024-09-03 21:15:12 -04:00
jmorganca	593d6836ab	fix cuda build	2024-09-03 21:15:12 -04:00
jmorganca	533a7e7d50	fix build on windows	2024-09-03 21:15:12 -04:00
jmorganca	0873d28b16	fix `ggml-metal.m` build constraints	2024-09-03 21:15:12 -04:00
jmorganca	bb795faa6c	fix `ggml-metal.m`	2024-09-03 21:15:12 -04:00
jmorganca	e86db9381a	`avx2` should only add `avx2`	2024-09-03 21:15:12 -04:00
jmorganca	4a5633e4bc	fix sync script	2024-09-03 21:15:12 -04:00
jmorganca	86f453252b	fix `ggml-metal.m`	2024-09-03 21:15:12 -04:00
jmorganca	dfd8f34806	fix `ggml-metal.m`	2024-09-03 21:15:12 -04:00
jmorganca	beb847b40f	add license headers	2024-09-03 21:15:12 -04:00
jmorganca	785f76d390	pre-patch	2024-09-03 21:15:12 -04:00
jmorganca	9fe48978a8	move `runner` package down	2024-09-03 21:15:12 -04:00
jmorganca	01ccbc07fe	replace static build in `llm`	2024-09-03 21:15:12 -04:00
jmorganca	ec09be97e8	fix build	2024-09-03 21:15:12 -04:00
jmorganca	6129f30479	wip...	2024-09-03 21:15:12 -04:00
jmorganca	eb1aa97961	rename `server` to `runner`	2024-09-03 21:15:12 -04:00
Jeffrey Morgan	5e921e06ac	Update README.md	2024-09-03 21:15:12 -04:00
Jeffrey Morgan	02089baf70	Update README.md	2024-09-03 21:15:12 -04:00
Jeffrey Morgan	870e91be76	Update README.md	2024-09-03 21:15:12 -04:00
Jeffrey Morgan	7ecc8e86c4	Update README.md	2024-09-03 21:15:12 -04:00
jmorganca	b1696e308e	Add missing hipcc flags	2024-09-03 21:15:12 -04:00
jmorganca	0110994d06	Initial `llama` Go module	2024-09-03 21:15:12 -04:00
jmorganca	2ef3a217d1	add sync of llama.cpp	2024-09-03 21:15:12 -04:00
Michael Yang	fccf8d179f	partial decode ggml bin for more info	2023-08-10 09:23:10 -07:00
Bruce MacDonald	984c9c628c	fix embeddings invalid values	2023-08-09 16:50:53 -04:00
Bruce MacDonald	09d8bf6730	fix build errors	2023-08-09 10:45:57 -04:00
Bruce MacDonald	7a5f3616fd	embed text document in modelfile	2023-08-09 10:26:19 -04:00
Michael Yang	f2074ed4c0	Merge pull request #306 from jmorganca/default-keep-system automatically set num_keep if num_keep < 0	2023-08-08 09:25:34 -07:00
Bruce MacDonald	a6f6d18f83	embed text document in modelfile	2023-08-08 11:27:17 -04:00
Jeffrey Morgan	5eb712f962	trim whitespace before checking stop conditions Fixes #295	2023-08-08 00:29:19 -04:00
Michael Yang	4dc5b117dd	automatically set num_keep if num_keep < 0 num_keep defines how many tokens to keep in the context when truncating inputs. if left to its default value of -1, the server will calculate num_keep to be the left of the system instructions	2023-08-07 16:19:12 -07:00
Michael Yang	b9f4d67554	configurable rope frequency parameters	2023-08-03 22:11:58 -07:00
Michael Yang	c5bcf32823	update llama.cpp	2023-08-03 11:50:24 -07:00
Michael Yang	0e79e52ddd	override ggml-metal if the file is different	2023-08-02 12:50:30 -07:00
Michael Yang	74a5f7e698	no gpu for 70B model	2023-08-01 17:12:50 -07:00
Michael Yang	7a1c3e62dc	update llama.cpp	2023-08-01 16:54:01 -07:00
Michael Yang	319f078dd9	remove -Werror there are compile warnings on Linux which -Werror elevates to errors, preventing compile	2023-07-31 21:45:56 -07:00
Jeffrey Morgan	7da249fcc1	only build metal for `darwin,arm` target	2023-07-31 21:35:23 -04:00
Bruce MacDonald	184ad8f057	allow specifying stop conditions in modelfile	2023-07-28 11:02:04 -04:00
Jeffrey Morgan	dffc8b6e09	update `llama.cpp` to `d91f3f0`	2023-07-28 08:07:48 -04:00
Michael Yang	3549676678	embed ggml-metal.metal	2023-07-27 17:23:29 -07:00
Michael Yang	fadf75f99d	add stop conditions	2023-07-27 17:00:47 -07:00
Michael Yang	ad3a7d0e2c	add NumGQA	2023-07-27 14:05:11 -07:00
Michael Yang	18ffeeec45	update llama.cpp	2023-07-27 14:05:11 -07:00
Michael Yang	cca61181cb	sample metrics	2023-07-27 09:31:44 -07:00
Michael Yang	c490416189	lock on llm.lock(); decrease batch size	2023-07-27 09:31:44 -07:00
Michael Yang	f62a882760	add session expiration	2023-07-27 09:31:44 -07:00
Michael Yang	3003fc03fc	update predict code	2023-07-27 09:31:44 -07:00
Michael Yang	35af37a2cb	session id	2023-07-27 09:31:44 -07:00
Michael Yang	726bc647b2	enable k quants	2023-07-25 08:39:58 -07:00
Michael Yang	cb55fa9270	enable accelerate	2023-07-24 17:14:45 -07:00
Michael Yang	b71c67b6ba	allocate a large enough tokens slice	2023-07-21 23:05:15 -07:00
Michael Yang	8526e1f5f1	add llama.cpp mpi, opencl files	2023-07-20 14:19:55 -07:00
Michael Yang	a83eaa7a9f	update llama.cpp to e782c9e735f93ab4767ffc37462c523b73a17ddc	2023-07-20 11:55:56 -07:00
Michael Yang	5156e48c2a	add script to update llama.cpp	2023-07-20 11:54:59 -07:00
Michael Yang	40c9dc0a31	fix multibyte responses	2023-07-14 20:11:44 -07:00
Michael Yang	0142660bd4	size_t	2023-07-14 17:29:16 -07:00
Michael Yang	1775647f76	continue conversation feed responses back into the llm	2023-07-13 17:13:00 -07:00
Michael Yang	05e08d2310	return more info in generate response	2023-07-13 09:37:32 -07:00
Michael Yang	e1f0a0dc74	fix eof error in generate	2023-07-12 09:36:16 -07:00
Jeffrey Morgan	c63f811909	return error if model fails to load	2023-07-11 20:32:26 -07:00
Jeffrey Morgan	7c71c10d4f	fix compilation issue in Dockerfile, remove from `README.md` until ready	2023-07-11 19:51:08 -07:00
Jeffrey Morgan	e64ef69e34	look for ggml-metal in the same directory as the binary	2023-07-11 15:58:56 -07:00
Michael Yang	442dec1c6f	vendor llama.cpp	2023-07-11 11:59:18 -07:00
Michael Yang	fd4792ec56	call llama.cpp directly from go	2023-07-11 11:59:18 -07:00
Jeffrey Morgan	268e362fa7	fix binding build	2023-07-10 11:33:43 -07:00
Jeffrey Morgan	a18e6b3a40	llama: remove unnecessary std::vector	2023-07-09 10:51:45 -04:00
Jeffrey Morgan	5fb96255dc	llama: remove unused helper functions	2023-07-09 10:25:07 -04:00
Patrick Devine	3f1b7177f2	pass model and predict options	2023-07-07 09:34:05 -07:00
Michael Yang	5dc9c8ff23	more free	2023-07-06 17:08:03 -07:00
Bruce MacDonald	da74384a3e	remove prompt cache	2023-07-06 17:49:05 -04:00
Michael Yang	2c80eddd71	more free	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	9fe018675f	use `Makefile` for dependency building instead of `go generate`	2023-07-06 16:34:44 -04:00
Michael Yang	1b7183c5a1	enable metal gpu acceleration ggml-metal.metal must be in the same directory as the ollama binary otherwise llama.cpp will not be able to find it and load it. 1. go generate llama/llama_metal.go 2. go build . 3. ./ollama serve	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	0998d4f0a4	remove debug print statements	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	79a999e95d	fix crash in bindings	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	fd962a36e5	client updates	2023-07-06 16:34:44 -04:00
Jeffrey Morgan	0240165388	fix llama.cpp build	2023-07-06 16:34:44 -04:00

1 2 3 4

152 Commits