jmorganca
cd776e49ad
llama: wip vision support for runner
2024-08-12 22:18:30 -07:00
Daniel Hiltgen
41bf8d9932
Update sync with latest llama.cpp layout, and run against b3485
2024-07-31 09:50:39 -07:00
Daniel Hiltgen
5152a430f5
Prefix all build artifacts with an OS/ARCH dir
...
This will help keep incremental builds from stomping on each other and make it
easier to stitch together the final runner payloads
2024-07-29 15:38:52 -07:00
jmorganca
7ad4c5334e
clean up metal code
2024-07-29 15:38:52 -07:00
jmorganca
9caee9f8e3
fix Makefile
on windows
2024-07-29 15:38:52 -07:00
jmorganca
518ba1c793
remove printing
2024-07-29 15:38:52 -07:00
jmorganca
2abf81885d
lint
2024-07-29 15:38:52 -07:00
jmorganca
f6faf66dac
fix metal
2024-07-29 15:38:52 -07:00
jmorganca
f8424faf75
fix build on windows
2024-07-29 15:38:52 -07:00
jmorganca
295c202b2f
link metal
2024-07-29 15:38:52 -07:00
jmorganca
f96cade3a6
wip
2024-07-29 15:38:52 -07:00
jmorganca
b767f6554c
wip meta
2024-07-29 15:38:52 -07:00
jmorganca
87833dd606
sync
2024-07-29 15:38:52 -07:00
jmorganca
2f94ffd801
remove perl docs
2024-07-29 15:38:52 -07:00
jmorganca
e9d15eb277
remove build scripts
2024-07-29 15:38:52 -07:00
jmorganca
a687913a97
fix output
2024-07-29 15:38:52 -07:00
jmorganca
6110d25dce
arch build
2024-07-29 15:38:52 -07:00
jmorganca
2081ec9ba1
add temporary makefile
2024-07-29 15:38:52 -07:00
jmorganca
34015ca10d
fix cgo flags for darwin amd64
2024-07-29 15:38:51 -07:00
jmorganca
89cb4b8d6b
basic progress
2024-07-29 15:38:51 -07:00
jmorganca
0d365e8d34
add more runner params
2024-07-29 15:38:51 -07:00
jmorganca
424627c347
embeddings
2024-07-29 15:38:51 -07:00
jmorganca
1a801fba2a
remove dependency on llm
2024-07-29 15:38:51 -07:00
jmorganca
727494ea54
grammar
2024-07-29 15:38:51 -07:00
jmorganca
b39fca7088
sampling
2024-07-29 15:38:51 -07:00
jmorganca
db55b1b89d
better example
module, add port
2024-07-29 15:38:51 -07:00
jmorganca
1124e24aff
wip
2024-07-29 15:38:51 -07:00
jmorganca
df44d119a3
add llava
to runner
2024-07-29 15:38:51 -07:00
jmorganca
aaca2ce093
wip
2024-07-29 15:38:51 -07:00
jmorganca
323a3f1f3a
cuda linux
2024-07-29 15:38:51 -07:00
jmorganca
31e0de825e
disable log file
2024-07-29 15:38:51 -07:00
jmorganca
878eb9a19f
add llava
2024-07-29 15:38:51 -07:00
jmorganca
3d656588a7
avx2
should only add avx2
2024-07-29 15:38:51 -07:00
jmorganca
4dd63c1fef
move runner
package down
2024-07-29 15:38:51 -07:00
jmorganca
82214396b5
replace static build in llm
2024-07-29 15:38:51 -07:00
jmorganca
491ff41675
Initial llama
Go module
2024-07-29 15:35:09 -07:00
jmorganca
075f2e88d9
add sync of llama.cpp
2024-07-29 15:35:09 -07:00
Michael Yang
fccf8d179f
partial decode ggml bin for more info
2023-08-10 09:23:10 -07:00
Bruce MacDonald
984c9c628c
fix embeddings invalid values
2023-08-09 16:50:53 -04:00
Bruce MacDonald
09d8bf6730
fix build errors
2023-08-09 10:45:57 -04:00
Bruce MacDonald
7a5f3616fd
embed text document in modelfile
2023-08-09 10:26:19 -04:00
Michael Yang
f2074ed4c0
Merge pull request #306 from jmorganca/default-keep-system
...
automatically set num_keep if num_keep < 0
2023-08-08 09:25:34 -07:00
Bruce MacDonald
a6f6d18f83
embed text document in modelfile
2023-08-08 11:27:17 -04:00
Jeffrey Morgan
5eb712f962
trim whitespace before checking stop conditions
...
Fixes #295
2023-08-08 00:29:19 -04:00
Michael Yang
4dc5b117dd
automatically set num_keep if num_keep < 0
...
num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions
2023-08-07 16:19:12 -07:00
Michael Yang
b9f4d67554
configurable rope frequency parameters
2023-08-03 22:11:58 -07:00
Michael Yang
c5bcf32823
update llama.cpp
2023-08-03 11:50:24 -07:00
Michael Yang
74a5f7e698
no gpu for 70B model
2023-08-01 17:12:50 -07:00
Michael Yang
319f078dd9
remove -Werror
...
there are compile warnings on Linux which -Werror elevates to errors,
preventing compile
2023-07-31 21:45:56 -07:00
Jeffrey Morgan
7da249fcc1
only build metal for darwin,arm
target
2023-07-31 21:35:23 -04:00