LocalAI - Models

arcee-ai_afm-4.5b

AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. We use a modified version of TorchTitan for pretraining, Axolotl for supervised fine-tuning, and a modified version of Verifiers for reinforcement learning. The development of AFM-4.5B prioritized data quality as a fundamental requirement for achieving robust model performance. We collaborated with DatologyAI, a company specializing in large-scale data curation. DatologyAI's curation pipeline integrates a suite of proprietary algorithms—model-based quality filtering, embedding-based curation, target distribution-matching, source mixing, and synthetic data. Their expertise enabled the creation of a curated dataset tailored to support strong real-world performance. The model architecture follows a standard transformer decoder-only design based on Vaswani et al., incorporating several key modifications for enhanced performance and efficiency. Notable architectural features include grouped query attention for improved inference efficiency and ReLU^2 activation functions instead of SwiGLU to enable sparsification while maintaining or exceeding performance benchmarks. The model available in this repo is the instruct model following supervised fine-tuning and reinforcement learning.

Links

Tags

jina-reranker-v1-tiny-en

This model is designed for blazing-fast reranking while maintaining competitive performance. What's more, it leverages the power of our JinaBERT model as its foundation. JinaBERT itself is a unique variant of the BERT architecture that supports the symmetric bidirectional variant of ALiBi. This allows jina-reranker-v1-tiny-en to process significantly longer sequences of text compared to other reranking models, up to an impressive 8,192 tokens.

Links

Tags

eurollm-9b-instruct

The EuroLLM project has the goal of creating a suite of LLMs capable of understanding and generating text in all European Union languages as well as some additional relevant languages. EuroLLM-9B is a 9B parameter model trained on 4 trillion tokens divided across the considered languages and several data sources: Web data, parallel data (en-xx and xx-en), and high-quality datasets. EuroLLM-9B-Instruct was further instruction tuned on EuroBlocks, an instruction tuning dataset with focus on general instruction-following and machine translation.

Links

Tags

phi-4

phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning. phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. Phi-4 is a 14B parameters, dense decoder-only Transformer model.

Links

Tags

LocalAI-functioncall-phi-4-v0.3

A model tailored to be conversational and execute function calls with LocalAI. This model is based on phi-4.

Links

Tags

LocalAI-functioncall-phi-4-v0.2

A model tailored to be conversational and execute function calls with LocalAI. This model is based on phi-4. This is the second iteration of https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.1 with added CoT (o1) capabilities from the marco-o1 dataset.

Links

Tags

LocalAI-functioncall-phi-4-v0.1

A model tailored to be conversational and execute function calls with LocalAI. This model is based on phi-4.

Links

Tags

sicariussicariistuff_phi-lthy4

- The BEST Phi-4 Roleplay finetune in the world (Not that much of an achievement here, Phi roleplay finetunes can probably be counted on a single hand). - Compact size & fully healed from the brain surgery Only 11.9B parameters. Phi-4 wasn't that hard to run even at 14B, now with even fewer brain cells, your new phone could probably run it easily. (SD8Gen3 and above recommended). - Strong Roleplay & Creative writing abilities. This really surprised me. Actually good. Writes and roleplays quite uniquely, probably because of lack of RP\writing slop in the pretrain. Who would have thought? - Smart assistant with low refusals - It kept some of the smarts, and our little Phi-Lthy here will be quite eager to answer your naughty questions. - Quite good at following the character card. Finally, it puts its math brain to some productive tasks. Gooner technology is becoming more popular by the day.

Links

Tags

sicariussicariistuff_phi-line_14b

Excellent Roleplay with more brains. (Who would have thought Phi-4 models would be good at this? so weird... ) Medium length response (1-4 paragraphs, usually 2-3). Excellent assistant that follows instructions well enough, and keeps good formating. Strong Creative writing abilities. Will obey requests regarding formatting (markdown headlines for paragraphs, etc). Writes and roleplays quite uniquely, probably because of lack of RP\writing slop in the pretrain. This is just my guesstimate. LOW refusals - Total freedom in RP, can do things other RP models won't, and I'll leave it at that. Low refusals in assistant tasks as well. VERY good at following the character card. Math brain is used for gooner tech, as it should be.

Links

Tags

microsoft_phi-4-mini-instruct

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - with a focus on high-quality, reasoning dense data. The model belongs to the Phi-4 model family and supports 128K token context length. The model underwent an enhancement process, incorporating both supervised fine-tuning and direct preference optimization to support precise instruction adherence and robust safety measures.

Links

Tags

microsoft_phi-4-mini-reasoning

Phi-4-mini-reasoning is a lightweight open model built upon synthetic data with a focus on high-quality, reasoning dense data further finetuned for more advanced math reasoning capabilities. The model belongs to the Phi-4 model family and supports 128K token context length. Phi-4-mini-reasoning is designed for multi-step, logic-intensive mathematical problem-solving tasks under memory/compute constrained environments and latency bound scenarios. Some of the use cases include formal proof generation, symbolic computation, advanced word problems, and a wide range of mathematical reasoning scenarios. These models excel at maintaining context across steps, applying structured logic, and delivering accurate, reliable solutions in domains that require deep analytical thinking. This model is designed and tested for math reasoning only. It is not specifically designed or evaluated for all downstream purposes. Developers should consider common limitations of language models, as well as performance difference across languages, as they select use cases, and evaluate and mitigate for accuracy, safety, and fairness before using within a specific downstream use case, particularly for high-risk scenarios. Developers should be aware of and adhere to applicable laws or regulations (including but not limited to privacy, trade compliance laws, etc.) that are relevant to their use case. Nothing contained in this Model Card should be interpreted as or deemed a restriction or modification to the license the model is released under. This release of Phi-4-mini-reasoning addresses user feedback and market demand for a compact reasoning model. It is a compact transformer-based language model optimized for mathematical reasoning, built to deliver high-quality, step-by-step problem solving in environments where computing or latency is constrained. The model is fine-tuned with synthetic math data from a more capable model (much larger, smarter, more accurate, and better at following instructions), which has resulted in enhanced reasoning performance. Phi-4-mini-reasoning balances reasoning ability with efficiency, making it potentially suitable for educational applications, embedded tutoring, and lightweight deployment on edge or mobile systems. If a critical issue is identified with Phi-4-mini-reasoning, it should be promptly reported through the MSRC Researcher Portal or secure@microsoft.com

Links

Tags

microsoft_phi-4-reasoning-plus

Phi-4-reasoning-plus is a state-of-the-art open-weight reasoning model finetuned from Phi-4 using supervised fine-tuning on a dataset of chain-of-thought traces and reinforcement learning. The supervised fine-tuning dataset includes a blend of synthetic prompts and high-quality filtered data from public domain websites, focused on math, science, and coding skills as well as alignment data for safety and Responsible AI. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning. Phi-4-reasoning-plus has been trained additionally with Reinforcement Learning, hence, it has higher accuracy but generates on average 50% more tokens, thus having higher latency.

Links

Tags

microsoft_phi-4-reasoning

Phi-4-reasoning is a state-of-the-art open-weight reasoning model finetuned from Phi-4 using supervised fine-tuning on a dataset of chain-of-thought traces and reinforcement learning. The supervised fine-tuning dataset includes a blend of synthetic prompts and high-quality filtered data from public domain websites, focused on math, science, and coding skills as well as alignment data for safety and Responsible AI. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.

Links

Tags

phillama-3.8b-v0.1

The description of the LLM model is: Phillama is a model based on Phi-3-mini and trained on Llama-generated dataset raincandy-u/Dextromethorphan-10k to make it more "llama-like". Also, this model is converted into Llama format, so it will work with any Llama-2/3 workflow. The model aims to generate text with a specific "llama-like" style and is suited for text-generation tasks.

Links

https://huggingface.co/RichardErkhov/raincandy-u_-_phillama-3.8b-v0.1-gguf

Tags

Model Gallery

Filter by type:

Filter by tags:

arcee-ai_afm-4.5b

jina-reranker-v1-tiny-en

eurollm-9b-instruct

phi-4

LocalAI-functioncall-phi-4-v0.3

LocalAI-functioncall-phi-4-v0.2

LocalAI-functioncall-phi-4-v0.1

sicariussicariistuff_phi-lthy4

sicariussicariistuff_phi-line_14b

microsoft_phi-4-mini-instruct

microsoft_phi-4-mini-reasoning

microsoft_phi-4-reasoning-plus

microsoft_phi-4-reasoning

phillama-3.8b-v0.1