Ruliad AI Releases DeepThought-8B: A New Small Language Model Built on LLaMA-3.1 with Test-Time Compute Scaling and Deliverers Transparent Reasoning

Ruliad AI released Deepthought-8B-LLaMA-v0.01-alpha, focusing on reasoning transparency and control. This model, built on LLaMA-3.1 with 8 billion parameters, is designed to offer sophisticated problem-solving capabilities comparable to much larger models while maintaining operational efficiency. Deepthought-8B distinguishes itself with unique features aimed at making AI reasoning more accessible and understandable. The standout characteristic is its […] The post Ruliad AI Releases DeepThought-8B: A New Small Language Model Built on LLaMA-3.1 with Test-Time Compute Scaling and Deliverers Transparent Reasoning appeared first on MarkTechPost.

Dec 7, 2024 - 11:22

0 97

Ruliad AI Releases DeepThought-8B: A New Small Language Model Built on LLaMA-3.1 with Test-Time Compute Scaling and Deliverers Transparent Reasoning

Ruliad AI released Deepthought-8B-LLaMA-v0.01-alpha, focusing on reasoning transparency and control. This model, built on LLaMA-3.1 with 8 billion parameters, is designed to offer sophisticated problem-solving capabilities comparable to much larger models while maintaining operational efficiency.

Deepthought-8B distinguishes itself with unique features aimed at making AI reasoning more accessible and understandable. The standout characteristic is its transparent reasoning mechanism, where every step in the decision-making process is documented. This feature ensures users can follow the model’s thought process, outputted in a structured JSON format. This step-by-step reasoning builds trust in its outputs and facilitates seamless integration into applications requiring clear and explainable AI logic. Another aspect of Deepthought-8B is its programmable reasoning patterns. Unlike many models that require retraining for different tasks, this model allows customization of reasoning approaches without necessitating retraining. This adaptability makes it suitable for various applications, from coding tasks to complex problem-solving scenarios. Also, its scalability in test-time computing ensures it can adjust reasoning depth based on the complexity of tasks, providing users with a versatile tool for various challenges.

Deepthought-8B operates efficiently on systems with 16GB or more VRAM and supports advanced features like Flash Attention 2 for enhanced performance. Its technical ecosystem is built on widely used frameworks such as Python, PyTorch, and the Transformers library, allowing developers compatibility and ease of use. Each reasoning chain in the model includes stages such as problem understanding, data gathering, analysis, calculation, verification, conclusion drawing, and implementation. These clearly defined steps enhance the model’s usability and position it as a valuable tool for domains requiring rigorous logical workflows.

Deepthought-8B also shows strong performance across various benchmarks, like coding and mathematical tasks effectively. However, it has limitations. Complex mathematical reasoning, long-context processing, and edge-case handling are areas where the model has room for improvement. Acknowledging these limitations reflects Ruliad’s transparency in presenting the model’s capabilities, fostering user trust, and encouraging constructive feedback for future iterations. Ruliad has positioned Deepthought-8B as a commercial enterprise solution, with licensing terms supporting this approach. The model is accompanied by comprehensive support options, including social media and email contact, ensuring users can easily access assistance. The documentation for Deepthought-8B includes detailed installation and usage guidelines.

Installation

pip install torch transformers
# Optional: Install Flash Attention 2 for better performance
pip install flash-attn

Usage

1.First, set your HuggingFace token as an environment variable:

export HF_TOKEN=your_token_here
export HF_HUB_ENABLE_HF_TRANSFER=1

2.Use the model in your Python code:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Initialize the model
model_name = "ruliad/deepthought-8b-llama-v0.01-alpha"
tokenizer = AutoTokenizer.from_pretrained(
    model_name,
    add_bos_token=False,
    trust_remote_code=True,
    padding="left",
    torch_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    attn_implementation="flash_attention_2",  # Use "eager" (or omit) if flash_attn is not installed
    use_cache=True,
    trust_remote_code=True,
)

3.Run the provided example script:

python deepthought_inference.py

In conclusion, Deepthought-8B, with its 8.03 billion parameters, rivals larger 70B-scale models in reasoning tasks, leveraging advanced features such as JSON-structured outputs and customizable inference paths. Its ability to run on systems with as little as 16GB VRAM ensures accessibility, while test-time compute scaling allows users to tailor performance to task complexity. With over 10,000 downloads in the past month, the model’s adoption underscores its utility and relevance.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 60k+ ML SubReddit.

[Partner with us]: ‘Next Magazine/Report- Open Source AI in Production’