tomasmcm/juanako-7b-una

稳定版本

replicate 来源：fblgit/juanako-7b-UNA ✦ 量化版本：TheBloke/juanako-7B-UNA-AWQ ✦ juanako采用UNA（Uniform Neural Alignment）技术。这是一种尚未发表的训练技术，可简化Transformer层之间的对齐过程。

体验模型

replicateAPI

价格

344次/1$

API文档资料

juanako-7b-UNA（统一神经对齐）

该模型是基于fblgit/juanako-7b-UNA-v2-phase-1在HuggingFaceH4/ultrafeedback_binarized数据集上微调的版本。它在多数方面超越了当前大多数基于Mistral的模型，是截至目前最新且最强大的juanako版本。

分数

官方 HuggingFace 结果可在此处查看：链接

模型	平均分 ⬆️	ARC (25样本) ⬆️	HellaSwag (10样本) ⬆️	MMLU (5样本) ⬆️	TruthfulQA (MC) (0样本) ⬆️	Winogrande (5样本)	GSM8K (5样本)	DROP (3样本)
mistralai/Mistral-7B-v0.1	50.32	59.58	83.31	64.16	42.15	78.37	18.12	6.14
Intel/neural-chat-7b-v3-1	59.0	66.21	83.64	62.37	59.65	78.14	19.56	43.84
fblgit/juanako-7b-UNA	59.91	68.17	85.34	62.47	65.13	78.85	20.7	38.74

根据 HuggingFace LLM 排行榜，其得分为：59.91。在使用 lm-eval-harness 的 big-refactor 分支时，其得分为：65.1。

作者 Xavier M. @fblgit

模型描述

胡安娜科（juanako）采用UNA（Uniform Neural Alignment）技术，这是一种尚未发布的训练方法，能够简化Transformer层之间的对齐过程。

提示词

以下提示显示出积极效果，具体效果可能因任务而异且需要进一步实验验证，但作为初始方案是可行的：

<|im_start|>system
- You are a helpful assistant chatbot trained by MosaicML.
- You answer questions.
- You are excited to be able to help the user, but will refuse to do anything that could be considered harmful to the user.
- You are more than just an information source, you are also able to write poetry, short stories, and make jokes.<|im_end|>
<|im_start|>user
Explain QKV<|im_end|>
<|im_start|>assistant

### Assistant: I am StableVicuna, a large language model created by CarperAI. I am here to chat!

### Human: Explain QKV
### Assistant:

[Round <|round|>]
问：Explain QKV
答：

[Round <|round|>]
Question：Explain QKV
Answer：

Question：Explain QKV
Answer：

评估（lm-eval 大重构分支）

TruthfulQA 零样本

|    Tasks     |Version|Filter|Metric|Value |   |Stderr|
|--------------|-------|------|------|-----:|---|-----:|
|truthfulqa_mc2|Yaml   |none  |acc   |0.6549|±  |0.0153|

ARC 25-Shot

|    Tasks    |Version|Filter| Metric |Value |   |Stderr|
|-------------|-------|------|--------|-----:|---|-----:|
|arc_challenge|Yaml   |none  |acc     |0.6476|±  |0.0140|
|             |       |none  |acc_norm|0.6809|±  |0.0136|

HellaSwag 10-Shot

|  Tasks  |Version|Filter| Metric |Value |   |Stderr|
|---------|-------|------|--------|-----:|---|-----:|
|hellaswag|Yaml   |none  |acc     |0.6703|±  |0.0047|
|         |       |none  |acc_norm|0.8520|±  |0.0035|

GSM8k 5-Shot

|Tasks|Version|  Filter  |  Metric   |Value |   |Stderr|
|-----|-------|----------|-----------|-----:|---|-----:|
|gsm8k|Yaml   |get-answer|exact_match|0.4898|±  |0.0138|

GPT 零样本评估

|    Tasks     |Version|Filter|  Metric  |Value |   |Stderr|
|--------------|-------|------|----------|-----:|---|-----:|
|boolq         |Yaml   |none  |acc       |0.8703|±  |0.0059|
|lambada_openai|Yaml   |none  |perplexity|3.2598|±  |0.0705|
|              |       |none  |acc       |0.7336|±  |0.0062|
|piqa          |Yaml   |none  |acc       |0.8254|±  |0.0089|
|              |       |none  |acc_norm  |0.8292|±  |0.0088|
|sciq          |Yaml   |none  |acc       |0.9580|±  |0.0063|
|              |       |none  |acc_norm  |0.9130|±  |0.0089|

MathQA 零样本

|Tasks |Version|Filter| Metric |Value |   |Stderr|
|------|-------|------|--------|-----:|---|-----:|
|mathqa|Yaml   |none  |acc     |0.3752|±  |0.0089|
|      |       |none  |acc_norm|0.3772|±  |0.0089|

PiQa 1-Shot

|Tasks|Version|Filter| Metric |Value |   |Stderr|
|-----|-------|------|--------|-----:|---|-----:|
|piqa |Yaml   |none  |acc     |0.8308|±  |0.0087|
|     |       |none  |acc_norm|0.8357|±  |0.0086|

Winogrande 5-Shot

|  Tasks   |Version|Filter|Metric|Value|   |Stderr|
|----------|-------|------|------|----:|---|-----:|
|winogrande|Yaml   |none  |acc   |0.768|±  |0.0119|

PubMedQA 零样本

| Tasks  |Version|Filter|Metric|Value|   |Stderr|
|--------|-------|------|------|----:|---|-----:|
|pubmedqa|Yaml   |none  |acc   | 0.76|±  |0.0191|

RACE 1-Shot（单次竞赛）

|Tasks|Version|Filter|Metric|Value |   |Stderr|
|-----|-------|------|------|-----:|---|-----:|
|race |Yaml   |none  |acc   |0.5282|±  |0.0154|

MMLU 5-Shot（8位）

|      Groups      |Version|Filter|Metric|Value |   |Stderr|
|------------------|-------|------|------|-----:|---|-----:|
|mmlu              |N/A    |none  |acc   |0.6137|±  |0.1243|
| - humanities     |N/A    |none  |acc   |0.5671|±  |0.1101|
| - other          |N/A    |none  |acc   |0.6859|±  |0.1164|
| - social_sciences|N/A    |none  |acc   |0.7195|±  |0.0713|
| - stem           |N/A    |none  |acc   |0.5087|±  |0.1297|

DROP 3-Shot (8位) (指令评估)

{'score': 0.49801113762927607}
{'drop': 49.8}
drop: 49.8

CRASS 零样本（指令评估）

{'score': 0.8357664233576643}
{'crass': 83.58}
crass: 83.58

训练详情

训练超参数

训练过程中使用了以下超参数： - learning_rate: 0.0001 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - distributed_type: multi-GPU - num_devices: 14 - gradient_accumulation_steps: 16 - total_train_batch_size: 224 - total_eval_batch_size: 14 - optimizer: 采用Adam优化器，betas=(0.9,0.999)，epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.01 - num_epochs: 1

训练结果

训练损失	训练轮次	训练步数	验证损失	奖励/选中项	奖励/拒绝项	奖励/准确率	奖励/边际值	对数概率/拒绝项	对数概率/选中项	逻辑值/拒绝项	逻辑值/选中项
0.4795	0.2	56	0.4958	-1.3684	-2.6385	0.7552	1.2701	-265.3887	-241.2612	-2.2572	-2.4922
0.4642	0.4	112	0.4859	-1.0380	-1.9769	0.7273	0.9389	-258.7718	-237.9569	-2.2414	-2.4751
0.4758	0.61	168	0.4808	-1.2594	-2.3704	0.7343	1.1110	-262.7074	-240.1708	-2.2305	-2.4633
0.4549	0.81	224	0.4768	-1.1906	-2.3201	0.7552	1.1295	-262.2044	-239.4827	-2.2284	-2.4610

框架版本

Transformers 4.35.0-UNA
Pytorch 2.1.0
Datasets 2.14.6
Tokenizers 0.14.1

引用

如果你觉得juanako有用，请：

@misc{juanako7buna,
  title={Juanako: Uniform Neural Alignment}, 
  author={Xavier Murias},
  year={2023},
  publisher = {HuggingFace},
  journal = {HuggingFace repository},
  howpublished = {\url{https://huggingface.co/fblgit/juanako-7b-UNA}},
}

感谢所有为AI诞生做出贡献的杰出人士，以下是我们认为与研究相关的一些人物。若发现遗漏引用，请联系我们。

@misc{lin2021truthfulqa,
  title={TruthfulQA: Measuring How Models Mimic Human Falsehoods},
  author={Stephanie Lin and Jacob Hilton and Owain Evans},
  year={2021},
  eprint={2109.07958},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}
@misc{tunstall2023zephyr,
      title={Zephyr: Direct Distillation of LM Alignment}, 
      author={Lewis Tunstall and Edward Beeching and Nathan Lambert and Nazneen Rajani and Kashif Rasul and Younes Belkada and Shengyi Huang and Leandro von Werra and Clémentine Fourrier and Nathan Habib and Nathan Sarrazin and Omar Sanseviero and Alexander M. Rush and Thomas Wolf},
      year={2023},
      eprint={2310.16944},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
@inproceedings{Bisk2020,
  author = {Yonatan Bisk and Rowan Zellers and
            Ronan Le Bras and Jianfeng Gao
            and Yejin Choi},
  title = {PIQA: Reasoning about Physical Commonsense in
           Natural Language},
  booktitle = {Thirty-Fourth AAAI Conference on
               Artificial Intelligence},
  year = {2020},
}
@software{eval-harness,
  author       = {Gao, Leo and
                  Tow, Jonathan and
                  Biderman, Stella and
                  Black, Sid and
                  DiPofi, Anthony and
                  Foster, Charles and
                  Golding, Laurence and
                  Hsu, Jeffrey and
                  McDonell, Kyle and
                  Muennighoff, Niklas and
                  Phang, Jason and
                  Reynolds, Laria and
                  Tang, Eric and
                  Thite, Anish and
                  Wang, Ben and
                  Wang, Kevin and
                  Zou, Andy},
  title        = {A framework for few-shot language model evaluation},
  month        = sep,
  year         = 2021,
  publisher    = {Zenodo},
  version      = {v0.0.1},
  doi          = {10.5281/zenodo.5371628},
  url          = {https://doi.org/10.5281/zenodo.5371628}
}
@misc{rafailov2023direct,
    title={Direct Preference Optimization: Your Language Model is Secretly a Reward Model}, 
    author={Rafael Rafailov and Archit Sharma and Eric Mitchell and Stefano Ermon and Christopher D. Manning and Chelsea Finn},
    year={2023},
    eprint={2305.18290},
    archivePrefix={arXiv},
}

使用量分析

总调用次数：39

平均响应时间：1.2s