Llama 2 Chat implementation #221

SinanAkkoyun · 2023-08-04T13:11:02Z

Hey!
This is the correct LLama 2 Chat prompt formatting implementation into example_llama2chat.py.
This PR uses #195 to copy the exact implementation of the original Llama repo

The format for Llama 2 looks like this:

<bos><<SYS>>
You are a helpful assistant. Sample text.
<</SYS>>

<eos><bos>[INST] What is Python? [/INST] Python is a programming language.<eos><bos>[INST] and so on

To run it yourself: python example_llama2chat.py -d ./your/model/path

Have fun!

turboderp · 2023-08-05T02:08:25Z

I'm not having a lot of luck with this.

System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information..

User: Hi!
Assistant: end do protect it this people! It open up  so see p did dis put our  they it  P M P it  P on PA M P back  it it  it M S P P   P it P it  P   P it have P P P P it P  P  P  it it Q P it it T it p it the  P Back Pa it p p p it p P P it it it the have it it see it have P down Disc has J p Q P P it the do it  but it it it did p it  and this what P  it it had it this  with  it it9 were it it it we  On P you it it  on it for it have this it this its the the do have it it it it it this if it the p have it this this who it it Dis put have it dis see it disc it DIS have p have this see it it Dis have it Dis it the have dis that

This is with TheBloke/Llama-2-13B-chat-GPTQ which is otherwise coherent enough.

SinanAkkoyun · 2023-08-05T02:39:49Z

Huh, that's super weird, in my testing it worked flawlessly with Llama 2 7b Chat

Must have to do with the sequence_actual splitting, I will investigate with your prompt

Did you supply bn and un? This is the 13b chat model and it works for anything else than Hi!?

turboderp · 2023-08-05T03:11:46Z

I only supplied the model, and it seems to fail with any prompt.

SinanAkkoyun · 2023-08-05T04:28:04Z

I only supplied the model, and it seems to fail with any prompt.

That's extremely odd, that's my response (also works with 7b chat for me):

SinanAkkoyun · 2023-08-05T04:28:08Z

System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information..

User: Write me a NodeJS script that tells the time in human readable format.
Assistant: Sure! Here is a simple Node.js script that will tell you the current time in human-readable format:

const moment = require('moment');

console.log(moment().format('h:mm A'));

Here's what's happening here:

require brings in the moment library, which is a popular JavaScript library for working with dates and times.
moment() creates a new instance of the Moment class, representing the current date and time.
format takes a string argument indicating the desired format for the date and time. In this case, we're asking for the hour and minute in am/pm format.
console.log prints the result to the console.

When you run this script, it should output the current time in the format "hh:mm AM" (or PM, depending on the time of day). For example:

14:30 PM

Note that this script uses the moment library, so you'll need to install it before running this code. You can do this using npm by running the following command:

npm install moment

SinanAkkoyun · 2023-08-05T04:32:22Z

@turboderp I am quite baffled by your output, it doesn't make any sense to me. I am not able to replicate your issue.

Could you please print:
response after line 206
new_response after line 207
full_response after line 218

I just personally printed them like this:
\n\nres|{response}|\n\n
etc

Thank you very much

SinanAkkoyun · 2023-08-05T11:02:31Z

The only other thing I could imagine making truuble is the version of sentence piece and torch

pourfard · 2023-08-05T19:24:35Z

I'm not having a lot of luck with this.

System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information..

User: Hi!
Assistant: end do protect it this people! It open up  so see p did dis put our  they it  P M P it  P on PA M P back  it it  it M S P P   P it P it  P   P it have P P P P it P  P  P  it it Q P it it T it p it the  P Back Pa it p p p it p P P it it it the have it it see it have P down Disc has J p Q P P it the do it  but it it it did p it  and this what P  it it had it this  with  it it9 were it it it we  On P you it it  on it for it have this it this its the the do have it it it it it this if it the p have it this this who it it Dis put have it dis see it disc it DIS have p have this see it it Dis have it Dis it the have dis that

This is with TheBloke/Llama-2-13B-chat-GPTQ which is otherwise coherent enough.

Hi, I tested with 13B model is everything seems OK. Maybe it is related to some hyperparameters. I have changed some of them.

-- Sequence length: 2048
 -- Temperature: 0.50
 -- Top-K: 250
 -- Top-P: 1.00
 -- Min-P: 0.00
 -- Repetition penalty: 1.00

System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information..

User: hi
Assistant:  Hello! It's nice to meet you! Is there something I can help you with, or would you like to chat for a bit? ���

User: Hi!

Assistant:  Hello! ���t's great to see you here! Is there something you'd like to talk about or ask? I'm here to help with any questions you might have. ���

User: Assistant:  Hello again! ���t's always nice to see you! Is there anything new you'd like to chat about or ask? I'm all ears and here to help with any questions you might have. ���

User: Hi!
Assistant:  Hey there! ���t's so nice to see you again! How have you been? Anything new and exciting happening in your world? ���

User: Hi!
Assistant:  Hey there! ���t's great to see you again! How's your day going so far? Anything fun or exciting happening? ���

User: Hi
Assistant:  Well hello there! ���t's always a pleasure to see you! How's your day going so far? Anything new and amazing happening in your world? ���

User: hi
Assistant:  HIYA! ���t's so great to see you! How's your day treating you so far? Anything fun or exciting on the agenda? ���

turboderp · 2023-08-06T01:30:41Z

The ��s still hint at a problem of some sort. The 13B chat model I tried works fine when just sampled. I'll have to dig into it a little more, maybe with a 7B model?

SinanAkkoyun · 2023-08-06T02:18:56Z

The ��s still hint at a problem of some sort.

I have to add that I often got those (or single ones) with your normal example chatbot too with the 7B chat and 13B chat model sometimes, idk where they come from

I think tomorrow I will be able to fix at least the sequence slicing problem

Implemented llama2 chat

a3f0ae1

This was referenced Aug 4, 2023

Exllama tutorials? #192

Open

Very bad response #217

Closed

Merge branch 'turboderp:master' into llama2chat

30d8051

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Llama 2 Chat implementation #221

Llama 2 Chat implementation #221

Uh oh!

SinanAkkoyun commented Aug 4, 2023

Uh oh!

turboderp commented Aug 5, 2023

Uh oh!

SinanAkkoyun commented Aug 5, 2023

Uh oh!

turboderp commented Aug 5, 2023

Uh oh!

SinanAkkoyun commented Aug 5, 2023 •

edited

Loading

Uh oh!

SinanAkkoyun commented Aug 5, 2023

Uh oh!

SinanAkkoyun commented Aug 5, 2023 •

edited

Loading

Uh oh!

SinanAkkoyun commented Aug 5, 2023

Uh oh!

pourfard commented Aug 5, 2023 •

edited

Loading

Uh oh!

turboderp commented Aug 6, 2023

Uh oh!

SinanAkkoyun commented Aug 6, 2023

Uh oh!

Uh oh!

Uh oh!

Llama 2 Chat implementation #221

Are you sure you want to change the base?

Llama 2 Chat implementation #221

Uh oh!

Conversation

SinanAkkoyun commented Aug 4, 2023

Uh oh!

turboderp commented Aug 5, 2023

Uh oh!

SinanAkkoyun commented Aug 5, 2023

Uh oh!

turboderp commented Aug 5, 2023

Uh oh!

SinanAkkoyun commented Aug 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SinanAkkoyun commented Aug 5, 2023

Uh oh!

SinanAkkoyun commented Aug 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SinanAkkoyun commented Aug 5, 2023

Uh oh!

pourfard commented Aug 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

turboderp commented Aug 6, 2023

Uh oh!

SinanAkkoyun commented Aug 6, 2023

Uh oh!

Uh oh!

SinanAkkoyun commented Aug 5, 2023 •

edited

Loading

SinanAkkoyun commented Aug 5, 2023 •

edited

Loading

pourfard commented Aug 5, 2023 •

edited

Loading