Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: NemoLLM error with NGC_API_KEY #2153

Open
2 tasks done
aadesoba-nv opened this issue Jan 29, 2025 · 3 comments
Open
2 tasks done

[BUG]: NemoLLM error with NGC_API_KEY #2153

aadesoba-nv opened this issue Jan 29, 2025 · 3 comments
Labels
bug Something isn't working external This issue was filed by someone outside of the Morpheus team

Comments

@aadesoba-nv
Copy link

aadesoba-nv commented Jan 29, 2025

Version

25.02

Which installation method(s) does this occur on?

Docker

Describe the bug.

When you set NGC_API_KEY=XYZ, you get a reasonable error message, but if this is changed to the real key by one digit then we get that unhelpful error, not sure if its a Morpheus or a NemoLLM bug.

Minimum reproducible example

the LLM returns a response in an unexpected format, in this case "\n(current age)**0.43\n" rather than "26 ^ 0.43"

Relevant log output

Similar log output:

Exception occurred in pipeline. Rethrowing
Traceback (most recent call last):
File "/opt/conda/envs/morpheus/lib/python3.10/site-packages/morpheus/pipeline/pipeline.py", line 408, in post_start
await executor.join_async()
File "/opt/conda/envs/morpheus/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/opt/conda/envs/morpheus/lib/python3.10/asyncio/tasks.py", line 650, in _wrap_awaitable
return (yield from awaitable.await())
File "/opt/conda/envs/morpheus/lib/python3.10/site-packages/morpheus_llm/llm/nodes/llm_generate_node.py", line 55, in execute
results = await self._llm_client.generate_batch_async(inputs, return_exceptions=self._return_exceptions)
File "/opt/conda/envs/morpheus/lib/python3.10/site-packages/morpheus_llm/llm/services/nemo_llm_service.py", line 190, in generate_batch_async
results = await asyncio.gather(*futures, return_exceptions=return_exceptions)
File "/opt/conda/envs/morpheus/lib/python3.10/site-packages/morpheus_llm/llm/services/nemo_llm_service.py", line 153, in _process_one_async
raise RuntimeError(
RuntimeError: Failed to generate response for prompt 'What is the capital of France?' after 5 attempts. Errors: ['\n401\n', '\n401\n', '\n401\n', '\n401\n', '\n401\n']

Full env printout

Click here to see environment details

[Paste the results of print_env.sh here, it will be hidden by default]

Other/Misc.

No response

Code of Conduct

  • I agree to follow Morpheus' Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@aadesoba-nv aadesoba-nv added the bug Something isn't working label Jan 29, 2025
@morpheus-bot-test morpheus-bot-test bot added Needs Triage Need team to review and classify external This issue was filed by someone outside of the Morpheus team labels Jan 29, 2025
@morpheus-bot-test
Copy link

Hi @aadesoba-nv!

Thanks for submitting this issue - our team has been notified and we'll get back to you as soon as we can!
In the meantime, feel free to add any relevant information to this issue.

@dagardner-nv dagardner-nv removed the Needs Triage Need team to review and classify label Jan 30, 2025
@dagardner-nv
Copy link
Contributor

We need to determine if this is a bug in NemoLLM by reproducing this outside of Morpheus, and thus reporting it to the NemoLLM team, or if it is a Morpheus issue.

@dagardner-nv
Copy link
Contributor

dagardner-nv commented Jan 31, 2025

This appears to be an issue with running nemollm.NemoLLM.generate in async mode, in blocking mode we always get a good response. The response errors are multi-line errors, in async we are only getting the last line.

import asyncio
import os

import nemollm

def test(nemo_key: str, model: str, prompt: str):
    try:
        con = nemollm.NemoLLM(api_key=nemo_key)
        response = con.generate(model, prompt)
        print(f"Success: {response['text']}")
    except Exception as e:
        print(f"Failure: {e}")

async def test_async(nemo_key: str, model: str, prompt: str):
    con = nemollm.NemoLLM(api_key=nemo_key)
    fut = await asyncio.wrap_future(con.generate(model, prompt, return_type="async"))
    response = nemollm.NemoLLM.post_process_generate_response(fut, return_text_completion_only=False)
    print(response)

async def main():
    model = "gpt-43b-002"
    prompt="What is the minimum nvidia driver version needed for CUDA 12.5?"

    print("Test with real key")
    nemo_key = os.environ['NGC_API_KEY']
    await test_async(nemo_key, model, prompt)

    print("\n----------\n")
    print("Test with a bad key")
    await test_async("bad_key", model, prompt)

    print("\n----------\n")
    print("Test with a bad key one character off from a real key")
    await test_async(nemo_key[0:-1] + '5', model, prompt)

if __name__ == '__main__':
    asyncio.run(main())

Output:

Test with real key
{'text': ' The minimum NVIDIA driver version needed for CUDA 12.5 depends on the specific GPU architecture being used. For example', 'cumlogprobs': -7.698154, 'prompt_labels': [{'class_name': 'nontoxic', 'score': 0.98646855}], 'completion_labels': [{'class_name': 'nontoxic', 'score': 0.98950094}]}

----------

Test with a bad key
{'status': 'fail', 'msg': 'http: named cookie not present\n'}

----------

Test with a bad key one character off from a real key
{'status': 'fail', 'msg': '\n401\n'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working external This issue was filed by someone outside of the Morpheus team
Projects
Status: Todo
Development

No branches or pull requests

2 participants