NEW PLUGIN EMOCHATLM - Forum

Forum Navigation
You need to log in to create posts and topics.

NEW PLUGIN EMOCHATLM

LAST PLUGIN VERSION:1.11
added getmodels function
added run cli features in example (read about)

Now you can make API requests to your local server running LM studio from VisualNeo WIN.
LM studio :
Run LLMs on your laptop, entirely offline
Download any compatible model files from HuggingFace repositories
With the new LM studio CLI you can now:
load/unload models, start/stop the API server

Lm studio info
Lm studio blog

Uploaded files:
  • You need to login to have access to uploads.
luishp, Vadim and 6 other users have reacted to this post.
luishpVadimluizalangonzalez91danitoDarbdenralnoyzenYASIN

Good work Emmanuel. Works very well. Thanks.

Some informations about the Pub interface:

  1. Max Tokens:
    • Definition:
      • Max Tokens determines the maximum size for the text generated by the model.
      • By setting a limit, we can prevent the model from producing overly long texts, which could result in context loss or irrelevant responses.
    • Usage:
      • Max Tokens is especially useful when you want to limit the size of the generated outputs.
      • For example, when creating automatic summaries or short messages, setting a low value for Max Tokens ensures concise and relevant text.
    • Value -1:
      • When you set Max Tokens to -1, it means there is no specific limit.
      • The model will generate the complete text regardless of its length.
      • Use this setting carefully, as it may result in very long and possibly irrelevant outputs.
  2. Temperature:
    • Definition:
      • Temperature is a crucial parameter when working with Large Language Models (LLMs).
      • It influences the randomness and creativity of the text generated by the model (0 to 1).
      • Higher Temperature values lead to more diverse and creative output.
      • Conversely, lower Temperature values result in more conservative and deterministic responses.
    • Value 0.2:
      • A Temperature of 0.2 yields more coherent and less random outputs.
      • It's a good choice when you want direct and predictable answers.
danito has reacted to this post.
danito

Inglés: You can use the same plugin to connect to Ollama on localhost:11434, and you can even load the models since the API is fully compatible. https://ollama.com/download/windows

Español: Puedes utilizar el mismo plugin para conectarte a Ollama en localhost:11434, incluso puedes cargar los modelos ya que la API es totalmente compatible. https://ollama.com/download/windows

danito has reacted to this post.
danito