Page cover

📃NLP API's

NLP API's allow to add data for analysis on LLM dashboard

Pre-requisite

  • Create Project.

  • Get the project_id and API key.

Initialisation

Initialise censius_client object.

from censius.nlp import CensiusClient

API_KEY   = "<API KEY>"
projectId = <PROJECT ID> # datatype as integer

client = CensiusClient(api_key = API_KEY, project_id = projectId)

Introducing Types

  • DatasetType

    • 🔹DatasetType.TEXT: Multi-string text containing special characters and limiters (, . & / \ * @ and so on)

  • ModelType

    • 🔹ModelType.NLP

    • 🔹ModelType.LLM

  • UseCase

    • NLP

      • 🔹UseCase.NLP.SUMMARIZATION

      • 🔹UseCase.NLP.SENTIMENT_CLASSIFICATION

      • 🔹UseCase.NLP.INTENT_CLASSIFICATION

      • 🔹UseCase.NLP.Q_AND_A

      • 🔹UseCase.NLP.TOXICITY_DETECTION

      • 🔹UseCase.NLP.INFORMATION_RETRIEVAL

      • 🔹UseCase.NLP.LANGUAGE_TRANSLATION

    • LLLM

      • 🔸UseCase.LLM.SUMMARIZATION

      • 🔸UseCase.LLM.SENTIMENT_CLASSIFICATION

      • 🔸UseCase.LLM.INTENT_CLASSIFICATION

      • 🔸UseCase.LLM.Q_AND_A

      • 🔸UseCase.LLM.TOXICITY_DETECTION

      • 🔸UseCase.LLM.INFORMATION_RETRIEVAL

      • 🔸UseCase.LLM.LANGUAGE_TRANSLATION

      • 🔸UseCase.LLM.REASONING

      • 🔸UseCase.LLM.VARY_PROMPTING

      • 🔸UseCase.LLM.VARY_STRATEGY

      • 🔸UseCase.LLM.CALIBRATION

      • 🔸UseCase.LLM.HARM_EVALUATION

Example Dataset

Simple dataset with News articles, reference summary, and summary generated by the T5 base model.

UseCase : Summarisation

Following APIs will be supported in the given order:

  • Register Dataset : register_dataset() - Register a training dataset to the Censius Platform.

Argument
Type
Description
Presence

name

Text

A name for reference.

Required

file

CSV path

This is expected to be a Training dataset CSV file name. The CSV has to be in the provided format.

Required

dataset_type

DatasetType.TEXT

As of now by default, we are considering dataset_type as “text”. In later stage, we will be supporting:

  • ”DatasetType.Vector”

Required

use_case

enum

UseCase.LLM.* or UseCase.NLP.* Please see the introduced types

Required

➡️ ROUGE score is calculated from generated and reference summaries. Therefore, both summaries must be provided by the user.

Register Model - register_model()

  • This API allows the user to register a new model to the Censius platform.

Argument
Type
Description
Presence

model_name

string

The name of the model

Required

model_type

enum

ModelType.NLP or ModelType.LLM; Whichever applies

Required

use_case

enum

UseCase.LLM.* or UseCase.NLP.* Introduced_types

Required

dataset_id

INTEGER

Recording the ID of the dataset the model is trained on

Required

parent_model_id

INTEGER

Id of the model being updated (version)

Optional

Log - log your predictions

This function enables logging individual predictions and features. It can be integrated as part of the production environment to log below values as predictions are made.

Argument
Type
Description
Presence

log_id

string

The ID of this prediction log. This can be used to update the actual of this log later

Required

model_id

int

The ID of the model

Required

prediction

DatasetType.TEXT

The summary generated by the model

Required

referenced_output

DatasetType.TEXT

The referenced_summary used for validating prediction, hence actuals

Required

timestamp

integer

This is supposed to be Timestamp of prediction generated in millisecond.

Required

input

DatasetType.TEXT

The input query went to the LLM model.

Required

file

Pandas.DataFrame

File for bulk insertion in single call. Example provided.

Optional (WIP)

confidence_score

float

model confidence score between 0 and 1.

optional

Last updated

Was this helpful?