PolicyPro / README.md
quarkymatter's picture
Update README.md
52597cf verified
metadata
license: llama3.1
datasets:
  - huggingface/policy-docs
  - mteb/legal_summarization
  - grimulkan/document-editing
  - AleAle2423/Table_of_contents
  - siqideng/proposal_drafter_feedback
  - quarkymatter/PolicyPro_dataset
language:
  - en
tags:
  - legal
  - academic
  - handbook
  - document editor
  - policy
  - chat
  - proposal
library_name: transformers
metrics:
  - bleu
base_model:
  - meta-llama/Meta-Llama-3.1-70B-Instruct
pipeline_tag: text-generation

image/jpeg

PolicyPro [ B E T A ]

Model Description

PolicyPro is a factual language model trained on PolicyPro handbook documents and everyday conversation data. It can be used to generate formal and structured policy texts, edit/modify existing texts, search, and summarize information.

**Developed by:** Brandon Cotton and Whitney Osborn

**Model type:** Text-to-Text Generation

**Language(s) (NLP):** English

Uses

To chat with the model via InteractiveChat():

from huggingface_hub import InferenceClient

client = InferenceClient(
    "quarkymatter/PolicyPro",
    token="hf_xxxxxxxxxxxxxxxxxxxxxx",
)


def interactive_chat():
    while True:
        user_input = input("You: ")
        messages = [{"role": "user", "content": user_input}]
        for message in client.chat_completion(
            messages=messages,
            max_tokens=500,
            stream=True,
        ):
            # Check if the response is None before accessing attributes
            if message is not None:
                print(message.choices[0].delta.content, end="")
            else:
                print("The model failed to generate a response. Try again.")
        print()  # Print a newline after the entire response


if __name__ == "__main__":
    interactive_chat()

Direct Use

PolicyPro can be used to:

* Update/modify/edit existing policies
* Get summaries of policies
* Ask questions about specific policies and get answers
* Generate different creative text formats of policy content, such as paraphrases and key concepts.

Note: PolicyPro is still under development, and its outputs should never be taken as legal advice.

Downstream Use

PolicyPro will prospectively be integrated via website or chatbot to provide easy access to policy documents and information.

Out-of-Scope Use

PolicyPro is not intended for:

* Generating legal documents without human evaluation
* Providing legal advice
* Creating misleading or false information about university policies

Bias, Risks, and Limitations

Bias:

  • PolicyPro is trained on a dataset of university policy documents, which may reflect institutional biases.
  • The model may not be accurate for all university policies or situations.

Risks:

  • PolicyPro could be used to generate misleading or false information about university policies.
  • Users may rely on PolicyPro's outputs as legal advice, which could lead to negative consequences.

Limitations:

  • PolicyPro is a factual language model and cannot understand the nuances of legal language.
  • The model may not be able to answer all questions about university policies accurately.
  • Accurate document editing is still under construction.

Recommendations

  • The model should be continuously monitored and updated to address any biases or inaccuracies.
  • Libraries and datasets must be refined to provide the best model training.

How to Get Started with the Model

(chat coming soon)

  1. Install the transformers library: pip install transformers and other dependencies
  2. Load the PolicyPro model:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "quarkymatter/PolicyPro"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Example usage
prompt = "What is the Dominican College policy on academic leave?"
input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"]
output = model.generate(input_ids)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Details

Note: Information is not publicly available due to client confidentiality.

The model was trained on the following custom datasets:

  • quarkymatter/PolicyPro_dataset (contains policy texts and documents)

Contact

For questions and/or concerns regarding this model, please contact Whitney at [email protected].