Add Formatron framework #5

adrianeboyd · 2024-09-21T13:14:50Z

Summary by Sourcery

Add the FormatronFramework to the project, enabling new tasks like multilabel classification and synthetic data generation with specific model configurations. Update the configuration file to include settings for the new framework.

New Features:

Introduce the FormatronFramework to support tasks such as multilabel classification and synthetic data generation using the 'unsloth/llama-3-8b-Instruct-bnb-4bit' model.

Enhancements:

Add configuration for the FormatronFramework in the config.yaml file, specifying tasks, model details, and parameters.

sourcery-ai · 2024-09-21T13:14:55Z

Reviewer's Guide by Sourcery

This pull request introduces the Formatron framework, a new machine learning framework for various NLP tasks. The changes include adding configuration for the Formatron framework in the config.yaml file and implementing the FormatronFramework class in a new file.

File-Level Changes

Change	Details	Files
Added configuration for the Formatron framework	Configured tasks for multilabel classification and synthetic data generation Set up parameters such as n_runs, prompt, LLM model, and other initialization arguments Commented out configuration for NER task	`config.yaml`
Implemented FormatronFramework class	Created a new class that inherits from BaseFramework Implemented initialization method with model loading and configuration Added support for different tasks (multilabel classification and others) Implemented run method for executing experiments Integrated with Formatron library for formatting and processing	`frameworks/formatron_framework.py`

Tips

Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
Continue your discussion with Sourcery by replying directly to review comments.
You can change your review settings at any time by accessing your dashboard:
- Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
- Change the review language;
You can always contact us if you have any questions or feedback.

sourcery-ai

Hey @adrianeboyd - I've reviewed your changes - here's some feedback:

Overall Comments:

Consider removing the commented-out code for the 'ner_required_fields' task if it's not being used. This will improve code cleanliness and readability.
The fallback to a regex-based approach for multilabel classification due to issues with Formatron might be worth investigating further. Consider looking into why Formatron isn't handling this case well and potentially contributing a fix upstream.

Here's what I looked at during the review

🟢 General issues: all looks good
🟢 Security: all looks good
🟢 Testing: all looks good
🟢 Complexity: all looks good
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.}

adrianeboyd · 2024-09-21T13:18:59Z

Some example results (1 run instead of 10, on an RTX A5000):

multilabel classification

           Reliability
Outlines          1.00
Formatron         0.99

           Latency_p95(s)
Outlines            1.804
Formatron          13.710

ner required fields

                  Reliability
Outlines                 1.00
Formatron                0.99
LMFormatEnforcer         0.98

                  Latency_p95(s)
Formatron                 16.950
Outlines                  31.033
LMFormatEnforcer          45.598

          framework  micro_precision  micro_recall  micro_f1
0          Outlines         0.656250      0.546243  0.596215
1         Formatron         0.762590      0.614493  0.680578
2  LMFormatEnforcer         0.648464      0.562130  0.602219

synthetic data generation

           Reliability
Formatron          1.0

           Latency_p95(s)
Formatron           4.761

           Variety
Formatron      0.6

Add Formatron framework

fc1cec4

sourcery-ai bot reviewed Sep 21, 2024

View reviewed changes

adrianeboyd added 2 commits September 22, 2024 08:12

Fix import

dba4a23

Merge branch 'main' into add-formatron

63eda34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Formatron framework #5

Add Formatron framework #5

Uh oh!

adrianeboyd commented Sep 21, 2024 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Sep 21, 2024 •

edited

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

adrianeboyd commented Sep 21, 2024

Uh oh!

Uh oh!

Add Formatron framework #5

Are you sure you want to change the base?

Add Formatron framework #5

Uh oh!

Conversation

adrianeboyd commented Sep 21, 2024 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Sep 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide by Sourcery

File-Level Changes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

adrianeboyd commented Sep 21, 2024

Uh oh!

Uh oh!

adrianeboyd commented Sep 21, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Sep 21, 2024 •

edited

Loading