IBM Instructlab

Designing a proof of concept for synthetically generated data review platform.

Timeline: 2 Months

About IBM Instructlab

Instructlab is an open source IBM project, training and fine tuning enterprise-level LLMs with synthetically generated data on their flagship watsonX AI product.

IBM Instructlab partnered with Graphite Group, a student tech consulting organization, to help teams of reviewers to efficiently and collaboratively approve or deny sets of synthetic data.

I lead 3 designers to determine the product vision, lead user research, and create high fidelity designs.

Reviewing Synthetically Generated Data: Inconsistent, Tedious, and Manual

Instructlab's synthetic data is only reviewed by technical internal IBM teams, but Instructlab aims to expand to a wider, external, enterprise + nontechnical user base.

The current data reviewal process does not include any best practices of who’s tackling a specific question-and-answer data set, or if it’s already been reviewed. These details are arbitrarily determined by each team.

The alternative — directly reviewing in the command line interface — was favored by only technical users, becoming a large barrier for non-technical reviewers.

Pain points:

There is no existing standard review process or location, making review and collaboration especially difficult.
Users are oftentimes unsure about their decisions, leaving some questions partially or completely unreviewed.
Despite the process being inherently collaborative and involving multiple reviewers, users are left to their own to assign or review questions.

Prioritizing Collaboration, Efficiency, and Brevity

While high fidelity designs cannot be shared, here are a couple of our prioritized features addressing the above pain points.

List and Modular Views

We generated multiple iterations of data viewing options: list view configuration allows reviewers to work their way through the questions faster, have a more holistic view of multiple questions at once. Modular view offers more functionalities for reviewing a question and answer pair, enabling detailed review of each question.

Collaborative Team Tooling

Collaborative team tooling allows users to discuss the data with one another, allowing reviewers to see comments, feedback, and tags/mentions for help or another perspective.

Reference Documents

While reviewing the questions, the reviewer can have access to the "source of truth" reference document, allowing users to gut check their assessment of the data without having to rely on external collaboration or resources.

Continued Design Exploration

Stakeholders provided critical feedback, particularly on scaling the solution. Some potential next steps:

Metrics on the number of approved or denied datasets and confidence scores to evaluate overall synthetic data quality.y
Examining technical limitations if the design were to extend beyond manual reviews to automated validation mechanisms.
Role-based task assigning for teams to redirect questions to other members.s✳︎