Work With Us - Commercial Companies - Submit Solution
MYSTIC DEPOT: Vendor-Agnostic AI Evaluation Infrastructure
We look forward to your solution —
To submit, scroll to the form at the bottom of this page.
We look forward to your solution —
To submit, scroll to the form at the bottom of this page.
As artificial intelligence (AI) capabilities evolve at an extraordinary pace, the government requires evaluation infrastructure that can keep pace by continuously assessing new models against mission-specific benchmarks as they are released.
Further, the success of AI systems in national security contexts will depend on human-machine teaming. Evaluation must assess not only whether AI systems can perform tasks in isolation, but whether human-AI teams achieve better mission outcomes than either humans or AI alone.
Evaluation must also keep pace as AI systems evolve from passive models to active agents that use tools, access systems, and execute multi-step tasks. Beyond model outputs, assessment must account for agent behaviors, including whether agents complete complex missions correctly and safely, use tools appropriately, and maintain auditability.
The Department of War (DoW), in partnership with the Office of the Director of National Intelligence (ODNI), seeks an evaluation harness and government-specific benchmarks that together enable rigorous, reproducible, vendor-agnostic assessment of any AI system against government-defined criteria. The Government intends to use this harness across multiple programs. Solutions should be designed for broad applicability rather than single-program optimization. This Area of Interest (AOI) comprises two Lines of Effort (LOE); vendors may respond to one or both. Vendors submitting solutions must specify if they are addressing LOE 1, LOE 2, or both on the title slide or title page in their submission. Submission file titles should likewise indicate “LOE1_”, “LOE2_”, or “LOE1&2_” as a prefix.
The Government is interested in considering solutions from a wide selection of vendors. All submissions should clearly explain which of the desired solution attributes they do and do not address, with proven examples of prior deployment, if applicable. The Government will consider partial solutions. Vendors are welcome to apply individually or in partnership. The Government may also request teaming arrangements amongst solution providers. Vendors are expected to demonstrate their solution in an unclassified environment as part of the Commercial Solutions Opening.
This AOI seeks an evaluation harness or harnesses to serve as the integrated infrastructure of an execution environment, tooling, and methodology that connects models to benchmarks and produces structured evaluation data. Harness architecture should enable standardized, reproducible assessment of AI systems against defined criteria by providing the following:
Submissions in response to this LOE should also have the following attributes:
This AOI seeks solutions from vendors that create benchmarks across unclassified, secret, and top secret workflows, and that provide their methodology for government review and adoption. These benchmarks would be executed using the evaluation harness in LOE 1.
The benchmarking methodology should address:
Materials should enable government personnel to develop and maintain benchmarks without ongoing vendor support, including but not limited to: written methodology guide, worked examples, common pitfalls, quality assurance checklist, and training curriculum.
This AOI seeks solutions from vendors that are eligible to receive an Other Transaction award in accordance with 10 U.S.C. 4022 and have demonstrated expertise in AI evaluation, security testing, and benchmark creation. Submissions should provide specific, verifiable evidence of the following preferred qualifications as appropriate to the supported LOE:
This Area of Interest is being released in accordance with the Commercial Solutions Opening (CSO) process detailed within HQ0845-20-S-C001 (DIU CSO), posted to SAM.gov on 23 March 2020. This document can be found at: https://sam.gov/opp/c304359f88a0456bab1fa8837a3647f4/view
Any prototype Other Transaction (OT) agreement awarded may result in follow-on production without further competitive procedures. The follow-on may be significantly larger than the prototype OT.
Anticipated follow-on activities include:
Any prototype OT will include: "In accordance with 10 U.S.C. 4022(f), and upon a determination that the prototype project for this transaction has been successfully completed, this competitively awarded prototype OTA may result in the award of a follow-on production contract or transaction without the use of competitive procedures.”
DIU
When you submit to a DIU solicitation, we'll ask you to include a solution brief. Here's some guidance about what that entails.
Companies are advised that any Prototype Other Transaction (OT) agreement awarded in response to this solicitation may result in the direct award of a follow-on production contract or agreement without the use of further competitive procedures. Follow-on production activities will result from successful prototype completion.
The follow-on production contract or agreement will be available for use by one or more organizations within the Department of Defense. As a result, the magnitude of the follow-on production contract or agreement could be significantly larger than that of the Prototype OT agreement. All Prototype OT agreements will include the following statement relative to the potential for follow-on production: “In accordance with 10 U.S.C. § 4022(f), and upon a determination that the prototype project for this transaction has successfully been completed, this competitively awarded Prototype OT agreement may result in the award of a follow-on production contract or transaction without the use of competitive procedures.”
If you are having problems uploading your AOI submission to DIU, it may be one of these common issues with submitting, click here for solutions to common submission issues.
Need clarification? Having technical issues?
Reach out to our team.
*Required
If we think there’s a good match between your solution and our DoD partners, we’ll invite you to provide us with a full proposal — this is the beginning of negotiating all the terms and conditions of a proposed prototype contract.
After a successful prototype, the relationship can continue and even grow, as your company and any interested DoD entity can easily enter into follow-on contracts.
We solicit commercial solutions that address current needs of our DoD partners. (View all open solicitations and challenges.
You send us a short brief about your solution.
We’ll get back to you within 30 days if we’re interested in learning more through a pitch. If we're not interested, we'll strive to let you know ASAP.