Skip to content
Active Training

Verified
benchmarks.

We believe in absolute transparency. Official benchmark performance scores across industry standards will be fully published here.

STATUS: UNDER EVALUATION

Vilcus-1 Scores Pending

Vilcus-1 is currently in its active training and parameter tuning phase. To maintain strict scientific accuracy and prevent speculative assertions, **we have removed all preliminary scores**.

Once model training cycles are fully finalized and verified by independent alignment teams, complete comparative tables (including performance graphs against other models) will be officially posted here.

Training Run Active

Our computing clusters are currently processing final alignment layers. Benchmark updates will launch automatically upon model completion.

Evaluation Methodology

Coding Capabilities (HumanEval)

Evaluates the correctness of synthesizing Python code blocks from docstrings. Vilcus-1 will be benchmarked on complex multi-step programming syntax.

Pending Training Completion

Mathematical Logic (GSM8K)

Standardized multi-step math word problems requiring cohesive multi-step reasoning before generating final responses.

Pending Training Completion

Scientific Reasoning (GPQA)

High-level graduate-grade questions across biology, physics, and chemistry designed by domain experts to stress-test logical depth.

Pending Training Completion

Want early access?

Get notified when official model cards and verified benchmark results are posted.

Join the Waitlist