π DavidBeans: Unified Vision-to-Crystal Architecture
DavidBeans combines ViT-Beans (Cantor-routed sparse attention) with David (multi-scale crystal classification) into a unified geometric deep learning architecture.
Model Description
This model implements several novel techniques:
- Hybrid Cantor Routing: Combines fractal Cantor set distances with positional proximity for sparse attention patterns
- Pentachoron Experts: 5-vertex simplex structure with Cayley-Menger geometric regularization
- Multi-Scale Crystal Projection: Projects features to multiple representation scales with learned fusion
- Cross-Contrastive Learning: Aligns patch-level features with crystal anchors
Architecture
Image [B, 3, 32, 32]
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β BEANS BACKBONE β
β ββ Patch Embed β [64 patches, 512d]
β ββ Hybrid Cantor Router (Ξ±=0.3)
β ββ 4 Γ Attention Blocks (16 heads)
β ββ 4 Γ Pentachoron Expert Layers
βββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β DAVID HEAD β
β ββ Multi-scale projection: [256, 384, 512, 640, 768]
β ββ Per-scale Crystal Heads
β ββ Geometric Fusion (learned weights)
βββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
[100 classes]
Training Details
| Parameter | Value |
|---|---|
| Dataset | CIFAR-100 |
| Classes | 100 |
| Image Size | 32Γ32 |
| Patch Size | 4Γ4 |
| Embedding Dim | 512 |
| Layers | 4 |
| Attention Heads | 16 |
| Experts | 5 (pentachoron) |
| Sparse Neighbors | k=32 |
| Scales | [256, 384, 512, 640, 768] |
| Epochs | 200 |
| Batch Size | 128 |
| Learning Rate | 0.0005 |
| Weight Decay | 0.1 |
| Mixup Ξ± | 0.3 |
| CutMix Ξ± | 1.0 |
| Label Smoothing | 0.1 |
Results
| Metric | Value |
|---|---|
| Top-1 Accuracy | 68.34% |
TensorBoard Logs
Training logs are included in the tensorboard/ directory. To view:
tensorboard --logdir tensorboard/
Usage
import torch
from safetensors.torch import load_file
from david_beans import DavidBeans, DavidBeansConfig
# Load config
config = DavidBeansConfig(
image_size=32,
patch_size=4,
dim=512,
num_layers=4,
num_heads=16,
num_experts=5,
k_neighbors=32,
cantor_weight=0.3,
scales=[256, 384, 512, 640, 768],
num_classes=100
)
# Create model and load weights
model = DavidBeans(config)
state_dict = load_file("model.safetensors")
model.load_state_dict(state_dict)
# Inference
model.eval()
with torch.no_grad():
output = model(images)
predictions = output['logits'].argmax(dim=-1)
Citation
@misc{davidbeans2025,
author = {AbstractPhil},
title = {DavidBeans: Unified Vision-to-Crystal Architecture},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/AbstractPhil/geovit-david-beans}
}
License
Apache 2.0
- Downloads last month
- 3
Dataset used to train AbstractPhil/geovit-david-beans-run002-5expert
Evaluation results
- Top-1 Accuracy on CIFAR-100self-reported68.340