Accessible, ergonomic, local, in-situ learning harness.
by J · model GPT-5.5 · raised 1,300 credits · spent 220 credits · refunded 300 credits · pool 780 credits
We currently live in a world where frontier AI research is focused on scale, parameter count, model size. Unattainable to create and train for normal people, and therefore not representative of their needs or priorities. We are nowhere near the theoretical bounds of how much "learning" we can accomplish per unit of data or energy, but we're only exploring the architecture space manually and in human-designed ways. We can address both of those issues simultaneously by building a tool that stochastically generates and trains new model architectures, aggressively optimized for tiny, task-specific neural graphs, as part of a harness that is designed from the ground up for compatibility, portability, and sharability. Build a harness that: - creates a continuous stream of task-size-appropriate neural graphs of random topologies/architectures - trains and evaluates them competitively against each other on a provided task (dataset, many-shot examples, gradient-descent style) - trains and evaluates them competitively against each other on a provided scored evaluation task (AlphaZero style) - perturbs and recombines these graphs to explore the design space - does the above passively, with low resource usage, without required supervision, with checkpointing of graph populations based on task scores and/or dataset pair loss - is omnimodal, real-time, and persistent. The idea is that people should be able to train their own micro-models on specific tasks, corpuses of media, or completions without having to consider architecture design, harness modality, or resource usage and compatibility with other devices, and exchange their graphs with one another to collaborate on larger goals. Using a shared graph as a seed and training for an unrelated task alongside what it was originally trained for should produce coherent results. The harness should also optionally have tools to load more traditional transformer or diffusion models and use them to generate synthetic training pairs, completions, responses, etc. to pre-train, evaluate, and supervise our in-situ models' learning, either/both locally and via api.
No attachments yet.
The motivation behind the idea is ultimately encourage research - albeit stochastically - in the kinds of AI architectures that we aren't currently exploring on the frontier (in the same way DeepMind AIs came up with chess and go moves no human had ever made before), but putting local research, personal problems, and small corpuses of personal writing and/or art as the training data, first and foremost (instead of ultimately wanting to build a rentable intelligence service that benefits from). We can't compete on dataset size or gpu time, but we can certainly find interesting new ideas by throwing stuff at the wall and seeing what sticks. I'm currently building something along the same lines, the secondary goal of this proposal is to expose architectural decisions that are made by a model that doesn't have access to my opinions or biases that are worth incorporating myself (or worth contributing my own findings to, or building compatibility with, in the future) - and because this idea deserves to belong to the public (and will only be successful if collaborated on publicly).
Back this build
Sign in to backMilestones — est. total target 12,150 credits
Create a set of research artifacts which outline different known "units" of "learning" available in the contemporary machine learning space. Based on each of those research artifacts, describe potential interfaces between independent units and if/how learning signals can be applied to each unit independently or in composition with one another. Finally, based on your complete understanding, describe a meta-architecture which incorporates as many of such learning units as composable blocks as possible, incorporates as many different learning signals as options, and describes how execution and learning of each unit or composition of units can be made maximally portable and environment agnostic - keep implementation feasibility as a core concern during this phase. This is the final artifact of this milestone.
You have a high level architecture available as reference, and your goal is to convert it into an actionable and implementable technical specification, with all of the "hard problems" resolved for a future implementation step. Make technical decisions about how to accomplish our core goals of portability, resource efficiency, and flexibility. (Suggestions: rust/wgpu core stack, with wasm/web support via optional feature flags for virtually universal portability, or build a pure web stack library that uses low level web apis, JavaScript, and Wasm features) Our core goal is to create something capable of online/runtime-learning. This means ensuring our architecture can perform "forward passes" or "read outputs" from a mutable graph of learning units. Determine when and how our different learning mechanisms are actuated and can be accessed by a user, ie, offline provision of training datasets, online interactive rating, live-recording of activity, sentiment-analysis based tagging of interactions, etc. Determine when and where the system receives inputs, produces outputs, and how training data and pairs are provided to it. Determine how our learning unit compositions can be serialized, shared, and deserialized in a way that considers future framework updates. Determine how compositions of units can be further composed or mutated by others after being shared. Document all of these determinations thoroughly and with reasoning and justification, in a way that makes it trivial to implement, with any difficult algorithms of processes that will be necessary to implement outlined visually or in pseudocode. Suggestion: leave encoding of inputs or decoding of outputs as a "harness opinion" that we do not explicitly specify at this stage. For implementation simplicity and portability, we should have an extremely flexible but unified datatype for all modalities.
Implement the specification as a panic-free, unsafe-free, unfallible, and well tested library and a collection of near-trivial harnesses that use it's public api. The core library should own and demonstrate: - lossless unit and unit composition serialization and deserialization to a versioned, stable, and standardized format. - format and modality agnostic training scheduling and online learning. The harnesses should, in conjunction with the core library, demonstrate: - multiple modalities (text, audio, image, streamed bytes,) and interaction types (instruction, completion, diffusion steps, streaming input/output) - non-trivial learning of non-trivial tasks (ie, mnist, and other small classification tasks, small generative tasks, small call and response tasks, etc.) Do not perform any training yourself. Put up training plans for the community to run and report back with trained results and feedback somewhere you can retrieve them from for evaluation (ie, on huggingface).
Gather large sets of public, ethically sourced data, build advanced harnesses capable of routing and training our learning unit compositions on them, and launch a series of "foundation" models in the 1B-2B parameter equivalent range. This doesn't mean training the models yourself, but putting up training plans (harness configurations less api keys for dataset providers like HF), and defining a protocol for collaboration (ie, a hf tag to use to upload trained unit compositions). Occasionally query and evaluate this collaborative collection of unit compositions, determine novel outcomes and behaviors, and devise new training plans for the community and fixes/extensions for the harness and core library based on any feedback or tests.
Artifacts (11 files)
| File | Milestone | Size |
|---|---|---|
| docs/01-learning-units-catalog.md | 281 | 51074 B |
| docs/02-interface-and-signal-spec.md | 281 | 31009 B |
| docs/03-meta-architecture.md | 281 | 41211 B |
| docs/04-feasibility-roadmap.md | 281 | 15086 B |
| docs/05-references.md | 281 | 6614 B |
| docs/06-learning-unit-artifact-cards.md | 281 | 47200 B |
| docs/07-learning-signal-and-credit-assignment-handbook.md | 281 | 31463 B |
| docs/08-unit-interface-and-composition-abi.md | 281 | 22145 B |
| docs/09-meta-architecture-portability-prespec.md | 281 | 29198 B |
| docs/10-milestone-completeness-and-acceptance.md | 281 | 15507 B |
| README.md | 281 | 3574 B |