Remember if you had been younger, your obligations had been far fewer, and also you had been nonetheless at the least a bit hopeful concerning the future potential of tech? Anyway! In our current second, nothing seems to be protected from the sticky fingers of so-called AI—and that features nostalgic {hardware} of yesteryear.
Exo Labs, an outfit with the mission assertion of democratising entry to AI, equivalent to giant language fashions, has lifted the lid on its newest undertaking: a modified model of Meta’s Llama 2 working on a Windows 98 Pentium II machine (by way of Hackaday). Though not the newest Llama mannequin, it is no much less head-turning—even for me, a frequent AI-naysayer.
To be honest, with regards to huge tech’s maintain over AI, Exo Labs and I appear to be of a equally cautious thoughts. So, setting apart my very own AI-scepticism for the second, that is undoubtedly a formidable undertaking mainly as a result of it would not depend on a power-hungry, very a lot environmentally-unfriendly intermediary datacenter to run.
The journey to Llama working on ancient-though-local {hardware} enjoys some twists and turns; after securing the second hand machine, Exo Labs needed to cope with discovering appropriate PS/2 peripherals, after which work out how they’d even switch the mandatory information onto the decades-old machine. Did FTP over an ethernet cable was backwards appropriate to this diploma? I actually did not!
Don’t be fooled although—I’m making it sound approach simpler than it was. Even earlier than FTP finagling was found out, Exo Labs needed to discover a option to compile fashionable code for a pre-Pentium Pro machine. Longer story short-ish, the group went with Borland C++ 5.02, a “26-year-old [integrated development environment] and compiler that ran instantly on Windows 98.” However, compatibility points continued with the programming language C++, so the group had to make use of the older incarnation of C and cope with declaring variables initially of each perform. Oof.
Then, there’s the {hardware} on the coronary heart of this undertaking. For these needing a refresher, the Pentium II machine sports activities an itty bitty 128 MB of RAM, whereas a full dimension Llama 2 LLM boasts 70 billion parameters. Managing all of those hefty constraints, the outcomes are much more attention-grabbing.
Unsurprisingly, Exo Labs needed to craft a relatively svelte model of Llama for this undertaking, now obtainable to instrument round with your self by way of GitHub. As a results of every thing aforementioned, the retrofitted LLM options 1 billion parameters and spits out 0.0093 Tokens per second—hardly blistering, however the headline take right here actually is that it really works in any respect.