I've been wanting to write this for a long time. Even before large generative machine-learning models became a valid subject of conversation among non-technical people.
My relationship with "artificial intelligence" has been quite complicated. I've had a strong scientific and hacker-like curiosity towards machine-learning models for many years, and I feel there's a lot of positive potential to them. However, I dislike how they're being adopted by the civilization.
There's already a lot of political polarization around this topic. I often side with the skeptics and laugh at the naïve advocates, but I still feel both sides are misguided. Similar polarization has happened with many other technologies before, so it's nothing new.
It is much easier for me to hate, say, "mining"-based cryptocurrencies such as Bitcoin. They're all environmentally destructive scam, period. But AI forms a mind-boggingly huge territory of possibilities, and we've barely scratched the surface of it. Simplistic judgement is therefore impossible.
I often get obsessed by systems that do something they shouldn't be able to. Such as computer programs that do "impossible" things – how can a short formula like t*5&(t<<7)|t*3&(t*4<<10) sound like music? When I found potentially useful undocumented features in an old 8-bit computer, I could spend hundreds of hours studying it in order to show it at a demoscene event.
In 2015, Google revealed its Deepdream technique. How neural networks designed for image recognition could be used backwards to produce psychedelic imagery on top of existing images. I got obsessed by the technique, dissecting several neural nets in order to find out how they work and to find new visually interesting things to do with them.
In 2019, I became similarly obsessed by OpenAI's GPT-2. I had seen neural language models before, but they had looked just as bad as the Markov chains used in silly chatbots since 1990s. But GPT-2 was different. It remained coherent over several paragraphs and even showed something that looked like creativity, so I wanted to know exactly how it worked and where its limits were. I implemented my own version in C in order to better understand it. I also trained it to speak my native language. Some even paid me for helping them out with GPT.
Exploring neural nets was fascinating, but at the same time they were far out of my technological comfort zone. I generally like systems that are relatively simple on the surface but have a lot of emergent complexity. Elegant algorithms, simple formulas, 8-bit microchips. Big neural models are always somewhat like black boxes, alien artifacts very difficult for human minds to reverse-engineer. So, I guess I understand pretty well the anxiety that oldschool hackers feel when dealing with machine learning.
AI is just as old as computers. The terms "artificial intelligence" and "machine learning" were coined in the 1950s, and even the first artificial neural nets were made in that decade. Even earlier, there was the field of cybernetics, focusing on control processes in machines and living beings alike. Cybernetics was maybe a bit too open and wide-ranging for the U.S. academics, however, so they split it into more straightforward subfields.
Another 1950s concept was "intelligence amplification", IA. Not many have heard about IA, but these days it's pretty much ingrained in all human-computer interaction. Graphical user interfaces, hypertext and WYSIWYG all originated in IA research, even though the original idea of IA has somewhat faded out over the decades.
AI researchers wanted to create machines that would simulate human intelligence well enough to replace it. In their vision, humans and machines would compete against each other by the same rules and metrics – pretty much like in the good old industrial capitalism. On the other hand, the IA side wanted to create machines that would assist and "amplify" people's natural intellects – so, instead of competition there would be co-operation combining the strengths of humans and computers.
Now that machine learning models have been framed as "AI", they have also brought back the old separation of AI and IA. It almost looks like we forgot all the decades of research in human-computer interaction and returned to the "Star Trek future" of the 1960s. The computer is a mysterious black box that acts like a person. You give it a question in natural language, and after some processing, a final and definitive answer will come out.
This "prompt'n'pray" kind of interface is particularly alienating to those who want to affect what's going on and to have control over the result. But not as alienating to the boss types who are more into giving orders than actually doing things. Just look at who are the most vocal advocates of "prompt'n'pray" services on social media.
Whenever I've played around with ML models, I've wanted to observe and interfere more than the default interfaces let me. With GPT, I wanted to watch the actual candidate tokens and their probabilities, sometimes select the next token by myself or hack the selection logic. With image generators, I preferred to feed them pre-existing images and to gradually refine the imagery by latent blending, inpainting and manual editing. Plain "prompt engineering" frustrated me very quickly.
But am I not just being frustrated by the clumsiness of the early stages of a new technology? No, I don't believe that corporate generative ML will get any closer to IA ideals. Intelligent and knowledgeable users are bad for business, that's why. Even software developers need to remain ignorant: "Open"AI locks its API to mere "text in, text out" in order to protect its corporate interests.
And IA is currently unfashionable even when not contrasted with AI. Mainstream "user experience design", especially in social media applications, represents the diagonal opposite of IA, amplifying stupidity instead of intelligence.
Corporations will probably continue on this road – dumbing people down in order to boost the growth of AI – unless there's an adequate non-corporate intervention.
In the production design of the latest Dune movies, the team avoided getting inspired by anything found on the Internet. The reason was that the images easily found on-line form a "very shallow pool". Many TV/movie productions look similar because they're ultimately based on the same imagery.
This is a common discrepancy on today's Internet. In principle, there's a huge amount of endlessly diverse material, but the search and discovery algorithms hide the "long tail" of it. Additionally, "good content" is defined by gut-based "like" reactions, which narrows down the variety even more. This narrowing-down even applies to the algorithms themselves: when the various platform companies plagiarize a few market leaders, algorithmic diversity suffers.
Generative ML models are based on statistical probabilities and therefore share the same problem. Even if there are terabytes of training material behind an oversized model, it seldom spits out anything from the "long tail". If you ask ChatGPT to say something weird, it will say something that it considers a popular choice for "weird". The "prompt'n'pray" formula is a slave to the probability estimates and rarely chooses anything unpopular.
Some generative models even boost their shallowness with additional Internet-style shallowness. Stable Diffusion XL, for example, has been trained to generate images that maximize a one-dimensional "aesthetic score" which is based on gut reactions of large groups of people. Something similar happens also when models are designed to be competitive at one-dimensional benchmarks.
So, if you want something unique out of a generative model, the seeds of uniqueness need to come from somewhere else than the probability-and-popularity engine. Just like on the algorithmic Internet: in order to keep your recommendations interesting, you need to look for marginal stuff with obscure search keywords from time to time.
Generative models already make it very cheap to produce shallow "content" such as marketing bullshit and something that looks like stock photos, but its reign will not end there. I believe the cheapness will gradually extend to all kinds of "content".
When the models get more capable, humans will need to look for spaces that are still unreachable by "AI". To invent new genres, just like photography led to the invention of new artistic styles. The shallow pools will get deeper, which will be refreshing for a while. But all this hide-and-seek game against the machine may be futile in the end. The culture will need to grow beyond "content" altogether.
"Content" is a concept that is industrial-capitalist to the core. It prioritizes the corporate platform, be it a social media service, a commercial radio station or a book publisher, and devalues the creator. In "content talk", there are no authors who need publishers for the books they write, but a publishing industry that needs "content producers" for its books.
Once all "content" is as cheap as tap water, the present hiearchies and games of "content production" will have been demolished. What remains valuable will be the kind of culture that cannot be reduced to consumable "content". I don't know what it will be, but I hope it will be better than what today's commercial Internet offers.
This may be difficult to believe in today's world, but the most essential characteristic of computers is their flexibility. Anything can become anything with a different software. Unfortunately, this also means adaptability to oppressive ideologies, especially to those of the big owners. This is why we have destructively short hardware lifespans, ridiculous usage limitations, non-programmability, ultra-addictive applications and overall one-dimensionality and shallowness. None of these are built-in characteristics of computing.
The same is true for many subfields of computing, including machine learning. The most money and hype revolves around the aspects that can make the rich even richer. Build big centralized AI brains to replace human workers, and forget about the small, decentralized and IA.
Power dynamics of technology often create polarization, and polarization weakens the ability to see properly. Philosophy of technology has suffered from this from the beginning. Even Heidegger, despite all his insight, basically attacked a strawman called "Technik" without properly separating the socio-economical from the technical.
Polarization also often destroys the third option. Arts&Crafts started as a forward-looking and experimental movement that could have presented an alternative technological vision for the early 1900s: ditch dehumanizing factory industry but adopt the kind of machines that support quality and artistic vision. A bit like IA against AI. But instead, the polarizing forces reframed A&C as a "tradition", situating it on the "anti-progress" side and thus eliminating its transformative potential.
The personal computer started as a third option. The big computers of the 1970s were feared and opposed: they'd take our jobs, they'd destroy humanity. Microcomputers were advocated as an unusual, countercultural way of fighting back – learn the secrets of the establishment and use its tools against it – but then they became middle-class consumer products and then the new tools of oppression. The third option became one of the two.
Now we're in the middle of an AI bubble that looks much like the dot-com bubble of the late 1990s. Fraudsters who want to get rich quickly are aggressively pushing "AI" even for purposes it is ridiculously unsuitable for. All the hype makes the world look either bleak or utopian, depending on the side you choose in the ever louder polarization circus. But in the midst of this, it is important to work on creating a third option that can gain traction once the bubble bursts.
It is easy to fight a strawman, but difficult to envision alternatives. That requires understanding of history, society and technology, as well as imagination that goes beyond the "shallow pools". I consider myself lucky in this regard because I've been involved with computers and programming from an early age. Over a few decades, I've participated in underground computer subcultures whose outlooks I've been able to contrast with the mainstream ones. For someone with a background like mine, it would be simply stupid to declare "AI" as a taboo that right-minded fighters cannot touch.
Of course there's a lot of oppose. The energy consumption of machine learning is huge, and it is expected to blow up globally with the demand of mass-market applications. Then there's the whole bunch of social and economic issues. And, as with all automation, it is also important to maintain the corresponding manual practices that don't use automation. But a categorical opposition of an entire macro-category of algorithms is misguided.
Despite my occasional obsession towards big mainstream applications of ML, I'm mostly interested in those kinds of ML and AI that stay away from the hype. Classical hand-coded heuristics. Small neural nets that do small things, often making huge and wasteful models ridiculous in comparison. And, of course, the vast unexplored gaps between AI and IA.
I sometimes encounter interesting things but seldom anything practically useful. But I don't think practicality matters all that much anyway. Hobbyist microcomputers weren't very useful either in their early years, but they still created a cultural alternative to how the establishment was using the technology.