The Human-Computer Compromise
// 02.25.2024
An HCI (human-computer interface) paradigm I've been thinking about for a very long time is the compromise between how much work a computer has to do versus how much work the human has to do to operate the computer, as well as an inversely proportional relationship of bandwidth of information being sent from the human (consciously or later unconsciously) to the computer and being handled by the computer. Of course, the computer doesn't care how much work it has to do every waking microsecond because it can't really think (yet, no matter how convincing LLMs are). It is just a pile of combinational logic gates and memory with a predetermined fate. However, it is still an interesting trend I think is worth exploring.
This relationship can be traced back throughout the entire history of the computer:
One of the earliest ways of controlling a computer was manually punching paper cards and inserting them into a machine. This required knowing exactly how every part of the computer worked to not break it. Bandwidth could be measured in bytes per hour.
Later we moved on to keyboards in the terminal or DOS. (Is that what you call it? idk, I was born in 2003) This still feels extremely primitive, but was a huge leap forward. Bandwidth skyrocketed to hundreds of bytes per minute! This is thanks to humans being able to take advantage of muscle memory gained from operating typewriters. They still had to think very precisely about what exact commands they wanted to send, but they could unconsciously type entire words without thinking. This paradigm was so effective that it remains largely unchanged and extremely standard even today. The fact that I'm writing this with a keyboard is proof enough.
Then came the magical graphical user interface (GUI). A lot more work now has to be done on the computer's end because it now has to worry about a whole new type of visual rendering and a window manager. Bandwidth also significantly increased because humans now move a mouse around on a table, which translates to X and Y coordinates on a screen. This also opened up a new level of freedom with how we can choose to operate the computer. Complex GUIs also vastly open up the skill floor and number of actions, as we no longer have to know what we want to do. We can just discover buttons: recognition instead of recall.
Touch screens are fine. They're not faster than a mouse but they are more convenient and easier to learn, so they're still going in the right direction.
Notice how the amount of work done by the computer increases by about an order of magnitude with each generation, somewhat mirroring their advances in complexity and transistor count.
Dictation tangent
40 bits per second is the limit for verbal communication between humans. To my dismay, it seems like our speed of thought prohibits us from pushing beyond this. (Regalado, MIT) Though I'm hopeful we'll still see benefits through our visual cortex.
Dictation is the fastest method we have for language yet. Because I care so much about operating computers at the speed of thought, I use it pretty frequently, especially on my phone without a physical keyboard. I am surprised that more people don't feel the same way about that tradeoff. You might get looked at like a grandma for speaking into your phone in public, but honestly try typing dialogue from a video and see how hard it is. We speak over twice as fast as we type without mistakes. Don't get me started on stenography.
Present
Now we can turn to modern-day computing trends. Apple Vision Pro released this month, and its primary mode of interaction is eye tracking with hand-tracked pinch selection. The only way to interact with the device (except trackpads) is through your innate natural biological "cursor" that you have been unconsciously controlling your whole life. Every waking microsecond of this device requires using machine learning in combination with 12, yes twelve cameras to analyze your eye gaze, hand skeletal structure, hand visual cutout, and sometimes your entire face movements simultaneously. On top of all of that, this is the first XR headset that constantly maps your environment with a depth sensor without the user even knowing at all. Its sensor bandwidth is a staggering 256GB/s, though the usable human input from that is much much less. There are certainly other processes constantly running in the background I forgot to mention, like 6dof tracking and app rendering. It is truly hard to keep track of everything this computer does, all for the sake of human convenience.
This sensor suite allows the device to peer into your unconscious intentions for some very cool UX moments that have never been possible before, like automagically jumping your trackpad cursor to where you are looking at in any window, or popping up a "Connect" button above your MacBook keyboard when looking at it.
It's also worth noting that all this work can be used to optimize the computer and make it work less to perfectly suit what you are capable of perceiving. AVP utilizes eye-tracked foveated rendering to only render what you're looking at in high resolution. I wrote a whole other paper about Exploiting Human Perception to Create the Perfectly Optimized Virtual Reality System.
As far as HCI goes, this certainly feels like the future. I wrote this entire essay in mine. Therefore it's hard to think that we can get even more complex. But you'd be wrong.
Intermission
I'd like to reemphasize that though the word "compromise" implies that there is a tug between two active parties, we shouldn't feel bad for computers in any way more than you could feel empathy for a lightbulb or dishwasher, because they are still simply much more beautiful versions of those things.
Future
If we look into the future, it's filled with neural interfaces. BCIs (brain computer interfaces) even. Headsets that read your brain so you don't even have to look to operate it. The computational complexity of this increases even further, as the computer now has to train a neural network that is precisely calibrated to how your brain sends signals.
Regrettably Elon Musk and his Neuralink
It's not worth mentioning all the reasons why this guy is a shitty person, but unfortunately he has a very rare good idea with enough resources to materialize it. One of those ideas is Neuralink, which aims to do a lot of things, but ultimately use micrometer-width wires implanted into the brain combined with neural network training to increase the bandwidth of communication with computers to more than ever possible before. (Regalado, 2023)
“We're a 300 baud modem. Very slowly outputting information into our phone or maybe a little bit faster into a computer if you're using 10 fingers," he said on the Times' Sway podcast. "And it's just very hard to communicate. AI will diverge from us just because it can't talk to us.” (Keane, 2020)
This feels much more like hypothetical sci-fi than anything previously discussed, but I'm still interested to see what becomes possible from it.
Steve Jobs once said
This process of integrating personal computers into society is going to take maybe 10 years to conclude. And we of course want to continue to sell more and more computers. The key to that will be to make the computers easier and easier to use. And the way that that's going to happen is we're going to spend more and more computer power in the box to adapt the computer more to the way people are familiar with doing things so the people have to adapt less to the way computers do things. (Penn, 2024)
I stumbled upon this the other day. It was very gratifying to see that someone agrees with me.
Conclusion
Just to reiterate, hopefully you can see this relationship: the further we traverse down the computing skill tree, the more effort the computer has to do and the less effort the human has to do for them to communicate with each other. The bandwidth of communication also becomes much wider, to the point where computers now have to filter out all the incoming information to get the useful stuff in the case of eye/face/hand/body tracking and BCIs.
You might start to wonder if it's all worth it. All of this work and computing? Yeah. Sorry for throwing in another Steve Jobs quote before I let you go, but I really like this one:
I remember reading an article when I was about 12 years old, I think it might have been Scientific American. Where they measured the efficiency of locomotion for all the species on planet Earth, how many kilocalories did they expend to get from point A to point B? And the condor won. It came in the list, surpassed everything else, and humans came in about a third of the way down the list, which was not such a great showing for the crown of creation. But somebody there had the imagination to test the efficiency of a human riding a bicycle. [It] blew away the condor all the way off the top of the list and it made a really big impression on me that we humans are tool builders and that we can fashion tools that amplify these inherent abilities that we have to spectacular magnitudes. And so for me, a computer has always been a bicycle of the mind. Something that takes us far beyond our inherent abilities. I think we're just at the early stages of this tool. Very early stages. And we've come only a very short distance and it's still in its formation but already we've seen enormous changes. I think that's nothing compared to what's coming in the next 100 years. (Penn, 2024)
Sources
Keane, S. (2020, September 29). Elon Musk: Neuralink brain implant will improve “bandwidth” of human communication. CNET. https://www.cnet.com/science/elon-musk-neuralink-brain-implant-will-improve-bandwidth-of-human-communication/
Penn, T. (2024, January 22). Would Steve Jobs be proud of vision pro?. YouTube. https://youtu.be/HVBZcHzv9-I?si=vOL-A8LOUb7ayFIs
Regalado, A. (2023, September 29). Elon Musk wants more bandwidth between people and machines. do we need it?. MIT Technology Review. https://www.technologyreview.com/2023/09/29/1080472/elon-musk-bandwidth-brains/