Post

Thoughts on GenAI (1/4): AI Development and Proof of Digital Identity

A while ago, a friend who does not work in IT asked me what GenAI actually is. As I tried to explain, I realized that even though I work in cybersecurity at the forefront of IT, I did not know much about it myself. Over the weekend, I looked into various materials and organized my thoughts on current issues, security considerations, and random thoughts about AI.

This is the first of four posts:


Generative AI (GenAI) is producing innovative results across various fields, from text, images, and music to code generation. Large language models (LLMs) like OpenAI’s ChatGPT and Google’s Gemini perform various language-related tasks such as translation, article summarization, and question answering. Visual creation tools like Midjourney enable image and artwork generation.

When I think about it, the way GenAI generates new outputs seems quite similar to how humans exercise creativity. As the saying goes, “imitation is the mother of creation.” Humans create creative outputs depending on how they connect and combine existing knowledge and information. Similarly, AI reinterprets existing outputs based on vast datasets, learns patterns, and combines these learned patterns probabilistically and organically to generate works that feel new and original to humans. What matters in this creative process is the quantity and quality of data and the knowledge that can be extracted from it.

Some argue that AI cannot be truly creative in the real sense. They say creativity stems from human subjective experiences and emotions, and AI cannot fully imitate or understand these. But I wonder if the current limitation of AI is that it cannot perfectly imitate or understand human five senses: sight, hearing, touch, taste, and smell. These senses deeply influence the cognitive process through which humans acquire data, and consequently affect creativity.

Current AI mainly relies on sight and hearing, but if AI gains the ability to imitate human five senses through various sensors, experience the world based on this, accumulate data, and continuously learn, I believe AI could advance to a stage where it can think creatively at a level equal to or beyond humans. Big tech companies are already conducting research to implement sensors that mimic human five senses in semiconductors.

OpenAI’s recently released Sora shows that AI has reached a stage where it can generate videos from just prompt input. The introduction on OpenAI’s website is interesting:

We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction.

Watching this introduction and the videos generated by the model, I thought about how existing AI was limited to expressing single scenes through text or photos. Video generation AI captures interactions between scenes through continuous moments, opening a path to imitate how humans interact with the physical world.

Google Gemini recently increased the number of input tokens it can receive exponentially to 1M. This could further widen the path for AI to interact with the world.

Through these developments, AI might move toward a process of indirectly experiencing and learning human five senses, even without direct sensors.

It feels like the emergence of an “AI” identical to “me” is not far off. If AI’s sensory system is completed and its continuous learning ability improves, and if an interactive AI is given the same data as “me,” this AI could behave similarly to “me” and make the same judgments. This means existing methods of distinguishing between “me” and “AI” could become meaningless. Proof methods based on past experiences would no longer be valid. What methods, beyond biometric authentication, would there be to distinguish between a digitized “me” and the real “me”?

This post is licensed under CC BY 4.0 by the author.