The data-filled “clean and smart” world: the questions that still need to be asked
- By ssyc1
- December 11, 2024
- No Comments
With GenAI and LLM (large language models) being the hot topics of 2024 and likely to continue into 2025, and with the recent launch of Open AI’s 01 and similar models that are “designed to spend more time thinking before they respond”, with many investors expecting AI’s impact to start ratcheting through industries and verticals, and with the FDA in the US set to reach the milestone of 1,000 approved AI- and machine-learning-enabled medical devices before year-end, what are the questions we still need to be asking that will help us navigate as this most major and structural theme makes its impact on various parts of the broader economy and society?
Once a while, we do wonder if some of those AI chatbots out there can produce an illuminating answer, a better answer, an interesting answer, or even just some inspiration.
Of course, what these models can do (and do well) will drive what might get adoption to “stick”.
It has been clear for some time now that:
- the “smarts” in these AI agents are helping to “speed things up”, “make records of things” and so on though models are dependent on data for “training” (and data owners have recently become increasingly protective and assertive over their rights, with top venture capitalists attributing this to a slow-down in progress of AI models’ capabilities as well as a recent MIT study showing 25% of data from the highest quality sources having been restricted in the 12 months running up to April 2024);
- the “01”-style models that can “learn to think before it speaks” seem to be able to achieve much higher performance in logical reasoning-related types of tasks (e.g. 85% mark in a Maths Olympiad test) even if there is still much that these models cannot do, with both investment bank analysts and top professors reminding us quite a number of jobs requiring “human cognition” that cannot be replaced, that throwing more data to train a model is not necessarily going to improve a customer service representative’s ability to help a customer troubleshoot problems, and that we are still looking for a “killer app” for Gen AI;
- “hallucination” limits the usefulness of these agents (more recently, there is the “model collapse” talk due to “a lack of human data“ reminding us that a world increasingly filled with LLM-generated content will be a “baggage” to (if not serve to “poison”) future LLM-generated predictions …
What else can we say about the usefulness and applications of these technologies for us the venture investor?
- We were surprised by the wide difference between the best agents and the average ones – we wonder what the distribution of “super-smartness” / “out-performers” amongst AI agents might be (this is a technology writer’s model-by-model comparison).
- Related to the above, we are watching closely the rise of AI powered search and the likes of Perplexity AI which is more focused on the search task than a more all-purpose conversational AI assistant like ChatGPT: are the models powering Gen-AI chatbots slated to become the basis for the “next Google” of AI search (it is not an accident that Open AI earlier this year also launched a prototype of a SearchGPT essentially announcing its entry into the search market).
- We did ask ourselves the question if AI agents can shed some light on the run-up in share prices of energy giants (of course there’s the [proliferation of] additional question whether AI agents can actually cause share price increases …]);
- We’ve noticed the emergence of interesting companies whose names reference things like “reflexive AI” and whose founder-CEOs speak about “embodied AI” which seem to point to the relevance of social sciences to AI and software engineers and the importance of concepts like human-centred AI, the convergence of human and artificial intelligence, and augmented AI.
All these questions seem to suggest to us at least three things and then there is a fourth one that is specific to “clean” (as in “green”):
– (a) “clean” data will continue to command a premium especially if the results of the data analyses have $ consequences and need to have accuracy, reliability or robustness (e.g. say saving kWh and hence $ or supply-demand matching in order to delay capex);
(b) “cleaning” data and some means of filtering the good from the bad will need to be devised (whether mandated by government or corporations, or some standards developed, or the premium paid for superior data sets will drive the low-quality data and their producers and sellers out of the market, or some user-defined elements incorporated as today’s Perplexity.AI and its likes – asking its users to choose the model – do);
(c) we can look forward to disruptions even in some of the markets created by now well-known names during web 1.0 – it will be fun trying to work out what the “killer app” for GenAI is going to be, as search and email clearly were for the internet.); and
(d) as we are talking about “clean” or “green” as in reducing and optimizing energy use as well as reducing its emission intensity, we have to admit that the ramp-up of GenAI – with the concomitant need to build more data centres and the significant energy required to train huge amounts of data – has created some more conundrum for us and the climate change-related decarbonisation need, though the increasing cost-competitiveness, scalability and modularity of many of the clean and low-carbon energy technologies are very much big pluses to Big Tech.
More good questions needed!
Leave a Reply